Abstract:The purpose of the research is to specify effective approaches for improving the semantic analysis of graphic contents of big data. This article considers images or video scenes as examples of such complex contents. Proposed approach takes into account the special features of these contents and create a hybrid annotation model that extends the text annotation model with more specific elements. For the visual data, these are characteristics of visualization. Determining the similarity of information contents is a critical problem for solving big data tasks. It is the basis for the big data categorization and enables the composition of the documents, conversion of an unstructured contents to relevant knowledge structures and the visualization of the information. Semantic analysis of information contents is usually based on their metadata, which form the basis of semantic annotations. Also, they are elements of a structured semantic description of the content and the basis for its automated processing. The approach is based on using ontologies to define semantic annotations. Ontologies provide various sources of knowledge to measure semantic similarity, contain a lot of information about the interpretation of concepts and other semantic relationships with a hierarchical structure based on hyponymy relations. But, in recent years, there is the rapid growth of the number of images and video resources. And, at this time, we can note a significant enrichment of available visual information. From a visual point of view, it is easier to understand whether two concepts are similar. Therefore, the integration of semantic and visual information of the image ensures the optimization of the ontological methods for similarity estimation and allows to obtain similarity metrics that are more consistent with human perception. De facto, such assessments of the complex semantic similarity of concepts are defined by the composition of two functions, the first of which, in fact, is an ontological measure of similarity, and the second is built on the basis of a complex facilities vector. It is a concatenation of semantic and visual characteristics with an established weight balance between these two types of features. The combination of visualization features with semantic and ontological characteristics of the contents in the similarity metrics is the central idea of this study.

ENRICHING LARGE DOCUMENT STORES WITH INTELLIGENT METADATA: A FRAMEWORK FOR EFFECTIVE KNOWLEDGE MANAGEMENT AND APPLIED ANALYTICS

Metadata Management Technology in Data Warehouse

Metadata Management for Textual Documents in Data Lakes

Metadata as a tool of the semantic analysis of the complex contents of the big data. The images

A fine-grained perspective on big data knowledge creation: dimensions, insights, and mechanism from a pilot study

MetaInsight: Automatic Discovery of Structured Knowledge for Exploratory Data Analysis

Making Metadata More FAIR Using Large Language Models

Docs2KG: Unified Knowledge Graph Construction from Heterogeneous Documents Assisted by Large Language Models

Inferring Tabular Analysis Metadata by Infusing Distribution and Knowledge Information

Knowledge and Metadata Integration for Warehousing Complex Data

AnaMeta: A Table Understanding Dataset of Field Metadata Knowledge Shared by Multi-dimensional Data Analysis Tasks

Semantic Modelling of Organizational Knowledge as a Basis for Enterprise Data Governance 4.0 -- Application to a Unified Clinical Data Model

Detecting customers knowledge from social media big data: toward an integrated methodological framework based on netnography and business analytics

A Framework for Capturing and Analyzing Unstructured and Semi-structured Data for a Knowledge Management System

DETEXA: declarative extensible text exploration and analysis through SQL

Combining large language models with enterprise knowledge graphs: a perspective on enhanced natural language understanding

A Metadata Generation System with Semantic Understanding for Video Retrieval in Film Production

Towards Accurate and Efficient Document Analytics with Large Language Models

Lightweight Knowledge Representations for Automating Data Analysis

Automatic Construction of Enterprise Knowledge Base

Digitising Cultural Complexity: Representing Rich Cultural Data in a Big Data environment