What problem does this paper attempt to address?

The problem that this paper attempts to solve is the extraction of entities and their relationships in social networks. Specifically, the author focuses on how to extract social networks from web pages, especially dealing with the heterogeneity of web - page data and the lack of semantic structure. The paper proposes an information - retrieval - based method to address these challenges and compares the performance of several methods in extracting social relationships, including strength relations (Strength Relations) and relations based on online academic databases. ### Main problems: 1. **Heterogeneity of web - page data and lack of semantic structure**: Most web - page documents are unstructured and lack explicit semantic information, which makes it difficult to extract social networks from them. 2. **Entity recognition and relationship extraction**: How to effectively identify entities (such as individuals, organizations, etc.) in web pages and their relationships, especially in large - scale data. 3. **Performance evaluation of methods**: How to evaluate the performance of different methods in extracting social networks, especially precision and recall. ### Solutions: 1. **Information - retrieval - driven method**: Use information - retrieval techniques to extract social networks from web pages, focusing on entity recognition and relationship extraction. 2. **Comparison of multiple methods**: Compare the performance of supervised and unsupervised learning methods in extracting social networks, especially strength relations (SRS) and underlying strength relations based on URLs (USR). 3. **Experimental verification**: Verify the performance of different methods through experiments, using a data set of 539 web pages and comparing with the benchmark graph in the DBLP online database. ### Formula summary: - **Jaccard coefficient**: \[ \text{sim}_{\text{jac}}(a, b)=\frac{|a\cap b|}{|a| + |b|-|a\cap b|} \] - **Conditional probability**: \[ p(b_i|a)=\frac{|(q\Rightarrow b_i) = T|}{|M|} \] - **Improved Jaccard coefficient**: \[ \text{sim}(a, b_i)=\frac{|(a\Rightarrow b_i) = T|}{|M|+|D_{b_i}|-|(q\Rightarrow b_i) = T|} \] - **TF - IDF calculation**: \[ \text{TF.IDF}_w=\text{tf}(w)\cdot\text{idf}(w)=\left(\sum_{j = 1}^{N}\sum_{i = 1}^{m}\frac{1}{n}\right)\log\frac{N}{\text{df}(w)} \] - **Normalized TF - IDF**: \[ \text{tfidf}_{\text{nor}}=(\text{TF.IDF})\left(\frac{N}{\sigma}\right) \] - **Recall**: \[ \text{Rec}(S_i)=\frac{| \{ S\in P(S_i):C(S)=C(S_i)\}|}{| \{ S\in P(S_i)\}|} \] - **Precision**: \[ \text{Prec}(S_i)=\frac{| \{ S\in C(S_i):P(S)=P(S_i)\}|}{| \{ S\in C(S_i)\}|} \] - **F - value**: \[ F = 2\cdot\text{REC}\cdot\text{PREC}/(\text{REC}+\text{PREC}) \] Through these methods and formulas, the paper aims to improve the efficiency and accuracy of extracting social networks from web pages.

Social Network Extraction: Superficial Method and Information Retrieval

A Methodology to Extract Social Network from the Web Snippet

Information Retrieval Model: A Social Network Extraction Perspective

Extracted Social Network Mining

Social Network Extraction of Academic Researchers

Solution to Large Scale Extraction of Social Relations of Persons Based on Web

Learning a Probabilistic Semantic Model from Heterogeneous Social Networks for Relationship Identification

Social network mashup: Ontology-based social network integration for statistic learning

Enhanced Semantic Graph Based Approach With Sentiment Analysis For User Interest Retrieval From Social Sites

Extracting Academic Information from Conference Web Pages

Mining Online Social Networks: Deriving User Preferences through Node Embedding

Network embedding enhanced intelligent recommendation for online social networks

A Social Search Model for Large Scale Social Networks

Strong Social Component-Aware Trust Sub-network Extraction in Contextual Social Networks

Parallelization in Extracting Fresh Information from Online Social Network

KG-CFSA: a comprehensive approach for analyzing multi-source heterogeneous social network knowledge graph

Semantic Mining of Social Networks

A Study on Online Social Networks Theme Semantic Computing Model

Mining Social Data to Extract Intellectual Knowledge

Contributive Social Capital Extraction From Different Types of Online Data Sources

Survey of network embedding techniques for social networks