Abstract:<p>Humans must easily handle the vast amounts of data being generated by the revolution of information technology. Thus, Automatic Text summarization has been applied to various domains in order to find the most relevant information and make critical decisions quickly. In the context of Arabic, text summarization techniques suffer from several problems. First, most existing methods do not consider the context or domain to which the document belongs. Second, the majority of the existing approaches are based on the traditional bag-of-words representation, which involves high dimensional and sparse data, and makes it difficult to capture relevant information. Third, research in Arabic Text summarization is fairly small and only recently compared to that on Anglo-Saxon and other languages due to the shortage of Arabic corpora, resources, and automatic processing tools. In this paper, we try to overcome these limitations by proposing a new approach using documents clustering, topic modeling, and unsupervised neural networks in order to build an efficient document representation model. First, a new document clustering technique using Extreme learning machine is performed on large text collection. Second, topic modeling is applied to documents collection in order to identify topics present in each cluster. Third, each document is represented in a topic space by a matrix where rows represent the document sentences and columns represent the cluster topics. The generated matrix is then trained using several unsupervised neural networks and ensemble learning algorithms in order to build an abstract representation of the document in the concept space. Important sentences are ranked and extracted according to a graph model with a redundancy elimination component. The proposed approach is evaluated on Essex Arabic Summaries Corpus and compared against other Arabic text summarization approaches using ROUGE measure. Experimental results showed that the models trained on topic representation learn better representations and improve significantly the summarization performance. In particular, ensemble learning models demonstrated an important improvement on Rouge recall and promising results on F-measure.</p>

Clustering Web Search Results For Effective Arabic Language Browsing

Learning to Cluster Web Search Results.

Clustering Web Search Results Using Semantic Information

Lexical Ambiguity in Arabic Information Retrieval: The Case of Six Web-Based Search Engines

An Accuracy-Enhanced Stemming Algorithm for Arabic Information Retrieval

Ameliorating Search Results Recommendation System Based on K-Means Clustering Algorithm and Distance Measurements

Personalized Concept-Based Clustering of Search Engine Queries

An online clustering algorithm for Chinese web snippets based on Generalized Suffix Array

Web Search Clustering and Labeling with Hidden Topics

Query Result Clustering For Object-Level Search

An efficient user-oriented clustering of web search results

An Effective Clustering-Based Web Page Recommendation Framework for E-Commerce Websites

ISTC: A New Method for Clustering Search Results

C4-2: Combining Link and Contents in Clustering Web Search Results to Improve Information Interpretation

Use Link-Based Clustering To Improve Web Search Results

A Chinese Web Page Clustering Algorithm Based on the Suffix Tree

Unsupervised neural networks for automatic Arabic text summarization using document clustering and topic modeling

Web Pages Clustering: A New Approach

Suffix Tree Based Label Generation Method for Web Search Results Clustering

Link Based Clustering of Web Search Results

On Combining Link and Contents Information for Web Page Clustering