Abstract:Web content mining describes the classification, clustering, and attribute analysis of a large number of text documents and multimedia files on the web. Special tasks include retrieval of data from the Internet search engine tool W; structured processing and analysis of web data. Today's blog analysis has security concerns. We do experiments to investigate its safety. Through experiments, we draw the following conclusions: (1) Web log extraction can use efficient data mining algorithms to systematically extract logs from web servers, then determine the main access types or interests of users, and then to a certain extent, based on the discovered user patterns, analyze the user's access settings and behavior. (2) No matter in the test set or the mixed test set, the curve value of deep mining is very stable, the curve value has been kept at 0.95, and the curve value of fuzzy statistics method and quantitative statistics method is stable within the interval of 0.90–095. The results also show that the data mining method has the highest identification accuracy and the best security performance. (3) Web usage analysis requires data abstraction for pattern discovery. This data abstraction can be achieved through data preprocessing, which introduces different formats of web server log files and how web server log data is preprocessed for web usage analysis. One of the most critical parts of the web mining field is web log mining. Web log mining can use powerful data mining algorithms to systematically mine the logs in the web server and then learn the user's access or preferred interests and then conduct a certain degree of user preferences and behavior patterns according to the discovered user patterns. Based on the above analysis, the current web log analysis is faced with security problems. We conduct experiments to study to verify the security performance of web logs and draw conclusions through experiments.

Web Usage mining framework for Data Cleaning and IP address Identification

A Survey on Preprocessing Methods for Web Usage Data

Web Usage Mining: Pattern Discovery and Forecasting

A Fuzzy Clustering Based Approach for Mining Usage Profiles from Web Log Data

Preprocessing: A Prerequisite for Discovering Patterns in Web Usage Mining Process

Frameworks for Web Usage Mining

Frequent Pattern Mining of Web Log Files Working Principles

Performance Evaluation of the MapReduce-based Parallel Data Preprocessing Algorithm in Web Usage Mining with Robot Detection Approaches

Effective E-Learning Environment Personalization Using Web Usage Mining Technology

Research on Web Mining Technique Facing Electronic Business and Application

Preprocessing: A Prerequisite for Discovering Patterns in WUM Process

Efficient Web Log Mining using Doubly Linked Tree

Web Data Mining with Organized Contents Using Naive Bayes Algorithm

Web Log Data Analysis by Enhanced Fuzzy C Means Clustering

Web Log Analysis and Security Assessment Method Based on Data Mining

Business Intelligence from Web Usage Mining

Business Intelligence: A Rapidly Growing Option through Web Mining

Analysis of Web Server Logs to Understand Internet User Behavior and Develop Digital Marketing Strategies

Discovering potential user browsing behaviors using custom-built apriori algorithm

A language independent web data extraction using vision based page segmentation algorithm

A Novel Incremental Mining Algorithm of Frequent Patterns for Web Usage Mining