Abstract:English is now widely used in the world as an international language. As a symbol of the development of human civilization, English characters provide an important medium and tool for mankind. In the current information age, the vocabulary of English words is more quantitative, and it is almost everywhere. Under the background of the multiquantification of English words and the quantification of the relationship between words, the similarity measurement analysis and calculation of English words and the classification of vocabulary measurement calculation are carried out by integrating the characteristics of language. The experimental results are as follows: (1) the development situation of English words is analyzed, the research direction of the experiment is determined, the concept of English character features is proposed, and the similarity calculation method is selected according to different features, in order to simplify the complex and difficult-to-understand word meaning relationship between English words; (2) the text features are extracted through the similarity feature selection of language and text. The extraction of features indirectly affects the effectiveness of classification. The similarity word embedding vector is used to map English words into the vector for analysis and comparison, calculate the distance between the similarity numerical variables between English words and their similarity coefficient, measure the distance between them, and evaluate the similarity between them, including the angle cosine method and correlation coefficient method which are the two main methods for calculating the similarity coefficient.

A STRING SIMILARITY CALCULATION FOR RECOGNISING KEYWORDS OF COINED PROFANITIES

An adaptive method for text domain similarity calculation

A Pivotal Prefix Based Filtering Algorithm for String Similarity Search

A Study of Discriminatory Speech Classification Based on Improved Smote and SVM-RF

Robust Quick String Matching Algorithm for Network Security

An Approximate String Matching Algorithm for Chinese Information Retrieval Systems

Curbing Profanity Online: A Network-Based Diffusion Analysis of Profane Speech on Chinese Social Media

Message similarity calculating approach of P2P searching

An Improved Chinese String Comparator for Bloom Filter Based Privacy-Preserving Record Linkage

On a New Algorithm for Removing Repeating Patterns in Similarity Analysis

A Similarity Metric Method of Obfuscated Malware Using Function-Call Graph

Online community thread similarity measurement algorithm based on author analysis

The Methodology and an Application to Fight Against Unicode Attacks

Research on algorithm for networks new words identification

Similarity Measurement and Classification of English Characters Based on Language Features

A New Similarity Computing Method Based on Concept Similarity in Chinese Text Processing.

A study on the classification of stylistic and formal features in English based on corpus data testing

Chinese Word Similarity Computing Based on Combination Strategy

YZR-net : Self-supervised Hidden representations Invariant to Transformations for profanity detection

Design and Implementation of FAQ Automatic Return System Based on Similarity Computation

A similarity reinforcement algorithm for heterogeneous web pages