Abstract:We propose a metric for the space of multiple sequence alignments that can be used to compare two alignments to each other. In the case where one of the alignments is a reference alignment, the resulting accuracy measure improves upon previous approaches, and provides a balanced assessment of the fidelity of both matches and gaps. Furthermore, in the case where a reference alignment is not available, we provide empirical evidence that the distance from an alignment produced by one program to predicted alignments from other programs can be used as a control for multiple alignment experiments. In particular, we show that low accuracy alignments can be effectively identified and discarded. We also show that in the case of pairwise sequence alignment, it is possible to find an alignment that maximizes the expected value of our accuracy measure. Unlike previous approaches based on expected accuracy alignment that tend to maximize sensitivity at the expense of specificity, our method is able to identify unalignable sequence, thereby increasing overall accuracy. In addition, the algorithm allows for control of the sensitivity/specificity tradeoff via the adjustment of a single parameter. These results are confirmed with simulation studies that show that unalignable regions can be distinguished from homologous, conserved sequences. Finally, we propose an extension of the pairwise alignment method to multiple alignment. Our method, which we call AMAP, outperforms existing protein sequence multiple alignment programs on benchmark datasets. A webserver and software downloads are available at <a class="link-external link-http" href="http://bio.math.berkeley.edu/amap/" rel="external noopener nofollow">this http URL</a> .

An Error-Sensitive Evaluation Metric for Word Alignment

Experimental Investigation into Alignment-based Acoustic Confidence Measures in Keyword Verification for Mandarin Speech

Multilingual Word Error Rate Estimation: e-WER3

Automatic Speech Recognition System-Independent Word Error Rate Estimation

Improving Statistical Word Alignment with Various Clues.

Minimum Error Rate Training for Bilingual News Alignment.

A New Evaluation Method: Evaluation Data and Metrics for Chinese Grammar Error Correction

AI-Assisted Human Evaluation of Machine Translation

Error Span Annotation: A Balanced Approach for Human Evaluation of Machine Translation

Word-Level ASR Quality Estimation for Efficient Corpus Sampling and Post-Editing through Analyzing Attentions of a Reference-Free Metric

Beyond Levenshtein: Leveraging Multiple Algorithms for Robust Word Error Rate Computations And Granular Error Classifications

Advocating Character Error Rate for Multilingual ASR Evaluation

MQM-APE: Toward High-Quality Error Annotation Predictors with Automatic Post-Editing in LLM Translation Evaluators

Consistency-Aware Search for Word Alignment.

Enhancing Empathic Accuracy: Penalized Functional Alignment Method to Correct Misalignment in Emotional Perception

Comparative Study of Word Alignment Heuristics and Phrase-Based SMT

Mismatching-Aware Unsupervised Translation Quality Estimation For Low-Resource Languages

Alignment Metric Accuracy

A Cross-Entropy-Guided (CEG) Measure for Speech Enhancement Front-End Assessing Performances of Back-End Automatic Speech Recognition

A New Psychometric-inspired Evaluation Metric for Chinese Word Segmentation.