Speech Length Threshold in Forensic Speaker Comparison by Using Long-Term Cumulative Formant (LTCF) Analysis

Cao Honglin,Kong Jiangping
DOI: https://doi.org/10.1109/imccc.2012.103
2012-01-01
Abstract:Long-Term Formant distribution (LTF) is a relatively new method in forensic speaker comparison, by which the results have been proved to contain important speaker-specific information. However, few studies have been carried out for the fundamental issue that how long the speech sample should be collected. The current paper investigated the speech length threshold (SLT) by using Long-Term Cumulative Formants (LTCF) analysis, which was one of the LTF methods. The speech sample for each speaker was segmented into one-second length sub samples. Pearson's correlation coefficients were calculated for LTCF values of the whole speech sample and new set of speech samples that were formed by adding the immediately following sub sample onto the speech sample before it with a start from the first sub sample. The results show that SLT can be placed at about 70 seconds natural speech recordings (approximate 20 seconds only vocalic samples in duration), which are adequate to represent the whole vocal tract resonance characteristics.
What problem does this paper attempt to address?