Abstract:Cognitive Components of Speech at Different Time Scales Ling Feng (lf@imm.dtu.dk) Informatics and Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby, Denmark Lars Kai Hansen (lkh@imm.dtu.dk) Informatics and Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby, Denmark Abstract statistics. This has been demonstrated by a variety of inde- pendent component analysis (ICA) algorithms, whose rep- resentations closely resemble those found in natural percep- tual systems. Examples are, e.g., visual features (Bell & Se- jnowski, 1997; Hoyer & Hyvrinen, 2000), and sound features (Lewicki, 2002). Within an attempt to generalize these findings to higher cognitive functions we proposed and tested the independent cognitive component hypothesis, which basically asks the question: Do humans also use information theoretically opti- mal ICA methods in more generic and abstract data analysis? Cognitive component analysis (COCA) is thus simply defined as the process of unsupervised grouping of abstract data such that the ensuing group structure is well-aligned with that re- sulting from human cognitive activity (Hansen, Ahrendt, & Larsen, 2005). For the preliminary research on COCA, hu- man cognitive activity is restricted to the human labels in su- pervised learning methods. This interpretation is not compre- hensive, however it is capable of representing some intrinsic mechanism of human cognition. Further more, COCA is not limited to one specific technique, but rather a conglomerate of different techniques. We envision that efficient representa- tions of high level processes are based on sparse distributed codes and approximate independence, similar to what has been found for more basic perceptual processes. As men- tioned, independence can dramatically reduce the perception- to-action mappings by using factorial codes rather than com- plex codes based on the full joint distribution. Hence, it is a natural starting point to look for high-level statistically inde- pendent features when aiming at high-level representations. In this paper we focus on cognitive processes in digital speech signals. The paper is organized as follows: First we discuss the specifics of the cognitive component hypothesis in rela- tion to speech, then we describe our specific methods, present results obtained for the TIMIT database, and finally, we con- clude and draw some perspectives. Cognitive component analysis (COCA) is defined as unsu- pervised grouping of data leading to a group structure well- aligned with that resulting from human cognitive activity. We focus here on speech at different time scales looking for pos- sible hidden ‘cognitive structure’. Statistical regularities have earlier been revealed at multiple time scales corresponding to: phoneme, gender, height and speaker identity. We here show that the same simple unsupervised learning algorithm can de- tect these cues. Our basic features are 25-dimensional short- time Mel-frequency weighted cepstral coefficients, assumed to model the basic representation of the human auditory system. The basic features are aggregated in time to obtain features at longer time scales. Simple energy based filtering is used to achieve a sparse representation. Our hypothesis is now basi- cally ecological: We hypothesize that features that are essen- tially independent in a reasonable ensemble can be efficiently coded using a sparse independent component representation. The representations are indeed shown to be very similar be- tween supervised learning (invoking cognitive activity) and un- supervised learning (statistical regularities), hence lending ad- ditional support to our cognitive component hypothesis. Keywords: Cognitive component analysis; time scales; en- ergy based sparsification; statistical regularity; unsupervised learning; supervised learning. Introduction The evolution of human cognition is an on-going interplay between statistical properties of the ecology, the process of natural selection, and learning. Robust statistical regularities will be exploited by an evolutionary optimized brain (Barlow, 1989). Statistical independence may be one such regularity, which would allow the system to take advantage of factorial codes of much lower complexity than those pertinent to the full joint distribution. In (Wagensberg, 2000), the success of given ‘life forms’ is linked to their ability to recognize in- dependence between predictable and un-predictable process in a given niche. This represents a precision of the classical Darwinian paradigm by arguing that natural selection sim- ply favors innovations which increase the independence of the agent and un-predictable processes. The agent can be an individual or a group. The resulting human cognitive sys- tem can model complex multi-agent scenery, and use a broad spectrum of cues for analyzing perceptual input and for iden- tification of individual signal producing processes. The optimized representations for low level perception are indeed based on independence in relevant natural ensemble Cognitive Component Analysis In sensory coding it is proposed that visual system is near to optimal in representing natural scenes by invoking ‘sparse distributed’ coding (Field, 1994). The sparse signal consists of relatively few large magnitude samples in a background of numbers of small signals. When mixing such indepen-

Data-driven Temporal Processing Using Independent Component Analysis for Robust Speech Recognition

Research of Independent Component Analysis for Preprocessing Phonocardiograms

The Efficiency of Ica-Based Representation Analysis: Application to Speech Feature Extraction

Maximum Likelihood I-Vector Space Using PCA for Speaker Verification.

Temporal Coding of Local Spectrogram Features for Robust Sound Recognition

Ica Feature Extract Ion-Framework and Application

Adaptive Speech Separation Based on Beamforming and Frequency Domain-Independent Component Analysis

Optimizing Ica-Based Spatial Filters For Long-Term Bci Users

Statistical Beamformer Exploiting Non-stationarity and Sparsity with Spatially Constrained ICA for Robust Speech Recognition

Separation and Recognition of Electroencephalogram Patterns Using Temporal Independent Component Analysis

Experimental evaluation of a new speaker identification framework using PCA.

Improved estimation of the number of independent components for functional magnetic resonance data by a whitening filter.

Cognitive Components of Speech at Different Time Scales

Application of independent component analysis on artifacts removal in the time-frequency analysis of transient evoked otoacoustic emissions

A Novel I-Vector Framework Using Multiple Features and PCA for Speaker Recognition in Short Speech Condition

Robust Feature Extraction Using Temporal Context Averaging for Speaker Identification in Diverse Acoustic Environments

Text-independent Speaker Identification Based on Spectral Weighting Functions

Deep Discriminant Analysis for i-vector Based Robust Speaker Recognition

Analysis of fMRI data by blind separation of data in a tiny spatial domain into independent temporal component

Face Recognition Based On Independent Component Analysis And Fuzzy Support Vector Machine

Noise reduction in speech signals using adaptive independent component analysis (ICA) for hands free communication devices