Abstract:Recently, word segmentation in Chinese reading has become an important research issue, especially concerning how readers segment sequences of characters into words. The feed forward hypothesis assumes that the visual information obtained from Chinese characters is initially fed into a character recognition system, and word segmentation follows after character recognition, leading to word recognition. The holistic hypothesis assumes word segmentation influences character recognition through feedback. According to the interactive hypothesis, word segmentation is an interactive process involving characters and words. In the present study, we investigated the timing of segmentation on word n+1 and word n to test these above mentioned hypotheses. The sentences used in the experiment consisted of 7 to 10 two-character words. These stimuli were balanced in four conditions following a Latin-square design. Experiment 1 contained two sub-experiments (Experiments 1A & 1B). We manipulated delay times in conditions facilitating the segmentation of word n+1 in Experiment 1A. There were four conditions in total: a control and three facilitating-segmentation conditions, in which the color of word n+1 changed from red to black, after word n was fixated for 40/120/160 ms. In the baseline, control condition normal sentences were presented in red. The results showed that none of these facilitating-segmentation conditions of word n+1 promoted sentence reading time, but they did positively influence eye movement data with reliable main delay time effects that did not fit the predictions of the holistic hypothesis. It was not possible to exclude the influence of exogenous attention from the results of Experiment 1A. Therefore, Experiment 1B, which adopted similar conditions as were used in Experiment 1A, was conducted. In this sub-experiment, however, the two adjacent characters not belonging to a word were grouped together, and their color was changed simultaneously. Thus, these manipulations interrupted the word n+1 segmentation. This sub-experiment was also able to test for the three hypotheses mentioned in the first paragraph. The results showed that all the interrupted manipulations negatively influenced eye movement data, and the influences on sentence reading time followed an inverted U shape in the function of delay times in the interrupted word n+1 segmentation conditions. The patterns of eye movements and sentence reading time in Experiment 1B differed from those in Experiment 1A, thus excluding the influence of exogenous attention from the results obtained in Experiment 1A. The results of Experiment 1B did not fit the prediction of the feed forward hypothesis. There were also two sub-experiments (A & B) in Experiment 2, in which the delay times of the conditions facilitating/interpreting the segmentation of word n were manipulated. Four conditions were used in Experiment 2A: a control condition provided as the baseline and 3 facilitating the segmentation of word n after a fixation time of 40/120/160 ms, changing the color from red to black. We found that these facilitating-segmentation of word n conditions do not promote sentence reading time, but that they negatively influence eye movement data without reliable main delay time effects. However, interpreting the segmentation of word n (Experiment 2B) negatively influenced the sentence reading time and eye movement data with reliable main delay time effects. The results of Experiment 2 did not fit the predictions of either the feed forward or the holistic hypothesis. In a word, the results from both experiments (Experiment 1 & 2) indicate an interactive process between character and word.

Forgetting Word Segmentation in Chinese Text Classification with L1-Regularized Logistic Regression.

Long Short-Term Memory Neural Networks for Chinese Word Segmentation.

Chinese Word Segmentation with Maximum Entropy and N-gram Language Model

Neural Word Segmentation Learning for Chinese

Chinese Word Segmentation with Character Abstraction.

Probabilistic Chinese word segmentation with non-local information and stochastic training

Lattice LSTM for Chinese Sentence Representation

Enhancing Domain Portability of Chinese Segmentation Model Using Chi-Square Statistics and Bootstrapping

Neural Chinese Word Segmentation with Lexicon and Unlabeled Data via Posterior Regularization

Enhancing Chinese Word Segmentation Using Unlabeled Data

Is Word Segmentation Necessary for Deep Learning of Chinese Representations?

Test the Activation Model of Transforming Characters to Words in Chinese Reading: Evidence from Delay Word-Boundary Effects

A morphology-based Chinese word segmentation method

Bidirectional LSTM-CRF Attention-based Model for Chinese Word Segmentation

Chinese Word Segmentation Without Using Lexicon and Hand-Crafted Training Data

Chinese Web Page Classification Based on Statistical Word Segmentation

A Discriminative Latent Variable Chinese Segmenter with Hybrid Word/Character Information.

Study on influences of different Chinese word segmentation methods to text automatic classification based on LDA model

Compositional Recurrent Neural Networks for Chinese Short Text Classification

State-of-the-art Chinese Word Segmentation with Bi-LSTMs