Dynamic Assessment of Information Acquisition Effort During Interactive Search.
Michael J. Cole,Jacek Gwizdka,Chang Liu,Nicholas J. Belkin
DOI: https://doi.org/10.1002/meet.2011.14504801149
2011-01-01
Abstract:Proceedings of the American Society for Information Science and TechnologyVolume 48, Issue 1 p. 1-10 PaperFree Access Dynamic assessment of information acquisition effort during interactive search Michael J. Cole, Michael J. Cole m.cole@rutgers.edu School of Communication and Information, Rutgers, The State University of New Jersey, 4 Huntington Street, New Brunswick, NJ 08901, USASearch for more papers by this authorJacek Gwizdka, Jacek Gwizdka asist2011@gwizdka.com School of Communication and Information, Rutgers, The State University of New Jersey, 4 Huntington Street, New Brunswick, NJ 08901, USASearch for more papers by this authorChang Liu, Chang Liu changl@eden.rutgers.edu School of Communication and Information, Rutgers, The State University of New Jersey, 4 Huntington Street, New Brunswick, NJ 08901, USASearch for more papers by this authorNicholas J. Belkin, Nicholas J. Belkin belkin@rutgers.edu School of Communication and Information, Rutgers, The State University of New Jersey, 4 Huntington Street, New Brunswick, NJ 08901, USASearch for more papers by this author Michael J. Cole, Michael J. Cole m.cole@rutgers.edu School of Communication and Information, Rutgers, The State University of New Jersey, 4 Huntington Street, New Brunswick, NJ 08901, USASearch for more papers by this authorJacek Gwizdka, Jacek Gwizdka asist2011@gwizdka.com School of Communication and Information, Rutgers, The State University of New Jersey, 4 Huntington Street, New Brunswick, NJ 08901, USASearch for more papers by this authorChang Liu, Chang Liu changl@eden.rutgers.edu School of Communication and Information, Rutgers, The State University of New Jersey, 4 Huntington Street, New Brunswick, NJ 08901, USASearch for more papers by this authorNicholas J. Belkin, Nicholas J. Belkin belkin@rutgers.edu School of Communication and Information, Rutgers, The State University of New Jersey, 4 Huntington Street, New Brunswick, NJ 08901, USASearch for more papers by this author First published: 11 January 2012 https://doi.org/10.1002/meet.2011.14504801149Citations: 4AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat Abstract We present a method to measure some aspects of cognitive effort by a user while reading during a search session. We measured reading eye movement properties and patterns of eye movement in a user study (n=32) of participants carrying out realistic journalism IR work tasks. The results show the cognitive effort measures correlate positively with the information task characteristics that, by hypothesis, contribute to task difficulty. They also correlate well with participant self-assessed task difficulty. Our methodology can be applied to eye tracking data in any (textual) information setting and used to give dynamic estimates of these aspects of cognitive processing during search. INTRODUCTION How difficult is the task? When was it most difficult? The topic of assessing information task difficulty has received widespread attention (e.g., Byström, 2002; Byström & Järvelin, 1995; Kim & Rieh, 2005; Kim, 2006, Gwizdka & Spence, 2006; Toms et al., 2007 Gwizdka 2008; Aula, Khan, & Guan, 2010; Liu, Gwizdka, Liu & Belkin, 2010). Task difficulty has been correlated with multiple aspects of user task performance, such as search effort and efficiency, and with user characteristics, such as cognitive style and ability (e.g., Gwizdka, 2008, Kim, 2006; Aula, Khan & Guan, 2010). The cognitive nature of information interaction makes cognitive effort a significant component of the search effort. Cognitive effort measurement has been mostly post-task assessment. There is some work on direct measures of cognitive effort using physiological measurement by electro-encephalography (EEG), functional near infrared spectroscopy (fNIRS), and by measuring changes in pupil size (e.g., Tungare & Pérez-Quinones, 2009). User studies have generally assessed task difficulty, typically by asking users, only after task performance was completed. Gwizdka (2010) showed that measuring effort averaged over entire task sessions can mask important fluctuations in the cognitive demand on the user during performance of the task. In this paper we propose and validate new and objective cognitive effort measures based on the effort required to acquire information by reading. The measures are derived from eye-movement patterns and allow for dynamic and direct assessment of this important aspect of the user's cognitive effort during information search. Eye-movement-derived measures are well-suited to representing information search processes. In (textual) interactive information retrieval (IIR), information acquisition is mediated by eye movement patterns in service of the reading process. Eye movements are known to be cognitively-controlled and research into the reading process has identified several observable indicators of cognitive effort associated with reading eye movements. This paper extends previous work relating eye movement patterns to task and page types (Cole, et al., 2010; 2011) by measuring these cognitive effort features in units of reading, i.e. sequences of eye fixations, that we identify from eye tracking logs of user studies. Our cognitive effort measures are closely associated with semantic processing. This work provides a foundation for research to investigate the use of eye movement patterns to infer user domain knowledge, task properties, task progress, and document usefulness. In principle, these calculations could be made while the person is engaged in their task, allowing an information system to adapt dynamically to the user and improve its effectiveness. RELATED WORK Task difficulty Task difficulty has attracted much research attention and has been found to be a significant factor influencing users' search behaviors and performance. In difficult tasks, users are more likely to visit more web pages (Kim, 2006; Gwizdka & Spence, 2006), issue more queries (Kim, 2006; Aula, Khan, & Guan, 2010), and spend more time on search engine result pages (SERPs) (Aula, Khan, & Guan, 2010). Factors that correlate with subjective task difficulty have also been examined. For example, Gwizdka (2008) showed that the number of result pages and individual documents visited, the number of documents marked as relevant, and individual cognitive differences correlated well with subjective task difficulty. Liu, Gwizdka, Liu & Belkin (2010) reported how behavioral predictors of task difficulty vary across task types. They showed that task type affects the relationships between task difficulty and user behaviors and so task difficulty predictions should take account of task type. Gwizdka (2010) introduced a dual task methodology to measure cognitive loading during information search sessions. He showed that averaging cognitive load measurements over entire task sessions can mask important fluctuations in the cognitive demand on the user during task performance. Consequently, it is important to measure cognitive effort dynamically during search tasks to discover points of peak cognitive loading and detect changes in cognitive loading by task phase. Measurement of the user cognitive load can identify when the user may benefit from specific types of system support, for example, by changes in the user interface, or by making available resources to externalize memory. Task facets and task difficulty As pointed out in the introduction to this paper, many have commented on, and/or studied the relationship between the tasks that lead people to engage in information seeking behavior, or their actual information retrieval tasks, and the difficulties experienced in accomplishing such tasks. But few have characterized these tasks in any systematic way; most such studies have instead characterized tasks a priori according to arbitrarily defined degrees of difficulty (e.g. Aula, Khan & Guan, 2010), or considered just one aspect of task, such as complexity with respect to behavior (e.g. Byström & Järvelin, 1995), or identified task types with respect to empirical data but without reference to any underlying logic (e.g. White & Kelly, 2006). Li (2008; Li & Belkin, 2008) has proposed a faceted classification of task types, applied to both the tasks that lead people to engage in information seeking behavior, and the resulting information searching tasks. Her work has studied the relationship between the task facets and searchers' behaviors, including searchers' perceptions of task difficulty. Because Li's schema allows systematic variation of aspects of a task, we base the work reported here on that scheme, extended by us to some extent. As discussed in Liu, et al., 2010, several of these facets have obvious implications for task difficulty, including task Product, task Complexity, task Goal, and whether the product is Named. The values of Product are factual, intellectual, physical; the prediction with respect to difficulty is that factual will be less difficult than intellectual. The values of Complexity in Li's scheme are characterized according to the number of different activities implied by a task; the prediction with respect to difficulty is that less complex tasks will be less difficult than more complex tasks. The values of Goal are specific, amorphous, and mixed; the prediction with respect to difficulty is that tasks with specific goals will be less difficult than either amorphous or mixed (amorphous and specific), and amorphous tasks being the most difficult. The prediction with respect to Named is that tasks in which the desired product is actually named will be less difficult than tasks in which the product is not named. Eye movements and reading Cognitive effort in reading is associated with the processes of accessing word meanings, sentence and phrase parsing, and text comprehension. As such, it involves basic user characteristics, such as domain knowledge. Research has connected these aspects of processing effort with properties of reading eye movements. Eye movement patterns are cognitively controlled and reading patterns have long been studied (Rayner, 1998). A number of results relate eye movements and properties of fixations to semantic and cognitive processing states. Models of the reading process have been developed that explain observed fixation duration and word skipping behaviors (Reichle et al., 2004). The E-Z Reader reading model The E-Z Reader model is a cognitively-controlled, serial-attention model of reading eye movements (Reichle et al., 2006). It takes word identification, visual processing, attention, and control of the oculomotor system as joint determinants of eye movement in the reading process. It is a processing model based on the assumption that reading is a cognitively-controlled process where the saccade (i.e., very fast movement of eyes during which eyes do not acquire any visual information) to the next word is programmed while the person is cognitively processing the text available in the currently attended fixation. The saccade programming has a labile stage. If the next word is recognized during this stage, the programmed saccade is canceled and a saccade to the next word is programmed. There is a distinction between high acuity text in the foveal region and progressively lower acuity text in the parafoveal region. It has been shown that some text information can be extracted from the parafoveal region. Frequently, enough information is acquired to permit planning a saccade target that is several words away from the current fixation. There is evidence for recognition and use of word length (Juhasz et al., 2008), orthographic features, and some semantics such as the predictability of the word in context (Drieghe et al., 2007) and morphological features (Drieghe et al., 2010). Processing of text during fixations has been shown to take place in several stages. Lexical processing takes at least 113 ms (Reingold and Rayner, 2006). The labile lexical processing period for reprogramming the pending saccade runs from 113 ms to 168 ms. The pending saccade will be executed after the cognitive processing is completed. This explains why observations of eye movements can be connected with the semantics of information processing. Eyes remain fixated during the lexical processing period independently of the stimuli, for example even if the word is removed (Findlay and Gilchrist, 2003). The next saccade takes place only after cognitive processing is completed. It has long been known that familiarity and conceptual complexity of the text processed is positively correlated with the fixation duration (e.g. Rayner and Duffy, 1986). A limitation of the E-Z Reader model is that it does not account for higher-order cognitive processes, for example those involving language comprehension and conceptual processing. The model provides an explanation of the moment-to-moment reading process when linguistic processing is running smoothly (Reichle et al., 2004). Even though E-Z Reader is a lexical model, there is empirical evidence of relationships between fixation duration excesses and comprehension and anaphor resolution (Rayner et al., 2006). Our work is directed at correlating measures of information acquisition effort during information search with high-level search behaviors. It is enough to be able to take the fixation duration and patterns of movement to be indicators of effort associated with higher level cognitive processing. Eye movements and cognitive effort during reading There are several indications of reading effort that can be gleaned from eye movement patterns. One is the ratio of the amount of text processed to the processing time. If the text is easy to read and it does not require additional reflection, the reading speed will be greater (Rayner & Pollatsek, 1989). Reading speed is affected by word familiarity (Williams & Morris, 2004), words used in less frequently encountered senses (Sereno, O'Donnell, & Rayner, 2006), and when additional reflection is required to comprehend the concepts involved (Morris, 1994). Sentence parsing can also impact reading speed. Reading speed is associated with several components in the description of reading eye movements. Fixation duration is an indicator of the cognitive processing required to establish the meaning of the word, and the meaning of the word in context. Rayner et al. (2006) show that conceptual processing is reflected to some degree in eye movements. Text passages of greater conceptual difficulty resulted in more fixations and slightly longer mean fixation duration. Fixation spacing is associated with cognitive processing constraints. Perceptual span is the amount of text one takes in at a time and studies of reading in different orthographic systems show it is associated with basic human cognitive processing properties. It has been demonstrated that fixation spacing while reading Chinese is about three characters (Inhoff and Liu, 1998; Tsai and McConkie, 1995). Studies of non-logographic systems showed differences between phoneme-encoded systems, such as Hebrew, and character-based systems, e.g. English and Dutch. Pollatsek et al. (1986) showed that bilingual Hebrew and Dutch speakers shifted their perceptual span when switching between reading the two languages. When reading Hebrew the perceptual span was about 5 characters. In Dutch it was closer to 14. What is striking in these results is that across the orthographic systems, approximately the same number of concepts can be expressed in the different perceptual spans observed. The LATER model of neural-decision making for saccade programming (Carpenter and McDonald 2007) explains the relationship between increasing fixation duration, presumed to be due to semantic processing requirements, and the location of the next fixation, as a function of competing cognitive mechanisms. Word familiarity affects the rate at which the mechanism reaches the next saccade decision threshold, but fixation proximity to the previous fixation increases the probability of executing the programmed saccade to the next word. The result is that when the fixation spacing is short, processing of even familiar words will tend not to increase the average fixation spacing. So decreasing perceptual span is expected to correlate with unfamiliar words and conceptually difficult passages. The sequence pattern of fixations is also associated with reading effort. Reading sequences include cases where the next eye fixation will return to a previous point in the text passage. The number of regressions in a reading sequence, and the fixation durations of the regression fixation have been associated with the difficulty of reading passages, resolution of ambiguous (sense) words, conceptual complexity of text, parsing difficulties and the reading goal (Rayner & Pollatsek, 1989; Rayner, 2006). It is a common feature of reading eye movement sequences and can have an incidence of 10–15% of the total fixations in the reading sequence (Boland, 2004). Summary of related work The cognitive nature of information interaction connects cognitive effort measurement with task difficulty assessment. It has been demonstrated that task difficulty is affected by the nature of the task. To date, there has been little work on direct measurement of cognitive effort in search. The existing work has also focused on measures that can be calculated only for the entire task. It has been shown (Gwizdka, 2010) that averaging such measures over the entire task have serious deficiencies. Research into the psychophysics of the oculomotor reading process has identified a number of observable features that indicate cognitive processing states. Observable eye fixations can be associated with semantic and cognitive processing during reading in ways that relate directly to cognitive effort. Word familiarity, sense disambiguation in context, the conceptual difficulty of a text passage can all be related to eye movement features, specifically the duration of eye fixations, their spacing, and sequence patterns. In our previous work, we investigated relationships between the user's task type and transitions in reading strategies from scanning to extended reading (Cole, et al., 2010), and differences in the influence of page types (search results pages vs. content documents) on text acquisition and page processing when different tasks are being executed (Cole, et al., 2011). This paper extends the previous work by investigating eye movement-based measures related explicitly to cognitive effort during the reading process, and showing correlations with task characteristics associated with difficulty and with user's assessments of overall task difficulty. RESEARCH OBJECTIVES The objective of this paper is to present eye-movement based measures as a way for assessing cognitive effort involved in information acquisition from web pages. The new measures include reading speed, number of eye fixation regressions, perceptual span, and fixation duration excess. Our goal is to examine relationships between these new measures and subjective assessment of task difficulty as well as search task types. We also examine differences between the new measures of cognitive effort on web pages of different types, such as on search engine result pages (SERPs) and content pages. METHODOLOGY Experimental Procedure Our user study investigated behaviors associated with different task types for 32 undergraduate journalism students carrying out realistic professional journalism tasks. Each participant was given a tutorial and performed four tasks involving web search (described below). Participants were asked to continue searching until they had gathered enough information to accomplish the task or 20 minutes had elapsed. During the search, all of the participants' interactions with the computer system, including eye gaze, were logged. Tasks Our study concerned the work domain of journalism because it can be associated with any topic, yet has a small number of task types. A set of four tasks was identified by interviewing journalism faculty and practicing journalists. The tasks were designed to vary according to values of the characteristics which we believed could affect search behavior (Li, 2009). After a training task, participants completed four tasks in counterbalanced order: advanced obituary (OBI), interview preparation (INT), copy editing (CPE), and background information (BIC). The tasks varied in several dimensions: complexity defined as the number of necessary steps needed to achieve the task goal (for example, identifying an expert and then finding their contact information), the task product (factual vs. intellectual, e.g., fact checking vs. production of a document), the information object (a complete document vs. a document segment), whether the search target is specifically identified (Named), and the nature of the task goal (specific vs. amorphous). In Table 1, one can see the advanced obituary and the copy editing tasks have the least similarity. Table 1. Task characteristics Task Product Named Level Goal Complexity OBI F Unnamed Document A High CPE F Named Segment S Low INT F, I Unnamed Document A, S Low BIC F, I Unnamed Document S High a *F = Factual I=Intellectual A = Amorphous S = Specific The tasks were: Background Information Collection (BIC) Your assignment: You are a journalist at the New York Times, working with several others on a story about "whether and how changes in US visa laws after 9/11 have reduced enrollment of international students at universities in the US". You are supposed to gather background information on the topic, specifically, to find what has already been written on this topic. Your Task: Please find and save all the stories and related materials that have already been published in the last two years in the New York Times on this topic, and also in five other important newspapers, either US or foreign. BIC task is a mixed product because identifying "important" newspapers is intellectual, but finding topical documents is factual. It is Document Level because whole stories are judged. It has the Specific Goal of finding documents on a well-defined topic, but Unnamed because the search targets are not specifically identified (compare with CPE). It has High Complexity because of the number of sources to be consulted and the activities that need to be done. Interview Preparation (INT) Your assignment: Your assignment editor asks you to write a news story about "whether state budget cuts in New Jersey are affecting financial aid for college and university students. Your Task: Please find the names of two people with appropriate expertise that you are going to interview for this story and save just the pages or sources that describe their expertise and how to contact them. INT is a Mixed Product, because defining expertise is intellectual, and contact information is a fact. It is at the Document Level, because expertise is determined by a whole page. The Goal Quality is Mixed, because determining expertise is amorphous but contact information is specific. It is Unnamed because the search targets are not specifically identified in the task. It has Low Complexity because only two people need to be found. Advance Obituary (OBI) Your assignment: Many newspapers commonly write obituaries of important people years in advance, before they die, and in this assignment, you are asked to write an advance obituary for a famous person. Your Task: Please collect and save all the information you will need to write an advance obituary of the artist Trevor Malcolm Weeks. OBI is a Factual Product, because facts about the person are needed. It is at the Document Level because entire documents need to be examined. It is Unnamed because the search targets are not specifically identified in the task. The Goal Quality is Amorphous because "all the information" is undefined. It has High Complexity because many facts need to be found. Copy Editing (CPE) Your assignment: You are a copy editor at a newspaper and you have only 20 minutes to check the accuracy of the three underlined statements in the excerpt of a piece of news story below. "New South Korean President Lee Myung-bak takes office Lee Myung-bak is the 10th man to serve as South Korea's president and the first to come from a business background. He won a landslide victory in last December's election. He pledged to make the economy his top priority during the campaign. Lee promised to achieve 7% annual economic growth, double the country's per capita income to US$4,000 over a decade and lift the country to one of the topic seven economies in the world. Lee, 66, also called for a stronger alliance with top ally Washington and implored North Korea to forgo its nuclear ambitions and open up to the outside world, promising a better future for the impoverished nation. Lee said he would launch massive investment and aid projects in the North to increase its per capita income to US$3,000 within a decade "once North Korea abandons its nuclear program and chooses the path to openness." Your Task: Please find and save an authoritative page that either confirms or disconfirms each statement. CPE is a Factual Product, because facts have to be identified. It is at the Segment Level, because items within a document need to be found. It has the Specific Goal of confirming facts, but is Low Complexity because only three facts need to be confirmed. It is Named because the search targets are specified. Data Collection We used a multi-source logging system (Bierig et al., 2009). Eye data was collected with a Tobii T-60 eye-tracker (1280×1024 @ 60 Hz). We used eye fixation data as calculated by the Tobii Studio software. Logging issues for two participants prevented data analysis for all four tasks. In the following we report on results for 30 of the 32 participants. Reading Models In most previous eye tracking work in information search settings reports of reading behavior have been based on analysis of eye gaze position aggregates ('hot spots'), without distinguishing the fixation sub-sequences that comprise true reading behavior. The eye fixation analysis in our work is based on our implementation of the E-Z Reader reading model (Reichle et al., 2006). The inputs to our algorithm are successive fixation locations and their duration. The output is a classification of the sequences of fixations as members of a reading sequence, or as isolated lexical fixations, which we call 'scanning' fixations. Reading sequences and scanning instances are restricted to lexical fixations, that is, fixations that exceed the lexical processing threshold of 113 ms. The foveal (in focus) region is operationalized by taking the Tobii default of 35 pixels for projection of the foveal radius on the display. The display projection of the parafoveal annulus was taken to extend from 35 pixels to 120 pixels. As suggested by the E-Z Reader model, we distinguish between the left hand side of the parafoveal region and the right hand side to model the left to right reading pattern of the English text presented in the studies. Our algorithm should generalize to many languages with similar orthographic encoding, and with suitable changes in parameters, may also be usable for logogram or phonetically-encoded languages. The algorithm represents the reading process as sequences of fixations where the succeeding fixation is within the region looked at previously. This region may be stepwise extended to the right, provided the new fixation is within the ride-hand-side parafoveal region. Fixation regressions to any point in the previously fixated text are allowed. It is possible to extend the text region on the left hand side by regression to the beginning of the reading area. Research has shown that although the center of the fixation may be above or below the text centerline, people read just the line of text and not words above or below (Hornof & Halverson, 2002). A reading sequence ends when a fixation is not in the right-hand-side parafoveal region and is not a regression to a point in the line of text already swept out in the reading instance. The algorithm was used to distinguish reading fixation sequences from scanning fixations. Notice that reading behavior that might typically be described as 'scanning', that is a disciplined processing of text by skipping some of the text and reading phrases, is modeled as a series of short reading sequences by our model. After applying the reading model to the eye movement data, a user task session is analyzed as a contiguous sequence of these reading units. This representation of the task session provides a description of an important portion of the information acquired by the user. We operationalize several cognitive/semantic effort indicators: fixation duration in excess of the minimum required for lexical processing, the existence and number of regression fixations in the reading sequence, and the spacing of fixations in the reading sequence. We also define reading speed as the length of text acquired per unit time. These measures are commonly used in reading studies, although they are not typically construed as cognitive effort, per se. Our interests are somewhat different than the typical goals of research into reading. For one, at least at this st