Exploiting Context When Learning to Classify

Peter D. Turney
DOI: https://doi.org/10.48550/arXiv.cs/0212035
2002-12-13
Abstract:This paper addresses the problem of classifying observations when features are context-sensitive, specifically when the testing set involves a context that is different from the training set. The paper begins with a precise definition of the problem, then general strategies are presented for enhancing the performance of classification algorithms on this type of problem. These strategies are tested on two domains. The first domain is the diagnosis of gas turbine engines. The problem is to diagnose a faulty engine in one context, such as warm weather, when the fault has previously been seen only in another context, such as cold weather. The second domain is speech recognition. The problem is to recognize words spoken by a new speaker, not represented in the training set. For both domains, exploiting context results in substantially more accurate classification.
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to understand the major contributors to the cosmic X - ray background (CXRB), especially those sources showing a turning point in the X - ray source count. Specifically, the research aims to test the existing CXRB synthesis models and explore the physical properties and evolution of these sources by combining the high - quality X - ray spectra from XMM - Newton, the accurate position data from Chandra, and multi - band observational data. ### Main problem decomposition: 1. **Understanding the major contributors to the cosmic X - ray background**: - The energy density of the cosmic X - ray background (CXRB) peaks at around 30 keV. To reproduce its broadband spectral shape, the CXRB synthesis model requires a large number of X - ray sources with hard spectra. These sources are likely to be absorbed active galactic nuclei (AGN), which is consistent with the unified model of AGN. 2. **Exploring the properties of X - ray sources**: - By conducting an in - depth investigation of the 13 - hour deep - field region, the research team hopes to understand the specific properties of these X - ray sources, including their redshift distribution, spectral characteristics, and whether they are absorbed or not. This helps to validate existing models and reveal new physical mechanisms. 3. **Conducting comprehensive analysis by combining multi - band data**: - The research not only relies on X - ray observations but also combines data from optical and other bands. For example, through the deep imaging data from Subaru, William Herschel Telescope, Isaac Newton Telescope, and Kitt Peak, as well as subsequent multi - object spectral observations, researchers can identify the optical counterparts of X - ray sources and obtain their redshift information. 4. **Explaining the complexity of the optical classification of low - luminosity AGN**: - The paper pays special attention to some narrow - emission - line galaxies (NELGs) and galaxies without broad emission lines, although they have X - ray spectra similar to those of AGN. These sources may be weak AGN in bright host galaxies or the broad emission lines are invisible due to the contrast in brightness between the AGN and the host galaxy. ### Summary: The core problem of this research is to verify the existing cosmic X - ray background synthesis models through high - precision X - ray and optical observations and by combining multi - band data, and to gain an in - depth understanding of the physical properties and evolution of these sources, especially those showing a turning point in the X - ray source count. This will help us better understand the evolution of AGN and its role throughout the entire history of the universe.