Cost-sensitive Classification : Status and Beyond

Hsuan-Tien Lin
Abstract:The rows represent the actual patient status, and the columns represent the diagnosis made by the doctor. For instance, on any correct diagnosis, the society pays no (additional) cost. However, if an H1N1-infected patient is predicted as coldinfected or healthy, the whole society may suffer from a huge amount of cost. On the other hand, if a cold-infected patient is predicted as healthy, the society needs to pay some cost—but not as serious as the ones paid in the previous scenario. These different costs are important for a human doctor when making any diagnosis. For instance, the doctor would be very careful on the slightest H1N1 symptom to prevent the “1000000” level mis-prediction. If we were to build an automatic system—a “computer doctor”—to make the diagnosis, how can the system use the cost information appropriately? Many real-world applications that share similar needs can be found in medical decision making, target marketing, and object recognition. Those applications belong to cost-sensitive classification. In fact, costsensitive classification can be used to express any finite-choice and bounded-loss machine learning problems [2]. Thus, it has been attracting much research attention in the past decade [3], [4], [5], [6], [7], [8], [2], [9], [10], [11], [12], [1], [13].
What problem does this paper attempt to address?