entropy based concept shift detection
play

Entropy-based Concept Shift Detection Peter Vorburger, Abraham - PDF document

Entropy-based Concept Shift Detection Peter Vorburger, Abraham Bernstein University of Zurich Department of Informatics Binzm uhlestrasse 14, 8050 Zurich, Switzerland { vorburger, bernstein } @ifi.unizh.ch Abstract per is a real-world


  1. Entropy-based Concept Shift Detection Peter Vorburger, Abraham Bernstein University of Zurich Department of Informatics Binzm¨ uhlestrasse 14, 8050 Zurich, Switzerland { vorburger, bernstein } @ifi.unizh.ch Abstract per is a real-world problem which stands exemplary for the problems mentioned above – the analysis of sensor data on When monitoring sensory data (e.g., from a wearable de- wearable devices. In our research on context-awareness [1], vice) the context oftentimes changes abruptly: people move where we learned classifiers predicting peoples’ anticipated from one situation (e.g., working quietly in their office) to behavior based on sensory input, we found that contexts (or another (e.g., being interrupted by one’s manager). These contextual situations) switch rather than gradually change. context changes can be treated like concept shifts, since We also found, that contextual information could be reused, the underlying data generator (the concept) changes while even for new, not yet encountered situations. Therefore, moving from one context situation to another. We present an an ongoing monitoring of the sensor stream is needed. An entropy based measure for data streams that is suitable to online pattern matching mechanism comparing the sensor detect concept shifts in a reliable, noise-resistant, fast, and stream to the entire library of already known contexts is, computationally efficient way. We assess the entropy mea- however, computational complex and not yet suitable for sure under different concept shift conditions. To support today’s wearable devices. One solution is to indicate pos- our claims we illustrate the concept shift behavior of the sible candidates (or hot spots) for context changes limiting stream entropy. We also present a simple algorithm control the computationally intensive context (re-)determination on approach to show how useful and reliable the information those candidates. Thus, a computationally “cheap” tech- obtained by the entropy measure is compared to a ensemble nique to find such context-switch candidates would be very learner as well as an experimentally inferred upper limit. helpful. From the machine learning point of view the con- Our analysis is based on three large synthetic data sets rep- text generating the sensor data can be viewed as the un- resenting real, virtual, and a combination of both concept derlying concept generating the data stream and the con- drifts under different noise conditions (up to 50%). Last text switches can be viewed as “abrupt concept drifts” also but not least, we demonstrate the usefulness of the entropy referred to as concept shifts . This paper introduces an based measure context switch indication in a real world ap- entropy-based measure to detect concept shifts . In the fol- plication in the context-awareness/wearable computing do- lowing we will show that this measure is very sensitive to main. concept shifts while remaining noise-tolerant. Additionally, it allows to distinguish between different shift intensities. In order to be able to assess this measure, we introduce a coarse concept shift adapting algorithm , which we show to 1 Introduction (1) provide mostly a better prediction quality than conven- tional approaches, (2) require limited computational power, In real-world applications the mining of data streams, (3) exhibit quick reaction time, and (4) show good perfor- rather than time independent data, is increasingly important. mance under noisy conditions. After the assessment of the In many applications data (e.g., from the financial indus- algorithm on synthetic data sets we apply our approach to try, sensor data, multimedia content) is gathered over time, sensor data obtained by a context-aware wearable comput- which raises the problem that the concepts to be learned ing setup [1], where the entropy measure clearly indicates may drift (i.e., change) over time [5]. Also, the increasing context switches on the basis of audio and accelerometer amount of data (e.g., multimedia content, data warehouses) recordings. and the limitation of computing power due to miniaturiza- tion (e.g., wearable computing) call for faster and more The next section provides a short review on the related resource friendly algorithms. The motivation for this pa- work relates our contributions to other projects in the field.

Recommend


More recommend