A hands-on approach to tasks and techniques in data stream mining and
real-time analytics, with examples in MOA, a popular freely available
open-source software framework.
Today many information sources--including sensor networks, financial
markets, social networks, and healthcare monitoring--are so-called data
streams, arriving sequentially and at high speed. Analysis must take
place in real time, with partial data and without the capacity to store
the entire data set. This book presents algorithms and techniques used
in data stream mining and real-time analytics. Taking a hands-on
approach, the book demonstrates the techniques using MOA (Massive Online
Analysis), a popular, freely available open-source software framework,
allowing readers to try out the techniques after reading the
explanations.
The book first offers a brief introduction to the topic, covering big
data mining, basic methodologies for mining data streams, and a simple
example of MOA. More detailed discussions follow, with chapters on
sketching techniques, change, classification, ensemble methods,
regression, clustering, and frequent pattern mining. Most of these
chapters include exercises, an MOA-based lab session, or both. Finally,
the book discusses the MOA software, covering the MOA graphical user
interface, the command line, use of its API, and the development of new
methods within MOA. The book will be an essential reference for readers
who want to use data stream mining as a tool, researchers in innovation
or data stream mining, and programmers who want to create new algorithms
for MOA.