Text analytics is a field that lies on the interface of information
retrieval, machine learning, and natural language processing, and this
textbook carefully covers a coherently organized framework drawn from
these intersecting topics. The chapters of this textbook is organized
into three categories:
- Basic algorithms: Chapters 1 through 7 discuss the classical
algorithms for machine learning from text such as preprocessing,
similarity computation, topic modeling, matrix factorization,
clustering, classification, regression, and ensemble analysis.
- Domain-sensitive mining: Chapters 8 and 9 discuss the learning
methods from text when combined with different domains such as
multimedia and the Web. The problem of information retrieval and Web
search is also discussed in the context of its relationship with ranking
and machine learning methods.
- Sequence-centric mining: Chapters 10 through 14 discuss various
sequence-centric and natural language applications, such as feature
engineering, neural language models, deep learning, text summarization,
information extraction, opinion mining, text segmentation, and event
detection.
This textbook covers machine learning topics for text in detail. Since
the coverage is extensive, multiple courses can be offered from the same
book, depending on course level. Even though the presentation is
text-centric, Chapters 3 to 7 cover machine learning algorithms that are
often used indomains beyond text data. Therefore, the book can be used
to offer courses not just in text analytics but also from the broader
perspective of machine learning (with text as a backdrop).
This textbook targets graduate students in computer science, as well as
researchers, professors, and industrial practitioners working in these
related fields. This textbook is accompanied with a solution manual for
classroom teaching.