One consequence of the pervasive use of computers is that most documents
originate in digital form. Widespread use of the Internet makes them
readily available. Text mining - the process of analyzing unstructured
natural-language text - is concerned with how to extract information
from these documents. Developed from the authors' highly successful
Springer reference on text mining, Fundamentals of Predictive Text
Mining is an introductory textbook and guide to this rapidly evolving
field. Integrating topics spanning the varied disciplines of data
mining, machine learning, databases, and computational linguistics, this
uniquely useful book also provides practical advice for text mining.
In-depth discussions are presented on issues of document classification,
information retrieval, clustering and organizing documents, information
extraction, web-based data-sourcing, and prediction and evaluation.
Background on data mining is beneficial, but not essential. Where
advanced concepts are discussed that require mathematical maturity for a
proper understanding, intuitive explanations are also provided for less
advanced readers. Topics and features: presents a comprehensive,
practical and easy-to-read introduction to text mining; includes chapter
summaries, useful historical and bibliographic remarks, and
classroom-tested exercises for each chapter; explores the application
and utility of each method, as well as the optimum techniques for
specific scenarios; provides several descriptive case studies that take
readers from problem description to systems deployment in the real
world; includes access to industrial-strength text-mining software that
runs on any computer; describes methods that rely on basic statistical
techniques, thus allowing for relevance to all languages (not just
English); contains links to free downloadable software and other
supplementary instruction material. Fundamentals of Predictive Text
Mining is an essential resource for IT professionals and managers, as
well as a key text for advanced undergraduate computer science students
and beginning graduate students. Dr. Sholom M. Weiss is a Research Staff
Member with the IBM Predictive Modeling group, in Yorktown Heights, New
York, and Professor Emeritus of Computer Science at Rutgers University.
Dr. Nitin Indurkhya is Professor at the School of Computer Science and
Engineering, University of New South Wales, Australia, as well as
founder and president of data-mining consulting company Data-Miner Pty
Ltd. Dr. Tong Zhang is Associate Professor at the Department of
Statistics and Biostatistics at Rutgers University, New Jersey.