Interested in how an efficient search engine works? Want to know what
algorithms are used to rank resulting documents in response to user
requests? The authors answer these and other key information retrieval
design and implementation questions.
This book is not yet another high level text. Instead, algorithms are
thoroughly described, making this book ideally suited for both computer
science students and practitioners who work on search-related
applications. As stated in the foreword, this book provides a current,
broad, and detailed overview of the field and is the only one that does
so. Examples are used throughout to illustrate the algorithms.
The authors explain how a query is ranked against a document collection
using either a single or a combination of retrieval strategies, and how
an assortment of utilities are integrated into the query processing
scheme to improve these rankings. Methods for building and compressing
text indexes, querying and retrieving documents in multiple languages,
and using parallel or distributed processing to expedite the search are
likewise described.
This edition is a major expansion of the one published in 1998. Besides
updating the entire book with current techniques, it includes new
sections on language models, cross-language information retrieval,
peer-to-peer processing, XML search, mediators, and duplicate document
detection.