With the proliferation of huge amounts of (heterogeneous) data on the
Web, the importance of information retrieval (IR) has grown considerably
over the last few years. Big players in the computer industry, such as
Google, Microsoft and Yahoo!, are the primary contributors of technology
for fast access to Web-based information; and searching capabilities are
now integrated into most information systems, ranging from business
management software and customer relationship systems to social networks
and mobile phone applications.
Ceri and his co-authors aim at taking their readers from the foundations
of modern information retrieval to the most advanced challenges of Web
IR. To this end, their book is divided into three parts. The first part
addresses the principles of IR and provides a systematic and compact
description of basic information retrieval techniques (including binary,
vector space and probabilistic models as well as natural language search
processing) before focusing on its application to the Web. Part two
addresses the foundational aspects of Web IR by discussing the general
architecture of search engines (with a focus on the crawling and
indexing processes), describing link analysis methods (specifically Page
Rank and HITS), addressing recommendation and diversification, and
finally presenting advertising in search (the main source of revenues
for search engines). The third and final part describes advanced aspects
of Web search, each chapter providing a self-contained, up-to-date
survey on current Web research directions. Topics in this part include
meta-search and multi-domain search, semantic search, search in the
context of multimedia data, and crowd search.
The book is ideally suited to courses on information retrieval, as it
covers all Web-independent foundational aspects. Its presentation is
self-contained and does not require prior background knowledge. It can
also be used in the context of classic courses on data management,
allowing the instructor to cover both structured and unstructured data
in various formats. Its classroom use is facilitated by a set of slides,
which can be downloaded from www.search-computing.org.