This book provides modern technical answers to the legal requirements of
pseudonymisation as recommended by privacy legislation. It covers topics
such as modern regulatory frameworks for sharing and linking sensitive
information, concepts and algorithms for privacy-preserving record
linkage and their computational aspects, practical considerations such
as dealing with dirty and missing data, as well as privacy, risk, and
performance assessment measures. Existing techniques for
privacy-preserving record linkage are evaluated empirically and
real-world application examples that scale to population sizes are
described. The book also includes pointers to freely available software
tools, benchmark data sets, and tools to generate synthetic data that
can be used to test and evaluate linkage techniques.
This book consists of fourteen chapters grouped into four parts, and two
appendices. The first part introduces the reader to the topic of linking
sensitive data, the second part covers methods and techniques to link
such data, the third part discusses aspects of practical importance, and
the fourth part provides an outlook of future challenges and open
research problems relevant to linking sensitive databases. The
appendices provide pointers and describe freely available, open-source
software systems that allow the linkage of sensitive data, and provide
further details about the evaluations presented. A companion Web site at
https: //dmm.anu.edu.au/lsdbook2020 provides additional material and
Python programs used in the book.
This book is mainly written for applied scientists, researchers, and
advanced practitioners in governments, industry, and universities who
are concerned with developing, implementing, and deploying systems and
tools to share sensitive information in administrative, commercial, or
medical databases.
*The Book describes how linkage methods work and how to evaluate their
performance. It covers all the major concepts and methods and also
discusses practical matters such as computational efficiency, which are
critical if the methods are to be used in practice - and it does all
this in a highly accessible way!David J. Hand, Imperial College,
London