Hashing algorithms scramble data and create pseudo-uniform data
distribu- tions. Bucket algorithms operate on raw untransformed data
which are parti- tioned Into groups according to membership In
equl-slzed d-dlmenslonal hyperrec- tangles, called cells or buckets. The
bucket data structure Is rather sensitive to the distribution of the
data. In these lecture notes, we attempt to explain the connection
between the expected time of various bucket algorithms and the dis-
tribution of the data. The results are Illustrated on standard
searching, sorting and selection problems, as well as on a variety of
problems In computational geometry and operations research. The notes
grew partially from a graduate course on probability theory In computer
science. I wish to thank Elizabeth Van Gulick for her help with the
manuscript, and David Avis, Hanna AYukawa, Vasek Chvatal, Beatrice
Devroye, Hossam EI Glndy, Duncan McCallum, Magda McCallum, Godfrled
Toussaint and Sue Whltesldes"for making the School of Computer Science
at McGill University such an enjoyable place. The work was supported by
NSERC Grant A3456 and by FCAC Grant EQ-1679. INTRODUCTION 1 INTRODUCTION
It Is not a secret that methods based upon the truncation of data have
good expected time performance. For example, for nice distributions of
the data, searching Is often better done via a hashing data structure
Instead of via a search tree. The speed one observes In practice Is due
to the fact that the truncation operation Is a constant time operation.