This textbook, suitable for an early undergraduate up to a graduate
course, provides an overview of many basic principles and techniques
needed for modern data analysis. In particular, this book was designed
and written as preparation for students planning to take rigorous
Machine Learning and Data Mining courses. It introduces key conceptual
tools necessary for data analysis, including concentration of measure
and PAC bounds, cross validation, gradient descent, and principal
component analysis. It also surveys basic techniques in supervised
(regression and classification) and unsupervised learning
(dimensionality reduction and clustering) through an accessible,
simplified presentation. Students are recommended to have some
background in calculus, probability, and linear algebra. Some
familiarity with programming and algorithms is useful to understand
advanced topics on computational techniques.