Data Science: A First Introduction focuses on using the R
programming language in Jupyter notebooks to perform data manipulation
and cleaning, create effective visualizations, and extract insights from
data using classification, regression, clustering, and inference.
The text emphasizes workflows that are clear, reproducible, and
shareable, and includes coverage of the basics of version control. All
source code is available online, demonstrating the use of good
reproducible project workflows.
Based on educational research and active learning principles, the book
uses a modern approach to R and includes accompanying autograded Jupyter
worksheets for interactive, self-directed learning. The book will leave
readers well-prepared for data science projects.
The book is designed for learners from all disciplines with minimal
prior knowledge of mathematics and programming. The authors have honed
the material through years of experience teaching thousands of
undergraduates in the University of British Columbia's DSCI100:
Introduction to Data Science course.