The world of data has been evolving for a while now. First there were designers. Then database administrators. Then CIOs. Then data architects. This book signals the next step in the evolution and maturity of the industry. It is a must read for anyone who takes their profession and career honestly.
— Bill Inmon, Creator of the data warehouse
 
 
Bird illustration. Free trial subscription available to the O'Reilly Media platform.

Pre-release chapters of Fundamentals of Data Engineering are available now. Website editors may have included affiliate links in this post, at zero additional cost to readers.

Fundamentals of Data Engineering

We’re excited to announce our upcoming book, Fundamentals of Data Engineering (O’Reilly) is on track for digital publication in July; preorders of the print version should be delivered in August. If you have an O’Reilly Learning Platform subscription, you can read the prerelease chapters now, though we’ve made quite a few edits in the final stretch. You can also pre-order it from your favorite bookseller, such as Amazon.

When we set out to write this book, Jess Haberman, our acquisitions editor, repeatedly impressed on us the immensity of the task ahead. Data engineering is a vast and ever-growing practice. Given the challenge in front of us, why write a book on data engineering?

After surveying the landscape of books on data engineering, we realized that we wanted to address a very specific gap: while there were many great technical manuals on data engineering technologies, there was no single book that discussed the overall practice of data engineering. Somehow this was overlooked in the body of literature, and we felt this needed to be addressed. It’s easy to write a book about tools and much harder to zoom out and describe the field of data engineering in its entirety. Data engineering is a foundational discipline for stitching together technologies and teams to serve data for solving business problems.

What can you expect to learn from Fundamentals of Data Engineering? Here’s the table of contents for an idea of the material.

  • Data Engineering Described

  • The Data Engineering Lifecycle (and its undercurrents)

  • Designing Good Data Architecture

  • Choosing Technologies Across the Data Engineering Lifecycle

  • Data Generation in Source Systems

  • Storage

  • Ingestion

  • Queries, Modeling, and Transformation

  • Serving Data for Analytics, Machine Learning, and Reverse ETL

  • Security and Privacy

  • The Future of Data Engineering

  • Appendices - Technical details on Compression/Serialization and Cloud Networking

Each chapter is approachable by both beginners and experts alike, balancing a broad, pragmatic, and deep technical discussion of the material. We also look not just at the technical aspects of data engineering but discuss ways to communicate with stakeholders and things to be aware of.

With Fundamentals of Data Engineering, we believe we have achieved what we set out to accomplish. If you’re a data engineer - or would like to become one - you will greatly benefit from this book.

And don’t just take our word for it.

 
Fundamentals of Data Engineering is a great introduction to the business of moving, processing, and handling data. I’d highly recommend it for anyone wanting to get up to speed in data engineering or analytics, or for existing practitioners who want to fill in any gaps in their understanding
— Jordan Tigani, Founder & CEO, MotherDuck and Founding engineer & co-creator of Google BigQuery