The Fundamentals of Data Engineering is now in early release (Chapters 1 & 2)! Also, why are we writing this book?

As promised, the first two draft chapters of The Fundamentals of Data Engineering are in early release. We’re stoked to have our ideas in the public arena. So far, the feedback and response have been overwhelmingly positive. This gives us a bit of reassurance that we’re not totally full of BS, and that our ideas are resonating with people.

Why are we writing this book? As a friend bluntly asked me, “Why are you writing a book on data engineering, when the field itself is changing in real-time?” Matt and I thought long and hard about this question, and the best answer we have is - we’re writing this book precisely because the field of data engineering is changing so fast.

On a daily basis, we field questions both from experienced data engineers and people who want to break into the field. Oftentimes, the questions revolve around how to make sense of the field right now. We can’t blame them, as there’s no shortage of “how to become a data engineer” articles and courses out there, the vast majority of which focus on tools (“Spark for data engineers”), which we think really confuses people. While tools are important, data engineering is much more than tools. We feel that tool-focused approaches lack the bigger picture context of data engineering. As a result, people who want to jump into data engineering, or improve their existing capabilities, jump between technologies, often missing the forest for the trees.

For people asking our advice about how to approach data engineering, we suggest focusing on the things that won’t change, namely the data engineering lifecycle, and its undercurrents (covered in Chapter 2). The data engineering lifecycle consists of several key stages - source systems, data ingestion, storage, transformation, and serving. Once you understand the lifecycle - and how to manage it - data engineering becomes much more approachable.

So, instead of creating the umpteenth resource on “Technology X for Data Engineering”, we chose to peel back the layers of data engineering and start from the first principles. It’s been a lot of hard thinking - and we’re far from done. As far as we know, there’s not a book that explains the fundamentals of data engineering in a comprehensive manner like this. Though our book is far from done, and still in rough form, we believe that it will stand the test of time and cut through the noise in a field that is changing at light speed. Enjoy!

Please check out The Fundamentals of Data Engineering. It’s in draft form, so if you find issues with either our prose or ideas, please email book_feedback@ternarydata.com. Thanks!

Joseph Reis