Mitch Seymour - Mastering Kafka Streams and ksqlDB (2020)

GuDron · 9 Авг 2022

Mastering Kafka Streams and ksqlDB
Автор: Mitch Seymour (2020)

For data engineers and data scientists, there’s never a shortage of technologies that are competing for our attention. Whether we’re perusing our favorite subreddits, scanning Hacker News, reading tech blogs, or weaving through hundreds of tables at a tech conference, there are so many things to look at that it can start to feel overwhelming. But if we can find a quiet corner to just think for a minute, and let all of the buzz fade into the background, we can start to distinguish patterns from the noise. You see, we live in the age of explosive data growth, and many of these technologies were created to help us store and process data at scale. We’re told that these are modern solutions for modern problems, and we sit around discussing “big data” as if the idea is avant-garde, when really the focus on data volume is only half the story. Technologies that only solve for the data volume problem tend to have batchoriented techniques for processing data. This involves running a job on some pile of data that has accumulated for a period of time. In some ways, this is like trying to drink the ocean all at once. With modern computing power and paradigms, some technologies actually manage to achieve this, though usually at the expense of high latency. Instead, there’s another property of modern data that we focus on in this book: data moves over networks in steady and never-ending streams. The technologies we cover in this book, Kafka Streams and ksqlDB, are specifically designed to process these continuous data streams in real time, and provide huge competitive advantages over the ocean-drinking variety. After all, many business problems are time-sensitive, and if you need to enrich, transform, or react to data as soon as it comes in, then Kafka Streams and ksqlDB will help get you there with ease and efficiency. Learning Kafka Streams and ksqlDB is also a great way to familiarize yourself with the larger concepts involved in stream processing. This includes modeling data in different ways (streams and tables), applying stateless transformations of data, using local state for more advanced operations (joins, aggregations), understanding the different time semantics and methods for grouping data into time buckets/windows, and more. In other words, your knowledge of Kafka Streams and ksqlDB will help you distinguish and evaluate different stream processing solutions that currently exist and may come into existence sometime in the future. I’m excited to share these technologies with you because they have both made an impact on my own career and helped me accomplish technological feats that I thought were beyond my own capabilities. In fact, by the time you finish reading this sentence, one of my Kafka Streams applications will have processed nine million events. The feeling you’ll get by providing real business value without having to invest exorbitant amounts of time on the solution will keep you working with these technologies for years to come, and the succinct and expressive language constructs make the process feel more like an art form than a labor. And just like any other art form, whether it be a life-changing song or a beautiful painting, it’s human nature to want to share it. So consider this book a mixtape from me to you, with my favorite compilations from the stream processing space available for your enjoyment: Kafka Streams and ksqlDB, Volume 1.
EPUB

Скрытое содержимое могут видеть только пользователи групп(ы): Premium, Местный, Свои

PDF

Скрытое содержимое могут видеть только пользователи групп(ы): Premium, Местный, Свои

Mitch Seymour - Mastering Kafka Streams and ksqlDB (2020)

GuDron

dumpz.ws

Похожие темы