Subhadip Chanda and Harsha Pasala are experts in real-time data engineering, specializing in scalable Spark and Databricks streaming architectures. Combining deep production experience with practical design insight, they guide readers beyond prototypes to build resilient, low-latency, and future-ready analytics pipelines that operate reliably at enterprise scale.
About the Technical Reviewer
Subhadip Chanda is a Solutions Architect at Databricks, based in Canada. Over the past several years, he has worked across data platform engineering, real-time analytics, and data governance, helping organizations design and operationalize systems built on Spark, Delta Lake, and Unity Catalog. Before Databricks, Subhadip spent time in solution architecture roles that gave him a practitioner's perspective on what works in production, and what only works in slide decks.
The book started with a gap Subhadip kept running into; despite Apache Spark being the dominant engine for large-scale data processing, practitioners still lacked a comprehensive guide to building production-grade streaming systems, end to end. Each chapter in this book reflects problems he had encountered, debugged, and solved in real environments. Thus, Subhadip wrote this book so that the next engineer would not have to piece the answers together from scattered documentation and Stack Overflow threads.
Harsha Pasala is a Specialist Solutions Architect at Databricks with over a decade of experience in data. He works with some of Canada’s largest organizations, including major banks, healthcare providers, and national railways, to solve the high-stakes challenges of moving data at scale.
His work focuses on the practical side of data engineering: fixing underperforming streaming pipelines, optimizing data layouts to reduce cloud costs, and ensuring that low-latency requirements hold up under pressure. This book reflects the years spent by Harsha in design reviews and technical deep dives alongside his colleagues at Databricks. It is exclusively designed to be a pragmatic guide for engineers who need Spark Streaming to work reliably in the real world.