Very rewarding and interesting two episodes, thanks for that!

I could not resist to think about Apache Hadoop och Apache Spark while listening to theese episodes. Now I might be asking you for relly stupid explanations, but plese have understanding, I am not developer but just an architect in a role of property owner/digitalisation strategist.

If I was lucky enough to get right conclusions, than Kafka is resulting in visualisation/streaming but Hadoop and Spark are resulting in some answers or solutions (on demand) after processing really much data located at different places. Right?

If that is the case than I would like to understand if those IT ecosystems are syncronised, do they supplement each other, whether their activities overlap, how they work together for the same client and make value?

Maybe it could be toppic for episode nr.3 :)

Kind regards,

/Andrea G.

Expand full comment

Andrea, Thank you! No reason to apologize of course.

Kafka doesn't have anything to do with visualization really. It's an open source framework to help move data in real time in a standardized way that solves the challenges with real-time capabilities and scale. It was used as a message broker, but many people and companies have started to leverage it more as an event driven architecture to enable decisions being made at the right moment, by the right people. As such, it's perfect as a smart city architecture which Kai writes about here:


Nicolas Waern talking about Event Streaming and Kafka here at a Blockchain conference in Mallorca, Spain 2019:


Heavily inspired by the seminal talk from Neda Narkhede, former CTO for Confluent (where Kai works) here:


Example material around kafka from Kai and its use cases.


It might be more important to think about the problems Kafka can solve, albeit equally important to understand the foundation the future will be built on, of course. Hadoop is meant for storage and Spark is:

"Apache Spark is an open-source, distributed processing system used for big data workloads. It utilizes in-memory caching and optimized query execution for fast queries against data of any size. Simply put, Spark is a fast and general engine for large-scale data processing."

In regards to a suggested episode 3 talking about these things but from a more architectural side. That could definitely be interesting!

In your line of work, where do you see real-time real estate matter most?

Expand full comment