My Synology disk crashed and so did my Docker set up. Basically, the CI/CD pipeline for my programs no longer existed. The wonderful thing of an awful crash like this, is that I could rethink my setup. The result is what I would call “a poor man’s CI/CD”. It’s just Git, Docker, Docker Compose and Cron. It is easy to set up and it might be all you need.
At Wehkamp we use Apache Kafka in our event driven service architecture. It handles high loads of messages really well. We use Apache Spark to run analysis. From time to time, I need to read a Kafka topic into my Databricks notebook. In this article, I’ll show what I use to read from a Kafka topic that has no schema attached to it. We’ll also dive into how we can render the JSON schema in a human-readable format.
Last week I was working on a Databricks script that needed to produce a Slack message as its final outcome. I lifted some code that used a Slack client that was PIP-installed. Unfortunately, I could not use the package on my cluster. Fortunately, the Slack API is so simple, that you don’t really need a package to post a simple message to a channel. In this blog I’ll show you the simplest way of producing awesome messages in Slack.
Today we’ll be looking at sorting and reducing an array of a complex data type. I’m using Databricks to do Spark, but I’m sure the code is compatible. I’ll be using Spark SQL to show the steps. I’ve tried to keep the data as simple as possible. The example should apply to scenarios that are more complex.