Archive of 2019

Kafka, Spark and schema inference

At Wehkamp we use Apache Kafka in our event driven service architecture. It handles high loads of messages really well. We use Apache Spark to run analysis. From time to time, I need to read a Kafka topic into my Databricks notebook. In this article, I’ll show what I use to read from a Kafka topic that has no schema attached to it. We’ll also dive into how we can render the JSON schema in a human-readable format.

Read the article Kafka, Spark and schema inference

Simple Python code to send messages to a Slack channel (without packages)

Last week I was working on a Databricks script that needed to produce a Slack message as its final outcome. I lifted some code that used a Slack client that was PIP-installed. Unfortunately, I could not use the package on my cluster. Fortunately, the Slack API is so simple, that you don’t really need a package to post a simple message to a channel. In this blog I’ll show you the simplest way of producing awesome messages in Slack.

Read the article Simple Python code to send messages to a Slack channel (without packages)

My Little List of Tools for Prototyping

As a developer I love to prototype to see if an idea works. Thinking big and starting small is actually one of Wehkamp’s principles. And, let’s face it, that’s not easy!

Usually it starts by getting an idea of the core concept that should be validated. Especially when working with teams, communication is key. This list of tools helped me over the years to draw or code out some of these concepts and get a discussion started.

Every tool on this list is free and online.

Read the article My Little List of Tools for Prototyping

AWS Lambda Size: PIL+TF+Keras+Numpy?

At Wehkamp we’ve been using machine learning for a while now. We’re training models in Databricks (Spark) and Keras. This produces a Keras file that we use to make the actual predictions. Training is one thing, but getting them to production is quite another!

The main problem we’ve faced was that it was too big to actually fit into a lambda. This blogs shows how we’ve dealt with that problem.

Read the article AWS Lambda Size: PIL+TF+Keras+Numpy?
expand_less brightness_auto