Anything can be technology. But the dictionary defines it as:
1: a: the practical application of knowledge especially in a particular area; b: a capability given by the practical application of knowledge.
2: a manner of accomplishing a task especially using technical processes, methods, or knowledge.
3: the specialized aspects of a particular field of endeavor.
Recently, I worked on my theme for KeesTalksTech. To gain performance, I need to rely less on plugins, that’s why I needed a simple way to show small lists of posts in my sidebar.
I’ve created 2 short codes: one that shows recent posts, used in the new section and one that shows specific posts, used in the highlights section.
At Wehkamp we use Redis a lot. It is fast, available and implemented as a managed AWS service called ElastiCache. Sometimes we need to extract data from Redis, and usually I use the redis-cli to interact from the command-line. But what if you need to get the values of 400k+ keys? What would you do? Is there an effective way to query multiple key/values from Redis?
This week I needed to query an ElastiCache instance on AWS – which is Amazons version of Redis. I could not find a decent free client to query this remote dictionary, so I ended up using redis-cli on Ubuntu. Turns out: Redis is a wonderful and powerful system to work with.
I have no idea how I came to this point, but the yellow colors in my terminal (both cmd and PowerShell) are not bright yellow anymore. So I want to reset my colors back to the old values! Turns out that getting them back is not as straightforward as I had hoped…
This week we’ve been looking at joining two huge tables in Spark into a single table. It turns out that it is not a straightforward exercise to join data based on an array of IDs. In this blog I’ll show one way of doing this.
Our data strategy specifies that we should store data on S3 for further processing. Raw S3 data is not the best way of dealing with data on Spark, though. In this blog I’ll show how you can use Spark Structured Streaming to write JSON records of a Kafka topic into a Delta table.
There is a lot of code that needs to make a selection based on a maximum value. One example are Kafka reads: we only want the latest offset for each key, because that’s the latest record. What is the fastest way of doing this?
Tired of the dull Python syntax highlighting in Databricks? Just copy this code into your Magic CSS editor, change it (to your own style), pin it & enjoy!
At Wehkamp we use Apache Kafka in our event driven service architecture. It handles high loads of messages really well. We use Apache Spark to run analysis. From time to time, I need to read a Kafka topic into my Databricks notebook. In this article, I’ll show what I use to read from a Kafka topic that has no schema attached to it. We’ll also dive into how we can render the JSON schema in a human-readable format.
Last week I was working on a Databricks script that needed to produce a Slack message as its final outcome. I lifted some code that used a Slack client that was PIP-installed. Unfortunately, I could not use the package on my cluster. Fortunately, the Slack API is so simple, that you don’t really need a package to post a simple message to a channel. In this blog I’ll show you the simplest way of producing awesome messages in Slack.