Last week I was working on a Databricks script that needed to produce a Slack message as its final outcome. I lifted some code that used a Slack client that was PIP-installed. Unfortunately, I could not use the package on my cluster. Fortunately, the Slack API is so simple, that you don’t really need a package to post a simple message to a channel. In this blog I’ll show you the simplest way of producing awesome messages in Slack.
I like to validate my application configuration upon startup. Especially when doing local development, I want to know which application settings are missing. I also like to know where I should add them. This blog shows how to implement validation of your configuration classes using data annotations.
Today we’ll be looking at sorting and reducing an array of a complex data type. I’m using Databricks to do Spark, but I’m sure the code is compatible. I’ll be using Spark SQL to show the steps. I’ve tried to keep the data as simple as possible. The example should apply to scenarios that are more complex.
Last week we had some problems with the Google Ads bot. It was not able to crawl a bunch of URLs while the browser had no problem getting through. The only difference was the User-Agent. This send us on a debugging journey through Cloudflare, gateways and micro-sites. To assist us, we’ve created a small bash script to visit an URL and show some debug info.
At Wehkamp we use AWS Lambda to classify images on S3. The Lambda is triggered when a new image is uploaded to the S3 bucket. Currently we have over 6.400.000 images in the bucket. Now we would like to run the Lambda for all images of the bucket. In this blog I’ll show how we did this with a Python 3.6 script.
At Wehkamp we’ve been using machine learning for a while now. We’re training models in Databricks (Spark) and Keras. This produces a Keras file that we use to make the actual predictions. Training is one thing, but getting them to production is quite another!
The main problem we’ve faced was that it was too big to actually fit into a lambda. This blogs shows how we’ve dealt with that problem.