AWS

AWS

Amazon Web Services (AWS) is a subsidiary of Amazon that provides on-demand cloud computing platforms to individuals, companies, and governments, on a metered pay-as-you-go basis. In aggregate, these cloud computing web services provide a set of primitive abstract technical infrastructure and distributed computing building blocks and tools.

Streaming a Kafka topic in a Delta table on S3 using Spark Structured Streaming
Streaming a Kafka topic in a Delta table on S3 using Spark Structured Streaming
Amazon S3, Databricks / Spark, Kafka, PySpark

Streaming a Kafka topic in a Delta table on S3 using Spark Structured Streaming

Our data strategy specifies that we should store data on S3 for further processing. Raw S3 data is not the best way of dealing with data on Spark, though. In this blog I’ll show how you can use Spark Structured Streaming to write JSON records of a Kafka topic into a Delta table.

AWS Lambda Size: PIL+TF+Keras+Numpy?
AWS Lambda Size: PIL+TF+Keras+Numpy?
Amazon S3, AWS Lambda, bash, Python

AWS Lambda Size: PIL+TF+Keras+Numpy?

At Wehkamp we’ve been using machine learning for a while now. We’re training models in Databricks (Spark) and Keras. This produces a Keras file that we use to make the actual predictions. Training is one thing, but getting them to production is quite another!

The main problem we’ve faced was that it was too big to actually fit into a lambda. This blogs shows how we’ve dealt with that problem.