When you are training a machine learning image classification model, you often need to resize the images your dataset into smaller ones. When you retrain your model on new data, you resize the images once more. In this blog I’ll share how S3 can be used to cache the resized images.
At Wehkamp we use AWS Lambda to classify images on S3. The Lambda is triggered when a new image is uploaded to the S3 bucket. Currently we have over 6.400.000 images in the bucket. Now we would like to run the Lambda for all images of the bucket. In this blog I’ll show how we did this with a Python 3.6 script.
At Wehkamp we’ve been using machine learning for a while now. We’re training models in Databricks (Spark) and Keras. This produces a Keras file that we use to make the actual predictions. Training is one thing, but getting them to production is quite another!
The main problem we’ve faced was that it was too big to actually fit into a lambda. This blogs shows how we’ve dealt with that problem.