AWS Lambda
AWS Lambda is
the
best-in-class
when it comes to serverless
computing on AWS.
Fast, reliable and traceable, it
offers all we need and is one of the building blocks for other teams at
RavenPack.
Unfortunately for us, due to
the
size limitations, we were unable to deploy our
machine learning libraries and models, or so was the case until the last
re:Invent event.
In November 2020 AWS announced
Container Image support
for Lambda and
we
enthusiastically
seized the opportunity.
Our use case
Once this technical limitation was overcome we
wanted to be able to:
-
Quickly deploy models on the AWS Cloud to
receive market feedback.
-
Test the behavior locally for debugging.
-
Create a scalable, cost-effective and
efficient framework.
In this article, we will share how we did it
and greatly reduced our Time to Innovate
window.
Our ML model
In this use case, our machine learning model was
a classifier for a Computer Vision task. We used
transfer
learning
with a classifier on top of VGG19 and Tensorflow
2 as the framework.
This is not crucial, but to have an idea of
the
latencies, on our machine (i7–10700K @3.8,
32GB DDR4@2400 Mhz), CPU inference times were around 500ms/image.
Automating the workflow
We created a simple bash script to orchestrate the
pipeline from creating the docker image to uploading it to ECR.
All good things must come… to a docker
container
When we launch the bash script, the following steps
are done:
-
The previous deployment is removed, ensuring that
old models are removed or overwritten.
-
The lambda folder is created.
-
The requirements.txt, dockerfile and machine
learning models are copied on their paths.
Once this is done, the docker image could be built as
the next step in the bash script file. This just downloaded or reused AWS lambda
python3.8 framework base image and installed the required packages in the image.
To conclude, it sets the handler function as an
entrypoint. This was required by AWS Lambda to work properly.
Test the docker image in the local environment
Uploading, setting up the image and testing it on AWS
would have been
expensive
in terms of time and also if
there are bugs in our code, so it was preferable to
test the image
locally
before submitting it to ECR.
For this, the lambda base image provided an interface
that
emulated
the lambda behavior. We only had to run the
docker image locally and send our requests to the endpoint.
Let’s run it
by
mapping
the 8080 port to our local port 9000, running
in a terminal
sudo docker run -p 9000:8080 --env
MY_ENVIRONMENT_VARIABLES lambda_deployment:latest
This started a server the same way running a Flask app
locally would do.
Now we could start debugging from python or any other
language and iterate.
Unfortunately, we had to regenerate the image for every
bug correction we made. For that reason it was more convenient to
use
docker-compose
when we wanted to do lambda deployments. We
would create our own base image that contained the AWS Lambda
base
image
and all the packages in
the
requirements.txt
file and then to use that as the base image
in order to only modify the handler function or the models.
Regenerating the image usually takes around 3 minutes
but with this approach we could take it down to
less than 30
seconds.
This is a sample request sent to our model using python
On every call, the container is going to
return the result, but in the terminal it displays the total duration
time, billed time and used resources. This is useful to dimension our
Lambda accordingly, as otherwise we would either over-dimension
resources with the additional expense or risk long inference times and
errors.
In our case, we increased the latency to
1200ms at first but after
refactoring
the
code and loading the model outside the handler body, it was reduced to
800ms, a good
tradeoff
for going serverless!
Once everything is debugged, then we can
submit our image to ECR.
Submitting the image to ECR
We needed to first create an ECR repository
via the AWS console for example. In this
case,
ravenpack-test.
Once everything had been debugged and we
knew for sure the image worked, then there was time to
also
upload it to ECR. In the bash file shown
above, we just needed to
uncomment
the last
three lines, to retag the image and upload it to AWS ECR. To avoid getting an upload error, it is very
important to
authenticate
your docker client
prior to pushing the image this way:
aws ecr get-login-password >password.txt
password=$(<password.txt)
sudo docker login -u AWS -p $password <aws_account_id>.dkr.ecr.<region>.amazonaws.com
Then, be sure to correctly tag the image and
upload it to the repository. Although the costs are not very high to
maintain several images on ECR, we advise to check beforehand that it
behaves as expected locally, and to only upload when necessary.
image_name=$aws_account_id.dkr.ecr.$aws_region.amazonaws.com/ravenpack-test # proper naming for the image on the ECR repository
sudo docker tag lambda_deployment $imaeg_name # rename the image to the ECR repository name
sudo docker push $aws_account_id.dkr.ecr.$aws_region.amazonaws.com/ravenpack-test # push the image
Load the image in AWS Lambda
Now that the image is in ECR, we just need
to create a new lambda function and select the Container Image option.
Conclusions
In this article we have shown the steps to
serve a Machine Learning model using AWS Lambda and its new container
image feature.
This proves to be a cost-effective solution,
as the
total cost
for 100 inferences/day
would be
less than US$1/mo
(US$ 0.5 fixed
cost per hosting and US$0.15 variable costs per 1000 inferences).