How to containerize a HuggingFace Transformers Model using Docker and Flask?
HuggingFace have made a huge impact on Natural Language Processing domain by making lots of Transformers models available online. One problem I faced during my MLOPS process is to deploy one of those HuggingFace models for sentiment analysis. In this post, I will shortly summarize what I did to deploy a HuggingFace model using Docker and Flask.
I assume the reader has a basic knowledge about Docker, TensorFlow, Transformers, Flask and PyTorch libraries. The source code can be found at my github repo.
Required libraries: Flask, transformers, TensorFlow. (pip or conda as you wish, I used pip). If you are using TensorFlow, as I do, you will need PyTorch only if you are using a HuggingFace model trained on PyTorch, with the flag from_pt=true. But, to reload and re-use the model from local you don’t need PyTorch again, so it will not be needed in your container.
Step 1: Load and save the transformer model in a local directory using save_hf_model.py
Your saved models/transformers directory should look like this:
Load your model from local directory back and test your loaded model by comparing the results of two models, the original and the loaded one.
Step 2: Create a minimal flask app, in fact you can use the one at my github repo without changing anything. Just replace your model with the one in the models/transformers directory. Recommend to test your app at this level again by running with flask.
Step 3: Containerize the app using Dockerfile:
docker build — tag mlapp .
docker run -i -p 9000:5000 mlapp
(add -d flag to run in detach mode in the background, you can change 9000 as you need)
- Check if your docker is up and running docker ps
- Check if the container is responding
curl 127.0.0.1:9000 -v
Step 4: Test your model with make_req.py. Please note that your data should be in the correct format, for example, as you tested your model in save_hf_model.py.
Step 5: To stop your docker container
docker stop 1fbcac69069c
Your model is now running in your container, ready to deploy anywhere.
Happy machine learning!