How to containerize a HuggingFace Transformers Model using Docker and Flask?

3 min readAug 17, 2021

HuggingFace have made a huge impact on Natural Language Processing domain by making lots of Transformers models available online. One problem I faced during my MLOPS process is to deploy one of those HuggingFace models for sentiment analysis. In this post, I will shortly summarize what I did to deploy a HuggingFace model using Docker and Flask.

I assume the reader has a basic knowledge about Docker, TensorFlow, Transformers, Flask and PyTorch libraries. The source code can be found at my github repo.

Required libraries: Flask, transformers, TensorFlow. (pip or conda as you wish, I used pip). If you are using TensorFlow, as I do, you will need PyTorch only if you are using a HuggingFace model trained on PyTorch, with the flag from_pt=true. But, to reload and re-use the model from local you don’t need PyTorch again, so it will not be needed in your container.

Step 1: Load and save the transformer model in a local directory using save_hf_model.py

Your saved models/transformers directory should look like this:

Load your model from local directory back and test your loaded model by comparing the results of two models, the original and the loaded one.

Step 2: Create a minimal flask app, in fact you can use the one at my github repo without changing anything. Just replace your model with the one in the models/transformers directory. Recommend to test your app at this level again by running with flask.

Step 3: Containerize the app using Dockerfile:

docker build — tag mlapp .

docker run -i -p 9000:5000 mlapp

(add -d flag to run in detach mode in the background, you can change 9000 as you need)

Check if your docker is up and running docker ps

Check if the container is responding

curl 127.0.0.1:9000 -v

Step 4: Test your model with make_req.py. Please note that your data should be in the correct format, for example, as you tested your model in save_hf_model.py.

Step 5: To stop your docker container

docker stop 1fbcac69069c

Your model is now running in your container, ready to deploy anywhere.

Happy machine learning!

How to containerize a HuggingFace Transformers Model using Docker and Flask?

Written by ismail aslan