Explore Courses Blog Tutorials Interview Questions
0 votes
in Machine Learning by (19k points)

I have a web application written in Flask. As suggested by everyone, I can't use Flask in production. So I thought of Gunicorn with Flask.

In Flask application I am loading some Machine Learning models. These are of size 8GB collectively. Concurrency of my web application can go up to 1000 requests. And the RAM of the machine is 15GB.

So what is the best way to run this application?

1 Answer

0 votes
by (33.1k points)

You can start your app with various workers or async workers using Gunicorn.

For example:


from flask import Flask

app = Flask(__name__)


def hello():

    return "Hello World!"

if __name__ == "__main__":

Gunicorn with gevent async worker

gunicorn server:app -k gevent --worker-connections 1000

Gunicorn 1 worker 12 threads:

gunicorn server:app -w 1 --threads 12

Gunicorn with 4 workers (multiprocessing):

gunicorn server:app -w 4

Hope this answer helps.

Browse Categories