Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shutdown process is broken in 0.15 #1160

Closed
2 tasks done
KharchenkoDmitriy opened this issue Aug 18, 2021 · 24 comments · Fixed by #1269
Closed
2 tasks done

Shutdown process is broken in 0.15 #1160

KharchenkoDmitriy opened this issue Aug 18, 2021 · 24 comments · Fixed by #1269
Labels

Comments

@KharchenkoDmitriy
Copy link

Checklist

  • The bug is reproducible against the latest release and/or master.
  • There are no similar issues or pull requests to fix it yet.

Describe the bug

My FastAPI ASGI server cannot shutdown properly with uvicorn==0.15 while it can with 0.14

To reproduce

Setup minimal FastAPI app and add some functions with logs(prints) to shutdown event

Expected behavior

You see all logs(prints) from functions on shutdown

Actual behavior

Get ASGI 'lifespan' protocol appears unsupported. without --lifespan on
Get error trace with --lifespan on

Debugging material

uvicorn scheduler.main:app --host=0.0.0.0 --port ${WEB_PORT:-8000} --reload --lifespan on
INFO: Will watch for changes in these directories: ['/home/dmytro/storage/chimplie/projects/raok-main/raok-scheduler']
INFO: Uvicorn running on http://0.0.0.0:8004 (Press CTRL+C to quit)
INFO: Started reloader process [177653] using statreload
INFO: Started server process [177655]
INFO: Waiting for application startup.
INFO: Tortoise-ORM started, {'default': <tortoise.backends.asyncpg.client.AsyncpgDBClient object at 0x7f63d4a10e50>}, {'models': {'Task': <class 'scheduler.models.task.Task'>, 'Aerich': <class 'aerich.models.Aerich'>}}
INFO: Application startup complete.
^CINFO: Shutting down
INFO: Finished server process [177655]
ERROR: Exception in 'lifespan' protocol
Traceback (most recent call last):
File "/home/dmytro/.local/share/virtualenvs/raok-scheduler-hpGGYNLi/lib/python3.8/site-packages/uvicorn/lifespan/on.py", line 84, in main
await app(scope, self.receive, self.send)
File "/home/dmytro/.local/share/virtualenvs/raok-scheduler-hpGGYNLi/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 75, in call
return await self.app(scope, receive, send)
File "/home/dmytro/.local/share/virtualenvs/raok-scheduler-hpGGYNLi/lib/python3.8/site-packages/fastapi/applications.py", line 199, in call
await super().call(scope, receive, send)
File "/home/dmytro/.local/share/virtualenvs/raok-scheduler-hpGGYNLi/lib/python3.8/site-packages/starlette/applications.py", line 112, in call
await self.middleware_stack(scope, receive, send)
File "/home/dmytro/.local/share/virtualenvs/raok-scheduler-hpGGYNLi/lib/python3.8/site-packages/starlette/middleware/errors.py", line 146, in call
await self.app(scope, receive, send)
File "/home/dmytro/.local/share/virtualenvs/raok-scheduler-hpGGYNLi/lib/python3.8/site-packages/starlette/middleware/cors.py", line 70, in call
await self.app(scope, receive, send)
File "/home/dmytro/.local/share/virtualenvs/raok-scheduler-hpGGYNLi/lib/python3.8/site-packages/starlette/exceptions.py", line 58, in call
await self.app(scope, receive, send)
File "/home/dmytro/.local/share/virtualenvs/raok-scheduler-hpGGYNLi/lib/python3.8/site-packages/starlette/routing.py", line 569, in call
await self.lifespan(scope, receive, send)
File "/home/dmytro/.local/share/virtualenvs/raok-scheduler-hpGGYNLi/lib/python3.8/site-packages/starlette/routing.py", line 544, in lifespan
await receive()
File "/home/dmytro/.local/share/virtualenvs/raok-scheduler-hpGGYNLi/lib/python3.8/site-packages/uvicorn/lifespan/on.py", line 135, in receive
return await self.receive_queue.get()
File "/usr/lib64/python3.8/asyncio/queues.py", line 163, in get
await getter
asyncio.exceptions.CancelledError
INFO: Stopping reloader process [177653]

image

Environment

  • Fedora 34 / Python 3.8 / Uvicorn version: 0.15 - bug, 0.14 - ok
  • command to run: uvicorn main:app --host=0.0.0.0 --port 8000 --reload
@Kludex
Copy link
Member

Kludex commented Aug 19, 2021

When I press CTRL + C it sends two signals to handle_exit(), first a SIGNIT followed by a SIGTERM.

I'm debugging it.

@Kludex
Copy link
Member

Kludex commented Aug 19, 2021

Ok... It was quite fast to find it 😅

The problem is that on reload mode, we terminate the process on the shutdown(): de53c23

@euri10
Copy link
Member

euri10 commented Aug 19, 2021

Ok... It was quite fast to find it

The problem is that on reload mode, we terminate the process on the shutdown(): de53c23

seems more like an issue in op lifespan event, cant reproduce here
the actual app would help

@KharchenkoDmitriy
Copy link
Author

Ok... It was quite fast to find it

The problem is that on reload mode, we terminate the process on the shutdown(): de53c23

I see. I am using a script for same purpose (send SIGTERM to process group) in docker exactly to avoid dead reloader while server still working. It runs into the same problem with CTRL+C as it`s sending signal to whole group and as a result child processes receive two SIGTERMs instead of one.
In my case I decide to use a boolean that indicate if my script is propagating signals to process group or not

@euri10
Copy link
Member

euri10 commented Aug 19, 2021

this simple app reloads fine and go through lifespan events fine too:

routes = [
    Route("/index", index),
]

async def lifespan(app):
    print(f'startup lifespan')
    yield
    print('shutdown lifespan')

app = Starlette(routes=routes, lifespan=lifespan)

/home/lotso/PycharmProjects/uvicorn/venv/bin/python -m uvicorn dev.app:app --reload --lifespan=on --reload-dir=dev --log-config=dev/logging_dev.yaml
INFO:     Will watch for changes in these directories: ['/home/lotso/PycharmProjects/uvicorn/dev']
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [407820] using watchgod
WARNING   asyncio - 44 - [SFJAPI-dev]- Executing <Task pending name='Task-1' coro=<Server.serve() running at /home/lotso/PycharmProjects/uvicorn/./uvicorn/server.py:86> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fa9758adca0>()] created at /home/lotso/.asdf/installs/python/3.9.5/lib/python3.9/asyncio/locks.py:223> cb=[run_until_complete.<locals>.done_cb()] created at /home/lotso/.asdf/installs/python/3.9.5/lib/python3.9/asyncio/runners.py:44> took 0.167 seconds
startup lifespan
INFO:     Started server process [407827]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     127.0.0.1:38310 - 2021-08-19 09:28:30,450 -'GET / HTTP/1.1' 404 Not Found
INFO:     127.0.0.1:38310 - 2021-08-19 09:28:30,703 -'GET /favicon.ico HTTP/1.1' 404 Not Found
INFO:     127.0.0.1:38310 - 2021-08-19 09:28:35,012 -'GET /index HTTP/1.1' 200 OK
WARNING:  WatchGodReload detected file change in '['/home/lotso/PycharmProjects/uvicorn/dev/app.py']'. Reloading...
INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [407827]
shutdown lifespan
INFO:     Started server process [409414]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
WARNING   asyncio - 44 - [SFJAPI-dev]- Executing <Task pending name='Task-1' coro=<Server.serve() running at /home/lotso/PycharmProjects/uvicorn/./uvicorn/server.py:86> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7f0b05d7ac10>()] created at /home/lotso/.asdf/installs/python/3.9.5/lib/python3.9/asyncio/locks.py:223> cb=[run_until_complete.<locals>.done_cb()] created at /home/lotso/.asdf/installs/python/3.9.5/lib/python3.9/asyncio/runners.py:44> took 0.167 seconds
startup lifespan
INFO:     127.0.0.1:38764 - 2021-08-19 09:29:01,921 -'GET /index HTTP/1.1' 200 OK

@KharchenkoDmitriy
Copy link
Author

Ok... It was quite fast to find it
The problem is that on reload mode, we terminate the process on the shutdown(): de53c23

seems more like an issue in op lifespan event, cant reproduce here
the actual app would help

I`m using just fastapi app itself
image

And it shutdown properly without --reload flag, and failed with it

@Kludex
Copy link
Member

Kludex commented Aug 19, 2021

@euri10 I'm reproducing your code, and it does raise CancelledError here (it doesn't handle the lifespan shutdown event).

EDIT: On reload is fine. It's just when you CTRL + C there.

@euri10
Copy link
Member

euri10 commented Aug 19, 2021

ok can get the error on ctrl+c, thought it was the same as kill -2 which works fine, as well as kill -15
I wouldn't worry too much

@KharchenkoDmitriy
Copy link
Author

KharchenkoDmitriy commented Aug 19, 2021

That's because kill sends signal to single process, but CTRL+C sends signal to process group. That's why with reloaded up(which send additional signal to worker), web worker would receive two signals instead of one.

@v1b1
Copy link

v1b1 commented Aug 19, 2021

I get the same error with Starlette 0.16.0 when I send a SIGINT:

[2021-08-19 11:01:19,614][25024] INFO:     Uvicorn running on http://0.0.0.0:5000 (Press CTRL+C to quit)
[2021-08-19 11:01:23,189][25024] ERROR:    Traceback (most recent call last):
  File "c:\Users\...\.venv\lib\site-packages\starlette\routing.py", line 624, in lifespan
    await receive()
  File "c:\Users\...\.venv\lib\site-packages\uvicorn\lifespan\on.py", line 135, in receive
    return await self.receive_queue.get()
  File "C:\Users\...\python\current\lib\asyncio\queues.py", line 166, in get
    await getter
asyncio.exceptions.CancelledError

I never had to create a lifespan function in previous versions, but I found that the error disappears if I bootstrap Starlette like so:

async def _lifespan(app):
  import asyncio
  try:
    yield
  except asyncio.exceptions.CancelledError:
    pass
server_app = Starlette(routes=routes, lifespan=_lifespan)

@Kludex Kludex added the bug label Aug 28, 2021
tsukumijima added a commit to tsukumijima/KonomiTV that referenced this issue Aug 29, 2021
…ードでの起動に変更

Uvicorn の --reload オプションはファイルに変更があった時に自動でリロードしてくれる優れものなんだけど、Linux 環境では Uvicorn 0.15 のバグで Ctrl+C が効かない
そもそも通常利用でリロードモードにする必要はないので、この機会にコマンドを分ける
ref: encode/uvicorn#1160
@igormcsouza
Copy link

igormcsouza commented Oct 12, 2021

I'm having the exactly same issue!! Here's the code https://github.com/igormcsouza/kitty-api/blob/master/api/__init__.py

I downgrade to 0.14 just to avoid error

@Kludex
Copy link
Member

Kludex commented Oct 21, 2021

Downgrade is not recommended.

Reload works normally, it's just when you CTRL+C, it will force exit, but network resources are closed anyway.

@igormcsouza
Copy link

Even though the problem persist! I tried once more yesterday! That error at the end causes issues on my routine! I will keep on 0.14 until it is fixed!

@afparsons
Copy link

For passers-by: I encountered a similar issue when setting spaCy's Language.pipe n_process parameter to > 1.

@haptiq-darko
Copy link

Hello, we're having this issue when we Nuitka compile our app. Here is the output

0:00:00.901396  INFO       13 show_systeminfo  Processor: Intel64 Family 6 Model 165 Stepping 2, GenuineIntel
0:00:00.902397  INFO       14 show_systeminfo  Machine  : AMD64
0:00:00.903399  INFO       15 show_systeminfo  Node     : DESKTOP-VF4OT9Q
0:00:00.904396  INFO       16 show_systeminfo  System   : Windows
0:00:00.910397  INFO       17 show_systeminfo  Platform : Windows-10-10.0.19044-SP0
0:00:00.911397  INFO       26 startup_event    Starting: Tasks
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:53259 (Press CTRL+C to quit)
INFO:     Shutting down
INFO:     Finished server process [15628]
0:00:01.801396  INFO       30 shutdown_event   Shutting down...
0:00:01.802398  INFO       32 shutdown_event   Shutting down: Tasks
ERROR:    Traceback (most recent call last):
  File "C:\Users\darko\AppData\Local\Temp\ONC4D5~1\starlette\routing.py", line 624, in lifespan
  File "C:\Users\darko\AppData\Local\Temp\ONC4D5~1\uvicorn\lifespan\on.py", line 135, in receive
  File "C:\Users\darko\AppData\Local\Temp\ONC4D5~1\asyncio\queues.py", line 166, in get
asyncio.exceptions.CancelledError

@francipvb
Copy link

Hello all,

I got this only under a container, but if not running inside a container it works fine.

Hope this helps a little.

@nodreb-borrus
Copy link

I am running uvicorn programatically with workers and getting this issue when I Ctrl-C the main process.

It only happens with workers, I think one of the worker processes is generating the exception since catching the asyncio exception on my uvicorn.run call does not stop the stack trace.

Also, it does not happen every time I shutdown, only randomly/sporadically.

Since I am starting uvicorn from within python, is there some way to prevent the default Ctrl-C behavior and shutdown uvicorn programatically?

Referring to: https://twitter.com/graingert/status/1539697480631197703

How about running your own lifespan task and sending a "lifespan.shutdown.complete" event?

Is this applicable here?

INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started parent process [10954]
INFO:     Started server process [10958]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Started server process [10959]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Started server process [10961]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Started server process [10960]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
^CINFO:     Shutting down
INFO:     Shutting down
INFO:     Shutting down
INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [10959]
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [10961]
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [10960]
INFO:     Finished server process [10958]
ERROR:    Traceback (most recent call last):
  File "/home/user/code/project/.venv/lib/python3.8/site-packages/starlette/routing.py", line 638, in lifespan
    await receive()
  File "/home/user/code/project/.venv/lib/python3.8/site-packages/uvicorn/lifespan/on.py", line 137, in receive
    return await self.receive_queue.get()
  File "/home/user/.asdf/installs/python/3.8.13/lib/python3.8/asyncio/queues.py", line 163, in get
    await getter
asyncio.exceptions.CancelledError

INFO:     Stopping parent process [10954]

@andreaevooq
Copy link

I still have this issue on the latest version

@euri10
Copy link
Member

euri10 commented Sep 21, 2022

please the open a discussion in potential issues with as many details as possible

@rafrafek
Copy link

rafrafek commented Nov 8, 2022

I have made a new repo with this exact problem, but the problem occurs (for me) only when using Docker. If it makes any difference: I use m1 Mac, so my docker image is linux/arm64.

You can play with the repo:

https://github.com/rafrafek/playground

image

@itrare
Copy link

itrare commented Nov 15, 2022

I have the same issue in the FastAPI app and it caused pod restarts.

gunicorn --bind :8080 app.main:app --workers 2 --timeout 200 -k uvicorn.workers.UvicornWorker

However, decreasing the number of workers to 1 solves the issue of CancelledError. Although still looking to solve this without having to decrease the workers.

@ibe-314
Copy link

ibe-314 commented Nov 17, 2022

I get this issue with uvicorn==0.19.0, fastapi== 0.87.0 and python 3.9 when using uvicorn together with krenew (I need kerberos).

The code I run is contained in main.py with content:

from fastapi import FastAPI
import uvicorn

if __name__ == "__main__":
    app = FastAPI()
    uvicorn.run(app)

When I run

python3 main.py

and quit with ctrl+c everything is okay.

Running

krenew python3 main.py

and stopping with ctrl+c yields the following error

ERROR:    Traceback (most recent call last):
  File "xxx/starlette/routing.py", line 674, in lifespan
    await receive()
  File "xxx/uvicorn/lifespan/on.py", line 137, in receive
    return await self.receive_queue.get()
  File "xxx/asyncio/queues.py", line 166, in get
    await getter
asyncio.exceptions.CancelledError

@nodreb-borrus
Copy link

nodreb-borrus commented Dec 27, 2022

I was looking at this issue again to see if I could track it down, and it comes down to handle_exit on Server:

uvicorn/uvicorn/server.py

Lines 305 to 310 in dd9d5d7

def handle_exit(self, sig: int, frame: Optional[FrameType]) -> None:
if self.should_exit and sig == signal.SIGINT:
self.force_exit = True
else:
self.should_exit = True

Whenever handle_exit is called twice on one worker, force_exit is set which creates this stack trace. You can make it happen for all your workers by rapidly pressing Ctrl-C twice. I see the purpose of this behavior when the regular shutdown is blocked and you need to force your app to close.

The bug described in this issue is that handle_exit is being called twice on one worker randomly/sporadically, causing it to force_exit even though Ctrl-C is only pressed once.

I added some debug messages locally to show this, printing the process ID, and when it runs handle_exit/force_exit:

14253: handle_exit
14255: handle_exit
14254: handle_exit
14252: handle_exit
14252: handle_exit
14252: force_exit

This is with 4 workers, and again this only happens sometimes (roughly 1 in 5 in my case).

Since this issue is considered closed, I will figure out some hack for my app to remove this stack trace, but I wanted to share this info hoping it may lead to an eventual proper fix.

@Kludex
Copy link
Member

Kludex commented Dec 27, 2022

I'm locking this issue since the issue mentioned on the OP was solved, and to avoid confusion.

If you think you've found a similar issue, please create a discussion.

@encode encode locked as resolved and limited conversation to collaborators Dec 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet