Dashboard: Can't delete workspaces which failed to be created due to lack of memory on pod #8274

felladrin · 2022-02-17T12:14:47Z

Bug description

After trying to open https://github.com/gitpod-io/gitpod repository on Gitpod and failing due to lack of memory, the workspaces that failed to open (it failed two times, as you can see in the screenshot below) got stuck in the dashboard, with “Failed” status. Then when I click the Delete Workspace button, it's not deleting them (triggering the error in the screenshot below).

Steps to reproduce

Try reproducing the OutOfMemory error by opening a big repository, like https://github.com/gitpod-io/gitpod
After it failed to create, go to your dashboard, click the three-dots icon, and select "Delete Workspace"
You'll see that nothing happens, and an error is displayed on Dev Tools Console.

Workspaces affected

gitpodio-gitpod-wdvp0x65v3a
gitpodio-gitpod-px58o9iw73v

Note: Both workspaces were automatically deleted by the garbage collector on 2022-03-02. [1]

Expected behavior

The workspace should disappear from my dashboard after clicking the Delete Workspace button.

Example repository

I can't share the workspace either. As it failed to be created, when I click the Share button, it triggers the following error:

Anything else?

No response

sagor999 · 2022-02-17T17:18:19Z

I think this is for @gitpod-io/engineering-webapp to handle this case appropriately.
I will remove workspace for now from this issue, but feel free to tag us in if needed.

geropl · 2022-02-18T07:57:05Z

@felladrin How long did you wait between 1) workspace failed and 2) try to delete?
We do not allow any state-changing operation (e.g., delete) on workspaces that are still running. We rely on ws-manager to report the status, and terminate/delete workspace that failed. For rare cases where workspaces are "stuck in stopping", for example, we have timeouts: 1h in this case.

I will remove workspace for now from this issue

@sagor999 What did you do? Manipulate the DB? Or remove the pod from the k8s control plane? 🤔

felladrin · 2022-02-18T08:09:20Z

@felladrin How long did you wait between 1) workspace failed and 2) try to delete?

It failed on Feb 16th at 11:15 AM, and I tried to delete it on Feb 17th at 12:38 PM, so a difference >25h.

I will remove workspace for now from this issue
@sagor999 What did you do? Manipulate the DB? Or remove the pod from the k8s control plane? 🤔

I believe @sagor999 was talking about removing the tag "team: workspace" (which I added when I created the issue) (and adding "team: webapp" in place of it) on this issue. Cause the workspaces records are still listed in my dashboard:

geropl · 2022-02-18T08:16:57Z

I believe @sagor999 was talking about removing the tag "team: workspace"

Ok. We'll need to investigate. 👍

JanKoehnlein · 2022-02-18T13:19:08Z

Scheduled for investigation

sagor999 · 2022-02-18T14:43:42Z

@geropl I believe this happens when workspace never had a chance to actually start. Due to this out of memory error, pod was scheduled and ws-manager considered it to be started. But it was never actually started.
So when that happens, ws will be stuck in limbo like this.
For what it is worth, this PR should improve ws-manager handling of such edge cases from workspace point of view, but maybe from webapp you need to handle this as well.

geropl · 2022-02-18T14:53:22Z

I believe this happens when workspace never had a chance to actually start

💡 That is indeed the case: So far the (implicit) contract has been that once the StartWorkspace succeeded, we rely on updates from ws-manager. We already have a timeout for such cases; I wonder why it did not kick in (requires investigation I mentioned earlier).

geropl · 2022-04-21T06:39:49Z

This might be more common with an upcoming workspace PR (#9438 ), so we should prioritize this.

stale · 2022-08-11T23:18:27Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

j-elmer123 · 2022-08-13T18:44:25Z

this is also happened to me.

NguyenCongVN · 2022-08-14T23:13:06Z

Is there any workaround here? Kind of delete with notice? I experienced this yesterday

felladrin added team: workspace Issue belongs to the Workspace team type: bug Something isn't working labels Feb 17, 2022

felladrin added this to 🌌 Workspace Team Feb 17, 2022

sagor999 added team: webapp Issue belongs to the WebApp team and removed team: workspace Issue belongs to the Workspace team labels Feb 17, 2022

sagor999 removed this from 🌌 Workspace Team Feb 17, 2022

sagor999 added this to 🍎 WebApp Team Feb 17, 2022

JanKoehnlein moved this to Scheduled in 🍎 WebApp Team Feb 18, 2022

geropl removed the status in 🍎 WebApp Team Apr 14, 2022

geropl moved this to Scheduled in 🍎 WebApp Team Apr 14, 2022

geropl mentioned this issue Apr 21, 2022

[ws-manager] fix workspace status flipping pending to deleted #9438

Merged

geropl removed the status in 🍎 WebApp Team Apr 28, 2022

geropl mentioned this issue Jun 24, 2022

[bridge] Update prebuild status when controlling instance timeouts (5/5) #10882

Merged

1 task

stale bot added the meta: stale This issue/PR is stale and will be closed soon label Aug 11, 2022

stale bot removed the meta: stale This issue/PR is stale and will be closed soon label Aug 14, 2022

axonasif added the meta: never-stale This issue can never become stale label Aug 15, 2022

axonasif mentioned this issue Aug 15, 2022

can't delete workspace after failed docker build #635

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dashboard: Can't delete workspaces which failed to be created due to lack of memory on pod #8274

Dashboard: Can't delete workspaces which failed to be created due to lack of memory on pod #8274

felladrin commented Feb 17, 2022 •

edited

Loading

sagor999 commented Feb 17, 2022

geropl commented Feb 18, 2022

felladrin commented Feb 18, 2022

geropl commented Feb 18, 2022

JanKoehnlein commented Feb 18, 2022

sagor999 commented Feb 18, 2022

geropl commented Feb 18, 2022

geropl commented Apr 21, 2022

stale bot commented Aug 11, 2022

j-elmer123 commented Aug 13, 2022

NguyenCongVN commented Aug 14, 2022

Dashboard: Can't delete workspaces which failed to be created due to lack of memory on pod #8274

Dashboard: Can't delete workspaces which failed to be created due to lack of memory on pod #8274

Comments

felladrin commented Feb 17, 2022 • edited Loading

Bug description

Steps to reproduce

Workspaces affected

Expected behavior

Example repository

Anything else?

sagor999 commented Feb 17, 2022

geropl commented Feb 18, 2022

felladrin commented Feb 18, 2022

geropl commented Feb 18, 2022

JanKoehnlein commented Feb 18, 2022

sagor999 commented Feb 18, 2022

geropl commented Feb 18, 2022

geropl commented Apr 21, 2022

stale bot commented Aug 11, 2022

j-elmer123 commented Aug 13, 2022

NguyenCongVN commented Aug 14, 2022

felladrin commented Feb 17, 2022 •

edited

Loading