Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEADLINE_EXCEEDED: gRPC communication trouble #5414

Closed
meysholdt opened this issue Aug 27, 2021 · 10 comments · Fixed by #5455 or #5510
Closed

DEADLINE_EXCEEDED: gRPC communication trouble #5414

meysholdt opened this issue Aug 27, 2021 · 10 comments · Fixed by #5455 or #5510
Labels
groundwork: awaiting deployment priority: highest (user impact) Directly user impacting team: workspace Issue belongs to the Workspace team type: bug Something isn't working type: incident Gitpod.io service is unstable

Comments

@meysholdt
Copy link
Member

meysholdt commented Aug 27, 2021

the query

SELECT * from `d_b_workspace_instance` as i 
    WHERE i.creationTime >= "2021-08-27" and status LIKE "%DEADLINE_EXCEEDED%";

returns 74 matches, all on cluster eu15 and all with status

{
    "phase": "stopped",
    "message": "Workspace cannot be started: Error: 4 DEADLINE_EXCEEDED: Deadline exceeded",
    "conditions": {
        "failed": "Error: 4 DEADLINE_EXCEEDED: Deadline exceeded"
    }
}
@meysholdt
Copy link
Member Author

SELECT region, count(region) from `d_b_workspace_instance` as i 
    WHERE i.creationTime >= "2021-08-20" and status LIKE "%DEADLINE_EXCEEDED%" 
    GROUP BY `region`

returns

region	count(region)
eu15	107
k3s01	84
us16	433

it's weird that the error does not seem to occur in eu16 and us15

@meysholdt meysholdt changed the title 74 workspaces could not start today. DEADLINE_EXCEEDED: Errors starting workspaces Aug 27, 2021
@meysholdt
Copy link
Member Author

Occurrence of the error during the last week according to the DB-UPDATE-Log. The error seems to occur scarcely, but then intensely.

cluster eu15:
image

cluster us16:
image

log query

logName="projects/gitpod-191109/logs/cloudsql.googleapis.com%2Fmysql-general.log"
textPayload:"DEADLINE_EXCEEDED: Deadline exceeded"
textPayload:"UPDATE"
textPayload:"eu15"

@meysholdt
Copy link
Member Author

meysholdt commented Aug 27, 2021

Let's take a look at the logs from today: (Query: "DEADLINE_EXCEEDED: Deadline exceeded")
image

The most messages the following error from server:
"Request sendHeartBeat failed with internal server error"

Error: 4 DEADLINE_EXCEEDED: Deadline exceeded
    at Object.callErrorFromStatus (/app/node_modules/@grpc/grpc-js/build/src/call.js:31:26)
    at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client.js:179:52)
    at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:336:141)
    at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:299:181)
    at /app/node_modules/@grpc/grpc-js/build/src/call-stream.js:145:78
    at processTicksAndRejections (internal/process/task_queues.js:79:11)
    at runNextTicks (internal/process/task_queues.js:66:3)
    at listOnTimeout (internal/timers.js:518:9)
    at processTimers (internal/timers.js:492:7)

image

@meysholdt
Copy link
Member Author

among the other errors are:

from workspace pods:

�[91m[main 2021-08-27T06:19:46.617Z]�[0m code server: 1:63e3d7ac-a087-44c4-88c0-be63fa2e0a0a terminal: resize failed: Error: 4 DEADLINE_EXCEEDED: Deadline exceeded
    at Object.callErrorFromStatus (/ide/node_modules/@grpc/grpc-js/build/src/call.js:31:26)
    at Object.onReceiveStatus (/ide/node_modules/@grpc/grpc-js/build/src/client.js:176:52)
    at Object.onReceiveStatus (/ide/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:336:141)
    at Object.onReceiveStatus (/ide/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:299:181)
    at /ide/node_modules/@grpc/grpc-js/build/src/call-stream.js:145:78
    at processTicksAndRejections (internal/process/task_queues.js:77:11) {

server says "stopWorkspace error: "

Error: 4 DEADLINE_EXCEEDED: Deadline exceeded
    at Object.callErrorFromStatus (/app/node_modules/@grpc/grpc-js/build/src/call.js:31:26)
    at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client.js:179:52)
    at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:336:141)
    at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:299:181)
    at /app/node_modules/@grpc/grpc-js/build/src/call-stream.js:145:78
    at processTicksAndRejections (internal/process/task_queues.js:79:11)
    at runNextTicks (internal/process/task_queues.js:66:3)
    at listOnTimeout (internal/timers.js:518:9)
    at processTimers (internal/timers.js:492:7)

server says "Request openPort failed with internal server error"

Error: 4 DEADLINE_EXCEEDED: Deadline exceeded
    at Object.callErrorFromStatus (/app/node_modules/@grpc/grpc-js/build/src/call.js:31:26)
    at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client.js:179:52)
    at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:336:141)
    at Object.onReceiveStatus (/app/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:299:181)
    at /app/node_modules/@grpc/grpc-js/build/src/call-stream.js:145:78
    at processTicksAndRejections (internal/process/task_queues.js:79:11)

...and more.

@meysholdt meysholdt added type: bug Something isn't working type: incident Gitpod.io service is unstable labels Aug 27, 2021
@meysholdt meysholdt changed the title DEADLINE_EXCEEDED: Errors starting workspaces DEADLINE_EXCEEDED: gRPC communication trouble Aug 27, 2021
@csweichel
Copy link
Contributor

/team workspace

@roboquat roboquat added the team: workspace Issue belongs to the Workspace team label Aug 30, 2021
@csweichel csweichel added the priority: highest (user impact) Directly user impacting label Aug 30, 2021
@csweichel
Copy link
Contributor

/schedule

@roboquat
Copy link
Contributor

@csweichel: Issue scheduled in the workspace team (WIP: 0)

In response to this:

/schedule

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@geropl
Copy link
Member

geropl commented Aug 30, 2021

How is this related to this PR we recently deployed? Did it mainly occur after deploying it?

#5322

@csweichel
Copy link
Contributor

How is this related to this PR we recently deployed? Did it mainly occur after deploying it?

#5322

I reckon this mostly relates to server as the workspaces fail to start in the first place. I.e. their StartWorkspace request never makes it to ws-manager.

@meysholdt
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
groundwork: awaiting deployment priority: highest (user impact) Directly user impacting team: workspace Issue belongs to the Workspace team type: bug Something isn't working type: incident Gitpod.io service is unstable
Projects
None yet
5 participants