Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aptos-workspace-server] various fixes and improvements #15571

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

vgao1996
Copy link
Contributor

@vgao1996 vgao1996 commented Dec 12, 2024

This PR implements several fixes and improvements to aptos-workspace-server:

  • Shared Docker Network:
    • Instead of creating a new Docker network for each instance, we now use a shared network (aptos-workspace).
    • This helps avoid hitting Docker's default network limit (30 networks per host).
  • Error Propagation Fixes:
    • Fixed issues with error propagation so we can know why services actually fail
  • Automatic Docker Image Pulling:
    • Docker images are now automatically pulled if they do not already exist locally.
  • Shared Docker Client Initialization:
    • All docker operations now await a single shared future to initialize the Docker client, rather than creating separate instances.
  • Timeout:
    • Automatically initiates the shutdown sequence when a timeout is reached. This gives us a chance to tear down the server in case the front-end is not responding, hopefully avoiding slowing down the user's computer semi-permanently.

@vgao1996 vgao1996 requested review from banool and 0xmaayan December 12, 2024 02:40
Copy link

trunk-io bot commented Dec 12, 2024

⏱️ 51m total CI duration on this PR
Job Cumulative Duration Recent Runs
rust-move-tests 13m 🟩
rust-move-tests 13m 🟩
rust-move-tests 13m
rust-cargo-deny 5m 🟩🟩🟩
check-dynamic-deps 4m 🟩🟩🟩🟩
semgrep/ci 1m 🟩🟩🟩🟩
general-lints 1m 🟩🟩🟩
file_change_determinator 40s 🟩🟩🟩
permission-check 12s 🟩🟩🟩🟩
permission-check 11s 🟩🟩🟩🟩
rust-move-tests 1s

settingsfeedbackdocs ⋅ learn more about trunk.io

@vgao1996 vgao1996 changed the title [aptos-workspace-server] fix error propagation + pull docker images + use only 1 docker network [aptos-workspace-server] various fixes and improvements Dec 17, 2024
@vgao1996 vgao1996 marked this pull request as ready for review December 17, 2024 21:38
future::Future,
net::{IpAddr, Ipv4Addr},
sync::Arc,
};

/// An wrapper to ensure propagation of chain of errors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
/// An wrapper to ensure propagation of chain of errors.
/// A wrapper to ensure propagation of chain of errors.

Comment on lines +204 to +205
#[arg(long, default_value_t = 1800)]
timeout: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment and maybe better naming could be nice here. My first thought was that this would be a startup or shutdown timeout, but it really seems like a "max lifespan".

I don't fully know exactly how you use workspace, but I suppose there is a good reason that we always do this, vs making it optional?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it'd be good to suffix whatever name we choose with _secs

bollard::errors::Error::DockerResponseServerError {
status_code: 409, ..
} => {
println!("Docker network {} already exists, not creating it", name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.


handle
.await
.map_err(|err| anyhow!("failed to join task handle: {}", err))?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.map_err(|err| anyhow!("failed to join task handle: {}", err))?
.map_err(|err| anyhow!("failed to join task handle: {:#}", err))?

I think this is the best way to get the most context? Not 100% sure though.

},
Err(_err) => {
println!(
"Docker image {} does not exist. Pulling image..",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tiny nit

Suggested change
"Docker image {} does not exist. Pulling image..",
"Docker image {} does not exist. Pulling image...",

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants