Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support batch session timeout #3066

Merged
merged 7 commits into from
Nov 14, 2024

Conversation

fregataa
Copy link
Member

@fregataa fregataa commented Nov 11, 2024

resolves #2357

Test CLI command

# ./backend.ai session create --type batch -c "echo \"hello world\" && sleep 10" --timeout 5 -t test-batch <IMAGE_NAME>
./backend.ai session create --type batch -c "echo \"hello world\" && sleep 10" --timeout 5 -t test-batch python

Result example

Before #3085
The record from sessions table
image

After #3085
The record from session table
image

Checklist: (if applicable)

  • Milestone metadata specifying the target backport version
  • Mention to the original issue
  • API server-client counterparts (e.g., manager API -> client SDK)
  • Test case(s) to:
    • Demonstrate the difference of before/after
    • Demonstrate the flow of abstract/conceptual models with a concrete implementation

📚 Documentation preview 📚: https://sorna--3066.org.readthedocs.build/en/3066/


📚 Documentation preview 📚: https://sorna-ko--3066.org.readthedocs.build/ko/3066/

@github-actions github-actions bot added area:docs Documentations comp:manager Related to Manager component comp:agent Related to Agent component comp:client Related to Client component comp:cli Related to CLI component require:db-migration Automatically set when alembic migrations are added or updated size:L 100~500 LoC labels Nov 11, 2024
Copy link
Member Author

fregataa commented Nov 11, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @fregataa and the rest of your teammates on Graphite Graphite

@fregataa fregataa added this to the 24.12 milestone Nov 11, 2024
@fregataa fregataa requested a review from agatha197 November 11, 2024 05:31
@fregataa fregataa changed the title Support batch session timeout feat: Support batch session timeout Nov 11, 2024
@fregataa fregataa marked this pull request as ready for review November 11, 2024 05:35
Copy link
Contributor

@agatha197 agatha197 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can I test? Can you update PR description?

@fregataa fregataa force-pushed the topic/11-11-feat_support_batch_session_timeout branch from 6511022 to b506512 Compare November 12, 2024 07:50
@fregataa fregataa changed the base branch from main to topic/11-12-fix_session_status_info_not_reflecting_batch_failures November 12, 2024 07:50
@fregataa fregataa force-pushed the topic/11-12-fix_session_status_info_not_reflecting_batch_failures branch from e948dd1 to c597151 Compare November 12, 2024 08:02
@fregataa fregataa force-pushed the topic/11-11-feat_support_batch_session_timeout branch from b506512 to 022612c Compare November 12, 2024 08:02
@fregataa fregataa requested a review from agatha197 November 12, 2024 08:12
agatha197

This comment was marked as duplicate.

Copy link
Contributor

@agatha197 agatha197 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Base automatically changed from topic/11-12-fix_session_status_info_not_reflecting_batch_failures to main November 13, 2024 08:34
@fregataa fregataa force-pushed the topic/11-11-feat_support_batch_session_timeout branch from 022612c to 8fe6356 Compare November 13, 2024 08:46
@fregataa fregataa added this pull request to the merge queue Nov 14, 2024
agatha197 added a commit to lablup/backend.ai-webui that referenced this pull request Nov 14, 2024
resolves #2812
related PR: lablup/backend.ai#3066

Add batch job timeout duration to the session launcher.
User can select the time unit and input the timeout duration.

**How to test:**
> test endpoint: 10.82.230.49
1. Modify the support version of `batch-timeout` in `backend.ai-client-esm` for testing. (24.09)
2. Set session type to the batch.
3. Enable `Batch Job Timeout Duration`
  - eg
    start command: sleep 20 && echo \"hello world\"
    timeout: 3s

**Checklist:**
- [ ] Batch job timeout value remains after refreshing
- [ ] The 'Confirm and Launch' page has the same value if you set the 'Batch job timeout duration'.
- [ ] (API is not implemented yet) Create a new session with batch session timeout option.

**Screentshots:**
![image.png](https://graphite-user-uploaded-assets-prod.s3.amazonaws.com/2HueYSdFvL8pOB5mgrUQ/5adbf314-e521-4410-bed8-075124fad6c8.png)

![image.png](https://graphite-user-uploaded-assets-prod.s3.amazonaws.com/2HueYSdFvL8pOB5mgrUQ/c9dd0f12-695e-4960-ba98-9b6371c6a47a.png)

![image.png](https://graphite-user-uploaded-assets-prod.s3.amazonaws.com/2HueYSdFvL8pOB5mgrUQ/4075faa3-a61e-4ce3-833c-c858e6d6b75a.png)

![image.png](https://graphite-user-uploaded-assets-prod.s3.amazonaws.com/2HueYSdFvL8pOB5mgrUQ/600fd8cb-0040-43e4-b8fe-307730d292c0.png)

![image.png](https://graphite-user-uploaded-assets-prod.s3.amazonaws.com/2HueYSdFvL8pOB5mgrUQ/55116de0-e7eb-4ef7-8077-abde8ea73b3b.png)

**Checklist:**

- [ ] Documentation
- [ ] Test case: Verify timeout can be enabled/disabled
- [ ] Test case: Confirm time picker accepts valid duration formats
- [ ] Test case: Check default value initialization
- [ ] Test case: Validate timeout enforcement on batch sessions
Merged via the queue into main with commit 5e4d821 Nov 14, 2024
23 of 24 checks passed
@fregataa fregataa deleted the topic/11-11-feat_support_batch_session_timeout branch November 14, 2024 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:docs Documentations comp:agent Related to Agent component comp:cli Related to CLI component comp:client Related to Client component comp:manager Related to Manager component require:db-migration Automatically set when alembic migrations are added or updated size:L 100~500 LoC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

User-defined timeout for batch sessions
2 participants