test: long-running nexmark on madsim #5170

wangrunji0408 · 2022-09-07T05:19:19Z

We hope to run nexmark for a long period in deterministic simulation to find more stability issues.

Some potential challenges:

limited compute resource: one test can only be run on 1 CPU core due to the determinism requirement.
limited memory capacity: long-running task will generate a huge amount of data which exceeds the memory capacity. we may need to dump data to the disk.

lmatz · 2022-11-08T14:49:05Z

limited compute resource: one test can only be run on 1 CPU core due to the determinism requirement.

For the set of computations without any side effects, i.e. pure expression evaluation, can they be executed concurrently on multiple cores?

wangrunji0408 · 2022-11-15T16:48:07Z

I missed it. Sorry for the late response.

For the set of computations without any side effects, i.e. pure expression evaluation, can they be executed concurrently on multiple cores?

Yes. But unfortunately, it's hard for madsim to identify which task is pure computation. In practice, tasks without any side effects almost don't exist. More or less, they interact with each other through channels or shared states. Once it happens, we have to determine the order of the two tasks, otherwise the determinism will be broken. If we could intervene every time they make a side effect, then parallel execution seems possible. But I feel that it would take a lot of effort, the determinism would be hard to guarantee, and I'm afraid it can not be well-parallelized given the ubiquitous dependencies. 🥹

Thinking from the other side, simply speeding up the execution may not be the right direction for this problem. Concurrency bugs usually have a small depth, which means they can happen within a few steps if you carefully construct the schedule sequence. So they should be found quickly by massive simulations with different seeds. If they can't, the reason could be that some conditions are not satisfied. For example, the storage data is not large enough to trigger compaction. The only way to meet this condition from scratch is to run data ingestion for a long time. However, why do we have to run from scratch? If our simulator supports loading from a checkpoint, we can prepare a large dataset in advance and directly start from here. That's what we plan to do next.

lmatz · 2022-11-16T07:31:47Z

Thanks for the detailed explanation!

If our simulator supports loading from a checkpoint, we can prepare a large dataset in advance and directly start from here. That's what we plan to do next.

It makes sense!

TennyZhuang · 2023-02-27T03:24:33Z

Any updates?

wangrunji0408 · 2023-02-27T04:10:28Z

After some rethinking, I decided to make this issue low-priority, as long-running also makes it slow to reproduce. We can't benefit much from it compared with existing longevity test. Instead, I was trying to add more short-term fault injection tests (e.g. #7623) so that problems would be found more efficient.

wangrunji0408 added the type/feature label Sep 7, 2022

wangrunji0408 self-assigned this Sep 7, 2022

github-actions bot added this to the release-0.1.13 milestone Sep 7, 2022

wangrunji0408 mentioned this issue Sep 7, 2022

Tracking: deterministic simulation testing #4180

Open

22 tasks

neverchanje mentioned this issue Sep 7, 2022

Automatic micro-testing #5164

Closed

15 tasks

fuyufjh modified the milestones: release-0.1.13, next-release-0.1.14 Sep 26, 2022

wangrunji0408 mentioned this issue Nov 16, 2022

feat: try to enable recovery test in ci #6347

Merged

3 tasks

wangrunji0408 modified the milestones: release-0.1.14, release-0.1.15 Nov 18, 2022

wangrunji0408 modified the milestones: release-0.1.15, release-0.1.16 Dec 19, 2022

wangrunji0408 modified the milestones: release-0.1.16, release-0.1.17 Jan 30, 2023

TennyZhuang modified the milestones: release-0.1.17, next-release-0.1.19 Feb 27, 2023

wangrunji0408 modified the milestones: release-0.19, future-release-0.21 May 19, 2023

fuyufjh added the priority/low label Aug 8, 2023

fuyufjh removed this from the release-1.1 milestone Aug 8, 2023

wangrunji0408 closed this as not planned Won't fix, can't repro, duplicate, stale Jul 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: long-running nexmark on madsim #5170

test: long-running nexmark on madsim #5170

wangrunji0408 commented Sep 7, 2022

lmatz commented Nov 8, 2022

wangrunji0408 commented Nov 15, 2022 •

edited

Loading

lmatz commented Nov 16, 2022

TennyZhuang commented Feb 27, 2023

wangrunji0408 commented Feb 27, 2023

test: long-running nexmark on madsim #5170

test: long-running nexmark on madsim #5170

Comments

wangrunji0408 commented Sep 7, 2022

lmatz commented Nov 8, 2022

wangrunji0408 commented Nov 15, 2022 • edited Loading

lmatz commented Nov 16, 2022

TennyZhuang commented Feb 27, 2023

wangrunji0408 commented Feb 27, 2023

wangrunji0408 commented Nov 15, 2022 •

edited

Loading