Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: make crypto.timingSafeEqual test less flaky #8456

Conversation

not-an-aardvark
Copy link
Contributor

@not-an-aardvark not-an-aardvark commented Sep 8, 2016

Checklist
  • make -j4 test (UNIX), or vcbuild test nosign (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines
Affected core subsystem(s)

crypto

Description of change

WIP; do not merge.

The crypto.timingSafeEqual test still seems to be a bit flaky. This makes a few changes to the test:

  • Separates the basic usage and the benchmarking into different tests
  • Moves the timing-sensitive benchmark function into a separate module, and reparses the module on every iteration of the loop to avoid shared state between timing measurements.

If this doesn't work, an alternative would be to start a separate child process for each individual timing measurement, which would completely avoid shared state between measurements (although it would also probably make the test much more CPU-intensive).

/cc @Trott

Refs: #8040, #8203, #8304

@nodejs-github-bot nodejs-github-bot added the test Issues and PRs related to the tests. label Sep 8, 2016
@Trott
Copy link
Member

Trott commented Sep 8, 2016

Haven't been able (yet) to force failures via stress test repetition but have seen it coming up on full CI runs, so here's three of those:

CI: https://ci.nodejs.org/job/node-test-pull-request/3975/

CI again: https://ci.nodejs.org/job/node-test-pull-request/3976/

CI one more time: https://ci.nodejs.org/job/node-test-pull-request/3977/

const crypto = require('crypto');

const BENCHMARK_FUNC_PATH =
'../fixtures/crypto-timing-safe-equal-benchmark-func';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: common.fixturesDir but probably no need to worry about that until we see if this even works.

@not-an-aardvark
Copy link
Contributor Author

not-an-aardvark commented Sep 9, 2016

  • 3 test timeouts on ARM: 1, 2, 3
  • 1 build failure on Ubuntu 18.04: here

(The test takes a bit longer now because it has to call require() and parse a module 20000 times.)

@Trott
Copy link
Member

Trott commented Sep 9, 2016

Stress test on debian8-x86 with master shows a 30% failure rate. https://ci.nodejs.org/job/node-stress-single-test/894/nodes=debian8-x86/console

Here's a stress test on debian8-x86 against this PR for comparison: https://ci.nodejs.org/job/node-stress-single-test/895/nodes=debian8-x86/console

@Trott
Copy link
Member

Trott commented Sep 9, 2016

Timeout on Raspberry Pi is probably acceptable. A few other tests do this to skip tests on machines with low RAM (like the old Pi devices):

if (!common.enoughTestMem) {
  common.skip(skipMessage);
  return;
}

We can do that on this test too if the Raspberry Pi 1 can't handle it.

@not-an-aardvark
Copy link
Contributor Author

not-an-aardvark commented Sep 9, 2016

I think this stresstest is running the wrong test; it should be test-crypto-timing-safe-equal-benchmarks for this PR since the test file was split into two.

@Trott
Copy link
Member

Trott commented Sep 9, 2016

Whoops, yes, let's re-run that stress test but with the correct test this time: https://ci.nodejs.org/job/node-stress-single-test/nodes=debian8-x86/898/console

@Trott
Copy link
Member

Trott commented Sep 9, 2016

I don't want to monopolize our only debian8-x86 test machine for 14 hours (which is what I'm calculating it would take to run this test 10K times... 10K * 5 seconds = 50K seconds = ~14 hours).

So I'm going to stop it now after ~80 runs without a single failure. Certainly more than enough to feel confident that the failure rate on debian8-x86 will be much less than the 30% we are seeing on current master.

Maybe add the skip code and then we can run regular CI a few more times to see if everything is A-OK?

}

if (!common.enoughTestMem) {
common.skip('skipping memory-intensive test');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: common.skip() prepends text indicating the test has been skipped so the word 'skipping' is redundant. I'm sure this is actually an issue with a bunch of other tests and is fine to leave as-is (hence the 'Nit' prefix) if you wish, as fixing these throughout the tests would probably make a good first contribution for a newcomer anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on a search in the test/ folder it seems like all the other tests actually do this correctly, so I fixed this one to do it correctly as well.

@Trott
Copy link
Member

Trott commented Sep 9, 2016

@not-an-aardvark not-an-aardvark changed the title WIP: test: make crypto.timingSafeEqual test less flaky test: make crypto.timingSafeEqual test less flaky Sep 9, 2016
@Trott
Copy link
Member

Trott commented Sep 9, 2016

CI looks great. (Windows failure looks unrelated, will open separate issue to investigate if one isn't already open.)

@nodejs/testing @nodejs/crypto

@jasnell
Copy link
Member

jasnell commented Sep 9, 2016

LGTM

Trott pushed a commit to Trott/io.js that referenced this pull request Sep 12, 2016
The `crypto.timingSafeEqual` test still seems to be a bit flaky. This
makes a few changes to the test:

* Separates the basic usage and the benchmarking into different tests

* Moves the timing-sensitive benchmark function into a separate module,
and reparses the module on every iteration of the loop to avoid shared
state between timing measurements.

PR-URL: nodejs#8456
Reviewed-By: James M Snell <[email protected]>
@Trott
Copy link
Member

Trott commented Sep 12, 2016

Landed in c678ecb 🎉

@Trott Trott closed this Sep 12, 2016
@not-an-aardvark not-an-aardvark deleted the fix-more-timing-safe-equal-flakes branch September 12, 2016 04:09
Fishrock123 pushed a commit that referenced this pull request Sep 14, 2016
The `crypto.timingSafeEqual` test still seems to be a bit flaky. This
makes a few changes to the test:

* Separates the basic usage and the benchmarking into different tests

* Moves the timing-sensitive benchmark function into a separate module,
and reparses the module on every iteration of the loop to avoid shared
state between timing measurements.

PR-URL: #8456
Reviewed-By: James M Snell <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants