Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions 😅 #1

Open
mindplay-dk opened this issue Aug 16, 2023 · 8 comments
Open

Questions 😅 #1

mindplay-dk opened this issue Aug 16, 2023 · 8 comments

Comments

@mindplay-dk
Copy link

First off, this looks really promising - I'm hoping this might be exactly what I've been looking for.

Okay, so, question 1:

It's not actually clear to me how this API achieves isolation of the benchmarked dependencies. Do you have to create multiple scopes, and then run your tests individually, or what's the idea?

I mean, you only get to set up the arguments for your tests once, in the constructor argument to Scope - so if you have different units you want to test, and you need to avoid loading them to avoid cross-pollution, well, how?

import { IsoBench } from "iso-bench'"
import { function1 } from "module1"
import { function2 } from "module2"

const scope = new IsoBench.Scope({}, () => [function1, function2])

scope
  .add('function1', (function1) => {
    function1()
  })
  .add('function2', (_, function2) => {
    function2()
  })
  .result()
  .run()

I mean, this doesn't effectively isolate function1 from function2, does it? They've both been loaded - so even if you're not using them both in each test, there will be cross-pollution here, or not?

Question 2:

How do you get the results? The run method returns Promise<void>.

Do they just get printed on screen, or what's the idea?

Question 3:

Any idea if this should work with tsx aka typescript-execute?

All I've managed to get thus far is a ReferenceError saying the symbol is not defined.

I noticed you're compiling ahead-of-time with tsc, and I don't fully understand the V8 wizardry you're doing with this library, so I'm not sure if this is expected to work or not?

I tried copy-pasting some examples from your test and couldn't get those to work either.


Here's a repo with a minimal preliminary iso-bench setup for the thing I'm trying to benchmark:

https://github.com/mindplay-dk/sigma/blob/try-iso-bench/benchmarks/src/json/index.ts

When I run it, it just prints:

parseSigmaDefer - ReferenceError: import_sigma is not defined
[TESTS COMPLETED]

I tried without the async/await wrapper as well - also not sure if that's expected to work or not? But I figured, if I want to test these two functions, it can't happen in the same Scope instance, since that forces me to create both test subjects at the same time?

If you can help me figure this out, I'd like to help improving the README - it spends a lot of time framing the problem, and explaining implementation details, and it's great to have this information somewhere, but it's probably not what most people need first thing to start writing a benchmark.

I'm trying to solve the benchmarking problem for the sigma project that I'm currently contributing to, and this might make a good first showcase for this library.

If I can get it working, I also might hop in and try to help with the library itself - it doesn't look like it's doing much in terms of statistics on the actual measurements, it's just an average, I think? I have some code lying around that could probably improve the stability of the output numbers. 🙂

@Llorx
Copy link
Owner

Llorx commented Aug 16, 2023

EDIT: This applies for v1. v2 has a simpler API.

First off, this looks really promising - I'm hoping this might be exactly what I've been looking for.

Hello! I'm glad someone noticed the very same problem as I noticed ^^. I created this tool just to fit my needs, and added a bit of documentation because why not, but I didn't plan someone to use it haha.

It's not actually clear to me how this API achieves isolation of the benchmarked dependencies. Do you have to create multiple scopes, and then run your tests individually, or what's the idea?

You just need to create 1 Scope and each test in this scope is going to be isolated from the other tests in the same Scope (and also isolated from other Scopes, of course). The isolation is achieved by forking the process, so individual processes are used for each test. The Scope is the "window" from the main process to these subprocesses, so the Scope setup function can't access the main scope. If you have items in your main scope, you just neeed to send them to the Scope setup function by adding extra arguments to the Scope constructor.

These areguments are going to be "sent" to the subprocesses, so they need to be able to be serialized by V8. If you happen to have non-serializable values, as for example a function to be tested, you need to import/require that function inside the Scope setup function.

I modified your test and it works now:

import { IsoBench } from 'iso-bench'

import { SAMPLE } from './@sample'

(async () => {
  const scope = new IsoBench.Scope({
    minMs: 1000,
    __dirname: __dirname
  }, (SAMPLE) => {
    const {parse} = require("./sigma");
    return [parse, SAMPLE];
  }, SAMPLE);
  await scope
    .add('parseSigmaDefer', (parseSigmaDefer, SAMPLE) => {
      parseSigmaDefer(SAMPLE)
    })
    .result()
    .run()
})();

As you can see, the SAMPLE is sent to the Scope without a problem because is a plain object, which works perfectly with V8.serialize.

On the other hand, the parse function in the ./sigma file needs to be required from the Scope setup, as functions cannot be serialized. For this to work I need to set the Scope __dirname to the actual __dirname, so the requires inside the Scope setup are relative to this folder.

How do you get the results? The run method returns Promise<void>.
Do they just get printed on screen, or what's the idea?

Yes, that's the idea now. I was thinking into adding an output processor, so you can log it, create a table, create an image or whatever thing you want, but right now it just spits the output in the console.log.

If I can get it working, I also might hop in and try to help with the library itself - it doesn't look like it's doing much in terms of statistics on the actual measurements, it's just an average, I think? I have some code lying around that could probably improve the stability of the output numbers. 🙂

That would be awesome ^^. As I said, I created this tool to fit my needings, so is a bit clunky. I didn't expect someone from "the real world" to run it so I'm excited actually hahaha, and yours seems like a nice test for this tool. All help is very much appreciated.

PS: I'm thinking in another way of having isolated processes but with way less hassle than this. Trying it right now. Maybe in 2 days we have another update with better API.
EDIT: New API done :-P

@Llorx
Copy link
Owner

Llorx commented Aug 16, 2023

@mindplay-dk Release v2.0.1 with a new easier API. Not backwards compatible, but without confusing scoping poblems.

Your code turns to be like this with the new version:

import { IsoBench } from 'iso-bench';

import { SAMPLE } from './@sample';
import { parse as parseSigmaDefer } from './sigma';

(async () => {
  const bench = new IsoBench("My bench", { minMs: 2000 });
  await bench
    .add('parseSigmaDefer', () => {
      parseSigmaDefer(SAMPLE)
    })
    .run()
})();

Way simpler :-)

Still, there's no output management and such.

@mindplay-dk
Copy link
Author

This look much simpler! Will give this a try later today. 🙂👍

If this pans out, what do you think about making the run function just return the raw numbers?

This way, you can keep this library in the realm of "do one thing and do it well" - I could then look into fixing or forking benny or benchmark by integrating this, independently of your needing to get involved in a lot of opinionated math and reporting features and so on.

@Llorx
Copy link
Owner

Llorx commented Aug 17, 2023

@mindplay-dk Sure! That's cool because you can use built-in processors to output these raw numbers to the console, a file, HTML or whatever, or you can build your own processor.

@mindplay-dk
Copy link
Author

Okay, but wait, so version 2 no longer works with browsers?

But seems like what you're saying in the benny thread is, it didn't really completely work with browsers as it was?

So still no browser support.

Hm, yeah, solving this is a bit of a different project in itself though... probably would involve something like Playwright and opening a dedicated tab for each test - or, as you suggested, reloading the page, which is something I've done before, but that's also a bit of a different project. 🤔

Actually, I have to wonder if forking is even strictly enough - since, technically, the benchmark code itself could have effects on the code you're benchmarking... and, strictly speaking, it could have different effects on different scripts.

Ideally, probably every benchmark should actually be authored as a fully isolated, stand-alone script - but then measuring the time it takes to run would require modifying the running code... I guess, if you really wanted a totally robust V8 benchmarking framework, it would need to be written in C++ to make sure there is nothing but the engine itself affecting the timing of the benchmarked scripts... but that is definitely a project in itself. 😅

Well, one thing at a time. I will test this later today in the context of this project and see if the numbers look reliable for this benchmark - and if so, I will probably submit a little PR to return the raw numbers, and then just hand-roll the rest.

@mindplay-dk
Copy link
Author

I can confirm that the 2.0.1 release yields reliable results for these benchmarks. 🙂👍

Thanks! I'll prepare a PR to expose the raw numbers.

@Llorx
Copy link
Owner

Llorx commented Aug 17, 2023

Nice! Going to keep this issue open, as it has useful information, until I clear up the repo.

@Llorx
Copy link
Owner

Llorx commented Aug 17, 2023

Okay, but wait, so version 2 no longer works with browsers?

Since I removed support for worker threads and changed it to forks, it didn't work in browsers. That update was in version 1.

Actually, I have to wonder if forking is even strictly enough - since, technically, the benchmark code itself could have effects on the code you're benchmarking... and, strictly speaking, it could have different effects on different scripts.

Ideally, probably every benchmark should actually be authored as a fully isolated, stand-alone script - but then measuring the time it takes to run would require modifying the running code... I guess, if you really wanted a totally robust V8 benchmarking framework, it would need to be written in C++ to make sure there is nothing but the engine itself affecting the timing of the benchmarked scripts... but that is definitely a project in itself. 😅

That's close on how v1 works. It only ran the strictly enough JavaScript to get the metrics (just the time difference between runs, which is to get the time and then substract the time again), without any extra addition but the user setup before the tests, but the Scope setup is complicated to understand and prepare, specially if you want just to test basic things quickly. I've found myself reading my own README to implement some tests after some time without using iso-bench 😅

If you want to waste time preparing a complete benchmark because it is going to be important, then you can go with v1 as it is almost-pollution-free. v2 is really lightweight though, and with 0 dependencies, so I expect it to not pollute regular performance tests. At least my tests that are being polluted with default benchmarks, they are still pollution-free with v2. Anyway, V8 gets updated with new optimizations every now and then so I'm not sure how is this going to affect v2 over time.

I'm going to add some automatic tests that check some typical polluted scenarios to see if they are still pollution-free with iso-bench, and when a new nodejs version comes, they will run on it, to check if is still working as expected.

Also, as you say, maybe a nodejs tool for benchmarking, but built-in the native code, is not a bad idea. I love C actually :P. To the TODO list!
200w

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants