Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the Correctness & Performance of Time Formatting #576

Merged
merged 1 commit into from
Oct 5, 2022

Conversation

CryZe
Copy link
Collaborator

@CryZe CryZe commented Oct 5, 2022

This improves the correctness and the performance of the time formatting by making use of the fact that the underlying times are represented as a pair of integers (seconds and the subsecond nanoseconds). This way instead of formatting the fractional part as a floating point number we already have the fractional part in form of the nanoseconds. On top of that we can improve the performance a little bit by removing the data dependency we introduced by calculating the hours from the total minutes (and so on) when we can just calculate the hours directly from the total seconds. This allows for a little bit of instruction level parallelism where the days, hours, minutes and seconds can all be calculated in parallel.

This improves the correctness and the performance of the time formatting
by making use of the fact that the underlying times are represented as a
pair of integers (seconds and the subsecond nanoseconds). This way
instead of formatting the fractional part as a floating point number we
already have the fractional part in form of the nanoseconds. On top of
that we can improve the performance a little bit by removing the data
dependency we introduced by calculating the hours from the total minutes
(and so on) when we can just calculate the hours directly from the total
seconds. This allows for a little bit of instruction level parallelism
where the days, hours, minutes and seconds can all be calculated in
parallel.
@CryZe CryZe added enhancement An improvement for livesplit-core. performance Affects the performance of the code. labels Oct 5, 2022
@CryZe CryZe merged commit 66fab23 into LiveSplit:master Oct 5, 2022
@CryZe CryZe deleted the integer-time-formatting branch October 5, 2022 23:42
@CryZe CryZe added this to the v0.13 milestone Oct 5, 2022
CryZe added a commit to CryZe/livesplit-core that referenced this pull request Oct 6, 2022
Going through the standard library's formatter is usually quite slow.
And usually we want to show either minutes or seconds that are in the
range from 0 to 59, or even tenths from 0 to 99. For those we can just
define a lookup table where we look up the formatted strings and
directly call `write_str` on the formatter instead of going through the
`write!` macro, which would go through a lot more setup. Additionally we
already have `itoa` as an indirect dependency which we now directly use
for formatting integers that are not bound that way. `itoa` generally
does the same algorithm as `std` but does so without going through any
formatting machinery and is thus a lot faster, but also much less
customizable.

Overall this and LiveSplit#576 together result in a `~3.86x` performance
improvement when formatting a time.

| When        |        Time |
|-------------|------------:|
| Both PRs    | `59.205 ns` |
| Previous PR | `154.63 ns` |
| Before      | `228.47 ns` |
CryZe added a commit to CryZe/livesplit-core that referenced this pull request Oct 6, 2022
Going through the standard library's formatter is usually quite slow.
And usually we want to show either minutes or seconds that are in the
range from 0 to 59, or even tenths from 0 to 99. For those we can just
define a lookup table where we look up the formatted strings and
directly call `write_str` on the formatter instead of going through the
`write!` macro, which would go through a lot more setup. Additionally we
already have `itoa` as an indirect dependency which we now directly use
for formatting integers that are not bound that way. `itoa` generally
does the same algorithm as `std` but does so without going through any
formatting machinery and is thus a lot faster, but also much less
customizable.

Overall this and LiveSplit#576 together result in a `~3.86x` performance
improvement when formatting a time.

| When        |        Time |
|-------------|------------:|
| Both PRs    | `59.205 ns` |
| Previous PR | `154.63 ns` |
| Before      | `228.47 ns` |
CryZe added a commit that referenced this pull request Oct 6, 2022
Going through the standard library's formatter is usually quite slow.
And usually we want to show either minutes or seconds that are in the
range from 0 to 59, or even tenths from 0 to 99. For those we can just
define a lookup table where we look up the formatted strings and
directly call `write_str` on the formatter instead of going through the
`write!` macro, which would go through a lot more setup. Additionally we
already have `itoa` as an indirect dependency which we now directly use
for formatting integers that are not bound that way. `itoa` generally
does the same algorithm as `std` but does so without going through any
formatting machinery and is thus a lot faster, but also much less
customizable.

Overall this and #576 together result in a `~3.86x` performance
improvement when formatting a time.

| When        |        Time |
|:------------|------------:|
| Both PRs    | `59.205 ns` |
| Previous PR | `154.63 ns` |
| Before      | `228.47 ns` |
CryZe added a commit that referenced this pull request Dec 29, 2022
- The `livesplit-hotkey` crate is now documented. (@CryZe)
  [#479](#479)
- Not every key press emits a scan code on Windows. For those the
  virtual key code is now translated to a scan code. (@CryZe)
  [#480](#480)
- Time parsing is now a lot more robust, handles more edge cases, and is
  also a lot more accurate. (@CryZe)
  [#483](#483) and
  [#578](#578)
- When parsing a GDI based font name, platforms other than Windows now
  don't attempt to parse "normal" as part of the font name anymore as it
  is too ambigious. It could either refer to a font weight or stretch.
  (@kadiwa4)
  [#487](#487)
- The text engine can now be customized. You can either provide your own
  text engine or use the one provided by `livesplit-core`. The one
  provided is now behind the `path-based-text-engine` and converts all
  glyphs to paths that can easily be drawn. (@CryZe)
  [#495](#495)
- The path based text engine now caches the width of digits for tabular
  numbers, as well as the ellipsis glyph and its width, so that they can
  be layed out faster. (@kadiwa4)
  [#490](#490) and
  [#499](#499)
- On Windows GDI is now used to resolve GDI based font names. (@CryZe)
  [#500](#500)
- (Total) Possible Time Save now properly indicates that it's updating
  frequently. This results in faster rendering times. (@kadiwa4)
  [#501](#501)
- Initial support for auto splitting has landed in `livesplit-core`.
  Auto splitters are provided as WebAssembly modules. Support can be
  activated via the `auto-splitting` feature. (@P1n3appl3)
  [#477](#477)
- Auto splitting is also supported via the C API when activating its
  `auto-splitting` feature. (@DarkRTA)
  [#503](#503)
- A watchdog for the Auto Splitting Runtime was added which unloads
  scripts that aren't responsive. (@CryZe)
  [#528](#528)
- Splits and layouts can now be parsed and saved on `no_std` platforms.
  (@CryZe) [#532](#532)
- The splits component column labels can now be queried via the C API.
  (@MichaelJBerk)
  [#526](#526)
- The Software Renderer is now supported on `no_std` platforms. (@CryZe)
  [#536](#536)
- The parsers are now faster because they don't allocate as much memory
  anymore. (@CryZe)
  [#546](#546)
- The auto splitters have unstable support the `WebAssembly System
  Interface` via the `unstable-auto-splitting` feature. (@CryZe)
  [#547](#547)
- The Timer component can now use the color of the delta for its
  background. (@Hurricane996)
  [#539](#539)
- The splits component now takes the font into account when calculating
  the width of the columns. (@Hurricane996)
  [#550](#550)
- The `Resource Allocator` now decodes the images, allowing the
  underlying renderer to do the encoding by itself. (@CryZe)
  [#562](#562)
- Cargo's `--crate-type` parameter is now used to build the C API.
  (@CryZe) [#565](#565)
- The columns of the splits component can now show the custom variables.
  (@CryZe) [#566](#566)
- On the web, the `keydown` event may not always pass a `KeyboardEvent`
  despite the specification saying that this should be the case. This is
  now properly handled. (@CryZe)
  [#567](#567)
- An integer overflow in the `FuzzyList` used for searching game and
  category names has been fixed. (@CryZe)
  [#569](#569)
- The way the background is handled in the Detailed Timer component has
  been fixed. (@CryZe)
  [#572](#572)
- The times are now formatted as strings without going through floating
  point numbers which increases both the correctness and the
  performance. (@CryZe)
  [#576](#576)
- Instead of using `core::fmt` formatting machinery to format the times
  as strings, we now use a custom implementation that's much faster.
  (@CryZe) [#577](#577)
  and [#580](#580)
- Holding down a hotkey on Windows now doesn't cause it to be triggered
  over and over again. Other platforms already behaved this way.
  (@CryZe) [#584](#584)
- The `base64` crate is now replaced with `base64-simd` which uses SIMD
  to speed up the decoding of the images. (@CryZe)
  [#585](#585)
- Splits from `SpeedRunIGT`, which is a Minecraft speedruning mod, can
  now be parsed. (@CryZe)
  [#591](#591)
- It turns out using `evdev` for the hotkeys on Linux requires the user
  to be in the `input` group, which is not always the case. Therefore we
  now fall back to `X11` if `evdev` is not usable. (@CryZe)
  [#592](#592)
- When an auto splitter wants to attach to a Process by name, the start
  time and process id are now used to prioritize duplicate processes.
  (@Eein) [#589](#589)
- It is now possible to resolve the key codes to the particular name of
  the key based on the current keyboard layout on Linux and the web.
  This was already the case on Windows and macOS. (@CryZe)
  [#594](#594) and
  [#595](#595)
- It is now possible to trust the user of the C API to always pass valid
  UTF-8 strings to the C API via the optional
  `assume-str-parameters-are-utf8` feature. This is also always the case
  when using WebAssembly on the web. This improves the performance
  because no validation of the strings is necessary. (@CryZe)
  [#597](#597)
- There is now a new `max-opt` cargo profile that can be used to
  maximally optimize the resulting executable. The release profile is
  now using its default configuration again. (@CryZe)
  [#598](#598)
- When encountering images `livesplit-core` checks their dimensions to
  potentially automatically shrink them if they are larger than
  necessary. It turns out that checking the dimensions of PNG images was
  a lot less efficient than it could have been. This even improves
  parsing speed of entire splits files by up to 30%. (@CryZe)
  [#600](#600)
- The documentation now uses links to types mentioned. (@Eein)
  [#596](#596)
- Auto splitters can now query size of the modules of a process.
  (@CryZe) [#602](#602)
- The log messages emitted by auto splitters can now be consumed
  directly instead of always being emitted via the `log` crate. (@CryZe)
  [#603](#603)
- The auto splitters can provide settings that can be configured. For
  now the auto splitters need to be reloaded when the settings change.
  (@CryZe) [#606](#606)
- The file path used to be tracked in the `Run`, but no frontend even
  used this. So it has been removed. (@CryZe)
  [#616](#616)
- The documentation states that the title component's lines store the
  unabbreviated line as their last element. This was not actually the
  case and has been fixed. (@DarkRTA)
  [#615](#615)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An improvement for livesplit-core. performance Affects the performance of the code.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant