Skip to content

Commit

Permalink
Improve Node#each_recursive performance (#139)
Browse files Browse the repository at this point in the history
Fix #134

## Summary

This PR does:

- Add `benchmark/each_recursive.yaml`
- Rewrite `Node#each_recursive` implementation for performance
- Add a test for `Node#each_recursive`

The performance of `Node#each_recursive` is improved 60~80x faster.

## Details

`each_recursive` is too much slow as I described in #134.
I improved this performance by rewriting its implementation in this PR.

Also, I added a benchmark in `benchmark/each_recursive.yaml` and the
following is a result on my laptop:

```
RUBYLIB= BUNDLER_ORIG_RUBYLIB= /Users/makenowjust/Projects/github.com/makenowjust/simple-dotfiles/.asdf/installs/ruby/3.3.2/bin/ruby -v -S benchmark-driver /Users/makenowjust/Projects/github.com/ruby/rexml/benchmark/each_recursive.yaml
ruby 3.3.2 (2024-05-30 revision e5a195edf6) [arm64-darwin23]
Calculating -------------------------------------
                     rexml 3.2.6      master  3.2.6(YJIT)  master(YJIT)
      each_recursive      11.279     686.502       17.926        1.470k i/s -     100.000 times in 8.866303s 0.145666s 5.578360s 0.068018s

Comparison:
                   each_recursive
        master(YJIT):      1470.2 i/s
              master:       686.5 i/s - 2.14x  slower
         3.2.6(YJIT):        17.9 i/s - 82.01x  slower
         rexml 3.2.6:        11.3 i/s - 130.35x  slower

```

We can see that the performance is improved 60~80x faster.

Additionally, I added a new test for `Node#each_recursive`.
It was missing, but we need it to confirm not to break the previous
behavior.

Thank you.

---------

Co-authored-by: Sutou Kouhei <[email protected]>
  • Loading branch information
makenowjust and kou authored Jun 7, 2024
1 parent da67561 commit dab8065
Show file tree
Hide file tree
Showing 3 changed files with 84 additions and 4 deletions.
40 changes: 40 additions & 0 deletions benchmark/each_recursive.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
loop_count: 100
contexts:
- gems:
rexml: 3.2.6
require: false
prelude: require 'rexml'
- name: master
prelude: |
$LOAD_PATH.unshift(File.expand_path("lib"))
require 'rexml'
- name: 3.2.6(YJIT)
gems:
rexml: 3.2.6
require: false
prelude: |
require 'rexml'
RubyVM::YJIT.enable
- name: master(YJIT)
prelude: |
$LOAD_PATH.unshift(File.expand_path("lib"))
require 'rexml'
RubyVM::YJIT.enable
prelude: |
require 'rexml/document'
xml_source = +"<root>"
100.times do
x_node_source = ""
100.times do
x_node_source = "<x>#{x_node_source}</x>"
end
xml_source << x_node_source
end
xml_source << "</root>"
document = REXML::Document.new(xml_source)
benchmark:
each_recursive: document.each_recursive { |_| }
12 changes: 8 additions & 4 deletions lib/rexml/node.rb
Original file line number Diff line number Diff line change
Expand Up @@ -52,10 +52,14 @@ def parent?

# Visit all subnodes of +self+ recursively
def each_recursive(&block) # :yields: node
self.elements.each {|node|
block.call(node)
node.each_recursive(&block)
}
stack = []
each { |child| stack.unshift child if child.node_type == :element }
until stack.empty?
child = stack.pop
yield child
n = stack.size
child.each { |grandchild| stack.insert n, grandchild if grandchild.node_type == :element }
end
end

# Find (and return) first subnode (recursively) for which the block
Expand Down
36 changes: 36 additions & 0 deletions test/test_document.rb
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,42 @@ def test_gt_linear_performance_attribute_value
end
end

def test_each_recursive
xml_source = <<~XML
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<root name="root">
<x name="1_1">
<x name="1_2">
<x name="1_3" />
</x>
</x>
<x name="2_1">
<x name="2_2">
<x name="2_3" />
</x>
</x>
<!-- comment -->
<![CDATA[ cdata ]]>
</root>
XML

expected_names = %w[
root
1_1 1_2 1_3
2_1 2_2 2_3
]

document = REXML::Document.new(xml_source)

# Node#each_recursive iterates elements only.
# This does not iterate XML declerations, comments, attributes, CDATA sections, etc.
actual_names = []
document.each_recursive do |element|
actual_names << element.attributes["name"]
end
assert_equal(expected_names, actual_names)
end

class WriteTest < Test::Unit::TestCase
def setup
@document = REXML::Document.new(<<-EOX)
Expand Down

0 comments on commit dab8065

Please sign in to comment.