Cut memory usage in half by iterating arrays on the fly #66

daniel-rusu · 2023-10-20T18:48:49Z

Instead of copying collections by pushing each array element individually onto the stack (in the MeasurementStack), we could create a pushArray method in the MeasurementStack that keeps track of the entire array reference instead of copying all the elements from the array. The measurement stack could keep track of the index into the array so when the pop method is called and an array is present then fetch that element from the array at that index and decrement the index. When the index becomes negative then actually pop the array itself.

This will make analyzing collections much more efficient in both time and space.

The text was updated successfully, but these errors were encountered:

daniel-rusu · 2023-10-20T18:54:49Z

An alternative approach would be to have 2 measurement stacks, one for arrays and the other for regular objects so the logic would look something like this:

while (arrayStack.isNotEmpty() || objectStack.isNotEmpty) {
    while (arrayStack.isNotEmpty()) {
        val array = arrayStack.pop()
        for (element in array) {
            if (!shouldSkip(element)) {
              measureObject(element)
            }
        }
    }
    while (objectStack.isNotEmpty()) {
        measureObject(objectStack.pop())
    }
}

daniel-rusu · 2023-10-22T03:06:22Z

To clarify, when encountering an array with N eligible elements, Jamm currently makes 2N copies. Inside the MeasurementStack, all the array element references get copied to the tracker, and another copy of all the references gets added to the stack.

This enhancement cuts the memory usage in half since the references only need to be lazily added to the tracker without needing to copy them to the stack.

I tried an implementation using a variation of the first proposal above so the rest of the code can remain mostly unchanged. The MeasurementStack contained 2 stacks, objectStack & arrayStack, and found that this improved performance by about 5% while cutting the memory consumption in half since both stacks remained tiny and only the tracker grew to keep track of all the elements.

Since the process of measuring an array element can end up adding additional arrays, we need to keep track of where we left off so the arrayStack stored an ArrayWrapper that contained a reference to the array along with the next index to be processed. When fetching the current array element, the index would be decremented until finding the next eligible element and the wrapper is removed from the stack when it contains no more elements. The isEmpty function returns true when both stacks are empty.

blerer · 2023-10-24T08:06:27Z

@daniel-rusu I am not sure that I understand your way of computing memory usage. Considering the way Java deal with memory when we add an array to the stack we only copy the reference to the array not its value. Therefore, the impact on memory is near neglectable.

daniel-rusu · 2023-10-24T15:23:29Z

@blerer there seems to be a misunderstanding here as this has nothing to do with the Java memory model. Yes, when first encountering an array field, a reference to that array is added to the stack and this is negligible as you say. However, when popping that reference and discovering that it's pointing to an array then every single eligible array element reference is copied to the stack. See this section of the code for details:

jamm/src/org/github/jamm/MemoryMeter.java

Line 496 in fbed9dd

addArrayElements((Object[]) current, stack);

So this does actually cut the memory usage of the Jamm library in half when analyzing a large array of fairly shallow objects using the first proposal assuming we prioritize popping from the object stack. That's because each array element is popped directly from the array and analyzing that element would add a small number of references to the stack to be analyzed and emptied before popping the next array element. So instead of blowing up the size of the stack by copying all the array elements, the stack would instead remain tiny since only direct field references would be copied there and cleared before moving to the next array element.

blerer · 2023-10-26T12:04:05Z

Sorry, I did not read your 3rd comment before. I now understand your idea and it make sense. Fill free to propose a patch :-)

daniel-rusu mentioned this issue Oct 20, 2023

Please create CachedMemoryMeterStrategy that caches shallow sizes of classes #65

Open

daniel-rusu changed the title ~~Iterate arrays instead of copying each element into the MeasurementStack~~ Cut memory usage in half by iterating arrays on the fly Oct 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cut memory usage in half by iterating arrays on the fly #66

Cut memory usage in half by iterating arrays on the fly #66

daniel-rusu commented Oct 20, 2023 •

edited

Loading

daniel-rusu commented Oct 20, 2023 •

edited

Loading

daniel-rusu commented Oct 22, 2023 •

edited

Loading

blerer commented Oct 24, 2023

daniel-rusu commented Oct 24, 2023

blerer commented Oct 26, 2023

Cut memory usage in half by iterating arrays on the fly #66

Cut memory usage in half by iterating arrays on the fly #66

Comments

daniel-rusu commented Oct 20, 2023 • edited Loading

daniel-rusu commented Oct 20, 2023 • edited Loading

daniel-rusu commented Oct 22, 2023 • edited Loading

blerer commented Oct 24, 2023

daniel-rusu commented Oct 24, 2023

blerer commented Oct 26, 2023

daniel-rusu commented Oct 20, 2023 •

edited

Loading

daniel-rusu commented Oct 20, 2023 •

edited

Loading

daniel-rusu commented Oct 22, 2023 •

edited

Loading