-
Notifications
You must be signed in to change notification settings - Fork 336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SAMZA-2778: Make AzureBlobOutputStream buffer initialization size configurable. #1662
Conversation
@PawasChhokra can you review this change? |
@ehoner can you please follow the PR guidelines outlined in this SEP - https://cwiki.apache.org/confluence/display/SAMZA/SEP-25%3A+PR+Title+And+Description+Guidelines |
remove empty test
samza-azure/src/main/java/org/apache/samza/system/azureblob/avro/AzureBlobAvroWriter.java
Show resolved
Hide resolved
samza-azure/src/main/java/org/apache/samza/system/azureblob/AzureBlobConfig.java
Show resolved
Hide resolved
...a-azure/src/main/java/org/apache/samza/system/azureblob/avro/AzureBlobAvroWriterFactory.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API changes looks fine to me. I have few minor comments. PR looks good otherwise.
samza-azure/src/main/java/org/apache/samza/system/azureblob/avro/AzureBlobAvroWriter.java
Show resolved
Hide resolved
samza-azure/src/main/java/org/apache/samza/system/azureblob/AzureBlobConfig.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to merge once you have resolved the comments
Symptom:
JVM crashes due to OOM exceptions. When the system has a large number of
AzureBlobWriterObjects
, the memory becomes heavily fragmented and can be susceptible to crashing when a newAzureBlobOutputStream
is created.Cause:
The crashes are caused by inefficient memory management when using the G1GC. (The default garbage collector in Java 11+.) What causes (code paths) this issue and why this is a problem for the G1GC are examined separately.
What causes this issue?
The underlying issue is caused by the
AzureBlobOutputStream
. When a new instance is created, it creates a new ByteArrayOutputStream initialized to maxFlushThresholdSize.12 TheByteArrayOutputStream
is the buffer used by the parent to accumulate messages between flush intervals. It requires 10MB (current default) of memory to initialize. This allows the buffer to accumulate messages without resize operations, however, it does not prevent resizing. The maxFlushThresholdSize is enforced in the parent duringwrite()
- see AzureBlobAvroWriter.Why this is a problem for the G1GC?
The focus here is on the G1GC and humongous objects (G1 specific).3 The G1 GC introduced a new memory management strategy that divides the heap into regions,
-XX:G1HeapRegionSize=n
. The GC can operate on regions concurrently and copies live objects between regions during full GC to reclaim/empty regions.4 The default behavior creates ~2048 regions, sized to a factor of 2 between 1MB and 32MB. Any object larger than half of a region size, is considered a humongous object.Humongous objects are allocated an entire region (or consecutive regions if larger than a single region) and are not copied between regions during full GC. Any space not used by the object within a region is non-addressable for the life of the object.5 A JVM heap size of 31GB,
-Xmx31G
, will default to 16MB regions. Considering the current default size is 10MB, each buffer requires an entire region and prevents the use of 6MB, regardless of the how much data is in the buffer. For a heap smaller than 16GB, each buffer would require multiple regions.The 10MB buffer size can exhaust the regions and cause OOMs or create fragmentation that causes OOMs. A fragmentation caused OOM occurs in the following sequence. On
new
, the JVM attempts to create the object in Eden. If there is insufficient space in Eden a minor GC is performed. If there is insufficient space after minor GC, the object is immediately promoted Old Gen. If there is insufficient space in Old Gen, a full GC is performed. If a full GC cannot allocate memory or region(s) for a humongous object the JVM will exit with OOM.Changes:
The javadocs, where appropriate, have been updated to reflect changes or describe new behaviors. No public APIs were removed, they were marked deprecated and migrated to the new default initialization value. All of the changes are itemized below.
AzureBlobConfig
Adding two new public fields and one public method. The new configuration is made accessible in the same manner as existing configs (see #configs SEP-26), also consistent the coding-guide. There is a new public config key:
SYSTEM_INIT_BUFFER_SIZE
- named initBufferSize.bytes. The default value is public fieldSYSTEM_INIT_BUFFER_SIZE_DEFAULT
. The user provided configuration value is accessible with new public methodgetInitBufferSizeBytes(..)
. The method returns the configuration value between 32 andgetMaxFlushThresholdSize(..)
inclusive. 32 is the default initialization size of aByteArrayOutputStream
in the parameterless constructor.AzureBlobWriterFactory
There are two changes to this interface, both to the method,
getWriterInstance(..)
. The existing implementation is marked@Deprecated
and a new method with an additional parameter is added. The new parameter is anint
that is expected to be the _ initBufferSize_.AzureBlobAvroWriterFactory
The modifications here are consistent with the changes to interface
AzureBlobWriterFactory
. However, the deprecated implementation uses the new fieldAzureBlobConfig.SYSTEM_INIT_BUFFER_SIZE_DEFAULT
when calling the new public API. This will migrate users to the new initialization behavior.AzureBlobAvroWriter
There are two public, one package-private, and two private changes. Both public changes are to constructors. The existing public constructor is marked deprecated and invokes the new public constructor with the new field
AzureBlobConfig.SYSTEM_INIT_BUFFER_SIZE_DEFAULT
. The new constructor sets the new private fieldinitBufferSize
. The package-private constructor is modified with an additionalint
parameter and the tests were changed accordingly. The remaining private change modifies creation of newAzureBlobOutputStream
instances to include the additional private fieldinitBufferSize
.AzureBlobOutputStream
The existing public API is marked
@Deprecated
and theByteArrayOutputStream
is initialized with the new fieldAzureBlobConfig.SYSTEM_INIT_BUFFER_SIZE_DEFAULT
. The new public constructor includes the new parameterinitBufferSize
and initializes theByteArrayOutputStream
to that size.Tests:
Both test classes,
TestAzureBlobAvroWriter
andTestAzureBlobOutputStream
were modified to use the new fieldAzureBlobConfig.SYSTEM_INIT_BUFFER_SIZE_DEFAULT
.No unit or integration tests were added to verify the GC behavior. The performance of the new system analyzed with JXray and compared to previous analyses. The number of humongous objects was reduced from >80% to <25%.
Footnotes
The configuration values are set from either SYSTEM_MAX_FLUSH_THRESHOLD_SIZE or SYSTEM_MAX_FLUSH_THRESHOLD_SIZE_DEFAULT. ↩
Call sequence from configuration to
ByteArrayOutputStream
initialization (referencing existing implementation).AzureBlobConfig
- getMaxFlushThresholdSize() returns user provided value or default.AzureBlobSystemProducer.blockFlushThresholdSize
- initialization from config value.AzureBlobSystemProducer.createNewWriter(...)
- new AzureBlobWriter instance (determined by factory).AzureBlobAvroWriter.maxBlockFlushThresholdSize
- initialization (Samza provided implementation).AzureBlobAvroWriter.startNextBlob(..)
- new AzureBlobOutputStream (internal implementation).AzureBlobOutputStream.maxBlockFlushThresholdSize
- initialize new ByteArrayOutputStream to max blob configuration value."Garbage-First Garbage Collector: Humongous Objects" ↩
"Part 1: Introduction to the G1 Garbage Collector ↩
"What’s the deal with humongous objects in Java?" ↩