-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PacketBuffer: pool EMPTY (CON-1394) #1132
Comments
@FMistrorigo Looking into this. |
@FMistrorigo I ran ( Could you please share:
|
This issue has lots of history: I still see it. It is not bothering me because it is transient and will self clear after a minute. My best guess? Packet transmissions are failing and that starts tying up retry buffers. After a while all of the buffers become used. Then wait a minute, the retries time out and the buffers become free. I have switched over to dynamic wifi buffers and pushed them into PSRAM. I have upto 32 TX and 32 RX. If you are using a low memory chip like the C3 RAM can get fragmented to the point where wifi buffers can't be allocated. |
Hi @shubhamdp, Before proceeding I would like to clarify the behavior of our device that we are finishing certifying. That being said, I did as you asked and gathered all the information. FAIL LOG on TH.txt HEAP during run.txt -> Recap of the heap in different parts of the code. For security reasons for my company, I cannot provide mdns traffic data. As said by @jonsmirl, looking at the log the problem could be in the retry buffers, but I'm not sure. Thanks, |
I am not an Espressif employee, but I have been working with ESP chips and Matter since the beginning. I would strongly recommend against shipping products which are sitting right on the edge of not being able to fit into a chip. Matter grows and grows over time. While you might be able to just squeeze everything in at the moment, after the next update you might not fit anymore. Then you will tell the programmers to make it fit anyway. And that is where you can end up in a lot of trouble because the effort needed to keep making it fit is going to get larger and larger and cost more and more to do. This path typically ends up costing more than just using a larger chip to begin with and initially shipping with some spare RAM and flash available. We are using the ESP32-S3-PICO-1-N8R2. 4MB flash is too close to the edge for us since it only allows a 1.8MB OTP partition. 8MB allows a 3.8MB image of which we are using 2MB so 1.8MB of room to grow. I got tired of spending major amounts of effort shoehorning into the S3 on chip RAM, adding 2MB of PSRAM gives us 1.8MB for future expansion, meanwhile I am using it to shadow the flash for increased performance. We use the ESP32-S3-PICO-1-N8R2 is every product. That is a huge benefit because now we can have a single binary image which supports every product we make. When it boots it queries the product ID and then enables the correct support code -- that even includes some products which have LCDs and some which don't. This single image makes development far easier to do and saves more on coding costs than we lose by spending a little more on hardware. The only things I wouldn't do this way are light bulbs and sensors. Light bulbs have no need for continual upgrades, instead they get replaced so it is reasonable to use right up the edge of the chip in a light bulb. And our sensors use another technology which is far more power efficient than Matter. |
Hi, The problem persists, we always get the error "chip[CSL]: PacketBuffer: pool EMPTY." The problem happens both when I try to connect to wifi more than once via our app, and when I try to do the TC-SC-3.6 test. I have already tried to increase CHIP_DEVICE_CONFIG_MAX_EVENT_QUEUE_SIZE (to 50 or even 100) from menuconfig but nothing changes. I share with you our sdkconfig in the hope that the problem is there. |
I have moved up to ESP IDF 5.2.3 since I saw some patches in this area in the git logs. |
I am also using: I have a MB+ of free PSRAM, so plenty of room. |
Hi, After a lot of time and a lot of debug i finally foud the problem and came out with a solution.
I confirmed that the largest free block was smaller than the actual block the program was trying to allocate. By following every single tip to reduce heap size (https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-guides/performance/ram-usage.html), changing esp-idf version to 5.2.3 to get the latest fixes, minimizing the freertos tasks stack sizes, and also following the lwIP minimum RAM usage (https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-guides/lwip.html#minimum-ram-usage) I was able to reach 80KB of free heap (with wifi connected and matter initialized). The minimum free heap you need to have to pass every time the TC-SC-3.6 test is at least 70KB in my case (I used this function to find out what was the peak heap memory that the lwIP consumes: https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-guides/lwip.html#peak-buffer-usage), so now apart from exceptional cases of fragmentation, the problem is solved. Thanks @jonsmirl for the patience and your suggestions. Fabio. |
I run this little task all of the time
|
You could also switch over to static wifi buffers. |
@FMistrorigo With this commit, the RAM usage in the data model is reduced by around 4-5KB. Please check if you have this commit, if you are using the main branch. |
Hi @dhrishi, I'm not using the main branch so I don't have the fix, but I solved the problem following the "minimizing RAM usage" page in the documentation and reaching the quantity of heap i needed to allocate every pool of lwIP in it's peak heap consumption. Thanks, |
Description
I'm trying to get my product certified and the last test I can't pass is the TC-SC-3.6 test, which tries to establish different PASE Sessions (3) for each Fabric.
The problem i'm facing is this one:
It looks like the Largest memory block is not big enough for the things it needs to do.
I logged the biggest memory block and found out that before the error I only have (INTERNAL) 1080 byte, even if the free heap is 43784.
Can you help me find a solution to this problem?
Is there a defragmentation algorithm I can use or something that might help solve this problem?
Environment
Additional Details
I'm doing the test using th-fall2023.
The text was updated successfully, but these errors were encountered: