Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help Test SDIO for RP2040/RP2350 #2562

Open
greiman opened this issue Oct 28, 2024 · 12 comments
Open

Help Test SDIO for RP2040/RP2350 #2562

greiman opened this issue Oct 28, 2024 · 12 comments

Comments

@greiman
Copy link

greiman commented Oct 28, 2024

Please help test this beta version of SdFat that supports fast SDIO.

A number of users have requested this feature and hope it will be include in this package for RP2040/RP2350.

Here are some of my test results for the SdFat bench example with a Lexar Silver Plus card:

Pico 2 512 byte transfers at 150 MHz

FILE_SIZE_MB = 5
BUF_SIZE = 512 bytes

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
15526.96,38,32,32
15478.89,39,32,32

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
15526.96,449,32,32
15526.96,452,32,32

Pico 2 large transfers at 250 MHz

FILE_SIZE_MB = 100
BUF_SIZE = 32768 bytes

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
27263.48,11293,1197,1201
27256.04,11098,1197,1201

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
27013.02,2400,1210,1212
26788.63,9307,1210,1222

Pico 2 small 64 byte transfers 150 MHz:

FILE_SIZE_MB = 100
BUF_SIZE = 64 bytes

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
9980.04,11721,2,6
9984.03,8736,2,6

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
10145.07,75,1,5
10145.07,76,1,5

@earlephilhower
Copy link
Owner

earlephilhower commented Nov 1, 2024

Very nice!

Using my just made SD->Pico (RP2040 @ stock 133MHZ) adapter on an old SanDisk Extreme 32GB "U3" "V30" card I'm getting ~13MB/s

Pinout and clocks

#define SPI_CLOCK SD_SCK_MHZ(50)
#define RP_CLK_GPIO 14
#define RP_CMD_GPIO 15
#define RP_DAT0_GPIO 18 

Results

Type any character to start
FreeStack: 253848
Type is FAT32
Card size: 31.91 GB (GB = 1E9 bytes)

Manufacturer ID: 0X3
OEM ID: SD
Product: SE32G
Revision: 8.0
Serial number: 0X7B761DA4
Manufacturing date: 8/2016

FILE_SIZE_MB = 5
BUF_SIZE = 512 bytes
Starting write test, please wait.

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
13227.51,805,37,37
13333.33,781,37,37

Starting read test, please wait.

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
12658.23,62,39,39
12658.23,69,39,39

Done

Adapter

image

I couldn't find a way of low-level erasing the card beforehand (like secure erase for SSDs) under Linux, so assume these #s are on a card that's been beaten to death in an old cell phone.

The stack free function needs a bit of work, and there look to be some functions in the SDIO routine with >500 bytes of stack needed (which is not an error but might cause weirdness in real apps with other stack users in the call chain).

Is there something specific you wanted to try out? My real high perf cards for my DSLRs are all full-size, so I can't use this adapter. But 13MB/s seems like a pretty good # even so on a random old one...

@earlephilhower
Copy link
Owner

There might be some CPU limitation, it seems. I bumped to 200MHZ on the same card and got the following

FreeStack: 253848
Type is FAT32
Card size: 31.91 GB (GB = 1E9 bytes)

Manufacturer ID: 0X3
OEM ID: SD
Product: SE32G
Revision: 8.0
Serial number: 0X7B761DA4
Manufacturing date: 8/2016

FILE_SIZE_MB = 5
BUF_SIZE = 512 bytes
Starting write test, please wait.

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
19920.32,1325,24,25
19920.32,1338,24,25

Starting read test, please wait.

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
19011.41,50,26,26
19011.41,39,26,26

Done

@greiman
Copy link
Author

greiman commented Nov 1, 2024

I couldn't find a way of low-level erasing the card beforehand

Use the SdFormatter example. The erase option quickly low level erases an SD.

The stack free function needs a bit of work, and there look to be some functions in the SDIO routine with >500 bytes of stack needed.

I may remove the free stack function from SdFat examples since it is over 10 years old and was for 328 boards.

Buffering for alignment problem is a problem. I have avoided allocating dynamic memory and don't like using the stack. Any suggestions?

But 13MB/s seems like a pretty good # even so on a random old one...

I suspect few apps need more.

@earlephilhower
Copy link
Owner

earlephilhower commented Nov 1, 2024

+1 for minimizing dynamic memory allocation! It's not nearly as important on the 2040(256K) or 2350 (512K!!!) as the 8266 or AVRs, but every little bit helps avoid memory fragmentation.

Not sure what you mean by buffer alignment, but using __attribute__((aligned(4))) (or whatever) should work. Stack variables should already be 4-byte aligned if I understand the ARM ABI properly.

We also have a HW DMA engine that's 2x faster for large blocks than memcpy. It only works for 4-byte aligned offset, 4-byte aligned length, though. But it's a drop-in-replacement of memcpy with rp2040.memcpyDMA (and falls back to the ROM memcpy when it can't handle things). For smallish copies (32-bytes) it's about even with ROM memcpy, so if you're moving small blocks this won't do much.

--edit-- A quick sprinkling of rp2040.memcpyDMA in the spots where whole sectors were being copied didn't move the needle. So, no simple speed up there. :(

@earlephilhower
Copy link
Owner

SdFormatter didn't seem to change the resumts on the SanDisk card so it was probably still relatively clean. I did get a different on a generic MicroCenter-branded "U10" card, whose results follow. I suppose the 12-13MB/s read is due to a bottleneck somewhere in the MCU since it's the same as the "good" card, while the write limitation is down to the very cheap card. In any case, very consistent success using the SDIO mode even with spaghetti wiring!

Type is FAT32
Card size: 15.59 GB (GB = 1E9 bytes)

Manufacturer ID: 0X27
OEM ID: PH
Product: SD16G
Revision: 6.0
Serial number: 0XDA603B0C
Manufacturing date: 2/2019

FILE_SIZE_MB = 5
BUF_SIZE = 512 bytes
Starting write test, please wait.

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
8503.40,134410,37,58
8695.65,134520,37,58

Starting read test, please wait.

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
12406.95,69,40,40
12406.95,54,40,40

Done

@greiman
Copy link
Author

greiman commented Nov 2, 2024

Not sure what you mean by buffer alignment

Read/write calls often occur with non aligned buffers. Also if the file positioned is not a multiple of four bytes the copies to form complete sectors will not be aligned.

The bench example with 512 byte transfers will never need the copies. Try bench with 513 byte transfers to insure no crashes due to alignment problems.

Performance suffers with lots of nonaligned copies.

Here is RP2040 with 512 byte transfers at 133 MHz using a low cost PNY 32GB microSD.

FILE_SIZE_MB = 5
BUF_SIZE = 512 bytes

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
12658.23,421,37,39
12658.23,416,37,39

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
12626.26,68,39,39
12626.26,68,39,39

Here is the result with 513 byte transfers:

FILE_SIZE_MB = 5
BUF_SIZE = 513 bytes

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
8103.73,1068,38,62
8130.08,459,38,62

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
8064.52,96,44,62
8077.54,99,44,62

I suppose the 12-13MB/s read is due to a bottleneck somewhere in the MCU

Modern SD cards have huge flash pages, 32KB or more. Low end SD cards don't pipeline reads as well as high end cards.

High end cards have lots of buffering and they read ahead and pipeline the data stream for sequential reads. High end SD cards are incredibly complex, some even use Pseudo-SLC cache like SSD drives.

I was amazed to see high end SD cards setup read steam buffer policy based on how a file was written. SD cards expect a standard file format as specified by the SD Association. The SD expects standard locations and sizes for clusters and other file structures.

FAT areas are managed in a different way than data areas.

Some SD cards try to optimize for multiple open files or random I/O.

For best results use the Official SD Association formatter.

@earlephilhower
Copy link
Owner

Gotcha. Misaligned accesses (byte-wise? ugh!) are always brutal, anyway. The DMA copy won't help you there in most cases, sadly.

Your examples have several boards w/SDIO pins defined manually I can add those defines to the board variant headers and it will "just work" without you needing to manually include the values for every example.

In any case, is there a timeline for beta->release on the new SDFAT? My fork had minimal changes to work with our File and other minutiae, so I may need to pull there and not directly from your release. But, I'd need a release to start work. :)

@greiman
Copy link
Author

greiman commented Nov 2, 2024

Your examples have several boards w/SDIO pins defined manually I can add those defines to the board variant headers and it will "just work" without you needing to manually include the values for every example.

About the only variant that is safe is AdaFruit Metro RP2040, it has an onboard SDIO/SPI socket. Other users of the beta select different pins than my test cases.

The current beta has lots of changes for other boards. I have done tests with some of the most popular Arduino and AdaFruit boards. There are now thousands of "Arduino Compatible" boards plus custom boards that use SdFat so I can no longer test a fraction of these boards.

If I don't get any serious issues, I will post a release on SdFat in about a week.

@julianblanco
Copy link

julianblanco commented Nov 18, 2024

Just chiming in:
Built with platformio, metro 2040, samsung 512 evo select, SPI_CLOCK SD_SCK_MHZ(150)

Type any character to start
FreeStack: 253848
Type is exFAT
Card size: 512.71 GB (GB = 1E9 bytes)

Manufacturer ID: 0X1B
OEM ID: SM
Product: GF8S5
Revision: 3.0
Serial number: 0X1CC763E2
Manufacturing date: 4/2022

FILE_SIZE_MB = 5
BUF_SIZE = 512 bytes
Starting write test, please wait.

write speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
12852.65,1979,38,38
12819.69,2124,37,38

Starting read test, please wait.

read speed and latency
speed,max,min,avg
KB/Sec,usec,usec,usec
12562.01,54,39,40
12562.01,54,39,40

Done

@earlephilhower
Copy link
Owner

earlephilhower commented Nov 19, 2024

@greiman do you support bit reversal on the SDIO data pins? Looking at the SparkFun Thing Plus RP2040 it looks like GPIOs map from a low GPIO SDIO3 to a higher GPIO SDIO0.

https://cdn.sparkfun.com/assets/5/4/f/6/b/RP2040_Thing_Plus_Schematic.pdf

SDIO3=GPIO 9
SDIO2=GPIO 10
...

@greiman
Copy link
Author

greiman commented Nov 19, 2024

do you support bit reversal on the SDIO data pins?

I noticed the SparkFun Thing Plus RP2040 but never pursued it.

It seemed I might flip bits in each nibble by using IN_SHIFTDIR and OUT_SHIFTDIR in the PIO state machine but I assumed the nibbles in a 32-bit word would be in the wrong order but never verified this.

Here are is the places for IN_SHIFTDIR and OUT_SHIFTDIR.

Edit: I looked at the datasheet and it looks like the bits in a nibble are not changed by shift direction:

IN always uses the least significant Bit count bits of the source data. For example, if PINCTRL_IN_BASE is set to 5, the
instruction IN PINS, 3 will take the values of pins 5, 6 and 7, and shift these into the ISR. First the ISR is shifted to the left
or right to make room for the new input data, then the input data is copied into the gap this leaves. The bit order of the
input data is not dependent on the shift direction.

@earlephilhower
Copy link
Owner

Thanks for checking. I guess software-based bit reversal would be needed there. The RP2350 version of that SparkFun Thing Plus board has a SD slot but only 1-bit wired so it's probably not worth delving into.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants