Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyreadr.custom_errors.LibrdataError: Unable to read from file for large RDS files #99

Open
pty0111 opened this issue Nov 11, 2023 · 4 comments
Labels
waiting for librdata changes the issue needs some fixes to the C library librdata before it can be solved

Comments

@pty0111
Copy link

pty0111 commented Nov 11, 2023

Is there an upper limit on the size of RDS files that can be loaded using pyreadr?
When reading an RDS file of a small matrix, the code works well, but when reading large matrices (>10GB in size), I get the following error:
pyreadr.custom_errors.LibrdataError: Unable to read from file

@ofajardo
Copy link
Owner

I think there should not be such a limit. In addition you should probably get a memory error instead of a unable to read from file error, so I suspect that there is something else happening with that file. IS the file something you have created yourself with R? or is it something somebody else generated? If somebody else I think as mentioned before the problem is something else besides the size. If you did create it, please share a simplified code to reproduce the issue.

@pty0111
Copy link
Author

pty0111 commented Nov 12, 2023

This is a matrix that I generated from my data. It is a 31595 by 39643 matrix saved using saveRDS(my.mtx, file = "expr.rds") command. When I subset to fewer rows, e.g., saveRDS(my.mtx[1:5000,], file = "expr.rds"), pyreadr works without any issue.
My pyreadr version is 0.4.9

@elaude
Copy link

elaude commented Apr 22, 2024

I can confirm this bug exists. I submitted a fix to librdata (WizardMac/librdata#49), please consider updating once it is merged.

@ofajardo
Copy link
Owner

ofajardo commented Jul 2, 2024

Sure, I will update here once the PR is merged into librdata

@ofajardo ofajardo added the waiting for librdata changes the issue needs some fixes to the C library librdata before it can be solved label Jul 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting for librdata changes the issue needs some fixes to the C library librdata before it can be solved
Projects
None yet
Development

No branches or pull requests

3 participants