-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternative approach to build a transfer table #114
Comments
Thanks @FlxPo. Strange that you're getting out-of-memory errors. That shouldn't happen, and definitely not from distance calculations, which are all internalised here in C++ code, and don't use |
I just realized I was using an old version of gtfsrouter, I will do some tests to see if I still get an error. |
No, the error is real. It comes from |
Let me know if that's okay now. See Rdatatable/data.table issue 5676 for background - |
I'm always interested in opportunities for library (gtfsrouter)
library(data.table)
transfer_table <- function(gtfs, d_limit = 200, crs = 2154) {
# ... your function from above ...
}
path <- "./feeds/helsinki.zip"
gtfs <- extract_gtfs (path)
#> ▶ Unzipping GTFS archive✔ Unzipped GTFS archive
#> ▶ Extracting GTFS feed✔ Extracted GTFS feed
#> ▶ Converting stop times to seconds✔ Converted stop times to seconds
#> ▶ Converting transfer times to seconds✔ Converted transfer times to seconds
system.time (
t1 <- gtfs_transfer_table (gtfs, d_limit = 200)
)
#> user system elapsed
#> 2.131 0.004 2.055
system.time (
t2 <- transfer_table(gtfs, d_limit = 200)
)
#> user system elapsed
#> 95.689 0.449 95.685 Created on 2024-01-31 with reprex v2.1.0 So basically 50 times faster than |
I just tested the latest development version on the [Helsinki GTFS](https://www.hsl.fi/en/hsl/open-data, is this the one you use ?). This is weird, my function seems a little bit faster (and I cannot reproduce the 50x difference).
|
Oh, the difference is that i commented out the |
Thanks for the precision. For the record I'm not sure I see why distances should be calculated from lat/lon coordinates (at least at a regional scale), and I wouldn't mind specifying the CRS for each of my projects. |
I tried to use gtfs_transfer_table on a big GTFS dataset, with 46 000 stops (from IDFM, for the Ile-de-France region : https://eu.ftp.opendatasoft.com/stif/GTFS/IDFM-gtfs.zip). A transfer table is already provided, but i needed to recompute them to be able to merge this feed with other feeds.
For now gtfs_transfer_table seems to compute all pairwise distances between stops with geodist, which results in a out of memory error given the high number of stops.
Here is an alternative approach using sf which is quite fast, adapted to my needs but that could return exactly the same thing as gtfs_transfer_table :
It would require taking a dependency on sf.
The text was updated successfully, but these errors were encountered: