Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update WOF import pipeline. #1868

Merged
merged 2 commits into from
Apr 17, 2019
Merged

Update WOF import pipeline. #1868

merged 2 commits into from
Apr 17, 2019

Conversation

zerebubuth
Copy link
Member

Previously, we would look at the metadata CSVs, load all the referenced GeoJSON files over HTTP, parse them and put them into PostgreSQL. The hosted pgdump version was just dumped from a manual import of this data.

Because we're now doing global builds, this data has grown stale. This change adds a script to download and parse the WOF "bundles", which are tar.gz files containing the GeoJSON files - so we download 4 files instead of thousands. Instead of loading this into a database, we dump the data out as a SQL file, ready to be imported at database setup time.

The SQL dump is put into the shapefiles.tar.gz versioned static data asset, similar to Natural Earth and the OSMData land/water polygons. This means it's stable across releases, but we can update it as part of a regular asset rebuild.

Connects to #1808.

Previously, we would look at the metadata CSVs, load all the referenced GeoJSON files over HTTP, parse them and put them into PostgreSQL. The hosted `pgdump` version was just dumped from a manual import of this data.

Because we're now doing global builds, this data has grown stale. This change adds a script to download and parse the WOF "bundles", which are `tar.gz` files containing the GeoJSON files - so we download 4 files instead of thousands. Instead of loading this into a database, we dump the data out as a SQL file, ready to be imported at database setup time.

The SQL dump is put into the `shapefiles.tar.gz` versioned static data asset, similar to Natural Earth and the OSMData land/water polygons. This means it's stable across releases, but we can update it as part of a regular asset rebuild.
shapefiles.tar.gz: {{ tgt_shapefile_zips }}
tar czf shapefiles.tar.gz {{ tgt_shapefile_zips }}
shapefiles.tar.gz: {{ tgt_shapefile_zips }} wof_snapshot.sql
tar czf shapefiles.tar.gz $^
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: the $^ line noise means everything after the colon on the rule line, in this case the tgt_shapefile_zips and the wof_snapshot.sql.

@zerebubuth zerebubuth merged commit bfc0e46 into master Apr 17, 2019
@zerebubuth zerebubuth deleted the zerebubuth/1808-update-wof branch April 17, 2019 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants