Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADBC: add spatial support for DuckDB databases and GeoParquet #11459

Merged
merged 1 commit into from
Dec 10, 2024

Conversation

rouault
Copy link
Member

@rouault rouault commented Dec 8, 2024

  • Automate loading duckdb_spatial extension when installed, and when the dataset is DuckDB or Parquet
  • Retrieve geometries (GEOMETRY type) as OGR geometries
  • Read GeoParquet metadata to figure out spatial extent, CRS and geometry type per geometry column
  • Use duckdb_spatial ST_Intersects() for faster spatial filtering (when done with OGRLayer::SetSpatialFilter()), potentially leveraging DuckDB RTree when it is available.
  • Use GeoParquet bounding box column in complement to above
  • Passthrough forward of WHERE claused expresse through OGRLayer::SetAttributeFilter()

@rouault rouault added this to the 3.10.1 milestone Dec 8, 2024
@rouault rouault mentioned this pull request Dec 8, 2024
@rouault rouault modified the milestones: 3.10.1, 3.11.0 Dec 8, 2024
@rouault rouault force-pushed the adbc_spatial branch 2 times, most recently from c81e67b to db5035c Compare December 8, 2024 18:08
@coveralls
Copy link
Collaborator

coveralls commented Dec 8, 2024

Coverage Status

coverage: 68.541% (-0.04%) from 68.579%
when pulling e7332ae on rouault:adbc_spatial
into d511807 on OSGeo:master.

- Automate loading duckdb_spatial extension when installed, and when
  the dataset is DuckDB or Parquet
- Retrieve geometries (GEOMETRY type) as OGR geometries
- Read GeoParquet metadata to figure out spatial extent, CRS and
  geometry type per geometry column
- Use duckdb_spatial ST_Intersects() for faster spatial filtering
  (when done with OGRLayer::SetSpatialFilter()), potentially
  leveraging DuckDB RTree when it is available.
- Use GeoParquet bounding box column in complement to above
- Passthrough forward of WHERE claused expresse through
  OGRLayer::SetAttributeFilter()
@paleolimbot
Copy link
Contributor

This is amazing! I looked through for anything that seemed off about ADBC usage but it seems great. We've talked about providing a libpq-like "escape string"/"escape identifier" to help with people generating SQL, but here you're mostly generating SQL destined for a specific source.

The Snowflake, BigQuery, and PostgreSQL drivers also can have geometry queried...eventually I think we'll have them marked with the "geoarrow.wkb" extensions (hopefully also from DuckDB eventually).

@rouault rouault merged commit 8fcd4f7 into OSGeo:master Dec 10, 2024
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants