Skip to content

Docker based GitHub Action that aggregates pump data from open street maps and stores it in a GeoJSON file.

License

Notifications You must be signed in to change notification settings

technologiestiftung/giessdenkiez-de-osm-pumpen-harvester

Repository files navigation

Giess den Kiez Pumpen aggregation from OSM

This is a Docker based GitHub Action to aggregate pumps data from OpenStreetMap and to store them in a geojson file. It is defined in ./action.yml

The aggregated data is used to provide locations and information about the street pumps in the frontend of Gieß den Kiez. The Overpass API for OSM is used to retrieve the data, by fetching all nodes with tag "man_made"="water_well" and "description"="Berliner Straßenbrunnen".

The corresponding query is defined in the script fetch.py. It can be overriden by providing a custom overpass query statement.

The data obtained in this way is further processed and the raw OSM data is filtered. In utils.py, all attributes are dropped that are theoretically still available in the OSM data, but which we do not need. By adding the respective attributes to the filter list, they can be included in the final data set.

Run locally

  • Create Python virtualenv and activate it
  • pip install requirements.txt or pip install requirements-mac.txt (if you are on MacOS)
  • Run python harvester/main.py pumps.geojson to generate the pumps.geojson file

Inputs to the Github Action

outfile-path

Required The path where the GeoJSON file should be written to. Default "public/data/pumps.geojson".

query

A custom overpass query statement to retrieve pumps from OpenStreetMap. When omitted, the action will retrieve Berlin pumps.

Outputs from the Github Action

file

The path to where the file was written.

Example Usage

The Github Action defined in this repository is built to be reusable. What you do with the generated pumps.geojson file is up to you and depends on your specific use case.

Usage for giessdenkiez-de repository

For giessdenkiez-de, the custom Github Action defined here in ./action.yml gets used in a periodically triggered Github Action, which is defined in giessdenkiez-de -> pumps.yml. For this specific use case, the generated pumps.geojson file is subsequently uploaded to a Supabase storage location. For details, refer to the Github Actions definition in giessdenkiez-de -> pumps.yml.

Your own public repository

Reference the Github Action defined in this repository in your own Github Actions file. Use the generated pumps.geojson in a way that fits your architecture.

File: .github/workflows/pumps.yml

on:
  workflow_dispatch:
  schedule:
    # every sunday morning at 4:00
    - cron: "0 4 * * 0"

jobs:
  hello_world_job:
    runs-on: ubuntu-latest
    name: A job to aggregate pumps data from open street maps
    steps:
      - name: Pumps data generate step
        # use tags if you want to fix on a specific version
        # e.g
        # uses: technologiestiftung/[email protected]
        # use master if you like to gamble
        uses: technologiestiftung/giessdenkiez-de-osm-pumpen-harvester@master
        id: pumps
        with:
          outfile-path: "out/pumps.geojson"
          # Pass "query" argument to specify custom overpass query string (see example below for the city of Magdeburg)
          # query: '[out:json][bbox:52.0124,11.4100, 52.2497,11.8330];(node["man_made"="water_well"];);out;>;out;'
      # Use the output from the `pumps` step
      - name: File output
        run: echo "The file was written to ${{ steps.pumps.outputs.file }}"

Your own private repository

You can use the code from this public repository in your own private repository in your own Github Actions file. Use the generated pumps.geojson in a way that fits your architecture.

File: .github/workflows/main.yml

on:
  workflow_dispatch:
  schedule:
    # * is a special character in YAML so you have to quote this string
    - cron: "0 4 * * 0"

jobs:
  hello_world_job:
    runs-on: ubuntu-latest
    name: A job to aggregate pumps data from open street maps
    steps:
      # To use this repository's private action,
      # you must check out the repository
      - name: Checkout
        uses: actions/checkout@v2
      - name: Pumps data generate step
        uses: ./ # Uses an action in the root directory
        id: pumps
        with:
          outfile-path: "out/pumps.geojson"
          # Pass "query" argument to specify custom overpass query string (see example below for the city of Magdeburg)
          # query: '[out:json][bbox:52.0124,11.4100, 52.2497,11.8330];(node["man_made"="water_well"];);out;>;out;'
      # Use the output from the `hello` step
      - name: File output
        run: echo "The file was written to ${{ steps.pumps.outputs.file }}"

Development

See also https://docs.github.com/en/actions/creating-actions/creating-a-docker-container-action

Python

Run the script with python3 harvester/main.py path/to/out/file.geojson

Docker

Build the container and run it.

mkdir out
docker build --tag technologiestiftung/giessdenkiez-de-osm-pumpen-harvester .
docker run -v $PWD/out:/scripts/out technologiestiftung/giessdenkiez-de-osm-pumpen-harvester path/scripts/out/outfile.json

Test

pytest
pytest --cov=harvester --cov-fail-under 75 --cov-config=.coveragerc

Contributors ✨

Thanks goes to these wonderful people (emoji key):

Fabian Morón Zirfas
Fabian Morón Zirfas

💻 📖
Lisa-Stubert
Lisa-Stubert

💻 📖
Lucas Vogel
Lucas Vogel

📖
Jens Winter-Hübenthal
Jens Winter-Hübenthal

💻

This project follows the all-contributors specification. Contributions of any kind welcome!

Credits



A project by:

Supported by: