Investigate European wildfire probability model for physrisk #319

joemoorhouse · 2024-07-08T14:38:54Z

Hi @jmcano-arfima, please feel free to adopt and edit this issue!

The paper 'Climate change-related statistical indicators' presents an approach based on calculating fire probability.
https://www.ecb.europa.eu/pub/pdf/scpsps/ecb.sps48~e3fd21dd5a.en.pdf

This issue is to investigate whether it is possible to onboard such a set into physrisk, potentially in collaboration with the authors.

jmcano-arfima · 2024-07-09T08:48:34Z

Hi @joemoorhouse, indeed we could try collaborating with the authors (which would probably be the fastest solution).

We were also contemplating replicating the methodology ourselves, but still don't fully understand the implications of becoming a "data provider"...

I take this opportunity to tag @csanmillan, who has just joined us on the quant side and will be working closely on this issue.

csanmillan · 2024-08-01T15:28:24Z

Hi @joemoorhouse,

I wanted to update you on the progress I've made over the past few weeks. I've been reading the ECB article to understand how they create the probability map using machine learning and acquiring the necessary datasets.

The regression needs four inputs that need to be processed:

Copernicus FWI databases (baseline until 2005, with additional inputs from the RCP until 2100. These data have already been
converted to .csv format).
Distance to city, railway, and road (maps in .tif format, which I have already converted to .csv).
These two inputs have been downloaded and are ready for processing.

I am currently preparing the remaining inputs:

3. Burnt Area MCD64A1.061 MODIS
4. Land Cover Type MCD12Q1.061 MODIS
These last two are the most challenging because they are in the Google Earth Engine database and have a resolution of 500 meters. However, we are working with a 2500-meter resolution to reproduce the results of the paper, and there will be time to improve this later.

Both datasets are images, and there is one for each year over the past 20 years, making it difficult to obtain all of them for an entire continent over such a period. Therefore, I am developing a Python script to download them as efficiently as possible. Below is a brief example from Spain, with a resolution of 25000 meters (which simplifies the process of refining the code).

Hope to have the data input ready by next week once I have optimized the download process.

csanmillan · 2024-10-02T10:21:45Z

I am working on a Jupyter notebook (.ipynb) to prepare the data for Machine Learning. Retrieved some files from 1 month ago. The data input has the following structure:

Variables/datasets currently prepared/downloaded:

FWI_mean and FWI_max_to_mean: Data downloaded from Copernicus. This data was downloaded 1 month ago in .csv format.
Landcover type and Burn Area: Data extracted country by country from MODIS using Google Earth Engine. All the countries have been combined into a single Europe map with two layers.

Datasets pending download and/or processing:

Country Codes I still need to obtain the country codes, which is not complicated using an external database. This will be a straightforward process.
**Critical Infrastructure **The challenging part will come with the last dataset: the roads, railways, and urban centers dataset. Calculating the distance to roads and railways is simple since they are explicitly stored. The urban centers will be challenging since they are deduced from two .tif maps: hospitals and educational centers.

Will be working on the input parquet file and will start with the XGBoost with monotonic constraints.
We're using xarray for multidimensional data handling, Dask for parallel processing, GeoPandas for spatial analysis, and Zarr for efficient storage of large datasets in the notebook.

joemoorhouse added the enhancement New feature or request label Jul 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate European wildfire probability model for physrisk #319

Investigate European wildfire probability model for physrisk #319

joemoorhouse commented Jul 8, 2024

jmcano-arfima commented Jul 9, 2024

csanmillan commented Aug 1, 2024

csanmillan commented Oct 2, 2024 •

edited

Loading

Investigate European wildfire probability model for physrisk #319

Investigate European wildfire probability model for physrisk #319

Comments

joemoorhouse commented Jul 8, 2024

jmcano-arfima commented Jul 9, 2024

csanmillan commented Aug 1, 2024

csanmillan commented Oct 2, 2024 • edited Loading

Variables/datasets currently prepared/downloaded:

Datasets pending download and/or processing:

csanmillan commented Oct 2, 2024 •

edited

Loading