Skip to content

Utilities to scrape the API of the Forged Alliance Forever online game.

License

Notifications You must be signed in to change notification settings

yaniv-aknin/fafdata

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fafdata

Build Status

A data engineering toolkit to extract metadata and replays from api.faforever.com and load it into a data lake like BigQuery. The intention is to reconstruct (part) of the Forged Alliance Forver database as a public BigQuery dataset.

The dataset

Using this toolkit, I've scraped the API and created a dataset of all game models and some associated models (player, gamePlayerStats, mapVersion, etc).

It lets you make stuff like this: Scatter plot panels

At the time of this writing, there are three public ways to use this dataset:

  • A simple Datastudio Dashboard for quick browsing
  • A Kaggle dataset where I've flattened, filtered and documented two CSVs
  • A publicly accessible BigQuery dataset for your own queries (← the good stuff is here)
    • Try the query SELECT COUNT(id) FROM `fafalytics.faf.games` WHERE DATE(startTime) = "2022-01-01"
      (you might pay a tiny amount for this)
    • Try pinning the fafalytics project in Cloud Console

The utilities

The tools includes utilities to extract, transform and load FAF metadata and replay data. Here's a demo session using faf.extract and faf.transform to create a BigQuery table:

from faforever to bigquery in 30s

An overview of all utilities:

  • faf.extract: Scrapes models from api.faforver.com, storing them as JSONs on disk.
  • faf.transform: Transform extracted JSON files into JSONL files ready for loading to a data lake.
  • faf.parse: Parses a downloaded .fafreplay file into a .pickle; this speeds up subsequent dumps of the replay.
  • faf.dump: Dumps the content of a replay (raw .fafreplay or pre-parsed .pickle) into a JSONL file to be loaded to the lake.

Epilogue

This is a bit of a fork/rewrite of fafalytics, another project of mine with much larger scope (not just scrape the API, but also download and analyse the binary replay files). I now think it's better to approach this with three smaller scoped projects - one for data engineering, one for dataviz and analytics, and one for ML.

About

Utilities to scrape the API of the Forged Alliance Forever online game.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published