A Python package to manage Google Cloud Data Catalog Fileset export scripts.
Disclaimer: This is not an officially supported Google product.
- Executing in Cloud Shell
- 1. Environment setup
- 2. Export Filesets to CSV file
# Set your SERVICE ACCOUNT, for instructions go to 1.3. Auth credentials
# This name is just a suggestion, feel free to name it following your naming conventions
export GOOGLE_APPLICATION_CREDENTIALS=~/datacatalog-fileset-exporter-sa.json
# Install datacatalog-fileset-exporter
pip3 install datacatalog-fileset-exporter --user
# Add to your PATH
export PATH=~/.local/bin:$PATH
# Look for available commands
datacatalog-fileset-exporter --help
Using virtualenv is optional, but strongly recommended unless you use Docker.
git clone https://github.com/mesmacosta/datacatalog-fileset-exporter
cd ./datacatalog-fileset-exporter
All paths starting with ./
in the next steps are relative to the datacatalog-fileset-exporter
folder.
pip install --upgrade virtualenv
python3 -m virtualenv --python python3 env
source ./env/bin/activate
pip install --upgrade .
Docker may be used as an alternative to run the script. In this case, please disregard the Virtualenv setup instructions.
- Data Catalog Admin
This name is just a suggestion, feel free to name it following your naming conventions
./credentials/datacatalog-fileset-exporter-sa.json
This step may be skipped if you're using Docker.
export GOOGLE_APPLICATION_CREDENTIALS=~/credentials/datacatalog-fileset-exporter-sa.json
Filesets are composed of as many lines as required to represent all of their fields. The columns are described as follows:
Column | Description | Mandatory |
---|---|---|
entry_group_name | Entry Group Name. | Y |
entry_group_display_name | Entry Group Display Name. | Y |
entry_group_description | Entry Group Description. | Y |
entry_id | Entry ID. | Y |
entry_display_name | Entry Display Name. | Y |
entry_description | Entry Description. | Y |
entry_file_patterns | Entry File Patterns. | Y |
schema_column_name | Schema column name. | N |
schema_column_type | Schema column type. | N |
schema_column_description | Schema column description. | N |
schema_column_mode | Schema column mode. | N |
- Python + virtualenv
datacatalog-fileset-exporter filesets export --project-ids my-project --file-path CSV_FILE_PATH