Skip to content

Commit

Permalink
Merge pull request #215 from hackalog/add_paths
Browse files Browse the repository at this point in the history
Add paths for output, figures
  • Loading branch information
hackalog authored May 18, 2021
2 parents 87ddaf7 + 1cefe0e commit d6d8f2b
Show file tree
Hide file tree
Showing 4 changed files with 30 additions and 27 deletions.
40 changes: 17 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
[![Coverage Status](https://coveralls.io/repos/github/hackalog/cookiecutter-easydata/badge.svg?branch=master)](https://coveralls.io/github/hackalog/cookiecutter-easydata?branch=master)
[![Documentation Status](https://readthedocs.org/projects/cookiecutter-easydata/badge/?version=latest)](https://cookiecutter-easydata.readthedocs.io/en/latest/?badge=latest)

# Cookiecutter EasyData
# EasyData

_A python framework and git template for data scientists, teams, and workshop organizers
aimed at making your data science **reproducible**_
Expand All @@ -18,8 +18,8 @@ In other words, Easydata is a template, library, and workflow that lets you **ge

## What is Easydata?

Easydata is a python cookiecutter for building custom data science git repos that provides:
* An **opinionated workflow** for collaboration, storytelling,
Easydata is a framework for building custom data science git repos that provides:
* An **prescribed workflow** for collaboration, storytelling,
* A **python framework** to support this workflow
* A **makefile wrapper** for conda and pip environment management
* prebuilt **dataset recipes**, and
Expand All @@ -32,7 +32,7 @@ Easydata is **not**
* a prescribed data format.


### Requirements to use this cookiecutter template:
### Requirements to use this framework:
- anaconda (or miniconda)
- python3.6+ (we use f-strings. So should you)
- [Cookiecutter Python package](http://cookiecutter.readthedocs.org/en/latest/installation.html) >= 1.4.0: This can be installed with pip by or conda depending on how you manage your Python packages:
Expand All @@ -49,7 +49,7 @@ python -m pip install -f requirements.txt
### To start a new project, run:
------------

cookiecutter https://github.com/hackalog/cookiecutter-easydata
cookiecutter https://github.com/hackalog/easydata


### The resulting directory structure
Expand All @@ -59,8 +59,13 @@ The directory structure of your new project looks like this:


* `LICENSE`
* Terms of use for this repo
* `Makefile`
* top-level makefile. Type `make` for a list of valid commands
* `Makefile.include`
* Global includes for makefile routines. Included by `Makefile`.
* `Makefile.env`
* Command for maintaining reproducible conda environment. Included by `Makefile`.
* `README.md`
* this file
* `catalog`
Expand All @@ -80,12 +85,6 @@ The directory structure of your new project looks like this:
* A default Sphinx project; see sphinx-doc.org for details
* `framework-docs`
* Markdown documentation for using Easydata
* `models`
* Trained and serialized models, model predictions, or model summaries
* `models/trained`
* Trained models
* `models/output`
* predictions and transformations from the trained models
* `notebooks`
* Jupyter notebooks. Naming convention is a number (for ordering),
the creator's initials, and a short `-` delimited description,
Expand All @@ -96,13 +95,9 @@ The directory structure of your new project looks like this:
* Generated analysis as HTML, PDF, LaTeX, etc.
* `reports/figures`
* Generated graphics and figures to be used in reporting
* `reports/tables`
* Generated data tables to be used in reporting
* `reports/summary`
* Generated summary information to be used in reporting
* `environment.yml`
* (if using conda) The YAML file for reproducing the analysis environment
* `environment.(platform).lock.yml`
* The user-readable YAML file for reproducing the conda/pip environment.
* `environment.(platform).lock.1yml`
* resolved versions, result of processing `environment.yml`
* `setup.py`
* Turns contents of `MODULE_NAME` into a
Expand All @@ -116,9 +111,6 @@ The directory structure of your new project looks like this:
* code to fetch raw data and generate Datasets from them
* `MODULE_NAME/analysis`
* code to turn datasets into output products
* `tox.ini`
* tox file with settings for running tox; see tox.testrun.org


### Installing development requirements
The first time:
Expand All @@ -142,6 +134,8 @@ make delete_environment
```


## History
Early versions of Easydata were based on
[cookiecutter-data-science](http://drivendata.github.io/cookiecutter-data-science/).
## Credits and Thanks
* Early versions of Easydata were based on the excellent
[cookiecutter-data-science](http://drivendata.github.io/cookiecutter-data-science/)
template.
* Thanks to the [Tutte Institute](https://github.com/TutteInstitute) for supporting the development of this framework.
12 changes: 8 additions & 4 deletions {{ cookiecutter.repo_name }}/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ _Author: {{ cookiecutter.author_name }}_

{{cookiecutter.description}}

This repo is build on the cookiecutter-easydata template and workflow for making it easy to share your work with others and
This repo is build on the easydata template and workflow for making it easy to share your work with others and
to build on the work of others. This includes:

* managing conda environments in a consistent and reproducible way,
Expand All @@ -27,7 +27,7 @@ in [Setting up git and Checking Out the Repo](framework-docs/git-configuration.m
order to check-out the code and set-up your remote branches

Note: These instructions assume you are using SSH keys (and not HTTPS authentication) with {{ cookiecutter.upstream_location }}.
If you haven't set up SSH access to {{ cookiecutter.upstream_location }}, see [Configuring SSH Access to {{cookiecutter.upstream_location}}](https://github.com/hackalog/cookiecutter-easydata/wiki/Configuring-SSH-Access-to-Github). This also includes instuctions for using more than one account with SSH keys.
If you haven't set up SSH access to {{ cookiecutter.upstream_location }}, see [Configuring SSH Access to {{cookiecutter.upstream_location}}](https://github.com/hackalog/easydata/wiki/Configuring-SSH-Access-to-Github). This also includes instuctions for using more than one account with SSH keys.

Once you've got your local, `origin`, and `upstream` branches configured, you can follow the instructions in this handy [Git Workflow Cheat Sheet](framework-docs/git-workflow.md) to keep your working copy of the repo in sync with the others.

Expand Down Expand Up @@ -115,6 +115,8 @@ Project Organization
* `catalog`
* Data catalog. This is where config information such as data sources
and data transformations are saved.
* `catalog/config.ini`
* Local Data Store. This configuration file is for local data only, and is never checked into the repo.
* `data`
* Data directory. Often symlinked to a filesystem with lots of space.
* `data/raw`
Expand All @@ -138,7 +140,9 @@ Project Organization
* `reports/figures`
* Generated graphics and figures to be used in reporting.
* `environment.yml`
* The YAML file for reproducing the conda/pip environment.
* The user-readable YAML file for reproducing the conda/pip environment.
* `environment.(platform).lock.yml`
* resolved versions, result of processing `environment.yml`
* `setup.py`
* Turns contents of `{{ cookiecutter.module_name }}` into a
pip-installable python module (`pip install -e .`) so it can be
Expand All @@ -154,4 +158,4 @@ Project Organization

--------

<p><small>This project was built using <a target="_blank" href="https://github.com/hackalog/cookiecutter-easydata">cookiecutter-easydata</a>, a python template aimed at making your data science workflow reproducible.</small></p>
<p><small>This project was built using <a target="_blank" href="https://github.com/hackalog/easydata">Easydata</a>, a python template aimed at making your data science workflow reproducible.</small></p>
2 changes: 2 additions & 0 deletions {{ cookiecutter.repo_name }}/catalog/config.ini
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,6 @@ raw_data_path = ${data_path}/raw
interim_data_path = ${data_path}/interim
processed_data_path = ${data_path}/processed
project_path = ${catalog_path}/..
output_path = ${project_path}/reports
figures_path = ${output_path}/figures

Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@
'interim_data_path': '${data_path}/interim',
'processed_data_path': '${data_path}/processed',
'project_path': '${catalog_path}/..',
'output_path': '${project_path}/reports',
'figures_path': '${output_path}/figures'

}
_catalog_file = _module_dir.parent / "catalog" / "config.ini"

Expand Down

0 comments on commit d6d8f2b

Please sign in to comment.