Skip to content

jeremyjordan/flower-classifier

Repository files navigation

flower-classifier

Tests Streamlit App

Authors: Jeremy Jordan and John Huffman

John and I were walking through a garden one day and kept pointing out flowers that we thought looked cool. The only problem was... we didn't know the names of any of the flowers! As machine learning engineers, our first thought was "let's build an image classifier" and this project was born.

Passion Flower

Getting started

  1. Spin up a Colab notebook.
  2. Install colabcode.
  3. Start the code server.
from colabcode import ColabCode
ColabCode(port=10000, mount_drive=True)
  1. Go to the ngrok link provided.
  2. Clone the repo.
git clone https://github.com/jeremyjordan/flower-classifier.git
  1. Run make colab to set up the project on your Colab instance (or run make init if running locally).
  2. Start a training job by running train, optionally providing configuration options.
    • eg. If you want to do a quick check, you can run train trainer=smoke_test

Training a model

You can initiate a training job from the command line using the train script. We use Hydra to manage configuration of the job, which allows us to compose multiple separate config files for various parts of the system into a hierarchical structure. You can see all of the available config groups in the conf/ directory. We specify the default values to use in conf/config.yaml, but these can be overwritten using Hydra's override syntax.

Examples:

Start training job using default values

train

Override a configuration group (referencing non-default config files)

train trainer=smoke_test dataset=random

Override specific values in a config file (+ appends a new value, ~ deletes a value from the config)

train model.architecture=resnest200e +model.extra_arg=example ~model.dropout_rate

Realistic example

train model.architecture=efficientnet_b3 \
    model.dropout_rate=0.35 \
    optimizer.lr=0.01 \
    dataset.batch_size=64 \
    optimizer=sgd \
    lr_scheduler=onecycle \
    trainer.max_epochs=50 \
    dataset=folder

You can run train --help to view available configuration options.

Contributing

In order to commit code from a Colab machine, you'll need to do the following:

  1. Make sure you have an Github auth token (https://github.com/settings/tokens)
  2. Configure the git settings on the machine
git config --global user.name "Jeremy Jordan"
git config --global user.email ""
gh auth login --with-token <<< INSERT_TOKEN_HERE

Note: make sure you've ran make colab before setting this up.