Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor train.py and val.py loggers #4137

Merged
merged 27 commits into from
Jul 24, 2021
Merged

Refactor train.py and val.py loggers #4137

merged 27 commits into from
Jul 24, 2021

Conversation

glenn-jocher
Copy link
Member

@glenn-jocher glenn-jocher commented Jul 24, 2021

πŸ› οΈ PR Summary

Made with ❀️ by Ultralytics Actions

🌟 Summary

Major refactoring of the logging system in YOLOv5, moving towards a modular design.

πŸ“Š Key Changes

  • πŸ”₯ Removed direct warnings and SummaryWriter imports and calls from train.py.
  • ✨ Introduced utils/loggers directory, centralizing logging utilities.
  • πŸ‘‰ Moved W&B logging code from wandb_utils.py to utils/loggers/wandb/.
  • πŸš€ Created Loggers class to handle logging actions during training.
  • πŸ”„ Renamed variables and methods for clarity and better adherence to the logging abstraction.
  • ♻️ Refactored calls to loggers to use the new logging class methods instead of direct access.

🎯 Purpose & Impact

  • 🎨 The purpose of these changes is to modularize the logging aspect of the training pipeline, making the codebase cleaner and easier to maintain.
  • 🧩 Modularization allows easier introduction or replacement of logging systems without affecting core training code.
  • ⚑ Users will benefit from a more stable and reliable logging experience during training, potentially with less exposure to implementation details for various logging backends (like W&B, TensorBoard, etc.).

@glenn-jocher
Copy link
Member Author

@AyushExel this is my logger refactor attempt for train.py and val.py. Changes are +168 and -90, so a net gain in code unfortunately, but train.py and val.py have reduced by about 40 lines and are a bit more readable. All of the logging activity is now in single-line callbacks and is parameterized to accommodate future expansion.

@glenn-jocher
Copy link
Member Author

@AyushExel can you take a thorough look at all the changes to make sure all the functionality still works? Thanks!!

@glenn-jocher glenn-jocher changed the title Refactor train.py and valy.py loggers Refactor train.py and val.py loggers Jul 24, 2021
@glenn-jocher glenn-jocher merged commit efe60b5 into master Jul 24, 2021
@glenn-jocher glenn-jocher deleted the update/loggers branch July 24, 2021 23:18
@glenn-jocher
Copy link
Member Author

@AyushExel I've merged this PR! I tested basic functionality, i.e. training with and without wandb installed on CPU and single GPU locally and in Colab and everything seems to work well. Here's an example of a run logged with this PR branch:
https://wandb.ai/glenn-jocher/YOLOv5/runs/1j76ofmb?workspace=user-glenn-jocher

I have not tested the advanced functionality though, including resuming from artifacts etc. Can you run some tests to make sure there are no new issues introduced?

One existing issue I see is that both opt.hyp and hyp are logged to W&B, duplicating content in the 'Config' section of the run here:
https://wandb.ai/glenn-jocher/YOLOv5/runs/1j76ofmb/overview?workspace=user-glenn-jocher

@glenn-jocher glenn-jocher self-assigned this Jul 24, 2021
@AyushExel
Copy link
Contributor

@glenn-jocher Hey this looks great!! I just tested this for resuming from artifacts, and dataset upload and I think everything works well. This refactoring will be very helpful in updating APIs and other features.

And yes, I'm aware of opt.hyp being logged in config. It enables seamless resuming of runs from W&B artifacts where it just sets opt.hyp = run.config.hyp . Maybe instead of saving the data in the config, we can just log the hyp file as an artifact but it seems like overkill for just logging one file. Anyway, we can definitely do that if it helps de-clutter the config panel.

@glenn-jocher
Copy link
Member Author

@AyushExel hmm I see. Maybe we should just remove the hyp logging then and use only the opt logging?

@AyushExel
Copy link
Contributor

Yeah. Makes sense. I'll clean it up this week

@glenn-jocher
Copy link
Member Author

Great thanks!

robin-maillot added a commit to robin-maillot/yolov5 that referenced this pull request Sep 22, 2021
* ConfusionMatrix `normalize=True` fix (ultralytics#3587)

* train.py GPU memory fix (ultralytics#3590)

* train.py GPU memory fix

* ema

* cuda

* cuda

* zeros input

* to device

* batch index 0

* W&B: Allow changed in config variable ultralytics#3588

* Update `dataset_stats()` (ultralytics#3593)

@kalenmike this is a PR to add image filenames and labels to our stats dictionary and to save the dictionary to JSON. Save location is next to the train labels.cache file. The single JSON contains all stats for entire dataset.

Usage example:
```python
from utils.datasets import *

dataset_stats('coco128.yaml', verbose=True)
```

* Delete __init__.py (ultralytics#3596)

* Simplify README.md (ultralytics#3530)

* Update README.md

* added hosted images

* added new logo

* testing image hosting

* changed svgs to pngs

* removed old header

* Update README.md

* correct colab image source

* splash.jpg

* rocket and W&B fix

* added contributing template

* added social media to top section

* increased size of top social media

* cleanup and updates

* rearrange quickstarts

* API cleanup

* PyTorch Hub cleanup

* Add tutorials

* cleanup

* update CONTRIBUTING.md

* Update README.md

* update wandb link

* Update README.md

* remove tutorials header

* update environments and integrations

* Comment API image

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* double spaces after section

* Update README.md

* Update README.md

Co-authored-by: Glenn Jocher <[email protected]>

* Update datasets.py (ultralytics#3591)

* 'changes-in_dataset'

* Update datasets.py

Co-authored-by: Glenn Jocher <[email protected]>

* Download COCO and VOC by default (ultralytics#3608)

* Suppress wandb images size mismatch warning (ultralytics#3611)

* supress wandb images size mismatch warning

* supress wandb images size mismatch warning

* PEP8 reformat and optimize imports

Co-authored-by: Glenn Jocher <[email protected]>

* Fix incorrect end epoch comment (ultralytics#3612)

* Update `check_file()` (ultralytics#3622)

* Update `check_file()`

* Update datasets.py

* Update README.md (ultralytics#3624)

* FROM nvcr.io/nvidia/pytorch:21.05-py3 (ultralytics#3633)

* Add `**/*.torchscript.pt` (ultralytics#3634)

* Update `verify_image_label()` (ultralytics#3635)

* RUN pip install --no-cache -U torch torchvision (ultralytics#3637)

* Assert non-premature end of JPEG images (ultralytics#3638)

* premature end of JPEG images

* PEP8 reformat

Co-authored-by: Glenn Jocher <[email protected]>

* Update CONTRIBUTING.md (ultralytics#3645)

* Update CONTRIBUTING.md

* Update CONTRIBUTING.md

* Update CONTRIBUTING.md

* Update CONTRIBUTING.md

* Update CONTRIBUTING.md (ultralytics#3647)

* `is_coco` list fix (ultralytics#3646)

* Update README.md (ultralytics#3650)

Be more user-friendly to new users

* Update `dataset_stats()` to list of dicts (ultralytics#3657)

* Update `dataset_stats()` to list of dicts

@kalenmike

* Update datasets.py

* Remove `/weights` directory (ultralytics#3659)

* Remove `/weights` directory

* cleanup

* Update download_weights.sh comment (ultralytics#3662)

* Update train.py (ultralytics#3667)

* Update `train(hyp, *args)` to accept `hyp` file or dict (ultralytics#3668)

* Update TensorBoard (ultralytics#3669)

* Update `WORLD_SIZE` and `RANK` retrieval (ultralytics#3670)

* Cache v0.3: improved corrupt image/label reporting (ultralytics#3676)

* Cache v0.3: improved corrupt image/label reporting

Fix for ultralytics#3656 (comment)

* cleanup

* EMA changes for pre-model's batch_size (ultralytics#3681)

* EMA changes for pre-model's batch_size

* Update train.py

* Update torch_utils.py

Co-authored-by: Glenn Jocher <[email protected]>

* Update README.md (ultralytics#3684)

* Update cache check (ultralytics#3691)

Swapped order of operations for faster first per ultralytics@f527704#r52362419

* Skip HSV augmentation when hyperparameters are [0, 0, 0] (ultralytics#3686)

* Create shortcircuit in augment_hsv when hyperparameter are zero

* implement faster opt-in

Co-authored-by: Glenn Jocher <[email protected]>

* Slightly modify CLI execution (ultralytics#3687)

* Slightly modify CLI execution

This simple change makes it easier to run the primary functions of this
repo (train/detect/test) from within Python. An object which represents
`opt` can be constructed and fed to the `main` function of each of these
modules, rather than having to call the lower level functions directly,
or run the module as a script.

* Update export.py

Add CLI parsing update for more convenient module usage within Python.

Co-authored-by: Lewis Belcher <[email protected]>

* Reformat (ultralytics#3694)

* Update DDP for `torch.distributed.run` with `gloo` backend (ultralytics#3680)

* Update DDP for `torch.distributed.run`

* Add LOCAL_RANK

* remove opt.local_rank

* backend="gloo|nccl"

* print

* print

* debug

* debug

* os.getenv

* gloo

* gloo

* gloo

* cleanup

* fix getenv

* cleanup

* cleanup destroy

* try nccl

* return opt

* add --local_rank

* add timeout

* add init_method

* gloo

* move destroy

* move destroy

* move print(opt) under if RANK

* destroy only RANK 0

* move destroy inside train()

* restore destroy outside train()

* update print(opt)

* cleanup

* nccl

* gloo with 60 second timeout

* update namespace printing

* Eliminate `total_batch_size` variable (ultralytics#3697)

* Eliminate `total_batch_size` variable

* cleanup

* Update train.py

* Add torch DP warning (ultralytics#3698)

* Add `train.run()` method (ultralytics#3700)

* Update train.py explicit arguments

* Update train.py

* Add run method

* Update DDP backend `if dist.is_nccl_available()` (ultralytics#3705)

* [x]W&B: Don't resume transfer learning runs (ultralytics#3604)

* Allow config cahnge

* Allow val change in wandb config

* Don't resume transfer learning runs

* Add entity in log dataset

* Update 4 main ops for paths and .run() (ultralytics#3715)

* Add yolov5/ to path

* rename functions to run()

* cleanup

* rename fix

* CI fix

* cleanup find models/export.py

* Fix `img2label_paths()` order (ultralytics#3720)

* Fix `img2label_paths()` order

* fix, 1

* Fix typo (ultralytics#3729)

* Backwards compatible cache version checks (ultralytics#3730)

* Update readme.

* Update `check_datasets()` for dynamic unzip path (ultralytics#3732)

@kalenmike

* Create `data/hyps` directory (ultralytics#3747)

* Force non-zero hyp evolution weights `w` (ultralytics#3748)

Fix for ultralytics#3741

* Edit comment (ultralytics#3759)

edit comment

* Add optional dataset.yaml `path` attribute (ultralytics#3753)

* Add optional dataset.yaml `path` attribute

@kalenmike

* pass locals to python scripts

* handle lists

* update coco128.yaml

* Capitalize first letter

* add test key

* finalize GlobalWheat2020.yaml

* finalize objects365.yaml

* finalize SKU-110K.yaml

* finalize SKU-110K.yaml

* finalize VisDrone.yaml

* NoneType fix

* update download comment

* voc to VOC

* update

* update VOC.yaml

* update VOC.yaml

* remove dashes

* delete get_voc.sh

* force coco and coco128 to ../datasets

* Capitalize Argoverse_HD.yaml

* Capitalize Objects365.yaml

* update Argoverse_HD.yaml

* coco segments fix

* VOC single-thread

* update Argoverse_HD.yaml

* update data_dict in test handling

* create root

* COCO annotations JSON fix (ultralytics#3764)

* Add `xyxy2xywhn()` (ultralytics#3765)

* Edit Comments for numpy2torch tensor process

Edit Comments for numpy2torch tensor process

* add xyxy2xywhn

add xyxy2xywhn

* add xyxy2xywhn

* formatting

* pass arguments

pass arguments

* edit comment for xyxy2xywhn()

edit comment for xyxy2xywhn()

* cleanup datasets.py

Co-authored-by: Glenn Jocher <[email protected]>

* Remove DDP MultiHeadAttention fix (ultralytics#3768)

* fix/incorrect_fitness_import (ultralytics#3770)

* W&B: Update Tables API and comply with new dataset_check (ultralytics#3772)

* Update tables API and windows path fix

* update dataset check

* NGA xView 2018 Dataset Auto-Download (ultralytics#3775)

* update clip_coords for numpy

* uncomment

* cleanup

* Add autosplits

* fix

* cleanup

* Update README.md fix banner width (ultralytics#3785)

* Objectness IoU Sort (ultralytics#3610)

Co-authored-by: U-LAPTOP-5N89P8V7\banhu <[email protected]>

* Update objectness IoU sort (ultralytics#3786)

* Create hyp.scratch-p6.yaml (ultralytics#3787)

* Fix datasets for aws and get_coco.sh (ultralytics#3788)

* merge master

* Update get_coco.sh

* Update seeds for single-GPU reproducibility (ultralytics#3789)

For seed=0 on single-GPU.

* Update Usage examples (ultralytics#3790)

* nvcr.io/nvidia/pytorch:21.06-py3 (ultralytics#3791)

* Update Dockerfile (ultralytics#3792)

* FROM nvcr.io/nvidia/pytorch:21.05-py3 (ultralytics#3794)

* Fix competition link (ultralytics#3799)

* link to the competition repaired

* Update README.md

Co-authored-by: Glenn Jocher <[email protected]>

* Fix warmup `accumulate` (ultralytics#3722)

* gradient accumulation during warmup in train.py

Context:
`accumulate` is the number of batches/gradients accumulated before calling the next optimizer.step().
During warmup, it is ramped up from 1 to the final value nbs / batch_size. 
Although I have not seen this in other libraries, I like the idea. During warmup, as grads are large, too large steps are more of on issue than gradient noise due to small steps.

The bug:
The condition to perform the opt step is wrong
> if ni % accumulate == 0:
This produces irregular step sizes if `accumulate` is not constant. It becomes relevant when batch_size is small and `accumulate` changes many times during warmup.

This demo also shows the proposed solution, to use a ">=" condition instead:
https://colab.research.google.com/drive/1MA2z2eCXYB_BC5UZqgXueqL_y1Tz_XVq?usp=sharing

Further, I propose not to restrict the number of warmup iterations to >= 1000. If the user changes hyp['warmup_epochs'], this causes unexpected behavior. Also, it makes evolution unstable if this parameter was to be optimized.

* replace last_opt_step tracking by do_step(ni)

* add docstrings

* move down nw

* Update train.py

* revert math import move

Co-authored-by: Glenn Jocher <[email protected]>

* Add feature map visualization (ultralytics#3804)

* Add feature map visualization

Add a feature_visualization function to visualize the mid feature map of the model.

* Update yolo.py

* remove boolean from forward and reorder if statement

* remove print from forward

* General cleanup

* Indent

* Update plots.py

Co-authored-by: Glenn Jocher <[email protected]>

* Update `feature_visualization()` (ultralytics#3807)

* Update `feature_visualization()`

Only plot for data with height, width > 1

* cleanup

* Cleanup

* Fix for `dataset_stats()` with updated data.yaml (ultralytics#3819)

@kalenmike

* Move IoU functions to metrics.py (ultralytics#3820)

* Concise `TransformerBlock()` (ultralytics#3821)

* Update setup.py to use utf8 everywhere.

* Update setup.py to use utf8 everywhere again.

* Fix `LoadStreams()` dataloader frame skip issue (ultralytics#3833)

* Update datasets.py to read every 4th frame of streams

* Update datasets.py

Co-authored-by: Glenn Jocher <[email protected]>

* Plot `AutoShape()` detections in ascending order (ultralytics#3843)

* Copy-Paste augmentation for YOLOv5 (ultralytics#3845)

* Copy-paste augmentation initial commit

* if any segments

* Add obscuration rejection

* Add copy_paste hyperparameter

* Update comments

* Created using Colaboratory

* Created using Colaboratory

* Add EXIF rotation to YOLOv5 Hub inference (ultralytics#3852)

* rotating an image according to its exif tag

* Update common.py

* Update datasets.py

* Update datasets.py

faster

* delete extraneous gpg file

* Update common.py

Co-authored-by: Glenn Jocher <[email protected]>

* `--evolve 300` generations CLI argument (ultralytics#3863)

* evolve command accepts argument for number of generations

* evolve generations argument used in evolve for loop

* evolve argument boolean fixes

* default to 300 evolve generations

* Update train.py

Co-authored-by: John San Soucie <[email protected]>
Co-authored-by: Glenn Jocher <[email protected]>

* Add multi-stream saving feature (ultralytics#3864)

* Added the recording feature for multiple streams

Thanks for the very cool repo!!
I was trying to record multiple feeds at the same time, but the current version of the detector only had one video writer and one vid_path!
So the streams were not being saved and only were initialized with one frame and this process didn't record the whole thing.

Fix:
I made a list of `vid_writer` and `vid_path` and the `i` from the loop over the `pred` took care of the writer which need to work!

I hope this helps, Thanks!

* Cleanup list lengths

* batch size variable

* Update datasets.py

Co-authored-by: Glenn Jocher <[email protected]>

* Created using Colaboratory

* Models `*.yaml` reformat (ultralytics#3875)

* Create `utils/augmentations.py` (ultralytics#3877)

* Create `utils/augmentations.py`

* cleanup

* Improved BGR2RGB speeds (ultralytics#3880)

* Update BGR2RGB ops

* speed improvements

* cleanup

* Evolution commented `hyp['anchors']` fix (ultralytics#3887)

Fix for `KeyError: 'anchors'` error when start hyperparameter evolution:
```bash
python train.py --evolve
```

```bash
Traceback (most recent call last):
  File "E:\yolov5\train.py", line 623, in <module>
    hyp[k] = max(hyp[k], v[1])  # lower limit
KeyError: 'anchors'
```

* Hub models `map_location=device` (ultralytics#3894)

* Hub models `map_location=device`

* cleanup

* YOLOv5 + Albumentations integration (ultralytics#3882)

* Albumentations integration

* ToGray p=0.01

* print confirmation

* create instance in dataloader init method

* improved version handling

* transform not defined fix

* assert string update

* create check_version()

* add spaces

* update class comment

* Save PyTorch Hub models to `/root/hub/cache/dir` (ultralytics#3904)

* Create hubconf.py

* Add save_dir variable

Co-authored-by: Glenn Jocher <[email protected]>

* Feature visualization update (ultralytics#3920)

* Feature visualization update

* Save to jpg (faster)

* Save to png

* Fix `torch.hub.list('ultralytics/yolov5')` pathlib bug (ultralytics#3921)

* Update `setattr()` default for Hub PIL images (ultralytics#3923)

Fix inference from PIL source.

* `feature_visualization()` CUDA fix (ultralytics#3925)

* Update `dataset_stats()` for zipped datasets (ultralytics#3926)

* Update `dataset_stats()` for zipped datasets

@kalenmike

* cleanup

* Fix inconsistent NMS IoU value for COCO (ultralytics#3934)

Evaluation of 'best' and 'last' models will use the same params as the evaluation during the training phase. 
This PR fixes ultralytics#3907

* Created using Colaboratory

* Feature visualization improvements 32 (ultralytics#3947)

* Update augmentations.py (ultralytics#3948)

* Cache v0.4 update (ultralytics#3954)

* Numerical stability fix for Albumentations (ultralytics#3958)

* Update `albumentations>=1.0.2` (ultralytics#3966)

* Update `np.random.random()` to `random.random()` (ultralytics#3967)

* Update requirements.txt `albumentations>=1.0.2` (ultralytics#3972)

* `Ensemble()` visualize fix (ultralytics#3973)

* fix visualize error

* Revert "fix visualize error"

* add visualise profile

* Created using Colaboratory

* Update `probability` to `p` (ultralytics#3980)

* Alert (no detections) (ultralytics#3984)

* `Detections()` class `print()` overload

* Update common.py

* Update README.md (ultralytics#3996)

* Rename `test.py` to `val.py` (ultralytics#4000)

* W&B sweeps support (ultralytics#3938)

* Add support for W&B Sweeps

* Update and reformat

* Update search space

* reformat

* reformat sweep.py

* Update sweep.py

* Move sweeps files to wandb dir

* Remove print

Co-authored-by: Glenn Jocher <[email protected]>

* Update greetings.yml (ultralytics#4024)

* Update greetings.yml

* Update greetings.yml

* Add `--sync-bn` known issue (ultralytics#4032)

* Add `--sync-bn` known issue

* Update train.py

* Update greetings.yml (ultralytics#4037)

* Update README.md (ultralytics#4041)

* Update README.md

* Update README.md

* Update README.md

* AutoShape PosixPath support (ultralytics#4047)

* AutoShape PosixPath support

Usage example:

````python
from pathlib import Path

model = ...
file = Path('data/images/zidane.jpg')

results = model(file)
```

* Update common.py

* `val.py` refactor (ultralytics#4053)

* val.py refactor

* cleanup

* cleanup

* cleanup

* cleanup

* save after eval

* opt.imgsz bug fix

* wandb refactor

* dataloader to train_loader

* capitalize global variables

* runs/hub/exp to runs/detect/exp

* refactor wandb logging

* Refactor wandb operations (ultralytics#4061)

Co-authored-by: Ayush Chaurasia <[email protected]>

* Module `super().__init__()` (ultralytics#4065)

* Module `super().__init__()`

* remove NMS

* Missing `nc` and `names` handling in check_dataset() (ultralytics#4066)

* Created using Colaboratory

* Albumentations >= 1.0.3 (ultralytics#4068)

* W&B: fix refactor bugs (ultralytics#4069)

* Refactor `export.py` (ultralytics#4080)

* Refactor `export.py`

* cleanup

* Update check_requirements()

* Update export.py

* Addition refactor `export.py` (ultralytics#4089)

* Addition refactor `export.py`

* Update export.py

* Add train.py ``--img-size` floor (ultralytics#4099)

* Update resume.py (ultralytics#4115)

* Fix indentation in `log_training_progress()` (ultralytics#4126)

* Update README.md (ultralytics#4134)

* ONNX inference update (ultralytics#4073)

* Rename `opset_version` to `opset` (ultralytics#4135)

* Update train.py (ultralytics#4136)

* Refactor train.py

* Update imports

* Update imports

* Update optimizer

* cleanup

* Refactor train.py and val.py `loggers` (ultralytics#4137)

* Update loggers

* Config

* Update val.py

* cleanup

* fix1

* fix2

* fix3 and reformat

* format sweep.py

* Logger() class

* cleanup

* cleanup2

* wandb package import fix

* wandb package import fix2

* txt fix

* fix4

* fix5

* fix6

* drop wandb into utils/loggers

* fix 7

* rename loggers/wandb_logging to loggers/wandb

* Update message

* Update message

* Update message

* cleanup

* Fix x axis bug

* fix rank 0 issue

* cleanup

* Update README.md (ultralytics#4143)

* Add `export.py` ONNX inference suggestion (ultralytics#4146)

* Created using Colaboratory

* New CSV Logger (ultralytics#4148)

* New CSV Logger

* cleanup

* move batch plots into Logger

* rename comment

* Remove total loss from progress bar

* mloss :-1 bug fix

* Update plot_results()

* Update plot_results()

* plot_results bug fix

* Created using Colaboratory

* Update dataset headers (ultralytics#4162)

* Update script headers (ultralytics#4163)

* Update download script headers

* cleanup

* bug fix attempt

* bug fix attempt2

* bug fix attempt3

* cleanup

* Improve docstrings and run names (ultralytics#4174)

* Update comments header (ultralytics#4184)

* Train from `--data path/to/dataset.zip` feature (ultralytics#4185)

* Train from `--data path/to/dataset.zip` feature

* Update dataset_stats()

* cleanup

* cleanup2

* Create yolov5-bifpn.yaml (ultralytics#4195)

* Update Hub Path inputs (ultralytics#4200)

* W&B: Restructure code to support the new dataset_check() feature (ultralytics#4197)

* Improve docstrings and run names

* default wandb login prompt with timeout

* return key

* Update api_key check logic

* Properly support zipped dataset feature

* update docstring

* Revert tuorial change

* extend changes to log_dataset

* add run name

* bug fix

* bug fix

* Update comment

* fix import check

* remove unused import

* Hardcore .yaml file extension

* reduce code

* Reformat using pycharm

Co-authored-by: Glenn Jocher <[email protected]>

* Update yolov5-bifpn.yaml (ultralytics#4208)

* W&B: More improvements and refactoring (ultralytics#4205)

* Improve docstrings and run names

* default wandb login prompt with timeout

* return key

* Update api_key check logic

* Properly support zipped dataset feature

* update docstring

* Revert tuorial change

* extend changes to log_dataset

* add run name

* bug fix

* bug fix

* Update comment

* fix import check

* remove unused import

* Hardcore .yaml file extension

* reduce code

* Reformat using pycharm

* Remove redundant try catch

* More refactoring and bug fixes

* retry

* Reformat using pycharm

* respect LOGGERS include list

Co-authored-by: Glenn Jocher <[email protected]>

* PyCharm reformat (ultralytics#4209)

* PyCharm reformat

* YAML reformat

* Markdown reformat

* Add `@try_except` decorator (ultralytics#4224)

* Explicit `requirements.txt` location (ultralytics#4225)

* Suppress torch 1.9.0 max_pool2d() warning (ultralytics#4227)

* Created using Colaboratory

* Created using Colaboratory

* Fix weight decay comment (ultralytics#4228)

* Update profiler (ultralytics#4236)

* Add `python train.py --freeze N` argument (ultralytics#4238)

* Add freeze as an argument

I train on different platforms and sometimes I want to freeze some layers. I have to go into the code and change it and also keep track of how many layers I froze on what platform. Please add the number of layers to freeze as an argument in future versions thanks.

* Update train.py

* Update train.py

* Cleanup

Co-authored-by: Glenn Jocher <[email protected]>

* Update `profile()` for CUDA Memory allocation (ultralytics#4239)

* Update profile()

* Update profile()

* Update profile()

* Update profile()

* Update profile()

* Update profile()

* Update profile()

* Update profile()

* Update profile()

* Update profile()

* Update profile()

* Update profile()

* Cleanup

* Add `train.py` and `val.py` callbacks (ultralytics#4220)

* added callbacks

* Update callbacks.py

* Update train.py

* Update val.py

* Fix CamlCase add staticmethod

* Refactor logger into callbacks

* Cleanup

* New callback on_val_image_end()

* Add curves and results images to TensorBoard

Co-authored-by: Glenn Jocher <[email protected]>

* W&B: suppress warnings (ultralytics#4257)

* Improve docstrings and run names

* default wandb login prompt with timeout

* return key

* Update api_key check logic

* Properly support zipped dataset feature

* update docstring

* Revert tuorial change

* extend changes to log_dataset

* add run name

* bug fix

* bug fix

* Update comment

* fix import check

* remove unused import

* Hardcore .yaml file extension

* reduce code

* Reformat using pycharm

* Remove redundant try catch

* More refactoring and bug fixes

* retry

* Reformat using pycharm

* respect LOGGERS include list

* call wandblogger.log instead of wandb.log

Co-authored-by: Glenn Jocher <[email protected]>

* Update AP calculation (ultralytics#4260)

* Update AP calculation

* Cleanup

* Remove original

* Update Autoshape forward header (ultralytics#4271)

* Update variables (ultralytics#4273)

* Add `DWConvClass()` (ultralytics#4274)

* Add `DWConvClass()`

* Cleanup

* Cleanup2

* Update 'results saved to' string (ultralytics#4275)

* W&B: Fix sweep bug (ultralytics#4276)

* Improve docstrings and run names

* default wandb login prompt with timeout

* return key

* Update api_key check logic

* Properly support zipped dataset feature

* update docstring

* Revert tuorial change

* extend changes to log_dataset

* add run name

* bug fix

* bug fix

* Update comment

* fix import check

* remove unused import

* Hardcore .yaml file extension

* reduce code

* Reformat using pycharm

* Remove redundant try catch

* More refactoring and bug fixes

* retry

* Reformat using pycharm

* respect LOGGERS include list

* call wandblogger.log instead of wandb.log

* Fix Sweep bug

Co-authored-by: Glenn Jocher <[email protected]>

* Feature `python train.py --cache disk` (ultralytics#4049)

* Add cache-on-disk and cache-directory to cache images on disk

* Fix load_image with cache_on_disk

* Add no_cache flag for load_image

* Revert the parts('logging' and a new line) that do not need to be modified

* Add the assertion for shapes of cached images

* Add a suffix string for cached images

* Fix boundary-error of letterbox for load_mosaic

* Add prefix as cache-key of cache-on-disk

* Update cache-function on disk

* Add psutil in requirements.txt

* Update train.py

* Cleanup1

* Cleanup2

* Skip existing npy

* Include re-space

* Export return character fix

Co-authored-by: Glenn Jocher <[email protected]>

* Fixed logging level in distributed mode (ultralytics#4284)

Co-authored-by: fkwong <[email protected]>

* Simplify callbacks (ultralytics#4289)

* Evolve in CSV format (ultralytics#4307)

* Update evolution to CSV format

* Update

* Update

* Update

* Update

* Update

* reset args

* reset args

* reset args

* plot_results() fix

* Cleanup

* Cleanup2

* Update newline (ultralytics#4308)

* Update README.md (ultralytics#4309)

remove unnecessary "`"

* Simpler code for DWConvClass (ultralytics#4310)

* more simpler code for DWConvClass

more simpler code for DWConvClass

* remove DWConv function

* Replace DWConvClass with DWConv

* `int(mlc)` (ultralytics#4385)

* Fix module count in parse_model (ultralytics#4379)

Co-authored-by: yangyuantao <[email protected]>

* Created using Colaboratory

* Update README.md (ultralytics#4387)

* W&B: Add advanced features tutorial (ultralytics#4384)

* Improve docstrings and run names

* default wandb login prompt with timeout

* return key

* Update api_key check logic

* Properly support zipped dataset feature

* update docstring

* Revert tuorial change

* extend changes to log_dataset

* add run name

* bug fix

* bug fix

* Update comment

* fix import check

* remove unused import

* Hardcore .yaml file extension

* reduce code

* Reformat using pycharm

* Remove redundant try catch

* More refactoring and bug fixes

* retry

* Reformat using pycharm

* respect LOGGERS include list

* Initial readme update

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

Co-authored-by: Glenn Jocher <[email protected]>

* W&B: Fix for 4360 (ultralytics#4388)

* Improve docstrings and run names

* default wandb login prompt with timeout

* return key

* Update api_key check logic

* Properly support zipped dataset feature

* update docstring

* Revert tuorial change

* extend changes to log_dataset

* add run name

* bug fix

* bug fix

* Update comment

* fix import check

* remove unused import

* Hardcore .yaml file extension

* reduce code

* Reformat using pycharm

* Remove redundant try catch

* More refactoring and bug fixes

* retry

* Reformat using pycharm

* respect LOGGERS include list

* Fix

* fix

Co-authored-by: Glenn Jocher <[email protected]>

* Fix rename `utils.google_utils` to `utils.downloads` (ultralytics#4393)

* Simplify ONNX inference command (ultralytics#4405)

* No cache option for reading datasets (ultralytics#4376)

* no cache option

* no cache option

* bit change

* changed to 0,1 instead of True False

* Update train.py

* Update datasets.py

Co-authored-by: Glenn Jocher <[email protected]>

* Update plots.py (ultralytics#4407)

* Add `yolov5s-ghost.yaml` (ultralytics#4412)

* Add yolov5s-ghost.yaml

* Finish C3Ghost

* Add C3Ghost to list

* Add C3Ghost to number of repeats if statement

* Fixes

* Cleanup

* Remove `encoding='ascii'` (ultralytics#4413)

* Remove `encoding='ascii'`

* Reinstate `encoding='ascii'` in emojis()

* Merge PIL and OpenCV in `plot_one_box(use_pil=False)` (ultralytics#4416)

* Merge PIL and OpenCV box plotting functions

* Add ASCII check to plot_one_box

* Cleanup

* Cleanup2

* Created using Colaboratory

* Standardize headers and docstrings (ultralytics#4417)

* Implement new headers

* Reformat 1

* Reformat 2

* Reformat 3 - math

* Reformat 4 - yaml

* Add `SPPF()` layer (ultralytics#4420)

* Add `SPPF()` layer

* Cleanup

* Add credit

* Created using Colaboratory

* Remove DDP process group timeout (ultralytics#4422)

* Update hubconf.py attempt_load  import (ultralytics#4428)

* TFLite prep (ultralytics#4436)

* Add TensorFlow and TFLite export (ultralytics#1127)

* Add models/tf.py for TensorFlow and TFLite export

* Set auto=False for int8 calibration

* Update requirements.txt for TensorFlow and TFLite export

* Read anchors directly from PyTorch weights

* Add --tf-nms to append NMS in TensorFlow SavedModel and GraphDef export

* Remove check_anchor_order, check_file, set_logging from import

* Reformat code and optimize imports

* Autodownload model and check cfg

* update --source path, img-size to 320, single output

* Adjust representative_dataset

* Put representative dataset in tfl_int8 block

* detect.py TF inference

* weights to string

* weights to string

* cleanup tf.py

* Add --dynamic-batch-size

* Add xywh normalization to reduce calibration error

* Update requirements.txt

TensorFlow 2.3.1 -> 2.4.0 to avoid int8 quantization error

* Fix imports

Move C3 from models.experimental to models.common

* Add models/tf.py for TensorFlow and TFLite export

* Set auto=False for int8 calibration

* Update requirements.txt for TensorFlow and TFLite export

* Read anchors directly from PyTorch weights

* Add --tf-nms to append NMS in TensorFlow SavedModel and GraphDef export

* Remove check_anchor_order, check_file, set_logging from import

* Reformat code and optimize imports

* Autodownload model and check cfg

* update --source path, img-size to 320, single output

* Adjust representative_dataset

* detect.py TF inference

* Put representative dataset in tfl_int8 block

* weights to string

* weights to string

* cleanup tf.py

* Add --dynamic-batch-size

* Add xywh normalization to reduce calibration error

* Update requirements.txt

TensorFlow 2.3.1 -> 2.4.0 to avoid int8 quantization error

* Fix imports

Move C3 from models.experimental to models.common

* implement C3() and SiLU()

* Fix reshape dim to support dynamic batching

* Add epsilon argument in tf_BN, which is different between TF and PT

* Set stride to None if not using PyTorch, and do not warmup without PyTorch

* Add list support in check_img_size()

* Add list input support in detect.py

* sys.path.append('./') to run from yolov5/

* Add int8 quantization support for TensorFlow 2.5

* Add get_coco128.sh

* Remove --no-tfl-detect in models/tf.py (Use tf-android-tfl-detect branch for EdgeTPU)

* Update requirements.txt

* Replace torch.load() with attempt_load()

* Update requirements.txt

* Add --tf-raw-resize to set half_pixel_centers=False

* Add --agnostic-nms for TF class-agnostic NMS

* Cleanup after merge

* Cleanup2 after merge

* Cleanup3 after merge

* Add tf.py docstring with credit and usage

* pb saved_model and tflite use only one model in detect.py

* Add use cases in docstring of tf.py

* Remove redundant `stride` definition

* Remove keras direct import

* Fix `check_requirements(('tensorflow>=2.4.1',))`

Co-authored-by: Glenn Jocher <[email protected]>

* Fix default `--weights yolov5s.pt` (ultralytics#4458)

* Fix missing labels after albumentations (ultralytics#4455)

* fix missing labels after augmentation

* Update datasets.py

Cleanup

Co-authored-by: Huu Quan <[email protected]>
Co-authored-by: Glenn Jocher <[email protected]>

* `check_requirements(('coremltools',))` (ultralytics#4478)

* `check_requirements(('coremltools',))`

* Update ci-testing.yml

* Update ci-testing.yml

* W&B: Refactor the wandb_utils.py file (ultralytics#4496)

* Improve docstrings and run names

* default wandb login prompt with timeout

* return key

* Update api_key check logic

* Properly support zipped dataset feature

* update docstring

* Revert tuorial change

* extend changes to log_dataset

* add run name

* bug fix

* bug fix

* Update comment

* fix import check

* remove unused import

* Hardcore .yaml file extension

* reduce code

* Reformat using pycharm

* Remove redundant try catch

* More refactoring and bug fixes

* retry

* Reformat using pycharm

* respect LOGGERS include list

* Fix

* fix

* refactor constructor

* refactor

* refactor

* refactor

* PyCharm reformat

Co-authored-by: Glenn Jocher <[email protected]>

* Add `install=True` argument to `check_requirements` (ultralytics#4512)

* Add `install=True` argument to `check_requirements`

* Update general.py

* Automatic TFLite uint8 determination (ultralytics#4515)

* Auto TFLite uint8 detection

This PR automatically determines if TFLite models are uint8 quantized rather than accepting a manual argument.

The quantization determination is based on @zldrobit comment ultralytics#1127 (comment)

* Cleanup

* Fix for `python models/yolo.py --profile` (ultralytics#4541)

Profiling fix copies input to Detect layer to circumvent inplace changes to the feature maps.

* Auto-fix corrupt JPEGs (ultralytics#4548)

* Autofix corrupt JPEGs

This PR automatically re-saves corrupt JPEGs and trains with the resaved images. WARNING: this will overwrite the existing corrupt JPEGs in a dataset and replace them with correct JPEGs, though the filesize may increase and the image contents may not be exactly the same due to lossy JPEG compression schemes. Results may vary by JPEG decoder and hardware.

Current behavior is to exclude corrupt JPEGs from training with a warning to the user, but many users have been complaining about large parts of their dataset being excluded from training.

* Clarify re-save reason

* Fix for corrupt JPEGs auto-fix PR (ultralytics#4560)

Auto-fix corrupt JPEGs PR introduced a bug whereby the f.seek() operation read all of the bytes in the image, resulting in the PIL image having nothing to read upon the .save() operation. 

Fix was to re-open the image using PIL before saving.

* Fix for AP calculation limits 0.0 - 1.0 (ultralytics#4563)

This PR brings alignment in AP computation practices with Detectron2 and MMDetection. 

Problem first noted by @yusiyoh in ultralytics#4546

* ONNX opset 13 (ultralytics#4566)

* Add EarlyStopping feature (ultralytics#4576)

* Add EarlyStopping feature

* Add comment

* Cleanup

* Cleanup2

* debug

* debug2

* debug3

* debug3

* debug4

* debug5

* debug6

* debug7

* debug8

* debug9

* debug10

* debug11

* debug12

* Cleanup

* Add TODO for known DDP issue

* Remove `image_weights` DDP code (ultralytics#4579)

* Initial commit

* Update

* Add `Profile()` profiler (ultralytics#4587)

* Add `Profile()` profiler

* CamelCase Timeout

* Fix bug in `plot_one_box` when label is `None` (ultralytics#4588)

* Create `Annotator()` class (ultralytics#4591)

* Add Annotator() class

* Download Arial

* 2x for loop

* Cleanup

* tuple 2 list

* max_size=1920

* bold logging results to

* tolist()

* im = annotator.im

* PIL save in detect.py

* Smart asarray in detect.py

* revert to cv2.imwrite

* Cleanup

* Return result asarray

* Add `Profile()` profiler

* CamelCase Timeout

* Resize after mosaic

* pillow>=8.0.0

* daemon imwrite

* Add cv2 support

* Remove plot_wh_methods and plot_one_box

* pil=False for hubconf.py annotations

* im.shape bug fix

* colorstr common.py

* join daemons

* Update t.daemon

* Removed daemon saving

* Auto-UTF handling (ultralytics#4594)

* Re-order `plots.py` to class-first (ultralytics#4595)

* Created using Colaboratory

* Update mosaic plots font size (ultralytics#4596)

* TensorBoard `on_train_end()` speed improvements (ultralytics#4605)

* Created using Colaboratory

* Auto-download Arial.ttf on init (ultralytics#4606)

* Auto-download Arial.ttf on init

* Fix ROOT

* Fix: add P2 layer 21 to yolov5-p2.yaml `Detect()` inputs (ultralytics#4608)

Layer 21 includes the information of xsmall objects

* Update `check_git_status()` warning (ultralytics#4610)

* W&B: Don't log models in evolve operation (ultralytics#4611)

* Close `matplotlib` plots after opening (ultralytics#4612)

* Close plots

* Replace fig.close() for plt.close()

* DDP `torch.jit.trace()` `--sync-bn` fix (ultralytics#4615)

* Remove assert

* debug0

* trace=not opt.sync

* sync to sync_bn fix

* Cleanup

* Fix for Arial.ttf redownloads with hub inference (ultralytics#4627)

* Fix 2 for Arial.ttf redownloads with hub inference (ultralytics#4628)

* Fix 3 for Arial.ttf redownloads with hub inference (ultralytics#4629)

Fix 3 for Arial.ttf redownloads with hub inference, follow-on to ultralytics#4628.

* Checkpoint code.

* Fix for `plot_evolve()` string argument (ultralytics#4639)

* Fix `is_coco` on missing `data['val']` key (ultralytics#4642)

* Fix workers to 1 for windows and fix issue with image_size not being used correctly during training

* Remove mojo files.

* Add mojo_test.py and update gitignore.

* Move entity and project to variables.

* Update installation of dependencies to only if needed and make whl search more generic.

* Fix missing parameter in _find_module_wheel_path.

* Remove extra prints.

* Fix weights download bug and pretraining always using yolov5s weights.

* Update code to work with Ultralytics YOLOv5:4 env.

* Add confidence threshold plot

* Minor cleanup of azure_wrapper.

* Fix click/typer incompatibility before 4.0.0

* Restore gitignore and remove wrong error import print in Azure wrapper.

* Fix wrong typer version in requirements.

Co-authored-by: Glenn Jocher <[email protected]>
Co-authored-by: Ayush Chaurasia <[email protected]>
Co-authored-by: Kalen Michael <[email protected]>
Co-authored-by: masood azhar <[email protected]>
Co-authored-by: Wei Quan <[email protected]>
Co-authored-by: xiaowk5516 <[email protected]>
Co-authored-by: Mai Thanh Minh <[email protected]>
Co-authored-by: SpongeBab <[email protected]>
Co-authored-by: ZouJiu1 <[email protected]>
Co-authored-by: lb-desupervised <[email protected]>
Co-authored-by: Lewis Belcher <[email protected]>
Co-authored-by: fcakyon <[email protected]>
Co-authored-by: Robin <[email protected]>
Co-authored-by: Yonghye Kwon <[email protected]>
Co-authored-by: Piotr Skalski <[email protected]>
Co-authored-by: U-LAPTOP-5N89P8V7\banhu <[email protected]>
Co-authored-by: batrlatom <[email protected]>
Co-authored-by: yellowdolphin <[email protected]>
Co-authored-by: Zigarss <[email protected]>
Co-authored-by: Feras Oughali <[email protected]>
Co-authored-by: Valentin Aliferov <[email protected]>
Co-authored-by: san-soucie <[email protected]>
Co-authored-by: John San Soucie <[email protected]>
Co-authored-by: ketan-b <[email protected]>
Co-authored-by: johnohagan <[email protected]>
Co-authored-by: jmiranda-laplateforme <[email protected]>
Co-authored-by: Eldar Kurtic <[email protected]>
Co-authored-by: KEN <[email protected]>
Co-authored-by: imyhxy <[email protected]>
Co-authored-by: IneovaAI <[email protected]>
Co-authored-by: junji hashimoto <[email protected]>
Co-authored-by: fkwong <[email protected]>
Co-authored-by: Sudhanshu Singh <[email protected]>
Co-authored-by: Yuantao Yang <[email protected]>
Co-authored-by: yangyuantao <[email protected]>
Co-authored-by: Ahmad Mustafa Anis <[email protected]>
Co-authored-by: Omid Sadeghnezhad <[email protected]>
Co-authored-by: Jiacong Fang <[email protected]>
Co-authored-by: Huu Quan, CAP <[email protected]>
Co-authored-by: Huu Quan <[email protected]>
Co-authored-by: Takumi Karasawa <[email protected]>
Co-authored-by: Yukun Xia <[email protected]>
Co-authored-by: vincent <[email protected]>
BjarneKuehl pushed a commit to fhkiel-mlaip/yolov5 that referenced this pull request Aug 26, 2022
* Update loggers

* Config

* Update val.py

* cleanup

* fix1

* fix2

* fix3 and reformat

* format sweep.py

* Logger() class

* cleanup

* cleanup2

* wandb package import fix

* wandb package import fix2

* txt fix

* fix4

* fix5

* fix6

* drop wandb into utils/loggers

* fix 7

* rename loggers/wandb_logging to loggers/wandb

* Update message

* Update message

* Update message

* cleanup

* Fix x axis bug

* fix rank 0 issue

* cleanup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants