Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Working with Images documentation #743

Merged
merged 4 commits into from
Apr 9, 2021
Merged

Working with Images documentation #743

merged 4 commits into from
Apr 9, 2021

Conversation

thisiseshan
Copy link
Contributor

Add "Working with Images on Hub section"
Add "docs/source/img" directory to store images linked to documentation
Add link to Colab file for Working with Images

@github-actions
Copy link

github-actions bot commented Apr 3, 2021

Locust summary

Git references

Initial: f19d566
Terminal: 56e9248

benchmarks/benchmark_dataset_comparison.py
Changes:
hub/api/dataset.py
Changes:
  • Name: Dataset
    Type: class
    Changed lines: 1
    Total lines: 924
    Changes:
hub/exceptions.py
Changes:
hub/report.py
Changes:
hub/store/dynamic_tensor.py
Changes:
hub/store/metastore.py
Changes:
benchmarks/benchmark_compress_time.py
Changes:
benchmarks/benchmark_dataset_iter.py
Changes:
benchmarks/benchmark_random_access.py
Changes:
benchmarks/benchmark_sequential_access.py
Changes:
  • Name: time_batches
    Type: function
    Changed lines: 23
    Total lines: 23
    • Name: time_tiledb
      Type: function
      Changed lines: 56
      Total lines: 56
      • Name: time_zarr
        Type: function
        Changed lines: 36
        Total lines: 36
        • Name: time_hub
          Type: function
          Changed lines: 11
          Total lines: 11
          benchmarks/benchmark_sequential_write.py
          Changes:
          • Name: time_batches
            Type: function
            Changed lines: 25
            Total lines: 25
            • Name: time_tiledb
              Type: function
              Changed lines: 25
              Total lines: 25
              • Name: time_zarr
                Type: function
                Changed lines: 12
                Total lines: 12
                • Name: time_hub
                  Type: function
                  Changed lines: 22
                  Total lines: 22
                  hub/auto/infer.py
                  Changes:
                  • Name: _find_root
                    Type: function
                    Changed lines: 3
                    Total lines: 29

                    @codecov
                    Copy link

                    codecov bot commented Apr 3, 2021

                    Codecov Report

                    Merging #743 (97f86ac) into master (971e422) will not change coverage.
                    The diff coverage is n/a.

                    Impacted file tree graph

                    @@           Coverage Diff           @@
                    ##           master     #743   +/-   ##
                    =======================================
                      Coverage   89.26%   89.26%           
                    =======================================
                      Files          63       63           
                      Lines        4378     4378           
                    =======================================
                      Hits         3908     3908           
                      Misses        470      470           

                    Continue to review full report at Codecov.

                    Legend - Click here to learn more
                    Δ = absolute <relative> (impact), ø = not affected, ? = missing data
                    Powered by Codecov. Last update 971e422...97f86ac. Read the comment docs.

                    @mynameisvinn
                    Copy link
                    Contributor

                    mynameisvinn commented Apr 3, 2021

                    Hey @thisiseshan thanks for the PR.

                    I noticed you resized images before pushing to Hub. Could you convert the dataset to Hub format, and then use Hub to resize images? It would be feel more functional (rather than procedural), which should be the goal of any dataflow pipeline.

                    A good place to start is with this notebook.

                    @mynameisvinn mynameisvinn self-requested a review April 3, 2021 13:35
                    @mynameisvinn mynameisvinn added the documentation Improvements or additions to documentation label Apr 3, 2021
                    @thisiseshan
                    Copy link
                    Contributor Author

                    Absolutely! 😄

                    @mynameisvinn
                    Copy link
                    Contributor

                    @thisiseshan Thanks for including an example of hub.Transform.

                    One more thing: I noticed you collected image filenames in a pd.DataFrame. Is there a reason you decided to use a pd.DataFrame instead of something lightweight, perhaps a list?

                    @thisiseshan
                    Copy link
                    Contributor Author

                    I prefer pd.Dataframe over lists for structured data.
                    In my case of using labelled data. I can generate a csv file for the complete dataset (which isn't even available on Kaggle), all the while being able to visualize classes and images side by side in a tabular form.
                    Especially when using Hub, I feel pd.Dataframe is the way to go as it makes uploading data very easy and the code much readable. Just my thoughts 💭

                    @Diveafall
                    Copy link
                    Contributor

                    Hey @thisiseshan! Can you please add a quick note in the guide that using Dataframe is not required. We want to make sure our users know that there are multiple ways for doing this 🙂

                    @thisiseshan
                    Copy link
                    Contributor Author

                    The same has also been added to the linked Colab file 😄

                    @Diveafall
                    Copy link
                    Contributor

                    Thanks a lot @thisiseshan!

                    @Diveafall Diveafall merged commit 914413c into activeloopai:master Apr 9, 2021
                    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
                    Labels
                    documentation Improvements or additions to documentation
                    Projects
                    None yet
                    Development

                    Successfully merging this pull request may close these issues.

                    3 participants