Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new ImageDataset Class #579

Closed
Marsmaennchen221 opened this issue Mar 22, 2024 · 3 comments · Fixed by #645
Closed

Add a new ImageDataset Class #579

Marsmaennchen221 opened this issue Mar 22, 2024 · 3 comments · Fixed by #645
Assignees
Labels
enhancement 💡 New feature or request released Included in a release

Comments

@Marsmaennchen221
Copy link
Contributor

Marsmaennchen221 commented Mar 22, 2024

Is your feature request related to a problem?

This ImageDataset should handle an input as an ImageList and an output as either an ImageList a column that will be one hot encoded internally or a numerical table with values between 0 and 1. If the output is a table it should have the same amount of columns as the nn module has output neurons and all columns should have values between 0 and 1. If the output is a column the amount of one hot encoded values should not exceed the amount of output neurons of the nn module.

Desired solution

Add a new ImageDataset Class
This class should also deliver batching methods for NN models

Possible alternatives (optional)

No response

Screenshots (optional)

No response

Additional Context (optional)

No response

@Marsmaennchen221 Marsmaennchen221 added the enhancement 💡 New feature or request label Mar 22, 2024
@Marsmaennchen221 Marsmaennchen221 self-assigned this Mar 22, 2024
@github-project-automation github-project-automation bot moved this to Backlog in Library Mar 22, 2024
@lars-reimann
Copy link
Member

Should the output not be a table instead of a column? Say we're doing object classification, then the output could be described as a one-hot encoded vector.

@Marsmaennchen221
Copy link
Contributor Author

Should the output not be a table instead of a column? Say we're doing object classification, then the output could be described as a one-hot encoded vector.

Yes, you are right, the table should have the same amount of columns as the nn module has as output neurons. In this case the table should only contain numerical columns with values between 0 and 1. For easy use, we could also allow a column that will be one hot encoded in the Dataset.

lars-reimann pushed a commit that referenced this issue May 6, 2024
)

Closes #579, #580, #581 

### Summary of Changes

feat: added `Convolutional2DLayer`, `ConvolutionalTranspose2DLayer`,
`FlattenLayer`, `MaxPooling2DLayer` and `AvgPooling2DLayer`
feat: added `InputConversionImage`, `OutputConversionImageToColumn`,
`OutputConversionImageToTable` and `OutputConversionImageToImage`
feat: added generic `ImageDataset`
feat: added class `ImageSize` and methods `ImageList.sizes` and
`Image.size` to get the sizes of the respective images
feat: added ability to iterate over `SingleSizeImageList`
feat: added param to return filenames in `ImageList.from_files`
feat: added option `None` for no activation function in `ForwardLayer`
feat: added `Image.__array__` to convert a `Image` to a `numpy.ndarray`
feat: added equals check to `OneHotEncoder`
fix: fixed bug #581 in removing
the Softmax function from the last layer in `NeuralNetworkClassifier`
refactor: move `image.utils` to `image._utils`
refactor: extracted test devices from `test_image` to `helpers.devices`

---------

Co-authored-by: megalinter-bot <[email protected]>
@github-project-automation github-project-automation bot moved this from Backlog to ✔️ Done in Library May 6, 2024
lars-reimann pushed a commit that referenced this issue May 9, 2024
## [0.24.0](v0.23.0...v0.24.0) (2024-05-09)

### Features

* `Column.plot_histogram()` using `Table.plot_histograms` for consistent results ([#726](#726)) ([576492c](576492c))
* `Regressor.summarize_metrics` and `Classifier.summarize_metrics` ([#729](#729)) ([1cc14b1](1cc14b1)), closes [#713](#713)
* `Table.keep_only_rows` ([#721](#721)) ([923a6c2](923a6c2))
* `Table.remove_rows` ([#720](#720)) ([a1cdaef](a1cdaef)), closes [#698](#698)
* Add `ImageDataset` and Layer for ConvolutionalNeuralNetworks ([#645](#645)) ([5b6d219](5b6d219)), closes [#579](#579) [#580](#580) [#581](#581)
* added load_percentage parameter to ImageList.from_files to load a subset of the given files ([#739](#739)) ([0564b52](0564b52)), closes [#736](#736)
* added rnn layer and TimeSeries conversion ([#615](#615)) ([6cad203](6cad203)), closes [#614](#614) [#648](#648) [#656](#656) [#601](#601)
* Basic implementation of cell with polars ([#734](#734)) ([004630b](004630b)), closes [#712](#712)
* deprecate `Table.add_column` and `Table.add_row` ([#723](#723)) ([5dd9d02](5dd9d02)), closes [#722](#722)
* deprecated `Table.from_excel_file` and `Table.to_excel_file` ([#728](#728)) ([c89e0bf](c89e0bf)), closes [#727](#727)
* Larger histogram plot if table only has one column ([#716](#716)) ([31ffd12](31ffd12))
* polars implementation of a column ([#738](#738)) ([732aa48](732aa48)), closes [#712](#712)
* polars implementation of a row ([#733](#733)) ([ff627f6](ff627f6)), closes [#712](#712)
* polars implementation of table ([#744](#744)) ([fc49895](fc49895)), closes [#638](#638) [#641](#641) [#649](#649) [#712](#712)
* regularization for decision trees and random forests ([#730](#730)) ([102de2d](102de2d)), closes [#700](#700)
* Remove device information in image class ([#735](#735)) ([d783caa](d783caa)), closes [#524](#524)
* return fitted transformer and transformed table from `fit_and_transform` ([#724](#724)) ([2960d35](2960d35)), closes [#613](#613)

### Bug Fixes

* make `Image.clone` internal ([#725](#725)) ([215a472](215a472)), closes [#626](#626)

### Performance Improvements

* improved performance of `TabularDataset.__eq__` by a factor of up to 2 ([#697](#697)) ([cd7f55b](cd7f55b))
@lars-reimann
Copy link
Member

🎉 This issue has been resolved in version 0.24.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@lars-reimann lars-reimann added the released Included in a release label May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 💡 New feature or request released Included in a release
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants