v0.24.0
0.24.0 (2024-05-09)
This release features completely rewritten containers for tabular data (currently experimental). They use the extremely fast polars library as their backend. Together with a drastically more efficient implementation of our own interface, operations on tabular data are now as fast as they should be.
Previously, even operations on small tables (10000 rows x 50 columns) took very long, as this comparison of Table
methods shows:
method | old (s) | new (s) | speedup (factor) |
---|---|---|---|
remove_duplicate_rows |
0.25474 | 0.01306 | 19.5 |
remove_rows_with_missing_values |
0.25159 | 0.00946 | 26.6 |
remove_rows_with_outliers |
0.28816 | 0.01034 | 27.9 |
remove_rows |
2.69647 | 0.00242 | 1114.2 |
shuffle_rows |
0.24690 | 0.00204 | 121.0 |
slice_rows |
0.12313 | 0.00011 | 1119.4 |
sort_rows |
4.67574 | 0.00372 | 1256.9 |
split_rows |
0.24764 | 0.00219 | 113.1 |
transform_column |
2.89572 | 0.00030 | 9652.4 ❗ |
You can find a full list of changes below. Special thanks to all contributors:
Features
Column.plot_histogram()
usingTable.plot_histograms
for consistent results (#726) (576492c)Regressor.summarize_metrics
andClassifier.summarize_metrics
(#729) (1cc14b1), closes #713- Add
ImageDataset
and Layer for ConvolutionalNeuralNetworks (#645) (5b6d219), closes #579 #580 #581 - added load_percentage parameter to ImageList.from_files to load a subset of the given files (#739) (0564b52), closes #736
- added rnn layer and TimeSeries conversion (#615) (6cad203), closes #614 #648 #656 #601
- Basic implementation of cell with polars (#734) (004630b), closes #712
- deprecate
Table.add_column
andTable.add_row
(#723) (5dd9d02), closes #722 - deprecated
Table.from_excel_file
andTable.to_excel_file
(#728) (c89e0bf), closes #727 - Larger histogram plot if table only has one column (#716) (31ffd12)
- polars implementation of a column (#738) (732aa48), closes #712
- polars implementation of a row (#733) (ff627f6), closes #712
- polars implementation of table (#744) (fc49895), closes #638 #641 #649 #712
- regularization for decision trees and random forests (#730) (102de2d), closes #700
- Remove device information in image class (#735) (d783caa), closes #524
- return fitted transformer and transformed table from
fit_and_transform
(#724) (2960d35), closes #613