Changelog#
v4.3.0 (2024-06-08)#
β¨ Features#
model: Add check for fitted model in LGBMModel fingerprint. (
f6a0933)
π Bug Fixes#
v4.2.0 (2024-05-21)#
β¨ Features#
transformer: Update
ExpressionTransformerto useTypedDictinstead of tuples. (3950abd)
v4.1.0 (2024-05-18)#
β¨ Features#
tuning: Add support for enqueuing trials in
OptunaTuner. (9e0b6b2)data splitting: Add support for stratification on multiple features in the
RandomSplitter. (d745434)transformer: Add
metadataoption for theExpressionTransformerthat allows for creation of meta features not tracked in theDataSchema. (f16ea8b)transformer: Add
ExpressionTransformerfor creating features using thevaexexpression system. (c0faf74)
v4.0.0 (2024-05-09)#
βοΈ BREAKING CHANGES#
β¨ Features#
exporter: Add
LocalManifestsupport forLocalExporterwhich simplifies caching logic and enables S3 manifest translations. (2199ff0)exporter: Add support for multiple data export using
LocalExporter. (ff988b6)data source: Add support for reading manifest files from S3 buckets in
S3Ingester. (9c68a9b)pipeline: Add
disable_cacheparameter toPipelineexecution. (da1e31a)
π Bug Fixes#
π οΈ Code Refactoring#
data source: Extract shared S3 logic to
utilswhich can be then used byS3Exporter. (97a7974)
v3.2.0 (2024-04-18)#
β¨ Features#
tuning: Add support for
RDSStorageusing theOptunaTuner(cc06ddd)
π Bug Fixes#
v3.1.0 (2024-04-12)#
β¨ Features#
v3.0.0 (2024-04-05)#
βοΈ BREAKING CHANGES#
model: Update
LGBMModelto use dependency injection, now expects alightgbm.LGBMModelas argument. (7250f34)
π Bug Fixes#
v2.2.0 (2024-03-22)#
β¨ Features#
filter: Add
ImblearnResamplingFilterwhich is a wrapper forimblearnover- and under-samplers. (77a3d7d)filter: Add
ExpressionFilterand base class for simple DataFrame filtering usingvaexexpressions. (dc679ff)cache: Add
disable_cacheargument to all cached functions to completely bypass all caching functionality. (fbdfc5d)
π Documentation#
Update
CHANGELOG.mdformat to include missing categories. (d97b32c)
v2.1.0 (2024-02-24)#
β¨ Features#
Update Titanic dataset to
mleko2.0 API. (62bf991)tuning: Add
optuna-dashboardsupport toOptunaTunerincluding automatically generated experiment notes. (29d81c2)transformer: Improve flexibility of
LabelEncoderTransformerby adding optional null encoding and manual dictionary mapping. (f7b30a9)Set
cache_directoryas optional argument, with custom default locations. (08e8777)
π Bug Fixes#
data cleaning: Fix
meta_columnsnot being forcefully cast to correct data type inCSVToVaexConverter. (b42b9ed)
π Documentation#
Update year in Copyright in README.md (#192) (
eeb56e1)
π§ͺ Tests#
Fix test cases generating cache directory outside temporary directory. (
ba57fbf)
v2.0.0 (2024-02-07)#
βοΈ BREAKING CHANGES#
pipeline: Refactor
PipelineStepto useTypedDictfor both inputs and outputs. (2eb623c)
β¨ Features#
π Bug Fixes#
π οΈ Code Refactoring#
π Documentation#
Refactor mleko package documentation to format bullet list correctly. (
76ee895)
π€ Continous Integration#
v1.2.6 (2024-01-25)#
π Bug Fixes#
Bump patch release. (
ff5f94e)
v1.2.5 (2024-01-25)#
π Bug Fixes#
Fix
CHANGELOG.mdtemplate location (141c9b7)
v1.2.4 (2024-01-25)#
π Bug Fixes#
Trigger patch release. (
7269dca)
ποΈ Build#
semantic versioning: Update
CHANGELOG.mdtemplate and semantic versioning logic. (1727e09)
v1.2.3 (2024-01-25)#
π Bug Fixes#
Remove coverage from workflow (
09eb09d)
v1.2.2 (2024-01-25)#
π Bug Fixes#
Switch to trusted publishing (
e84712d)
v1.2.1 (2024-01-25)#
π Bug Fixes#
Experiment with semantic versioning (
0942196)
ποΈ Build#
v1.2.0 (2023-10-09)#
β¨ Features#
data source: β¨ Add support for pattern matching in
*Ingesterand addLocalManifestto index fetched files. (75974a4)
π Bug Fixes#
logging: π Fix LGBM logging routing to correct log level. (
0e5fa77)
π¨ Style#
ποΈ Build#
ποΈ Bump
gitpythonto resolve CVE-2023-41040 and CVE-2023-40590. (79627bd)
v1.1.0 (2023-09-27)#
β¨ Features#
tuning: β¨ Add hyperparameter tuning functionality, initially including
OptunaTuner. (be38c07)
π§ͺ Tests#
tuning: π§ͺ Add test cases for
TuneStep. (d811c7d)
v1.0.0 (2023-09-20)#
βοΈ BREAKING CHANGES#
π Improve
README.mdwith more up to date information. (b388b59)
β¨ Features#
transformer: β¨ Add
DataSchemaAPI to transformersfit,transformandfit_transform. (e053c85)
π Documentation#
π Add example notebook for
Titanicdataset. (e651af9)
v0.8.1 (2023-09-07)#
π Bug Fixes#
config: π Fix readthedocs build to only generate html. (
13fc207)
v0.8.0 (2023-09-06)#
β¨ Features#
model: β¨ Add
LGBMModelalong with base class which can be extended for all types of future models. (b47a241)β¨ Add
DataSchemawhich tracks dataset features throughout the pipeline and methods. (e03bd2c)feature selection: β¨ Update
BaseFeatureSelectorand children to use thefit,transformandfit_transformpattern. (62e4dd1)transformer: β¨ Add
fit,transformandfit_transformto allTransformers, along with API and caching simplificatons. (5cc4ebc)cache: β¨ Add
CacheHandlerwhich allows customization of read/write functions for each cached return value individually. (609e084)
π Bug Fixes#
feature selection: π Add
DataSchemaas partial return from allfitmethods in feature selectors. (ebf2484)
π οΈ Code Refactoring#
cache: πΈ Replace
disable_cachewith a check ifcache_size=0forLRUCacheMixin. (cfd7592)
v0.7.0 (2023-07-11)#
β¨ Features#
π Bug Fixes#
data cleaning: π Switched to HDF5 as file format for faster I/O and better SageMaker support. (
61f9e42)
v0.6.1 (2023-06-30)#
π Bug Fixes#
ποΈ Build#
config: π§ Switch mypy for pyright and update configuration. (
5631aed)
v0.6.0 (2023-06-26)#
β¨ Features#
v0.5.0 (2023-06-17)#
β¨ Features#
transformer: β¨ Add MinMaxScalerTransformer for normalizing numerical features. (
9b26c00)transformer: β¨ Add MaxAbsScalerTransformer that scales numerical features. (
1fd2a93)transformer: β¨ Add CompositeTransformer for chaining together multiple transformers sequentially. (
006d741)transformer: β¨ Add LabelEncoderTransformer for ordinal encoding. (
41a4c45)transformer: β¨ Add FrequencyEncoderTransformer along with support for pipeline. (
465e6db)
π οΈ Code Refactoring#
π« Switch to tqdm.auto to prevent breaking in Jupyter notebooks. (
dc139cf)
π§ͺ Tests#
β Now _get_local_filenames returns a sorted list of filenames to ensure stability. (
774e8eb)
v0.4.2 (2023-06-11)#
π Performance improvements#
β‘οΈ Optimize VarianceFeatureSelector when threshold is 0. (
906dde3)
π οΈ Code Refactoring#
β Remove pandas dependency. (
40e264c)
π€ Continous Integration#
semantic versioning: π· Add more sections to changelog based on conventional commit categories. (
e5b1594)
v0.4.1 (2023-06-04)#
π Bug Fixes#
v0.4.0 (2023-06-03)#
β¨ Features#
feature selection: β¨ Add that filters out invariant features. (
798c261)feature selection: β¨ Add
PearsonCorrelationFeatureSelectorwhich drops highly correlated features. (66e5cd2)feature selection: β¨ Add
CompositeFeatureSelector, for chaining multiple feature selection steps on the same DataFrame. (3d75079)feature selection: β¨ Add standard deviation feature selector. (
c56177b)feature selection: β¨ Add missing rate feature selector. (
d5ba8b5)
π Bug Fixes#
π Fix typeguard breaking changes causing build to fail. (
66c6a8e)
π οΈ Code Refactoring#
v0.3.1 (2023-05-21)#
π Bug Fixes#
:bug: Added notes to pipeline step docstrings. (
d94f899)
π οΈ Code Refactoring#
data source: :bug: Added note to the KaggleDataSource init docstring. (
d5f12d3)
π€ Continous Integration#
:rocket: Removed semantic PR workflow and updated test workflow to not run on release commits. (
8138745)
v0.3.0 (2023-05-21)#
β¨ Features#
new notes (#54) (
21239f7)
π Bug Fixes#
π€ Continous Integration#
:rocket: Updated release to only trigger if the commit message does not contain chore(release). (
c9f3f3f)
v0.2.0 (2023-05-21)#
β¨ Features#
add data splitting step (#53) (
a668b1a)
π Documentation#
v0.1.3 (2023-05-13)#
π Bug Fixes#
cache: :bug: Cache modules exposed in subpackage init. (
fd65e9d)
v0.1.2 (2023-05-13)#
π Bug Fixes#
π Documentation#
:memo: Fixed sphinx-autoapi build warnings. (
040963a)
v0.1.0 (2023-05-12)#
β¨ Features#
data source: :sparkles: Add KaggleDataSource to download the dataset from Kaggle by providing a destination directory, owner slug, dataset slug, and necessary API credentials. (
3fa07b6)
π Bug Fixes#
cache: :bug: Fixed test by not testing it⦠(
e3a0ce9)cache: :bug: Try logging using assert to fix GH issue (
5e247ec)cache: :bug: Attempting to fix test case failing in GH actions. (
4892591)cache: :bug: LRUCacheMixin now relies on file modification time instead of access time due to system limitations. (
127d657):bug: Fixed docstrings for private methods in KaggleDataSource and removed xdoctest from build steps (
bb55cf5)