Metadata-Version: 2.1
Name: pandaslearn
Version: 0.1.9
Summary: `pandaslearn` is a small wrapper on top of `scikit-learn` to automate common modeling tasks.
Home-page: https://github.com/soumendra/pandaslearn
License: MIT
Keywords: python,machinelearning
Author: Soumendra Prasad Dhanee
Author-email: soumendra@gmail.com
Requires-Python: >=3.8,<3.11
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Requires-Dist: SQLAlchemy (>=1.4.25)
Requires-Dist: alembic (>=1.7.3)
Requires-Dist: backports.weakref (>=1.0.post1)
Requires-Dist: backports.zoneinfo (>=0.2.1)
Requires-Dist: catboost (>=0.26)
Requires-Dist: category-encoders (>=2.2.2)
Requires-Dist: fastapi (>=0.68.1)
Requires-Dist: joblib (>=1.0.1)
Requires-Dist: lightgbm (>=3.2.1)
Requires-Dist: loguru (>=0.5.3)
Requires-Dist: missingno (>=0.5.0)
Requires-Dist: optuna (>=2.9.1)
Requires-Dist: pandas (>=1.2.5)
Requires-Dist: pandas-profiling (>=3.0.0)
Requires-Dist: pandas-ta (>=0.3.2-beta.0)
Requires-Dist: pdpipe (>=0.0.53)
Requires-Dist: plotnine (>=0.8.0)
Requires-Dist: psycopg2 (>=2.9.1)
Requires-Dist: pydantic (>=1.8.2)
Requires-Dist: scikit-learn (>=1.0)
Requires-Dist: scikit-lego (>=0.6.7)
Requires-Dist: sqlmodel (>=0.0.4)
Requires-Dist: stackprinter (>=0.2.5)
Requires-Dist: streamlit
Requires-Dist: streamlit-pandas-profiling (>=0.1.2)
Requires-Dist: xgboost (>=1.4.2)
Requires-Dist: yellowbrick (>=1.3.post1)
Project-URL: Repository, https://github.com/soumendra/pandaslearn
Description-Content-Type: text/markdown

# pandaslearn

`pandaslearn` is a small wrapper on top of `scikit-learn` to automate common modeling tasks.

* Create `Trainer` instance with `Dataset` and `Model` instances, `__init__()` in `Trainer` instance should populate `Dataset` and `Model` instance's `logger` attributes. Methods on `Dataset` and `Model` should be called after that, so that everything gets logged appropriately.
# TODO

* TODO: visualization: add barcharts (plotnine)
* TODO: visualization: add histograms (plotnine)
* TODO: visualization: add scatterplots (plotnine)
* TODO: visualization: add lineplots (plotnine)
* TODO: visualization: add boxplots (plotnine)
* TODO: visualization: add violin plots (plotnine)
* TODO: visualization: add function to change theme (xkcd, ?)
* TODO: add a `geo` namespace (+ feature engineering, plots)
* TODO: add tests against a few standard fixtures (precompute values and test against them)
* TODO: integrate missingno package: functions to only compute/sort nullity
* TODO: integrate missingno package: plotnine functions matching missingno plot(including geo)
* TODO: integrate missingno package: timeseries nullity plots (just plot all timelines with gaps)
* TODO: pandas-profiling has a lot of useful analysis useful for ml. Integrate those (provide textual outcomes like dicts or dfs instead of plot)
* TODO: future integration targets: https://compose.alteryx.com/en/stable/
* TODO: future integration targets: https://featuretools.alteryx.com/en/stable/
* TODO: future integration targets: https://evalml.alteryx.com/en/stable/

