I'm excited to announce the launch of the open source project sk-dist.
The goal of the project is to provide a general framework for distributing scikit-learn meta-estimators with Spark.
Examples of meta-estimators are decision tree ensembles (random forests and extra randomized trees), hyperparameter tuners (grid search and randomized search) and multi-class techniques (one vs rest and one vs one).
To get started with sk-dist, check out the link bellow. The code repository also contains a library of examples to illustrate some of the use cases of sk-dist. All are welcome to submit issues and contribute to the project.
Read more: https://github.com/Ibotta/sk-dist
Comentarios