SciKit Data¶
About SciKit Data¶
The propose of this library is to allow the data analysis process more easy and automatic.
This library is based on some important libraries as:
- pandas;
- jupyter;
- matplotlib;
- scikit-learn;
- Free software: MIT license
- Documentation: https://skdata.readthedocs.io.
Features¶
Books used as reference to guide this project:
- https://www.packtpub.com/big-data-and-business-intelligence/clean-data
- https://www.packtpub.com/big-data-and-business-intelligence/python-data-analysis
- https://www.packtpub.com/big-data-and-business-intelligence/mastering-machine-learning-scikit-learn
Some other materials used as reference:
- https://github.com/rsouza/MMD/blob/master/notebooks/3.1_Kaggle_Titanic.ipynb
- https://github.com/agconti/kaggle-titanic/blob/master/Titanic.ipynb
- https://github.com/donnemartin/data-science-ipython-notebooks/blob/master/kaggle/titanic.ipynb
This project contemplates the follow features:
- Data conversions:
- soon ...
- Data collection:
- soon ...
- Data cleaning:
- ...
- Data storage:
- soon ...
- Data integration:
- soon ...
- Data manipulation:
- ...
- Outliers removal:
- ...
Installing scikit-data¶
Using conda¶
Installing scikit-data from the conda-forge channel can be achieved by adding conda-forge to your channels with:
$ conda config --add channels conda-forge
Once the conda-forge channel has been enabled, scikit-data can be installed with:
$ conda install scikit-data
It is possible to list all of the versions of scikit-data available on your platform with:
$ conda search scikit-data --channel conda-forge
Using pip¶
To install scikit-data, run this command in your terminal:
$ pip install skdata
If you don’t have pip installed, this Python installation guide can guide you through the process.