# Obviously Awesome

## Scikit-Learn Guides - Jupyter Notebooks

(these are HTML pages, converted using `nbconvert`. As such, they do not support Jekyll markup schemes.)
(Edits in progress. Not final.)
Getting Started
Estimator basics
Transformers & preprocessors
Model evaluation
Automatic parameter searches
Linear Models
Details `Ordinary Least Squares (OLS)`
`Ridge` regression
`Lasso` regression
`Akeike` & `Bayes` info criteria
`Elastic Net` regression
`Least Angle (LARS)` regression
`OrthogonalMatchingPursuit` (OMP)
`BayesianRidge` regression
General Linear Regression (GLR)
GLR with `Tweedie`
`Stochastic Gradient Descent (SGD)` regressor & classifier
`Passive Aggressive` algos
`RANSAC, Huber, Theil-Sen` robustness algos
`Polynomial` regression
Logistic Regression (LR)
Details logistic function (wikipedia)
Binary, One-vs-Rest, Multinomial options
Solvers:
`liblinear`
`lbfgs, sag, newton-cg`: l2 penalty support
`sag`: uses SGD
`saga`: l1 penalty support
Discriminant Analysis (LDA, QDA)
Details `Linear` DA
`Quadratic` DA
Shrinkage
Estimators
Kernel Ridge Regression (KRR)
Details example: KRR vs SVR
example: execution time
Support Vector Machines (SVMs)
Details Classification
Classification (multiclass)
Scoring
Weights
Regression
Complexity & kernel options
Gram matrix
Details Classification (std, multiclass, weighted, averaged)
Regression (std, sparse data)
Tips
Nearest Neighbors (NNs)
Details Options
`KNN` vs `Radius`-based
Ball tree vs KD tree vs Brute Force
`NearestCentroid`
`NeighborhoodComponentsAnalysis` (NCA)
Gaussian Processes
Details `GaussianRegression` (GPRs)
Gaussian vs Kernel Ridge
Cross Decomposition / Partial Least Squares (PLS)
Details `Canonical PLS`
`SVD PLS`
`PLS regression`
`Canonical Correlation Analysis` (CCA)
Naive Bayes (NB) classifiers
Details `Gaussian NB`
`Multinomial NB`
`Complement NB`
`Bernoulli NB`
`Categorical NB`
Decision Trees (DTs)
Details `DT classifier`
`Graphviz`
`DT regressor`
Multiple outputs
Complexity
`ID3`, `C5.0`, `CART`
Impurity functions (Gini, Entropy, Misclassification, MSE, MAE)
Minimal cost-complexity pruning
Decision Trees / Bagging
Details Methods
`Random Forest`
`Extra Trees`
Feature importance
`Random Tree Embedding`
Decision Trees / Boosting
Details `AdaBoost`
`Gradient Boosted` DTs
Shrinkage vs Learning Rate
Subsampling
`Histogram-based Gradient Boosting`
`Stacked Generalization`
Voting Classifiers
Details Hard & soft voting classifiers
Voting regressor
Multiclass & Multioutput Algorithms
Details `Label Binarizer`
`One-vs-Rest` classifier
`Multilabel` classifier
`One-vs-One` classifier
`Output Code` classifier
`Multioutput` classifier
`Classifier Chains`
`Multi Output` regressor
`Regressor Chains`
Feature Selection (FS)
Details `Variance-based`
`Univariate `
`Recursive `
`Model-based `
`Impurity-based `
`Sequential `
`FS & pipelines`
Unsupervised Algorithms
Details `SelfTrainingClassifier`
`LabelSpreading`
`LabelPropogation`
Calibration Curves
Details Using cross validation
Performance scores
Regressors
Multiclass support
Multilayer Perceptrons (MLPs)
Details `MLP classifier`
Multilabel & Multiclass classification
`MLP regressor`
Regularization
Tips
Gaussian Mixtures
Details Expectation Maximization (EM)
Variational Bayes GM
Manifolds
Details `Isomap`
`Locally Linear Embedding` (LLE)
`Modified LLE`
`Hessian LLE`
`Local Tangent Space Alignment` (LTSA)
`Multi Dimensional Scaling` (MDS)
`Random Tree Embedding`
`Spectral Embedding`
`t-distributed Stochastic Neighbor Embedding` (t-SNE)
`Neighborhood Components Analysis` (NCA)
Clustering
Details `K-Means`
`Affinity Propagation`
`Mean Shift`
`Spectral`
`Agglomerative`
`Dendrograms`
`DBSCAN`
`OPTICS`
`Birch`
Clustering Metrics
Details `rand_score`
`mutual_info_score`
`Homogeneity, completeness & v-measure`
`Fowlkes-Mallows score`
`Silhouette coefficient`
`Calinski-Harabasz index`
`Davies-Bouldin index`
`Contingency matrix`
`Pair confusion matrix`
Biclustering
Details `Spectral co-clustering`
`Spectral bi-clustering`
metrics
Component Analysis / Matrix Factorization
Details `Principal Component Analysis` (PCA)
`Incremental PCA`
PCA with random SVD
PCA & sparse data
`Kernel PCA`
Truncated SVD (aka `Latent Semantic Analysis`, LSA)
Dictionary Learning
`Factor Analysis` (FA)
`Independent Component Analysis` (ICA)
`Non-Negative Matrix Factorization` (NNMF)
`Latent Dirichlet Allocation` (LDA)
Covariance
Details Empirical (observed) covariance
Shrunk covariance
Ledoit-Wolf (LW) shrinkage
Oracle approx shrikage (OAS)
Precision matrix
Min covariance determinant (MCD) estimators
Mahalanaobis distances
Novelty & Outlier Detection
Details Intro
section `One-class SVM` vs `Elliptic Envelope` vs `Isolation Forest` vs `Local Outlier Factor`
Novelties
Outliers
Density Analysis
Details Histograms
Kernel density estimation (KDE)
Restricted Boltzmann Machines (RBMs)
Details Learning methods
Cross Validation (CV)
Details Intro
`cross_val_score`
`cross_validate`
`cross_val_predict`
`Kfold`, `stratified Kfold`
`Leave One Out` (LOO)
`Leave P Out` (LPO)
CV on grouped data
Time series splits
Permutation testing
Visualizations
Hyperparameter Settings
Details Grid search
Randomized search
Successive Halving (SH)
Alternatives to brute-force search
Info criteria (AIC,BIC) regularization
Classifier Metrics
Details `Accuracy`
`Top K accuracy`
`Balanced accuracy`
`Cohen's kappa`
`Confusion matrix`
`Classification report`
`Hamming loss`
`Precision, recall, F-measure`
`Precision-recall curve`
`Average precision`
`Jaccard similarity`
`Hinge loss`
`Log loss`
`Matthews correlation coefficient`
`Receiver operating characteristic` (ROC)
`Detection error tradeoff` (DET)
`Zero-one loss`
`Brier score`
Multi-label Ranking Metrics
Details `Coverage error`
`Label ranking avg precision` (LRAP)
`Label ranking loss`
`Discounted cume gain` (DCG)
Regression Metrics
Details `Explained variance`
`Max error`
`Mean absolute error` (MAE)
`Mean squared error` (MSE)
`Mean squared log error` (MSLE)
`Mean absolute pct error` (MAPE)
`R2` (coefficient of determination)
`Tweedie deviance error`
"Dummy" Metrics
Details `Dummy classifier`
`Dummy regressor`
Metrics Overview
Details `make_scorer`
Learning Curves
Details `validation_curve`
`learning_curve`
Partial Dependence Plots (PDPs)
Details PDP - 2D example
PDP - 3D example
Individual conditional expectation (ICE) plot example
Permutation Feature Importance (PFI) plots
Details Tree-based models: impurity vs permutation
Visualization: ROC curves
Details Example using `svc_disp`
Customized Partial Dependence plots
Details Multiple examples
Visualization Examples
Details `Confusion matrix display`
`ROC curve display`
`Precision recall display`
Composite Transformers
Details `Pipelines`
`Regression target transformers`
`Feature unions`
`Column transformers`
Feature Extraction (Text)
Details Bag of Words (BoW)
`Count Vectorizer`
`TfIdf Transformer`
`TfIdf Vectorizer`
Decoding text files
The Hashing Trick
`Hashing Vectorizer`
Custom vectorizers
Feature Extraction (Image Patches)
Details `extract_patches_2d`
`reconstruct_from_patches_3d`
Connecitivity graphs
Preprocessing Techniques
Details `Standard scaler`
`MinMax scaler`
`MaxAbs scaler`
`Robust scaler`
`Kernel centerer`
`Quantile transform`
`Power Map`

`Normalizer`
`Ordinal encoder`
`One Hot encoder`
`K Bins discretizer` (aka binning)
`Polynomial feature generation`
Imputation Techniques
Details `Simple` (univariate)
`Iterative` (multivariate)
`Nearest Neighbors`
`Missing Indicator`
Dimensionality Reduction: Random Projections (RP)
Details `random_projection`
`Johnson-Lindenstrauss lemma`
`Gaussian RP`
`Sparse data RP`
Kernel Approximations
Details Nystroem approximation
`RBF sampler`
`Additive Chi-squared sampler`
`Skewed Chi-squared sampler`
`Polynomial sampler`
Pairwise Operations
Details `pairwise_distances`
`pairwise_kernels`
Cosine similarity
Kernels: linear, polynomial, sigmoid, RBF, laplacian, chi-squared
Transforming Prediction Targets
Details `Label binarizer`
`Multi-label binarizer`
`Label encoding`
Simple Datasets
Details Boston house prices (classification)
Iris (classification)
Diabetes (regression)
Digits (classification)
Linnerud (regression)
Wine (classification)
Breast cancer (classification)

`fetch_olivetti_faces`
`fetch_20newsgroups`
`fetch_lfw_people` (Labeled faces in the wild)
`fetch_covtype` (Forest covertype)
`fetch_rcv1` (Reuters Newswire corpus)
`fetch_kddcup99` (KDD CUP - intrusion detection)
`fetch_california_housing`
Artificial Data Generators
Details (classifications)
`make_blobs`
`make_classification`
`make_gaussian_quantiles`
`make_circles`
`make_moons`
(multilabel classifications)
`make_multilabel`
`make_hastie`
`make_biclusters`
`make_checkerboard`
(regression)
`make_regression`
`make_sparse_uncorrelated`
`make_friedman(1,2,3)`
(manifolds)
`make_s_curve`
`make_swiss_roll`
(decompositions)
`make_low_rank_matrix`
`make_sparse_coded_signal`
`make_spd_matrix` (symmetric positive definite)
Other Example Datasets
Details `load_sample_images`
`fetch_openml`
Other API tools - pandas, scipy, numpy, scikit-image, imageio
Performance / Scaling
Details Out-of-core operations example
Performance / Latency
Details Bulk vs Atomic mode
#Features
Input datatypes
Feature extraction
Linear algebra - BLAS, LAPACK usage
Memory limits
Model reshaping
Performance / Parallel Ops Tools
Details Joblib
OpenMP
NumPy/SciPy
`sklearn.set_config`
Persistence (File I/O)
Details Pickle
Joblib `dump`, `load`