Obviously Awesome

Scikit-Learn Guides - Jupyter Notebooks

(these are HTML pages, converted using nbconvert. As such, they do not support Jekyll markup schemes.)
(Edits in progress. Not final.)
Getting Started
Estimator basics
Transformers & preprocessors
Model evaluation
Automatic parameter searches
Linear Models
Details Ordinary Least Squares (OLS)
Ridge regression
Lasso regression
Akeike & Bayes info criteria
Elastic Net regression
Least Angle (LARS) regression
OrthogonalMatchingPursuit (OMP)
BayesianRidge regression
General Linear Regression (GLR)
GLR with Tweedie
Stochastic Gradient Descent (SGD) regressor & classifier
Passive Aggressive algos
RANSAC, Huber, Theil-Sen robustness algos
Polynomial regression
Logistic Regression (LR)
Details logistic function (wikipedia)
Binary, One-vs-Rest, Multinomial options
Solvers:
liblinear
lbfgs, sag, newton-cg: l2 penalty support
sag: uses SGD
saga: l1 penalty support
Discriminant Analysis (LDA, QDA)
Details Linear DA
Quadratic DA
Shrinkage
Estimators
Kernel Ridge Regression (KRR)
Details example: KRR vs SVR
example: execution time
Support Vector Machines (SVMs)
Details Classification
Classification (multiclass)
Scoring
Weights
Regression
Complexity & kernel options
Gram matrix
Stochastic Gradient Descent (SGD)
Details Classification (std, multiclass, weighted, averaged)
Regression (std, sparse data)
Tips
Nearest Neighbors (NNs)
Details Options
KNN vs Radius-based
Ball tree vs KD tree vs Brute Force
NearestCentroid
NeighborhoodComponentsAnalysis (NCA)
Gaussian Processes
Details GaussianRegression (GPRs)
Gaussian vs Kernel Ridge
Cross Decomposition / Partial Least Squares (PLS)
Details Canonical PLS
SVD PLS
PLS regression
Canonical Correlation Analysis (CCA)
Naive Bayes (NB) classifiers
Details Gaussian NB
Multinomial NB
Complement NB
Bernoulli NB
Categorical NB
Decision Trees (DTs)
Details DT classifier
Graphviz
DT regressor
Multiple outputs
Complexity
ID3, C5.0, CART
Impurity functions (Gini, Entropy, Misclassification, MSE, MAE)
Minimal cost-complexity pruning
Decision Trees / Bagging
Details Methods
Random Forest
Extra Trees
Feature importance
Random Tree Embedding
Decision Trees / Boosting
Details AdaBoost
Gradient Boosted DTs
Shrinkage vs Learning Rate
Subsampling
Histogram-based Gradient Boosting
Stacked Generalization
Voting Classifiers
Details Hard & soft voting classifiers
Voting regressor
Multiclass & Multioutput Algorithms
Details Label Binarizer
One-vs-Rest classifier
Multilabel classifier
One-vs-One classifier
Output Code classifier
Multioutput classifier
Classifier Chains
Multi Output regressor
Regressor Chains
Feature Selection (FS)
Details Variance-based
Univariate
Recursive
Model-based
Impurity-based
Sequential
FS & pipelines
Unsupervised Algorithms
Details SelfTrainingClassifier
LabelSpreading
LabelPropogation
Calibration Curves
Details Using cross validation
Performance scores
Regressors
Multiclass support
Multilayer Perceptrons (MLPs)
Details MLP classifier
Multilabel & Multiclass classification
MLP regressor
Regularization
Tips
Gaussian Mixtures
Details Expectation Maximization (EM)
Variational Bayes GM
Manifolds
Details Isomap
Locally Linear Embedding (LLE)
Modified LLE
Hessian LLE
Local Tangent Space Alignment (LTSA)
Multi Dimensional Scaling (MDS)
Random Tree Embedding
Spectral Embedding
t-distributed Stochastic Neighbor Embedding (t-SNE)
Neighborhood Components Analysis (NCA)
Clustering
Details K-Means
Affinity Propagation
Mean Shift
Spectral
Agglomerative
Dendrograms
DBSCAN
OPTICS
Birch
Clustering Metrics
Details rand_score
mutual_info_score
Homogeneity, completeness & v-measure
Fowlkes-Mallows score
Silhouette coefficient
Calinski-Harabasz index
Davies-Bouldin index
Contingency matrix
Pair confusion matrix
Biclustering
Details Spectral co-clustering
Spectral bi-clustering
metrics
Component Analysis / Matrix Factorization
Details Principal Component Analysis (PCA)
Incremental PCA
PCA with random SVD
PCA & sparse data
Kernel PCA
Truncated SVD (aka Latent Semantic Analysis, LSA)
Dictionary Learning
Factor Analysis (FA)
Independent Component Analysis (ICA)
Non-Negative Matrix Factorization (NNMF)
Latent Dirichlet Allocation (LDA)
Covariance
Details Empirical (observed) covariance
Shrunk covariance
Ledoit-Wolf (LW) shrinkage
Oracle approx shrikage (OAS)
Precision matrix
Min covariance determinant (MCD) estimators
Mahalanaobis distances
Novelty & Outlier Detection
Details Intro
section One-class SVM vs Elliptic Envelope vs Isolation Forest vs Local Outlier Factor
Novelties
Outliers
Density Analysis
Details Histograms
Kernel density estimation (KDE)
Restricted Boltzmann Machines (RBMs)
Details Learning methods
Cross Validation (CV)
Details Intro
cross_val_score
cross_validate
cross_val_predict
Kfold, stratified Kfold
Leave One Out (LOO)
Leave P Out (LPO)
CV on grouped data
Time series splits
Permutation testing
Visualizations
Hyperparameter Settings
Details Grid search
Randomized search
Successive Halving (SH)
Alternatives to brute-force search
Info criteria (AIC,BIC) regularization
Classifier Metrics
Details Accuracy
Top K accuracy
Balanced accuracy
Cohen's kappa
Confusion matrix
Classification report
Hamming loss
Precision, recall, F-measure
Precision-recall curve
Average precision
Jaccard similarity
Hinge loss
Log loss
Matthews correlation coefficient
Receiver operating characteristic (ROC)
Detection error tradeoff (DET)
Zero-one loss
Brier score
Multi-label Ranking Metrics
Details Coverage error
Label ranking avg precision (LRAP)
Label ranking loss
Discounted cume gain (DCG)
Regression Metrics
Details Explained variance
Max error
Mean absolute error (MAE)
Mean squared error (MSE)
Mean squared log error (MSLE)
Mean absolute pct error (MAPE)
R2 (coefficient of determination)
Tweedie deviance error
"Dummy" Metrics
Details Dummy classifier
Dummy regressor
Metrics Overview
Details make_scorer
Learning Curves
Details validation_curve
learning_curve
Partial Dependence Plots (PDPs)
Details PDP - 2D example
PDP - 3D example
Individual conditional expectation (ICE) plot example
Permutation Feature Importance (PFI) plots
Details Tree-based models: impurity vs permutation
Visualization: ROC curves
Details Example using svc_disp
Customized Partial Dependence plots
Details Multiple examples
Visualization Examples
Details Confusion matrix display
ROC curve display
Precision recall display
Composite Transformers
Details Pipelines
Regression target transformers
Feature unions
Column transformers
Feature Extraction (Text)
Details Bag of Words (BoW)
Count Vectorizer
TfIdf Transformer
TfIdf Vectorizer
Decoding text files
The Hashing Trick
Hashing Vectorizer
Custom vectorizers
Feature Extraction (Image Patches)
Details extract_patches_2d
reconstruct_from_patches_3d
Connecitivity graphs
Preprocessing Techniques
Details Standard scaler
MinMax scaler
MaxAbs scaler
Robust scaler
Kernel centerer
Quantile transform
Power Map

Normalizer
Ordinal encoder
One Hot encoder
K Bins discretizer (aka binning)
Polynomial feature generation
Imputation Techniques
Details Simple (univariate)
Iterative (multivariate)
Nearest Neighbors
Missing Indicator
Dimensionality Reduction: Random Projections (RP)
Details random_projection
Johnson-Lindenstrauss lemma
Gaussian RP
Sparse data RP
Kernel Approximations
Details Nystroem approximation
RBF sampler
Additive Chi-squared sampler
Skewed Chi-squared sampler
Polynomial sampler
Pairwise Operations
Details pairwise_distances
pairwise_kernels
Cosine similarity
Kernels: linear, polynomial, sigmoid, RBF, laplacian, chi-squared
Transforming Prediction Targets
Details Label binarizer
Multi-label binarizer
Label encoding
Simple Datasets
Details Boston house prices (classification)
Iris (classification)
Diabetes (regression)
Digits (classification)
Linnerud (regression)
Wine (classification)
Breast cancer (classification)

fetch_olivetti_faces
fetch_20newsgroups
fetch_lfw_people (Labeled faces in the wild)
fetch_covtype (Forest covertype)
fetch_rcv1 (Reuters Newswire corpus)
fetch_kddcup99 (KDD CUP - intrusion detection)
fetch_california_housing
Artificial Data Generators
Details (classifications)
make_blobs
make_classification
make_gaussian_quantiles
make_circles
make_moons
(multilabel classifications)
make_multilabel
make_hastie
make_biclusters
make_checkerboard
(regression)
make_regression
make_sparse_uncorrelated
make_friedman(1,2,3)
(manifolds)
make_s_curve
make_swiss_roll
(decompositions)
make_low_rank_matrix
make_sparse_coded_signal
make_spd_matrix (symmetric positive definite)
Other Example Datasets
Details load_sample_images
fetch_openml
Other API tools - pandas, scipy, numpy, scikit-image, imageio
Performance / Scaling
Details Out-of-core operations example
Performance / Latency
Details Bulk vs Atomic mode
Validation overhead
#Features
Input datatypes
Feature extraction
Linear algebra - BLAS, LAPACK usage
Memory limits
Model reshaping
Performance / Parallel Ops Tools
Details Joblib
OpenMP
NumPy/SciPy
sklearn.set_config
Persistence (File I/O)
Details Pickle
Joblib dump, load