Obviously Awesome

Elements of Statistical Learning - book notes

This post is in progress. It will be fleshed out as time permits.

  • Cover
  • Preface
  • Contents
  • Intro

  • Supervised Learning

    Variable types & terminology, Basic approaches (least squares & nearest neighbors), Decision theory, Local methods in high dimensions, Statistical models, supervised learning & function approximation, Structured regression models, Classes of restricted estimators Model selection & bias-variance tradeoff

  • Linear Regression

    Intro, Least squares, Subset selection, Shrinkage, Derived input directions, Multiple-outcome shrinkage & selection, Lasso & related path algorithms, Computation factors

  • Linear Classification

    Intro, Indicator matrix, Linear discriminant analysis, Logistic regression, Separating hyperplanes

  • Basis Expansion & Regularization

    Intro, Piecewise polynomials & splines, Filtering & feature extraction, Smoothing splines, Auto selection of smoothing paramters, Non-parametric logistic regression, Multi-dimensional splines, Regularization & reproducing Kernel Hilbert spaces, Wavelet smoothing

  • Kernel Smoothing

    1-D kernel smoothers, Kernel widths, Local regression, Structured local regression, Local likelihood & other models, Kernel density estimation (KDE) & classification, Radial basis functions & kernels, Mixture models, Computational factors

  • Model Assessment & Selection

    Intro, Bias, variance, model complexity, Bias-variance decomposition, Training error rate optimism, In-sample prediction error estimates, Effective number of parameters, Bayesian approach & BIC, Max description length, Vapnik-Chervokenkis dimension, Cross validation, Bootstrap methods

  • Model Inference & Averaging

    Intro, Bootstrap & max likelihood methods, Bayes methods, Bootstrap-Bayes relationship, EM algorithm, MCM for posterior sampling, Bagging, Model averaging & stacking Stochastic search: bumping

  • Additive Models, Trees, Related Methods

    Generalized additive methods, Tree methods, PRIM: bump hunting, MARS: multivariate adaptive regression splines, Hierarchical mixtures of experts, Missing data, Computational factors

  • Boosting & Additive Trees

    Boosting, Boosting Fits an Additive Model, Forward Stagewise Additive Modeling, Exponential Loss & AdaBoost, Why exponential loss, Loss functions, Std data mining procedures, Example: Spam data, Boosting trees, Gradient boosting, Right-sized trees for boosting, Regularization, Interpretation, Illustrations

  • Neural Nets

    Intro, Projection pursuit regression, Neural nets, Fitting, Training issues, Examples & discussion, Bayes NNs, Computational factors

  • Support Vector Machines & Flexible Discriminants

    Intro, Support Vector Classifier, Support Vector Machines & Kernels, Generalizing Linear Discriminant Analysis (LDA), Flexible Discriminant Analysis, Penalized Discriminant Analysis, Mixture Discriminant Analysis

  • Prototype Methods & Nearest Neighbors

    Intro, Prototype methods, KNN classifiers, Adaptive nearest-neighbor methods, Computational factors

  • Unsupervised Learning

    Intro, Association Rules, Cluster Analysis, Self-Organizing Maps, Principal Components/Curves/Surfaces, Non-Negative Matrix Factorization (NNMF), Independent Component Analysis (ICA), Multidimensional Scaling (MDS), Nonlinear dimension reduction, Google PageRank

  • Random Forests

    Intro, Definition, Details, Analysis

  • Ensembles

    Intro, Boosting & regularization paths, Learning ensembles

  • Undirected graphical models

    Intro, Markov graphs, UGMs for continuous variables, UGMs for discrete variables

  • High-dimensional problems

    When p»N, Diagonal LDA & nearest shrunken centroids, Linear classifiers & quadratic regularization, L1 regularization, Classification when features are unavailable, High-D regression, Feature assessment