ESL is one of the most widely accepted introductory texts on Machine Learning. Let's just leave it at that. Each chapter link points to a PDF of the relevant book's section.


Supervised Learning Variable Types
Least Squares & Nearest Neighbors
Decision Theory
Statistical Models
Regression Models
Estimator Classes
Model Selection & Bias-Variance

Linear Regression Least Squares
Subsets
Shrinkage
Derived Input Directions
Comparisons
Multiple Outcomes
Lasso & Related
Computational Factors

Linear Classification Intro
Indicator Matrix
Discriminant Analysis
Logistic Regression
Separating Hyperplanes

Basis Expansions & Regularization Intro
Piecewise Polynomials & Splines
Filtering & Feature Extraction
Smoothing Splines
Auto-Selection of Smoothing Parameters
Non-parametric Logistic Regression
Multi-dimensional Splines
Regularization & Reproducing Kernel Hilbert Spaces
Wavelet Smoothing

Kernel Smoothing 1D Smoothers
Kernel Width
Local Regression
Structured Local Regression
Local Likelihood
Kernel Density Estimation
Radial Basis Functions & Kernels
Mixture Models
Computational Factors

Model Assessment Intro
Bias, Variance, Model Complexity
Bias-Variance Decomposition
Training Error Rates & Optimism
Effective # of Parameters
Bayesian Approach & BIC
Minimum Description Length
Vapnik-Chervonenkis Dimension
Cross Validation
Bootstrap Methods

Model Inference & Averaging Intro
Bootstrap & Max Likelihood
Bayesian Methods
Bootstrap:Bayesian Relation
EM Algorithm
MCMC for Posterior Sampling
Bagging
Model Averaging & Stacking
Bumping

Additive Models Generalized Additive Models
Tree-Based Methods
PRIM
MARS
Hierarchical Expert Mixtures
Missing Data
Computational Factors

Boosting & Additive Trees Boosting Methods
Forward Stagewise Additive Modeling
AdaBoost
Why Exponential Loss?
Loss Functions
"Off the Shelf" Procedures
Example: Spam Data
Boosting Trees
Right-Sized Trees
Regularization
Interpretation
Examples

Neural Nets Intro
Projection Persuit Regression
Neural Nets
Fitting
Training Issues
Examples
Discussion
Bayesian NNs
Computational Factors

SVMs & Flexible Discriminants Intro
Support Vector Classifier
Support Vector Machines & Kernels
Generalizing Linear Discriminant Analysis
Flexible Discriminant Analysis
Penalized Discriminant Analysis
Mixture Discriminant Analysis

Prototypes & Nearest Neighbors Intro
Prototypes (K-Means, LVQ, Gaussian Mixtures)
k-Nearest-Neighbor Classifiers
Adaptive Nearest-Neighbor Methods
Computational Factors

Unsupervised Learning Intro
Association Rules
Cluster Analysis
Self-Organizing Maps
Principal Components
Non-Negative Matrix Factorization
Independent Component Analysis
Multidimensional Scaling (MDS)
Non-Linear Dimension Reduction
Google PageRank

Random Forests Intro
Definition
Out-of-Bag, Variable Importance, Proximity, Overfitting
Analysis

Ensembles Intro
Boosting & Regularization
Learning & Ensembles

Graphs (Undirected) Intro
Markov Graphs
Continuous-Variable Graphs
Discrete-Variable Graphs

Dimensionality When P >> N
Diagonal LDA
Linear Classifiers - Quadratic Regularization
Linear Classifiers - L1 Regularization
Classification when Features aren't Available
High-Dimensional Regression
Feature Assessment