Source abbreviations:
AJE: algorithms (Jeff Erickson)
BA: bandit algorithms
BJP: (me)
CI: collective intelligence (segaran)
CO: convex estimation
DIDL: dive into deep learning
DLG: deep learning (goodfellow, et al)
DMMD: data mining of massive datasets
DSA: data structures & algorithms
DSCL: data science at the command line
EA: elementary algorithms
ESL: elements of statistical learning
FDS: foundations of data science
ITA: intro to algorithms (cormen, et al)
JE: algorithms (Jeff Erickson)
NP: numeric python
SKL: scikit-learn
SM:
RL: reinforcement learning

Math book notes

(multiple)
  • unsupervised learning (ESL)
    association rules (market baskets, apriori)
    clustering
    self-organizing maps
    principal components
    non-negative matrix factorization (NMMF)
    independent component analysis (ICA)
    multidimensional scaling (MDS)
    nonlinear dimension reduction
    pagerank
approximations
arithmetic
  • complex-numbers (LAY)
    examples; geometric representation; powers; R^2
  • computation (DLG)
    underflow, overflow
    poor conditioning
    gradient-based optimization
    jacobian & hessian matrices
    constrained optimization
    linear least squares
  • eigenvectors & eigenvalues (LAY)
    intro; eigenvectors & difference equations
    determinants & characteristic equations
    similarity
    diagonalization
    eigenvectors & linear transforms
    complex eigenvalues
    discrete dynamical systems
    differential equations
    iterative estimates
  • factoring primes (ADM)
    is n a prime number? if not, what are its factors?
  • linear algebra (DLG)
    scalars, vectors, matrices, tensors
    vector|matrix multiplication
    identity matrix
    inverse matrix
    linear dependence
    span
    norms
    diagonal matrix
    symmetric matrix
    orthogonal matrix
    eigen decomposition
    singular value decomposition (svd)
    moore-penrose pseudoinverse matrix
    trace operator
    determinant
    example - principal components analysis (PCA)
  • linear algebra (LAY)
  • linear algebra (LAY)
  • linear equation solvers (ADM)
    if A = an mxm matrix, and b = an mx1 vector, what is vector X such that AX=b?
  • math basics (DIDL)
    linear & matrix ops
    eigen decompositions
    single-variable calculus
    multi-variable calculus
    integrals
    random variables
  • number theory (ITA)
    basics (divisors, primes/composites)
    greatest common divisor (Euclid)
    modular math (group theory?)
    linear equations
    the chinese remainder problem
    powers
    RSA public-key crypto
    prime testing
    factorization (integer)
  • numericals (ADM)
    linear equations
    bandwidth reduction
    matrix multiplication
    determinants & permanents
    optimization (constrained, unconstrained)
    linear programming
    random number gen
    factors & prime testing
    arbitrary-precision math
    the knapsack problem
    discrete fourier transforms (DFTs)
  • random numbers (ADM)
    (also part of "numericals" chapter of ADM.)
  • recursion (JE)
    reductions
    simplify & delegate
    tower of hanoi
    mergesort
    quicksort
    design pattern
    recursion trees
    linear-time selection
    fast multiplication
    exponentiation
bayes
books
cheatsheets
classification
classification & regression
clustering
combinationals
  • job scheduling (ADM)
    given a directed acyclic graph (vertices = jobs, edges = task dependencies), what schedule completes the job in minimum time/effort?
  • partitions (ADM)
    given integer n, generate partitions that add up to n.
  • permutations (ADM)
    given n, generate a set of items of length n.
  • satisfiability (ADM)
    given a set of logical constraints, is there a configuration that satisfies the set?
density estimation
dynamic programming
glossaries
graphs
inference
information theory
  • info theory tutorial (stone, USheffield)
    finding a route
    bits are not binary digits
    entropy
    entropy - continuous variables
    max-entropy distributions
    channel capacity
    shannon's source coding theorem
    noise reduces channel capacity
    mutual info
    shannon's noisy channel coding theorem
    gaussian channels
    fourier analysis
    history
    key equations
latent variables
  • linear factor models (DLG)
    probabilistic PCA + factor analysis
    independent component analysis
    sparse coding
    manifold representation of PCA
max likelihood estimation (MLE)
mixtures
  • latent linear models (SM)
    factor analysis
    principal components analysis (PCA)
    choosing number of dimensions
    PCA for categories
    PCA for paired & multiview data
    independent component analysis (ICA)
monte carlo
natural language processing
planning
planning / capacity
svms
time series
tools
vision
wavelets