Source abbreviations: AJE: Algorithms BA: Bandit Algorithms BJP: (me) CI: Collective Intelligence CO: Convex Estimation DIDL: Dive into Deep Learning DLG: Deep Learning (Goodfellow, et al) DMMD: Data Mining of Massive Datasets DSA: Data structures & Algorithms DSCL: Data Science at the Command Line EA: Elementary Algorithms ESL: Elements of Statistical Learning FDS: Foundations of Data Science GT: Geometric Topology ITA: Intro to Algorithms JE: Algorithms NP: Numeric Python SKL: Scikit-learn SM: ML cheatsheet RL: Reinforcement Learning

book chapter summaries - deep learning, machine learning, various math

Tags:
(multiple) approximations arithmetic association rules autoencoders bandit algorithms bash bayes cheatsheets classification clustering combinationals computation - complexity - performance - benchmarking data structures datasets deep learning architectures density estimation design dimensional reduction dynamic programming ensembles evaluation feature engineering file I/O gaussians generative models geometry graphs greedy algos inference information theory interviewing kernels label spreading, label propagation latent variables learning linear models linear programming make markov chains matrix math max likelihood estimation (MLE) methods mixtures monte carlo multilabel natural language processing novelties-outliers numerical analysis numpy pandas parametric models performance planning planning / capacity probabilistic analysis probability & statistics pycaret recommenders recurrent NNs recursion regression reinforcement learning restricted boltzmann machines robotics searching & sorting set theory streams strings survival analysis svd svms sympy tbd tensorflow time series tools topology training use cases vision visualization wavelets

(multiple)

data science cheatsheet 2.0 (aaron wang)

distributions; hypothesis testing; concepts; model evaluation; linear regression; logistic regression; decision trees; naive bayes; svms; knns; clustering; dimensional reduction (PCA, LDA, FA); NLP; neural nets (basics, CNNs, RNNs); boosting; recommenders; reinforcement learning; anomoly detection

other topics (FDS)

ranking & social choice; compressed sensing & sparse vectors; use cases; an uncertainty principle; gradients; linear programming; integer optimization; semi-definite programming

approximations

approximate-inference (DLG)

inference as optimization
expectation maximization (EM)
MAP inference | sparse coding
variational inference
learned approx inference

approximations (algorithm reductions) (ADM)

algo reductions
basic hardness reductions
satisfiability
creative reductions
"proving" hardness
P vs NP hardness
NP-complete problems

approximations (algorithm reductions) (ITA)

the vertex-cover problem
the traveling salesman problem
the set-cover problem
randomization & linear programming
the subset-sum problem

arithmetic

complex-numbers (LAY)

examples; geometric representation; powers; R^2

computation (DLG)

underflow, overflow
poor conditioning
gradient-based optimization
jacobian & hessian matrices
constrained optimization
linear least squares

factoring primes (ADM)

is n a prime number? if not, what are its factors?

linear algebra (LAY)

linear equations
row reductions
vector equations
Ax=b
solution sets of linear systems
applications
linear independence
linear transforms
linear models - business, science, engineering

linear equation solvers (ADM)

if A = an mxm matrix, and b = an mx1 vector, what is vector X such that AX=b?

number theory (ITA)

basics (divisors, primes/composites)
greatest common divisor (Euclid)
modular math (group theory?)
linear equations
the chinese remainder problem
powers
RSA public-key crypto
prime testing
factorization (integer)

random numbers (ADM)

(also part of "numericals" chapter of ADM.)

association rules

association rules | market basket analysis (ESL)
frequent itemsets (DMMD)

market-basket modeling; association rules; a-priori algorithm; large datasets & main memory; limited-pass algorithms; counting items in streams

autoencoders

autoencoders (DLG)

undercomplete AEs; regularized AEs; representational power, layer size & depth; stochastic encoders & decoders; denoising AEs; learning manifolds with AEs; predictive sparse decomposition; applications

autoencoders with Tensorflow (HoML)

bandit algorithms

UCB - asymptotic-optimality (BA)
UCB-algorithm-bernoulli-noise (BA)
UCB-algorithm-minimax-optimality (BA)
bandits-adversarial-vs-stochastic-linear (BA)
bandits-bayes (BA)
bandits-combinatorial (BA)
bandits-concentration-of-measure- (BA)
bandwidth-reduction (ADM)
basis-expansions-regularization (ESL)
bayes-empirical-estimation (CSI)
contextual (BA)
convex-analysis (BA)
exp3 (BA)
exp3-IX (BA)
exp3-adversarial-linear (BA)
explore-then-commit (BA)
follow-the-leader-mirror-descent (BA)
index (BA)
info theory (BA)
intro (BA)
least-squares-estimators-confidence-bounds (BA)
least-squares-estimators-optimal-design (BA)
lower-bounds (BA)
lower-bounds-high-probability (BA)
lower-bounds-instance-dependent (BA)
lower-bounds-minimax (BA)
markov-decisions (BA)
non-stationary (BA)
partial-monitoring (BA)
probability (BA)
pure-exploration (BA)
ranking (BA)
stochastic-finite (BA)
stochastic-linear (BA)
stochastic-linear-asymptotic-lower-bounds (BA)
stochastic-linear-finite-many-arms (BA)
stochastic-linear-minimax-lower-bounds (BA)
stochastic-linear-sparsity (BA)
stochastic-markov (BA)
thompson-sampling (BA)
upper-confidence-bound (BA)

bash

common linux/bash commands (Data Science - Command Line)

environment (alias, bash, cols, for, sudo, ...)
files & directories (body, cd, cat, chmod, ...)
pattern matching (awk, sed, grep)
deployment (aws, git, )
CSV data
JSON data
online data (curl, scp, scrape, ssh)
integer/date sequences,br> file extraction/compression (tar, tree, uniq, ...)

bayes

bayes inference (CSI)

two examples
uninformed prior distributions
flaws in frequentist inference
bayes vs frequentist comparison

bayes nets (directed graphs) (SM)
bayes statistics (NP)

intro & model definition
sampling posterior distributions
linear regression

bayesian statistics (SM)

intro
posterior distribution
MAP estimates
bayes model selection
priors
hierarchical bayes
empirical bayes
decision theory

cheatsheets

deep learning cheatsheet (2018) (SCDL)

CNNs, RNNs, tips & tricks

sampling methods (PSC)

inverse transform sampling; the bootstrap; rejection sampling; importance sampling

classification

cal housing market analysis (HoML)
classification basics (HoML)

MNIST, aka hello world
confusion matrix
metrics (precision,recall)
ROC curve
multiclass classification
multilabel classification
multioutput classification

discriminants (LDA, QDA) (SKL)

Linear DA
Quadratic DA

linear classification (ESL)

regression - indicator matrix
linear discriminant analysis (LDA)
logistic regression
hyperplanes

logistic regression (SKL)

solvers - liblinear, newton-cg, lbfgs, sag, saga

metrics (SKL)

accuracy, top-K accuracy, balanced accuracy
cohen's kappa, confusion matrix, classification report
hamming loss, precision, recall, f-measure
precision-recall curve, avg precision
precision-recall curve (multilabel)
jaccard similarity
hinge loss
log loss
matthews correleation coefficient
confusion matrix (multilabel)
ROC curve
detection-error tradeoff (DET)
zero-one loss
brier score

multiclass & multioutput algos (SKL)

intro
multiclass (aka label binarization)
one-vs-rest
multilabel
one-vs-one
output code
multioutput
classifier chains
multiclass-multioutput (aka multitask)

multilayer perceptron (MLP) (SKL)
naive bayes (SKL)

NB classification (gaussian, multinomial, complement, bernoulli)
categorical NB

nearest neighbors (SKL)

basic algos (ball tree, KD tree, ...)
KNNs & radius-based algos
nearest centroids
neighborhood components analysis (NCA)

nearest neighbors (ESL)

prototype methods (kmeans, learning vector quant, gaussian mixtures)
knn classifiers
adaptive NN methods
computational performance

clustering

biclustering methods (SKL)

intro, spectral co/biclustering

clustering (DMMD)

intro (data, strategies, dimensionality)
hierarchical
k-means
CURE (clustering using representatives)
non-euclidean spaces
clustering for streams & parallelism

clustering (FDS)

intro
k-means (lloyds algo, wards algo)
k-center
low-error
spectral
approximation stability
high-density
kernel methods
recursive clustering w/ sparse cuts
dense submatrices & communities
community finding & graph partitions
spectral clustering & social nets

clustering (ESL)
clustering methods (SKL)

Kmeans & Kmeans minibatch
Affinity propagation
Mean shifts
Spectral clustering
Agglomerative clustering
Hierarchical clustering
DBSCAN
Birch
OPTICS

clustering metrics (SKL)

rand index; mutual info score; homogeneity / completeness / v-measure; Fowlkes-Mallows score; silhouette coefficient; Calinski-Harabasz index; Davies-Bouldin index

combinationals

job scheduling (ADM)

given a directed acyclic graph (vertices = jobs, edges = task dependencies), what schedule completes the job in minimum time/effort?

partitions (ADM)

given integer n, generate partitions that add up to n.

permutations (ADM)

given n, generate a set of items of length n.

satisfiability (ADM)

given a set of logical constraints, is there a configuration that satisfies the set?

computation - complexity - performance - benchmarking

amortization (ITA)
growth (ITA)
multithreading (ITA)
np complete (ITA)
np hardness (JE)
numba cython (NP)
performance (DIDL)
scaling (DMMD)

data structures

b-trees (ITA)
datastructs (ADM)
datastructs (DSA)
datastructs augmenting (ITA)
datastructs disjoint (ITA)
dictionaries (ADM)
dictionaries (DSA)
fibonacci heaps (ITA)
hashes (ITA)
heaps (EA)
intro (ITA)
kd trees (ADM)
lists (EA)
priority queues (ADM)
priority queues (DSA)
queues sequences (EA)
red black trees (ITA)
steiner trees (ADM)
suffix trees (ADM)
trees (EA)
van emde boas trees (ITA)

datasets

6619 datasets as of july 2022 (paperswithcode)
artificial (generated) datasets (SKL)
example datasets (SKL)
other datasets (SKL)

deep learning architectures

CNN cheatsheet (SCDL)
adversarial apps (paperswithcode)
convolutional NNs (DLG)
convolutionl NNs (DLG)
deep feedforward NNs (DLG)
deep generative models (DLG)
deep learning (DLG)
gans (DIDL)
intro (ESL)
intro to neural nets (CSI)

intro; fitting; autoencoders; deep learning; learning (dropout, input distortion)

linear NNs (DIDL)
neural network zoo (asimov institute)
perceptrons (DIDL)
representation learning (DLG)

greedy layer-wise unsupervised pretraining
transfer learning | domain adaptation
semi-supervised disentangling of causal factors
distributed representation
exponential gains from depth
providing clues to find underlying causes

structured probabilistic models (DLG)

challenges; using graphs; sampling from graphs; advantages; dependencies; infererence & approx inference

density estimation

density estimates (PSC)

density estimates
histograms
kernel density estimator (KDE)

density estimation methods (SKL)

intro, histograms, kernel density estimates (KDE)

design

algo-analysis (ADM)
algorithm design (ADM)
feature selection (SKL)
model assessment (ESL)
model selection & evaluation (arxiv)

dimensional reduction

Nonlinear Dimension Reduction | Local Multidimensional Scaling (ESL)
component analysis & matrix factorization (SKL)
independent component analysis (ICA) (ESL)
manifold learning (SKL)
multi-dimensional scaling (MDS) (ESL)
non-negative matrix transform (NNMF) (ESL)
principal component analysis (PCA) ()
principal components (ESL)
self-organized maps (ESL)

dynamic programming

dynamic programming (ADM)
dynamic programming (ITA)
dynamic programming (JE)

intro; faster fibonacci numbers; smart recursion; greed is stupid; longest increasing subsequence; edit distance; subset sum; binary search trees; dynamic programming on trees;

ensembles

bagging (SKL)
boosting (adaboost, gradient tree boosting, histogram boosting) (SKL)
boosting additive trees (ESL)
catboost (catboost.ai)

gradient boosting on decision trees

decision trees (HoML)
decision trees (SKL)
ensemble learning (HoML)
ensembles (ESL)
random forests (ESL)
random forests boosting (CSI)
stacking (general case) (SKL)
voting (SKL)
xgboost (xgboost.ai)

gradient boosting library

evaluation

(hyper)parameters (SKL)
calibration curves (SKL)
covariance (SKL)
cross validation methods (SKL)
cross-validation (CSI)
dummy metrics (SKL)
metrics (SKL)
sparse models & lasso (CSI)

feature engineering

composite transforms (SKL)
data aggregation & grouping (PDA)
data cleaning (PDA)

missing data; transforms; extension data types; string ops; category ops

data imputation (SKL)
data preprocessing (SKL)
feature extraction (SKL)
feature extraction (text) (SKL)
image patch extraction (SKL)
join, combine, reshape (PDA)

hierarchical indexing; combining datasets; reshaping; pivoting

pairwise-data operations (SKL)
prediction target transforms (SKL)
random projections (SKL)
scrubbing data (DSCL)

transforms
plain text
CSV
XML,HTML,JSON

file I/O

data I/O (DSCL)

local data to docker
internet downloads (curl, ...)
decompressions (zip, ...)
excel to CSV
relational DBs
web APIs
authentication
streaming APIs

file I/O (NP)

CSV; HDF5; h5py; Pytables; serialization

file I/O - datatypes (PDA)

text files; JSON; XML/HTML scraping; binary data; web APIs; databases

gaussians

gaussian mixtures (SM)
gaussian mixtures (GMMs) & expectation maximization (EM) (SKL)
gaussian models (SKL)

gaussian regressions

gaussians (SM)

generative models

generative models - discrete data (SM)

generative classifiers; bayesian concept learning; beta-binomial model; dirichlet-multinomial model; naive bayes classifiers

geometry

bin packing (ADM)

given n items and m bins - store all the items using the smallest number of bins.

convex hulls (ADM)
geometric primitives (ADM)
geometry (ITA)
intersections (ADM)
line arrangements (ADM)
medial axis xforms (ADM)
minkowski sum (ADM)
motion planning (ADM)
nearest neighbors (ADM)
point location (ADM)
polygon partitions (ADM)
polygon simplification (ADM)
range search (ADM)
shape similarity (ADM)
spatial structures (DSA)

multi-dimensional structures; planar straight-line graphs; search trees; quad/octal trees; binary space partitioning trees; r-trees; spatio-temporal data; kinetic structures; online dicts; cuttings; approximate geometric queries

triangulation (ADM)
vector spaces (LAY)

graphs

basic algorithms (JE)

definitions; representations; data structures; whatever-first search; depth-first; breadth-first; best-first; disconnected graphs; directed graphs
reductions (flood fill)

chinese-postman (ADM)

given a graph, finding the shortest path touching each edge.

cliques (ADM)

how to find the largest clique (cluster) in a graph?

connected components (ADM)

find the pieces of a graph, where vertices x & y are members of different components if no path exists from x to y.

edge coloring (ADM)

what's the smallest set of colors needed to color the edges of a graph, such that no two same-color edges share a common vertex?

edge vertex connectivity (ADM)

what's the smallest subset of vertices (edges) whose deletion will disconnect a graph?

feedback edge vertex set (ADM)
flows & cuts applications (JE)

edge-disjoint paths
vertex capacities & vertex-disjoint paths
bipartite matching
tuple selection
disjoint-path covers
baseball elimination
project selection

graph algos (ITA)

representations; breadth-first search; depth-first search; topological sorting; strongly-connected components;

graph algos (SOTA) (paperswithcode)
graph datastructs (ADM)

adjancency matrices; adjancency lists

graph drawing (ADM)
graph generation (ADM)
graph isomorphism (ADM)

given two graphs G & H, find a function from G's vertices to H's vertices such that G & H are identical.

graph link analysis (DMMD)

PageRank; link spam; hubs & authorities

graph partition (ADM)

given a weighted graph G and integers k & m, partition the vertices of G into m equally-sized subsets such that the total edge cost spanning the subsets is at most k.

graph traversal (ADM)
graphs connected components (ADM)
graphs hard (ADM)
graphs polynomial time (ADM)
graphs weighted (ADM)
graphviz (tool) (graphviz)
hamiltonian cycles (ADM)
matching (ADM)
maxflow (ITA)
min spanning trees (JE)
min spanning trees (ITA)
minimum spanning tree (ADM)
network flow (ADM)
planarity detection (ADM)
random graphs (FDS)
social graphs (DMMD)
sparse matrices graphs (NP)
transitive closure (ADM)
traveling salesman (ADM)
tree drawing (ADM)
undirected graphs (ESL)
vertex coloring (ADM)
vertex cover (ADM)

greedy algos

greedy algos (ITA)
greedy algos (JE)

inference

after-model-selection-estimation (CSI)

accuracy after model selection
selection bias
combined bayes-frequentist estimation
notes

inference & max likelihood (ESL)
inference frequentist (CSI)
parametric inference (PSC)

information theory

info theory tutorial (stone, USheffield)

finding a route
bits are not binary digits
entropy
entropy - continuous variables
max-entropy distributions
channel capacity
shannon's source coding theorem
noise reduces channel capacity
mutual info
shannon's noisy channel coding theorem
gaussian channels
fourier analysis
history
key equations

interviewing

101 Interview Questions (BJP)

kernels

kernel smoothing (ESL)
kernels (SM)
kernels (CSI)

label spreading, label propagation

semi-supervised learning (SKL)

latent variables

linear factor models (DLG)

probabilistic PCA + factor analysis
independent component analysis
sparse coding
manifold representation of PCA

learning

knowledge papers (paperswithcode)
reasoning papers (paperswithcode)
stochastic gradient descent (SGD) (SKL)

linear models

generalized linear models (SM)

(incomplete notes in orig PDF)

linear programming

linear programming (ADM)
linear programming (ITA)

make

intro to make (DSCL)

overview|intro; running tasks; building; dependencies; summary

markov chains

markov chains (explained visually)
markov-chain-monte-carlo (CSI)
markov-chain-monte-carlo (SM)
random walks markov chains (FDS)

matrix math

basics (DIDL)

linear & matrix ops
eigen decompositions
single-variable calculus
multi-variable calculus
integrals
random variables

determinants (LAY)
eigenvectors & eigenvalues (LAY)

intro; eigenvectors & difference equations
determinants & characteristic equations
similarity
diagonalization
eigenvectors & linear transforms
complex eigenvalues
discrete dynamical systems
differential equations
iterative estimates

inner-product-length-orthogonality (LAW)
linear algebra overview (DLG)

scalars, vectors, matrices, tensors
vector|matrix multiplication
identity matrix
inverse matrix
linear dependence
span
norms
diagonal matrix
symmetric matrix
orthogonal matrix
eigen decomposition
singular value decomposition (svd)
moore-penrose pseudoinverse matrix
trace operator
determinant
example - principal components analysis (PCA)

matrix cookbook (matrixcookbook.com)

basics
derivatives
inverses
complex matrices
solutions & decompositions
multivariate distributions
gaussians
special matrices
functions & operators
1-D results
proofs

matrix determinants (ADM)
matrix math (LAY)
matrix multiply (ADM)
matrix ops (ITA)
numerical basics (ADM)

linear equations
bandwidth reduction
matrix multiplication
determinants & permanents
optimization (constrained, unconstrained)
linear programming
random number gen
factors & prime testing
arbitrary-precision math
the knapsack problem
discrete fourier transforms (DFTs)

symmetric matrices (LAY)

max likelihood estimation (MLE)

fisherian inference & MLE (CSI)

methods

methodologies (paperswithcode)

representation learning; transfer learning; image classification; reinforcement learning; 2D classification; domain adaptation; data augmentation; ...

mixtures

latent linear models (SM)

factor analysis
principal components analysis (PCA)
choosing number of dimensions
PCA for categories
PCA for paired & multiview data
independent component analysis (ICA)

monte carlo

monte carlo methods (DLG)

sampling; importance sampling; markov chain monte carlo (MCMC); gibbs sampling; mixing challenges

multilabel

metrics (SKL)

natural language processing

Gensim lessons ()
NLP SOTA (paperswithcode)

595 tasks (july2022)

natural language processing (NLP) (DIDL)
spaCy tutorial (spacy.io)
topic models (FDS)

topic models
non-negative matrix factorization (NMF)
hard & soft clustering
latent dirichlet allocation (LDA)
dominant admixtures
math
term-topic matrices
hidden markov models
graph models & belief propagation
bayes|belief nets
markov random fields
factor graphs
tree algorithms
message passing
single-cycle graphs
single-loop belief updates
max weight matching
warning propagation
variable correlation

novelties-outliers

novelty & outlier detection (SKL)

numerical analysis

cross decomposition (SKL)

canonical PLS (partial least squares)
SVD (simplified) PLS
PLS regression

cryptography (ADM)
diffeqs ordinary (NP)
diffeqs partial (NP)
dimensionality (ESL)
dimensionality (FDS)
dimensionality reduction (HoML)
dimensionality reduction (DMMD)
discrete fourier xform (ADM)
equation solving (NP)
integration (NP)
interpolation (NP)
optimization (ADM)
optimization (DIDL)
optimization (DLG)
optimization (NP)
optimization (SM)
polynomials & FFTs (ITA)
signal processing (NP)
splines (ESL)
summations (ITA)

numpy

advanced techniques (PDA)

ndarray internals
array manipulation
broadcasting
ufuncs
structured & record arrays
sorting
numba
advanced array I/O
performance tips

basics (PDA)
numpy basics (PDSH)

arrays; boolean arrays; broadcasting; indexing; sorting; structured data; aggregations; ufuncs; data types

vectors, matrices, ndarrays (NP)

pandas

pandas basics (PDA)

series; data frames; index objects; essential functions; descriptive stats

pandas basics (PDSH)

aggregation/grouping, concat, append, hierarchical indexes, merge, join, missing values, objects, ops, performance, pivot tables, time series ops, vectorized string ops

parametric models

parametric models & exponential families (CSI)

performance

parallelism tips (SKL)
parallelization (pipelines) (DSCL)

serial processing
parallel processing
distributed processing

performance tips (SKL)
scaling tips (SKL)

planning

planning algorithms (LaValle)

intro
motion planning
decision theory
differential-constraint planning

planning / capacity

knapsack problems (ADM)

probabilistic analysis

Probabilistic Analysis and Randomized Algorithms (ITA)

Indicator random variables, Randomized algorithms, Probabilistic analysis and further uses of indicator random variables

probability & statistics

bootstrap-confidence-intervals (CSI)
cheatsheet (WillChen)
cookbook ()
counting-probability (ITA)
distributions (PSC)
distributions multivariate (PSC)
expectation (PSC)
frequentist stats (SM)
hypothesis testing (PSC)
hypothesis testing - false discovery (CSI)
intro ()
medians (ADM)
medians orderstats (ITA)
modeling ()
other math (PSC)
probability (DLG)
probability (SM)
random vars (PSC)
statistics (NP)
stats glossary ()
survival analysis & EM (CSI)
theory (PSC)
variance (PSC)

pycaret

PyCaret intro (BJP)

PyCaret is a high-level, low-code Python library that makes it easy to compare, train, evaluate, tune, and deploy machine learning models with only a few lines of code. At its core, PyCaret is basically just a large wrapper over many data science libraries such as Scikit-learn, Yellowbrick, SHAP, Optuna, and Spacy. Yes, you could use these libraries for the same tasks, but if you don’t want to write a lot of code, PyCaret could save you a lot of time.

recommenders

recommenders (DIDL)
recommenders (DMMD)

recurrent NNs

RNNs with Tensorflow (HoML)
cheatsheets (SCDL)
recurrent NNs (DIDL)
recursive nets (DLG)

recursion

backtracking (AJE)
backtracking (JE)
recursion (JE)

reductions
simplify & delegate
tower of hanoi
mergesort
quicksort
design pattern
recursion trees
linear-time selection
fast multiplication
exponentiation

regression

additive-models-trees (ESL)
general linear models & regression trees (CSI)
isotonic regression (SKL)
jackknife (CSI)
linear models (OLS, ridge, lasso, AIC/BIC, elastic-net, LARS, OMP, Bayes, GLM, Tweedie) (SKL)
linear regression (ESL)
linear regression (PSC)
linear regression (SM)
logistic regression (SM)
metrics (SKL)
multiclass & multioutput algos (SKL)

multioutput
regressor chains

regularization (DLG)
ridge regression (SKL)
ridge regression (CSI)

reinforcement learning

RL with Tensorflow (HoML)
approximation-off-policy-methods (RL)
approximation-on-policy-control (RL)
approximation-on-policy-prediction (RL)
dynamic programming (RL)
eligibility traces (RL)
frontiers (RL)
markov finite (RL)
monte carlo (RL)
n-step bootstrap (RL)
policy gradients (RL)
reinforcement learning ()
tabular method planning (RL)
temporal distances (RL)

restricted boltzmann machines

restricted boltzmann machines (RBMs) (SKL)

robotics

robotics apps (paperswithcode)

searching & sorting

all-pairs-shortest-paths (ITA)
all-pairs-shortest-paths (JE)
binary-search-trees (ITA)

definition
query
insert, delete
random build

combinational search (ADM)
depth first search (AJE)
depth first search (JE)
heapsort (ITA)
linear time sort (ITA)
quicksort (ITA)
searching (ADM)
searching (ADM)
similarity search (DMMD)
single source shortest paths (ITA)
sort search (EA)
sorting & searching (ADM)
summary (ADM)
topological sort (ADM)

set theory

finite state machine minimization (ADM)
set cover (ADM)
set packing (ADM)
sets (ADM)
sets (ITA)
sets independent (ADM)
sets strings ()
subsets (ADM)

streams

mining data streams (DMMD)

strings

longest common substring (ADM)
shortest common superstring (ADM)
shortest path (ADM)
string matching (ADM)
string matching (ITA)
string matching approx (ADM)
text compression (ADM)

survival analysis

survival analysis ()

svd

singular value decomposition (svd) (FDS)

svms

support vector machines (ESL)
support vector machines (SVMs) (SKL)

classification (SVC, NuSVC, LinearSVC)
multiclass SVM
scoring & metrics
weighted classes/samples
regression (SVR, NuSVR, LinearSVR)
complexity
kernels
precomputed kernels - the Gram matrix

svms (HoML)

sympy

intro (NP)

symbols; expressions; numeric evaluation; calculus (derivatives, integrals, series expansions, limits, sums & products); equation solvers; linear algebra

tbd

statistical inference (PSC)
the partition problem (DLG)

tensorflow

CNNs (HoML)
DNNs (HoML)
neural net definitions (HoML)
setup (HoML)

time series

Prophet (Facebook)
calendar math (ADM)
time series (PSC)
time series applications (SOTA) (paperswithcode)
time series ops (PDA)

date & time datatypes; ranges, frequencies & shifting; periods; frequency conversion; moving windows

tools

common languages (DSCL)

Jupyter, R, Python, Rstudio, spark

creating one-liners (DSCL)

one-liners to scripts
creation using python or R

installation (SKL)
jq (JSON) basics ()
libraries (DSA)
list of tools (expanded) (DSCL)
mapreduce (DMMD)
patsy, statsmodels & scikit-learn (PDA)
python - pandas (NP)
resources (ADM)
resources (ADM)
statsmodels, patsy (NP)

topology

hyperbolic topology (GT)

groups; spaces; manifolds; thick-thin decomposition; sphere at infinity

surfaces (GT)

intro; teichmuller spaces; surface diffeomorphisms

three-manifolds (GT)

topology; seifert manifolds; construction; the "eight geometries"; mostow rigidity problem; hyperbolic 3Ms; hyperbolic dehn filling

training

training (HoML)

use cases

Google PageRank (ESL)
advertising (DMMD)
applications (DLG)
applications (DSA)
applications (RL)
audio algos (paperswithcode)
code generation algos (paperswithcode)
game-playing algos ()
medical applications (SOTA) (paperswithcode)
music papers (paperswithcode)
product embedding - ecommerce (arxiv)
speech algos (paperswithcode)

vision

computer vision SOTA (paperswithcode)

1300 tasks (july2022)

developers tools (scikit-image)
edges & lines (scikit-image)

contour finding
convex hulls (binary images)
canny filters
marching cubes
ridge operators
active contour model
drawing std shapes
random shapes
hough transforms (straight line)
approximating & subdividing polygons
hough transforms (circular, elliptical)
skeletonizing
morphological thinning
edge operations (multiple)

exposures & colors (scikit-image)

RGB-grayscale conversions
RGB-HSV conversions
histogram matching
(ex) immunohistochemical (IHC) staining
adapting grayscale filters to RGB images
regional maxima filtering (bright features)
local histogram equalization (LHE)
gamma & log-contrast adjustments
histogram equalization
tinting grayscale images

filtering & restoration (scikit-image)
image datasets (scikit-image)
longform examples (scikit-image)
numpy basic ops (scikit-image)
object detection (scikit-image)
object segmentation (scikit-image)
transforms & registration (scikit-image)

visualization

catscatter scatterplot demo (myriam barnes)
clifford attractors ()

simple demo using ggplot

data exploration / visualization (DSCL)

headers
descriptive stats
visuals / chart types

display objects (SKL)
hypertools (hypertools.readthedocs.io)
inspection plots (SKL)
matplotlib basics (NP)
partial dependence plots (PDPs) (SKL)
permutation feature importance plots (SKL)
plotting & visualization (PDA)

matplotlib primer; pandas & seaborn; other tools

receiver operating characteristic (ROC) curves (SKL)
seaborn basics (PDSH)

intro and pokemon tutorial

voronoi diagrams (ADM)

wavelets

wavelets (FDS)

Perfectly Awesome

book chapter summaries - deep learning, machine learning, various math