Python Resources

mostly Jupyter notebooks - hosted on GitHub


basics   (numeric python)
datatypes, typecasting, promoting, complex numbers, memory, arrays, indexes, slices, views, fancy indexing, boolean indexing, reshaping, merging, vectorization, math ops, aggregate ops, boolean arrays, conditionals, logic, set ops, matrix ops
home page   (
intermediate   (python DS handbook)
arrays, boolean arrays, masking, broadcasting, fancy indexes, sorting, structured data, aggregations, ufuncs, datatypes


articles   (towards DS)
TDS article search
basics   (numeric python)
series, dataFrames, time series
home page   (
intermediate   (python DS handbook)
aggregations, groups, concat/append, hierarchical indexes, merge/join, missing values, pivot tables, time series, vectorized objects
tips & tricks   (towards DS (blog))
date ranges, merges, save to excel, file compression, histograms, pdfs, cdfs, least squares, timing, display options, pandas 1.0 features


statistics - Bayes   (numeric python)
normal distribution, dependent variables, posterior distributions, linear regression, multilevel models
statistics - basics   (numeric python)
random numbers, distributions, hypothesis testing, kernel density estimation
statsmodel, patsy   (numeric python)
patsy, categorical variables, linear regression, discrete & logistic regression, poisson distribution, time series

Scientific Computation with SciPy:

Ordinary DiffEqtns   (numeric python)
symbolic solutions, directional field graphs, laplace transforms, numerical methods, numerical integration
Partial DiffEqtns   (numeric python)
integration   (numeric python)
simpson's rule, multiple integration, scikit-monaco, symbolic/multiprecision quadrature, laplace transforms, fourier transforms
interpolation   (numeric python)
polynomials, splines, multivariates
signal processing   (numeric python)
spectral analysis, fourier transforms, frequency-domain filters, windowing, spectrograms, convolutions, FIRs, IIRs
sparse matrices & graphs   (numeric python)
sparse matrices, sparse linear algebra, eigenvalue problems, graphs & networks

Feature Engineering:

setup, tips, caching, regression target transforms
data imputation basics   (scikit-learn 0.24)
univariate, multivariate, nearest-neighbor, marking imputed values
datasets - simple examples   (scikit-learn 0.24)
iris, digits, cal housing, labeled faces, 20 newsgroups, (more)
feature engineering intro   (python DS handbook)
one-hot encoding, word counts, tf-idf, linear-to-polynomial, missing data, pipelines
feature extraction (text)   (scikit-learn 0.24)
bag of words, sparsity, vectorizers, stop words, tf-idf, decoding, applications, limits, the hashing trick, out-of-core ops
file i/o   (numeric python)
CSV, HDF5, h5py, pytables, hdfstore, JSON, serialization, pickle issues
preprocessing basics   (scikit-learn 0.24)
mean removal, variance scaling, sparse scaling, outlier scaling, distribution maps, normalization, category coding, binning, binarization, polynomial features.
random projections   (scikit-learn 0.24)

Machine Learning:

README   (scikit-learn)
biclustering   (scikit-learn)
spectral co-clustering, spectral bi-clustering
calibration curves   (scikit-learn)
(ex) classifier confidence
MNIST, metrics, confusion matrix, precision & recall, ROC, multiple classes, error analysis, multiple labels, multiple outputs
label propagation
classification metrics   (scikit-learn)
clustering   (scikit-learn)
overview, k-means, affinity propagation, mean shift, spectral, hierarchical, dbscan, optics, birch, metrics
component analysis   (scikit-learn)
component analysis   (DS handbook)
intro, random projections, feature agglomeration, dimensional reduction, noise filter, eigenfaces
composite transformers   (scikit-learn)
pipeline, feature union
covariance   (scikit-learn)
empirical, shrunk, sparse invariance, robust estimation
cross decomposition   (scikit-learn)
cross validation   (scikit-learn)
user guide, ROC curves, K-fold, LvO, LpO, stratified, shuffled, group-K-fold
datasets (toys)   (scikit-learn)
datasets - other sources   (scikit-learn)
decision trees   (scikit-learn)
training, viz, predictions, CART, gini vs entropy, regularization
density estimation   (DS handbook)
histograms, spherical KDEs, custom estimators
density estimation   (scikit-learn)
validation, linear algebra, arrays, random sampling, graphs, testing, multiclass/multilabel, helpers, hashes, warnings, exceptions
curse of dimensionality, projections, manifolds, PCA, explained variance, choosing dimensions, PCA for compression, incremental PCA, randomized PCA, kernel PCA, selecting a kernel, LLE, MDS, isomap, t-SNE, LDA
discriminant analysis   (scikit-learn)
dimensionality reduction, LDA, math, shrinkage, estimators
cosine similarity, kernels (linear, polynomial, sigmoid, RBF, laplacian, chisqd)
ensembles (bagging)   (scikit-learn)
ensembles (boosting)   (scikit-learn)
ensembles (voting)   (scikit-learn)
feature extraction (text)   (scikit-learn)
feature selection   (scikit-learn)
low-variance features, univariate selection, recursive elimination, selecting from a model, pipeline ops
file IO   (scikit-learn)
gaussian mixtures   (scikit-learn)
expectation maximization (EM), confidence ellipsoids, bayes info criterion & n_clusters, covariance constraints (spherical, diagonal, tied, full), variational bayes (extension of EM)
gaussian processes   (scikit-learn)
regressions, classifiers, kernels
classification, regression, sparse data, complexity, stopping, tips, implementation
hyperparameters   (scikit-learn)
user guide, grid search, random parameters, tips, brute force alternatives
inspection plots   (scikit-learn)
kernel approximations   (scikit-learn)
noestrem method, std kernels
linear models   (scikit-learn)
user guide, OLS, ridge regression, lasso, elastic net, LARS, OMP, bayes, ARD, passive-aggressive algos, robustness, ransac vs theil-sen vs huber, polynomial regression
logistic regression   (scikit-learn)
manifolds   (scikit-learn)
hello, MDS, non-linear embeddings, tradeoffs, isomap on faces
metrics & scoring basics   (scikit-learn)
multilabel/multiclass   (scikit-learn)
label formats, OvR, OvO, ECCs, multiple outputs, classifier chains, regressor chains
definition, as a classifier, as a regressor, regularization, loss functions, complexity, math, tips, warm_start
naive bayes   (scikit-learn)
gaussian, multinomial, complement, bernoulli, out-of-core
nearest neighbors   (scikit-learn)
unsupervised, KD trees, Ball trees, regressions, nearest centroids, NCA
novelties & outliers   (scikit-learn)
definitions, methods, novelty detection, outlier detection, elliptic envelope, iso forest, local outlier factor, novelties with LOF
python vs cython vs c, code profiling, memory profiling, cython tips, profiling compiled extensions, joblib.Parallel, warm_start
regression (isotonic)   (scikit-learn)
regression (kernel ridge)   (scikit-learn)
regression metrics   (scikit-learn)
parameters, bernoulli RBM, stochastic max likelihood learning
support vector machines   (scikit-learn)
classification, regression, density estimates, novelty detection, complexity, tips, kernel functions, implementation
classification (linear), classification (nonlinear), polynomial features, the kernel trick, similarity functions, gaussian RBF kernels, regression
validation curves, learning curves
viz/display objects   (scikit-learn)

Natural Language Processing (NLP):

GenSim 101   (gensim)
similarity queries, text summaries, distance metrics, LDA, Annoy, PDLN, doc2vec, word mover, fasttext
NLTK 101   (NLTK)
data cleanup, bag of words, classifier fit, metrics, feature pareto, tf-idf, semantic meanings, CNN
SpaCy 101   (spacy)
tokens, POS tags, dependency parsing, lemmas, sentence boundaries, named entities, similarity, text classification, rule-based matches, training, serialization

Deep Learning with Tensorflow:

DNNs   (scikit-and-tensorflow-workbooks)
gradients, activation functions, batch normalization, gradient clipping, model reuse, layer freeze & cache, model zoos, regularization
RNNs   (scikit-and-tensorflow-workbooks)
intro, sequences, unrolling, simplification, training, deep RNNs, LSTMs, GRU cells, NLP basics
autoencoders (AEs)   (scikit-and-tensorflow-workbooks)
intro, stacked AEs, tying weights, reconstructions
convolutional neural nets (CNNs)   (scikit-and-tensorflow-workbooks)
layers, filters, map stacking, padding & pooling, architectures
intro   (scikit-and-tensorflow-workbooks)
installation, graphs, gradient descent, momentum, model save-restore, visualization, tensorboard, sharing variables
neural nets   (scikit-and-tensorflow-workbooks)
perceptrons, MLPs, backprop, training,
reinforcement learning (RL)   (scikit-and-tensorflow-workbooks)
openAI gym, policies, markov decision processes, q-learning

Deep Learning with PyTorch:

tensors, numpy arrays, cuda, autograd, gradients, neural net design, loss functions, backprop, weight updates, training, CNN definition, testing, GPU training, parallelism

Visualization Tools:

category scatter plots   (github/category-scatterplot)
matplotlib tutorial   (numeric python)
seaborn gallery   (
LOTs of plot types

Symbolic Computation (SymPy):

equation solvers   (numeric python)
square vs rectangular, eigenvalues, nonlinear equations, univariate equations
intro   (numeric python)
symbols, numbers, rationals, constants, functions, expressions, simplification, expansion, factor, collect, combine, apart, together, cancel, substitutions, evaluations, calculus, sums, products, equations, linear algebra


intro to Numba   (
installation, will it work?, nopython, performance, under the hood, @decorators, groups
numba, numba.vectorize, cython, tips & tricks, cython & C

Various Utilities:

postgres tutorial   (postgresqltutorial)

Python Standard Library: (v3.8)

posix, pwd, spwd, grp, crypt, termios, tty, pty, fcntl, pipes, resource, nis, syslog
msilib, msvcrt, winreg, winsound
os, io, time, argparse, getopt, logging, getpass, curses, platform, error, ctypes
struct, codecs
threads, multiprocessing, concurrent, subprocess, sched, queue, _thread, _dummy_thread
hashlib, hmac, secrets
datetime, calendar, collections, heapq, bisect, array, weakref, types, copy, pprint, reprlib, enum
boolean, comparisons, numerics, iterators, sequences, text sequences, binary sequences, sets, maps, context managers, more
bdb, faulthandler, pdb, profilers, timeit, trace, tracemalloc
martin heinz tutorial
typing, pydoc, doctest, unittest, 2to3, test
basics, concrete exceptions, warnings, hierarchy
zlib, gzip, bz2, lzma, zipfile, tarfile
csv, configparser, netrc, xdrlib, plistlib
pickle, copyreg, shelve, marshal, dbm, sqlite3
pathlib, os.path, fileinput, stat, filecmp, tempfile, glob, fnmatch, linecache, shutil
turtle, cmd, shlex
itertools, functools, operators
gettext, locale
webbrowser, cgi, cgitb, wegiref, urllib, http, ftplib, poplib, imaplib, nntplib, smtplib, smtpd, telnetlib, uuid, socketserver, http.server, http.cookies, xmlrpc, ipaddress
parser, ast, symtable, symbol, token, keyword, tokenize, tabnanny, pyrlbr, py_compile, compileall, diss, pickletools
html, xml
zipimport, pkgutil, modulefinder, runpy, importlib
audioop, aifc, sunau, wave, chunk, colorsys, imghdr, sndhdr, ossaudiodev
email, json, mailcap, mailbox, mimetypes, base64, binhex, binascii, quopri, uu
asyncio, socket, ssl, select, selectors, asyncore asynchat, signal, mmap
numbers, math, cmath, decimal, fractions, random, statistics
disutils, ensurepip, venv, zipapp
sys, sysconfig, builtins, __main__, warnings, dataclasses, contextlib, abc, atexit, traceback, __future__, gc, inspect, site
optparse, imp
tkinter, more...