1. decision making
2. applications / use cases
3. methods
4. history
5. societal impact
6. overview
1. Degrees of belief
2. Probability distributions
1. bayesian
2. naive bayes
3. sum-product variable elimination
4. belief propagation
5. computational complexity
6. direct sampling
7. likelihood weighted sampling
8. gibbs sampling
9. gaussian
10. summary
1. max-likelihood (categorical, gaussian, bayes)
2. bayes
3. non-parametric
4. missing data
- imputation
- expectation-maximization (EM)
1. bayesian network scoring
2. directed graph search
3. markov equivalence classes
4. partially-directed graph search
1. constraints on rational preferences
2. utility functions
3. utility elicitation
4. max expected utility
5. decision networks
6. value of information
7. irrationality
1. markov decision processes (MDPs)
2. policy evaluation
3. value function policies
4. policy iteration
5. value iteration
6. value iteration (async)
7. linear program formulation
8. linear systems with quadratic rewards
1. parametric representations
2. nearest neighbors
3. kernel smoothing
4. linear interpolation
5. simplex interpolation
6. linear regression
7. neural net regression
1. receding-horizon plans
2. lookahead with rollouts
3. forward search
4. branch & bound
5. sparse sampling
6. monte carlo tree search
7. heuristic search
8. labeled heuristic search
9. open-loop planning
1. approx policy evaluation
2. local search
3. genetic algorithms
4. cross entropy methods
5. evolutional strategies
6. evolutional strategies (isotropic)
1. finite difference methods
2. regression gradients
3. likelihood ratios
4. reward-to-go
5. baseline subtraction
1. gradient ascent updates
2. gradient ascent updates (restricted)
3. gradient ascent updates (natural)
4. trust region updates
5. clamped surrogate objective
1. intro
2. generalized advantage estimation
3. deterministic policy gradient
4. actor-critic with monte carlo tree search
1. performance evaluation
2. rare-event simulation
3. robustness
4. tradeoff analysis
5. adversarial analysis
1. bandit problems
2. bayesian model estimation
3. undirected exploration
4. directed exploration
5. optimal exploration
6. multiple-state exploration
1. max likelihood
2. update schemes
3. bayesian methods
4. bayes-adaptive MDPs
5. posterior sampling
1. incremental mean estimates
2. Q-learning
3. sarsa
4. eligibility tracing
5. reward shaping
6. action value function approximation
7. experience replay
1. behavioral cloning
2. dataset aggregation
3. stochastic mixing iterative learning
4. max margin inverse reinforcement learning
5. max entropy " " " " " "
6. generative adversarial imitation learning
1. initialization
2. discrete state filter
3. linear gaussian filter
4. extended kalman filter
5. unscented kalman filter
6. particle filter
7. particle injection
8. summary
1. belief-state MDPs
2. conditional plans
3. alpha vectors
4. pruning
5. value iteration
6. linear policies
1. fully-observable value approximation
2. fast informed bound
3. fast lower bounds
4. point-based value iteration
5. point-based value iteration (randomized)
6. upper bound (sawtooth)
7. point selection
8. sawtooth heuristic search
9. triangulated value functions
1. lookahead with rollouts
2. forward search
3. branch & bound
4. sparse sampling
5. monte carlo tree search
6. determinized sparse tree search
7. gap heuristic search
1. controllers
2. policy iteration
3. nonlinear programming
4. gradient ascent
1. simple games
2. response models
3. nash equilibrium
4. correlated equilibrium
5. iterated best response
6. hierarchical softmax
7. fictitious play
1. markov game (MG)
2. response models
3. nash equilibrium
4. opponent modeling
5. nash q-learning
1. partially observable markov game (POMG)
2. policy evaluation
3. nash equilibrium
4. dynamic programming
1. decentralized, partially observable, markov decision processes
2. subclasses
3. dynamic programming
4. iterated best response
5. heuristic search
6. nonlinear programming
1. measure spaces
2. probability spaces
3. metric spaces
4. normed vector spaces
5. positive definiteness
6. convexity
7. information content
8. entropy
9. cross entropy
10. relative entropy
11. gradient ascent
12. taylor expansions
13. monte carlo estimation
14. importance sampling
15. contraction mappings
1. uniform
2. gaussian (univariate)
3. beta
4. gaussian (multivariate)
5. dirichlet
1. asymptotic notation
2. P, NP, NP-hard, NP-complete (time complexity)
3. space complexity
4. decideability, aka the halting problem
1. neural nets (NNs)
2. feedforward nets
3. regularization
4. convolutional NNs
5. recurrent NNs
6. autoencoder nets
7. adversarial nets
1. search problems
2. graphs
3. forward search
4. branch & bound
5. dynamic programming
6. heuristic search
1. test world
2. 2048 (a tile game)
3. cart-pole (pole balancing)
4. mountain car
5. linear quadratic regulator
6. aircraft collision avoidance
7. crying baby
8. machine replacement
9. catch
10. prisoner's dilemma
11. rock-paper-scissors
12. traveler's dilemma
13. predator-prey
14. crying baby (multi-caregiver)
15. predator-prey (collaborative)
1. data types
2. functions
3. control flow
4. packages