Statistical Inference

Statistical Inference Bayes Impact 2014

arranged by Daniel Korenblum

The Inference Problem is estimation of an unknown quantity

Inference/Estimation Subject Areas Problem Point Estimation

Error Bars / Confidence (Estimator Error) Classification and Clustering (Pattern Recognition)

Model Selection

1. 2. 3.

Solution / Method

Algorithm / Statistic

Maximum Likelihood (ML)

Gradient Descent

Minimum-Variance Unbiased (MVU) Estimator

Least Squares

Maximum Posterior (MAP/GMLE)

Gradient Descent

Posterior Mean (PM)

Markov-chain Monte Carlo (MCMC)

Confidence Interval / Region

Covariance/Information, Resampling

Credibility Interval / Region

Evidentiary Credible Region (2014)

Unsupervised Learning

Cluster Analysis

(Semi) Supervised Learning

Discriminant, Generative, SVM, kNN, trees

Feature Selection

Ranking, Filtering, Greedy, Sparse

Hypothesis Testing

Significance Tests (Holy Trinity)

Model Evidence

Marginal Likelihood

Frequentist inference as an optimization problem: maximize the likelihood over all observations Bayesian inference as distribution estimation: the posterior distribution estimate is “the inference” Decision theory can be used to derive estimates from posteriors by minimizing decision risk/loss

Scope and Outline Topics covered 1. 2.

Likelihood models and model comparison Frequentist and Bayesian approaches 2.1. Frequentist Inference 2.1.1. Analytic - set the derivative of sample log-likelihood equal to zero and solve 2.1.2. Numerical - use local or global optimization algorithms (e.g. steepest descent) 2.2. Bayesian Inference 2.2.1. Choose a prior distribution 2.2.2. Product of likelihood and prior yields unnormalized posterior distribution 2.2.3. Select an objective / risk / loss and minimize its expected value over the posterior 3. Statistics and algorithms 3.1. Regression: using the noise distribution to choose appropriate objective / risk / loss 3.2. Estimator error: bias-variance trade-off, small bias can reduce variance and MSE 3.3. Classification: choosing between generative, discriminative, or discriminant approaches

Topics not covered 1. 2. 3. 4. 5.

Stochastic process models / methods (e.g. Markov models) Time series analysis / 1-D signal processing, multidimensional signal processing Black/gray box models (e.g. artificial neural networks, decision trees, ensembles) Information theoretic approaches (maximum entropy, mutual information, K-L divergence) Control theory, duality theory, convex analysis, global optimization

Introduction to Statistical Inference

Frequentist Inference Likelihood theory (Fisher ~1920)

Likelihood Theory Likelihood functions are not probability density functions. The integral of a likelihood function is not in general 1.

fixed

variable

variable fixed

variable

Frequentist Inference & Decision Theory

Frequentist Risk/Loss Function:

Frequentist Risk Example: Squared Error

Frequentist Decision Theoretic Objective

Bayesian Inference posterior distribution & minimum risk/loss

Bayesian Conditional Distributions

Bayesian Update, Inverse Problems

Prior Function and Regularization Term

Bayesian Posterior Loss

When the prior is improper, an estimator which minimizes the posterior expected loss is referred to as a generalized Bayes estimator.

Risk/Loss and Regularization Functions

Risk/Loss Functions and Derivatives

http://dl.acm.org.oca.ucsc.edu/citation.cfm?id=1281270

Point Estimation maximum likelihood, least squares

Maximum Likelihood Estimation (MLE)

Linear Regression / Least Squares

Orthogonal Projections & Least Squares

http://en.wikipedia.org/wiki/Linear_least_squares_(mathematics)#Properties_of_the_least-

Nonlinear Regression / Least Squares

Generalized Linear Models

Maximum Likelihood Noise Dependence

MLE Estimator: Gamma Distribution

MVU Estimator: Mean of Uniform Noise

Posterior Mean and Maximum Posterior

Median Posterior Density

Example: Changepoint Detection

Example: Changepoint Detection

PM Example: Bayesian Prediction

Error Bars / Uncertainty Fisher information, confidence regions

Negative Log-Likelihood & Uncertainty

Likelihood Geometry and Contours

Score Function & Fisher Information

Fisher Information / Precision

Estimator Error

proof of Cramer-Rao Lower-Bound: http://ens.ewi.tudelft.nl/Education/courses/et4386/Slides/01.estimation.pdf

Bayesian Mean Squared Error

Bayesian Minimum Mean Squared Error

Classification cluster analysis, supervised learning

Bayesian Classification

Bayes Classifier Risk/Loss

Bayesian Classifier Decision Error

Bayesian Classifier Posterior Density

Example: Support Vector Machine

Classifier Comparison Example

Feature Selection ranking, filtering, greedy, sparse, hybrid

Introduction to Feature Selection

Feature Selection Approaches

Filtering / Subset Selection Algorithms

Exhaustive Search & Zero-norm Penalty

Basis Pursuit / LASSO / Elastic Net

Cluster Analysis also known as unsupervised learning

Introduction to Cluster Analysis

Cluster Analysis Algorithm Categories Hierarchical

Crisp

agglomerative clustering

Fuzzy

Hierarchical unsupervised fuzzy clustering (Geva 1999)

Non-hierarchical

k-means

spectral clustering, fuzzy k-means

Hierarchical Agglomerative Clustering

Clustering Algorithm Comparisons

Model Selection Cross-Validation, LR, ICs, model evidence

Parsimony and Occam’s Razor

Cross-Validation

Likelihood Ratio Test for Nested Models

Aikake & Bayesian Information Criteria

Deviance Information Criterion

Model Selection Discussion

Bayesian Model Selection

Bayes Factors & Bias-Variance Tradeoffs

Bayesian Model Selection Example

Philosophy interpretations, debates, and paradoxes

Bertrand Paradox

Bertrand Paradox: Jaynes’ Solution

Bertrand Paradox: Disambiguation

References

References Lecture Notes 2004 Figueiredo, Lecture Notes on Bayesian Estimation and Classification Martinez et al., Estimation and Detection Books 2006 Bishop, Pattern Recognition and Machine Learning 2009 Hastie et al, The Elements of Statistical Learning 2012 MacKay, Information Theory, Inference, and Learning Algorithms Wiki http://wikipedia.org

Recommend Documents