Mathematical Statistics Lesson of the Day – Sufficient Statistics

The Chemical Statistician

Suppose that you collected data

$latex mathbf{X} = X_1, X_2, …, X_n$

in order to estimate a parameter $latex theta$.  Let $latex f_theta(x)$ be the probability density function (PDF)* for $latex X_1, X_2, …, X_n$.


$latex t = T(mathbf{X})$

be a statistic based on $latex mathbf{X}$.  Let $latex g_theta(t)$ be the PDF for $latex T(X)$.

If the conditional PDF

$latex h_theta(mathbf{X}) = f_theta(x) div g_theta[T(mathbf{X})]$

is independent of $latex theta$, then $latex T(mathbf{X})$ is a sufficient statistic for $latex theta$.  In other words,

$latex h_theta(mathbf{X}) = h(mathbf{X})$,

and $latex theta$ does not appear in $latex h(mathbf{X})$.

Intuitively, this means that $latex T(mathbf{X})$ contains everything you need to estimate $latex theta$, so knowing $latex T(mathbf{X})$ (i.e. conditioning $latex f_theta(x)$ on $latex T(mathbf{X})$) is sufficient for estimating $latex theta$.

Often, $latex T(mathbf{X})$ is a summary statistic of $latex X_1, X_2, …, X_n$, such as their

  • sample mean
  • sample median
  • sample minimum
  • sample maximum

If such a summary…

View original post 36 more words

About schapshow

Math & Statistics graduate who likes gymnastics, 90s alternative music, and statistical modeling. View all posts by schapshow

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Open source projects for neuroscience!

Systematic Investor

Systematic Investor Blog

Introduction to Data Science, Columbia University

Blog to document and reflect on Columbia Data Science Class

Heuristic Andrew

Good-enough solutions for an imperfect world

"History doesn't repeat itself but it does rhyme"

My Blog

take a minute, have a seat, look around

Data Until I Die!

Data for Life :)

R Statistics and Programming

Resources and Information About R Statistics and Programming

Models are illuminating and wrong

Data & Machine Learning & Product

Xi'an's Og

an attempt at bloggin, nothing more...

Practical Vision Science

Vision science, open science and data analysis

Big Data Econometrics

Small posts about Big Data.

Simon Ouderkirk

Remote Work, Small Data, Digital Hospitality. Work from home, see the world.


Quantitative research, trading strategy ideas, and backtesting for the FX and equity markets


I can't get no

The Optimal Casserole

No Line Is Ever Pointless

SOA Exam P / CAS Exam 1

Preparing for Exam P / Exam 1 thru Problem Solving


Mathematical statistics for the layman.

%d bloggers like this: