Odds vs. Probabilities

In my last post I talked about the shrinking demand for lawyers and how bitter JDs are filing class action lawsuits alleging that their alma maters manipulated the employment statistics of their graduates.  It seems like an appropriate time to clarify two commonly misinterpreted statistical metrics that deal with the likelihood of an event: probability and odds.

Probability is a concept that assigns a number between 0 and 1 (inclusive) to the likelihood of an event occurring.  Since numbers between 0 and 1 are obviously all decimals/fractions, probability can be interpreted as a percentage, which makes it an easy measure to interpret.  Probability can be rigorously defined using Axioms, or facts that are accepted as a basis for reasoning and analysis, and it was first done so by Russian mathematician Andrey Kolmogorov.  The Probability Axioms, sometimes called Kolmogorov’s Axioms, require some familiarity with set theory to understand, but the basic ideas are summarized below:

The Sample Space of an experiment is all of the possible outcomes that could take place as a result of the experiment.  If we’re interested in the number of dots facing up when a dice is rolled, then the sample space consists of one, two, three, four, five, or six.

In math a set is just a collection of objects (sometimes called elements).  There are multiple ways to represent a set, but the most simple is to list the objects in the set in curly brackets {}.  Using this notation, we can represent the set of all positive integers as {1, 2, 3, …} where the ellipsis shows that the set continues indefinitely because there are infinitely many positive integers.  A set X is a subset of set Y if every member of X is also a member of Y.  For example, X={1,2} is a subset of Y={1,2,3} because every object in X is also in Y; the extra element in Y is irrelevant.  We typically denote this relationship X  Y.  Set-Builder Notation is typically used for more complicated sets that are defined by more than a simple phrase, but it’s not necessary here.

So we know that the sample space is all possible outcomes of an experiment, and a mathematical set is a group of objects with some property.  It follows that we can represent the sample space of an event as the mathematical set of possible outcomes.  The possible outcomes of the experiment are the objects or elements of the set.  Mathematically, we can write S = {E1, E2, … , En} where E1, E2, …, En are all possible outcomes of the experiment.  An event is a subset of the sample space, since, by definition, the sample space is the set of all possible events.

Now we can actually talk about Kolmogorov’s axioms.  Let P(E) be the probability of event E.

  1. For any event E, 0 < P(E) < 1; i.e. the probability of any event is between 0 and 1 inclusive.
  2. P(S) = 1; i.e. the sample space includes all possible outcomes of an event, so the probability of an outcome in the sample space is 100%
  3. If events E1 and E2 are mutually exclusive (there is a way to write this using set notation, but I’m not going to go into it because it opens a huge can of worms with venn diagrams etc.) then the probability of both events occurring is additive.  Mathematically, P(E1 and E2) = P(E1) + P(E2)

Now it’s clear that the probability of an outcome is the chance that the outcome will happen divided by the total possible outcomes (i.e. the sample space).  Simply put, P(x) = (Chances of Event) / (Total Chances).  So the probability of randomly drawing an ace from a deck of cards is 4/52 = 1/13 = just less than 8%.  There are 4 aces in a standard deck of cards, and 52 is the total number of cards.

On the other hand, the odds of an event occurring are represented differently.  We’re still going to start with the same thing, the total chances for the event occurring.  But instead of comparing it to the total chances, we compare it to the chance that the event doesn’t occur.  Odds are expressed as a ratio.  1:1 should read as one to one, but it is commonly read as one out of one, which is really incorrect because “out of” suggests we’re comparing the number 1 to the total number of outcomes, when we’re really just comparing it to the chance that something doesn’t occur.

The total chances of some event is equal to the [chances for] + [chances against]; in our cards example, 52 = 4 + 48.  Rewriting this, we see that [chances against] = [total chances] – [chances for].  Since we know that the odds of an event are written as [chances for]:[chances against], and we just showed that [chances against] = [total chances] – [chances for], we can write the odds of drawing an ace from a deck of cards as [4 chances to draw an ace] : [52 total cards – 4 aces].  Simplifying, 4: (52-4) = 4:48 = 1:12.

Here is the discrepancy: if we consider the probability of drawing an ace as a ratio, then we have 1:13; however, the odds of drawing an ace is 1:12.

About schapshow

Math & Statistics graduate who likes gymnastics, 90s alternative music, and statistical modeling. View all posts by schapshow

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

www.openeuroscience.com/

Open source projects for neuroscience!

Systematic Investor

Systematic Investor Blog

Introduction to Data Science, Columbia University

Blog to document and reflect on Columbia Data Science Class

Heuristic Andrew

Good-enough solutions for an imperfect world

r4stats.com

"History doesn't repeat itself but it does rhyme"

My Blog

take a minute, have a seat, look around

Data Until I Die!

Data for Life :)

R Statistics and Programming

Resources and Information About R Statistics and Programming

Models are illuminating and wrong

A data scientist discussing his journey in the analytics profession

Xi'an's Og

an attempt at bloggin, nothing more...

Practical Vision Science

Vision science, open science and data analysis

Big Data Econometrics

Small posts about Big Data.

Simon Ouderkirk

Remote Work, Small Data, Digital Hospitality. Work from home, see the world.

rbresearch

Quantitative research, trading strategy ideas, and backtesting for the FX and equity markets

Statisfaction

I can't get no

The Optimal Casserole

No Line Is Ever Pointless

SOA Exam P / CAS Exam 1

Preparing for Exam P / Exam 1 thru Problem Solving

schapshow

Mathematical statistics for the layman.

%d bloggers like this: