Uncertainty Wednesday: Random Variables

Uncertainty Wednesday: Random Variables


Just a quick reminder on where we currently are in Uncertainty Wednesdays: I had introduced the idea of measuring uncertainty, then we defined what a probability distribution is and learned about entropy, which is a measure of uncertainty that is solely based on the probabilities of different states. We examined entropy for a simple distribution, and learned about the relationship of entropy to communication.

Now consider again our super simple world with two states A and B. Suppose that P(A) = 0.99 and P(B) = 0.01. We will keep this fixed, meaning we will not change the entropy of the probability distribution. Furthermore, you know from our analysis that the entropy of this distribution is quite low as the states have very unequal probabilities.

Suppose that these states represent the success or failure of an investment and you are faced with the following different payouts

Investment 1: A -$1, B $99
Investment 2: A -$100, B $9,900
Investment 3: A -$10,000, B $990,000    

The first thing to notice is that all three investments have the same 100x return. Wait, why 100x and not 99x? Because I have given you the net payouts. So in investment 1 you put up $1 and in state A you get back $0 (meaning you have now lost $1, hence -$1) whereas in state B you get back $100 (which means you now have $99 new dollars).

Intuitively there appears to be a big difference in uncertainty between these three investments, despite the fact that they have the same returns and the same entropy. To start to measure this difference, we need to introduce a new concept, that of a random variable.

A random variable X is simply a variable that takes on different values in different states of the world, with a defined probability distribution across those states. So for Investment 1

X = -1 with probability 0.99 (state A occurs)
X = 99 with probability 0.01 (state B occurs)

Often we will write this shorthand as P(-1) = 0.99 and P(99) = 0.01 (in an upcoming post I will talk about why this shorthand obscures something important).

We can now define measures such as the mean (or expected value), the variance and more to summarize the behavior of the random variable. If you already know what the expected value is, you can quickly convince yourself that it is the same for each of the investments above (and is what?).