Sunday, January 10, 2016

Powerball Madness

The Powerball is heading towards a 1 billion dollar pot. I was watching some program on TV last night and the guy was claiming that you can increase your odds of winning the lottery by not choosing numbers that have already come up. Let's put this theory to the test. Powerball historical results can be found here:

http://www.powerball.com/powerball/pb_nbr_history.asp

I downloaded to a flat text file. Unfortunately, numerical analysis of all 1900 drawings is not easily done. The rules have changed quite frequently. The number of balls used for the drawing has changed frequently.

https://en.wikipedia.org/wiki/Powerball

If you wanted to use the full sample for analysis, you would have to go through all these rule changes and weight the balls accordingly throughout the history of the game. Something I am not going to do on a Sunday morning. The last rule change was October 7th of 2015. This gives me a meager sample of 28 drawings. Nonetheless, I generate the numbers based on this set of drawings. I wacked together a C++ program to calculate the number of times any given ball was drawn.

// powerball.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include <fstream>
#include <map>


int _tmain(int argc, _TCHAR* argv[])
{
   std::ifstream lStream( _T("c:\\temp\\powerball2.txt"), std::ios::in );
   // Check for successful open.
   if ( ! lStream.is_open () )
   {
      printf ("open failed\n");
      return 1;
   }
   std::map<int, int> lFreq;
   std::map<int, int> lPowerFreq;
   for (int i = 1; i < 70; i++)
   {
      lFreq[i] = 0;
   }
   for (int i = 1; i < 27; i++)
   {
      lPowerFreq[i] = 0;
   }
   for (;;)
   {
      if (lStream.eof())
      {
         break;
      }
      char lLineBuf[_MAX_PATH];
      lLineBuf[0] = 0;
      lStream.getline (lLineBuf, sizeof(lLineBuf));
      if (strlen(lLineBuf))
      {
         int lBall1, lBall2, lBall3, lBall4, lBall5, lBall6;
         char lDateBuff[_MAX_PATH];
         if (sscanf_s(lLineBuf, "%s %d %d %d %d %d %d", lDateBuff, _MAX_PATH, &lBall1, &lBall2, &lBall3, &lBall4, &lBall5, &lBall6) == 7)
         {
            lFreq[lBall1]++;
            lFreq[lBall2]++;
            lFreq[lBall3]++;
            lFreq[lBall4]++;
            lFreq[lBall5]++;
            lPowerFreq[lBall6]++;
         }
      }
   }
   for (int i = 1; i < 70; i++)
   {
      printf ("freq %d %d\n", i, lFreq[i]);
   }
   for (int i = 1; i < 27; i++)
   {
      printf ("powerfreq %d %d\n", i, lPowerFreq[i]);
   }
   return 0;
}

Input for the program in powerball2.txt was as follows:

Draw Date   WB1 WB2 WB3 WB4 WB5 PB  PP
01/09/2016  32  16  19  57  34  13  3
01/06/2016  47  02  63  62  11  17  3
01/02/2016  42  15  06  05  29  10  2
12/30/2015  12  61  54  38  36  22  3
12/26/2015  65  40  44  59  27  20  2
12/23/2015  67  16  63  38  55  25  4
12/19/2015  30  68  59  41  28  10  2
12/16/2015  09  42  10  55  32  06  2
12/12/2015  62  02  30  19  14  22  2
12/09/2015  16  46  10  56  07  01  2
12/05/2015  47  33  68  27  13  13  2
12/02/2015  14  18  19  64  32  09  2
11/28/2015  47  02  66  67  06  02  3
11/25/2015  53  16  69  58  29  21  2
11/21/2015  37  57  47  50  52  21  3
11/18/2015  40  17  46  69  41  06  2
11/14/2015  66  37  22  14  45  05  3
11/11/2015  26  04  32  55  64  18  3
11/07/2015  50  53  07  16  25  15  2
11/04/2015  12  02  17  20  65  17  4
10/31/2015  09  47  20  25  68  07  2
10/28/2015  56  62  54  63  04  10  2
10/24/2015  20  31  56  64  60  02  3
10/21/2015  57  32  30  42  56  11  4
10/17/2015  57  62  69  49  48  19  3
10/14/2015  20  15  31  40  29  01  2
10/10/2015  27  68  12  43  29  01  2
10/07/2015  52  40  48  18  30  09  3

I outputted this to a text file and loaded it into Excel. Here is the graph of balls and the number of times they have been picked.




Here is the graph of the powerball picks.



With the sample size, the 5 base balls on average should have been picked 2.02 times. With the powerball, each number should have been picked 1.07 times. This is based on my sample size of 28.

With the drawing yesterday, here are the results.

16 - picked 5 times. Well above 2.02
19 - picked 3 times. Above average.
32 - picked 5 times.
34 - picked 1 time. Below average finally.
57 - picked 4 times.

Powerball 13 has been picked 2 times. Above average.

The strategy of picking numbers that haven't been picked before clearly would not have worked yesterday. That said, it would be interesting to analyze the full data set and see if any real trends can be observed. This simple exercise opens the door for all sorts of numerical analysis. I am not sure if I want to go down that hole.