Bootstrapping is a resampling method that allows us to infer statistics about a population from a sample. It is also easy to perform and understand, which makes it so darn cool. Practitioners who use bootstrap or fully appreciate its potential know that they can use it to estimate various population statistics, yet nearly all examples I could find online only use bootstrapping to estimate the population’s mean. I think it’s time to change that.

In this short article, I will review the bootstrap method and how to execute it in python. Then we’ll estimate the confidence intervals for the population’s standard deviation using this method to alleviate any confusion around how to bootstrap population statistics other than the mean from a sample. We’ll do a little visualization to understand better what we learned and experiment with drawing a larger number of samples to see how this affects the outcome.

Let’s dive in.

If you prefer, you can follow along in the Jupyter Notebook here.

Start by importing all the packages we will need.

`import pandas as pdimport numpy as npimport matplotlib.pyplot as plt%matplotlib inlineimport seaborn as snsimport scipy.stats as st`

Now, let’s generate a fictitious “population.” I made up the mean and standard deviation. You can make up your own if you wish.

`# generate a ficticious population with 1 million valuespop_mean = 53.21pop_std = 4.23population = np.random.normal(pop_mean, pop_std, 10**6)# plot the populationsns.distplot(population)plt.title('Population Distribution')`

We’ve now created a “population” with one million values, a mean of 52.21, and a standard deviation of 4.23.

## Draw a sample

We want to draw a small sample from the population to use for bootstrapping an approximation of the population parameters. In practice, we would only have the sample.

`# Draw 30 random values from the populationsample = np.random.choice(population, size=30, replace=False)`

The variable `sample` now contains 30 randomly drawn values from the population.

I’m going to go quickly here. If you want a more in-depth look at the bootstrap method, check out my previous article Estimating Future Online Event Donation Revenue for Musicians and Nonprofits — Bootstrap estimation of confidence intervals with python.

All of the magic with bootstrapping happens as a result of sampling with replacement. Replacement means that when we draw a sample, we record that number then return that number to the source so that it has…