# Tag: sampling

Cluster sampling is a method of obtaining a representative sample from a population that researchers have divided into groups. An individual cluster is a subgroup that mirrors the diversity of the whole population while the set of clusters are similar to each other. Typically, researchers use this approach when studying large, geographically dispersed populations because it is a cost-controlling measure. Researchers do not need to obtain samples from all clusters because each one reflects…… Read more...

Stratified sampling is a method of obtaining a representative sample from a population that researchers have divided into relatively similar subpopulations (strata). Researchers use stratified sampling to ensure specific subgroups are present in their sample. It also helps them obtain precise estimates of each group’s characteristics. Many surveys use this method to understand differences between subpopulations better. Stratified sampling is also known as stratified random sampling.

Sometimes you know the best fitting distribution, or probability density function, of your data prior to analysis; more often, you do not. Approaches to data sampling, modeling, and analysis can vary based on the distribution of your data, and so determining the best fit theoretical distribution can be an essential step in your data exploration process.

This is where distfit comes in.

distfit is a python package for probability density fitting across 89 univariate distributions to non-censored data by residual sum of squares (RSS), and hypothesis testing. Probability density fitting is the fitting of a probability distribution to a series of data concerning the repeated measurement of a variable phenomenon.… Read more...

Markov Chain Monte Carlo is a group of algorithms used to map out the posterior distribution by sampling from the posterior distribution. The reason we use this method instead of the quadratic approximation method is because when we encounter distributions that have multiple peaks, it is possible that the algorithm will converge to a local maxima, and not give us the true approximation of the posterior distribution. The Monte Carlo algorithms however, use the principles of randomness and chaos theory to solve problems that would otherwise be difficult, if not impossible, to solve analytically.

You and your friend have been using bandit algorithms to optimise which restaurants and movies to choose for your weekly movie night. So far, you have tried different bandits algorithms like Epsilon-Greedy, Optimistic Initial Values and Upper Confidence Bounds (UCB). You’ve found the UCB1-Tuned algorithm to work slightly better than the rest, for both Bernoulli and Normal rewards, and have ended up using it for the last few months.

Even though your movie nights have been going great with the choices made by UCB1-Tuned, you miss the thrill of trying a new algorithm out.

“Have you heard of Thompson Sampling?”