When approaching combinatorics there are two terms that may often cause confusion. Those are combinations and permutations.
Let’s try to clarify them better.
Informally, the term “combination” is used in natural language (i.e., English) when we want to refer to a set of “objects” whose order doesn’t matter, whereas we use the word “permutation” when we want to explicitly emphasize the order in which things appear.
For instance, we may say the following:

“My ice cream is a combination of banana, strawberry, and lemon“.
Here, we don’t care about the order of the ice cream flavours! We simply state that our ice cream is made of “banana, strawberry, and lemon”. In fact, we could state also that it is made of “lemon, banana, and strawberry” or “strawberry, lemon, and banana”, and so on and so forth. 
“My security pin code is 456“.
Now, except for the potential security issues you would encounter when saying something like the above, here you do care about the order in which the numbers appear. The system won’t authenticate you if you’ll digit 654 or 546!
According to the mathematical language, where statements have to be rigorous and precise differently from natural language sentences, we denote by combination any set of objects where order doesn’t matter while permutation indicates any set of ordered objects, or, equivalently, a particular (i.e., ordered) combination.
Let’s now explore those two concepts separately, first starting from permutations.
PERMUTATIONS
Roughly speaking, there are two types of permutations: with or without repetition allowed.
Permutations with Repetitions
These are the easiest to compute.
Suppose we have a set of items to choose from, i.e., . In addition, suppose we’d like to choose elements from .
We denote by the very first element picked from the entire set at the first stage (i.e., stage ).
More generally, at each later stage we pick an item from the set .
That is, each time after an element is removed from the original set that is immediately put back before the next picking take place. In this way, at each stage we can always choose from the whole set of elements, namely that we have exactly options to select from at stage , at stage , …, and at the last stage :
A typical example of this kind of permutations is when you are asking of how many possible numbers you can represent using digits each in the range . For instance, if you want to figure out how many permutations exist for a digit decimal lock (i.e., and ) you’ll get: .
Similarly, you can count how many numbers you are able to represent with binary digits (i.e., , and ), which is .
Permutations without Repetitions
Differently from permutations with repetitions allowed, here we have to reduce the number of available choices we may select from at each stage.
We use the same notation above, with being a set of items. In addition, we want to select items but this time we remove from the original set the element we picked at each stage. Therefore, if we denote by the very first element picked from the entire set at the first stage (i.e., stage ) the item we pick at the generic stage () is , which is chosen from the set .
For instance, what order could objects be in?
Well, intuitively we have options for the first choice, for the second, for the third, etc.
In the end, the total number of all possible orderings for the set of objects is (i.e., ):
where the “” symbol is the factorial function.
It turns out that if you want to select only items then we have to compute the following:
Now, is there a way to use the factorial to express the equation above? Well, it is and it can be shown as follows. By convention, . Though it may appear counterintuitive or wrong that multiplying no numbers together gets us , this helps simplify a lot of equations.
To derive the second equation above from the first one which already uses the factorial we may notice that the second equation is exactly the first one where the last terms have been canceled out, that is:
Instead of writing the whole formula, you may find several notations to succinctly refer to permutations without repetitions, such as the following:
Now, suppose you are interested in knowing how many possible ways a group of pool balls can be arranged (i.e., ).
According to our definition of permutation, the whole set of pool balls can be arranged in ways, which is about 21 trillions: a very very huge number!
Instead, if we are only interested in finding the number of ways a group of pool balls can be arranged out of all the pool balls that are available, we are actually referring to , where and still . It turns out that:
COMBINATIONS
Differently from permutations, with combinations the order of elements does not matter anymore.
Anyway, we can still classify combinations in the same way we classified permutations, that is with or without repetition allowed.
Let’s start first from combinations without repetitions, which by the way is far easier to understand.
Combinations without Repetitions
This is exactly how lotteries work. The numbers are picked one at a time, and if you are enough lucky to have the all numbers that have been extracted (no matter what order), you win!
The easiest way to explain combinations without repetition could be the following: first, assume that the order matters and treat it as permutations, and then alter it so that the order does not matter.
Going back to the pool ball example, let’s say we just want to know which pool balls were chosen disregarding of their order, assuming each pool ball is identified by an integer in the range .
We already found that there exist possible ways of arranging pool balls out of . However, many of those permutations can be considered as the same now that we don’t care about the order. For instance, assume that balls , , were chosen. Then, we have the following correspondence:
Permutations (order does matter)  Combinations (order does not matter) 
As it turns out, any permutation on the left – which is actually a sequence – corresponds to the same unique combination, namely the set . Thereby, permutations have times as many possibilities.
Indeed, there is an easy way to work out how many ways the items “” can be placed in order, and we have already seen it. The answer is simply: .
So, we can adjust our permutations to reduce it by how many ways the items could be in order (because, again, we are not interested in their order anymore!), as follows:
The formula above is often referred to as binomial coefficient or, more informally, as “ choose “, and it can be denoted by the following expressions:
Therefore, to answer our question about the possible set of pool balls out of the total pool balls, we simply need to write down the formula above, where and :
Interestingly enough the formula of binomial coefficient is symmetrical:
In other words, choosing out of pool balls or out of leads exactly to the same number of combinations.
Another way to figure out the combinations of elements out of requires using the Pascal’s triangle. Indeed, you have to simply locate the “cell” of the triangle corresponding to the row number and the slot number (remember that numbers on the triangle start from ).
Combinations with Repetitions
This is the hardest type of combinations you might be asked to compute. Let’s try to explain it by using an example.
Suppose there are 5 flavours of ice cream: banana, chocolate, lemon, strawberry and vanilla but we can have just 3 scoops. How many variations will there be?
Let’s use the letters to represent the set of five flavours. Some possible ice cream selections could be the following:
 : three scoops of chocolate;
 : one scoop of banana, one of lemon, and one of vanilla;
 : one scoop of banana and two of vanilla.
First of all, note that the usage of set notation is not totally precise due to some items appearing multiple times within the same set like in and . In fact, the usage of multiset would be preferable (i.e., and ) but this would also lead to an hardtoread notation.
So, just to be clear, there are items we can choose from and we want to select of those: order does not matter and we can select the same item multiple times.
Imagine the ice cream flavours to be located in five boxes one next to the other in this order: .
In addition, suppose you codify the action of picking a flavour with a box (i.e., the symbol ), whereas you use a triangle (i.e., the symbol ) to denote the action of skipping to the next flavour.
With this codification in mind, we can represent the above combination as follows:
 =
 =
 =
Therefore, instead of worrying about different flavours, we have now a simpler question: “How many different ways can we arrange boxes and triangles?”
Notice that there are always boxes (i.e., scoops of ice cream) and triangles (i.e., we need to move times to go from the first to fifth box).
Generally speaking, there are total positions, and we want to choose of them to have boxes ().
This is like saying “We have pool balls and we want to choose of them”. In other words, it is now like the pool balls question, but with slightly changed numbers, that is we want to compute and we know how to handle this:
Interestingly, we could have looked at the triangles instead of the boxes, and we would have then been saying “We have positions and want to choose of them to have triangles”, and the answer would be the same due to the symmetry of the binomial coefficient:
Finally, coming back to our ice cream example, where and , the answer to our question is the following:
Additional Note
The last result about Combinations with Repetitions is particularly useful to enumerate all the possible resamples of size which can be generated from an original sample also of size when performing bootstrapping. At the core of bootstrapping is the generation of many (i.e. thousands or tens of thousands) bootstrap random samples with replacement, each one having the same size of the original sample.
More formally, let us assume to have a sample of instances from an unknown population, i.e. . Bootstrapping requires to generate random samples (with replacement) , also called resamples, from , so that .
To find the total number of such possible random samples, when known , this actually reduces to find the total number of combinations with repetitions, each of size , which can be obtained from a collection of objects.
We already know from above that the number of combinations with repetitions, each of size , out of a collection of elements is:
If we substitute into the formula above, we can compute as follows:
The bootstrapping process can be thus seen as a repetition of independent trials each of which leads to a “success” for exactly one of categories (i.e. resamples), with each resample having a given fixed success probability (i.e. the probability of being picked) . Due to the random sampling procedure, we can fairly assume the resamples to be uniformly distributed, each with probability , where .
More formally, the bootstrapping process can be thought of as a Multinomial trials process, namely a sequence of independent and identically distributed random variables collectively represented by the vector , where each can take on possible values. Therefore, the multinomial trials process is a simple generalization of the Bernoulli trials process (which corresponds to ). We denote the set of outcomes by , and we denote the common probability density function of the trial variables by .
As already stated above, for each and .
In the special case of Bernoulli trial processes (i.e. when ), each random variable is actually a (discrete) binary random variable. This is distributed according to a Bernoulli distribution parametrized by , which is the probability of observing the outcome out of the set of two possible outcomes . This can be written as . Note also that the probability can be directly inferred from as it always holds that .
More generally, in the case of Multinomial trial process (i.e. when ), each random variable is a (discrete) ary random variable, which is distributed according to a Categorical distribution parametrized by the following vector of parameters . This can be written as . Note however that we don’t need to specify the full set of parameters as the last one can be easily derived once the other are known since .
Usually, we are interested in counting the number of times each outcome occurs in the trials. This number is represented by the following random variable:
The resulting vector of random variables is known to be distributed according to a Multinomial distribution, with parameters and . This is typically written as .
In the special case of Bernoulli trials process we just have two random variables as two are the possible outcomes of each trial; therefore is said to be distributed according to a Binomial distribution, whose parameters are and . This is usually written as .
At this point, we can compute the following joint probability distribution:
In the specific context of bootstrapping, we might be asked what is the probability of having anyone of the resamples selected more than, say, times during the bootstraps. This basically means to compute what is the probability that there exists (at least) one such that .
Therefore, as is getting larger and if we assume each resample equiprobable the probability of selecting the same resample more than once is rapidly decreasing to