There is an interesting question —

There are 5 different types of toys at a KFC restaurant. If you go there, you will get one toy randomly. How many times do you need to go to KFC in order to get all 5 toys?

The question is about probabilistic analysis. Different professionals, such as a business analyst, a statistical programmer, a mathematician and a software developer, will have different thinking pathway to solve this problem. Let's see what they would think.

#### 1. Business Analyst

A business analyst will tend to do scenario analysis at the first step.

##### Best-case scenario:

Assume I am so lucky that each time I visit KFC and get a different toy, then I only need 5 times to have all the five toys. The minimum number is 5.

##### Worst-case scenario:

To get a different toy I need to go to KFC five times. If there is not the toy I want, I will come back with empty hand. Thus, I will have to go to KFC 5*5 = 25 times. Of course, this scenario never happens.

OK. Then the mean times I need to go to KFC seems to be in a range [5, 25). Let's give the simplest try — use the (5+25)/2 to get

**15**times. The number is not accurate but at least we have an estimate.#### 2. Statistical Programmer

As a brute-force tool, simulation is the instant thought for a statistical programmer. Let the computer randomly create 10,000 trials — say a person plays the game 10,000 times. After averaging the results, the computer eventually tells the expected times to get the 5 toys.

I modified the SAS code from a post by Rick Wicklin, and set the maximum number of visits to KFC as 32. After 10,000 runs, the mean is

**11.37**. The hunch tells that this number should be quite close.```
************(1)Simulate 10000 trials**************************;
proc iml;
K = 5; /* number of toys */
L = 32; /* max visits per trial */
/* generate NSim trials of L visits */
NSim = 10000;
x = j(Nsim, L);
call randseed(12345);
call randgen(x, "Uniform");
x = ceil(K*x); /* integers in [1,K] */
/* record the visit when 5 toys are taken */
c = j(NSim,1,0); /** allocate */
do i = 1 to NSim;
do j = 5 to L;
rowUnique = countunique(x[i, 1:j]);
if rowUnique = 5 then do;
c[i, 1] = j;
goto skip;
end;
end;
skip:
end;
/* output the result */
create _1 from c;
append from c;
close _1;
;quit;
data _2;
set _1;
/* remove the trials that didn't get 5 toys in 32 visits */
where col1 >= 5;
run;
************(2)Show the results*******************************;
proc sgplot;
density col1;
density col1 / type = kernel;
run;
proc means;
run;
```

#### 3. Mathematician

Actually this question is a variant of Coupon collector's problem. The mean/expectation and standard deviation can be derived directly by the formulas. The final expectation should be 5*(1+1/2+1/3+1/4+1/5) =

**11.41**. This is the answer.#### 4. Software Developer

A software developer considers time complexity and space complexity first. When N is approaching infinity, the question is similar to a merge sorting. Given a merge sort is O(nlog(n)), the expected times must be greater than 5*ln(5) =

**8.05**. At least this number will be a lower bound for this question.