do I need an actual data set

Kaarlo Tuomi Tue Nov 23, 2021 8:10 am

I am trying to use statistics to solve a problem and believe that my next step is bootstrap hypothesis testing, which I had not heard about until this morning. statistics101 looks like it might be the thing for this but I just want to clarify something.

I have a frequency distribution showing the birth months of 3196 males from one country who did a thing. call this F1.

I also have census data from their country showing the number of folk born each month, and I have the birth rate ratio showing the number of male to female births. this permits me to calculate a frequency distribution of male births in the year. call this F2.

I want to compare F1 with F2 to test a theory that doing this thing is influenced by birthdate. basically the theory says that folk who do this thing tend to be born in a particular part of the year.

that's where (I think) bootstrapping comes in. as I understand it, I want to take 1000 or so random samples from the census data and compute their statistics to determine whether or not my original sample (the 3196 folk who did a thing) could just be a random sample from the general population.

if that is correct, then do I have to recreate the actual data set for the 2 million folk born each year, or can I just enter the frequency distribution (167488 folk born in January, 153146 born in February, 165951 born in March, etc), and work from that?

Kaarlo Tuomi

Re: do I need an actual data set

Admin Tue Nov 23, 2021 5:51 pm

Drawing from the frequency distribution should be sufficient. But I'm not an expert Statistician, so don't take my opinion as gospel.


