The Professor's Response

Categories

Hello Professor, I really don't understand this question

62. If you wanted to determine the number of people needed for a survey to determine the percent of the population without health insurance, how could this be done? Assume you would like to ultimately form a 95% confidence interval with a margin of error of just 2 percentage points (hint: think of the margin of error formula for the proportion interval).

I really don't know how I would actually go about this. I read the answer and I still don't understand what values you would even plug into the equation for p hat and q hat.

Hi Cristina,

I might create a video for this problem, but until I do, here is all you need to know.

As it describes in the answer, you must first solve the margin of error formula for n. The resulting formula for n is included in this problem's answer. You'll notice there are four unknown quantities: Za/2, E, p, and q.

The Za/2 value is 1.96, which is something you have done several times for a 95% confidence interval, so I will assume that isn't giving you trouble. The problem states that the margin of error is just 2 percentage points, so E is 0.02. The tricky part is picking the correct p and q. It turns out assuming both p and q are 50% (0.50) will work best.

The reason that assuming p is 50% and that q is 50% works best is more complicated, and honestly, it is not something you need to understand. If someone asks you to find the sample size for a study designed to estimate the population proportion, you should simply assume that both p and q are 0.50. If you want to know why this is the case, you can continue to read the explanation below.

This is the best choice because 0.50*0.50 produces the largest product of all the possible choices for p and q. You should remember that p and q have to add to 1 (or 100%). For example, p could be 0.75, but then q would need to be 0.25. For any two numbers that add to one, the pair that produces the maximum product is 0.5 and 0.5. This can be showed by finding the vertex of a particular quadratic equation.  I won't go into that detail at this point. However, since assuming p is 0.5 and q is 0.5 will provide the largest product when multiplied in the formula for the sample size, using them will ensure our sample size calculation will be large enough regardless of what the real p and q values are. In other words, assuming p and q are both 50% is the safe choice because if they aren't both 50% in reality, you will have a larger sample size than needed, but at least you won't have too little data.

Hope that makes some sense, if not just remember when p and q are unknown, use 50% for both.

Professor McGuckian