When we sample, we are trying to determine something about
            a population from a subset of that population. But
            before we can determine that, we need to know what population we are
            dealing with. In terms of set theory, we need to know what
            universe we are dealing with.
            
            
            
             
             
            
                    It may not be easy to determine if a particular entity is in the
                    population or not. Is someone who smokes one cigarette a month when
                    he is out drinking a "smoker"? Is a raccon who lives at the border of
                    Panama and Columbia a "North American racoon"? Is a phone with 30% of
                    its components from China "produced in China"? Is someone who claims
                    they are going to vote, but hasn't voted in 20 years, really a
                    "likely" voter? Is someone taking only one class every year or
                    two a "student at St. Joseph's College"?
                    
                    
                    Note that these decisions can be made in a biased way: if we
                    want to exaggerate the dangers of smoking, we could count
                    as "smokers" only people who smoke over two packs a day. On the
                    other hand, if we want to minimize the dangers, we could
                    include anyone who has smoked even a single cigarette in the
                    last several decades.
                
At first, it might seem plausible that if we want to learn something about a population from a subset of that population, we should carefully construct that subset to closely mirror the actual population. So if, for instance we want to sample the American electorate about an upcoming election, we might decide, "Well, we should construct our sample to include 45% Democrat voters, 40% Republican voters, 10% Libertarian voters, 5% Green Party voters."
This approach it is seriously wrong, as it begs the question of what the population is actually like. If we already know the composition of the population, then we do not need to sample. We could simply declare that "The vote will be 45% Democrat, 40% Republican, 10% Libertarian, and 5% Green." The only reason that we are sampling is that we do not know how the population as a whole will vote, and we are hoping that our sample will help us to understand how it will.
Perhaps surprisingly, the best way to sample a population to determine its characteristics from the sample is to make the sample as random as we can. But even that is fraught with difficulties: we need to sample by some means, and that means itself may bias our sample. Alf Landon.