The laboratory sets "normal" ranges for laboratory tests based upon population studies. The farther out of range the test result is, the more likely that the result reflects real disease.

A test may have a single normal range, or there may be different normal ranges based upon age, sex, race, or other factors. Sometimes, more history is needed for interpretation (such as with maternal serum alpha-fetoprotein in pregnancy, which is dependent upon the gestational age--the later in gestation, the more AFP is present normally).

The "threshold" value to call a test result positive, or the utility of a reference range, may be determined by clinical importance. A routine pregnancy test threshold is set to yield as few false positives as possible. A screening HIV test is set to yield as few false negatives as possible.

Standard "normal" ranges for tests with numeric values are based upon use of a bell shaped curve. **"Normal" is usually defined as those test values that fall within 2 standard deviations of the mean**, which includes 95% of all results. The standard deviation is just a measure of dispersion.

For most bell-shaped curves, 68% of the values fall within 1 standard deviation of the mean, 95% within 2 SD's, and 97.7% within 3 SD's. For most laboratory tests, the "normal range" is defined as values falling within 2 SD's of the mean. This is sometimes called the "95% confidence limits".

Thus, there is a 1 in 20 chance that an "abnormal" test may really be normal. For every 100 tests ordered, the statistical probability is that 5 will fall outside of a normal range.

If you perform 20 or more independent tests (which is not uncommon on patients admitted to hospital), then there is a greater than 50% likelihood that one or more tests will be "abnormal" just from statistical variation. If you keep ordering more tests just to track these down, you can go on for a long time and spend a lot of money.