From 1988 through 1994 the National Center for Health Statistics (NCHS) conducted the third National Health and Nutrition Examination Survey (NHANES III). It was designed to provide national estimates of the health and nutritional status of U.S. residents. They interviewed, examined and tested over 50,000 subjects aged 2 to 90 and have made the raw data available on the internet.
When I was looking at the Bellinger paper on the renal effects of dental amalgam in children (Bellinger et al, Neuropsychological and Renal Effects of Dental Amalgam in Children; A Randomized Clinical Trial: JAMA, April 19, 2006; Vol 295, No 15, 1775-83), I wished they had included a control group with no fillings of any kind to compare with those who had either amalgam or composite fillings. 19% of their subjects, both groups combined, had an ACR greater than 30 – anything above 30 is considered to be a sign of kidney damage. 19% with kidney damage seem like a lot to me. I would have expected that in a control group with good teeth, no more than 10% would have had an ACR over 30. Then it occurred to me that the data for such a control group probably already existed as part of the NHANES data set and all one had to do was extract it.
The children in the Bellinger trial were 6 to 10 years old at the start, so they would have been 11 to 15 at the end of the 5 year trial. In the NHANES data set, in that age range, were 850 children with perfect teeth — none missing, no cavities, no fillings. 11.7% of the boys had an ACR greater than 30; 18.6% of the girls measured over 30. A combination figure (taking into account the greater number of girls in the Bellinger trial) would be 15.4%. This was a surprise. Granted 15.4% is less than 19% so the children with good teeth in the NHANES data set had lower ACR measurements than the children in the Bellinger trial. But they were much closer than I had expected. Too close to draw firm conclusions.
This was disappointing, but having made the dive into the NHANES sea of information I decided to look around. I was looking, in particular, for information about mercury sensitivity and for evidence that amalgam is (or is not) causing problems in a significant portion of the population. My first discovery was that the word 'amalgam' does not appear anywhere. Despite incredibly detailed dental examinations, where each face of each tooth is examined for fillings and cavities, no distinction is made between composite and amalgam fillings. This was a setback. Nevertheless, at the time the survey was being done, composite fillings were just starting to be widely used. So if I compared adults with perfect teeth to those with fillings, the fillings would be mainly amalgam.
It didn't take long to confirm that the younger the subject, the more likely they were to have perfect teeth; the older the subject, the more likely to have physical problems. The data set has dental examinations of subjects from 5 to 90 but comparing 5 year olds to 90 year olds would not be very revealing. On the other hand if the age range selected for analysis were very narrow, the number of subjects in some categories would be too small to be statistically interesting. In professional sports, an athlete performs near his peak from early twenties to late thirties. In this range the subjects are mature but have not begun to be troubled with the ailments associated with aging. There were lots of subjects with perfect teeth up to age 38 but the number tapered off after that so I settled on an age range of 20 to 38.
Also, males and females test differently in a lot of ways so they were considered separately when looking for correlations between health problems and the state of their teeth.
My biggest surprise came with the computer administered neurobehavioral tests which test reaction time and short term memory. It appears from the NHANES data that people with bad teeth perform better on these tests than people with perfect teeth. At first I thought I had misunderstood the test or that I had made an error somewhere. So I repeated the calculations and came at it from several angles but kept getting the same result. Subjects with bad teeth were scoring better than subjects with good teeth. I declined to believe that cavities and fillings were beneficial. The alternative was that something was suppressing the scores of subjects with perfect teeth. I began to wonder if fluoride could be a factor. On a whim, I googled fluoride,neurobehavioral. When two words are separated by a comma in a google search, you get only documents which contain both words. I thought I was shooting at a small target and wouldn't have been surprised by zero hits. I got over 28000.
What I found was that a lot of mice, rats and rabbits have been testifying to the negative effects of fluoride. Not much testing on humans, but in a comparison of Chinese villages where the only difference was the amount of fluoride in the water, the fluoride appeared to be costing the children about 10 IQ points. Such tests will have to be repeated by different scientists on different subjects before becoming established scientific fact. In the meantime, fluoride is a wild card. My working assumption is that fluoride is probably harmful in ways not yet fully understood
The Second Look website has an extensive Bibliography of Scientific Literature on Fluoride.
The NHANES data includes blood (and urine) tests for lead, cadmium, iron, selenium and a host of other blood components, good and bad, but nothing on fluoride (or mercury). Faced with the experimental evidence from other sources that fluoride affects neurobehavioral test results I tried to find a way to select a subgroup from the NHANES subjects who did not owe their good teeth to high levels of fluoride. I was unable to find a way to do this.
If there are people who score lower on neurobehavioral tests because of amalgam, I was not going to be able to show it using the NHANES data. If there are amalgam sensitive people, and if their neurobehavior is affected by amalgam, they are either fewer in number than those affected in the same way by fluoride or amalgam has a smaller negative effect on neurobehavior than fluoride.
My primary interest when I first delved into the NHANES data was to look at renal function using ACR (albumin/creatinine ratio) measurements. I did find that subjects with fillings have higher ACR measurements (and therefore less efficient kidneys) than subjects with good teeth but the differences are small and not consistent. I was left wondering if fluoride has some effect on the kidney as well as on neurobehavior.
Turning to other health problems, I found two comparisons where there were clear differences between people with good teeth and people with fillings. One was in the use of over the counter pain medication like Aspirin, Tylenol and Ibuprofen. People with fillings consume more of these pills in a month than do people with perfect teeth. The other was respiratory and allergy problems.
Taking the latter first, this was also noticed in the Bellinger trial where 13 of the amalgam subjects had some sort of respiratory disorder compared with 7 of the composite subjects. This is a big difference from a percentage point of view but the numbers involved are too small to be significant: the P-value of 0.25 is not even close to the 0.05 required for statistical significance. The P-value of 0.25 means there is a 75% chance the results are due to an actual difference in the treatments and a 25% chance that the results are due to random variation.
In NHANES III, subjects were asked diagnostic questions about persistent coughs, wheezing, shortness of breath. Those questions concerned primarily with respiratory problems were combined with questions about allergy symptoms, like watery itchy eyes, and about hayfever and specific animal allergies.
1500 males aged 20 to 38 with 4 to 12 fillings had on average 23% more such symptoms than did 500 males in the same age range with perfect teeth. Choosing an appropriate test for P-value for such a comparison is beyond my capabilities, but the Excel ttest function compared the two columns of figures and gave a P-value of 0.000004 which is very highly significant. My guess is that even if the Excel function is inappropriate for this data set, a proper test would still produce an interesting P-value.
Females with fillings had 24% more symptoms and a similar P-value (0.000025).
Having found a clear correlation, I wondered if the culprit had to be the fillings or if there was a credible alternative. Smoking, for example, can cause respiratory problems. If there were more smokers among those with dental fillings then smoking could be the culprit and the fillings just incidental. In fact, I found from the NHANES data that smokers are more likely than non-smokers to have bad teeth and, as expected, smokers have more respiratory problems than non-smokers.
So I looked again at respiratory and allergy problems versus dental status, separating smokers from non-smokers. Among smokers who had perfect teeth versus smokers who had dental fillings there was no correlation between respiratory and allergy problems and dental status.
On the other hand, for the same comparison using only non-smoking subjects there was still a very strong correlation for both males and females. What this means is that smoking is a confounding factor and that smokers should be avoided in any comparison where the effect of smoking is not one of the factors being investigated.
There is overlap between symptoms of respiratory problems and allergy problems which is why they are grouped together in NHANES and why I left them grouped together. But for the smokers I tried separating symptoms into those which are primarily respiratory or allergy problems. It appeared that smoking was definitely affecting respiration but having little effect on allergies.
At this point I did some thinking about the usefulness of the NHANES III data for resolving the amalgam question. I knew when I started this exercise that I might be able to put amalgam in the spotlight but probably not prove it was doing harm. Smoking is just one example of something that could interfere with the comparisons. Fluoride might be another. Perhaps diet. Perhaps socio-economic status. And then there was the fact that not all fillings are amalgam. So even if dental fillings could be implicated beyond any doubt, amalgam might not be the culprit (though it would be quite surprising if it were not).
Those who recommend the continued use of dental amalgam maintain that the amount of mercury absorbed from amalgam fillings is too small to cause physical problems except, perhaps, in very rare individuals who are especially sensitive to mercury. A useful first step in proving harm from amalgam would be to show that amalgam is, in fact, affecting a significant portion of the public, say 5% to 10%, even if it is only proven to be causing minor annoying problems like allergies and mild headaches. Because if it is causing these minor problems it is obviously interfering with some of the body's processes. And if it is capable of doing that, it may be that the only difference between these minor problems and more serious ones like MS, AZ and PD is that the causes of the more serious problems are harder to pin down.
The NHANES III data set is useful for identifying appropriate targets. It might seem logical, for example, that if some of us are being affected by amalgam then some signs of harm should show up on neurobehavioral tests. The NHANES III data warns us that there are confounding factors we don't understand and that, for the time being, neurobehavioral testing would be a waste of time. Likewise, even though the kidney is known to be affected by mercury, there are so many other things that affect the kidney that pinning the amalgam tail on that particular donkey is messy. On the other hand, again from the NHANES III data, allergy problems look like a sitting duck. A comparison of allergy problems with dental status would have to be done very carefully to eliminate confounding influences but there is a very good chance that the results of the comparison would be unequivocal.
As for the number of people who are sufficiently mercury sensitive to be affected by amalgam, if the number were much less than 1% they wouldn't have enough collective weight to show up on the NHANES data comparisons.
Comparing the average number of symptoms of respiratory and allergy symptoms for good versus filled teeth demonstrates that there is a significant correlation and this makes amalgam a prime suspect. Another way to look at the same group is to list the subjects by the number of respiratory and allergy symptoms they have. This provides ballpark figures for the number of subjects whose dental fillings are causing harm.
In the table below, for comparison purposes, the number of each group is adjusted proportionally to what it would be if there were exactly 1000 subjects in each group.
Allergy and Respiration Symptoms in Non-Smoking Subjects with Perfect Teeth Versus Non-Smoking Subjects with 4 to 12 Fillings |
||||||
|
male subjects |
female subjects |
||||
Number of |
1000 |
1000 |
difference |
1000 |
1000 |
difference |
|
|
|
|
|
|
|
0 |
181 |
129 |
-52 |
232 |
132 |
-100 |
1 |
275 |
217 |
-58 |
258 |
207 |
-51 |
2 |
112 |
137 |
25 |
77 |
125 |
48 |
3 |
100 |
114 |
14 |
90 |
125 |
35 |
4 |
163 |
137 |
-26 |
77 |
107 |
30 |
5 |
56 |
80 |
24 |
90 |
99 |
9 |
6 |
62 |
78 |
16 |
103 |
80 |
-23 |
7 |
31 |
40 |
9 |
32 |
52 |
20 |
8 |
6 |
26 |
32 |
19 |
29 |
10 |
9 |
0 |
22 |
22 |
6 |
18 |
12 |
10 |
6 |
12 |
6 |
0 |
15 |
15 |
11 |
0 |
4 |
4 |
6 |
6 |
0 |
12 |
6 |
2 |
-4 |
6 |
3 |
-3 |
13 |
0 |
0 |
0 |
0 |
1 |
1 |
14 |
0 |
2 |
2 |
0 |
0 |
0 |
When we go from the group with perfect teeth to the group with 4-12 fillings, the number of subjects with zero symptoms drops by a substantial amount. No doubt some of the people with good teeth who started with, say, 4 symptoms would get even more symptoms if they suddenly acquired a few fillings. But the table above doesn't give any indication of how many such would be so affected because the 4-symptom category would lose some and gain some. The advantage of looking at the zero-symptom category is that it just loses. And the percent of subjects dropping to a lower level gives us some indication of how many subjects are harmed by fillings. This is unlikely to be an accurate indicator but it provides an order of magnitude.
There are 52 fewer males with zero symptoms in the fillings group compared with the 181 subjects in the perfect teeth group. That is a decrease of 29%. If the decrease were due to amalgam alone, it would suggest that 29% of the male population is sufficiently mercury sensitive to be harmed by amalgam. The decrease for females is 43%. Note also that for both male and female, the number of subjects in the double figures goes two or three lines further down the list for the fillings group than for the perfect teeth group.
Of course, if we were dealing with identical twins, one of whom had perfect teeth, the other fillings, we could take these numbers more seriously. As it is, subjects with perfect teeth must have different diets or habits or body chemistry (in order to avoid the cavities that sent the other subjects to the dentist) and some of those differences could be affecting the numbers. But 29% is a huge margin to explain away. Even if confounding factors were found to chop that figure in half, that would leave an estimate of 5% to 10% looking very conservative.
Turning to over-the-counter pain medication, this comparison has a personal note. I started getting headaches when I was a child and by the time I was 30 was getting 20 to 30 headaches per month. These were not migraines. They were uncomfortable but not debilitating. And they responded well to Aspirin and even better to 222's. From age 30 to 60 I took three Aspirins or two 222's just about every day. Then I got all my fillings changed from amalgam to composite. Six months after the last of the amalgam was removed, my headaches started to become less severe and less frequent. After another year, I stopped getting headaches.
Needless to say, I looked in the NHANES data set for questions about headaches. One question in the youth questionnaire; none in the adult questionnaire. Disappointing. Somewhat later, I realized I could get to the same place by tracking over-the-counter analgesics (which are covered in NHANES). Aspirin has many other uses but in the USA where 222's are a prescription drug, Aspirin, Tylenol and Ibuprofen (and similar drugs by other names) would be the drugs of choice for the kind of headache I used to get. Migraines would be a different proposition – over-the-counter analgesics are pretty much useless for that kind of headache.
Both males and females with 4 to 12 fillings took 40% more pills than subjects with perfect teeth. The Excel ttest function gave a P-value of 0.028 for the males and 0.060 for the females. Not nearly as impressive as the Excel P-value numbers for the respiratory/allergy comparison but still attention grabbing.
I had to downgrade these figures a little when smokers were considered. Smokers tend to consume more aspirins than non-smokers. When comparing non-smokers only (a somewhat smaller group), the males with fillings still used 40% more over-the-counter pain killers than males with perfect teeth but the Excel P-value was now 0.13 – no longer statistically significant but still interesting. I assume that the reduction in the P-value was due in large part to the reduction in sample size (to less than half).
Among the male subjects in the perfect teeth group, half used no over-the-counter pain killers. Going from the perfect teeth group to those with 4-12 fillings, the number of subjects who used no pain killers decreased by 27%.
Females were less consistent. Among non-smokers there was just a 19% increase in over- the-counter pain killer use , giving an Excel P-value of 0.23. Going from the perfect teeth group to those with 4-12 fillings, the number of subjects who used no pain killers decreased by 13%.
The indications here are that investigating mild headaches and dental amalgam would not produce as clear a correlation as allergies and dental amalgam. So if choosing between them, the easy first choice would be allergies.
One of the advantages of doing epidemiological studies on allergies and mild headaches is that they appear to be associated with dental fillings in a substantial number of subjects. An interesting characteristic of both problems is they are generally treated with over-the-counter medicine with the main advisor being a pharmacist; so even though they are quite common problems, they stay below the radar of the medical fraternity. This means that a study could not be done using existing medical records.
The advantage of choosing an easy target is that definitive results would be possible in a modest study. If a correlation can be established beyond doubt, then it would be easier to justify the major sort of study necessary to show whether there is also a correlation between amalgam and less common but more serious health problems like Alzheimers, MS, PD.
A medical research professional was kind enough to read this article but was not impressed:
"All observational data (like NHANES) can do is suggest possible associations. Unfortunately, there are always known and unknown biases in how subjects select themselves into the comparison groups. Only random assignment to comparison groups (randomized trials) can remove these biases. There are so many potential confounders that observational data are almost useless for this kind of investigation. The answers to the amalgam question will never be found in observational data."
I think this is a just criticism, especially as regards my hopes when I first looked into the NHANES data. I do know the reason for randomized trials but did not recognise at first how far removed the NHANES data is from the data that would be produced in a proper trial.
Though designing a study for this question would be difficult. One of the dilemmas faced by experimenters is that some problems do not lend themselves to randomized trials. If one wanted to set up a trial comparing people with fillings to people with perfect teeth, the only way that random assignment would work is to start with subjects, all of whom had perfect teeth. This sort of experiment has been done with sheep and other animals but there are ethical concerns with giving people fillings they don't need.
I do still think that looking at something like the NHANES data is a useful first step in preparing to do a randomized trial — it might suggest a comparison that had not occurred to the experimenter and it might warn of difficulties that could be avoided.
For example: my analysis suggests that the easiest amalgam association to prove would be amalgam and allergies. But designing a trial would still be tricky — in the Bellinger trial, no association was found between amalgam and allergies. There are any number of possible explanations for this, including random variation. But possibly the most important explanation is that I was looking at data for adults and the Bellinger subjects were children. When I looked at NHANES data for children, I found no correlation between fillings and allergies. In fact, if anything, fillings seem to protect against allergies. The indication here is that the body chemistry of children and adults, which is quite different, may be important in determining whether the mercury sensitive among us get allergies from amalgam. And this should be taken into account in any trial that compares amalgam and allergies.
Drawing conclusions from the NHANES data is a bit like the farmer/hunter who looks out the kitchen window and sees a moose in the pasture. This is an unusual occurrence. There are a number of possibilities: (1) It really is a moose. (2) It's a brown cow which, through tricks of early morning light and shadow and an imperfect pane of glass in the window, looks just like a moose. (3) It's a mechanical moose of the sort that game wardens use to catch people hunting out of season. (4) It's the Berkowitzes in a moose costume.
It's probably a moose. But the farmer knows there are other explanations for what he is seeing. So does he ignore whatever it is and return to breakfast? Not bloody likely.
Medical researchers are a little more cautious.