Seven Rules for Social Research
Glenn Firebaugh, Pennsylvania State University
October 31, 2007
// For Princeton University Press Website //
1a. It does not make sense to speak of abortion attitudes influencing one’s gender, so the relevant comparison is 41.5 percent of men agree vs. 39.4 percent of women agree.
1b. Men are more likely than women to approve of abortion for any reason at all, but the difference is small.
3. Women are consistently less likely th
an men to approve of abortion for the six specific reasons, although the gender difference varies by the question asked. The gap is largest for abnomore (4.5 percentage points) and smallest for abdefect (0.9 percentage points).
5a. Race differences are statistically significant (chi-square = 22.86; p < .005). Whites are the most supportive of abortion rights (40.9 percent approve of abortion for any reason); blacks are the least supportive (36.8 percent).
5b. Answers will vary according to what you expected. One possible reason for the lower support of nonwhites is the role of religion. African-Americans, although economically liberal, tend to hold fairly conservative social attitudes. Hispanics (who fall into the “other” category) tend to be members of the Catholic church, which is firmly opposed to abortion.
7a. The association between hunting and frequency of intercourse is strongly statistically significant (chi-square = 350.16; p < .005).
7b. Respondents who hunt (or who have a spouse that hunts) tend to have sex more frequently than respondents in a marriage where neither partner hunts. Of the couples where neither hunt, almost one quarter (24.9 percent) have not had sex at all in the past year. Among couples where both hunt, by contrast, only 3.5 percent have not had sex at all in the past year. Indeed, among couples where both hunt, nearly half report having sex two or more times a week (35.2 + 11.9 = 47.1 percent). Among couples where neither hunts, fewer than one in four report having sex two or more times each week (18.7 + 5.5 = 24.2 percent).
Rule 2.
1a. The association is both positive (the healthier you are, the happier you tend to be) and statistically significant (chi-square = 2,780.6; p < .005).
1b. Although there are exceptions – some people in poor health are happy and some people in excellent health are not – the association between health and happiness is strong. Of those who report being in excellent health, almost half (46.4 percent) say they are “very happy,” while only 17.6 percent of those in poor health say they are very happy. And over one third of respondents in poor health report that they are “not too happy” whereas only 6.5 percent of those in excellent health report being not too happy.
1c. No, the probability of being “very happy” as opposed to “pretty happy” or “not too happy” does not change linearly across the four health categories. Being in fair health as compared to poor health does not substantially change the likelihood that one will be “very happy” (21.2 percent of those in fair health versus 17.6 percent of those in poor health report being “very happy”). However, being in excellent health as opposed to good health makes one far more likely to be very happy (46.4 percent versus 28.9 percent).
1d. Again the change is not linear. Here, though, the differences are greatest at the bottom end of the health scale. Among those reporting fair health, 20.7 percent report being “not too happy.” Among those reporting poor health, this percentage balloons to 35.1. The differences are smaller for the other adjacent categories (fair versus good health, and good versus excellent health).
It appears, then, that the effect of health on happiness is particularly pronounced at the extremes: Being in excellent health greatly boosts your chances of being very happy, while being in poor health greatly boosts your chances of being unhappy.
3a. Male housekeepers are more likely than men in the paid labor force to be “not too happy” (20.8 percent versus 9.7 percent) and less likely to be “very happy” (22.6 percent versus 31.7 percent). This is different from the pattern for women, in two ways. First, among women, housekeepers are somewhat more likely (than women in the paid labor force) to say that they are “very happy;” among men, housekeepers are less likely to report being very happy. Second, the difference in happiness between housekeepers and paid workers is relatively small for women, but large for men.
3b. A possible explanation is that men are more likely to derive their social status from their jobs, and housekeeping is traditionally not considered a prestigious position. This sense of relatively low status may make male housekeepers less happy. Relatedly, American men are less likely (than American women) to choose to be housekeepers: Perhaps they could not find work, or were laid off from their jobs. Or perhaps their spouse has higher earnings potential, so the couple has decided that it makes economic sense for the husband to keep house. This violation of the traditional division of labor may stigmatize such men and lower their levels of happiness.
5a. The differences are much bigger for health than for gender. There is a 0.574-unit difference between the mean value of happiness for respondents in excellent health (2.399) and those in poor health (1.825). This gap of 0.574 is 90 percent of the overall standard deviation (0.638) of happiness. For gender, in contrast, the gap is only 0.009, which is less than 2 percent of the overall standard deviation.
5b. In comparing averages of the happiness scale, we assume that the intervals between the categories of happiness are equally large (i.e., that the distance between “not too happy” and “pretty happy” is the same as the distance between “pretty happy” and “very happy”).
7. Overall, older adults are somewhat more likely than younger respondents to be “very happy” as compared to “pretty happy” or “not too happy.” For example, 35.5 percent of respondents 70 and over are “very happy,” while only 28.8 percent of respondents under 30 are. On the other hand, older respondents are not substantially less likely to be “not too happy” as compared to “pretty happy” or “very happy” (12.9 percent of those who are 70 and older, and 12.1 percent of those in their 60s, report being “not too happy,” which is about the same as those under 30).
9a.-b. For 9a, your answers will vary depending on your predictions. You should have found that, averaged across the 1980-2004 period, Americans are least tolerant of extramarital sex (77 percent believe it is “always wrong”), followed by sex between unmarried teens (70 percent), homosexuality (68 percent), and premarital sex (27 percent). These averages obscure substantial differences in trends over time, however. Disapproval of homosexuality declined dramatically from 1980 to 2004. By contrast, disapproval of sex between unmarried teens increased somewhat, and disapproval of extramarital sex rose even more. Attitudes toward premarital sex, on the other hand, have not changed noticeably.
1a. This provides some support for the argument—the negative sign of the correlation coefficient signifies that higher levels of government intervention are associated with lower levels of income inequality—but not strong support. The correlation is fairly weak, especially for aggregate data such as these. The big story here is how much this correlation shrinks when you remove just eight cases (the eight nonmarket economies). With the nonmarket economies included, government intervention explained over half of the variance in the level of income inequality within nations. Without those economies, government involvement accounts for only 2.6 percent of the variation in nations’ income inequality (.162 = .0256).
3. Your plot should look something like the following, where the top dashed line represents the mean value of gini for the 24 market economies, the bottom dashed line represents the mean value of gini for the 8 non-market economies, and the solid line represents the regression line for the effect of dgi on gini among the 24 market economies.

3a. --
3b. Your answers will vary, but the value may well be larger than you had guessed.
3c. Your findings should cast doubt on the conclusions of the original study. The study tests a theory that applies only to market economies with data that includes non-market economies. Once we analyze only the market economies, the association between government intervention and income inequality becomes much weaker (as shown in Question 1).
5a. In general, you should find that the mean of the Newgini25-DGI correlations is closer to zero than the original Gini-DGI correlation (see question 7 in this set of student exercises).
5b. In general, you should find that the variance of the Newgini25-DGI correlations is larger than the variance of the Newgini01-DGI correlation. That is, the correlations vary more with greater measurement error (see question 7 in this set of student exercises).
7. Your particular results will differ from those obtained by others in the course, depending on your simulations. But the general patterns should be along the lines stated in the text.
1. Question wording is likely to be a major factor; calling attention to “responsibilities” might reduce support for the ERA, and this tendency might have increased over time as interest groups opposed to the ERA called attention to the potential unintended consequences of “equal rights.” It is nevertheless hard to know if this is the decisive difference between the polls, since there are other differences as well. Note in particular that the Gallup polls excluded those respondents who said they had not “heard or read about” the ERA. Over time presumably more and more people would hear about the ERA. It is important to know, then, whether those who learned about the ERA later differed from those who knew about it in 1976. One possibility is that less educated Americans tend to be less politically informed on issues such as the ERA. If so, and if the less educated also tend to be more concerned about the erosion of traditional values, then the apparent decline in support for the ERA in the Gallup polls could be due, at least in part, to the shrinking proportion of those who are uninformed about the ERA. (This possibility seems less likely when we compare the GSS and ANES results, however: See question 2.)
3a. The RDD sample reports higher voting rates (79.2 percent) than does the FTF sample (73.5 percent). Because the difference is statistically significant (chi-square = 6.7; p = .01), it should not be dismissed merely as sampling error. Although this 5.7 percentage-point difference might appear to be relatively unimportant, the implications are huge when we extrapolate the difference to the entire electorate, since it represents millions of voters. Both of these samples are intended to generalize to the American electorate, so our estimates of voting could vary dramatically based on the sample used.
3b. Your answers will vary. Question 4 addresses potential advantages and drawbacks of each method. Perhaps you will reconsider your answer after you have completed questions 4 and 5.
5a. Respondents from the RDD sample are more likely to have a college degree than those from the FTF sample (33.6 percent versus 28.7 percent). This difference is statistically significant at the .05 level (chi-square = 5.0; p = .03).
5b. Differences in the frame populations: As you saw in Question 4, the frame population of the RDD sample is likely to be slightly wealthier and less transient than the frame population of the FTF sample—precisely the sorts of people who would be more likely to hold a college degree. Differences in response rates: The ease of refusing a telephone interview relative to rejecting a personal interview might depress the representation of the less-educated in the RDD sample even more than it does in the FTF sample.
The arguments just given imply that the higher proportion of college graduates in the RDD sample reflects a real difference in the two samples. The measurement error argument adds a twist: Perhaps the two samples don’t differ with respect to education after all, or the difference is exaggerated. Measurement error: As just noted, the issue here is somewhat different. The question is not why RDD respondents tend to be more educated, but how much more educated they actually are (that is, whether the observed educational differences are exaggerated because people are more likely to overstate their educational attainment when they are interviewed over the telephone). Because respondents may be more likely to exaggerate their educational credentials to a phone interviewer who cannot use the respondent’s neighborhood, home, and furnishings as clues about the truth of the respondent’s claims about a college degree, there may be some merit to this argument. Nevertheless, most of the educational disparity between the FTF and RDD samples probably is real, and is due to differences in the respective frame populations and response rates.
1a. Yes, marital status and health both appear to affect happiness. The regression coefficients for both variables are statistically significant at p <.0005. Controlling for health status, married respondents reported being happy for an average of 0.379 more days over the last week than did unmarried respondents. However, the effect of health seems to be stronger. Controlling for marital status, each one-unit increase on the self-reported health scale raises the number of days of happiness in the past week by 0.335 days, on average. This means that the healthiest respondents (a 5 on the health scale) were happy for an average of 1.34 more days than were the least healthy respondents (a 1 on the health scale) (4 x 0.335 = 1.34).
1b. Marital status: There could be omitted variables that affect both the likelihood of marriage and the likelihood of happiness. There are a number of possibilities: financial security, physical attractiveness, a positive outlook on life, etc. In other words, it is not marriage per se that makes one happy, but other individual traits that affect both one’s marriage prospects and one’s happiness.
Note that “health” is not an appropriate answer here, since the effect of health is controlled for in the model. Note also that our estimate of the effect of marital status could be biased because of an endogeneity problem. Our model assumes that marital status causes happiness, when it could be the other way around, that happy people are more likely to be attractive mates and, thus, are more likely to marry and stay married.
Nor is it appropriate here to cite factors that mediate the effect of marital status on happiness. If you say, for example, that married people are happier because they are more likely to have children, and children bring happiness, then you are suggesting the causal chain marriage à children à happiness. By suggesting a mechanism (in some fields this is called a “pathway”) by which marriage affects happiness, you are giving a reason for believing that the estimates in our equation are causal, rather than a reason for believing that they are inflated estimates of the actual causal effect.
Health status. Our estimate of the effect of health on happiness would also be biased if there are prior individual traits that affect both one’s health and one’s happiness. (Again, it won’t do to cite mediating factors M in the causal chain health à M à happiness.) Socioeconomic status is one possibility: Because people with more education and higher incomes tend to be both healthier and happier, some of the estimated effect of health in our model may be due instead to differences in socioeconomic status (which we don’t control for). Social networks is another possibility. Socializing often with good friends is likely to be good for one’s health as well as for one’s happiness. A lonely or overly-busy life with little relaxation and socialization, by contrast, is likely to impair one’s health as well as one’s happiness. Finally, as in the case of marital status, durable temperament traits, such as outlook on life, may play a role. A positive outlook on life might boost both health and happiness.
Finally, note that there is the possibility of endogeneity bias here, just as in the case of marital status: Perhaps happiness affects health, as well as the other way around.
3. We explain less variance with the fixed-effects model (R2 = .018 versus .039 for the cross-section model). Typically FE models explain less variance. We presume that this is because they are more likely to avoid omitted-variables bias and capture only actual causal effects. We are more likely to avoid (or at least reduce) omitted-variables bias because we focus on change in individuals over time, thus filtering out the effects of unmeasured causes that vary across individuals but are stable for a given individual over time. In other words, the FE model uses only the components of marital status and health that change over time to explain change in happiness. In our cross-section analysis, by contrast, the variance in happiness that we explained by marital status and health included whatever component of that variance that was due to the (unanalyzed) association of marital status and health with stable unmeasured characteristics of the respondents. Although the FE strategy eliminates part of the variance in happiness that had previously been labeled “explained variance” in the cross-section model, we expect our results to better reflect the actual causal effect of health and marital status on happiness.
1. The coefficient for year, with a value of -0.318, represents within-cohort change: Controlling for one’s birth cohort, each one-year increase in the year of the survey represents a 0.318 percentage point decline in the percentage of people favoring laws that forbid the marriage of whites to blacks (anti-miscegenation laws). That is, over the 30-year span of these GSS surveys, the percentage of people within each cohort favoring anti-miscegenation laws declined by an average of 9.54 percentage points (30 x -0.318 = - 9.54).
The coefficient for birthcohort, with a value of -0.722, represents cross-cohort change: Controlling for the year of survey, those who have been born more recently – that is, those with higher values on birthcohort – are less likely to support such laws. On average, each one-year increase in the respondent’s birth year represents a 0.722 percentage point decline in the percentage of people favoring anti-miscegenation laws. For example, the percentage of respondents from a given birth cohort who favor anti-miscegenation laws is expected to be an average of 7.22 percentage points lower than the percentage of respondents born ten years earlier who favor such laws (10 x -0.722 = - 7.22).
3a. You should find a coefficient for retiree of 0.919 (effect on toomuchspend) and -0.613 (effect on toolittlespend). Both are highly statistically significant (p < .0005). In 1973, retirees were more likely than workers to believe that we spend too much on public education (compared to spending about the right amount or too little), and less likely than workers to believe that we spend too little on public education (compared to spending about the right amount or too much).
3b. You should find a coefficient for trend of -0.028 (effect on toomuchspend) and 0.038 (effect on toolittlespend). Both are highly statistically significant (p < .0005). Since 1973, workers have become less likely to believe that we spend too much on public education and more likely to believe that we spend too little on public education.
3c. You should find a coefficient for retireextrend of -0.012 (effect on toomuchspend) and -0.001 (effect on toolittlespend). In the sample, then, there is a slight difference in the trends for retirees and workers. Neither is statistically significant at the .05 level (p = .059 for toomuchspend and p = .815 for toolittlespend), however, so we fail to reject the null hypothesis of no difference in trends between retirees and workers. We conclude, then, that support for spending on education increased between 1973 and 2004 in the United States, and that the increase in support was the same for retirees as for those in the labor force. There is no apparent generational divergence on this issue.
3d. You should find a constant of -2.292 for toomuchspend and 0.127 for toolittlespend. These are logit coefficients (the logged odds, using natural logarithms). Taking the antilogarithm of each coefficient, we find that the predicted odds that a worker in 1973 thought that we spent too much money on the educational system was 0.10, and the predicted odds that a worker thought we spent too little money was 1.14 – predicted probabilities of 0.09 and 0.53, respectively. In short, the results of our model imply that in 1973 about 9 percent of American workers believed that too much money was being spent on the nation’s education system, as opposed to 53 percent who believed that too little money was being spent on the education system.
1. According to the ANES data, Blacks were much less likely than others to vote for Wallace. Indeed, as noted in the text, not a single respondent in the ANES voted for Wallace. Among non-black respondents, by contrast, about one in eight (12.4 percent) said they voted for George Wallace. The association between race and vote for Wallace is fairly strong – there is a 12.4 percentage point difference in the vote for Wallace for blacks versus others – and statistically significant (chi-square = 12.1, p < .005).
3a. The values of R are similar at the individual level: -0.109 for the ANES data and -0.126 for the GSS data. At the aggregate level, the values are 0.235 for the ANES data and 0.190 for the GSS data. The point that stands out is that, for both data sets, the relationships are in different directions at the individual and aggregate levels. Black themselves were much less likely to vote for George Wallace, yet voters in more heavily black regions were more likely to vote for Wallace. Both datasets exhibit this pattern.
3b. The individual versus region discrepancies are similar for the GSS and the ANES. For the ANES, the discrepancy is 0.235 – (-0.109) = 0.34. For the GSS, the discrepancy is 0.186 – (-0.126) = 0.31. Given the different sampling frames and so on used by the ANES and the GSS, this similarity in results is reassuring.