How Are Data Patterns Related?

We observed earlier that historians search for patterns in surviving evidence from the past and that descriptive statistics can help in this process. But historians are not always happy just locating a pattern. They frequently want to explain the pattern; they want to know why the pattern emerged and took the shape that it did. In common parlance, they want to know the causes of the historical patterns they identify. And here again quantitative methods can be useful so long as we are careful not to treat statistical measures of association as the equivalent of historical proof. With the help of statistics we can infer the existence of a relationship between two variables or factors, but that does not mean that one factor caused the behavior of the other. It remains possible that a third variable was the "prime mover"; by just looking at two variables, we run the risk of mistaking correlation for causation. Even complex procedures that statisticians call "multivariate analysis," which can handle several variables at once, are not powerful enough to prove historical causation to the satisfaction of most scholars. With few exceptions, historians do not believe that historical causation can be reduced to a formula, however complicated and sophisticated the mathematical manipulation of the data may be.

Yet if we cannot prove historical causation by means of statistics alone, we may still be able to use quantitative methods to help substantiate or challenge assertions about historical causation that are commonly expressed in qualitative terms. Although not all qualitative statements about historical causation can be tested quantitatively, some can. The key is to employ statistical tools in a reasonable, restrained, and responsible manner. And do not be surprised if you find that you can undermine a hypothesis more readily than you can come up with an alternative explanation for the data. In history as in many other disciplines, undercutting an argument or an interpretation is often easier than constructing one.

One of the first areas that attracted quantitative historians was the study of voting behavior. The basic story of how Americans had voted in the past was well known, but historians disagreed among themselves about why people voted the way they did. Some historians argued that Americans voted mainly according to their economic interests pure and simple. Other historians contended that economic interests mattered less than cultural factors, such as ethnic identity, religion, and philosophical outlook. The "new political historians" sought to resolve this debate by using quantitative methods. In particular, they matched various quantifiable variables against the election results and measured the level of correlation.

A coefficient of correlation indicates the strength of the relationship between two variables. The most commonly used correlation coefficient is the Pearson product-moment coefficient or Pearson r. While it is difficult to calculate, it is rather easy to interpret. Essentially, Pearson r can fluctuate between 0 and either +1 or – 1. The sign (+ or -) of Pearson r indicates the kind of relationship between the two variables. If Pearson r is positive, then the two variables behave in tandem and in the same direction: that is, if one goes up, the other goes up and if one goes down, the other goes down. On the other hand, if Pearson r is negative, then the two variables behave in tandem but in opposite directions: if one goes up, the other goes down. The closer the coefficient of correlation is to 1 or –1, the stronger the association is between the two variables. If Pearson r is 0, then there is no relationship between the two variables.

In the years following the end of the Civil War and emancipation, the question of extending civil rights to African Americans dominated national life in both northern and southern states. The use of quantitative evidence, and particularly the technique of correlation, can shed light on how some Americans felt about the pressing question of African-American voting rights. Almost every northern state allowed voters (then only white males) to vote on whether to extend voting rights to African-American men. Robert R. Dykstra and Harlan Hahn studied one such referendum held in Iowa in 1868. Using election returns and census data, they looked for associations (correlations) between votes on the referendum to extend voting rights and several political, social, and economic variables such as party affiliation (Democrat or Republican), land values, and population density. 66.5% of Iowa voters voted to extend suffrage to African Americans in 1868 (up from 14.6% who voted that way on a similar referendum held in 1857).

First they looked at party preference and economic factors (read and roll mouse over):