People are poor intuitive statisticians. We propose a reason. When people judge the correlation between two continuous variables, they mentally categorize it into a 2×2 contingency table: high X high Y, high X low Y, low X high Y, low X low Y. People pay more attention to some cells (high-high) than others (low-low), resulting in predictable interpretation errors. Specifically, people overweigh the importance of high-high evidence and underweigh the importance of low-low evidence. The result is that people judge datasets with identical correlations but different distributions of points differently. Visualizing data in scatterplots does not improve judgments. However, overlaying simple visual cues do. For instance, drawing a circle around the points of a scatterplot inhibits categorical thinking about the data, which improves people’s judgments by reducing their tendency to overweigh high-high evidence and underweigh low-low evidence.
de Langhe, Bart, Philip M. Fernbach, and Julie L. Schiro (2016), “Two-By-Two: Categorical Thinking in the Interpretation of Continuous Bivariate Data,” Manuscript in Preparation.