Best Practice: Interrogating Frequency Claims

BioSource Faculty
3 days ago
15 min read

Updated: 2 days ago

We have based our Best Practice series on Dr. Beth Morling's Research Methods in Psychology (5th ed.). We encourage you to purchase it for your bookshelf. If you teach research methods, consider adopting this best-of-class text for your classes.

Dr. Beth Morling is a distinguished Fulbright scholar and was honored as the 2014 Professor of the Year by the Carnegie Foundation for the Advancement of Teaching.

With more than two decades of experience as a researcher and professor of research methods, she is an internationally recognized expert and a passionate advocate for the Research Methods course. Morling's primary objective is to empower students to become discerning critical thinkers, capable of evaluating research and claims presented in the media.

In this post, we will explore a question addressed in Chapter 3: "How do researchers interrogate frequency claims?"

podcast icon — CLICK TO HEAR THIS POST NARRATED

Frequency claims—statements about the rate or proportion of a single measured variable—are common in psychological science and the media. But how can we tell whether such claims are trustworthy? To answer this, we draw on three of the four big validities emphasized by Morling: construct validity, external validity, and statistical validity. Each offers a critical lens through which to examine the methods and meaning of frequency-based conclusions.

We will explain how researchers define and measure variables clearly (construct validity), ensure their findings generalize to the broader population (external validity), and assess whether their numerical estimates are accurate and precise (statistical validity). Understanding how to interrogate these claims equips students to become critical consumers of research, capable of separating credible evidence from flawed or overstated findings.

Interrogating Frequency Claims

Once you’ve identified the kind of claim a study makes—frequency, association, or causal—the next step is to evaluate how well the evidence supports it. To do this, psychologists use a framework known as the four big validities: construct validity, statistical validity, external validity, and internal validity.

These four lenses allow us to interrogate a study systematically and ask: Did the researchers measure or manipulate their variables appropriately (construct)? Can the results be generalized beyond the study’s sample (external)? Are the findings statistically sound and precise (statistical)? And, in the case of causal claims, did the study rule out alternative explanations (internal)? Each validity asks a different but essential question, and together, they help us judge the credibility of scientific evidence.

Construct validity concerns how well a study measures or manipulates the variables it intends to. If a researcher claims that “screen time increases anxiety,” we must ask how “screen time” and “anxiety” were operationalized. Was screen time measured using app data or self-report? Was anxiety assessed using a validated scale? If the measures are imprecise or loosely defined, then the results might not reflect what they claim to. Construct validity applies to all three types of claims, because every claim depends on accurate and meaningful measurement. Poor construct validity can undermine a study’s conclusions before we even look at its numbers.

External validity addresses generalizability—whether the study’s findings apply to people and settings beyond those in the original sample. For example, if a study finds that 73% of a sample laughed yesterday, we have to ask: Who were these people? Were they from different countries, ages, and backgrounds, or just from one specific demographic? Was the sample randomly selected, or was it based on convenience? Without external validity, a study’s findings might not extend to the broader population. External validity is especially important for frequency claims, which often aim to describe rates or averages across large groups.

Statistical validity focuses on whether the data support the conclusions. This involves looking at effect sizes, confidence intervals, and statistical significance. For instance, if a study claims a correlation between sleep and mood, how strong is that relationship? Is the effect statistically significant, or could it have occurred by chance? Are the confidence intervals narrow enough to suggest precision, or so wide they could include no effect at all? Statistical validity also includes the issue of replication—have the findings been repeated in other studies? Without statistical validity, we can’t trust the reliability or generalizability of the reported results.

Internal validity is specific to causal claims and concerns whether the study’s design rules out alternative explanations. If we want to claim that one variable causes another, we need to ensure that nothing else could have produced the observed effect. This is where experiments shine, using random assignment and controlled procedures to eliminate confounds. If a study finds that students who received a new teaching method performed better, internal validity demands that we rule out other causes—like differences in student motivation or prior knowledge. Without internal validity, even a statistically significant result may be misleading.

To summarize, the four big validities offer a comprehensive toolkit for evaluating psychological research. Construct validity ensures that variables are accurately measured or manipulated. External validity tells us whether the results generalize. Statistical validity ensures the numbers are trustworthy. Internal validity confirms that cause-and-effect relationships are legitimate.

Not every study will maximize all four—sometimes researchers prioritize one over the others depending on their goals. But understanding which validities matter most for each type of claim helps you interpret research more critically and make informed judgments about what to believe.

Interrogating Frequency Claims

When researchers make frequency claims, they are typically reporting the rate or level of a single measured variable in a particular population. Examples include “30% of adults experience sleep problems,” or “1 in 4 college students reports symptoms of anxiety.” These claims are descriptive and do not make any statements about relationships between variables or about cause and effect.

To evaluate how well a study supports a frequency claim, we focus primarily on three of the four big validities: construct validity, external validity, and statistical validity. Internal validity, which deals with cause-and-effect relationships, is not relevant for frequency claims because they do not assert causation.

That said, frequency claims still need to be interrogated carefully to ensure that the conclusions are trustworthy, meaningful, and generalizable.

Construct validity is the first priority when evaluating a frequency claim. It involves asking whether the researchers accurately measured the variable of interest. For example, if a report says that “39% of teens text while driving,” we need to know how the researchers defined and measured “texting while driving.” Did they use self-reports from a survey? Did they observe drivers directly or examine cell phone records? Each method has advantages and drawbacks. Observational data might be more objective but harder to obtain. Self-reports are easier to collect but may be biased by social desirability or faulty memory. The better the operational definition and measurement strategy, the higher the construct validity of the study.

Next, external validity determines whether the findings from the study can be generalized to the broader population. This involves looking at how the sample was selected and whether it reflects the population the researchers aim to describe. If researchers want to make a claim about “teens,” did they sample teens from different regions, socio-economic backgrounds, and ethnic groups? Was the sample randomly selected or was it a convenience sample from one school or online community? A strong external validity supports the idea that the estimate from the sample applies to the population as a whole. If external validity is weak, the result may only apply to that specific sample and not to other groups.

Statistical validity addresses the precision and accuracy of the estimate in a frequency claim. This includes considering the sample size, the margin of error (also called the confidence interval), and whether the estimate was replicated in other studies. For example, if a survey reports that 25% of college students binge drink, but the margin of error is ±6%, the true percentage could be as low as 19% or as high as 31%. Narrower margins of error provide more precise estimates, especially in studies with large, representative samples. If a study’s statistical conclusions are based on small or biased samples, the results might not be reliable.

It’s also important to consider whether the frequency claim was replicated. If other researchers have studied the same topic and found similar results, that boosts confidence in the claim. On the other hand, if findings vary widely across studies, it might mean that the measurement tools are inconsistent, the samples are not comparable, or the phenomenon itself is highly context-dependent. Researchers and informed readers should look at the total body of evidence before accepting a frequency claim as fact.

In conclusion, interrogating a frequency claim means asking three big questions: How well was the variable measured? Can the result be generalized beyond the sample? And how precise and trustworthy are the numbers? While frequency claims may seem simple, they rely on strong methods and transparent reporting to be meaningful. As a student of psychology, learning to critically evaluate these claims will help you separate solid evidence from weak or misleading reports. Whether you're reading about health statistics, education trends, or consumer behavior, understanding how to interrogate frequency claims helps you become a more scientifically literate thinker.

Construct Validity of Frequency Claims

Construct validity is all about making sure the researchers measured what they intended to measure and did so in a way that is both accurate and meaningful. When evaluating a frequency claim, the first step is to ask: How well did the researchers define and measure their variable? For instance, if a claim says “39% of teens text while driving,” you need to know how “texting while driving” was operationalized. Did they ask teens directly with a survey question like, “Have you ever texted while driving?” or did they observe drivers at intersections? Perhaps they used cell phone records to match timestamps with GPS movement. Each method has trade-offs. Surveys are easy and scalable, but responses may be affected by social desirability bias—people may underreport risky behavior to appear more responsible.

In contrast, direct observation can improve accuracy but is time-consuming and may be impractical for large samples. If researchers observed only a few intersections or regions, their data might not capture national trends. Technological methods, like app data or cell phone usage records, offer objectivity but raise privacy concerns and may not be accessible for all researchers. Regardless of the method, it’s crucial that the operational definition matches the conceptual variable. If the concept is “risky driving,” but the measurement only looks at “texting,” important aspects of risky driving behavior could be overlooked.

Good construct validity also depends on the reliability of the measurement. Reliability refers to the consistency of a measurement over time or across different observers. If two different researchers observe the same teen drivers, do they record similar results? If a teen takes the same questionnaire a week apart, do their answers stay the same? Consistent results are a good sign that the measurement is reliable. However, even a reliable measure can be invalid. For example, using shoe size as a measure of intelligence might produce consistent scores, but it’s clearly not measuring intelligence. That’s why both reliability and validity are necessary for strong construct validity.

Another component of construct validity is whether the measurement captures different levels of the variable accurately. For example, if “stress” is measured using a one-item self-report scale ranging from “not stressed” to “very stressed,” is that detailed enough to reflect real differences among people? A better approach might be to use a multi-item questionnaire that evaluates emotional, cognitive, and physical symptoms of stress. This gives a more nuanced picture of the variable and supports more accurate conclusions. Without fine-grained measurement, researchers risk oversimplifying complex phenomena.

Researchers can also improve construct validity through pilot testing and using previously validated instruments. Before launching a large-scale study, they might test the questions on a smaller sample to ensure clarity and effectiveness. Using well-established tools like the Beck Depression Inventory or the Perceived Stress Scale can also strengthen construct validity because these instruments have been vetted through years of research. It’s always worth checking whether a study used original, untested measures or relied on proven tools.

Construct validity is the foundation for any frequency claim. Without it, even a study with a large, representative sample and rigorous statistics can lead to flawed conclusions. When you see a claim about the prevalence of a behavior, belief, or condition, start by asking: How was the variable

Only when these questions are answered satisfactorily can you trust the numbers that follow. As a critical consumer of psychological research, developing a keen eye for construct validity will help you separate high-quality evidence from studies with shaky foundations.

External Validity of Frequency Claims

External validity refers to how well the results of a study generalize to populations, settings, and times beyond those examined in the original study. For frequency claims—statements about how common a particular behavior, opinion, or experience is—external validity plays a vital role in determining whether we can trust that the reported rate applies to people outside the sample. If a study reports that 60% of adults experience daily stress, we should ask: Who are these adults? Were they selected randomly from the general population, or were they volunteers from a stress reduction class? A frequency claim becomes more meaningful if the sample is representative, and less so if it reflects a narrow, self-selected group. This is why researchers strive for random sampling when generalizing their findings to a broader population.

A strong example of external validity comes from the Gallup Global Emotions Report, which claims that “73% of the world laughed yesterday.” At first glance, this claim sounds suspiciously precise, given the vast diversity of human experience. But Gallup backed it up by collecting data from over 140 countries and using random sampling methods within each. They also adjusted for demographic differences, such as age and gender, and conducted the survey in local languages. This makes the findings more likely to generalize to the global population. In contrast, if a study only surveyed people from a single U.S. city using an online poll, its results would not generalize nearly as well—even if the methodology was otherwise solid.

To assess external validity, always look at the sampling method. Was it a probability sample, where every person in the population had an equal chance of being selected? Or was it a convenience sample, which draws from readily available participants? The latter is easier and cheaper but often results in bias. Convenience samples can still be valuable, especially in early-stage research or when studying specific subgroups, but they limit generalizability. If researchers don’t report how their sample was drawn, or if the sample seems too narrow or unrepresentative, it’s a red flag for external validity.

External validity isn’t just about who was studied—it also includes when and where the study took place. Findings from a study on teen vaping conducted in 2015 might not apply today due to cultural and policy changes. Similarly, data collected during the COVID-19 pandemic might not generalize to post-pandemic conditions. Researchers and readers alike must consider whether the context of a study affects the relevance of its findings elsewhere or in the future. Just because a result holds in one setting doesn’t mean it will in another.

Another key point is that improving external validity often involves trade-offs. Random sampling is ideal, but it can be expensive and logistically complex. Sometimes researchers have to balance feasibility with representativeness. For example, a national health survey might collect fewer variables from a large sample to keep costs down. A smaller, more detailed study might offer deeper insights but lack the broad generalizability of the larger survey. Understanding these trade-offs helps you appreciate the strengths and limitations of different research designs.

External validity helps determine whether a frequency claim’s results extend beyond the studied sample. To evaluate this, ask: Who participated in the study? How were they selected? Does the timing or setting limit generalizability? Was the sample size adequate to capture meaningful variation?

Answering these questions will guide you in deciding whether to trust and apply a frequency claim to your own population or situation. As you become more skilled at identifying these issues, you’ll be better equipped to separate robust, widely applicable findings from those that are too narrow to be useful beyond the original context.

Statistical Validity of Frequency Claims

Statistical validity focuses on how well the study’s numerical estimates support the claim being made. For frequency claims, which often present a percentage, average, or proportion (e.g., “39% of teens text while driving”), the central question is: How accurate and precise is that number?

Researchers typically use statistics to generate point estimates, such as a single percentage, and accompany them with confidence intervals or margins of error that show the range within which the true population value likely falls. For example, if a report finds that 25% of college students binge drink and the margin of error is ±4%, we can reasonably estimate that the true rate falls between 21% and 29%. The narrower the confidence interval, the more precise the estimate—and the more trustworthy the claim.

Another key factor in statistical validity is the sample size. Larger samples generally produce more precise estimates because they reduce the influence of outliers and random variability. If a frequency claim is based on a survey of 1,500 people, its margin of error will typically be smaller than a study based on just 100 participants. That’s why well-conducted national polls usually sample a few thousand people—they need to balance cost with precision. You should always check whether the study reports confidence intervals or other indicators of estimate precision. If the study doesn't, or if the margin of error is very large, it weakens the credibility of the frequency claim.

Statistical validity also involves evaluating how the data were analyzed and whether the statistical procedures used were appropriate for the type of data collected. For example, were the right statistical models used to adjust for sampling biases or stratification? Did the researchers apply weighting to account for underrepresented groups in their sample? Sophisticated analysis techniques can enhance statistical validity, but they must be transparently reported and correctly applied. Without this information, it's difficult to know whether the statistics presented truly support the study’s conclusions.

In addition to precision, consistency across studies is another sign of strong statistical validity. If multiple independent studies report similar frequencies for the same variable, our confidence in the result increases. On the other hand, if one study finds that 60% of people support a policy and another finds only 35%, we need to examine differences in sample selection, measurement methods, and timing. Replication is a cornerstone of psychological science, and statistically valid frequency claims should hold up under repeated testing.

Another consideration is how outliers or unusual data points are handled. Extreme responses can distort averages and percentages, especially in small samples. Did the researchers report how they dealt with missing data or influential cases? Did they conduct sensitivity analyses to see how the results changed under different assumptions? These kinds of checks help ensure that the findings are robust, not artifacts of chance or poor data handling. Transparent reporting builds trust in the statistical validity of the study.

Ultimately, evaluating statistical validity means looking beyond the headline number. Ask: What is the confidence interval? How large was the sample? Were the statistics appropriate for the data? Has the finding been replicated? How were missing data handled?

By asking these questions, you develop a more nuanced understanding of how well the numbers in a frequency claim truly reflect reality. Statistical validity is not just about mathematics—it’s about whether the data can be trusted to guide real-world understanding and decision-making.

Summary

Evaluating frequency claims requires careful attention to how variables are defined and measured, who is included in the study, and how the data are analyzed and presented. Strong construct validity ensures that researchers are accurately capturing what they intend to measure, while strong external validity allows those findings to generalize beyond the sample. Statistical validity guarantees that the numbers used to support a claim are precise, reliable, and properly interpreted. While frequency claims do not make causal statements, they are foundational to public understanding and policy—appearing in news headlines, health advisories, and academic literature.

As scientists, your ability to interrogate these claims reflects not only your methodological knowledge but also your scientific literacy. By applying the principles described in this post, you take an essential step toward becoming an informed reader of research, a thoughtful evaluator of evidence, and a responsible participant in evidence-based decision-making.

Key Takeaways

Frequency claims describe the rate or level of a single measured variable, such as how many people experience a specific behavior or condition.

To evaluate a frequency claim, focus on construct validity, external validity, and statistical validity—each reveals whether the claim is methodologically sound.

Construct validity asks whether the variable was measured clearly and reliably, using well-defined operationalizations and valid instruments.

External validity examines whether the sample represents the broader population, considering sampling methods, demographics, and timing.

Statistical validity evaluates the precision and accuracy of estimates through sample size, confidence intervals, appropriate analyses, and replication.

Glossary

association claim: a statement indicating a relationship or correlation between two measured variables, without implying causation.

causal claim: a statement asserting that one variable is responsible for changes in another, requiring experimental control and internal validity.

confidence interval: a statistical range, often expressed with a ± margin of error, within which the true population value is likely to fall.

construct validity: the degree to which a variable has been accurately defined and measured in a way that reflects the intended conceptual meaning.

convenience sample: a non-random sample drawn from easily accessible participants, often limiting generalizability.

external validity: the extent to which findings can be generalized to other people, settings, times, or contexts beyond the study sample.

four big validities: a foundational framework in psychological research used to evaluate the quality of evidence in a study. The four validities, construct, external, statistical, and internal, each address a distinct question: whether variables were measured appropriately (construct), whether findings generalize (external), whether the numerical conclusions are accurate (statistical), and whether causal claims are justified (internal).

frequency claim: a claim that describes how common or prevalent a single measured variable is within a population.

internal validity: the extent to which a study rules out alternative explanations for a causal relationship, primarily relevant to experiments.

margin of error: the range within which a population estimate likely falls, given sampling variability; used to express confidence in statistical estimates.

operational definition: a precise description of how a variable is measured or manipulated in a study.

point estimate: a single value (e.g., a percentage or average) that serves as a best estimate of a population parameter.

probability sampling: a sampling method in which each member of the population has a known and equal chance of being selected, supporting external validity.

replication: the process of repeating a study to see whether its results hold under similar or new conditions, reinforcing reliability and validity.

self-report: a data collection method where participants provide information about themselves, typically through surveys or interviews.

statistical validity: the extent to which a study’s numerical conclusions are accurate, reliable, and based on appropriate analysis methods.

validity: the overall soundness of a study’s design and conclusions, assessed through construct, external, statistical, and internal dimensions.

About the Authors

Zachary Meehan earned his PhD in Clinical Psychology from the University of Delaware and serves as the Clinic Director for the university's Institute for Community Mental Health (ICMH). His clinical research focuses on improving access to high-quality, evidence-based mental health services, bridging gaps between research and practice to benefit underserved communities. Zachary is actively engaged in professional networks, holding membership affiliations with the Association for Behavioral and Cognitive Therapies (ABCT) Dissemination and Implementation Science Special Interest Group (DIS-SIG), the BRIDGE Psychology Network, and the Delaware Project. Zachary joined the staff at Biosource Software to disseminate cutting-edge clinical research to mental health practitioners, furthering his commitment to the accessibility and application of psychological science.

Fred Shaffer earned his PhD in Psychology from Oklahoma State University. He is a biological psychologist and professor of Psychology, as well as a former Department Chair at Truman State University, where he has taught since 1975 and has served as Director of Truman’s Center for Applied Psychophysiology since 1977. In 2008, he received the Walker and Doris Allen Fellowship for Faculty Excellence. In 2013, he received the Truman State University Outstanding Research Mentor of the Year award. In 2019, he received the Association for Applied Psychophysiology and Biofeedback (AAPB) Distinguished Scientist award. He teaches Experimental Psychology every semester and loves Beth Morling's 5th edition.

Best Practice: Interrogating Frequency Claims

Interrogating Frequency Claims

Interrogating Frequency Claims

Construct Validity of Frequency Claims

External Validity of Frequency Claims

Statistical Validity of Frequency Claims

Summary

Key Takeaways

Glossary

About the Authors

Recent Posts

Comentarios