# Significance Tests Harm Progress in Forecasting

@article{Armstrong2007SignificanceTH, title={Significance Tests Harm Progress in Forecasting}, author={J. Scott Armstrong}, journal={Forecasting Models eJournal}, year={2007} }

Based on a summary of prior literature, I conclude that tests of statistical significance harm scientific progress. Efforts to find exceptions to this conclusion have, to date, turned up none. Even when done correctly, significance tests are dangerous. I show that summaries of scientific research do not require tests of statistical significance. I illustrate the dangers of significance tests by examining an application to the M3-Competition. Although the authors of that reanalysis conducted a… Expand

#### 159 Citations

Caveats for using statistical significance tests in research assessments

- Psychology, Computer Science
- J. Informetrics
- 2013

This article raises concerns about the advantages of using statistical significance tests in research assessments as has recently been suggested in the debate about proper normalization procedures… Expand

Pitfalls of significance testing and $p$-value variability: An econometrics perspective

- Mathematics
- 2018

Data on how many scientific findings are reproducible are generally bleak and a wealth of papers have warned against misuses of the p-value and resulting false findings in recent years. This paper… Expand

PERSPECTIVE - Researchers Should Make Thoughtful Assessments Instead of Null-Hypothesis Significance Tests

- Psychology, Computer Science
- Organ. Sci.
- 2011

Instead of making NHSTs, researchers should adapt their research assessments to specific contexts and specific research goals, and then explain their rationales for selecting assessment indicators. Expand

The Illusion of Predictability: How Regression Statistics Mislead Experts

- Computer Science
- 2011

The implications of the study suggest the need to reconsider the way in which empirical results are presented, and the possible provision of easy-to-use simulation tools that would enable readers of empirical papers to make accurate inferences. Expand

Response to Commentaries on “The illusion of predictability: How regression statistics mislead experts”

- Economics
- 2012

Does the manner in which results are presented in empirical studies affect perceptions of the predictability of the outcomes? Noting the predominant role of linear regression analysis in empirical… Expand

Null-hypothesis significance tests in behavioral and management research: We can do better

- Psychology
- 2009

Null-hypothesis significance tests (NHST) are a very troublesome methodology that dominates the quantitative empirical research in strategy and management. Inherent limitations and inappropriate… Expand

Testing University Rankings Statistically : Why this Perhaps is not such a Good Idea after All . Some Reflections on Statistical Power , Effect Size , Random Sampling and Imaginary Populations

- 2012

In this paper we discuss and question the use of statistical significance tests in relation to university rankings as recently suggested. We outline the assumptions behind and interpretations of… Expand

Inferential misconceptions and replication crisis

- Medicine
- 2016

The most serious flaws related to the p value are systematized and suggestions of how to prevent them and reduce the rate of false discoveries in the future are discussed. Expand

How Can Significance Tests Be Deinstitutionalized?

- Psychology
- 2012

The purpose of this article is to propose possible solutions to the methodological problem of null hypothesis significance testing (NHST), which is framed as deeply embedded in the institutional… Expand

The Ombudsman: Verification of Citations: Fawlty Towers of Knowledge?

- Psychology, Computer Science
- Interfaces
- 2008

Citations to “Estimating nonresponse bias in mail surveys,” one of the most frequently cited papers from the Journal of Marketing Research, are examined to illustrate faulty citations and recommend that journals include a section on their websites to list all relevant papers that have been overlooked and how the omitted paper relates to the published paper. Expand

#### References

SHOWING 1-10 OF 30 REFERENCES

Needed: A Ban on the Significance Test

- Psychology
- 1997

The significance test as currently used is a disaster Whereas most researchers falsely believe that the significance test has an error rate of 5%, empirical studies show the average error rate across… Expand

Should Significance Tests be Banned? Introduction to a Special Section Exploring the Pros and Cons

- Psychology
- 1997

Significance testing of null hypotheses is the standard epistemological method for advancing scientific knowledge in psychology, even though it has drawbacks and it leads to common inferential… Expand

Verification of Citations: Fawlty Towers of Knowledge?

- Computer Science
- 2007

This paper examines citations to Estimating nonresponse bias in mail surveys, one of the most frequently cited papers from the Journal of Marketing Research, as an exploratory study to illustrate the prevalence of faulty citations and provides specific operational recommendations on adjusting for non response bias. Expand

The Ombudsman: Verification of Citations: Fawlty Towers of Knowledge?

- Psychology, Computer Science
- Interfaces
- 2008

Citations to “Estimating nonresponse bias in mail surveys,” one of the most frequently cited papers from the Journal of Marketing Research, are examined to illustrate faulty citations and recommend that journals include a section on their websites to list all relevant papers that have been overlooked and how the omitted paper relates to the published paper. Expand

Statistical Significance with Comments by Editors of Marketing Journals

- Psychology
- 2000

The historical growth in the popularity of statistical significance testing is examined using a random sample of annual data from 12 American Psychological Association (APA) journals. The results… Expand

Debiasing forecasts: how useful is the unbiasedness test?

- Economics
- 2003

Abstract A number of studies have demonstrated the improvements in accuracy that can result from correcting judgmental forecasts to remove systematic bias. It has been suggested that the… Expand

The Historical Growth of Statistical Significance Testing in Psychology--and Its Future Prospects.

- Psychology
- 2000

The historical growth in the popularity of statistical significance testing is examined using a random sample of annual data from 12 American Psychological Association (APA) journals. The results… Expand

Confusion Over Measures of Evidence (p's) Versus Errors (α's) in Classical Statistical Testing

- Mathematics
- 2003

Confusion surrounding the reporting and interpretation of results of classical statistical tests is widespread among applied researchers, most of whom erroneously believe that such tests are… Expand

Findings from Evidence-Based Forecasting: Methods for Reducing Forecast Error

- Mathematics
- 2006

Empirical comparisons of reasonable approaches provide evidence on the best forecasting procedures to use under given conditions. Based on this evidence, I summarize the progress made over the past… Expand

The M3 competition: Statistical tests of the results

- Mathematics
- 2005

The main conclusions of the M3 competition were derived from the analyses of descriptive statistics with no formal statistical testing. One of the commentaries noted that the results had not been… Expand