

P-curve is relatively new, and as reviewer 2 notes, it has not been firmly established as superior to other more formal meta-analytic approaches (it might well be better in some contexts and worse in others). To demonstrate that Z-curve is a valid measure and an improvement over existing approaches, it seems critical to test it against other established meta-analytic models. Namely, average power is a weird quantity to estimate in light of a) decades of research on meta-analytic approaches to estimating the average effect size in the face of selection, and b) the fact that average power is a transformation of effect size. Reviewer 2 presents the strongest case against doing so. Reviewers 1 and 2 both questioned the goal of estimating average power. Simulations in which you know and can specify the ground truth seem more helpful in demonstrating the advantages and constraints of this approach. I agreed that the use of some of the existing data sets (e.g., the scraped data, the Cuddy data, perhaps the Motyl data) are not ideal ways to demonstrate the usefulness of this tool. They also note the lack of specificity for some of the simulations and question the datasets used for the analyses. I hope you will find the comments useful as you continue to develop this approach (which I do think is a worthwhile enterprise).Īll three reviews raised concerns about the clarity of the paper and the figures as well as the lack of grounding for a number of strong claims and conclusions (they each quote examples).

I would like to highlight what I see as the key issues, but many of the other comments are important and substantive. The reviews are extensive and thoughtful, and I won’t rehash all of the details in my letter. Based on the comments of the reviewers and my independent evaluation, I found these issues to be substantial enough that I have decided to decline the manuscript. My take was largely consistent with that of the reviewers.Īlthough the issue of estimating replicability from published results is an important one, I was less convinced about the method and felt that the paper does not do enough to define the approach precisely, and it did not adequately demonstrate its benefits and limits relative to other meta-analytic bias correction techniques. I read the paper independently of the reviews, both before sending it out and again before reading the reviews (given that it had been a while). Nelson was more positive about the goals of the paper and approach, although he wasn’t entirely convinced by the approach and evidence. Reviewers 1 and 2 were both strongly negative and recommended rejection. Reviewers 1 and 2 chose to remain anonymous and Reviewer 3 is Leif Nelson (signed review). In the end, I received guidance from three expert reviewers whose comments appear at the end of this message. I initially struggled to find reviewers for the paper and I also had to wait for the final review. First, my apologies for the overly long review process. Thank you for submitting your manuscript (AMPPS-17-0114) entitled “Z-Curve: A Method for the Estimating Replicability Based on Test Statistics in Original Studies” to Advances in Methods and Practices in Psychological Science (AMPPS). This is how even meta-scientists operate. The peer-review report can be found on OSF: Īlso, three years after the p-curve authors have been alerted to the fact that their method can provide biased estimates, they have not modified their app or posted a statement that alerts readers to this problem. Meanwhile the article has been published in the journal Meta-Psychology, a journal without fees for authors, open access to articles, and transparent peer-reviews. But, here you can see how traditional publishing works (or doesn’t work). Normally, you would not be able to see the editorial decision letter or know that an author of the inferior p-curve method provided a biased review. However, this blog post shows that AMPPS works like any other traditional, for-profit, behind pay-wall journal. You might think that an article that relies on validated simulations to improve on an existing method, z-curve is better than p-curve, would be published, especially be a journal that was created to improve / advance psychological science. We also show that other methods that are already in use for effect size estimation, like p-curve, produce biased (inflated) estimates. We validated our method with simulation studies. Jerry Brunner and I developed a method to estimate average power of studies while taking selection for significance into account.
