Science; Institution design for

“Scientist, falsify thyself”. Peer review, academic incentives, credentials, evidence and funding

May 16, 2020 — December 16, 2024

academe
agents
collective knowledge
economics
faster pussycat
game theory
how do science
incentive mechanisms
information provenance
institutions
mind
networks
sociology
wonk
Figure 1

Upon the thing that I presume academic publishing is supposed to do: further science. Reputation systems, collective decision making, groupthink management and other mechanisms for trustworthy science, a.k.a. our collective knowledge of reality itself.

I would like to consider the system of peer review, networking, conferencing, publishing and acclaim and see how closely it approximates an ideal system for uncovering truth, and further, imagine how we could make a better system. But I do not do that right now, I just collect some provocative links to that theme, in hope of time for more thought later.

Figure 2: Vesaelius pioneers scientific review by peering Credit: the University of Basel

1 Some open review processes

pubpeer (who are behind peeriodicals) produces a peer-review overlay for web browsers to spread their commentary and peer critique more widely. The site concept is itself brusquely confusing, but well blogged; you’ll get the idea. They are not afraid of invective, and I thought they looked more amateurish than effective. But I have revised that opinion; they are quite selective and effective. This system has been implicated in topical high-profile retractions (e.g. 1 2).

There are many others that have less institutional backing; e.g. Uri Simonsohn.

Figure 3: xkcd 2304

2 Mathematical models of the reviewing process

e.g. Cole, Jr, and Simon (1981);Lindsey (1988);Ragone et al. (2013);Nihar B. Shah et al. (2016);Whitehurst (1984).

The experimental data from Neurips experiments might be useful: See e.g. Nihar B. Shah et al. (2016) or a blog post on the 2014 experiment (1, 2).

3 Economics of publishing

See academic publishing.

Figure 4

4 Assignment for peer review process

There is some fun mechanism design and algorithmic work involved in peer review, e.g.

5 Gatekeeping

Almost by definition we will be dissatisfied with how much gatekeeping academia does; we all have skin in this game.

Baldwin (2018):

This essay traces the history of refereeing at specialist scientific journals and at funding bodies and shows that it was only in the late twentieth century that peer review came to be seen as a process central to scientific practice. Throughout the nineteenth century and into much of the twentieth, external referee reports were considered an optional part of journal editing or grant making. The idea that refereeing is a requirement for scientific legitimacy seems to have arisen first in the Cold War United States. In the 1970s, in the wake of a series of attacks on scientific funding, American scientists faced a dilemma: there was increasing pressure for science to be accountable to those who funded it, but scientists wanted to ensure their continuing influence over funding decisions. Scientists and their supporters cast expert refereeing—or “peer review,” as it was increasingly called—as the crucial process that ensured the credibility of science as a whole. Taking funding decisions out of expert hands, they argued, would be a corruption of science itself. This public elevation of peer review both reinforced and spread the belief that only peer-reviewed science was scientifically legitimate.

Thomas Basbøll says

It is commonplace today to talk about “knowledge production” and the university as a site of innovation. But the institution was never designed to “produce” something nor even to be especially innovative. Its function was to conserve what we know. It just happens to be in the nature of knowledge that it cannot be conserved if it does not grow.

Andrew Marzoni, Academia is a cult.

Adam Becker on the assumptions and pathologies revealed by Wolfram’s latest branding and positioning:

So why did Wolfram announce his ideas this way? Why not go the traditional route? “I don’t really believe in anonymous peer review,” he says. “I think it’s corrupt. It’s all a giant story of somewhat corrupt gaming, I would say. I think it’s sort of inevitable that happens with these very large systems. It’s a pity.”

So what are Wolfram’s goals? He says he wants the attention and feedback of the physics community. But his unconventional approach—soliciting public comments on an exceedingly long paper—almost ensures it shall remain obscure. Wolfram says he wants physicists’ respect. The ones consulted for this story said gaining it would require him to recognise and engage with the prior work of others in the scientific community.

And when provided with some of the responses from other physicists regarding his work, Wolfram is singularly unenthused. “I’m disappointed by the naivete of the questions that you’re communicating,” he grumbles. “I deserve better.”

An interesting edge case in peer review and scientific reputation: Adam Becker, Junk Science or the Real Thing? ‘Inference’ Publishes Both. As far as I’m concerned, publishing crap in itself is not catastrophic. A process that fails to discourage crap by eventually identifying it would be bad.

6 Style guide for reviews and rebuttals

See scientific writing.

7 Reformers

8 Grants

TBD

9 Incoming

Andrew Gelman in conversation with Noah Smith

Anyway, one other thing I wanted to get your thoughts on was the publication system and the quality of published research. The replication crisis and other skeptical reviews of empirical work have got lots of people thinking about ways to systematically improve the quality of what gets published in journals. Apart from things you’ve already mentioned, do you have any suggestions for doing that?

I wrote about some potential solutions in pages 19–21 of Gelman (2018) from a few years ago. But it’s hard to give more than my personal impression. As statisticians or methodologists we rake people over the coals for jumping to causal conclusions based on uncontrolled data, but when it comes to science reform, we’re all too quick to say, Do this or Do that. Fair enough: policy exists already and we shouldn’t wait on definitive evidence before moving forward to reform science publication, any more than journals waited on such evidence before growing to become what they are today. But we should just be aware of the role of theory and assumptions in making such recommendations. Eric Loken and I made this point several years ago in the context of statistics teaching (Gelman and Loken 2012), and Berna Devezer et al. published an article (Devezer et al. 2020) last year critically examining some of the assumptions that have at times been taken for granted in science reform. When talking about reform, there are so many useful directions to go, I don’t know where to start. There’s post-publication review (which, among other things, should be much more efficient than the current system for reasons discussed here), there are all sorts of things having to do with incentives and norms (for example, I’ve argued that one reason that scientists act so defensive when their work is criticised is because of how they’re trained to react to referee reports in the journal review process), and various ideas adapted to specific fields. One idea I saw recently that I liked was from the psychology researcher Gerd Gigerenzer, who wrote that we should consider stimuli in an experiment as being a sample from a population rather than thinking of them as fixed rules (Gigerenzer n.d.), which is an interesting idea in part because of its connection to issues of external validity or out-of-sample generalisation that are so important when trying to make statements about the outside world. * The Black Spatula Project - by Steve Newman

A 10 page paper caused a panic because of a math error. I was curious if Al would spot the error by just prompting: “carefully check the math in this paper” especially as the info is not in training data.

o1 gets it in a single shot. Should Al checks be standard in science?

Repo here: nick-gibb/black-spatula-project: Verifying scientific papers using LLMs

10 References

a Literal Banana. 2020. “Extended Sniff Test.”
Aczel, Szaszi, and Holcombe. 2021. A Billion-Dollar Donation: Estimating the Cost of Researchers’ Time Spent on Peer Review.” Research Integrity and Peer Review.
Afonso. 2014. How Academia Resembles a Drug Gang.” SSRN Scholarly Paper.
Agassi. 1974. The Logic of Scientific Inquiry.” Synthese.
al-Gharbi. 2020. Race and the Race for the White House: On Social Research in the Age of Trump.” Preprint.
Alon. 2009. How to Choose a Good Scientific Problem.” Molecular Cell.
Arbesman, and Christakis. 2011. Eurekometrics: Analyzing the Nature of Discovery.” PLoS Comput Biol.
Arbilly, and Laland. 2017. The Magnitude of Innovation and Its Evolution in Social Animals.” Proceedings of the Royal Society B: Biological Sciences.
Arvan, Bright, and Heesen. 2022. Jury Theorems for Peer Review.” The British Journal for the Philosophy of Science.
Azoulay, Fons-Rosen, and Zivin. 2015. Does Science Advance One Funeral at a Time? Working Paper 21788.
Baldwin. 2018. Scientific Autonomy, Public Accountability, and the Rise of ‘Peer Review’ in the Cold War United States.” Isis.
Björk, and Solomon. 2013. The Publishing Delay in Scholarly Peer-Reviewed Journals.” Journal of Informetrics.
Bogich, Balleseteros, Berjon, et al. n.d. On the Marginal Cost of Scholarly Communication.”
Budish, Che, Kojima, et al. 2009. “Implementing Random Assignments: A Generalization of the Birkhoff-von Neumann Theorem.” In Cowles Summer Conference.
Charlin, and Zemel. 2013. The Toronto Paper Matching System: An Automated Paper-Reviewer Assignment System.”
Charlin, Zemel, and Boutilier. 2011. A Framework for Optimizing Paper Matching.” In UAI2011.
Cole, Jr, and Simon. 1981. Chance and Consensus in Peer Review.” Science.
Coscia, and Rossi. 2020. Distortions of Political Bias in Crowdsourced Misinformation Flagging.” Journal of The Royal Society Interface.
Couzin-Frankel. 2015. PubPeer Co-Founder Reveals Identity—and New Plans.” Science.
Dang, and Bright. 2021. Scientific Conclusions Need Not Be Accurate, Justified, or Believed by Their Authors.” Synthese.
Devezer, Navarro, Vandekerckhove, et al. 2020. The Case for Formal Methodology in Scientific Reform.” Royal Society Open Science.
Flach, Spiegler, Golénia, et al. 2010. Novel Tools to Streamline the Conference Review Process: Experiences from SIGKDD’09.” SIGKDD Explor. Newsl.
Gasparyan, Gerasimov, Voronov, et al. 2015. Rewarding Peer Reviewers: Maintaining the Integrity of Science Communication.” Journal of Korean Medical Science.
Gelman. 2011. “Experimental Reasoning in Social Science.” In Field Experiments and Their Critics.
———. 2018. The Failure of Null Hypothesis Significance Testing When Studying Incremental Changes, and What to Do About It.” Personality and Social Psychology Bulletin.
Gelman, and Loken. 2012. “Statisticians: When We Teach, We Don’t Practice What We Preach.” Chance.
Gigerenzer. n.d. We Need to Think More about How We Conduct Research.” Behavioral and Brain Sciences.
Go Forth and Replicate! 2016. Nature News.
Goldsmith, and Sloan. 2007. The AI Conference Paper Assignment Problem.” In Proc. AAAI Workshop on Preference Handling for Artificial Intelligence.
Greenberg. 2009. How Citation Distortions Create Unfounded Authority: Analysis of a Citation Network.” BMJ.
Gurobi Optimization, LLC. 2023. “Gurobi Optimizer Reference Manual.” Manual.
Hallsson, and Kappel. 2020. Disagreement and the Division of Epistemic Labor.” Synthese.
Heesen, and Bright. 2021. Is Peer Review a Good Idea? The British Journal for the Philosophy of Science.
Hodges. 2019. Statistical Methods Research Done as Science Rather Than Mathematics.” arXiv:1905.08381 [Stat].
Ioannidis. 2005. Why Most Published Research Findings Are False. PLoS Medicine.
Jan. 2018. Recognition and Reward System for Peer-Reviewers.” In CEUR Workshop Proceedings.
Jecmen, Zhang, Liu, et al. 2020. Mitigating Manipulation in Peer Review via Randomized Reviewer Assignments.” In Advances in Neural Information Processing Systems.
Jecmen, Zhang, Liu, et al. 2022. Near-Optimal Reviewer Splitting in Two-Phase Paper Reviewing and Conference Experiment Design.” In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems. AAMAS ’22.
Jiménez, and Mesoudi. 2019. Prestige-Biased Social Learning: Current Evidence and Outstanding Questions.” Palgrave Communications.
Kirman. 1992. Whom or What Does the Representative Individual Represent? The Journal of Economic Perspectives.
———. 2010. “Learning in Agent Based Models.”
Krikorian, and Kapczynski. 2010. Access to knowledge in the age of intellectual property.
Lakatos. 1980. The Methodology of Scientific Research Programmes: Volume 1 : Philosophical Papers.
Laland. 2004. Social Learning Strategies.” Animal Learning & Behavior.
Leyton-Brown, Mausam, Nandwani, et al. 2022. Matching Papers and Reviewers at Large Conferences.”
Lindsey. 1988. Assessing Precision in the Manuscript Review Process: A Little Better Than a Dice Roll.” Scientometrics.
Littman. 2021. Collusion Rings Threaten the Integrity of Computer Science Research.” Communications of the ACM.
Liu, Suel, and Memon. 2014. A Robust Model for Paper Reviewer Assignment.” In Proceedings of the 8th ACM Conference on Recommender Systems.
Maxwell. 2009. What’s wrong with science? Edited by L. S. D. Santamaria. Sublime.
McCook. 2017. Meet PubPeer 2.0: New Version of Post-Publication Peer Review Site Launches Today.” Retraction Watch (blog).
Medawar. 1969. Induction and Intuition in Scientific Thought.
———. 1982. Pluto’s Republic.
———. 1984. The Limits of Science.
Merali. 2010. Computational Science: Error.” Nature.
Merrifield, and Saari. 2009. Telescope Time Without Tears: A Distributed Approach to Peer Review.” Astronomy & Geophysics.
Mimno, and McCallum. 2007. Expertise Modeling for Matching Papers with Reviewers.” In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’07.
Nature Editors: All Hat and No Cattle.” 2016.
Nguyen. 2020. Cognitive Islands and Runaway Echo Chambers: Problems for Epistemic Dependence on Experts.” Synthese.
Post-Publication Criticism Is Crucial, but Should Be Constructive.” 2016. Nature News.
Potts, Hartley, Montgomery, et al. 2016. A Journal Is a Club: A New Economic Model for Scholarly Publishing.” SSRN Scholarly Paper ID 2763975.
Ragone, Mirylenka, Casati, et al. 2013. On Peer Review in Computer Science: Analysis of Its Effectiveness and Suggestions for Improvement.” Scientometrics.
Rekdal. 2014. Academic Urban Legends.” Social Studies of Science.
Ridley, Kolm, Freckelton, et al. 2007. An Unexpected Influence of Widely Used Significance Thresholds on the Distribution of Reported P-Values.” Journal of Evolutionary Biology.
Ritchie. 2020. Science Fictions: How Fraud, Bias, Negligence, and Hype Undermine the Search for Truth.
Rodriguez, and Bollen. 2008. An Algorithm to Determine Peer-Reviewers.” In Proceedings of the 17th ACM Conference on Information and Knowledge Management. CIKM ’08.
Rzhetsky, Foster, Foster, et al. 2015. Choosing Experiments to Accelerate Collective Discovery.” Proceedings of the National Academy of Sciences.
Schimmer, Ralf, Geschuhn, Kai Karin, and Vogler, Andreas. 2015. Disrupting the subscription journals’ business model for the necessary large-scale transformation to open access.”
Sekara, Deville, Ahnert, et al. 2018. The Chaperone Effect in Scientific Publishing.” Proceedings of the National Academy of Sciences.
Sen. 1977. Rational Fools: A Critique of the Behavioral Foundations of Economic Theory.” Philosophy and Public Affairs.
Shah, Nihar B. 2022. Challenges, Experiments, and Computational Solutions in Peer Review.” Communications of the ACM.
Shah, Nihar B, Tabibian, Muandet, et al. 2016. “Design and Analysis of the NIPS 2016 Review Process.”
Smith. 2006. Peer Review: A Flawed Process at the Heart of Science and Journals.” Journal of the Royal Society of Medicine.
Solomon. 2007. The Role of Peer Review for Scholarly Journals in the Information Age.” Journal of Electronic Publishing.
Spranzi. 2004. Galileo and the Mountains of the Moon: Analogical Reasoning, Models and Metaphors in Scientific Discovery.” Journal of Cognition and Culture.
Stelmakh, Shah, and Singh. 2021. PeerReview4All: Fair and Accurate Reviewer Assignment in Peer Review.” Journal of Machine Learning Research.
Stove. 1982. Popper and After: Four Modern Irrationalists.
Suppes. 2002. Representation and Invariance of Scientific Structures.
Tang, Tang, and Tan. 2010. Expertise Matching via Constraint-Based Optimization.” In 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.
Taylor. 2008. “On the Optimal Assignment of Conference Papers to Reviewers.”
Thagard. 1993. “Societies of Minds: Science as Distributed Computing.” Studies in History and Philosophy of Modern Physics.
———. 1994. “Mind, Society, and the Growth of Knowledge.” Philosophy of Science.
———. 1997. “Collaborative Knowledge.” Noûs.
———. 2005. “How to Be a Successful Scientist.” Scientific and Technological Thinking.
———. 2007. Coherence, Truth, and the Development of Scientific Knowledge.” Philosophy of Science.
Thagard, and Litt. 2008. “Models of Scientific Explanation.” In The Cambridge Handbook of Computational Psychology.
Thagard, and Zhu. 2003. “Acupuncture, Incommensurability, and Conceptual Change.” Intentional Conceptual Change.
Thurner, and Hanel. 2010. “Peer-Review in a World with Rational Scientists: Toward Selection of the Average.”
Tran, Cabanac, and Hubert. 2017. Expert Suggestion for Conference Program Committees.” In 2017 11th International Conference on Research Challenges in Information Science (RCIS).
van der Post, Franz, and Laland. 2016. Skill Learning and the Evolution of Social Learning Mechanisms.” BMC Evolutionary Biology.
van Noorden. 2013. Open Access: The True Cost of Science Publishing.” Nature.
Vazire. 2017. Our Obsession with Eminence Warps Research.” Nature News.
Vijaykumar. 2020. “Potential Organized Fraud in ACM.”
Whitehurst. 1984. Interrater Agreement for Journal Manuscript Reviews.” American Psychologist.
Wible. 1998. Economics of Science.
Xiao, Dörfler, and van der Schaar. 2014. Incentive Design in Peer Review: Rating and Repeated Endogenous Matching.” arXiv:1411.2139 [Cs].
Xu, Yixuan Even, Jecmen, Song, et al. 2024. “A One-Size-Fits-All Approach to Improving Randomness in Paper Assignment.” In Proceedings of the 37th International Conference on Neural Information Processing Systems. NIPS ’23.
Xu, Yichong, Zhao, and Shi. 2017. “Mechanism Design for Paper Review.”
Yarkoni. 2019. The Generalizability Crisis.” Preprint.