Pre-registration and library support
Fred Breese – Research Services Co-ordinator, University of Manchester Library.
In the UK open research is not just a thriving grassroots movement – there are 33 UKRN Institutional Leads representing support and engagement from senior university leadership. Current consensus is that open research requirements are likely to have a place in REF 2028, accelerating a trend towards engagement. University libraries should expect to play a key role in this future through the development of open research services, framed around policy mandates either at the university or national level.
Support models for open access publishing and research data management are already present in libraries and have played a significant role in the growth in adoption of these practices. Below, I explore preregistration, which is a likely candidate for future library-based support.
Briefly, preregistration can be understood as follows: prior to data collection a study plan is shared, commonly on a repository such as OSF. This will contain information such as hypotheses and plans for data analysis. OSF’s guidance on how to create a preregistration using their platform gives an illustration of the process. Similar to data archiving in a repository, the deposit workflow is simple, but comes with hidden complexities around the content of study plans, embargo periods etc.
Purpose
In terms of its purpose, preregistration is an intervention designed to mitigate a number of factors contributing to irreproducibility. As Brian Nosek summarises in Preregistration is Hard and Worthwhile these are as follows:
- Improved transparency allows for detection of questionable research practices, particularly p-hacking and HARKing.
- Unpublished analyses are still findable through preregistration, reducing the impact of publication bias.
It’s worth noting that recent literature on preregistration has questioned its effectiveness (a well maintained Wikipedia page gives an extensive bibliography). This particularly relates to how preregistration is applied in practice: e.g., are registrations actually followed, are they specific enough so as not to leave room for p-hacking?
Though preregistration’s adoption by open research has been fairly recent, clinical trial registration has been common practice since the mid-2000s, giving an impression of what might happen were preregistration broadly mandated. Clinical trial registries have a poor track record both in terms of registration coverage and accuracy. Many of these issues played out during the Covid-19 pandemic, with registries delivering limited of their promised value for research co-ordination.
One response would be to frame preregistration as a more targeted intervention, showing when a finding has been HARKed. While preregistration is certainly effective at revealing HARKing, the status of this “questionable research practice” is complex. Kerr’s classic paper on the subject gives a valuable introduction to the issue of when post hoc theorising can be problematic. Indeed discussion of HARKing leads to methodological questions about the appropriate use of hypothesis testing itself.
Further considerations
Even in this very brief summary, it’s clear there is a lot for librarians to take on board if tasked with designing a service around preregistration. This should be expected given experiences with open access and open data, both of which require specific expertise. Research data management provides a fitting comparison, which combined with preregistration provides a gold standard for research transparency. Discipline specific concerns should inform how this can appropriately be applied for any individual study.
As with open data practices, the best implementation of preregistration is dependent on the right infrastructure, and the support of experts to ensure that study plans are as useful as possible. Indeed, some of the shortcomings described above should encourage the development of research support. This can well account for the nuances of the practice, as we see with research data management and its effective handling of concerns with sensitive material, IPR and potential commercial exploitation.