Reproducibility Policy

Over the last decade, we have witnessed a crisis in science in which many admired research studies have been overturned or found non-replicable. Researchers increasingly recognize that publication itself does not imply that findings are robust, and the public has questioned the credibility of social science research. In order to advance the credibility of sociological research, Sociological Science has adopted a reproducibility policy.

Starting with submissions received after April 1, 2023, authors of articles relying on statistical or computational methods will be required to deposit replication packages as a condition of publication in Sociological Science. Replication packages must contain both the statistical code and — when legally and ethically possible — the data required to fully reproduce the reported results. With this policy, Sociological Science hopes other high-impact journals in Sociology will follow suit in setting standards for reproducibility of published work.

In addition to depositing replication packages, papers relying on experimental methods must adhere to the disclosure and pre-registration requirements outlined in the journal’s Policy on Findings from Experimental Data below.

Under many legitimate circumstances, data cannot legally or ethically be made available to readers. When authors cannot make their data available, they must explain why in the main text of the paper. In such cases, making code and other materials available is still required, unless doing so would violate legal or ethical constraints.

Researchers using qualitative data, such as interviews or participant observation data, are not required to submit a replication package. We encourage authors to make qualitative data available when possible, and urge them to consider whether materials such as interview protocols or coding schemes can be shared.

Replication packages are not required at the time of submission and are not part of the editorial review. The packages are only required if a paper is accepted, and shall be deposited when authors return their publication agreement. However, authors must clearly state in their submission that that they either meet this standard or are claiming an exemption. Sociological Science considers this transparency statement to be an important element of the article.

Frequently Asked Questions

Why is transparency important?

We believe transparency and replication packages are necessary for two reasons: 1) Scientific integrity and 2) Knowledge dissemination.

Scientific integrity: Empirical findings should not be accepted on faith, nor on reputation, nor on seniority. As the motto of the Royal Society states, “Nullius in verba” – “see for yourself.” The role of a scientific community is to evaluate collectively the quality of the research it produces. The editorial review process of a journal is one means of peer evaluation, but it is imperfect. For scholars to “see for themselves” how the evidence and findings were generated, each author must provide their colleagues with as much content as legally and ethically possible to allow results to be reproduced.

Knowledge dissemination: Transparency advances the research frontier. High quality research papers often implement methods in novel ways that other scholars can learn from – this is part of a paper’s contribution, and part of why it deserves to be published. However, methodology sections in articles are necessarily abstract summaries of a research strategy. Many other details often remain in the code. Obscuring the exact details of an analysis treats methodological knowledge as a kind of trade secret, hindering the diffusion of new methods.

Why should a journal require replication packages?

Sociology’s scientific rigor and credibility would be strengthened by the widespread availability of replication packages. Their absence diminishes the quality of the discipline’s collective work product and slows the advancement of knowledge. Yet replication packages are public goods that are subject to collective action problems. In an intellectual community in which other scholars routinely withhold code and data, the incentives for individual scholars to provide replication packages are weak, if not negative. For an individual scholar, transparency reaps the most benefits when other researchers are also doing it – when sharing code and data grants membership to a community of scholars who reciprocally share their own data and code. While a growing number of sociologists today voluntarily provide replication packages, we believe that journals must provide leadership and set a high standard for transparency and credibility.

Why is the replication package only required for statistical and computational research?

Sociological Science recognizes that best practices for transparency in qualitative social science are currently under debate, and that researchers using qualitative data such as interviews and participant observation confront challenges to making materials available different from those facing quantitative researchers. Notably, researchers may be unable to make their interview or fieldnote data available if, to gain access to their respondents, they were legally, ethically, or administratively required to protect the latter’s anonymity. For these reasons, results based on qualitative methods of inquiry are exempt from the stipulations of this policy. We encourage authors to make qualitative data available when possible, and urge them to consider whether materials such as interview protocols or coding schemes can be shared. This information can be stored using the same third-party repositories mentioned elsewhere in this document.

What must authors do?

Articles with statistical or computational results must either:

1. Deposit data, code, and other materials necessary to reproduce results in a third-party repository, such as OSF, Dataverse, or openICPSR. This is what we mean by a replication package. The published paper must indicate that a replication package is available and provide its location, but the specific details may be added after the manuscript has been accepted. The publication agreement will require authors to provide the link to their replication package, which must also be included in the final manuscript.

2. When data cannot be made available, indicate in the text of the paper the legal, methodological, ethical, or other constraints that preclude access. In these cases, making code and other materials available is still required if allowed in a way that is consistent with the authors’ legal and ethical obligations. Code and other non-data materials should be made available through a third-party repository, and the availability of the materials should be indicated in the paper.

Before their article is published, authors will be asked to confirm either that one or the other of the above conditions is true for their manuscript or that the manuscript does not contain quantitative results. Final determination of whether the requirements of the transparency policy have been met (or if a paper is exempted from the requirements) rests with the Editors of Sociological Science.

What happens if I refuse to provide a replication package?

If the Editors of the journal determine that the paper is subject to the requirements of the transparency policy, and the author refuses to provide a replication package or specify the legitimate constraints that preclude them from doing so, then the paper will not be published.

Do reviewers or editors check the accuracy of the replication packages?

No. The existence of a replication package does not imply that the package has been vetted. The journal will verify that paper correctly lists the contents of the replication package, but will not perform replication tests. Replication packages are only required before publication, not at the outset of the review process. If a reader discovers a problem with the contents of a replication package, their first step should be to contact the authors.

Does this policy require researchers to use a specific software package for analyses?

No. We ask the reader to provide data and code in whatever formats were used in the analyses. To maximize reproducibility, we do recommend that authors document the version numbers of both the software used to produce the results and specific packages, libraries, or other dependencies for which version changes might yield different results.

Does this policy require researchers to document their results in a specific way?

No. When, as is typically the case, the analysis is more complex than one code and one data file, we urge researchers to include a plain text readme file that briefly describes the contents of the replication package. The readme file should be in the top level of the file structure of the package. An example of the guidance for a readme package is here.

The contents of the datafile should be transparently intelligible to a user. One best practice is to produce a codebook that describes each of the variables and the meaning of their values. Another best practice is to explicitly refer to an existing codebook if the data are from a secondary source (for example, reference to the General Social Survey codebook if the data uses GSS data). Another best practice is to produce clear labels for all variables and values if the data format allows such labels and these labels can clearly convey the necessary information to understand the variables.

For code files themselves, we recognize documentation practices vary considerably. At the bare minimum, we strongly recommend that documentation clearly demarcate which sections of code correspond to which specific tables and figures presented in the paper. When code spans multiple files, a master script file is encouraged. A master script file is a single code file that calls other code files to be executed in the analysis. We also expect researchers will follow good practices for noting code, to improve legibility and clarity about what the code is doing.

Does the policy require the posting of questionnaires, experimental stimuli, etc.?

Not presently. We do recognize that such material may be integral for making sense of results, for understanding puzzling aspects of findings, and for other researchers’ ability to build on a study’s findings. Consequently, even though we do not require it, we encourage researchers to include these materials to the maximum extent possible, and note that repositories often support these being included as part of the same upload as the replication package.

Does the policy require the posting of complete datasets?

Sociological Science does not require authors to provide data beyond those necessary to reproduce published results. For example, a project may collect data on additional variables than those used in the manuscript, and those additional variables need not be included with the data shared under this policy.

What if authors are analyzing publicly available datasets?

We expect authors to include the original data set (i.e. all raw variables used in the analysis) in their package whenever possible. Alternatively, replication packages must include (a) a link to the official web site where the data set(s) are stored; (b) the release version and/or release date of the data the researcher used, in case the data producers make corrections later; and (c) the code necessary to extract the variables (since users often rename variables at the same time as they extract).

What if data are available to researchers but cannot be posted?

In the event that data are available to other researchers but not able to be redistributed by the investigators, then information to this end should be provided in the article. For example, the Panel Study of Income Dynamics requires all users to register before obtaining data, which limits researchers from providing their data directly. However, full replication packages can still be shared using the PSID’s own repository at openICPSR, which limits access to registered PSID users. In general, if the data can be downloaded from a website, the paper should state so and provide the URL.

Can other exceptions or accommodations be made?

Yes. We recognize there will be unique circumstances not addressed in our current policies. Authors are encouraged to discuss with the editors any outstanding concerns at the time of submission.

Policy on Findings from Experimental Data Published in Sociological Science

These policies for findings from data from experiments are in addition to those for the provision of data and code discussed in the general policy for the transparency and credibility of work published in Sociological Science.

(1) Disclosure expectations

For results from experiments, Sociological Science expects that the discussion of methods, results, or an appendix includes mention of (a) all outcome measures collected and analyzed for the article’s target research question; (b) all independent variables or experimental manipulations analyzed for the research questions, whether successful or not; and (c) the total number of observations for which experimental data was collected but are excluded from analyses, along with the reasons for exclusion.

(2) Pre-registration

Sociological Science requires authors of papers that include originally collected experimental data and present findings from those experiments as testing hypotheses (i.e., as “confirmatory” rather than “exploratory” research) to indicate explicitly whether those experiments were pre-registered.

For an experiment to be pre-registered, it must have been provided to an independent online registry prior to data collection, and it must include specific information about the planned analyses in addition to the experimental design and hypotheses. The Open Science Framework provides both an example of a recommended registry and a recommended template for the information to be included in a pre-registration.

Manuscripts reporting pre-registered experiments must report all preregistered analyses in the body of the manuscript or in an appendix, note all deviations from the preregistered analysis plan, and clearly distinguish analyses that were preregistered from those that were not.

An anonymized version of this pre-registration must be included at the time of submission as additional materials available to reviewers. Apart from anonymization, the pre-registration must be the same as that which was provided to the registry prior to data collection.

After acceptance, the version of a study with pre-registered experiments should include a link to the online pre-registration in the version of the paper to be published.

This policy regarding pre-registration does not apply to secondary analyses of already-existing experimental data, such as analyses of the various experimental items available in different waves of the General Social Survey, so long as it is clear from the text that the data were independently collected and already existed at the time of the study.

SiteLock