Phase 2: Writing & Analysis

Create an executable script documenting the code required to load the raw data into a tabular format, and de-identify human subjects if applicable
- Document this preprocessing (“data wrangling”) procedure in the prepare_data.R file.
- This file is intended to document steps that can not or should not be replicated by end users, unless they have access to the raw data file.
- These are steps you would run only once, the first time you load data into R.
- Make this file as short as possible; only include steps that are absolutely necessary
Save the data using open_data() or closed_data()
- WARNING: Once you commit a data file to the ‘Git’ repository, its record will be retained forever (unless the entire repository is deleted). Assume that pushing data to a ‘Git’ remote repository cannot be undone. Follow the mantra: “Never commit something you do not intend to share”.
- When using external data sources (e.g., obtained using an API), it is recommended to store a local copy, to make the project portable and to ensure that end users have access to the same version of the data you used.
- NOTE: The open_data() and closed_data() functions generate a codebook and possibly additional files as part of their output, don’t worry about all the new files added to your project.
Write the manuscript in Manuscript.Rmd
- Use code chunks to perform the analyses. The first code chunk should call load_data()
- Finish each sentence with one carriage return (enter); separate paragraphs with a double carriage return.
Regularly Commit your progress to the Git repository; ideally, after completing each small and clearly defined task.
- In the top-right panel of ‘RStudio’, select the ‘Git’ tab
- Select the checkboxes next to all files whose changes you wish to Commit
- Click the Commit button.
- In the pop-up window, write an informative “Commit message”.
- Click the Commit button below the message dialog
- Click the green arrow labeled “Push” to send your commit to the remote repository
While writing, cite essential references with one at-symbol, [@essentialref2020], and non-essential references with a double at-symbol, [@@nonessential2020].

When writing in RMarkdown format, you use Markdown citekeys to refer to references, and these references will be stored in a separate text file known as a .bib file.

To ease this process, we recommend following this procedure for citation:

During writing, maintain a plain-text .bib file with the BibTeX references for all citations. + You can export a .bib file from most reference manager programs; the free, open-source reference manager Zotero is excellent and user-friendly, and highly interoperable with other commercial reference managers. Here is a tutorial for using Zotero with RMarkdown. + Alternatively, it is possible to make this file by hand, copy and pasting each new reference below the previous one; e.g., Figure ?? shows how to obtain a BibTeX reference from Google Scholar; simply copy-paste each reference into the .bib file
To cite a reference, use the citekey - the first word in the BibTeX entry for that reference. Insert it in the RMarkdown file like so: @yourcitekey2020. For a parenthesized reference, use [@citekeyone2020; @citekeytwo2020]. For more options, see the RMarkdown cookbook.
To indicate a non-essential citation, mark it with a double at-symbol: @@nonessential2020.
When Knitting the document, adapt the knit command in the YAML header.
knit: worcs::cite_all renders all citations, and
knit: worcs::cite_essential removes all non-essential citations.
Optional: To be extremely thorough, you could make a “branch” of the GitHub repository for the print version of the manuscript. Only in this branch, you use the function knit: worcs::cite_essential. The procedure is documented in this tutorial.

Phase 1: Study Design

Phase 3: Submission & Publication