Documentation

Why do you need documentation?

  • You want yourself to understand how code written some time ago works
  • You want others to understand how to (re-)use your code

For this you need to

  • Explain parts of your code with comments
  • Explain what to install and how to get started in your readme
  • Explain in-depth use of your code in a notebook

Comments

Comments are annotations you write directly in the code source.

They:

  • are written for users who deal with your source code

  • explain parts that are not intuitive from the code itself

  • do not replace readable or structured code

  • (in a specific structure) can be used to directly generate documentation for users.

When not to use comments

  • …to repeat in natural language what is written in your code
# Now we check if the age of a patient is greater than 18
if(agePatient > 18)
  • …to turn old code into zombie code (fine for troubleshooting, but do not leave it in!)
# Do not run this!!
# itDoesNotWork <- optimizeMulticoreDeepLearning(myProteins)
# if(itDoesNotWork == 1444){
#    connection <- connectToHPC(currentUser, password)
#}
  • …to replace version control, like git
# removed on August 5
# if() ...

#Now, it connects to the API with o-auth2, updated 05/05/2016
...

Comment lines: WHY over HOW

Comment lines are used to explain the purpose of some piece of code.

# Bug fix GH 20601
# If the data frame is too big, the number of unique index combination
# will cause int32 overflow on windows environments.
# We want to check and raise an error before this happens

num_rows = np.max([index_level.size for index_level in self.new_index_levels])
num_columns = self.removed_level.

Docstrings

  • Structured comments, associated to segments (rather than lines) of code, can be used to generate documentation for users* of your project.

  • These comments are called docstrings.

  • Docstrings are parsed as the first statement of a module (e.g. a function or class).

  • Docstrings allow you to provide documentation to a function, that is relevant to the user of that function.

  • Writing docstrings makes you generate your documentation as you are generating the code: efficiently, comprehensively!

Generating docstrings

In R you will need a separate package to deal with docstrings:

library(docstring)

multiply <- function(x,y){
#' @title Multiply two numbers
#' @description This function takes two
#' input numbers and multiplies
#' them. It returns the multiplied result.
#' @param x The first value
#' @param y The second value
#' @return The two arguments multiplied.

  return(x*y)
}

?multiply

Generating docstrings

In Python, docstrings are string literal comments following a function declaration:

def multiply(x,y):
  """
  Multiply two numbers
  
  This function takes two input numbers and multiplies them.
  It returns the multiplied result.
  Keyword arguments:
  x -- the first value
  y -- the second value
  """
  return x*y

NB: a triple single quote (''') works, but PEP style prefers double quotes for docstrings.

A glimpse into code generation

Docstrings are formatted so that they can easily be turned into documentation of your package.

You will need additional tools:

  • http://www.doxygen.nl/ : C++ (and many more languages)
  • http://www.sphinx-doc.org/ : Python
  • https://roxygen2.r-lib.org/ : R

We will not do this today, but it is worth checking out if you want to release your code!

In the console, in RStudio, run the following command:

roxygen2::roxygenise()

The command above will create documentation files that can be visualized by calling help(your package)

Your turn (choose one!)

Comment lines

  • Do you have superfluous comments? Remove them!
    • Remove your zombie code and version control-like comments +See if you can replace a ‘how’ comment for a ‘why’ comment (what is the purpose of this code? rather than this is how this code works)
  • Are there elements without comments that need them? Add them!
    • Have you found yourself staring at a piece of code for too long without understanding it? Perhaps it needs more information!
    • Try to comment on the thought behind the code rather than phrasing it in English.

Docstrings

  • Add a docstring to a function, preferably the last function you worked on (so it’s fresh in your memory).
  • Keep in mind: what does my user need to know when they are working with this function?