Robustness

Code robustness

Error management

Protect the user:

Make assumptions and expectations explicit.
- check values before processing them
- identify and manage exceptions
Produce errors when expectations are not met.
Consider error options, and perform error management:
- redirect the program
- log or report the error, to allow the user (or developer) to troubleshoot
- if necessary: abort the run

Advanced robustness: unit tests

Protect the developer (you!)

Test the expected behavior of your functions:
- Confirm a known output given a known input
- Do errors get produced as expected when the input calls for it?
Capture unexpected errors to identify further options for error management
You can automate running tests when pushing to Github using Continuous Integration
Tests are definitely worth learning when your project increases in size!

More on tests later…

Throwing an error (Python)

def read_vector_value(index=0, my_vector=[10,5,4,12,25]):
    if index > len(my_vector) - 1:
        raise IndexError('Index higher than vector length.')
    return my_vector[index]

read_vector_value(index=6)

. . .

Why not simply adjust the function output?

def read_vector_value(index=0, my_vector=[10,5,4,12,25]):
    if index > len(my_vector) - 1:
        return None
    return my_vector[index]

print(read_vector_value(index=6))

. . .

Because it is unclear if None is expected behavior or indicative of a problem.

Warning message without breaking (R)

An error breaks code execution

read_vector_value <- function(index=1,my_vector=c(10,5,4,12,25)){
  if(index>length(my_vector)){
    stop("Index higher than vector length.")}
return(my_vector[index])
}

print(read_vector_value(index=6))

. . .

Capture the error but release a warning

read_vector_value <- function(index=1,my_vector=c(10,5,4,12,25)){
  if(index>length(my_vector)){
    warning("Index higher than vector length.")
    return(NA)}
return(my_vector[index])
}

print(read_vector_value(index=6))

Redirecting with exceptions (Python)

If you do not want to interrupt your script when an error is raised: use try/catch (‘except’ in Python). NB: Note that Python allows you to distinguish by error type!

# ensure that the function is correctly in memory
def read_vector_value(index=0, my_vector=[10,5,4,12,25]):
    if index > len(my_vector) - 1:
        raise IndexError('Index higher than vector length.')
    return my_vector[index]

Compare:

try:
  read_vector_value(6)
except IndexError:
  print("This is an exception")  # This will be executed

and

try:
  read_vector_value(6)
except ArithmeticError:
  print("This is an exception")  # This will not be executed

Redirecting with exceptions (R)

In R you can use tryCatch():

read_vector_value <- function(index=1,my_vector=c(10,5,4,12,25)){
  if(index>length(my_vector)){
    stop("Index higher than vector length.")}
return(my_vector[index])
}

Catch the exception:

tryCatch({
    read_vector_value(6)
}, error = function(e) {
    print("This is an exception.")  # This will be executed
})

Validating input

Consider early statements in the script to validate (data) input.

With if/else (Python):

if not protein_data:
  raise ValueError("Dataset cannot be empty")

With try/catch (R):

tryCatch({
  do_something_that_might_go_wrong(protein_data)
}, error = function(e){
  log(e)
}, finally = {
  cleanup(protein_data)
})

Expectations and assumptions

Expect the worst:

use of wrong input values for functions
malformed text input
wrong data types

Your turn: explicit expectations

Identify assumptions in your code

What assumptions/expectations exist on your data or (user) input?
What assumptions/expectations exist on the input of (a) function(s)?

Make the input/data assumptions explicit

Option 1: Explicitly state assumptions on data or input in your README.md.
Option 2: Write a piece of code that tests the validity of data/input, and reports an error if the expectations are not met.

Test the input for a function

Modify the code inside your function to
- check the value of the arguments passed to your function using if/else statements;
- raise an error in case an argument is out of the range of acceptable values.