## We can’t avoid mistakes

But we can work as cautiously as possible, to make sure we catch them in time. It is always better to try and fail to run something, than to have the operation keep going and accumulating mistakes.

There are four types of mistakes to look out for: mistakes in the code, confusing interface, issues with arguments, and lack of integration. Some are caused by the programmer, and some are caused by the user. But in the context of writing code for science, the programmer and the user are often the same person, and so passing the blame around ends up being a very frustrating exercise. Even if it were not the case, user mistakes can come from sub-optimal design. It is crucial to work in a way that protects everyone against mistakes.

In this lesson, you may note that we will switch perspective frequently, from user to developper. This is because, in our own experience, this is a fair representation of the way we work. We try to write something (developper), then apply it to a specific problem (user), then figure out there is an issue and switch back to developper mode.

## After this lesson, you will be able to …

- … use defensive programming
- … write basic tests to ensure that the program fails when it should
- … think about function design in a way that minimizes confusion

But let’s do something a little bit different. Instead of writing a function, we will start by thinking about its behavior. In this example, we want to write a function that will calculate an average. That’s it. Specifically, it will calculate the signal-to-noise ration in an array of numbers, using the S = μ/σ expression, where μ is the average and σ the standard deviation.

What we will do is *declare* the function, but put nothing in it – in each of
the sections, we will add a few lines to the function, to make it *work*. For
now, we want a function that does *nothing*.

```
function snr(x::Vector{T}) where {T <: Number}
end
```

```
snr (generic function with 1 method)
```

## Using the wrong arguments

The first thing that can go wrong with this function is calling it with the
wrong arguments. In a sense, we have limited this risk because we took advantage
of Julia’s type annotation system, and so we can *only* call our function when
the argument is an array of numbers.

But with the definition of signal-to-noise ratio we picked, it only makes sense
to apply this function when all elements of `x`

are *non-negative*. So this is
the first thing our function should check. But instead of changing the function,
we will first *test* its behavior.

*test-driven development*. It is not the only way to proceed, but we think it is an interesting practice to experiment with.

Let’s start by documenting our “normal” behavior: calling the function with only positive or null numbers should not give any warning or error message:

```
using Test
@test_nowarn snr([0.0, 1.0, 2.0])
```

This test is *passing*: running the function with this argument gives no warning
(because the function currently does nothing, but that is besides the point).
And we also want our function to return an error when we call it with negative
arguments:

```
@test_throws DomainError snr([-1.0, 1.0, 2.0])
```

```
Test Failed at /home/runner/work/ScientificComputingForTheRestOfUs/Scientif
icComputingForTheRestOfUs/content/lessons/avoiding_mistakes/_index.Jmd:2
Expression: snr([-1.0, 1.0, 2.0])
Expected: DomainError
No exception thrown
Error: Test.FallbackTestSetException("There was an error during testing")
```

Well, this one, predictabily, is failing! So here is our first step: we need to
add something to our function to make it return a `DomainError`

– this is a
way to tell our user that something is not quite right with the arguments.

We can add a line checking the sign of the smallest value of `x`

– if this is
lower than 0 (and specifically, of the `zero`

value of the type `T`

of the
elements), then we *throw* an error.

```
function snr(x::Vector{T}) where {T <: Number}
minimum(x) < zero(T) && throw(DomainError(minimum(x), "all values passed must be positive or null."))
end
```

```
snr (generic function with 1 method)
```

Note that our error is *informative*: it gives a message explaining what went
wrong. Now that we have added this line, we need to make sure that *both* our
tests pass!

```
@test_nowarn snr([0.0, 1.0, 2.0])
@test_throws DomainError snr([-1.0, 1.0, 2.0])
```

```
Test Passed
Thrown: DomainError
```

Yeah! Now, because our function involves calculating a standard deviation, we may want to restrict its application to inputs with more than 3 elements. This is also something we can test, and throw an error:

```
function snr(x::Vector{T}) where {T <: Number}
length(x) < 3 && throw(ArgumentError("A minimum of three values must be provided."))
minimum(x) .< zero(T) && throw(DomainError(minimum(x), "all values passed must be positive or null."))
end
```

```
snr (generic function with 1 method)
```

And now, let’s run *all of the previous tests*, but also a new one.

```
@test_nowarn snr([0.0, 1.0, 2.0])
@test_throws DomainError snr([-1.0, 1.0, 2.0])
@test_throws ArgumentError snr([1.0, 1.0])
```

```
Test Passed
Thrown: ArgumentError
```

At this point, our function still does nothing. In fact, it does *less* than
nothing, since it will refuse to run in some situations. This practice is
generally refered to as *defensive programming*: we want to perform the actual
computation only when we are confident that the conditions to run it are met.

For now, we have done enough work. Let’s work on the code later, and we will instead improve the “interface” of our function.

## Confusing interface

As it is written, our function does not have a particularly explicit name. Maybe
`snr`

makes sense to us because it is fresh in our mind; but will it still be
true in a week? A month? Let’s instead aim for something more explicit:
`signaltonoiseratio`

.

```
function signaltonoiseratio(x::Vector{T}) where {T <: Number}
length(x) < 3 && throw(ArgumentError("A minimum of three values must be provided."))
minimum(x) < zero(T) && throw(DomainError(minimum(x), "all values passed must be positive or null."))
end
```

```
signaltonoiseratio (generic function with 1 method)
```

It is longer, but it is also more explicit. And in most cases, typing `sign`

and
pressing `Tab`

will autocomplete to the full function name, so there is minimal
effort involved.

Another source of mistakes is to have non-descriptive names for variables. In
the existing function, the most important variable is called `x`

: this means
nothing. We will replace `x`

by something more informative, such as
`measurement`

:

```
function signaltonoiseratio(measurement::Vector{T}) where {T <: Number}
length(measurement) < 3 && throw(ArgumentError("A minimum of three values must be provided."))
minimum(measurement) < zero(T) && throw(DomainError(minimum(measurement), "all values passed must be positive or null."))
end
```

```
signaltonoiseratio (generic function with 1 method)
```

At this point, our function still does *absolutely nothing*, but we can be sure
that it is named in a way that makes it easier to reason about, will refuse to
run when the arguments are wrong, and uses variable names that mean something.
In short, it does nothing, but it does it really well.

## Mistakes in the code

We now need to build the *inside* of our function. In practice, this will
require two parts. First, a function for the mean, and a function for the
standard deviation (in real life, we would of course use `mean`

and `std`

from
the `Statistics`

standard library).

Let’s get the standard deviation out of the way:

```
σ(x) = sum(sqrt.((x^2.0 .- μ(x)^2.0)./length(x)))
```

```
σ (generic function with 1 method)
```

We will assume that this function is correct, but in practice it would of course
be important to test it. This functions calls `μ(x)`

, which is our mean
function, which we will need to implement. Let’s do just that. We will use a
rather straightforward approach, where we sum the numbers in `x`

, and then
divide by the length of `x`

:

```
function μ(x::Vector{T}) where {T <: Number}
su = 0.0
for i in 1:length(x)
su += x[1]
end
return su/length(x)
end
```

```
μ (generic function with 1 method)
```

Now let’s think about tests for this functions. A simple one is: a vector of any
length that contains a single value should have its mean be this value (we will
use `≈`

instead of `==`

because rounding errors with floating point numbers
happen).

```
n = rand()
@test μ(fill(n, 200)) ≈ n
```

```
Test Passed
```

This works! Our function is bug-free, and we can use it. Or can we? The
confidence we have in a function is only as good as the exhaustivity of our
tests. So let’s add a second test, like the fact that the average of `[2,1,3]`

is `2`

.

```
@test μ([2,1,3]) == 2
```

```
Test Passed
```

That’s a good sign. Or is it? Let’s add one more. The average of `[2,1,3]`

and
`[3,2,1]`

should be the same:

```
@test μ([2,1,3]) == μ([3,2,1])
```

```
Test Failed at /home/runner/work/ScientificComputingForTheRestOfUs/Scientif
icComputingForTheRestOfUs/content/lessons/avoiding_mistakes/_index.Jmd:2
Expression: μ([2, 1, 3]) == μ([3, 2, 1])
Evaluated: 2.0 == 3.0
Error: Test.FallbackTestSetException("There was an error during testing")
```

Wait, what? This test is not passing, and this strongly suggests that our function is wrong. Take a moment to read it in detail, and see if you can spot the issue.

When we do the sum step, we have the line `su += x[1]`

– this is a simple typo,
but we are adding the first element over and over. This is not what we want: we
want to add the *i*-th element, and so we should correct the function
accordingly:

```
function μ(x::Vector{T}) where {T <: Number}
su = 0.0
for i in 1:length(x)
su += x[i]
end
return su/length(x)
end
n = rand()
@test μ(fill(n, 200)) ≈ n
@test μ([2,1,3]) == 2
@test μ([2,1,3]) == μ([3,2,1])
```

```
Test Passed
```

And now, this works!

The whole point of this example was to show that tests can be misleading. Our initial function, although wrong, was passing the first tests because we were not writing tests that could catch the bug. It does happen in practice. The two ways around these are usually to write more tests, or to use the code and notice that something is wrong. One of these is far riskier…

## Putting things back together

Now that we have our internal functions, we can finish the `signaltonoiseratio`

function:

```
function signaltonoiseratio(measurement::Vector{T}) where {T <: Number}
length(measurement) < 3 && throw(ArgumentError("A minimum of three values must be provided."))
minimum(measurement) < zero(T) && throw(DomainError(minimum(measurement), "all values passed must be positive or null."))
return μ(x)/σ(x)
end
```

```
signaltonoiseratio (generic function with 1 method)
```

Here it is! It is important to notice that *most* of the code we wrote is not
about making calculations. It is about making sure that the conditions to do
these calculations are met, and that the calculations are done correctly. This
is not unusual. In fact, the task of programming is often more about testing,
documenting, and commenting, than it is about writing the actual code; this is
not because these tasks are enjoyable on their own (they mostly are not), but
because they are necessary to get to a correct result.