# Python decorators and the tf.function

machine learning Python Tensorflow### Contents

## Python decorators

A decorator is a function that accepts another function as an argument and adds new functionality to it. Decorators are used for all sorts of things, like logging, timing the execution of functions, and caching values. Let’s see a couple of examples!

### The most useless decorator

Arguably, the most useless decorator is the following one that does absolutely nothing! :)

So, basically, the `noop_decorator`

returns whatever function we hand it over, without modifying it at all.

### Timing the execution of a function

In the following example, we construct a decorator called `mytimer`

that prints the time in seconds a function takes to execute. The decorator accepts as input the function *func* and returns another function called `wrapper`

. So, every time we call the `calc_stuff()`

function, in reality the `wrapper()`

function is executed. The latter saves the current value of a performance counter. It then uses the `*args`

and `**kwargs`

to collect the positional and keyword arguments. Subsequently, it runs the function *func* by forwarding the *args* and *kwargs* with the unpacking operators (asterisk and double asterisk). Next, it takes the difference between the new minus the old value of the performance counter and prints the result. This is the time elapsed during the execution of *func*. Finally, it returns the result of *func*, as we would expect if we had called the undecorated function.

Summing up the individual execution times, we get a total of 4.85 seconds, which is pretty close to the cumulative time `timeit()`

reports. Next, we redefine the function with no decorator, and we notice how each function call’s execution time is gone now.

### Retrieving the lost metadata

When we decorate a function, we basically replace it with another function. This has the undesired side effect that some of the original function’s metadata are lost since they are replaced by the wrapper’s. See, for instance, the following code:

The same applies for the docstrings:

To preserve the original function’s metadata, we use the `functools.wraps()`

, which copies the metadata to the wrapper function that would be otherwise lost!

Neat! The function names were copied over. The same applies to the docstrings:

Notice that although we had a docstring for the `wrapper`

function, it was replaced by the original functions’ docstrings. Last, to blow your mind, `functools.wraps`

is in itself a decorator! :P

## Eager vs. lazy Tensorflow’s execution modes

### Basic computation model

In Tensorflow, computations are modeled as a directed graph. Each node in the graph is a mathematical operation (say an addition of two scalars or a multiplication of two matrices). Every node has some inputs and outputs, possibly even zero. Along the edges of the graph, tensors flow! :) Tensors are multidimensional arrays with a specific type (e.g., float or double, etc.) and should not be confused with tensors in mathematical physics. For example, the mathematical operation \(\mathbf{\text{Relu}}\left(\mathbf{W} \mathbf{x} + \mathbf{b}\right)\) is represented as:

Image taken from here.

### Tensorflow 1.0 and lazy execution

In Tensorflow 1.0, one had to construct the computation graph, then set up a *session.run()* with *feed_dict* to populate the graph with actual data. The advantage of working with a computation graph is that it allowed Tensorflow to perform many optimizations (e.g., graph simplifications, inlining function bodies to accommodate interprocedural optimizations, and so on). As of the time of writing, *Grappler* is the default graph optimization engine in the Tensorflow runtime. Grappler rewrites the graphs in order to improve performance, and also provides a plugin interface to register custom-made optimizers. A very basic example of such a simplification is the following algebraic one, that takes into account the properties of commutativity, assosiativity and distributivity:

Despite the speed benefits, though, Tensorflow’s 1.0 user experience left much to be desired, so eager execution mode was eventually introduced.

### Tensorflow 2.0 and eager execution

In eager execution, we write some code, and we can run it immediately, line by line, examine the output, modify it, re-run it, etc. Everything is evaluated on the spot without constructing a computation graph that will be run later in a session. This is easier to debug and feels like writing regular Python code. Compare the following code:

However, by running Tensorflow one step at a time, we give up the previous speed optimizations that were possible during the lazy execution mode. In Tensorflow 2.0, the default execution mode has been set to eager, presumably after people started to favor Pytorch over TF since Pytorch was eager from the beginning. So, where does the `tf.function`

fit in this narrative? By using the tf.function decorator, we can convert a function into a Tensorflow Graph (`tf.Graph`

) and lazy execute it, so we bring back some of the speed acceleration we gave up before. The following code uses the `tf.function`

decorator to convert `my_func()`

into a callable Tensorflow graph that we visualize with Tensorboard.

Fire up the Tensorboard to inspect the computation graph. By the way, if you are using ssh tunneling, you will probably need to add local port forwarding for port 6006.

#### Show me the speedup!

We will borrow some real code from a previous post, where we used trainable probability distributions and measure the speedup that `tf.function`

brings. First, we load the necessary modules and generate some normally distributed training data.

We then define a negative log-likelihood loss function and a function to calculate the gradients and loss. Naturally, we would call `get_loss_and_grads()`

in a custom training loop, and then we would pass the gradients to the optimizer with `optimizer.apply_gradients()`

to update the model’s parameters. Here, we will just call `get_loss_and_grads()`

repeatedly.

We run it 1000 times and measure the execution time:

We do the same as before, but this time we decorate the `get_loss_and_grads()`

function with `tf.function()`

:

So, by decorating the `get_loss_and_grads()`

with `tf.function`

, we reduced the execution time from about 5.66 seconds to 0.70, that’s roughly a 88% relative reduction. Not bad!

### Caveats

#### Functions with side-effects

By now, I might have given you the false impression that adding `tf.function`

to any existing function, whatsoever, automatically converts it into a computation graph. We will now discuss some of the caveats with the `tf.function`

decorator. First, any Python side-effects will only happen once, when `func`

is traced. Such side-effects include, for instance, printing with `print()`

or appending to a list:

Similarly, if we modify a Python list:

The correct way to is to rewrite the append to a list as a Tensorflow operations, e.g. `with TensorArray()`

:

#### Passing Python scalars to `tf.function`

Probably the most subtle gotcha here is this. Passing Python scalars or lists as arguments to `tf.function`

, will always build a new graph! So by passing Python scalars repeatedly, say in a loop, as arguments to `tf.function`

, it will thrash the system by creating new computation graphs again and again!

Here we measure the performance degradation:

Tensorflow will even warn if it detects such a usage:

WARNING:tensorflow:5 out of the last 10006 calls to <function f at 0x7f68e6f75a60> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for more details.