I have a probability study program that does 10,000,000 iterations. In each iteration you have to apply some heavy formulas that include the calculation of the factorial, and it is taking a long time.
I have been told that I must apply memorization to reduce process times.
Could you tell me what it is and how I use it?
Memoization is an optimization technique that avoids recalculating previously obtained results. For this, the previous results are stored; and when an already calculated result is requested, the remembered value is returned, avoiding recalculation.
Memoization applies to functions that meet two requirements:
funcion(n)
it will always return the same value for an
die.A memoizable function can only call other memoizable functions.
We will use the factorial function to exemplify, starting from the traditional basic implementation that we will use as a reference:
manual storage
Functions, like other objects, can have attributes, which can be dynamically created, queried, and modified. For example, a default attribute is
__doc__
, which returns the docstring of the function, like so:Another predefined attribute, which we will take advantage of here, is
__dict__
, an initially empty dictionary where we will store the previously calculated factorials. The key will be the factorial number and the value, the calculated factorial.This is the manual implementation:
Memoization with decorators
Manual memorization has the drawback of adding code to the function, which can lead to confusion and errors. Of course, Python has a solution for that: the @lru_cache decorator .
The use could not be simpler: it is simply added before the function, which now looks like this:
What we have here is the basic factorial plus the decorator.
@lru_cache
has an optional argument,maxsize
, which is an estimate of the number of results to remember. The default value is 128 results, and can be increased if the problem handles a larger number of possible results.Demonstration
We are going to try all three versions of factorial and profile using
cProfile
, which receives a function, executes it and prints the statistics of time, number of calls, etc.We first define the function that will test a version of factorial. The function
calcular
receives a version of factorial and the number of times to call it:and then we run:
what it produces:
Note: Only the first line of each result is shown for clarity.
As you can see, we've reduced the execution time by a fifth simply by adding a decorator to our function.