^{Department of Physics and Astronomy}_{The Forbes Group
Python Performance}

Python Performance¶

Here we compare various approaches to solving some tasks in python with an eye for performance. Don't forget Donald Knuth's words:

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

In other words, profile and optimize after making sure your code is correct, and focus on the places where your profiling tells you you are wasting time.

Numpy¶

where vs piecewise¶

Summary: piecewise only wins when you have really large arrays. It does vectorize over the portions of the array, though, so is a good choice if you have very expensive functions. For this example, $N~10^{5}$ is about the tipping point. Beware of the gotcha though.

In [21]:

import math
import numpy as np
Ns = [10, 1000, 10000, 100000, 1000000]

def f_where(t, t1=1.0, alpha=3.0):
    return np.where(
        t < 0.0, 
        0.0, 
        np.where(
            t < t1, 
                (1 + np.tanh(alpha*np.tan(np.pi*(2*t/t1-1)/2)))/2,
                1.0)
        )

def f_piecewise(t, t1=1.0, alpha=3.0):
    return np.piecewise(
        t, 
        [t < 0.0, 
         np.logical_and(0 <= t, t < t1)  # Gotcha: you can't do just t<t1 here
        ],
        [0.0, lambda t: (1 + np.tanh(alpha*np.tan(np.pi*(2*t/t1-1)/2)))/2, 1.0]
    )
    
for N in Ns:
    t = np.linspace(-0.5 , 1.5, N)
    assert np.allclose(f_where(t), f_piecewise(t))
    prefix = f"N={N}"
    print(f"{prefix} -- where:")
    %timeit f_where(t)
    print("{} -- piecewise:".format(" "*len(prefix)))
    %timeit f_piecewise(t)
    print()

N=10 -- where:
14.4 µs ± 214 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
     -- piecewise:
36.9 µs ± 3.03 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

N=1000 -- where:
51.8 µs ± 1.72 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
       -- piecewise:
60.5 µs ± 558 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

N=10000 -- where:
182 µs ± 1.95 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
        -- piecewise:
253 µs ± 10.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

N=100000 -- where:
2.07 ms ± 13.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
         -- piecewise:
1.39 ms ± 7.75 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

N=1000000 -- where:
20.5 ms ± 75.7 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
          -- piecewise:
13.6 ms ± 61 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [ ]: