Best practices and anti-patterns
df.apply()if it's possible to obtain the same result by applying a built-in function.
applyis a slow operation both with pandas and with Terality. It can't be optimized as much as specialized built-in functions.
Iterating over rows of a structure (
for x in structloops) is inefficient and is strongly discouraged both with pandas and Terality. For example, don't do:
sum([x^2 for x in series])
By doing so, you're not taking advantage of the vectorization (parallelization) of pandas computations.
When running this code with Terality, the iteration is done on your computer instead of running it in parallel on the Terality cluster. In this scenario, not only you are not benefiting from pandas vectorization, you are also missing on Terality's parallelized computations.
Use a built-in function instead. In this example, you could use
Don't access the elements of a structure in a loop. For instance:
# Don't do this! this is even slower than the previous example
sum = 0
for i in range(len(series)):
sum += series[i]
This code will make an API request to the Terality cluster for each iteration. This will either be extremely slow, or get you temporarily blocked from Terality API from sending too many requests. So please avoid this at all costs!
Instead of iterating over Terality's structures rows, you should use their methods as much as possible. For instance, the above for-loop used to compute the sum of a
Seriesshould be replaced by the
Series.summethod, which will run significantly faster.
sum = series.sum()