Tips and tricks in Python (part 1)
Interesting utilities
divmod(a, b) # built-in method returns tuple (a // b, a % b)
np.isclose(a, b) # avoid numerical errors
Number formatting
I read two tricks about number formatting in a Medium article. The first trick is that underscores can be placed anywhere you prefer in a given number, but of course you should use them to separate the number the every three digits for better readability. The second one is a way to add commas in f-string literals. Here are how the tricks work.
value = 12_34_5678_9
print(f'{value:,}') # 123,456,789
print(f'{value:_}') # 123_456_789
value = 123_456.789_789
print(f'{value:,}') # 123,456.789789
print(f'{value:_}') # 123_456.789789
Virtual environments in Python
See comparison between tools managing virtual environments in a Realpython article. Also see a guide to create a virtual environments.
# env is the name of the folder containing environment
python3 -m venv env
# activate the environment
source env/bin/activate
# install packages
pip install numpy
pip install numpy==1.18.4
pip install numpy>=1.0.0
pip install --upgrade numpy
# deactivate the environment
deactivate
Instructions to install packages from a requirements.txt
file.
pip freeze > requirements.txt
pip install -r requirements.txt
pip install -r PythonPackages.txt --upgrade;
Built-in int method in Python
Mathematically, it is easy to see that
\begin{equation} x = \text{sgn}(x) * \text{abs}(x) \end{equation}
In Python, built-in int
method is defined as
\begin{equation}
\text{int}(x) = \text{sgn}(x) * \text{floor}(\text{abs}(x))
\end{equation}
For example, int(-2.2) = -2
Determine the largest value in a numpy array
A question needs to be answered: what to do when the given array contains np.nan
values? Most of the time, we do not care about these np.nan
values. That means the maximum value should be chosen from the values excluding np.nan
in the array.
However, the first naive approach value = arr.max()
is not correct since it returns np.nan
if np.nan
exists in the array. A workaround would be
arr = np.array([1, 2, np.nan])
value = arr[~np.isnan(arr)].max()
By doing so, you have reinvented the wheel. Just simply use np.nanmax(arr)
to achieve the goal, see nanmax documentation.
One more thing to notice, if a value is suspected to be a np.nan
value, one should not compare it with np.nan
to get the answer since
value = np.nan
value == np.nan # False
The correct way to to use isnan
method
np.isnan(value) # True
Accessing elements in a numpy array
Given a numpy array, one need to provide the value of element in a given coordinate.
import numpy as np
# input
arr = np.array([
[1, 2, 3],
[4, 5, 6],
])
coor = (1, 2)
# output 6
A solution to this exercise is quite simple
def getElement(arr, coor):
x, y = coor
return arr[x, y]
But what if the array dimension is not pre-provided? We then do not know the number of indices to unpack correctly in the line x, y = coor
. In addition, the approach return arr[*coor]
is not syntactically correct. Luckily, I have found out a workaround by using item
method as follows
def getElement2(arr, coor):
# coor must be a tuple, not a list
return arr.item(coor)
One last note is about the try/catch block. The IndexError error is raised when given indices are invalid (out of array’s bounds). Negative indices may not be what you’re dealing with, they are valid in numpy arrays though. Make sure that you know exactly what indices values you’re passing to the method.
Negative indices in numpy arrays
Given a 1-dimensional numpy array, suppose you want to extract the last m elements of the array. The most naive approach would be
import numpy as np
m = 2
arr = np.array([10, 20, 30, 40, 50])
print(arr[-m:])
However, when m equals to 0, the code above returns the entire array instead of returns nothing as expected (… extract the last 0 elements …). Of course if we know m is a non negative number beforehand, we would have come up with something different or at least being aware of the approach’s downsides. The lesson here is trying to identify as many edge cases as possible.
Back to our problem, the fix is relatively easy to find out
length = arr.size
assert 0 <= m <= length
print(arr[length-m:])
Coding without if statements
def withIfStatement(bool_):
if bool_:
print('Hello')
return None
def withoutIfStatements(bool_):
for _ in filter(lambda x: bool_, [None]):
print('Hello')
return None
withIfStatement(True)
withoutIfStatements(True)
Parsing arguments
import argparse
parser = argparse.ArgumentParser()
# parser.add_argument('--nItems', type=int, default=12, help='TBA') # method 1
parser.add_argument('-n', '--nItems', type=int, default=12, help='TBA') # method 2
args = parser.parse_args()
print(args.nItems)
python main.py -n 42
python main.py --nItems 43
python main.py -h # show help message and exit
Time formatting
import time
foo = time.strftime('%Y%m%d_%Hh%Mm%Ss')
It can also be done in a bash script as below
foo=$(date +'%Y%m%d_%Hh%Mm%Ss')