Refactoring machine learning code - namedtuple
This article covers:
Instead of using sometimes confusing indexing in your code, use a namedtuple instead. It’s backwards compatible, so you can still use the index, but you can make your code much more readable.
This is especially helpful when you transform between PIL
and numpy
based code, where PIL
uses a column, row notation while numpy
uses a row, column notation.
Let’s consider this piece of code where we want to get the pixel locations of several points which are in the numpy
format:
def get_image_values(image_path, locations):
img = Image.open(image_path).load()
return [img[point[1], point[0]] for point in locations]
values = get_image_values('my_image.png', [[2, 3], [5, 3], [7, 9]])
To me this code is hard to read, because it’s not clear what these point[1]
and point[0]
refer to and also why is point[1]
before point[0]
?
Let’s use a namedtuple to make this much easier to read:
from collections import namedtuple
Point = namedtuple('Point', ['row', 'column'])
def get_image_values(image_path, locations):
img = Image.open(image_path).load()
return [img[point.column, point.row] for point in locations]
values = get_image_values('my_image.png', [
Point(row=2, column=3),
Point(row=5, column=3),
Point(row=7, column=9)]
)
Note that you can still use point[0]
= point.row
and point[1]
= point.column
.
Read other posts
comments powered by Disqus