Justin Duke

Generators are dope

(This is also readable as a gist, if you prefer.)

I love generators. They’re one of my favorite features in Python, and criminally underrated. (Or underused? Either way.)

Let’s explore how they enable great abstractions through a topic near and dear to my heart: paginating through a very important list of sandwiches from Airtable. 1

First, let’s grab that table:

from airtable import airtable
table = airtable.Airtable('API_KEY', 'OTHER_KEY')

Okay, so we want to fetch all results in a table. Problem is, you know, pagination: this table has hundreds of records but we can only get like fifty per call. Here’s how we abstract that out!

def fetch_all_records(table_name):
    # Grab the first page.  The page has two fields we care about:
    # 1. records — aka the good stuff
    # 2. offset — a cursor to the next page.
    response = table.get(table_name)

    # If there's a cursor to the next page...
    while 'offset' in response:
        # Yield all the records on this page.
        # Records look like this:
        # {'fields': {'name': 'Salumi', 'city': 'Seattle', 'rating': 5}, 'id': 'as890ops'}
        for record in response['records']:
            yield record

        # Grab the next page and repeat the process.
        response = table.get('Finished', offset=response['offset'])

    # Otherwise, yield all the records and then we're done!
    for record in response['records']:
        yield record

Now, to access all the records, we don’t have to care about:

  1. How many records are in a page
  2. How to get to the next page
  3. How to traverse the page.

All of that is abstracted away by the generator! So we can just iterate through all of them like this:

for record in fetch_all_records('Sandwiches'):

And generators are lazy, too, so if we just want the first twenty items we can do so without worrying about premature pagination:

for i, record in enumerate(fetch_all_records('Sandwiches')):
    if i > 20:

Generators aren’t great for everything.

For instance, operating on an entire corpus of an iterable is rough: this code will force you to refetch the entire list over and over again.

best_rating = max([record['fields']['Rating'] for record in fetch_all_records('Sandwiches')])
worst_rating = min([record['fields']['Rating'] for record in fetch_all_records('Sandwiches')])

In such a case, you’re better off casting the generator to a list.

(But you might be best off with a different approach entirely.)

all_records = list(fetch_all_records('Sandwiches'))
best_rating = max([record['fields']['Rating'] for record in all_records])
worst_rating = min([record['fields']['Rating'] for record in all_records])

Ultimately, generators are the best kind of Python feature:

  1. They make it easier to understand code.
  2. They make it easier to write code.
  3. They’re just neat.
for record in fetch_all_records('Sandwiches'):
    # Gotta find a great sandwich in Seattle!
    if record['fields']['Rating'] > 4 and record['field']['City'] == 'Seattle':

Time for lunch!

  1. Don’t know what Airtable is? It’s basically Excel for developers. It’s super cool. Check it out. [return]
Liked this post? You should subscribe to my newsletter and follow me on Twitter.

(I've got an RSS feed, too, if you'd prefer.)