Python 3.x: Test if generator has elements remaining

Question

This is a great question. I’ll try to show you how we can use Python’s introspective abilities and open source to get an answer. We can use the dis module to peek behind the curtain and see how the CPython interpreter implements a for loop over an iterator.

>>> def for_loop(iterable):
...     for item in iterable:
...         pass  # do nothing
...     
>>> import dis
>>> dis.dis(for_loop)
  2           0 SETUP_LOOP              14 (to 17) 
              3 LOAD_FAST                0 (iterable) 
              6 GET_ITER             
        >>    7 FOR_ITER                 6 (to 16) 
             10 STORE_FAST               1 (item) 

  3          13 JUMP_ABSOLUTE            7 
        >>   16 POP_BLOCK            
        >>   17 LOAD_CONST               0 (None) 
             20 RETURN_VALUE

The juicy bit appears to be the FOR_ITER opcode. We can’t dive any deeper using dis, so let’s look up FOR_ITER in the CPython interpreter’s source code. If you poke around, you’ll find it in Python/ceval.c; you can view it here. Here’s the whole thing:

    TARGET(FOR_ITER)
        /* before: [iter]; after: [iter, iter()] *or* [] */
        v = TOP();
        x = (*v->ob_type->tp_iternext)(v);
        if (x != NULL) {
            PUSH(x);
            PREDICT(STORE_FAST);
            PREDICT(UNPACK_SEQUENCE);
            DISPATCH();
        }
        if (PyErr_Occurred()) {
            if (!PyErr_ExceptionMatches(
                            PyExc_StopIteration))
                break;
            PyErr_Clear();
        }
        /* iterator ended normally */
        x = v = POP();
        Py_DECREF(v);
        JUMPBY(oparg);
        DISPATCH();

Do you see how this works? We try to grab an item from the iterator; if we fail, we check what exception was raised. If it’s StopIteration, we clear it and consider the iterator exhausted.

So how does a for loop “just know” when an iterator has been exhausted? Answer: it doesn’t — it has to try and grab an element. But why?

Part of the answer is simplicity. Part of the beauty of implementing iterators is that you only have to define one operation: grab the next element. But more importantly, it makes iterators lazy: they’ll only produce the values that they absolutely have to.

Finally, if you are really missing this feature, it’s trivial to implement it yourself. Here’s an example:

class LookaheadIterator:

    def __init__(self, iterable):
        self.iterator = iter(iterable)
        self.buffer = []

    def __iter__(self):
        return self

    def __next__(self):
        if self.buffer:
            return self.buffer.pop()
        else:
            return next(self.iterator)

    def has_next(self):
        if self.buffer:
            return True

        try:
            self.buffer = [next(self.iterator)]
        except StopIteration:
            return False
        else:
            return True


x  = LookaheadIterator(range(2))

print(x.has_next())
print(next(x))
print(x.has_next())
print(next(x))
print(x.has_next())
print(next(x))

Leave a Comment Cancel reply