Update: This is not the best solution I have come up with, but I’ve left it because there are so many comments relating to it.
You have a set of events (birth/death), parental state (no descendants, parent, grandparent, etc) and life state (alive, dead).
I would store my data in structures with the following fields:
mother
father
generations
is_alive
may_have_living_ancestor
Sort your events by date, and then for each event take one of the following two courses of logic:
Birth:
Create new person with a mother, father, 0 generations, who is alive and may
have a living ancestor.
For each parent:
If generations increased, then recursively increase generations for
all living ancestors whose generations increased. While doing that,
set the may_have_living_ancestor flag to false for anyone for whom it is
discovered that they have no living ancestors. (You only iterate into
a person's ancestors if you increased their generations, and if they
still could have living ancestors.)
Death:
Emit the person's name and generations.
Set their is_alive flag to false.
The worst case is O(n*n)
if everyone has a lot of living ancestors. However in general you’ve got the sorting preprocessing step which is O(n log(n))
and then you’re O(n * avg no of living ancestors)
which means that the total time tends to be O(n log(n))
in most populations. (I hadn’t counted the sorting prestep properly, thanks to @Alexey Kukanov for the correction.)