Select iterates once over the entire set and, for each item, performs a conditional branch (checking for validity) and a + operation.
Where+Select creates an iterator that skips invalid elements (doesn’t yield them), performing a + only on the valid items.
So, the cost for a Select is:
t(s) = n * ( cost(check valid) + cost(+) )
And for Where+Select:
t(ws) = n * ( cost(check valid) + p(valid) * (cost(yield) + cost(+)) )
Where:
p(valid)is the probability that an item in the list is valid.cost(check valid)is the cost of the branch that checks for validitycost(yield)is the cost of constructing the new state of thewhereiterator, which is more complex than the simple iterator that theSelectversion uses.
As you can see, for a given n, the Select version is a constant, whereas the Where+Select version is a linear equation with p(valid) as a variable. The actual values of the costs determine the intersection point of the two lines, and since cost(yield) can be different from cost(+), they don’t necessarily intersect at p(valid)=0.5.