but as far as I can tell the only thing it does is cast the result to a list before returning it
You aren’t missing anything. That is the key difference. Except it isn’t a cast as such: the actual returned object is very different. Basically, there are two ways of reading data:
- in a streaming API each element is yielded individually; this is very memory efficient, but if you do lots of subsequent processing per item, mean that your connection / command could be “active” for an extended time
- in a buffered API all the rows are read before anything is yielded
If you are reading a very large amount of data (thousands to millions of rows), a non-buffered API may be preferable. Otherwise lots of memory is used, and there may be noticeable latency before even the first row is available. However, in most common scenarios the amount of data read is within reasonable limits, so it is reasonable to push it into a list before handing it to the caller. That means that the command / reader etc has completed before it returns.
As a side note, buffered mode also avoids the oh-so-common “there is already an open reader on the connection” (or whatever the exact phrasing is).