Python Multiprocessing Process or Pool for what I am doing?

Question

The two scenarios you listed accomplish the same thing but in slightly different ways.

The first scenario starts two separate processes (call them P1 and P2) and starts P1 running foo and P2 running bar, and then waits until both processes have finished their respective tasks.

The second scenario starts two processes (call them Q1 and Q2) and first starts foo on either Q1 or Q2, and then starts bar on either Q1 or Q2. Then the code waits until both function calls have returned.

So the net result is actually the same, but in the first case you’re guaranteed to run foo and bar on different processes.

As for the specific questions you had about concurrency, the .join() method on a Process does indeed block until the process has finished, but because you called .start() on both P1 and P2 (in your first scenario) before joining, then both processes will run asynchronously. The interpreter will, however, wait until P1 finishes before attempting to wait for P2 to finish.

For your questions about the pool scenario, you should technically use pool.close() but it kind of depends on what you might need it for afterwards (if it just goes out of scope then you don’t need to close it necessarily). pool.map() is a completely different kind of animal, because it distributes a bunch of arguments to the same function (asynchronously), across the pool processes, and then waits until all function calls have completed before returning the list of results.

Leave a Comment Cancel reply