As @pvg said in a comment, a (bounded) queue is the natural way to mediate among a producer and consumers with different speeds, ensuring they all stay as busy as possible but without letting the producer get way ahead.
Here’s a self-contained, executable example. The queue is restricted to a maximum size equal to the number of worker processes. If the consumers run much faster than the producer, it could make good sense to let the queue get bigger than that.
In your specific case, it would probably make sense to pass lines to the consumers and let them do the document = json.loads(line) part in parallel.
import multiprocessing as mp
NCORE = 4
def process(q, iolock):
from time import sleep
while True:
stuff = q.get()
if stuff is None:
break
with iolock:
print("processing", stuff)
sleep(stuff)
if __name__ == '__main__':
q = mp.Queue(maxsize=NCORE)
iolock = mp.Lock()
pool = mp.Pool(NCORE, initializer=process, initargs=(q, iolock))
for stuff in range(20):
q.put(stuff) # blocks until q below its max size
with iolock:
print("queued", stuff)
for _ in range(NCORE): # tell workers we're done
q.put(None)
pool.close()
pool.join()