The os module provides urandom, even on Windows:
bytearray(os.urandom(1000000))
This seems to perform as quickly as you need, in fact, I get better timings than your numpy (though our machines could be wildly different):
timeit.timeit(lambda:bytearray(os.urandom(1000000)), number=10)
0.0554857286941