Model Parallelism Numpy Performance