Integer
0, Most machines
1, NUMA machines.
This parameter directs FMS to redistribute the work among threads on a node during matrix multiply to reduce the amount of nonlocal memory references. Where possible, this is accomplished by computing blocks that are more square in shape.