1. A method of operating an array processor, the method comprising:

organizing an N-row by M-column torus array processor into M clusters of N processing elements (PEs), where M and N are positive

integers both greater than 1, said organizing including, for each value of y, where y ? {0,1} for M=2 and y ? {0, 1, . . .

, M?1} for M>2, establishing a cluster y of N PEs with a unique PE0, y, the remaining PEs of cluster y selected by PEa,(y+M-a)Mod M for each a, where a=1 for N=2, a ? {1, 2} for N=3, and a ? {1, 2, . . . , N?1} for N>3;

receiving a communication instruction in each PE, the M clusters interconnected by cluster switches, wherein the communication

instruction provides software accessible control of the cluster switches, wherein each PE in a first cluster of N PEs controls

an associated switching element (SE) of an associated cluster switch coupled to the first cluster of N PEs to select a signal

path in response to the received communication instruction, wherein the selected signal path begins from each PE that controls

the associated SE, continues through the associated cluster switch, and then continues through an adjacent cluster switch

to a torus nearest neighbor adjacent PE in an adjacent cluster of N PEs, and wherein N signal paths are selected by the N

PEs in the first cluster of N PEs; and

transmitting N source values, a different source value transmitted from each PE in the first duster of N PEs, in parallel

across the selected N signal paths to N torus nearest neighbor adjacent PEs in response to the communication instruction,

wherein the associated cluster switch and the adjacent cluster switch each provide N shared mutually exclusive direction paths

between the first cluster of N PEs and the adjacent cluster of N PEs.