Some special-purpose processors could use the following instruction.
MISC NEXT TO FTN DATA 32 32 32 32 128
(The addresses are only 32 bits long so the most significant 32 bits not indicated are considered to be the same as for the last instruction step.)
The 'MISC' bits have the following values.
0...0,1, 0...0, FL3,FL2,FL1,FL0, FF1,FF0, T,C, S2,S1,S0, 0...0 8 8 4 2 1 1 3 5 OPCODE UNUSED FLAG FLAG FTN STEP UNUSED
The first step in executing this instruction is to copy (fetch) it from memory to the state from the 'NEXT' address of the last instruction. The 128 'DATA' bits are the input to the function. The 32 'FTN' (function) bits indicate the processor address of the special-purpose processor. The second step of exectuting the instruction is to go to the special-purpose processor with the 'DATA' as inputs. When the state comes out of the special-purpose processor the 128 'DATA' bits hold the output of the special-purpose processor. The third step of the instruction is to store the result (128 'DATA' bits) at the 'TO' address. The first step of the next instruction is to copy (fetch) the next instruction from memory to the state from the 'NEXT' address of this instruction. The 'MISC' bits are the same as for the rotate and mask instruction.
Because the input data is in the instruction, the 'addressing mode' for the input data is called 'immediate addressing.' The rotate and mask instruction gets its input data from the ADDRESS in the instruction. That is called 'direct addressing.' The rotate and mask instruction also has the 'TO' address to store the data to. That is called 'direct addressing,' too.
A regular rotate and mask instruction could copy 32 bits from somewhere in memory to the rightmost 32 bits of the instruction. This rotate and mask instruction could be eliminated by allowing direct addressing in the special purpose instruction. The rightmost unused 'MISC' bit in the special purpose instruction could indicate direct addressing when the bit has value 1. This would cause an extra step in the instruction. This step would use the rightmost 32 bits of the instruction as an address to look up 32 bits of data to replace those 32 bits (of address). Of couse, there could be more flags and more direct addressing.
Direct addressing can replace a four-step rotate-and-mask instruction with one step in the special-purpose instruction. This can speed the processor significantly if used often. It also saves the memory taken up the replaced rotate-and-mask instruction.
One type of special purpose processor that could be used often and save many instructions each time it is used is a high-precision multiplier. Another is a programmable logic (gate) array (PLA).
The speedup for a modified von Neumann architecture coprocessor with special-purpose processors over a regular von Neumann processor with special-purpose processors should be large, though not nearly as large as for the one-instruction case.