One of the problems with this forum is that you can't hear the murmur
of assent ripple through the hardware design crowd when Larry rants
about this stuff. Larry has had his head out of the box for a long
Look at the ASCI project. The intention was for SGI to build an
Origin with around 1000 CPUs. That Origin had extra cache coherence
directory RAM and special encodings in that RAM so that the hardware
could actually keep the memory across all 1000 CPUs coherent. We
added extra physical address bits to the R10K to make this machine
Last I heard, the machine is mostly programmed with message passing.
I remember having a talk with an O/S guy who was implementing some
sort of message delivery utility inside the O/S. This was when
Cellular IRIX was in development, and they were investigating having
the various O/S images talk to each other with messages across the
shared memory. Then someone found out the O/S images could signal
each other FASTER through the HIPPI connections than they could
through shared memory. That is, this machine had a HIPPI port local
to each O/S image, and all those HIPPI ports were connected together
via a HIPPI switch.
Those HIPPI connections were build with the _same_physical_link_ as
the shared memory - an 800 MB/s source-synchronous channel. But if
you're sending a message, it's better to have the I/O system just
send the bits one way than have the shared memory system do two round
trips, one to invalidate the mailbox buffer for writing and another to
process the remote cache miss to receive the message.