From: Arnd Bergmann <arnd@arndb.de>
To: Michael Schnell <mschnell@lumino.de>
Subject: Re: atomic RAM ?
Date: Mon, 12 Apr 2010 17:02:35 +0200
User-Agent: KMail/1.12.2 (Linux/2.6.31-19-generic; KDE/4.3.2; x86_64; ; )
Cc: "linux-kernel" <linux-kernel@vger.kernel.org>,
       "nios2-dev" <nios2-dev@sopc.et.ntust.edu.tw>
References: <4BBD86A5.5030109@lumino.de> <201004091714.04990.arnd@arndb.de> <4BC2EEBD.3070504@lumino.de>
In-Reply-To: <4BC2EEBD.3070504@lumino.de>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201004121702.35237.arnd@arndb.de>
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2586
Lines: 50

On Monday 12 April 2010, Michael Schnell wrote:
> > You already need that with a non-SMP system anyway. As Alan explained,
> > futex is only an optimization for a relatively uninteresting case
> > (multi-threaded user applications), you really need to solve this for
> > kernel space first, because the kernel is inherently multi-threaded.
> >   
> I don't see why optimizing for speed and especially latency is
> uninteresting (with embedded systems like the one I'm planning).
> 
> Multi-threaded user applications is exactly the case that is extremely
> interesting to me and that is why I started this discussion. The
> non-SMP-Kernel ( and non-FUTEX)  case already is solved for NIOS
> (supposedly by interrupt disabling). An SMP-Linux is not yet crafted
> (and for me its a lot lower priority than decent user-space
> multithreading, but of course it is a valuable task).

Ok. Your initial post didn't make it clear that this is all you are
looking for. While atomic CPU operations would solve this problem,
you don't really need to make the RAM access itself atomic,
only the instruction flow.

> > If you want to have atomics in user space, why not go all the way and
> > make a small extension to your cache coherency logic to do load-locked/
> > store-conditional as well.
> Of course doing load-locked, store-conditional custom instructions was
> an option I did consider, but as there is no way to access memory
> through cache and MMU with custom instructions, I don't see how this
> could be done, as the current way FUTEX works, the code will define the
> DWORDs to be handled atomically anywhere in the user space memory. Of
> course disabling the cache completely is not an option for a task that
> is aimed to improve user space performance.

Right. So if you cannot implement a 'test-and-set', 'exchange' or
'store-conditional' instruction, I don't think any custom instructions
will help you.

You can probably implement an atomic function in a VDSO though, without
any CPU extensions, I think this has been discussed for blackfin
before. The idea is to let the kernel check if the instruction pointer
is in the critical section of the VDSO while returning to user space.
If it is, the kernel can jump back to the caller of that function
instead of the function itself, and indicate failure so the user can
retry.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/