2002-09-25 04:55:31

by Pete Zaitcev

[permalink] [raw]
Subject: cmpxchg in 2.5.38

The cmpxchg() is used in kernel/pid.c:next_free_map():

if (cmpxchg(&map->page, NULL, (void *) page))
free_page(page);

It is implemented on alpha, i386, ia64, ppc64, ppc, sparc64,
x86_64, but not on mips, cris, arm, s390, s390x, sparc, sh, parisc.
Do all architectures have to have it?

-- Pete


2002-09-25 04:59:57

by David Miller

[permalink] [raw]
Subject: Re: cmpxchg in 2.5.38

From: Pete Zaitcev <[email protected]>
Date: Wed, 25 Sep 2002 01:00:44 -0400

The cmpxchg() is used in kernel/pid.c:next_free_map():

if (cmpxchg(&map->page, NULL, (void *) page))
free_page(page);

It is implemented on alpha, i386, ia64, ppc64, ppc, sparc64,
x86_64, but not on mips, cris, arm, s390, s390x, sparc, sh, parisc.
Do all architectures have to have it?

Indeed, we can't really make cmpxchg a requirement, many
cpus lack it.

2002-09-25 07:13:06

by Oleg Drokin

[permalink] [raw]
Subject: Re: cmpxchg in 2.5.38

Hello!

On Wed, Sep 25, 2002 at 01:00:44AM -0400, Pete Zaitcev wrote:
> The cmpxchg() is used in kernel/pid.c:next_free_map():
> It is implemented on alpha, i386, ia64, ppc64, ppc, sparc64,
> x86_64, but not on mips, cris, arm, s390, s390x, sparc, sh, parisc.
> Do all architectures have to have it?

Ingo said that arches that cannot do cmpxchg in hardware should
provide spinlock-based version.

Bye,
Oleg

2002-09-25 07:52:21

by David Miller

[permalink] [raw]
Subject: Re: cmpxchg in 2.5.38

From: Oleg Drokin <[email protected]>
Date: Wed, 25 Sep 2002 11:18:20 +0400

Ingo said that arches that cannot do cmpxchg in hardware should
provide spinlock-based version.

That doesn't make any sense.

If cmpxchg cannot work with user bits of memory, like
cmpxchg is supposed to, it's really a crutch more than
anything else.

F.e. people will think that such systems can support DRM
correctly, they absolutely cannot.

2002-09-25 08:02:12

by Oleg Drokin

[permalink] [raw]
Subject: Re: cmpxchg in 2.5.38

Hello!

On Wed, Sep 25, 2002 at 12:47:19AM -0700, David S. Miller wrote:

> Ingo said that arches that cannot do cmpxchg in hardware should
> provide spinlock-based version.
> That doesn't make any sense.

Yes, I know.

> If cmpxchg cannot work with user bits of memory, like
> cmpxchg is supposed to, it's really a crutch more than
> anything else.

Ingo's argument was that since there is only one place in code that accesses
that variable (map->page), it is safe to rely on such a crippled
cmpxchg implementation.

To not retell you whole story and avoid the role of broken phone,
here's info on prevous discussion on this topic:
Subject: Re: [patch] generic-pidhash-2.5.36-D4, BK-curr
September 20, From Ingo Molnar <[email protected]>, Message-ID: <[email protected]>
September 20, From Ingo Molnar <[email protected]>, Message-ID: <[email protected]>

Bye,
Oleg

2002-09-25 08:13:17

by Ingo Molnar

[permalink] [raw]
Subject: Re: cmpxchg in 2.5.38


On Wed, 25 Sep 2002, Oleg Drokin wrote:

> Ingo's argument was that since there is only one place in code that
> accesses that variable (map->page), it is safe to rely on such a
> crippled cmpxchg implementation.

yes. It's only this place in the code that ever modifies that word, and
that happens only once during the lifetime of this address, so i'll rather
add a spinlock to the generic PID allocator code, it's a very very rare
slowpath.

Ingo

2002-09-25 08:34:58

by Ingo Molnar

[permalink] [raw]
Subject: [patch] pidhash-2.5.38-A0


the attached patch removes the cmpxchg from the PID allocator and replaces
it with a spinlock. This spinlock is hit only a couple of times per
bootup, so it's not a performance issue.

Ingo

--- linux/kernel/pid.c.orig Wed Sep 25 10:36:13 2002
+++ linux/kernel/pid.c Wed Sep 25 10:38:55 2002
@@ -53,6 +53,8 @@

static pidmap_t *map_limit = pidmap_array + PIDMAP_ENTRIES;

+static spinlock_t pidmap_lock __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED;
+
inline void free_pidmap(int pid)
{
pidmap_t *map = pidmap_array + pid / BITS_PER_PAGE;
@@ -77,8 +79,13 @@
* Free the page if someone raced with us
* installing it:
*/
- if (cmpxchg(&map->page, NULL, (void *) page))
+ spin_lock(&pidmap_lock);
+ if (map->page)
free_page(page);
+ else
+ map->page = (void *)page;
+ spin_unlock(&pidmap_lock);
+
if (!map->page)
break;
}

2002-09-25 12:00:20

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: cmpxchg in 2.5.38

On Wed, 25 Sep 2002, Ingo Molnar wrote:
> On Wed, 25 Sep 2002, Oleg Drokin wrote:
> > Ingo's argument was that since there is only one place in code that
> > accesses that variable (map->page), it is safe to rely on such a
> > crippled cmpxchg implementation.
>
> yes. It's only this place in the code that ever modifies that word, and
> that happens only once during the lifetime of this address, so i'll rather
> add a spinlock to the generic PID allocator code, it's a very very rare
> slowpath.

Furthermore archs that have cmpxchg() define __HAVE_ARCH_CMPXCHG, while
currently no code tests for it...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2002-09-25 17:37:18

by Pete Zaitcev

[permalink] [raw]
Subject: Re: cmpxchg in 2.5.38

OK, I'll work around the cmpxchg locally.

But for the record, I am concerned with the API inflation
instigated by the single source. The scheduler-specific
bitmaps were bad enough, even though Bill Irvin showed
a better way to do it at the time. Now, the same gentleman
invents one more API. I may be getting senile, though.

-- Pete

2002-09-25 19:04:24

by David Miller

[permalink] [raw]
Subject: Re: cmpxchg in 2.5.38

From: Ingo Molnar <[email protected]>
Date: Wed, 25 Sep 2002 10:26:34 +0200 (CEST)

It's only this place in the code that ever modifies that word, and
that happens only once during the lifetime of this address, so i'll rather
add a spinlock to the generic PID allocator code, it's a very very rare
slowpath.

You can't define a crippled primitive like this Ingo.

What if someone else starts to use it?

And more importantly, if you can't use it on a datum shared between
userspace and the kernel, you have to name it something other than
cmpxchg or make the DRM use something else.

2002-09-25 19:21:38

by Ingo Molnar

[permalink] [raw]
Subject: Re: cmpxchg in 2.5.38


On Wed, 25 Sep 2002, Pete Zaitcev wrote:

> OK, I'll work around the cmpxchg locally.

no need, i already sent a patch that removes the cmpxchg.

Ingo

2002-09-25 19:07:11

by David Miller

[permalink] [raw]
Subject: Re: cmpxchg in 2.5.38

From: Ingo Molnar <[email protected]>
Date: Wed, 25 Sep 2002 10:26:34 +0200 (CEST)

yes. It's only this place in the code that ever modifies that word

I just realized... how would a crippled spinlock implementation
protect the readers looking at the word?

The operation is decidely non-atomic, because only one side
of the access is being synchronized.

This is another reason you can't use cmpxchg like this and expect
every architecture to be able to do something reasonable.

Use instead some algorithm with xchg() which is supported on every
platform.

2002-09-25 20:08:59

by David Miller

[permalink] [raw]
Subject: Re: cmpxchg in 2.5.38

From: Ingo Molnar <[email protected]>
Date: Wed, 25 Sep 2002 21:34:59 +0200 (CEST)

On Wed, 25 Sep 2002, Pete Zaitcev wrote:

> OK, I'll work around the cmpxchg locally.

no need, i already sent a patch that removes the cmpxchg.

Thanks so much.