by Nick Piggin

[permalink] [raw]

Subject: Re: [PATCH linux-2.6 04/04] brsem: convert cpucontrol to brsem

Tejun Heo wrote:

> +/*
> + * cpucontrol is a brsem used to synchronize cpu hotplug events.
> + * Invoking lock_cpu_hotplug() read-locks cpucontrol and no
> + * hotplugging events will occur until it's released.
> + *
> + * Unfortunately, brsem itself makes use of lock_cpu_hotplug() and
> + * performing brsem write-lock operations on cpucontrol deadlocks.
> + * This is avoided by...
> + *
> + * a. guaranteeing that cpu hotplug events won't occur during the
> + * write-lock operations, and
> + *
> + * b. skipping lock_cpu_hotplug() inside brsem.
> + *
> + * #a is achieved by acquiring and releasing cpucontrol_mutex outside
> + * cpucontrol write-lock. #b is achieved by skipping
> + * lock_cpu_hotplug() inside brsem if the current task is
> + * cpucontrol_mutex holder (is_cpu_hotplug_holder() test).
> + *
> + * Also, note that cpucontrol is first initialized with
> + * BRSEM_BYPASS_INITIALIZER and then initialized again with
> + * __create_brsem() instead of simply using create_brsem(). This is
> + * necessary as cpucontrol brsem gets used way before brsem subsystem
> + * becomes up and running.
> + *
> + * Until brsem is properly initialized, all brsem ops succeed
> + * unconditionally. cpucontrol becomes operational only after
> + * cpucontrol_init() is finished, which should be called after
> + * brsem_init_early().
> + */

Mmm, this is just insane IMO.

Note that I happen to also think the idea (brsems) have merit, and
that cpucontrol may be one of the places where a sane implementation
would actually be useful... but at least when you're introducing
this kind of complexity anywhere, you *really* need to be able to
back it up with numbers.

As far as the VFS race fix goes, I guess Al or someone else will
comment on its correctness. But I think it might be nicer to first
fix it with a regular rwsem and then show some numbers to justify
its conversion to a brsem.

If you need interruptible rwsems, I almost got an implementation
working a while back, and David Howells recently said he was
interested in doing them... so that's not an impossibility.

Nick

--
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com

2005-09-25 08:03:07

Nathan Lynch wrote:

>Nick Piggin wrote:
>
>>
>>Note that I happen to also think the idea (brsems) have merit, and
>>that cpucontrol may be one of the places where a sane implementation
>>would actually be useful... but at least when you're introducing
>>this kind of complexity anywhere, you *really* need to be able to
>>back it up with numbers.
>>
>
>The only performance-related complaint with cpu hotplug of which I'm
>aware -- that taking a cpu down on a large system can be painfully
>slow -- resides in the "write side" of the code, which is not the case
>that the brsem implementation optimizes. I think this patch would
>make that case even worse. So I don't think it's appropriate to use a
>brsem for cpu hotplug, especially without trying rwsem first.
>
>

I'm not sure that a brsem would make a noticable difference.

It isn't that cpu hotplug semaphore is a performance problem
now, but that it isn't being used in as many cases as it could
be due to its unscalable nature. For example, a while back I
wanted to use it in the fork() path in the scheduler but
couldn't.

Anyway, as I said, you need to be able to back it up with
numbers ;)

Nick

Send instant messages to your online friends http://au.messenger.yahoo.com

2005-09-26 04:05:36

by Tejun Heo

[permalink] [raw]

Subject: Re: [PATCH linux-2.6 04/04] brsem: convert cpucontrol to brsem

Hello, Nathan & Nick.

Nick Piggin wrote:
> Nathan Lynch wrote:
>
>> Nick Piggin wrote:
>>
>>>
>>> Note that I happen to also think the idea (brsems) have merit, and
>>> that cpucontrol may be one of the places where a sane implementation
>>> would actually be useful... but at least when you're introducing
>>> this kind of complexity anywhere, you *really* need to be able to
>>> back it up with numbers.
>>>
>>
>> The only performance-related complaint with cpu hotplug of which I'm
>> aware -- that taking a cpu down on a large system can be painfully
>> slow -- resides in the "write side" of the code, which is not the case
>> that the brsem implementation optimizes. I think this patch would
>> make that case even worse. So I don't think it's appropriate to use a
>> brsem for cpu hotplug, especially without trying rwsem first.
>>
>>

Actually, a patch which converts cpucontrol to rwsem was once in -mm.
I don't know what happened to it. I can't see it in 2.6.13-rc2-mm1.

>
> I'm not sure that a brsem would make a noticable difference.
>
> It isn't that cpu hotplug semaphore is a performance problem
> now, but that it isn't being used in as many cases as it could
> be due to its unscalable nature. For example, a while back I
> wanted to use it in the fork() path in the scheduler but
> couldn't.

I couldn't have put it better. As it currently stands, cpucontrol
doesn't have any heavy readers. It's partly because it's not necessary
but at least for some part it's because it cannot be used in such cases
as the overhead is too big. The same is true for super->s_umount rwsem
- read_down'ing per-fs rwsem on every write(2) will hurt performance on
SMP machines. I think we'll have more and more of this class of
synchronization problems as we add hotplug capability to subsystems.

Having spent a week on implementing something very ugly :-), I'm a bit
embarrased here but I still want to point out that we need something to
solve this problem.

> Anyway, as I said, you need to be able to back it up with
> numbers ;)

Right, gotta benchmark before committing full implementation.

Thanks.

--
tejun