Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755981AbYJXOGP (ORCPT ); Fri, 24 Oct 2008 10:06:15 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751670AbYJXOF7 (ORCPT ); Fri, 24 Oct 2008 10:05:59 -0400 Received: from fg-out-1718.google.com ([72.14.220.155]:19618 "EHLO fg-out-1718.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751500AbYJXOF5 (ORCPT ); Fri, 24 Oct 2008 10:05:57 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=tgLY/fcWbGyGpN5D0YeEY/qhjf6UCZ/C0WBarrXVwPhMRsbe0RBXE/VsLN7RP8DqDF rf1vHlwmvcjNoC6HgrAID7Rd1ExAPZRIYye818AIMsf0aytDEQRqzFEIhVCspUL92qrU Rfb7xabUKV/Hs4Aokow7JXG5KJeT91IfIq4I4= Message-ID: <4f3ee3290810240705j3fd107d4v45c38c1e1c229079@mail.gmail.com> Date: Fri, 24 Oct 2008 16:05:55 +0200 From: cboulte@gmail.com To: Nadia.Derbey@bull.net Subject: Re: [PATCH 1/1] (v2) SYSVIPC - Fix the ipc structures initialization Cc: akpm@linux-foundation.org, manfred@colorfullife.com, linux-kernel@vger.kernel.org In-Reply-To: <20081024060040.700343046@bull.net> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20081024060022.897705758@bull.net> <20081024060040.700343046@bull.net> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9742 Lines: 215 On Fri, Oct 24, 2008 at 8:00 AM, wrote: > > This patch is a fix for Bugzilla bug > http://bugzilla.kernel.org/show_bug.cgi?id=11796. > > To summarize, a simple testcase is concurrently running an infinite loop on > msgctl(IPC_STAT) and a call to msgget(): > > while (1) > msgctl(id, IPC_STAT) 1 > | > | > | > 2 id = msgget(key, IPC_CREAT) > | > | > | > > In the interval [1-2], the id doesn't exist yet. > > In that test, the problem is the following: > When we are calling ipc_addid() from msgget() the msq structure is not > completely initialized. So idr_get_new() is inserting a pointer into the > idr tree, and the structure which is pointed to has, among other fields, > its lock uninitialized. > > Since msgctl(IPC_STAT) is looping while (1), idr_find() returns the > pointer as soon as it is inserted into the IDR tree. And ipc_lock() > calls spin_lock(&mqs->lock), while we have not initialized that lock > yet. > > This patch moves the spin_lock_init() before the call to ipc_addid(). > It also sets the "deleted" flag to 1 in the window between ipc structure > allocation and its locking in ipc_addid(). > > Regards, > Nadia > > > Signed-off-by: Nadia Derbey > > --- > ipc/util.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > Index: linux-2.6.27/ipc/util.c > =================================================================== > --- linux-2.6.27.orig/ipc/util.c 2008-10-23 15:20:46.000000000 +0200 > +++ linux-2.6.27/ipc/util.c 2008-10-24 09:03:06.000000000 +0200 > @@ -266,6 +266,17 @@ int ipc_addid(struct ipc_ids* ids, struc > if (ids->in_use >= size) > return -ENOSPC; > > + spin_lock_init(&new->lock); > + > + /* > + * We have a window between the time new is inserted into the idr > + * tree and the time it is actually locked. > + * In order to be safe during that window set the new ipc structure > + * as deleted: a concurrent ipc_lock() will see it as not present > + * until the initialization phase is complete. > + */ > + new->deleted = 1; > + > err = idr_get_new(&ids->ipcs_idr, new, &id); > if (err) > return err; > @@ -280,7 +291,6 @@ int ipc_addid(struct ipc_ids* ids, struc > ids->seq = 0; > > new->id = ipc_buildid(id, new->seq); > - spin_lock_init(&new->lock); > new->deleted = 0; > rcu_read_lock(); > spin_lock(&new->lock); > > -- > I tried the test with your patch (on a 2.6.27), and got a soft lockup again (the test ran during 5 minutes). It was harder to reproduce (it took longer) but still there. Output: [ 1476.099160] INFO: trying to register non-static key. [ 1476.100507] the code is fine but needs lockdep annotation. [ 1476.100507] turning off the locking correctness validator. [ 1476.100507] Pid: 3082, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #6 [ 1476.100507] [ 1476.100507] Call Trace: [ 1476.100507] [] static_obj+0x51/0x68 [ 1476.100507] [] __lock_acquire+0xe0a/0x153a [ 1476.100507] [] restore_args+0x0/0x30 [ 1476.100507] [] sched_clock+0x5/0x7 [ 1476.100507] [] lock_acquire+0x95/0xc2 [ 1476.100507] [] ipc_lock+0x63/0x9b [ 1476.100507] [] _spin_lock+0x2d/0x59 [ 1476.100507] [] ipc_lock+0x63/0x9b [ 1476.100507] [] ipc_lock+0x63/0x9b [ 1476.100507] [] ipc_lock+0x0/0x9b [ 1476.100507] [] ipc_lock_check+0xc/0x57 [ 1476.100507] [] sys_msgctl+0x18e/0x502 [ 1476.100507] [] thread_return+0x3d/0xbe [ 1476.100507] [] tick_dev_program_event+0x2d/0x95 [ 1476.100507] [] trace_hardirqs_on_thunk+0x3a/0x3f [ 1476.100507] [] trace_hardirqs_on_caller+0xff/0x12a [ 1476.100507] [] trace_hardirqs_on_thunk+0x3a/0x3f [ 1476.100507] [] trace_hardirqs_on_caller+0xff/0x12a [ 1476.100507] [] trace_hardirqs_on_thunk+0x3a/0x3f [ 1476.100507] [] sched_clock+0x5/0x7 [ 1476.100507] [] system_call_fastpath+0x16/0x1b [ 1476.100507] [ 1540.820503] BUG: soft lockup - CPU#1 stuck for 61s! [sysv_test2:3082] [ 1540.820506] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcsp snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 rng_core serio_raw i2c_core psmouse iTCO_wdt shpchp pci_hotplug button e752x_edac edac_core joydev evdev ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic scsi_transport_spi uhci_hcd ehci_hcd ide_core tg3 scsi_mod libphy thermal processor fan thermal_sys [ 1540.820506] irq event stamp: 2117257 [ 1540.820506] hardirqs last enabled at (2117257): [] trace_hardirqs_on_thunk+0x3a/0x3f [ 1540.820506] hardirqs last disabled at (2117256): [] trace_hardirqs_off_thunk+0x3a/0x3c [ 1540.820506] softirqs last enabled at (2111724): [] call_softirq+0x1c/0x28 [ 1540.820506] softirqs last disabled at (2111719): [] call_softirq+0x1c/0x28 [ 1540.820506] CPU 1: [ 1540.820506] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcsp snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801 rng_core serio_raw i2c_core psmouse iTCO_wdt shpchp pci_hotplug button e752x_edac edac_core joydev evdev ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic scsi_transport_spi uhci_hcd ehci_hcd ide_core tg3 scsi_mod libphy thermal processor fan thermal_sys [ 1540.820506] Pid: 3082, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #6 [ 1540.820506] RIP: 0010:[] [] native_read_tsc+0x8/0x1d [ 1540.820506] RSP: 0018:ffff8801458c1db8 EFLAGS: 00000286 [ 1540.820506] RAX: 0000000050aadc78 RBX: ffff88014452fb70 RCX: 0000000050aadb8a [ 1540.820506] RDX: 0000000000000446 RSI: ffffffff805425bc RDI: 0000000000000001 [ 1540.820506] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000000 [ 1540.820506] R10: ffffffff80250954 R11: ffffffff8033fa39 R12: ffff88002f312160 [ 1540.820506] R13: ffff8800aeae6000 R14: ffff8801458c0000 R15: ffffffff8082c810 [ 1540.820506] FS: 00007ffbced256e0(0000) GS:ffff880147054d18(0000) knlGS:0000000000000000 [ 1540.820506] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1540.820506] CR2: 00007ffbce890c70 CR3: 0000000144d05000 CR4: 00000000000006e0 [ 1540.820506] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1540.820506] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1540.820506] [ 1540.820506] Call Trace: [ 1540.820506] [] delay_tsc+0x18/0x51 [ 1540.820506] [] __ticket_spin_trylock+0x5/0x18 [ 1540.820506] [] _raw_spin_lock+0x96/0x102 [ 1540.820506] [] _spin_lock+0x4e/0x59 [ 1540.820506] [] ipc_lock+0x63/0x9b [ 1540.820506] [] ipc_lock+0x63/0x9b [ 1540.820506] [] ipc_lock+0x0/0x9b [ 1540.820506] [] ipc_lock_check+0xc/0x57 [ 1540.820506] [] sys_msgctl+0x18e/0x502 [ 1540.820506] [] thread_return+0x3d/0xbe [ 1540.820506] [] tick_dev_program_event+0x2d/0x95 [ 1540.820506] [] trace_hardirqs_on_thunk+0x3a/0x3f [ 1540.820506] [] trace_hardirqs_on_caller+0xff/0x12a [ 1540.820506] [] trace_hardirqs_on_thunk+0x3a/0x3f [ 1540.820506] [] trace_hardirqs_on_caller+0xff/0x12a [ 1540.820506] [] trace_hardirqs_on_thunk+0x3a/0x3f [ 1540.820506] [] sched_clock+0x5/0x7 [ 1540.820506] [] system_call_fastpath+0x16/0x1b [ 1540.820506] [ 1606.320502] BUG: soft lockup - CPU#1 stuck for 61s Inspired by your patch I wrote this one on top of yours: --- ipc/util.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Index: bug-sysv/ipc/util.c =================================================================== --- bug-sysv.orig/ipc/util.c +++ bug-sysv/ipc/util.c @@ -267,6 +267,8 @@ int ipc_addid(struct ipc_ids* ids, struc return -ENOSPC; spin_lock_init(&new->lock); + rcu_read_lock(); + spin_lock(&new->lock); /* * We have a window between the time new is inserted into the idr @@ -292,8 +294,6 @@ int ipc_addid(struct ipc_ids* ids, struc new->id = ipc_buildid(id, new->seq); new->deleted = 0; - rcu_read_lock(); - spin_lock(&new->lock); return id; } And I don't see any soft lockup for the moment (the test runs during 30 minutes). But I'm far from being confidant about this code. I'm just surprise that new is inserted into the list without all its attributes initialized. Besides, assiging new->delete = 0 without locking the ipc seems strange. c. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/