Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753403Ab3FFRfZ (ORCPT ); Thu, 6 Jun 2013 13:35:25 -0400 Received: from g4t0014.houston.hp.com ([15.201.24.17]:36572 "EHLO g4t0014.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751190Ab3FFRfY (ORCPT ); Thu, 6 Jun 2013 13:35:24 -0400 Message-ID: <1370540122.1724.3.camel@buesod1.americas.hpqcorp.net> Subject: Re: [IPC] INFO: suspicious RCU usage. From: Davidlohr Bueso To: Fengguang Wu Cc: Stephen Rothwell , linux-kernel@vger.kernel.org, Andrew Morton Date: Thu, 06 Jun 2013 10:35:22 -0700 In-Reply-To: <20130606132522.GA18959@localhost> References: <20130606132522.GA18959@localhost> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.4.4 (3.4.4-2.fc17) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4325 Lines: 89 Ccing Andrew. On Thu, 2013-06-06 at 21:25 +0800, Fengguang Wu wrote: > Greetings, > > I got the below dmesg and the first bad commit is > > commit 1f6587114a689a5d7fdfb0d4abc818117e3182a5 > Author: Davidlohr Bueso > Date: Thu Jun 6 10:41:56 2013 +1000 > > ipc: move rcu lock out of ipc_addid > > This patchset continues the work that began in the sysv ipc semaphore > scaling series: https://lkml.org/lkml/2013/3/20/546 > > Just like semaphores used to be, sysv shared memory and msg queues also > abuse the ipc lock, unnecessarily holding it for operations such as > permission and security checks. This patchset mostly deals with mqueues, > and while shared mem can be done in a very similar way, I want to get > these patches out in the open first. It also does some pending cleanups, > mostly focused on the two level locking we have in ipc code, taking care > of ipc_addid() and ipcctl_pre_down_nolock() - yes there are still > functions that need to be updated as well. > > This patch: > > Make all callers explicitly take and release the RCU read lock. > > This addresses the two level locking seen in newary(), newseg() and > newqueue(). For the last two, explicitly unlock the ipc object and the > rcu lock, instead of calling the custom shm_unlock and msg_unlock > functions. The next patch will deal with the open coded locking for > ->perm.lock > > Signed-off-by: Davidlohr Bueso > Cc: Andi Kleen > Cc: Rik van Riel > Signed-off-by: Andrew Morton > > [ 51.524946] > [ 51.525983] =============================== > [ 51.532875] [ INFO: suspicious RCU usage. ] > [ 51.535385] 3.10.0-rc4-next-20130606 #6 Not tainted > [ 51.538304] ------------------------------- > [ 51.540937] /c/kernel-tests/src/stable/include/linux/rcupdate.h:471 Illegal context switch in RCU read-side critical section! > [ 51.548110] > [ 51.548110] other info that might help us debug this: > [ 51.548110] > [ 51.553055] > [ 51.553055] rcu_scheduler_active = 1, debug_locks = 1 > [ 51.557199] 2 locks held by trinity/1107: > [ 51.560168] #0: (&ids->rw_mutex){+.+.+.}, at: [] ipcget+0x38/0x2b3 > [ 51.566465] #1: (rcu_read_lock){.+.+..}, at: [] newseg+0x19d/0x3fd > [ 51.572413] > [ 51.572413] stack backtrace: > [ 51.574761] CPU: 0 PID: 1107 Comm: trinity Not tainted 3.10.0-rc4-next-20130606 #6 > [ 51.579331] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 > [ 51.583068] 0000000000000001 ffff880004a07d88 ffffffff817b1f5c ffff880004a07db8 > [ 51.592119] ffffffff810f2f1d ffffffff81b78569 00000000000001a8 0000000000000000 > [ 51.596726] 0000000000000000 ffff880004a07de8 ffffffff810ded5e ffff880004a07fd8 > [ 51.605189] Call Trace: > [ 51.606409] [] dump_stack+0x19/0x1b > [ 51.609632] [] lockdep_rcu_suspicious+0xeb/0xf4 > [ 51.612905] [] __might_sleep+0x59/0x1dc > [ 51.618614] [] idr_preload+0x9b/0x142 > [ 51.621939] [] ipc_addid+0x3d/0x193 > [ 51.624373] [] newseg+0x221/0x3fd > [ 51.626596] [] ? newseg+0x19d/0x3fd > [ 51.630177] [] ipcget+0x1be/0x2b3 > [ 51.633174] [] ? retint_swapgs+0x13/0x1b > [ 51.636356] [] SyS_shmget+0x59/0x5d > [ 51.639576] [] ? shm_try_destroy_orphaned+0xbf/0xbf > [ 51.643673] [] ? shm_get_unmapped_area+0x20/0x20 > [ 51.647321] [] ? shm_security+0xb/0xb > [ 51.650831] [] system_call_fastpath+0x16/0x1b I suspect this is caused because now we call idr_preload() in ipc_addid with the rcu lock held by the caller. So, we can either have a two level rcu locking or a two level idr_preload/idr_preload_end. Thanks, Davidlohr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/