Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932254Ab1EaR3g (ORCPT ); Tue, 31 May 2011 13:29:36 -0400 Received: from tx2ehsobe004.messaging.microsoft.com ([65.55.88.14]:5812 "EHLO TX2EHSOBE007.bigfish.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932093Ab1EaR3Z (ORCPT ); Tue, 31 May 2011 13:29:25 -0400 X-SpamScore: 1 X-BigFish: VPS1(zzzz1202hzz8275bh8275dhz32i668h839h) X-Forefront-Antispam-Report: CIP:163.181.249.109;KIP:(null);UIP:(null);IPVD:NLI;H:ausb3twp02.amd.com;RD:none;EFVD:NLI X-WSS-ID: 0LM2LWT-02-9GX-02 X-M-MSG: From: Robert Richter To: Ingo Molnar CC: Peter Zijlstra , LKML , oprofile-list , Robert Richter , Marcin Slusarz , Carl Love , Subject: [PATCH 2/3] oprofile: Fix locking dependency in sync_start() Date: Tue, 31 May 2011 18:46:45 +0200 Message-ID: <1306860406-6094-3-git-send-email-robert.richter@amd.com> X-Mailer: git-send-email 1.7.5.rc3 In-Reply-To: <1306860406-6094-1-git-send-email-robert.richter@amd.com> References: <1306860406-6094-1-git-send-email-robert.richter@amd.com> MIME-Version: 1.0 Content-Type: text/plain X-OriginatorOrg: amd.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6276 Lines: 155 This fixes the A->B/B->A locking dependency, see the warning below. The function task_exit_notify() is called with (task_exit_notifier) .rwsem set and then calls sync_buffer() which locks buffer_mutex. In sync_start() the buffer_mutex was set to prevent notifier functions to be started before sync_start() is finished. But when registering the notifier, (task_exit_notifier).rwsem is locked too, but now in different order than in sync_buffer(). In theory this causes a locking dependency, what does not occur in practice since task_exit_notify() is always called after the notifier is registered which means the lock is already released. However, after checking the notifier functions it turned out the buffer_mutex in sync_start() is unnecessary. This is because sync_buffer() may be called from the notifiers even if sync_start() did not finish yet, the buffers are already allocated but empty. No need to protect this with the mutex. So we fix this theoretical locking dependency by removing buffer_mutex in sync_start(). This is similar to the implementation before commit: 750d857 oprofile: fix crash when accessing freed task structs which introduced the locking dependency. Lockdep warning: oprofiled/4447 is trying to acquire lock: (buffer_mutex){+.+...}, at: [] sync_buffer+0x31/0x3ec [oprofile] but task is already holding lock: ((task_exit_notifier).rwsem){++++..}, at: [] __blocking_notifier_call_chain+0x39/0x67 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 ((task_exit_notifier).rwsem){++++..}: [] lock_acquire+0xf8/0x11e [] down_write+0x44/0x67 [] blocking_notifier_chain_register+0x52/0x8b [] profile_event_register+0x2d/0x2f [] sync_start+0x47/0xc6 [oprofile] [] oprofile_setup+0x60/0xa5 [oprofile] [] event_buffer_open+0x59/0x8c [oprofile] [] __dentry_open+0x1eb/0x308 [] nameidata_to_filp+0x60/0x67 [] do_last+0x5be/0x6b2 [] path_openat+0xc7/0x360 [] do_filp_open+0x3d/0x8c [] do_sys_open+0x110/0x1a9 [] sys_open+0x20/0x22 [] system_call_fastpath+0x16/0x1b -> #0 (buffer_mutex){+.+...}: [] __lock_acquire+0x1085/0x1711 [] lock_acquire+0xf8/0x11e [] mutex_lock_nested+0x63/0x309 [] sync_buffer+0x31/0x3ec [oprofile] [] task_exit_notify+0x16/0x1a [oprofile] [] notifier_call_chain+0x37/0x63 [] __blocking_notifier_call_chain+0x50/0x67 [] blocking_notifier_call_chain+0x14/0x16 [] profile_task_exit+0x1a/0x1c [] do_exit+0x2a/0x6fc [] do_group_exit+0x83/0xae [] sys_exit_group+0x17/0x1b [] system_call_fastpath+0x16/0x1b other info that might help us debug this: 1 lock held by oprofiled/4447: #0: ((task_exit_notifier).rwsem){++++..}, at: [] __blocking_notifier_call_chain+0x39/0x67 stack backtrace: Pid: 4447, comm: oprofiled Not tainted 2.6.39-00007-gcf4d8d4 #10 Call Trace: [] print_circular_bug+0xae/0xbc [] __lock_acquire+0x1085/0x1711 [] ? sync_buffer+0x31/0x3ec [oprofile] [] lock_acquire+0xf8/0x11e [] ? sync_buffer+0x31/0x3ec [oprofile] [] ? mark_lock+0x42f/0x552 [] ? sync_buffer+0x31/0x3ec [oprofile] [] mutex_lock_nested+0x63/0x309 [] ? sync_buffer+0x31/0x3ec [oprofile] [] sync_buffer+0x31/0x3ec [oprofile] [] ? __blocking_notifier_call_chain+0x39/0x67 [] ? __blocking_notifier_call_chain+0x39/0x67 [] task_exit_notify+0x16/0x1a [oprofile] [] notifier_call_chain+0x37/0x63 [] __blocking_notifier_call_chain+0x50/0x67 [] blocking_notifier_call_chain+0x14/0x16 [] profile_task_exit+0x1a/0x1c [] do_exit+0x2a/0x6fc [] ? retint_swapgs+0xe/0x13 [] do_group_exit+0x83/0xae [] sys_exit_group+0x17/0x1b [] system_call_fastpath+0x16/0x1b Reported-by: Marcin Slusarz Cc: Carl Love Cc: # .36+ Signed-off-by: Robert Richter --- drivers/oprofile/buffer_sync.c | 8 ++------ 1 files changed, 2 insertions(+), 6 deletions(-) diff --git a/drivers/oprofile/buffer_sync.c b/drivers/oprofile/buffer_sync.c index 04250aa..f34b5b2 100644 --- a/drivers/oprofile/buffer_sync.c +++ b/drivers/oprofile/buffer_sync.c @@ -155,8 +155,6 @@ int sync_start(void) if (!zalloc_cpumask_var(&marked_cpus, GFP_KERNEL)) return -ENOMEM; - mutex_lock(&buffer_mutex); - err = task_handoff_register(&task_free_nb); if (err) goto out1; @@ -173,7 +171,6 @@ int sync_start(void) start_cpu_work(); out: - mutex_unlock(&buffer_mutex); return err; out4: profile_event_unregister(PROFILE_MUNMAP, &munmap_nb); @@ -190,14 +187,13 @@ out1: void sync_stop(void) { - /* flush buffers */ - mutex_lock(&buffer_mutex); end_cpu_work(); unregister_module_notifier(&module_load_nb); profile_event_unregister(PROFILE_MUNMAP, &munmap_nb); profile_event_unregister(PROFILE_TASK_EXIT, &task_exit_nb); task_handoff_unregister(&task_free_nb); - mutex_unlock(&buffer_mutex); + barrier(); /* do all of the above first */ + flush_cpu_work(); free_all_tasks(); -- 1.7.5.rc3 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/