Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754579AbbH0NxR (ORCPT ); Thu, 27 Aug 2015 09:53:17 -0400 Received: from blu004-omc1s3.hotmail.com ([65.55.116.14]:51775 "EHLO BLU004-OMC1S3.hotmail.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753117AbbH0NxQ (ORCPT ); Thu, 27 Aug 2015 09:53:16 -0400 X-TMN: [0oSwNGpBmqePKUQexhssK5c6eCYET+99] X-Originating-Email: [t.s.zhou@hotmail.com] Message-ID: Date: Thu, 27 Aug 2015 21:47:44 +0800 From: "T. Zhou" To: Wanpeng Li CC: Ingo Molnar , Peter Zijlstra , linux-kernel@vger.kernel.org Subject: Re: [PATCH] sched: fix tsk->pi_lock isn't held when do_set_cpus_allowed() References: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-OriginalArrivalTime: 27 Aug 2015 13:53:13.0943 (UTC) FILETIME=[B7C04E70:01D0E0CF] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3527 Lines: 78 Hi, On Tue, Aug 25, 2015 at 03:59:54PM +0800, Wanpeng Li wrote: > [ 15.273708] ------------[ cut here ]------------ > [ 15.274097] WARNING: CPU: 0 PID: 13 at kernel/sched/core.c:1156 do_set_cpus_allowed+0x7e/0x80() > [ 15.274857] Modules linked in: > [ 15.275101] CPU: 0 PID: 13 Comm: migration/0 Not tainted 4.2.0-rc1-00049-g25834c7 #2 > [ 15.275674] 00000000 00000000 d21f1d24 c19228b2 00000000 d21f1d58 c1056a3b c1ba00e4 > [ 15.276084] 00000000 0000000d c1ba17d8 00000484 c10838be 00000484 c10838be d21e5000 > [ 15.276084] d2121900 d21e5158 d21f1d68 c1056b12 00000009 00000000 d21f1d7c c10838be > [ 15.276084] Call Trace: > [ 15.276084] [] dump_stack+0x4b/0x75 > [ 15.276084] [] warn_slowpath_common+0x8b/0xc0 > [ 15.276084] [] ? do_set_cpus_allowed+0x7e/0x80 > [ 15.276084] [] ? do_set_cpus_allowed+0x7e/0x80 > [ 15.276084] [] warn_slowpath_null+0x22/0x30 > [ 15.276084] [] do_set_cpus_allowed+0x7e/0x80 > [ 15.276084] [] cpuset_cpus_allowed_fallback+0x7c/0x170 > [ 15.276084] [] ? cpuset_cpus_allowed+0x180/0x180 > [ 15.276084] [] select_fallback_rq+0x221/0x280 > [ 15.276084] [] migration_call+0xe3/0x250 > [ 15.276084] [] notifier_call_chain+0x53/0x70 > [ 15.276084] [] __raw_notifier_call_chain+0x1e/0x30 > [ 15.276084] [] cpu_notify+0x28/0x50 > [ 15.276084] [] take_cpu_down+0x22/0x40 > [ 15.276084] [] multi_cpu_stop+0xd5/0x140 > [ 15.276084] [] ? __stop_cpus+0x80/0x80 > [ 15.276084] [] cpu_stopper_thread+0xbc/0x170 > [ 15.276084] [] ? preempt_count_sub+0x9/0x50 > [ 15.276084] [] ? _raw_spin_unlock_irq+0x37/0x50 > [ 15.276084] [] ? _raw_spin_unlock_irqrestore+0x55/0x70 > [ 15.276084] [] ? trace_hardirqs_on_caller+0x144/0x1e0 > [ 15.276084] [] ? cpu_stop_should_run+0x35/0x40 > [ 15.276084] [] ? preempt_count_sub+0x9/0x50 > [ 15.276084] [] ? _raw_spin_unlock_irqrestore+0x41/0x70 > [ 15.276084] [] smpboot_thread_fn+0x174/0x2f0 > [ 15.276084] [] ? sort_range+0x30/0x30 > [ 15.276084] [] kthread+0xc4/0xe0 > [ 15.276084] [] ret_from_kernel_thread+0x21/0x30 > [ 15.276084] [] ? kthread_create_on_node+0x180/0x180 > [ 15.276084] ---[ end trace 15f4c86d404693b0 ]--- > no experiment from me(hate myself and me). just guess this path: take_cpu_down() cpu_notify(CPU_DYING) migration_call() in migration_call(), there is CPU_DYING case. add these: raw_spin_lock_irqsave(&p->pi_lock, flags); raw_spin_lock(&rq->lock, flags); ... raw_spin_unlock(&rq->lock, flags); raw_spin_unlock_irqrestore(&p->pi_lock, flags); no p->pi_lock and rq->lock inversed order(from Peter's review) no lock on p->pi_lock two times(from Peter's review) and in do_set_cpus_allowed(), add the following and delete some: WARN_ON_ONCE(debug_locks && !(lockdep_is_held(&p->pi_lock) || p->on_rq && lockdep_is_held(&task_rq(p)->lock))); (from Peter's suggestion) like what used in set_task_cpu() better or right solution there :) thanks, -- Tao -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/