Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753064AbcDOCTS (ORCPT ); Thu, 14 Apr 2016 22:19:18 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48836 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752951AbcDOCTQ (ORCPT ); Thu, 14 Apr 2016 22:19:16 -0400 Reply-To: xlpang@redhat.com Subject: Re: [PATCH v3 5/6] sched/deadline/rtmutex: Fix unprotected PI access in enqueue_task_dl() References: <1460633827-345-1-git-send-email-xlpang@redhat.com> <1460633827-345-6-git-send-email-xlpang@redhat.com> <20160414153111.GC2975@worktop.cust.blueprintrf.com> <57104ADB.20402@redhat.com> To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, Thomas Gleixner , Juri Lelli , Ingo Molnar , Steven Rostedt From: Xunlei Pang Message-ID: <57104FA0.4090509@redhat.com> Date: Fri, 15 Apr 2016 10:19:12 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <57104ADB.20402@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3864 Lines: 61 On 2016/04/15 at 09:58, Xunlei Pang wrote: > On 2016/04/14 at 23:31, Peter Zijlstra wrote: >> On Thu, Apr 14, 2016 at 07:37:06PM +0800, Xunlei Pang wrote: >>> We access @pi_task's data without any lock in enqueue_task_dl(), though >>> checked "dl_prio(pi_task->normal_prio)" condition, that's not enough. >> The proper fix is to ensure that pi_task is guaranteed to be blocked. > Even if pi_task was blocked, its parameters are still allowed to be changed, > so we have to do that. Did I miss something? > > Regards, > Xunlei Fortunately, I just reproduced through an overnight test, so it really happened in reality as I thought. [50697.042391] kernel BUG at kernel/sched/deadline.c:398! [50697.048212] invalid opcode: 0000 [#1] SMP [50697.137676] CPU: 1 PID: 10676 Comm: bugon Tainted: G W 4.6.0-rc3+ #19 [50697.146250] Hardware name: Intel Corporation Broadwell Client platform/SawTooth Peak, BIOS BDW-E1R1.86C.0127.R00.150 8062034 08/06/2015 [50697.159942] task: ffff880089d72b80 ti: ffff880074bb4000 task.ti: ffff880074bb4000 [50697.168420] RIP: 0010:[] [] replenish_dl_entity+0xff/0x110 [50697.178292] RSP: 0000:ffff88016ec43d90 EFLAGS: 00010046 [50697.184307] RAX: 0000000000000001 RBX: ffff880089d72d50 RCX: 0000000000000001 [50697.192390] RDX: 0000000000000010 RSI: ffff8800719858d0 RDI: ffff880089d72d50 [50697.200473] RBP: ffff88016ec43da8 R08: 0000000000000001 R09: 0000000000000097 [50697.208556] R10: 0000000057102e72 R11: 000000000f9e6fd7 R12: ffff88016ec56e40 [50697.216638] R13: ffff88016ec56e40 R14: 0000000000016e40 R15: ffff880089d72d50 [50697.224721] FS: 00007f14e788b700(0000) GS:ffff88016ec40000(0000) knlGS:0000000000000000 [50697.233887] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [50697.240396] CR2: 000055be08240c68 CR3: 000000008a5d5000 CR4: 00000000003406e0 [50697.248478] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [50697.256561] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [50697.264643] Stack: [50697.266917] ffff88016ec56e40 ffff880089d72b80 0000000000000010 ffff88016ec43de8 [50697.275338] ffffffff810cbfe4 ffff88016ec43de8 ffff880089d72b80 ffff88016ec56e40 [50697.283757] 00000000000188c5 ffff880089d72d50 ffff88016ec4f228 ffff88016ec43e18 [50697.292175] Call Trace: [50697.294943] [50697.297122] [] enqueue_task_dl+0x264/0x340 [50697.303838] [] update_curr_dl+0x1c3/0x1f0 [50697.310249] [] task_tick_dl+0x1c/0x80 [50697.316265] [] scheduler_tick+0x5c/0xe0 [50697.322480] [] ? tick_sched_do_timer+0x50/0x50 [50697.329383] [] update_process_times+0x51/0x60 [50697.336188] [] tick_sched_handle.isra.17+0x25/0x60 [50697.343486] [] tick_sched_timer+0x3d/0x70 [50697.349895] [] __hrtimer_run_queues+0xf3/0x270 [50697.356797] [] hrtimer_interrupt+0xa8/0x1a0 [50697.363404] [] local_apic_timer_interrupt+0x35/0x60 [50697.370799] [] smp_apic_timer_interrupt+0x3d/0x50 [50697.377996] [] apic_timer_interrupt+0x8c/0xa0 [50697.384798] [50697.384798] [50697.386974] Code: a9 48 c7 c7 38 5f a0 81 31 c0 48 89 75 e8 c6 05 5c 48 c8 00 01 e8 74 20 0c 00 49 8b 84 24 28 09 00 00 8b 4b 54 48 8b 75 e8 eb c4 <0f> 0b 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 [50697.409201] RIP [] replenish_dl_entity+0xff/0x110 [50697.416409] RSP [50697.433683] ---[ end trace da6e1e42babefb7f ]--- [50697.438913] Kernel panic - not syncing: Fatal exception in interrupt [50698.484088] Shutting down cpus with NMI [50698.488434] Kernel Offset: disabled [50698.492383] ---[ end Kernel panic - not syncing: Fatal exception in interrupt