Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754409AbbKXKbc (ORCPT ); Tue, 24 Nov 2015 05:31:32 -0500 Received: from mail.bmw-carit.de ([62.245.222.98]:37633 "EHLO mail.bmw-carit.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753704AbbKXKbZ (ORCPT ); Tue, 24 Nov 2015 05:31:25 -0500 X-CTCH-RefID: str=0001.0A0C0205.56543C77.027E,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0 Subject: Re: [PATCH cgroup/for-4.4-fixes] cgroup: make css_set pin its css's to avoid use-afer-free To: Tejun Heo , Li Zefan , Johannes Weiner References: <20151123195541.GA19072@mtj.duckdns.org> CC: , , Dave Jones , From: Daniel Wagner Message-ID: <56543C76.2050008@bmw-carit.de> Date: Tue, 24 Nov 2015 11:31:18 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <20151123195541.GA19072@mtj.duckdns.org> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4576 Lines: 86 Hi Tejun, On 11/23/2015 08:55 PM, Tejun Heo wrote: > A css_set represents the relationship between a set of tasks and > css's. css_set never pinned the associated css's. This was okay > because tasks used to always disassociate immediately (in RCU sense) - > either a task is moved to a different css_set or exits and never > accesses css_set again. > > Unfortunately, afcf6c8b7544 ("cgroup: add cgroup_subsys->free() method > and use it to fix pids controller") and patches leading up to it made > a zombie hold onto its css_set and deref the associated css's on its > release. Nothing pins the css's after exit and it might have already > been freed leading to use-after-free. > > general protection fault: 0000 [#1] PREEMPT SMP > task: ffffffff81bf2500 ti: ffffffff81be4000 task.ti: ffffffff81be4000 > RIP: 0010:[] [] pids_cancel.constprop.4+0x5/0x40 > ... > Call Trace: > > [] ? pids_free+0x3d/0xa0 > [] cgroup_free+0x53/0xe0 > [] __put_task_struct+0x42/0x130 > [] delayed_put_task_struct+0x77/0x130 > [] rcu_process_callbacks+0x2f4/0x820 > [] ? rcu_process_callbacks+0x2b3/0x820 > [] __do_softirq+0xd4/0x460 > [] irq_exit+0x89/0xa0 > [] smp_apic_timer_interrupt+0x42/0x50 > [] apic_timer_interrupt+0x84/0x90 > > ... > Code: 5b 5d c3 48 89 df 48 c7 c2 c9 f9 ae 81 48 c7 c6 91 2c ae 81 e8 1d 94 0e 00 31 c0 5b 5d c3 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 83 87 e0 00 00 00 ff 78 01 c3 80 3d 08 7a c1 00 00 74 02 > RIP [] pids_cancel.constprop.4+0x5/0x40 > RSP > ---[ end trace 89a4a4b916b90c49 ]--- > Kernel panic - not syncing: Fatal exception in interrupt > Kernel Offset: disabled > ---[ end Kernel panic - not syncing: Fatal exception in interrupt > > Fix it by making css_set pin the associate css's until its release. I still see this one with the patch applied: [ 19.369455] ------------[ cut here ]------------ [ 19.369851] WARNING: CPU: 1 PID: 1 at kernel/cgroup_pids.c:97 pids_cancel.constprop.6+0x31/0x40() [ 19.370596] Modules linked in: [ 19.370916] CPU: 1 PID: 1 Comm: systemd Not tainted 4.4.0-rc1+ #29 [ 19.371418] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 [ 19.372542] ffffffff81f65382 ffff88007c043b90 ffffffff81551ffc 0000000000000000 [ 19.373173] ffff88007c043bc8 ffffffff810de202 ffff88007a752000 ffff88007a29ab00 [ 19.374144] ffff88007c043c80 ffff88007a1d8400 0000000000000001 ffff88007c043bd8 [ 19.375185] Call Trace: [ 19.375506] [] dump_stack+0x4e/0x82 [ 19.376238] [] warn_slowpath_common+0x82/0xc0 [ 19.376975] [] warn_slowpath_null+0x1a/0x20 [ 19.377765] [] pids_cancel.constprop.6+0x31/0x40 [ 19.378623] [] pids_can_attach+0x6d/0xf0 [ 19.379451] [] cgroup_taskset_migrate+0x6c/0x330 [ 19.380142] [] cgroup_migrate+0xf5/0x190 [ 19.380592] [] ? cgroup_migrate+0x5/0x190 [ 19.381041] [] cgroup_attach_task+0x176/0x200 [ 19.381500] [] ? cgroup_attach_task+0x5/0x200 [ 19.381962] [] __cgroup_procs_write+0x2ad/0x460 [ 19.382482] [] ? __cgroup_procs_write+0x5e/0x460 [ 19.382949] [] cgroup_procs_write+0x14/0x20 [ 19.383432] [] cgroup_file_write+0x35/0x1c0 [ 19.383864] [] kernfs_fop_write+0x141/0x190 [ 19.384367] [] __vfs_write+0x28/0xe0 [ 19.384759] [] ? percpu_down_read+0x57/0xa0 [ 19.385274] [] ? __sb_start_write+0xb4/0xf0 [ 19.385712] [] ? __sb_start_write+0xb4/0xf0 [ 19.386160] [] vfs_write+0xac/0x1a0 [ 19.386563] [] ? __fget_light+0x66/0x90 [ 19.386960] [] SyS_write+0x49/0xb0 [ 19.387373] [] entry_SYSCALL_64_fastpath+0x12/0x76 [ 19.387861] ---[ end trace 46552476f436a20f ]--- cheers, daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/