Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756970Ab3JNRbR (ORCPT ); Mon, 14 Oct 2013 13:31:17 -0400 Received: from mail-ee0-f53.google.com ([74.125.83.53]:52245 "EHLO mail-ee0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756839Ab3JNRbP (ORCPT ); Mon, 14 Oct 2013 13:31:15 -0400 MIME-Version: 1.0 In-Reply-To: <20131014154627.GA9525@redhat.com> References: <20131014154627.GA9525@redhat.com> Date: Mon, 14 Oct 2013 10:31:14 -0700 X-Google-Sender-Auth: 6LHOuLf3ne7O1g94ebj6Rbi_3rY Message-ID: Subject: Re: epoll oops. From: Linus Torvalds To: Dave Jones , Linux Kernel , Al Viro , Davide Libenzi , Eric Wong , Oleg Nesterov Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2530 Lines: 52 On Mon, Oct 14, 2013 at 8:46 AM, Dave Jones wrote: > Machine is wedged and I can't get to it until tomorrow, but this is what was on serial console. > kernel running was from some time last Friday, I can get exact info tomorrow, though > I don't think there's anything epoll related recently that could explain this. It looks like it is the access to "lock->key" that takes a page fault. The pointer looks good (%r13=ffff8801654cec98), so I'm pretty sure this is due to DEBUG_PAGEALLOC and a free'd page. So it looks like ep_unregister_pollwait() calls remove_wait_queue() on a wait-queue head that has already been free'd. I have this dim memory of us having fought this before. But maybe I'm just remembering some of the old signalfd-vs-epoll races. Oleg, does this trigger any memory for you? Commit 971316f0503a ("epoll: ep_unregister_pollwait() can use the freed pwq->whead") just makes me go "Hmm, this is *exactly* that that commit is talking about.." Linus --- > Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > CPU: 3 PID: 449 Comm: trinity-main Not tainted 3.12.0-rc4+ #98 > task: ffff88023e239560 ti: ffff880083082000 task.ti: ffff880083082000 > RIP: 0010:[] [] __lock_acquire+0x58/0x1be0 > Call Trace: > [] lock_acquire+0x93/0x200 > [] _raw_spin_lock_irqsave+0x4b/0x90 > [] remove_wait_queue+0x19/0x40 > [] ep_unregister_pollwait.isra.14+0x5b/0x1e0 > [] ep_remove+0x26/0x140 > [] eventpoll_release_file+0x71/0xa0 > [] __fput+0x2aa/0x2d0 > [] ____fput+0xe/0x10 > [] task_work_run+0xac/0xe0 > [] do_exit+0x2c7/0xcc0 > [] do_group_exit+0x4c/0xc0 > [] SyS_exit_group+0x14/0x20 > [] tracesys+0xdd/0xe2 > Code: 85 c0 8b 05 4b d6 bc 00 45 0f 45 e0 85 c0 0f 84 07 01 00 00 8b 05 31 af 00 01 49 89 fd 41 89 f7 41 89 d3 85 c0 0f 84 08 01 00 00 <49> 8b 45 00 ba 01 00 00 00 48 3d 60 6a 13 82 44 0f 44 e2 41 83 > RIP [] __lock_acquire+0x58/0x1be0 > RSP > CR2: ffff8801654cec98 > ---[ end trace 044e98c2d3aab216 ]--- > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/