Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966478AbaLLGyW (ORCPT ); Fri, 12 Dec 2014 01:54:22 -0500 Received: from mail-wi0-f175.google.com ([209.85.212.175]:59386 "EHLO mail-wi0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966383AbaLLGyQ (ORCPT ); Fri, 12 Dec 2014 01:54:16 -0500 Date: Fri, 12 Dec 2014 07:54:11 +0100 From: Ingo Molnar To: Sasha Levin Cc: Linus Torvalds , Dave Jones , Chris Mason , Mike Galbraith , Peter Zijlstra , =?iso-8859-1?Q?D=E2niel?= Fraga , "Paul E. McKenney" , Linux Kernel Mailing List Subject: Re: frequent lockups in 3.18rc4 Message-ID: <20141212065411.GA27967@gmail.com> References: <20141205171501.GA1320@redhat.com> <1417806247.4845.1@mail.thefacebook.com> <20141211145408.GB16800@redhat.com> <548A122C.8000906@oracle.com> <548A2165.9030107@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <548A2165.9030107@oracle.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Sasha Levin wrote: > Right, and it reproduces in 3.10 as well, so it's not really a > new thing. > > What's odd is that I don't remember seeing this bug so long in > the past, I'll try bisecting trinity rather than the kernel - > it's the only other thing that changed. So I think DaveJ mentioned it that Trinity recently changed its test task count and is now more aggressively loading the system. Such a change might have made a dormant, resource limits related bug or load dependent race more likely. I think at this point it would also be useful to debug the hang itself directly: using triggered printks and kgdb and drilling into all the data structures to figure out why the system isn't progressing. If the bug triggers in a VM (which your testing uses) the failed kernel state ought to be a lot more accessible than bare metal. That it triggers in a VM, and if it's the same bug as DaveJ's, that also makes the hardware bug theory a lot less likely. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/