Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752416AbbGJWge (ORCPT ); Fri, 10 Jul 2015 18:36:34 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:47238 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752226AbbGJWg0 (ORCPT ); Fri, 10 Jul 2015 18:36:26 -0400 MIME-Version: 1.0 In-Reply-To: References: <559E72B1.2050903@osg.samsung.com> <559E7AA4.6020602@osg.samsung.com> <20150709144540.GA27723@kroah.com> <559EB218.80404@osg.samsung.com> <559FF99A.30509@osg.samsung.com> <55A03A33.8060405@osg.samsung.com> Date: Sat, 11 Jul 2015 06:36:24 +0800 Message-ID: Subject: Re: Linux 4.2-rc1 From: Ming Lei To: Linus Torvalds Cc: Shuah Khan , vz@mleia.com, Greg Kroah-Hartman , Linux Kernel Mailing List , Tejun Heo , Shuah Khan Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2563 Lines: 70 On Sat, Jul 11, 2015 at 5:47 AM, Linus Torvalds wrote: > But my patch (which is committed now) solves it all for you? > > I'm going to just assume it's timing, and there is no major real reason why > it started triggering just now... Now I see it, the issue is triggered when firmware request is timed out, and Shuah's reported should be caused by the following commit 0cb64249(firmware_loader: abort request if wait_for_completion is interrupted). But your patch is correct for this issue too. Thanks, Ming > > Linus > > On Jul 10, 2015 2:33 PM, "Shuah Khan" wrote: >> >> On 07/10/2015 11:11 AM, Linus Torvalds wrote: >> > On Fri, Jul 10, 2015 at 9:58 AM, Shuah Khan >> > wrote: >> >> >> >> I am not sure why this patch would cause the problem I am seeing. >> >> This patch itself looks like a cleanup type patch and doesn't >> >> really fix a bug. I am building with this patch reverted at the >> >> moment to confirm. >> > >> > Smells to me like it's just a timing issue, and that mayeb the bisect >> > failed because it's not 100% repeatable. Or maybe it *was* 100% >> > repeatable, but simply because that commit changed the timing of the >> > bootup scripts etc. >> > >> > But yes, trying it with the revert in place is a good idea just to >> > make sure. And perhaps checking that kernel more than a few times to >> > verify just how repeatable it is. >> > >> >> Quick update. Reverting didn't help. I think I mentioned I am seeing >> hangs during poweroff and reboot. I am seeing hangs during boot as well. >> I think there is a timing problem that manifests into the following >> 3 variations: >> >> 1. NULL pointer dereference alert, boots fine and runs fine - hangs >> during poweroff and reboot >> 2. Hangs during boot. When booted in recovery, it runs into repeated >> errors which looks very much like the same call trace I see in the >> alert. >> >> Please see attached images. These two are rolling failures repeated >> during udev initialization. It is related to firmware loading it looks >> like. >> >> thanks, >> -- Shuah >> >> -- >> Shuah Khan >> Sr. Linux Kernel Developer >> Open Source Innovation Group >> Samsung Research America (Silicon Valley) >> shuahkh@osg.samsung.com | (970) 217-8978 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/