Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753630AbcKPSCG (ORCPT ); Wed, 16 Nov 2016 13:02:06 -0500 Received: from foss.arm.com ([217.140.101.70]:33118 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752210AbcKPSCD (ORCPT ); Wed, 16 Nov 2016 13:02:03 -0500 Date: Wed, 16 Nov 2016 18:01:56 +0000 From: Brian Starkey To: Eric Dumazet Cc: LKML , Peter Zijlstra , Ingo Molnar , Andrew Morton , Alexander Potapenko , Steven Rostedt , Sebastian Andrzej Siewior , Thomas Gleixner Subject: Re: Regression: Failed boots bisected to 4cd13c21b207 "softirq: Let ksoftirqd do its job" Message-ID: <20161116180156.GA21156@e106950-lin.cambridge.arm.com> References: <20161116135527.GA5833@e106950-lin.cambridge.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2553 Lines: 71 Hi Eric, On Wed, Nov 16, 2016 at 07:52:42AM -0800, Eric Dumazet wrote: >On Wed, Nov 16, 2016 at 5:55 AM, Brian Starkey wrote: >> Hi, >> >> I'm running an ARM FVP (virtual platform - simluated hardware), which >> is failing to reach a login prompt due to extremely slow progress >> during boot. systemd gives up waiting for the ttyAMA0 device to >> appear, and never starts the getty. >> >> I've bisected this to commit 4cd13c21b207 "softirq: Let ksoftirqd do >> its job". >> >> Without this commit, the system boots to a login prompt in 2 minutes. >> With this commit, the system eventually manages to bring up sshd after >> 22 minutes, but as mentioned, the dev-ttyAMA0.device unit has timed >> out and so I don't get a prompt on my console. >> >> I only hit the issue when my rootfs is mounted over NFS, and with only >> a single core enabled. The (simulated) network device is an SMC91C111. >> With multiple cores enabled or a non-NFS filesystem, everything seems >> to work OK. >> >> I don't have an identical real hardware platform to try, but I >> could not reproduce it on a real ARM Juno board, which is similar. >> >> It looks from the logs that udev's workers are unable to make >> progress, so the device nodes don't get created. Don't pay too much >> attention to the timestamps in the logs below, they are "inside" the >> virtual platform, and don't reflect wall-clock time. >> Log before 4cd13c21b207: >> https://drive.google.com/open?id=0B8siaK6ZjvEwMktoa0NUS2hJd1U >> Log after 4cd13c21b207: >> https://drive.google.com/open?id=0B8siaK6ZjvEwZXlfeFFSQl9xZTQ >> Kernel config: arch/arm64/configs/defconfig >> >> I'm not sure how to debug this further, so if you have any suggestions >> I'd be glad to hear them. >> >> Many thanks, >> Brian >> > >Hi Brian. > >Thanks a lot for this report. > >If issue triggers when/if using one core, it is possible one driver >has a dependency on >softirqs being serviced during an initialization loop. > >If the thread is not yielding cpu (holding something like a spinlock >thus disabling preemption), >then ksoftirqd might not be able to run on the (same) cpu. > The smc91x driver does seem to have some trickiness around softirqs. I'm not familiar with net drivers, but I'll see if I can figure anything out there. >I sent a patch for busy polling yesterday, but I am almost certain >this would not fix your issue >(assuming you have CONFIG_PREEMPT) > >https://patchwork.ozlabs.org/patch/695185/ You're right in saying that this didn't help. Thanks, Brian