Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933514AbcKVP1o (ORCPT ); Tue, 22 Nov 2016 10:27:44 -0500 Received: from foss.arm.com ([217.140.101.70]:58880 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933052AbcKVP1m (ORCPT ); Tue, 22 Nov 2016 10:27:42 -0500 Date: Tue, 22 Nov 2016 15:27:33 +0000 From: Brian Starkey To: Eric Dumazet Cc: Thomas Gleixner , LKML , Peter Zijlstra , Ingo Molnar , Andrew Morton , Alexander Potapenko , Steven Rostedt , Sebastian Andrzej Siewior Subject: Re: Regression: Failed boots bisected to 4cd13c21b207 "softirq: Let ksoftirqd do its job" Message-ID: <20161122152733.GH25080@e106950-lin.cambridge.arm.com> References: <20161116135527.GA5833@e106950-lin.cambridge.arm.com> <20161116180156.GA21156@e106950-lin.cambridge.arm.com> <20161116210139.GB21156@e106950-lin.cambridge.arm.com> <20161117164200.GA24653@e106950-lin.cambridge.arm.com> <20161122103351.GA25080@e106950-lin.cambridge.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2968 Lines: 88 Hi Eric, On Tue, Nov 22, 2016 at 06:29:33AM -0800, Eric Dumazet wrote: >On Tue, Nov 22, 2016 at 2:33 AM, Brian Starkey wrote: >> >> Hi, >> >> On Fri, Nov 18, 2016 at 01:40:43AM +0100, Thomas Gleixner wrote: >>> >>> Brian, >>> >>> On Thu, 17 Nov 2016, Brian Starkey wrote: >>>> >>>> No joy with this patch :-( >>>> >>>> I had to add an ioaddr argument because apparently that macro depends >>>> on local context (yuck...), but it doesn't help my issue. >>>> >>>> FWIW I don't see any timeouts, either with or without the patch. >>>> (I don't know for sure, but I would guess that the model of the >>>> network card doesn't model whatever stall that loop is checking for. >>>> It probably just completes all MMU operations immediately) >>> >>> >>> Is there a chance that you enable trace points at the kernel command line? >>> >>> trace_event=sched_wakeup,sched_switch,irq_handler_entry,irq_handler_exit,softirq_raise,softirq_entry,softirq_exit >>> >>> should be enough for a start. All we need aside of that is a trigger to >>> stop the trace so we can actually see the events around the time where >>> things go stale. >>> >>> I assume that the whole issue is visible throughout the slow progress of >>> init towards a working system, so for a start it would be sufficient to add >>> something like this into the startup sequence at some point: >>> >>> mount -t debugfs debugfs /sys/kernel/debug >>> echo 0 >/sys/kernel/debug/tracing/tracing_on >>> >>> The only interesting challange is to get the trace data out of the >>> system. The trace is accessible via: >>> >>> cat /sys/kernel/tracing/trace >>> >>> So if your ssh works at some point, that might be an option or you just try >>> to store it over NFS (which will be slow, but better than nothing). Maybe >>> you have a better idea :) >> >> >> I finally managed to pry some traces out this morning. It seems like >> the system struggles to even invoke echo when it's doing badly. >> >> Trace before 4cd13c21b207: https://drive.google.com/open?id=0B8siaK6ZjvEwU21wNTdZS29kVXc >> Trace after 4cd13c21b207: https://drive.google.com/open?id=0B8siaK6ZjvEwbXVzcnpieVkzWFU >> (btw, if there's a preferred way to send the logs let me know. I >> wasn't sure large or non-text attachments would be well received) >> >> I'm not sure how much help the trace is, but it does look like the >> system is spending far too much time in the ethernet device's IRQ >> handler to be healthy. >> > >Thanks a lot Brian > >Can you confirm interrupt handler is smc911x_interrupt() ? > >(ie : is SMC_USE_PXA_DMA / SMC_USE_DMA defined or not ?) > Looks like there's a few similarly named devices and drivers. Mine is an SMSC LAN91C111 using the smc91x driver in drivers/net/ethernet/smsc/smc91x.c, rather than smc911x.c. So the interrupt handler is smc_interrupt() CONFIG_ARCH_PXA is not set, nor is SMC_USE_PXA_DMA or SMC_USE_DMA. -Brian > >> >> Thanks, >> Brian >>> >>> >>> Thanks, >>> >>> tglx >>>