Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933065AbcKWSVl (ORCPT ); Wed, 23 Nov 2016 13:21:41 -0500 Received: from foss.arm.com ([217.140.101.70]:59510 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752227AbcKWSVj (ORCPT ); Wed, 23 Nov 2016 13:21:39 -0500 Date: Wed, 23 Nov 2016 18:21:16 +0000 From: Brian Starkey To: Eric Dumazet Cc: Thomas Gleixner , LKML , Peter Zijlstra , Ingo Molnar , Andrew Morton , Alexander Potapenko , Steven Rostedt , Sebastian Andrzej Siewior Subject: Re: Regression: Failed boots bisected to 4cd13c21b207 "softirq: Let ksoftirqd do its job" Message-ID: <20161123182115.GA28733@e106950-lin.cambridge.arm.com> References: <20161116180156.GA21156@e106950-lin.cambridge.arm.com> <20161116210139.GB21156@e106950-lin.cambridge.arm.com> <20161117164200.GA24653@e106950-lin.cambridge.arm.com> <20161122103351.GA25080@e106950-lin.cambridge.arm.com> <20161122152733.GH25080@e106950-lin.cambridge.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4008 Lines: 127 Hi Eric, On Tue, Nov 22, 2016 at 08:09:58AM -0800, Eric Dumazet wrote: >> >> Looks like there's a few similarly named devices and drivers. Mine is >> an SMSC LAN91C111 using the smc91x driver in >> drivers/net/ethernet/smsc/smc91x.c, rather than smc911x.c. So the >> interrupt handler is smc_interrupt() >> >> CONFIG_ARCH_PXA is not set, nor is SMC_USE_PXA_DMA or SMC_USE_DMA. > >Oh right. > >static irqreturn_t smc_interrupt(int irq, void *dev_id) >{ >... >mask = SMC_GET_INT_MASK(lp); >SMC_SET_INT_MASK(lp, 0); > >loop() > >... >} else if (status & IM_ALLOC_INT) { > DBG(3, dev, "Allocation irq\n"); > tasklet_hi_schedule(&lp->tx_task); > mask &= ~IM_ALLOC_INT; >} >... > >SMC_SET_INT_MASK(lp, mask); >spin_unlock(&lp->lock); > /* > * We return IRQ_HANDLED unconditionally here even if there was > * nothing to do. There is a possibility that a packet might > * get enqueued into the chip right after TX_EMPTY_INT is raised > * but just before the CPU acknowledges the IRQ. > * Better take an unneeded IRQ in some occasions than complexifying > * the code for all cases. > */ > return IRQ_HANDLED; >} > >Could you trace mask value, it looks we loop and never acknowledge >some interrupt status. > >Maybe driver depends on tasklet_hi_schedule() triggers >smc_hardware_send_pkt() really really soon. > >Oh, but smc_hardware_send_pkt() uses a spin_trylock() at its >beginning, then later a spin_lock() hidden in SMC_ENABLE_INT(), >this looks racy. This patch didn't help. I did get some new traces though - I've attached the diff for the trace_printks I added. Before 4cd13c21b207: https://drive.google.com/open?id=0B8siaK6ZjvEwcEtOeFQzTmY0Nnc After 4cd13c21b207: https://drive.google.com/open?id=0B8siaK6ZjvEwZnQ4MVg1d3d1Tm8 It looks like the difference is that after 4cd13c21b207 the RX softirq isn't running, and RX interrupts don't call softirq_raise anymore - presumably because there's one pending, but I didn't have time to track that down to a code-path. Cheers, -Brian --->8--- diff --git a/drivers/net/ethernet/smsc/smc91x.c b/drivers/net/ethernet/smsc/smc91x.c index 503a3b6..6598319 100644 --- a/drivers/net/ethernet/smsc/smc91x.c +++ b/drivers/net/ethernet/smsc/smc91x.c @@ -418,7 +418,7 @@ static inline void smc_rcv(struct net_device *dev) packet_number = SMC_GET_RXFIFO(lp); if (unlikely(packet_number & RXFIFO_REMPTY)) { - PRINTK(dev, "smc_rcv with nothing on FIFO.\n"); + trace_printk("smc_rcv with nothing on FIFO.\n"); return; } @@ -475,6 +475,7 @@ static inline void smc_rcv(struct net_device *dev) SMC_WAIT_MMU_BUSY(lp); SMC_SET_MMU_CMD(lp, MC_RELEASE); dev->stats.rx_dropped++; + trace_printk("rx_dropped.\n"); return; } @@ -1216,7 +1217,7 @@ static irqreturn_t smc_interrupt(int irq, void *dev_id) int status, mask, timeout, card_stats; int saved_pointer; - DBG(3, dev, "%s\n", __func__); + trace_printk("%s\n", __func__); spin_lock(&lp->lock); @@ -1235,7 +1236,7 @@ static irqreturn_t smc_interrupt(int irq, void *dev_id) do { status = SMC_GET_INT(lp); - DBG(2, dev, "INT 0x%02x MASK 0x%02x MEM 0x%04x FIFO 0x%04x\n", + trace_printk("INT 0x%02x MASK 0x%02x MEM 0x%04x FIFO 0x%04x\n", status, mask, ({ int meminfo; SMC_SELECT_BANK(lp, 0); meminfo = SMC_GET_MIR(lp); @@ -1254,7 +1255,7 @@ static irqreturn_t smc_interrupt(int irq, void *dev_id) if (THROTTLE_TX_PKTS) netif_wake_queue(dev); } else if (status & IM_RCV_INT) { - DBG(3, dev, "RX irq\n"); + trace_printk("RX irq\n"); smc_rcv(dev); } else if (status & IM_ALLOC_INT) { DBG(3, dev, "Allocation irq\n"); @@ -1304,7 +1305,7 @@ static irqreturn_t smc_interrupt(int irq, void *dev_id) PRINTK(dev, "spurious interrupt (mask = 0x%02x)\n", mask); #endif - DBG(3, dev, "Interrupt done (%d loops)\n", + trace_printk("Interrupt done (%d loops)\n", MAX_IRQ_LOOPS - timeout); /*