Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756060AbcKVQKD (ORCPT ); Tue, 22 Nov 2016 11:10:03 -0500 Received: from mail-io0-f182.google.com ([209.85.223.182]:34769 "EHLO mail-io0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756027AbcKVQKA (ORCPT ); Tue, 22 Nov 2016 11:10:00 -0500 MIME-Version: 1.0 In-Reply-To: <20161122152733.GH25080@e106950-lin.cambridge.arm.com> References: <20161116135527.GA5833@e106950-lin.cambridge.arm.com> <20161116180156.GA21156@e106950-lin.cambridge.arm.com> <20161116210139.GB21156@e106950-lin.cambridge.arm.com> <20161117164200.GA24653@e106950-lin.cambridge.arm.com> <20161122103351.GA25080@e106950-lin.cambridge.arm.com> <20161122152733.GH25080@e106950-lin.cambridge.arm.com> From: Eric Dumazet Date: Tue, 22 Nov 2016 08:09:58 -0800 Message-ID: Subject: Re: Regression: Failed boots bisected to 4cd13c21b207 "softirq: Let ksoftirqd do its job" To: Brian Starkey Cc: Thomas Gleixner , LKML , Peter Zijlstra , Ingo Molnar , Andrew Morton , Alexander Potapenko , Steven Rostedt , Sebastian Andrzej Siewior Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2134 Lines: 71 > > Looks like there's a few similarly named devices and drivers. Mine is > an SMSC LAN91C111 using the smc91x driver in > drivers/net/ethernet/smsc/smc91x.c, rather than smc911x.c. So the > interrupt handler is smc_interrupt() > > CONFIG_ARCH_PXA is not set, nor is SMC_USE_PXA_DMA or SMC_USE_DMA. Oh right. static irqreturn_t smc_interrupt(int irq, void *dev_id) { ... mask = SMC_GET_INT_MASK(lp); SMC_SET_INT_MASK(lp, 0); loop() ... } else if (status & IM_ALLOC_INT) { DBG(3, dev, "Allocation irq\n"); tasklet_hi_schedule(&lp->tx_task); mask &= ~IM_ALLOC_INT; } ... SMC_SET_INT_MASK(lp, mask); spin_unlock(&lp->lock); /* * We return IRQ_HANDLED unconditionally here even if there was * nothing to do. There is a possibility that a packet might * get enqueued into the chip right after TX_EMPTY_INT is raised * but just before the CPU acknowledges the IRQ. * Better take an unneeded IRQ in some occasions than complexifying * the code for all cases. */ return IRQ_HANDLED; } Could you trace mask value, it looks we loop and never acknowledge some interrupt status. Maybe driver depends on tasklet_hi_schedule() triggers smc_hardware_send_pkt() really really soon. Oh, but smc_hardware_send_pkt() uses a spin_trylock() at its beginning, then later a spin_lock() hidden in SMC_ENABLE_INT(), this looks racy. diff --git a/drivers/net/ethernet/smsc/smc91x.c b/drivers/net/ethernet/smsc/smc91x.c index 73212590d04a..32abb0a084ec 100644 --- a/drivers/net/ethernet/smsc/smc91x.c +++ b/drivers/net/ethernet/smsc/smc91x.c @@ -617,14 +617,13 @@ static void smc_hardware_send_pkt(unsigned long data) /* queue the packet for TX */ SMC_SET_MMU_CMD(lp, MC_ENQUEUE); + SMC_SET_INT_MASK(lp, SMC_GET_INT_MASK(lp) | IM_TX_INT | IM_TX_EMPTY_INT); smc_special_unlock(&lp->lock, flags); netif_trans_update(dev); dev->stats.tx_packets++; dev->stats.tx_bytes += len; - SMC_ENABLE_INT(lp, IM_TX_INT | IM_TX_EMPTY_INT); - done: if (!THROTTLE_TX_PKTS) netif_wake_queue(dev);