Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935042AbcKQRIz (ORCPT ); Thu, 17 Nov 2016 12:08:55 -0500 Received: from mail-it0-f48.google.com ([209.85.214.48]:36758 "EHLO mail-it0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754521AbcKQRIi (ORCPT ); Thu, 17 Nov 2016 12:08:38 -0500 MIME-Version: 1.0 In-Reply-To: <20161116210139.GB21156@e106950-lin.cambridge.arm.com> References: <20161116135527.GA5833@e106950-lin.cambridge.arm.com> <20161116180156.GA21156@e106950-lin.cambridge.arm.com> <20161116210139.GB21156@e106950-lin.cambridge.arm.com> From: Eric Dumazet Date: Thu, 17 Nov 2016 07:29:14 -0800 Message-ID: Subject: Re: Regression: Failed boots bisected to 4cd13c21b207 "softirq: Let ksoftirqd do its job" To: Brian Starkey Cc: LKML , Peter Zijlstra , Ingo Molnar , Andrew Morton , Alexander Potapenko , Steven Rostedt , Sebastian Andrzej Siewior , Thomas Gleixner Content-Type: multipart/mixed; boundary=f403045d9db62809c6054180da1b Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5211 Lines: 118 --f403045d9db62809c6054180da1b Content-Type: text/plain; charset=UTF-8 On Wed, Nov 16, 2016 at 1:01 PM, Brian Starkey wrote: > On Wed, Nov 16, 2016 at 10:49:06AM -0800, Eric Dumazet wrote: >> >> On Wed, Nov 16, 2016 at 10:01 AM, Brian Starkey >> wrote: >> >>> >>> The smc91x driver does seem to have some trickiness around softirqs. >>> I'm not familiar with net drivers, but I'll see if I can figure >>> anything out there. >> >> >> Oh this code looks ugly :( >> >> Do you have CONFIG_SMP=y or not ? > > > Yeah CONFIG_SMP=y (and CONFIG_PREEMPT=y too, fwiw). > > I did try forcing it into the no-op locking (as though config SMP > wasn't set), it didn't help (and it doesn't look like that would be > safe with CONFIG_PREEMPT=y either). > > The bit in smc_hardware_send_pkt looks like skipping softirq > invocation when there's already one running wouldn't give the same > behaviour as before: > > if (!smc_special_trylock(&lp->lock, flags)) { > netif_stop_queue(dev); > tasklet_schedule(&lp->tx_task); > return; > } > > ... that said, I've no idea if that matters. > > Of course I also don't know if the network driver is even to blame :-( > I believe the problem is in SMC_WAIT_MMU_BUSY() Could you try this patch ? (inlined and attached) diff --git a/drivers/net/ethernet/smsc/smc91x.c b/drivers/net/ethernet/smsc/smc91x.c index 65077c77082a2f042117a0889c2b15099c58eae5..4ef653ae8564c6d2a5120cdaef12db1e1b218b39 100644 --- a/drivers/net/ethernet/smsc/smc91x.c +++ b/drivers/net/ethernet/smsc/smc91x.c @@ -229,19 +229,22 @@ static inline void PRINT_PKT(u_char *buf, int length) { } * if at all, but let's avoid deadlocking the system if the hardware * decides to go south. */ -#define SMC_WAIT_MMU_BUSY(lp) do { \ - if (unlikely(SMC_GET_MMU_CMD(lp) & MC_BUSY)) { \ - unsigned long timeout = jiffies + 2; \ - while (SMC_GET_MMU_CMD(lp) & MC_BUSY) { \ - if (time_after(jiffies, timeout)) { \ - netdev_dbg(dev, "timeout %s line %d\n", \ - __FILE__, __LINE__); \ - break; \ - } \ - cpu_relax(); \ - } \ - } \ -} while (0) +static void SMC_WAIT_MMU_BUSY(struct smc_local *lp) +{ + unsigned long timeout = jiffies + 2; + unsigned int count = 10000; + + while (SMC_GET_MMU_CMD(lp) & MC_BUSY) { + count--; + if (!count || time_after(jiffies, timeout)) { + netdev_dbg(lp->dev, "timeout %s line %d\n", + __FILE__, __LINE__); + break; + } + /* TODO : investigate using cond_resched() from allowed contexts */ + cpu_relax(); + } +} /* --f403045d9db62809c6054180da1b Content-Type: text/plain; charset=US-ASCII; name="patch2250.txt" Content-Disposition: attachment; filename="patch2250.txt" Content-Transfer-Encoding: base64 X-Attachment-Id: f_ivmihmyk0 ZGlmZiAtLWdpdCBhL2RyaXZlcnMvbmV0L2V0aGVybmV0L3Ntc2Mvc21jOTF4LmMgYi9kcml2ZXJz L25ldC9ldGhlcm5ldC9zbXNjL3NtYzkxeC5jCmluZGV4IDY1MDc3Yzc3MDgyYTJmMDQyMTE3YTA4 ODljMmIxNTA5OWM1OGVhZTUuLjRlZjY1M2FlODU2NGM2ZDJhNTEyMGNkYWVmMTJkYjFlMWIyMThi MzkgMTAwNjQ0Ci0tLSBhL2RyaXZlcnMvbmV0L2V0aGVybmV0L3Ntc2Mvc21jOTF4LmMKKysrIGIv ZHJpdmVycy9uZXQvZXRoZXJuZXQvc21zYy9zbWM5MXguYwpAQCAtMjI5LDE5ICsyMjksMjIgQEAg c3RhdGljIGlubGluZSB2b2lkIFBSSU5UX1BLVCh1X2NoYXIgKmJ1ZiwgaW50IGxlbmd0aCkgeyB9 CiAgKiBpZiBhdCBhbGwsIGJ1dCBsZXQncyBhdm9pZCBkZWFkbG9ja2luZyB0aGUgc3lzdGVtIGlm IHRoZSBoYXJkd2FyZQogICogZGVjaWRlcyB0byBnbyBzb3V0aC4KICAqLwotI2RlZmluZSBTTUNf V0FJVF9NTVVfQlVTWShscCkgZG8gewkJCQkJXAotCWlmICh1bmxpa2VseShTTUNfR0VUX01NVV9D TUQobHApICYgTUNfQlVTWSkpIHsJCVwKLQkJdW5zaWduZWQgbG9uZyB0aW1lb3V0ID0gamlmZmll cyArIDI7CQkJXAotCQl3aGlsZSAoU01DX0dFVF9NTVVfQ01EKGxwKSAmIE1DX0JVU1kpIHsJCVwK LQkJCWlmICh0aW1lX2FmdGVyKGppZmZpZXMsIHRpbWVvdXQpKSB7CQlcCi0JCQkJbmV0ZGV2X2Ri ZyhkZXYsICJ0aW1lb3V0ICVzIGxpbmUgJWRcbiIsCVwKLQkJCQkJICAgX19GSUxFX18sIF9fTElO RV9fKTsJCVwKLQkJCQlicmVhazsJCQkJCVwKLQkJCX0JCQkJCQlcCi0JCQljcHVfcmVsYXgoKTsJ CQkJCVwKLQkJfQkJCQkJCQlcCi0JfQkJCQkJCQkJXAotfSB3aGlsZSAoMCkKK3N0YXRpYyB2b2lk IFNNQ19XQUlUX01NVV9CVVNZKHN0cnVjdCBzbWNfbG9jYWwgKmxwKQoreworCXVuc2lnbmVkIGxv bmcgdGltZW91dCA9IGppZmZpZXMgKyAyOworCXVuc2lnbmVkIGludCBjb3VudCA9IDEwMDAwOwor CisJd2hpbGUgKFNNQ19HRVRfTU1VX0NNRChscCkgJiBNQ19CVVNZKSB7CisJCWNvdW50LS07CisJ CWlmICghY291bnQgfHwgdGltZV9hZnRlcihqaWZmaWVzLCB0aW1lb3V0KSkgeworCQkJbmV0ZGV2 X2RiZyhscC0+ZGV2LCAidGltZW91dCAlcyBsaW5lICVkXG4iLAorCQkJCSAgIF9fRklMRV9fLCBf X0xJTkVfXyk7CisJCQlicmVhazsKKwkJfQorCS8qIFRPRE8gOiBpbnZlc3RpZ2F0ZSB1c2luZyBj b25kX3Jlc2NoZWQoKSBmcm9tIGFsbG93ZWQgY29udGV4dHMgKi8KKwljcHVfcmVsYXgoKTsKKwl9 Cit9CiAKIAogLyoK --f403045d9db62809c6054180da1b--