Received: by 10.223.176.46 with SMTP id f43csp1033026wra; Sat, 20 Jan 2018 09:22:45 -0800 (PST) X-Google-Smtp-Source: AH8x227Sxsb+Tgrgl4rkqSn8hs1rNNxqD58UwvZHO7QHkMouZM+37h9gB/f5Z5YhpiX2SVyi7TAV X-Received: by 10.99.168.76 with SMTP id i12mr2593956pgp.119.1516468965469; Sat, 20 Jan 2018 09:22:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516468965; cv=none; d=google.com; s=arc-20160816; b=ZTonTmbZ9CZWAf/VXLMcypOnUvPDATmvqV7S7e4AWeWlsBpWte/fu7OvP3CjYMiYU4 MQMTiYnw6JMwhGbs74FONr9VZqO3GV2Ifw/5GU5RJ16+DQSEpsLvZcr6tm0xzg179oVU wOch+yNXiJxhNd2JTprN4Z7+Bgm83CIEJTcCjpG3+Wqxus1VhWNBF4R4Ol9NuWgQy5lV NAfVGUcbRnhAPhTycnXPDVOjLdBM+ZgUH4XII2jnh1W45XYiDhtVXQrGjS8d9qaJUT6N uGKz2p/sHN4OIMdVhkbEpb3bf2uo8fDgWvDfVlpdlt2qSkUJK+XOMLNmbNSZ5q+/W0Bm IEQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=TFl/zzBAFNaWK8e12QAjY+MQjpxa66kgTEeCXR6L3+Y=; b=ImhJYoEZGim8Xga2DfVgM0hGA7bCHaDNI2rXwbvuP0M84k2iK2/piZQ9XYdw37OT39 OAs42efcWi1qcwYbsQT263yv/FeFRjDFdIQqBXl6JWf4XNVnA1PBd8u30lKNXHU2cnML I4uynRlP/U8Yo18mRGLwJ0ipsydAzQLvbBRr/Ay0FSeRuVcvNui8Jj+9mUWQIhePWvbn w2wqYQGrZgrhdz7rO1bwtP8EXZyHHVr1oT76DSvopsASYP4Iv5E74kTvD+aIjmebsyAp 6bxf9t6wPfVg3Mf1pYykz957KofzKrFQ8rn2RvYy2wqeYHH+CpolisNIq7V7f4zA/TKs yfjg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=mPpekKgJ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c2si10030635pgf.611.2018.01.20.09.22.31; Sat, 20 Jan 2018 09:22:45 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=mPpekKgJ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756840AbeATRWP (ORCPT + 99 others); Sat, 20 Jan 2018 12:22:15 -0500 Received: from mail-qt0-f170.google.com ([209.85.216.170]:37864 "EHLO mail-qt0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756475AbeATRV6 (ORCPT ); Sat, 20 Jan 2018 12:21:58 -0500 Received: by mail-qt0-f170.google.com with SMTP id d54so11375379qtd.4; Sat, 20 Jan 2018 09:21:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=TFl/zzBAFNaWK8e12QAjY+MQjpxa66kgTEeCXR6L3+Y=; b=mPpekKgJ4hOLd6hVSuA4PpQcI0irwk5KSUzthJ1JKp83dmBswimXdTnBzOPeyzcWlh 07Uqb+W2NCNTs5+UrAjysxGUpSVq48SHhATXJQJI2YC3XP7Cyg1i3X3hq5hjFNXONEUA J/+nAxqaKWUImKJZBC0M7tXJJdPfQ7xOSIDRmXRbop14dIo2SxttZqguw0Q1SF4Xthfk w3xU5zYboSyS5zxDa8NQvdn55pHTkfLQcy//yrvsmrHjp8QuVnK5HDiZ+iGGb3zemZ6N yaghgiaQ2QIixFfvgVQkbg99RmY1fAb9t6PTj6HxzVui34WPawZUPoJd3iKhkjGIKe1y b9+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=TFl/zzBAFNaWK8e12QAjY+MQjpxa66kgTEeCXR6L3+Y=; b=O4YtyRubkUYk4+Sdy/fzYiszE6jnt7aIhQ3j7wANqbsDx6aX41bBWDaRIGFcp0R6rg 60e/6AQXHfVBnUnAnYsQVcufxLX0f9ai54/WNZxVNtpJ/HZeN2QXb7AFezsAkiGLdYq3 EuDMZi4BvfyfbLE+HiADKYLbQ+XUoX5P02Q2XVXkQH7olwF0/CfmeC2yJmUSb9NvLGam XU6iRAEeCt80Jdp1yKgR4vYjliMB2Bt64Uiqfec9YmsxIKDTpCRYyC/RjMDX7H3I20b2 U+EPLO5DzlmWT1mcah0dJUD6Qr6Oze+xXZMCJVzUnO/sgfjFYBpj+aQEo/dFXYEW8krc D7Mw== X-Gm-Message-State: AKwxytdqPFtXeYjx0cOt00INKBHLbaA6ekkrVVE+Mw/69qRLvnWY3VTM wWUnRm73rg7lY4Lj0Uc2bfsslfakCIj1Heg/iDX4/w== X-Received: by 10.237.42.198 with SMTP id t64mr3246257qtd.177.1516468916567; Sat, 20 Jan 2018 09:21:56 -0800 (PST) MIME-Version: 1.0 Received: by 10.140.89.199 with HTTP; Sat, 20 Jan 2018 09:21:55 -0800 (PST) In-Reply-To: <20180119225500.lq2vpnjh5isxiovf@f1.synalogic.ca> References: <20180118065054.29844-1-bpoirier@suse.com> <20180119085952.u63kius4ud34lleq@f1.synalogic.ca> <20180119133648.s5nbm4gvby6c33av@f1.synalogic.ca> <20180119224517.klugizz5n5zznryx@f1.synalogic.ca> <20180119225500.lq2vpnjh5isxiovf@f1.synalogic.ca> From: Alexander Duyck Date: Sat, 20 Jan 2018 09:21:55 -0800 Message-ID: Subject: Re: [Intel-wired-lan] [RFC PATCH] e1000e: Remove Other from EIAC. To: Benjamin Poirier Cc: Shrikrishna Khare , Jeff Kirsher , Netdev , intel-wired-lan , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 19, 2018 at 2:55 PM, Benjamin Poirier wrote: > On 2018/01/20 07:45, Benjamin Poirier wrote: > [...] >> > >> > I'm of the mind that we need to cut down on the code thrash. This >> > driver is supposed to have been in a "maintenance" mode for the last >> > year or so as there aren't being any new parts added is my >> > understanding. As-is I don't see any reason why 16ecba59bc33 ("e1000e: >> > Do not read ICR in Other interrupt", v4.5-rc1) was submitted or >> > accepted in the first place. I don't see any notes about it fixing any >> > bug or addressing any issue and it seems like that is the start of all >> > the issues we have been having recently with RXO triggering more >> > interrupts, various link issues, and this most recent VMware issue. >> >> I'm sorry to say but you're the one who suggested that change: >> >> http://lkml.iu.edu/hypermail/linux/kernel/1510.3/03528.html >> >> On 2015/10/28 23:08, Alexander Duyck wrote: >> > On 10/22/2015 05:32 PM, Benjamin Poirier wrote: >> [...] >> > >> > I would argue your first patch probably didn't go far enough to remove dead >> > code. Specifically you should only ever get into this function if LSC is >> > set. There are no other causes that should trigger this. As such you could >> > probably remove the ICR read, and instead replace it with an ICR write of >> > the LSC bit since OTHER is already cleared via EIAC. >> > > > ... The assumption that "There are no other causes that should trigger > this." turned out to be wrong and that caused the RXO problems. Clearing > OTHER via EIAC is causing the problems with vmware now. I don't think > you foresaw those problems back in 2015 and neither did I. Well that explains why I felt like I was explaining things to a younger version of myself. I was a bit more relaxed in terms of being willing to make arbitrary changes a few years ago. I tend to be a bit more conservative now, at least as far as having justifications as to why I want to do things. With any change you end up taking on risk, and so usually a patch has a justification as to why you are making the change. Unfortunately at the time I didn't have all the information and based my suggestion on a bad assumption. I would guess at the time I was thinking of doing general code cleanup. Other drivers such as igb work this way, but it led us down the path we are on now where we are having to make one fix after another. It is leading in the opposite direction of maintainability as this is all becoming more complex. Suggesting this was a bad decision on my part at the time. I'm only human, I make mistakes.. :-) With further review of the code I am seeing various other issues that could still pop up as I am not certain we should even have the "other" interrupt messing with the NAPI polling or packet accounting logic at all. The question I would have at this point is if we revert 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt", v4.5-rc1) and all the related fixes for it, what do we end up with? It seems like the code is slowly heading back in the direction of where it was originally anyway since there have been a number of partial reverts. I'm wondering what would happen if we were to just short-cut that and audit the patches involved to see what we really need and don't. Your patch as proposed is essentially another step in that direction. I'm thinking we may want to drop my currently proposed fix for now and instead look at going through and figure out what changes after that first one are still really needed. It doesn't look like my fix will make it for 4.15 anyway so we might as well focus on making certain to have things as solid as possible by the time 4.16-rc1 rolls around. Thanks. - Alex