Received: by 10.213.65.68 with SMTP id h4csp1371309imn; Wed, 14 Mar 2018 19:19:39 -0700 (PDT) X-Google-Smtp-Source: AG47ELu5Vthd/0eJJb5nzOoOUfy8iqxFowTVqi6xXwAoovKtPWjrOXHdAlmv+HKTZYmH9VuAdzR/ X-Received: by 2002:a17:902:2b84:: with SMTP id l4-v6mr6348188plb.338.1521080379764; Wed, 14 Mar 2018 19:19:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521080379; cv=none; d=google.com; s=arc-20160816; b=lj2QsBsgwo9OCXJxXwhdmPvO/4weKq+7VfeXYmF1WPa7ioJ759KBAWcjxPrQm6C2N+ hRPrGCFdDVTqLmFKs8MVc8T7IqaQsn6GD7EmInARjEX5W4QdC/E3m2IKHcV/mwcvZrnR r4LUXXNjhyQhlg/OSJs0SG5F9bfYfcBPL9EHmqMyEaCnqPX/XZ75NsqAMDUWbPeTCEld ZTURwTsSO1z2wYM+6LaeA5VEq1CXgZQfZwoazffFApS7uiHibmW3GfFFNJNsU0p6yLN3 /eDGjrfXH71TfT5ljoGICzPSqtAQ561gBu9JYEDGTua8CqjWS6OBqNNBl7aVsZisjAR4 S8Qg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dmarc-filter :dkim-signature:dkim-signature:arc-authentication-results; bh=0mza/2qA8XHPcCqrLhKRDgjcEx8S1f143xbH9zEZzp8=; b=rN1nNbmITcM8FA9vCH23/YlgbLKmH4Mtu3y3cQ5YyT7TjALZlE2fWzKSuNn/p8LyhB DRrIZoRQh9pY5Dz3uxXTZH2H0juH9v6NVj3sC5VikiW1ejJrXGNgoLE+S4ZrnPowDglV IdgU6q0t4G4e7U9lhlReGZaNgLvRMuCWGk6w58hvxr/VoSYR7hHBBqdu9exMamZqtIC3 C2imblHr+n9iGePySfEtLNkbw69NwBoZMfa0G4PBY2SUSamWm81z8MI4is0u38b+B1A5 bMh+8TQ4lNQQRgZjq0YYI5PDd9lkZZzbER/EG3Dc8619aIaMJHmIUGd2hcTyX+ITQ7jh qZPw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=WI5dGxi2; dkim=pass header.i=@codeaurora.org header.s=default header.b=Vo6k+Ae3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z4-v6si2997606plo.523.2018.03.14.19.19.17; Wed, 14 Mar 2018 19:19:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=WI5dGxi2; dkim=pass header.i=@codeaurora.org header.s=default header.b=Vo6k+Ae3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751652AbeCOCRt (ORCPT + 99 others); Wed, 14 Mar 2018 22:17:49 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:50226 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751480AbeCOCRr (ORCPT ); Wed, 14 Mar 2018 22:17:47 -0400 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 9170060316; Thu, 15 Mar 2018 02:17:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1521080266; bh=uRyc7NkXjBGx3ONeIPT2D6Noird9hsrN6huaj34x/mY=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=WI5dGxi2eSt4M7qgBUXGI1dblv0tPOcGRdmxHGaP6wwdCzL1fWDSCJ/04uNiinQMe I4jnC/QzDWUM8yH0ZXr9J1qY08w00Vt85eD7DbKeDIufk6xIAp9eVagj8sPM++vByo +bFESY1EleC8ZkoAUAc/DqFINP008tSRYogHLEG4= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on pdx-caf-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,T_DKIM_INVALID autolearn=no autolearn_force=no version=3.4.0 Received: from [192.168.0.105] (cpe-174-109-247-98.nc.res.rr.com [174.109.247.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: okaya@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 358DC60316; Thu, 15 Mar 2018 02:17:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1521080265; bh=uRyc7NkXjBGx3ONeIPT2D6Noird9hsrN6huaj34x/mY=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=Vo6k+Ae3JUMGbX8mrg8v2ALOlRhSS29kAhLHarKkMjVQ4izZ2h/Z6zVCHTdUNfMme rVCmKaJmiYmZzVshMMXk+klsb1rjGQqHUU3tXU6ilg+AbPcQ92N/QxVT0QLHfkCt9P G9ftiKG6MSnluyoDII0FxPn/mIw/Kkv2M+svzXag= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 358DC60316 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=okaya@codeaurora.org Subject: Re: [PATCH 7/7] ixgbevf: eliminate duplicate barriers on weakly-ordered archs To: Alexander Duyck Cc: Timur Tabi , Netdev , sulrich@codeaurora.org, linux-arm-msm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Jeff Kirsher , intel-wired-lan , LKML References: <1520997629-17361-1-git-send-email-okaya@codeaurora.org> <1520997629-17361-7-git-send-email-okaya@codeaurora.org> <12150aa0-77ba-878e-31f4-d4f8d6a28ccb@codeaurora.org> <2a4f4dec64b7462ae64152f6c2df9754@codeaurora.org> From: Sinan Kaya Message-ID: <53bf7dfe-32ee-1861-e6ea-81f667590a43@codeaurora.org> Date: Wed, 14 Mar 2018 22:17:43 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3/14/2018 9:44 PM, Alexander Duyck wrote: > On Wed, Mar 14, 2018 at 3:57 PM, Sinan Kaya wrote: >> Hi Alexander, >> >> On 3/14/2018 5:49 PM, Alexander Duyck wrote: >>> On Wed, Mar 14, 2018 at 5:13 AM, wrote: >>>> On 2018-03-14 01:08, Timur Tabi wrote: >>>>> >>>>> On 3/13/18 10:20 PM, Sinan Kaya wrote: >> >>> Actually I would argue this whole patch set is pointless. For starters >>> why is it we are only updating the Intel Ethernet drivers here? >> >> I did a regex search for wmb() followed by writel() in each drivers directory. >> I scrubbed the ones I care about and posted this series. Note also that >> I have one Infiniband patch in the series. > > I didn't see it as it I was only looking at the patches that had ended > up in intel-wired-lan. Also was there a cover page, I couldn't seem to > find that on LKML. Yeah, I didn't have a cover page. These patches were sitting on my branch for a while. I wanted to get them out without putting too much effort into it. I'll add it on the next version. > >> I considered "ease of change", "popular usage" and "performance critical >> path" as the determining criteria for my filtering. > > It might be advisable to break things up to subsystem or family. So > for example if you are going to update the Intel Ethernet drivers I > would focus on that and maybe spin the infiniband patch of into a > separate set that can be applied to a separate tree. This is something > I would consider more of a driver optimization than a fix. In our case > it makes it easier for us to maintain the patches to the Intel drivers > if you could submit just those to Jeff and Intel-wired-lan so that we > can take care of test and review as well as figure out what other > drivers will would still need to update in order to handle all the > cases involved in this. > >>> This >>> seems like something that is going to impact the whole kernel tree >>> since many of us have been writing drivers for some time assuming x86 >>> style behavior. >> >> That's true. We used relaxed API heavily on ARM for a long time but >> it did not exist on other architectures. For this reason, relaxed >> architectures have been paying double penalty in order to use the common >> drivers. >> >> Now that relaxed API is present on all architectures, we can go and scrub >> all drivers to see what needs to change and what can remain. > > My only real objection is that you are going to be having to scrub > pretty much ALL the drivers. It seems a little like trying to fix a > bad tire on your car by paving the road to match the shape of the > tire. Or we start with mostly used ones and hope to increase the coverage over time. It will take a while to cover all drivers. We could do several iterations like you are suggesting for each subsystem. > >>> >>> It doesn't make sense to be optimizing the drivers for one subset of >>> architectures. The scope of the work needed to update the drivers for >>> this would be ridiculous. Also I don't see how this could be expected >>> to work on any other architecture when we pretty much need to have a >>> wmb() before calling the writel on x86 to deal with accesses between >>> coherent and non-coherent memory. It seems to me more like somebody >>> added what they considered to be an optimization somewhere that is a >>> workaround for a poorly written driver. Either that or the barrier is >>> serving a different purpose then the one we were using. >> >> Is there a semantic problem with the definition of wmb() vs. writel() vs. >> writel_relaxed()? I thought everything is well described in barriers >> document about what to expect from these APIs. >> >> AFAIK, writel() is equal to writel_relaxed() on x86 architecture. >> It doesn't really change anything for x86 but it saves barriers on >> other architectures. > > Yeah. I had to go through and do some review since my concerns have > been PowerPC, IA64, and x86 historically. From what I can tell all > those architectures are setup the same way so that shouldn't be an > issue. OK, glad that we are in common understanding. > >>> >>> It would make more sense to put in the effort making writel and >>> writel_relaxed consistent between architectures before we go through >>> and start modifying driver code to support different architectures. >>> >> >> Is there an arch problem that I'm not seeing? >> >> Sinan > > It isn't really an arch problem I have as a logistical one. It just > seems like this is really backwards in terms of how this has been > handled. For the x86 we have historically had to deal with the > barriers for this kind of stuff ourselves, now for ARM and a couple > other architectures they seem to have incorporated the barriers into > writel and are expecting everyone to move over to writel_relaxed. You want to move to writel_relaxed() only if you know that your register accesses won't have any side effects. If you require some memory update to be observable to the HW before doing a register write, right thing is to do wmb() + writel_relaxed() wmb() + writel() is clearly the wrong choice and that's what the goal of this change. If we know that all writel() adaptations in all architectures guarantee observability, another option is to get rid of wmb() and just keep writel(). I'm not so convinced about this and hoping that someone will correct me. wmb() on x86 seems to have an sfence but writel() seems to have a compiler barrier in it. So, the type of barrier wmb() is using is different. we can't say (wmb()+ writel_relaxed()) == writel() for all architectures, but maybe I'm wrong. We really don't want to convert all writel() to writel_relaxed() blindly without giving much thought into it. This was also another reason why I limited the changes to wmb() + writel() combinations only. If there is wmb() + code + writel() and if we convert this to wmb() + code + writel_relaxed(), code will not be observed by the HW and this might break the driver. > It > seems like instead of going that route they should have probably just > looked at pushing the ARM drivers to something like a "writel_strict" > and adopted the behavior of the other architectures for writel. > > I'll go back through and review. It looks like a number of items were missed. > OK, I'll take a look at each one. Saw the code concern on the e1000e suggestion. I wanted to raise it here first before inspecting the rest. -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.