Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932191AbbDPSyl (ORCPT ); Thu, 16 Apr 2015 14:54:41 -0400 Received: from cantor2.suse.de ([195.135.220.15]:47734 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752073AbbDPSyd (ORCPT ); Thu, 16 Apr 2015 14:54:33 -0400 Date: Thu, 16 Apr 2015 20:54:26 +0200 From: "Luis R. Rodriguez" To: Hyong-Youb Kim Cc: Andy Walls , Hyong-Youb Kim , netdev@vger.kernel.org, Andy Lutomirski , Toshi Kani , "H. Peter Anvin" , Ingo Molnar , linux-kernel@vger.kernel.org, Hal Rosenstock , Sean Hefty , Suresh Siddha , Rickard Strandqvist , Mike Marciniszyn , Roland Dreier , Juergen Gross , Mauro Carvalho Chehab , Borislav Petkov , Mel Gorman , Vlastimil Babka , Davidlohr Bueso , Johannes Berg , Felix Fietkau , Benjamin Poirier , dave.hansen@linux.intel.com, plagnioj@jcrosoft.com, tglx@linutronix.de, Ville =?iso-8859-1?Q?Syrj=E4l=E4?= , linux-fbdev@vger.kernel.org, linux-media@vger.kernel.org, x86@kernel.org Subject: Re: ioremap_uc() followed by set_memory_wc() - burrying MTRR Message-ID: <20150416185426.GH5622@wotan.suse.de> References: <1428695379.6646.69.camel@misato.fc.hp.com> <20150410210538.GB5622@wotan.suse.de> <1428699490.21794.5.camel@misato.fc.hp.com> <20150411012938.GC5622@wotan.suse.de> <20150413174938.GE5622@wotan.suse.de> <1429137531.1899.28.camel@palomino.walls.org> <20150415235816.GG5622@wotan.suse.de> <20150416041837.GA5712@hykim-PC> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150416041837.GA5712@hykim-PC> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2905 Lines: 59 On Thu, Apr 16, 2015 at 01:18:37PM +0900, Hyong-Youb Kim wrote: > On Thu, Apr 16, 2015 at 01:58:16AM +0200, Luis R. Rodriguez wrote: > > > > An alternative... is to just ioremap_wc() the entire region, including > > MMIO registers for these old devices. I see one ethernet driver that does > > this, myri10ge, and am curious how and why they ended up deciding this > > and if they have run into any issues. I wonder if this is a reasonable > > comrpomise for these 2 remaining corner cases. > > > > For myri10ge, it a performance thing. Descriptor rings are in NIC > memory BAR0, not in host memory. Say, to send a packet, the driver > writes the send descriptor to the ioremap'd NIC memory. It is a > multi-word descriptor. So, to send it as one PCIE MWr transaction, > the driver maps the whole BAR0 as WC and does "copy descriptor; wmb". Interesting, so you burst write multi-word descriptor writes using write-combining here for the Ethernet device. > Without WC, descriptors would end up as multiple 4B or 8B MWr packets > to the NIC, which has a pretty big performance impact on this > particular NIC. How big are the descriptors? > Most registers that do not want WC are actually in BAR2, which is not > mapped as WC. For registers that are in BAR0, we do "write to the > register; wmb". If we want to wait till the NIC has seen the write, > we do "write; wmb; read". Interesting, thanks, yeah using this as a work around to the problem sounds plausible however it still would require likely making just as many changes to the ivtv and ipath driver as to just do a proper split. I do wonder however if this sort of work around can be generalized somehow though so that others could use, if this sort of thing is going to become prevalent. If so then this would serve two purposes: work around for the corner cases of MTRR use on Linux and also these sorts of device constraints. In order to determine if this is likely to be generally useful could you elaborate a bit more about the detals of the performance issues of not bursting writes for the descriptor on this device. Even if that is done a conversion over to this work around seems it may require device specific nitpicks. For instance I note in myri10ge_submit_req() for small writes you just do a reverse write and do the first set last, then finally the last 32 bits are rewritten, I guess to trigger something? > This approach has worked for this device for many years. I cannot say > whether it works for other devices, though. I think it should but the more interesting question would be exactly *why* it was needed for this device, who determined that, and why? Luis -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/