Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp5552393imm; Mon, 23 Jul 2018 01:34:13 -0700 (PDT) X-Google-Smtp-Source: AAOMgpeTjMYp94EfzAZemZeku3f8QvqXc9nKFx4mLVMDeThPrw+jKRQ4aidHdysOKO/Xp9NUksXC X-Received: by 2002:a17:902:683:: with SMTP id 3-v6mr12066089plh.291.1532334853598; Mon, 23 Jul 2018 01:34:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532334853; cv=none; d=google.com; s=arc-20160816; b=lybL9nEXwArYjUoYNt+KGwLCP4jJQL1CMY8Q3DRYsfX/Ej3v55njAUDmt2RbEFI8is 26ISnZaJExRSbpq6xM4w/dGetVcwpA1bEnjbt9g9WFmts18NtGTHZN0yY27MB0M469bx PnapWE6HWvuvxAkWqetdABKoFaH+ZoQrz+A/JY5uKfpIdyjHf4H4M4odji5nVGit+sNm X02Go4joj/I3egFjSskEWQTOHv2H1tTofOnCg4dKfyWLdnx5Cs3Oq0yW4gmDEEzBz0Mr VqOxlJ4F7BzYyhBffADW4txfchH1V19jsTQmBVwKMyzO/LmF2UNSFfqp99Fb4TUPQsmJ q+wA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=vEZ5u5lLuA3wdBrZQHJCIXDcSaJIvAOxhJFHkM2/6YY=; b=XKEWD3o+81I+5KsPfm8BB2gIu8ll7f9AmCwZKqNHiE0to4T7OPHOdPi/HSIroH/gqr rATYf0AT7ebz/Wa5vfAnpsPoXsVKrXOhKPn/Bx6YUe2Fqo2AS2OV+iI4VuMfzZ8K4SLM 7snWfs7/1lsi9w5/tMg7uGveOD9cGvVmY7jnB5LLeEDyz3B3xofY9R0uz+EuBQzeCtG7 KUzCv+5LKeVssO/HbBrnxJsmeyG3uxoDYzwktusmAodOa4Cfa0D8BMZ8adNGlDBzy3zM MyypjcMgo4RuDOR6lRc2w0iPOrp2ngGJWaOFnWLFIQ6m1M2F6EGhJSJrnjFY/afWlhdl PaBA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@broadcom.com header.s=google header.b=dvuymzBs; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadcom.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l6-v6si8392762plt.497.2018.07.23.01.33.58; Mon, 23 Jul 2018 01:34:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@broadcom.com header.s=google header.b=dvuymzBs; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadcom.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388071AbeGWJdI (ORCPT + 99 others); Mon, 23 Jul 2018 05:33:08 -0400 Received: from mail-oi0-f66.google.com ([209.85.218.66]:34872 "EHLO mail-oi0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387951AbeGWJdI (ORCPT ); Mon, 23 Jul 2018 05:33:08 -0400 Received: by mail-oi0-f66.google.com with SMTP id i12-v6so31828994oik.2 for ; Mon, 23 Jul 2018 01:33:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=vEZ5u5lLuA3wdBrZQHJCIXDcSaJIvAOxhJFHkM2/6YY=; b=dvuymzBslJ9/1UpWL+Z/499EpQAHCdqA2KbaVaOR1xiXCOaakEhhs+//4y4+oVWmel yXASLWMPI2ToLAVLFLqINjOnLg6VsZSIRvtdBq0CDd1aYQZLUkWjUWp2+HOzDpMXV6ua SkCspufhF/wXcGMSb4DOqsmHZibbrSLvz/F8c= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=vEZ5u5lLuA3wdBrZQHJCIXDcSaJIvAOxhJFHkM2/6YY=; b=sFlr/VKYeQKfqrRd/FTGk2VnO4lRtBsKsQ/YtlEMMcBuCHI55vU9bCVhsdedINJMKh gnpcZ0NOThaWr0PvY7LqnLtwn7yvSLEYJOAPdj7I+GHV7TespQH/fSmASDrObNBiPCwq XK1q4r+HXH3LUdozoNwQQCI/gdO/CCA9j/58QTKrkpC1jAGC+teJH2LwIRmAouGmAovs IxQGxWrEs/d4E+UYMb8SCh4cj7PYRzSr2XMsDvNaFBQbymsatqm4QCubc0e6/q6p479j emrWJQUYLokpxU64myYJOmd5l02P8MuHBfxI/6FAF7FcYR/BMs9hfQFVqrV6v0by45oT NBqg== X-Gm-Message-State: AOUpUlGT0aixfd3/KT4735LclN3MTOoqLKwpsXfGH0ETne4LEnejgI6M TY2o0LjqMuBcdGTXYAMf3AxrV87xhKvwUJIybcs6YA== X-Received: by 2002:aca:afcc:: with SMTP id y195-v6mr8035843oie.322.1532334783590; Mon, 23 Jul 2018 01:33:03 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:6043:0:0:0:0:0 with HTTP; Mon, 23 Jul 2018 01:33:02 -0700 (PDT) In-Reply-To: <20180720142728.572dd2d7@t450s.home> References: <1531457777-11825-1-git-send-email-srinath.mannam@broadcom.com> <20180717092207.775e381c@t450s.home> <20180718152531.60f7df6b@t450s.home> <20180719091252.749f2433@t450s.home> <20180720142728.572dd2d7@t450s.home> From: Srinath Mannam Date: Mon, 23 Jul 2018 14:03:02 +0530 Message-ID: Subject: Re: [RFC PATCH] vfio/pci: map prefetchble bars as writecombine To: Alex Williamson Cc: Ray Jui , Vikram Prakash , Scott Branden , kvm@vger.kernel.org, BCM Kernel Feedback , Linux Kernel Mailing List , aik@ozlabs.ru, David Gibson , benh@kernel.crashing.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jul 21, 2018 at 1:57 AM, Alex Williamson wrote: > On Thu, 19 Jul 2018 21:49:48 +0530 > Srinath Mannam wrote: > >> HI Alex, >> >> On Thu, Jul 19, 2018 at 8:42 PM, Alex Williamson >> wrote: >> > On Thu, 19 Jul 2018 20:17:11 +0530 >> > Srinath Mannam wrote: >> > >> >> HI Alex, >> >> >> >> On Thu, Jul 19, 2018 at 2:55 AM, Alex Williamson >> >> wrote: >> >> > On Thu, 19 Jul 2018 00:05:18 +0530 >> >> > Srinath Mannam wrote: >> >> > >> >> >> Hi Alex, >> >> >> >> >> >> On Tue, Jul 17, 2018 at 8:52 PM, Alex Williamson >> >> >> wrote: >> >> >> > On Fri, 13 Jul 2018 10:26:17 +0530 >> >> >> > Srinath Mannam wrote: >> >> >> > >> >> >> >> By default all BARs map with VMA access permissions >> >> >> >> as pgprot_noncached. >> >> >> >> >> >> >> >> In ARM64 pgprot_noncached is MT_DEVICE_nGnRnE which >> >> >> >> is strongly ordered and allows aligned access. >> >> >> >> This type of mapping works for NON-PREFETCHABLE bars >> >> >> >> containing EP controller registers. >> >> >> >> But it restricts PREFETCHABLE bars from doing >> >> >> >> unaligned access. >> >> >> >> >> >> >> >> In CMB NVMe drives PREFETCHABLE bars are required to >> >> >> >> map as MT_NORMAL_NC to do unaligned access. >> >> >> >> >> >> >> >> Signed-off-by: Srinath Mannam >> >> >> >> Reviewed-by: Ray Jui >> >> >> >> Reviewed-by: Vikram Prakash >> >> >> >> --- >> >> >> > >> >> >> > This has been discussed before: >> >> >> > >> >> >> > https://www.spinics.net/lists/kvm/msg156548.html >> >> >> Thank you for inputs.. I have gone through the long list of mail chain >> >> >> discussion. >> >> >> > >> >> >> > CC'ing the usual suspects from the previous thread. I'm not convinced >> >> >> > that the patch here has considered anything other than the ARM64 >> >> >> > implications and it's not clear that it considers compatibility with >> >> >> > existing users or devices at all. Can we guarantee for all devices and >> >> >> > use cases that WC is semantically equivalent and preferable to UC? If >> >> >> > not then we need to device an extension to the interface that allows >> >> >> > the user to specify WC. Thanks, >> >> >> > >> >> >> To implement with user specified WC flags, many changes need to be done. >> >> >> Suppose In DPDK, prefetcable BARs map using WC flag, then also same >> >> >> question comes >> >> >> that WC may be different for different CPUs. >> >> >> As per functionality, both WC and PREFETCHABLE are same, like merging writes and >> >> >> typically WC is uncached. >> >> >> So, based on prefetchable BARs behavior and usage we need to map bar memory. >> >> >> Is it right to map prefetchable BARs as strongly ordered, aligned >> >> >> access and uncached? >> >> > >> >> > Is it possible to answer that question generically? Whether to map a >> >> > BAR as UC or WC is generally a question for the driver. Does the >> >> > device handle unaligned accesses? Does the device need strong memory >> >> > ordering? If this is a driver level question then the driver that >> >> > needs to make that decision is the userspace driver. VFIO is just a >> >> > pass-through here and since we don't offer the user a choice of >> >> > mappings, we take the safer and more conservative mapping, ie. UC. >> >> > >> >> Yes, you are right, driver should make the decision based on its requirement. >> >> In my case, user space driver is part of SPDK, so SPDK should request DPDK >> >> and DPDK should request VFIO to map BAR for its choice of mapping. >> >> So to implement this we need code changes in VFIO, DPDK and SPDK. >> >> >> >> > You're suggesting that there are many changes to be done if we modify >> >> > the vfio interface to expose WC under the user's control rather than >> >> > simply transparently impose WC for all mappings, but is that really the >> >> > case? Most devices on most platforms seem to work fine now. Perhaps WC >> >> > is a performance optimization, but this is the first instance I've seen >> >> > of it as a functional issue. Does that suggest that the imposed >> >> > alignment on your platform is perhaps unique and the relaxed alignment >> >> > should be implemented at the architecture specific memory flags for UC >> >> > mappings? For instance, does x86 require this change for the same >> >> > device? The chance for regressions of other devices on other platforms >> >> > seems rather high as proposed. Thanks, >> >> This issue is not specific to platform or device. this is the requirement of >> >> CMB enabled NVMe cards. >> >> NVMe kernel driver already has support to map CMB bar as WC using >> >> ioremap_wc function. >> >> File: drivers/nvme/host/pci.c >> >> Function: nvme_map_cmb >> >> code: dev->cmb = ioremap_wc(pci_resource_start(pdev, bar) + offset, size); >> >> It means ioremap_wc is working with all platforms and WC map of >> >> perfetchable BARs does not harm. >> >> Same is required in SPDK NVMe driver also, without WC map it may work >> >> in x86 platform, but it does not work in ARM platforms. >> > >> > Doesn't this contradict your assertion that it's not specific to >> > platform or device? The device requires support for unaligned >> > accesses. The platform chooses to restrict unaligned accesses for >> > non-WC mappings while other platforms do not. The native driver can >> > still clearly have performance considerations for choosing to use WC >> > mappings, but it's still not clear to me that the functionality issue >> > isn't self inflicted by the platform definition of UC vs WC. Thanks, >> > >> Device allows both aligned and unaligned access.. so software have >> flexibility can do unaligned access. >> As per ARM64 platform, with UC map, memory access should be un-cached, >> aligned access and strongly order mapping. >> with WC map, memory access can be aligned/unaligned and un-cached. I >> thought this is the property of platform not issue. >> To allow software to do unaligned access of device memory, we need to >> use WC map of ARM64 platform case. >> In ARM platforms UC mapping is used to map controller registers which >> are 4 byte aligned exposed by non-prefetchable bars. >> Also prefetchable BARs allows write merging so I thought using WC map >> fulfills both write merging (add performance) and unaligned >> access. >> Can I modify the code as below to enable only for ARM platforms. >> vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); >> +#if defined(CONFIG_ARM) || defined(CONFIG_ARM64) >> + if (pci_resource_flags(pdev, index) & IORESOURCE_PREFETCH) >> + vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); >> +#endif >> vma->vm_pgoff = (pci_resource_start(pdev, index) >> PAGE_SHIFT) + pgoff; > > While the risk of regression is smaller by restricting this to ARM, I > don't think it's the right solution. What happens when a device > requires strict ordering? ARM now behaves differently than any other > architecture, that's not acceptable. Thanks, If strict ordering is required for prefetchable bars, driver software has to add barrier instructions. With this we can assume for prefetchable bars WC mapping should be fine? Regards, Srinath. > > Alex