Received: by 2002:a05:6a10:a852:0:0:0:0 with SMTP id d18csp2593432pxy; Mon, 3 May 2021 03:53:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzvWc2z1vN5YaAYmIbSCsi9m+8fqDRZTwDKnBkO/QOSuPI8e8v7z7kc+blekJ/dhUi3YEdq X-Received: by 2002:a17:906:a212:: with SMTP id r18mr16023467ejy.438.1620039206294; Mon, 03 May 2021 03:53:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620039206; cv=none; d=google.com; s=arc-20160816; b=muUZ71VX+JJ+rW6KS3Rerxu3kVFvIye3pYkm8lqfNJcL54C5eh8DFzHoNIwORGxqL7 yX5qDB+bj/ogumUQE0GEJ+BOOhBIXH/kYxyRE77EMbA6tYsqsQ0EQzdgscs5ag29m//g Fpvym2w3lAUw1DvXHfb9j7UiWpl50src7NiukifNZsQKL7+3/BJklnwUBUpo2nMMHjc2 slCYu91HpBcEBICOlATNjpX1+bmKQiR1g+q5JM2MR86Z2GPBbouCIKZmrGPK0e5hl34t t+RQOK7uac7vlXnHjCeLLrCVMW/cCIy1mAIbKupk62j2EpeEc+mbiBNRZGc4CVn8NeJc o4AQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:in-reply-to :subject:cc:to:from:message-id:date; bh=lzByfI/8+8ZnClRBfoxRHzQCK5HbBvhBXue+9rYMBRU=; b=yQto/ghmbj1qPMTSP6GKQUljWss9gQko03M68kVNKkxPalxcc3RA5IvpaMDUigfRBu WGuuuBi/chIN4F+MPNIBVFFu5LLJhRhoWDmIqbvPWUy8lnVUKn6r5cZHo2BSafCp99PK qfOpc28akl18ucQTv7xrdMwa3EQMIAnQqy4DSXmc8ZR3Xrs9RyU6QKeMMpZvzFM27Vgm KEsxroYEIsjtjRKP29Q0HirTu1jtdnPPeUsyES2a8Pzlw+raqhbq00LToNYBdnnuprXC Wn1iHac4D73KlX/tORaaZP/yJC1hKwtddaMBL0Ao0tynT6jTXY7PMySBfTGc9JEoW0JQ gWpw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s10si9568120edw.282.2021.05.03.03.53.03; Mon, 03 May 2021 03:53:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233313AbhECKSV (ORCPT + 99 others); Mon, 3 May 2021 06:18:21 -0400 Received: from mail.kernel.org ([198.145.29.99]:46984 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233025AbhECKSU (ORCPT ); Mon, 3 May 2021 06:18:20 -0400 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2C3DB610E6; Mon, 3 May 2021 10:17:27 +0000 (UTC) Received: from 78.163-31-62.static.virginmediabusiness.co.uk ([62.31.163.78] helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94) (envelope-from ) id 1ldVdk-00AX3n-VA; Mon, 03 May 2021 11:17:25 +0100 Date: Mon, 03 May 2021 11:17:23 +0100 Message-ID: <87bl9sunnw.wl-maz@kernel.org> From: Marc Zyngier To: Vikram Sethi Cc: Shanker Donthineni , Alex Williamson , Will Deacon , Catalin Marinas , Christoffer Dall , "linux-arm-kernel@lists.infradead.org" , "kvmarm@lists.cs.columbia.edu" , "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , Jason Sequeira Subject: Re: [RFC 1/2] vfio/pci: keep the prefetchable attribute of a BAR region in VMA In-Reply-To: References: <20210429162906.32742-1-sdonthineni@nvidia.com> <20210429162906.32742-2-sdonthineni@nvidia.com> <20210429122840.4f98f78e@redhat.com> <470360a7-0242-9ae5-816f-13608f957bf6@nvidia.com> <20210429134659.321a5c3c@redhat.com> <87czucngdc.wl-maz@kernel.org> <1edb2c4e-23f0-5730-245b-fc6d289951e1@nvidia.com> <878s4zokll.wl-maz@kernel.org> <87eeeqvm1d.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 62.31.163.78 X-SA-Exim-Rcpt-To: vsethi@nvidia.com, sdonthineni@nvidia.com, alex.williamson@redhat.com, will@kernel.org, catalin.marinas@arm.com, christoffer.dall@arm.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, jsequeira@nvidia.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Vikram, On Sun, 02 May 2021 18:56:31 +0100, Vikram Sethi wrote: > > Hi Marc, > > > From: Marc Zyngier > > Hi Vikram, > > > > > The problem I see is that we have VM and userspace being written in terms > > of Write-Combine, which is: > > > > - loosely defined even on x86 > > > > - subject to interpretations in the way it maps to PCI > > > > - has no direct equivalent in the ARMv8 collection of memory > > attributes (and Normal_NC comes with speculation capabilities which > > strikes me as extremely undesirable on arbitrary devices) > > If speculation with Normal NC to prefetchable BARs in devices was a > problem, those devices would already be broken in baremetal with > ioremap_wc on arm64, and we would need quirks there to not do Normal > NC for them but Device GRE, and if such a quirk was needed on > baremetal, it could be picked up by vfio/KVM as well. But we haven't > seen any broken devices doing wc on baremetal on ARM64, have we? The lack of evidence does not equate to a proof, and your devices not misbehaving doesn't mean it is the right thing, specially when we have such a wide range of CPU and interconnect implementation. Which is why I really want an answer at the architecture level. Not a "it works for me" type of answer. Furthermore, as I replied to Shanker in a separate email, what Linux/arm64 does is pretty much irrelevant. KVM/arm64 implements the ARMv8 architecture, and it is at that level that we need to solve the problem. If, by enumerating the properties of Prefetchable, you can show that they are a strict superset of Normal_NC, I'm on board. I haven't seen such an enumeration so far. > I know we have tested NICs write combining on arm64 in baremetal, as > well as GPU and NVMe CMB without issues. > > Further, I don't see why speculation to non cacheble would be an > issue if prefetch without side effects is allowed by the device, > which is what a prefetchable BAR is. > If it is an issue for a device I would consider that a bug already needing a quirk in > Baremetal/host kernel already. > From PCI spec " A prefetchable address range may have write side effects, > but it may not have read side effects." Right, so we have made a small step in the direction of mapping "prefetchable" onto "Normal_NC", thanks for that. What about all the other properties (unaligned accesses, ordering, gathering)? > > How do we translate this into something consistent? I'd like to see an actual > > description of what we *really* expect from WC on prefetchable PCI regions, > > turn that into a documented definition agreed across architectures, and then > > we can look at implementing it with one memory type or another on arm64. > > > > Because once we expose that memory type at S2 for KVM guests, it > > becomes ABI and there is no turning back. So I want to get it right once and > > for all. > > > I agree that we need a precise definition for the Linux ioremap_wc > API wrt what drivers (kernel and userspace) can expect and whether > memset/memcpy is expected to work or not and whether aligned > accesses are a requirement. > To the extent ABI is set, I would think that the ABI is also already > set in the host kernel for arm64 WC = Normal NC, so why should that > not also be the ABI for same driver in VMs. KVM is an implementation of the ARM architecture, and doesn't really care about what WC is. If we come to the conclusion that Normal_NC is the natural match for Prefetchable attributes, than we're good and we can have Normal_NC being set by userspace, or even VFIO. But I don't want to set it only because "it works when bare-metal Linux uses it". Remember KVM doesn't only run Linux as guests. M. -- Without deviation from the norm, progress is not possible.