Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp441380pxj; Wed, 2 Jun 2021 03:04:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzuDn/fh2w2xPskx+kMOyeepO7MBXI2inERiZViVMANw5EqoAyt5nL6PQsHP8ZIwUSJX2lq X-Received: by 2002:a50:a6c2:: with SMTP id f2mr30698410edc.39.1622628292255; Wed, 02 Jun 2021 03:04:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1622628292; cv=none; d=google.com; s=arc-20160816; b=WvS6AzPtkfztdLG1A+73o1x98rnc8LjzIldGlW5Qu7NJo14fSJp/Vu+jO1B7tRdc94 xB2FNTuUDAiBajigcB+XTn6G/pqbMSvHbr72RMWc/WUdq6CX5JXFcMPlLrpG9oEz9tAR qWLt8ktwef4osdmdyG0u1Abtkb9Xo+qTEKMAtX7u0X3x1YJyn6aeVQnhnVluBwXjlncN zhJHlbFJglCO3NRrajTMvzjv7bCam1+83l93m9QvSXB/hLjtzzl6WzNF0iD4IhFMFCHC 3x1yGeJGiQd1jmTW7LfAnu4JR0h6lJ3Q/0JWdcz/pg7AwiXwujBN7m0iPRTDorIQDG8d dvXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:subject:cc:to:from:message-id :date; bh=o0w6lwPuZhgn14Na+GKoYdepEpwe2WDWfzCsbX8Et3Q=; b=AWygHd1U3oI0ok5qSGgI1yWuLQhxVqB6AKCR5tFNKwQEk77WL7Y5Hzr/Ddh8cSBo3u myJygOJwqUaVK/aFbEXytoqjVFHnEENm1CTLg4WR/3+iVW+YvVBKjQ482GcpisYuG06j 82hKDc+cSFId6HkYzTpT8f6wsTr0/83PmqJe063ronsFRKVXG44w/6FF6AV0G0eEl8P1 xMB74ZcE0QkaJCL6ilCACN8f1MaP9R0wzNFyzUfP98f58ErGItTzMuIV2zm3HyA1qOa+ HNoiJTzHxwDuy70LTWtkIr7j2qoOot4wq+O+1PBhWO1GArDDCbdBkjPxSbmDdvzv5A3W ScNw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bv2si10732354ejb.450.2021.06.02.03.04.29; Wed, 02 Jun 2021 03:04:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231983AbhFBJjo convert rfc822-to-8bit (ORCPT + 99 others); Wed, 2 Jun 2021 05:39:44 -0400 Received: from mail.kernel.org ([198.145.29.99]:50214 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229818AbhFBJjn (ORCPT ); Wed, 2 Jun 2021 05:39:43 -0400 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 0467E60FF2; Wed, 2 Jun 2021 09:38:01 +0000 (UTC) Received: from 78.163-31-62.static.virginmediabusiness.co.uk ([62.31.163.78] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1loNK2-004zX9-Vi; Wed, 02 Jun 2021 10:37:59 +0100 Date: Wed, 02 Jun 2021 10:37:58 +0100 Message-ID: <878s3s1ua1.wl-maz@kernel.org> From: Marc Zyngier To: Shanker R Donthineni Cc: Catalin Marinas , Will Deacon , Vikram Sethi , Alex Williamson , Mark Kettenis , "christoffer.dall@arm.com" , "linux-arm-kernel@lists.infradead.org" , "kvmarm@lists.cs.columbia.edu" , "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , Jason Sequeira Subject: Re: [RFC 1/2] vfio/pci: keep the prefetchable attribute of a BAR region in VMA In-Reply-To: <273ba1c2-dfe6-7dc1-3e40-03398e82469b@nvidia.com> References: <878s4zokll.wl-maz@kernel.org> <87eeeqvm1d.wl-maz@kernel.org> <87bl9sunnw.wl-maz@kernel.org> <20210503084432.75e0126d@x1.home.shazbot.org> <20210504083005.GA12290@willie-the-truck> <20210505180228.GA3874@arm.com> <273ba1c2-dfe6-7dc1-3e40-03398e82469b@nvidia.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT X-SA-Exim-Connect-IP: 62.31.163.78 X-SA-Exim-Rcpt-To: sdonthineni@nvidia.com, catalin.marinas@arm.com, will@kernel.org, vsethi@nvidia.com, alex.williamson@redhat.com, mark.kettenis@xs4all.nl, christoffer.dall@arm.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, jsequeira@nvidia.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Shanker, On Sat, 08 May 2021 17:33:11 +0100, Shanker R Donthineni wrote: > > Hi Marc, > > On 5/5/21 1:02 PM, Catalin Marinas wrote: > >>> Will/Catalin, perhaps you could explain your thought process on why you chose > >>> Normal NC for ioremap_wc on the armv8 linux port instead of Device GRE or other > >>> Device Gxx. > >> I think a combination of: compatibility with 32-bit Arm, the need to > >> support unaligned accesses and the potential for higher performance. > > IIRC the _wc suffix also matches the pgprot_writecombine() used by some > > drivers to map a video framebuffer into user space. Accesses to the > > framebuffer are not guaranteed to be aligned (memset/memcpy don't ensure > > alignment on arm64 and the user doesn't have a memset_io or memcpy_toio). > > > >> Furthermore, ioremap() already gives you a Device memory type, and we're > >> tight on MAIR space. > > We have MT_DEVICE_GRE currently reserved though no in-kernel user, we > > might as well remove it. > @Marc, Could you provide your thoughts/guidance for the next step? The > proposal of getting hints for prefetchable regions from VFIO/QEMU is not > recommended, The only option left is to implement ARM64 dependent logic > in KVM. > > Option-1: I think we could take advantage of stage-1/2 combining rules to > allow NORMAL_NC memory-type for device memory in VM. Always map > device memory at stage-2 as NORMAL-NC and trust VM's stage-1 MT. > > --------------------------------------------------------------- > Stage-2 MT     Stage-1 MT    Resultant MT (combining-rules/FWB) > --------------------------------------------------------------- > Normal-NC      Normal-WT           Normal-NC >    -           Normal-WB              - >    -           Normal-NC              - >    -           Device-       Device- > --------------------------------------------------------------- I think this is unwise. Will recently debugged a pretty horrible situation when doing exactly that: when S1 is off and S2 is on, the I-side is allowed to generate speculative accesses (see ARMv8 ARM G.a D5.2.9 for the details). And yes, implementations definitely do that. Add side-effect reads to the mix, and you're in for a treat. > We've been using this option internally for testing purpose and > validated with NVME/Mellanox/GPU pass-through devices on > Marvell-Thundex2 platform. See above. It *will* break eventually. > Option-2: Get resource properties associated with MMIO using lookup_resource() > and map at stage-2 as Normal-NC if IORESOURCE_PREFETCH is set in flags. That's a pretty roundabout way of doing exactly the same thing you initially proposed. And it suffers from the exact same problems, which is that you change the semantics of the mapping without knowing what the guest's intent is. M. -- Without deviation from the norm, progress is not possible.