Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp706375pxb; Wed, 20 Jan 2021 19:21:13 -0800 (PST) X-Google-Smtp-Source: ABdhPJyZMuQVNt6brkoTPZtLnuz9Fzw/q2dfwrb8UXsJiWoOONOJwB3JUei2mjuy+fTKoIXUgUF5 X-Received: by 2002:a17:906:c09:: with SMTP id s9mr7889777ejf.539.1611199273616; Wed, 20 Jan 2021 19:21:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611199273; cv=none; d=google.com; s=arc-20160816; b=nuOvRx94EnOPKqmDbfxrjayU+1NPXboKpTlMYAVW0v629IKVIZafo+mkFXmqnsd+Vq LJsbOFlSAYxW6INTUjvhkgz8pcBhkCRIBpEjTxy+hseQ9KOQ04nupcOdVCGFxBSxDyHy Y//SZYcN4PfCaWuAbsApnm19XSsL26a5GnHhRvQA/D2oMUR3ChSLdhqkxUJH3rFprHtK E/RHMgxlrZaqQLORR2SCW4iDkbRdol7QXeH4EbETnKOYIiEhZGxPDijCCOkghtQYyn7Q WPbFnmoibFVclMO4JNUkp3+ep3c8ALFiZ+bulufwrBzC02hN09am/Vt8PcAEGb6en5zS 1/Mw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=r4htabb5YxuWD3co7GorxDkcozmF3NiAA4Yzh4pufa0=; b=pOEf7reYlxwsCX8KYXGq5JvGfwdj9buxtd7L+jtuRk/v3NQkz/pJQjI/hXJzxgYaAq FaI2irEIBwSp7ERpo1l5g/3DKzuUfQHCwynqLIE+zmexcZA5X1PCbwbZn5fR82eWf44/ WUTtCpBmuZolfMyzgNDbr3IpEJ08Avdcsf0Qv6ODD2k/PecEO/h5U+GNmGcq4vHGMjje 1YCOt0rNnbcCoDMdGFJ5JXwQBW860+GW2gNLJ0zbtZ2YprkCMeG6sfBTi9plwxRoaLtq B2V26m5TC3ZabCB10pnpgX2SiFXJy/O2N9RG5qJQOxOfcaraoULyoqvN4+BoxmRfwtrz YqeQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id f23si1575987edm.270.2021.01.20.19.20.50; Wed, 20 Jan 2021 19:21:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392295AbhAUBOj (ORCPT + 99 others); Wed, 20 Jan 2021 20:14:39 -0500 Received: from foss.arm.com ([217.140.110.172]:36906 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732776AbhAUBKv (ORCPT ); Wed, 20 Jan 2021 20:10:51 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D2B7B14FF; Wed, 20 Jan 2021 17:10:04 -0800 (PST) Received: from [10.57.39.58] (unknown [10.57.39.58]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8A93E3F68F; Wed, 20 Jan 2021 17:09:58 -0800 (PST) Subject: Re: [RFC PATCH v3 5/6] dt-bindings: of: Add restricted DMA pool To: Rob Herring Cc: Claire Chang , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , Joerg Roedel , Will Deacon , Frank Rowand , Konrad Rzeszutek Wilk , Boris Ostrovsky , Juergen Gross , Stefano Stabellini , Christoph Hellwig , Marek Szyprowski , Grant Likely , Heinrich Schuchardt , Thierry Reding , Ingo Molnar , Thiago Jung Bauermann , Peter Zijlstra , Greg Kroah-Hartman , Saravana Kannan , "Wysocki, Rafael J" , Heikki Krogerus , Andy Shevchenko , Randy Dunlap , Dan Williams , Bartosz Golaszewski , devicetree@vger.kernel.org, "linux-kernel@vger.kernel.org" , linuxppc-dev , Linux IOMMU , xen-devel@lists.xenproject.org, Tomasz Figa , Nicolas Boichat References: <20210106034124.30560-1-tientzu@chromium.org> <20210106034124.30560-6-tientzu@chromium.org> <20210120165348.GA220770@robh.at.kernel.org> <313f8052-a591-75de-c4c2-ee9ea8f02e7f@arm.com> From: Robin Murphy Message-ID: Date: Thu, 21 Jan 2021 01:09:56 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:78.0) Gecko/20100101 Thunderbird/78.6.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021-01-20 21:31, Rob Herring wrote: > On Wed, Jan 20, 2021 at 11:30 AM Robin Murphy wrote: >> >> On 2021-01-20 16:53, Rob Herring wrote: >>> On Wed, Jan 06, 2021 at 11:41:23AM +0800, Claire Chang wrote: >>>> Introduce the new compatible string, restricted-dma-pool, for restricted >>>> DMA. One can specify the address and length of the restricted DMA memory >>>> region by restricted-dma-pool in the device tree. >>> >>> If this goes into DT, I think we should be able to use dma-ranges for >>> this purpose instead. Normally, 'dma-ranges' is for physical bus >>> restrictions, but there's no reason it can't be used for policy or to >>> express restrictions the firmware has enabled. >> >> There would still need to be some way to tell SWIOTLB to pick up the >> corresponding chunk of memory and to prevent the kernel from using it >> for anything else, though. > > Don't we already have that problem if dma-ranges had a very small > range? We just get lucky because the restriction is generally much > more RAM than needed. Not really - if a device has a naturally tiny addressing capability that doesn't even cover ZONE_DMA32 where the regular SWIOTLB buffer will be allocated then it's unlikely to work well, but that's just crap system design. Yes, memory pressure in ZONE_DMA{32} is particularly problematic for such limited devices, but it's irrelevant to the issue at hand here. What we have here is a device that's not allowed to see *kernel* memory at all. It's been artificially constrained to a particular region by a TZASC or similar, and the only data which should ever be placed in that region is data intended for that device to see. That way if it tries to go rogue it physically can't start slurping data intended for other devices or not mapped for DMA at all. The bouncing is an important part of this - I forget the title off-hand but there was an interesting paper a few years ago which demonstrated that even with an IOMMU, streaming DMA of in-place buffers could reveal enough adjacent data from the same page to mount an attack on the system. Memory pressure should be immaterial since the size of each bounce pool carveout will presumably be tuned for the needs of the given device. > In any case, wouldn't finding all the dma-ranges do this? We're > already walking the tree to find the max DMA address now. If all you can see are two "dma-ranges" properties, how do you propose to tell that one means "this is the extent of what I can address, please set my masks and dma-range-map accordingly and try to allocate things where I can reach them" while the other means "take this output range away from the page allocator and hook it up as my dedicated bounce pool, because it is Serious Security Time"? Especially since getting that choice wrong either way would be a Bad Thing. Robin. >>>> Signed-off-by: Claire Chang >>>> --- >>>> .../reserved-memory/reserved-memory.txt | 24 +++++++++++++++++++ >>>> 1 file changed, 24 insertions(+) >>>> >>>> diff --git a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt >>>> index e8d3096d922c..44975e2a1fd2 100644 >>>> --- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt >>>> +++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt >>>> @@ -51,6 +51,20 @@ compatible (optional) - standard definition >>>> used as a shared pool of DMA buffers for a set of devices. It can >>>> be used by an operating system to instantiate the necessary pool >>>> management subsystem if necessary. >>>> + - restricted-dma-pool: This indicates a region of memory meant to be >>>> + used as a pool of restricted DMA buffers for a set of devices. The >>>> + memory region would be the only region accessible to those devices. >>>> + When using this, the no-map and reusable properties must not be set, >>>> + so the operating system can create a virtual mapping that will be used >>>> + for synchronization. The main purpose for restricted DMA is to >>>> + mitigate the lack of DMA access control on systems without an IOMMU, >>>> + which could result in the DMA accessing the system memory at >>>> + unexpected times and/or unexpected addresses, possibly leading to data >>>> + leakage or corruption. The feature on its own provides a basic level >>>> + of protection against the DMA overwriting buffer contents at >>>> + unexpected times. However, to protect against general data leakage and >>>> + system memory corruption, the system needs to provide way to restrict >>>> + the DMA to a predefined memory region. >>>> - vendor specific string in the form ,[-] >>>> no-map (optional) - empty property >>>> - Indicates the operating system must not create a virtual mapping >>>> @@ -120,6 +134,11 @@ one for multimedia processing (named multimedia-memory@77000000, 64MiB). >>>> compatible = "acme,multimedia-memory"; >>>> reg = <0x77000000 0x4000000>; >>>> }; >>>> + >>>> + restricted_dma_mem_reserved: restricted_dma_mem_reserved { >>>> + compatible = "restricted-dma-pool"; >>>> + reg = <0x50000000 0x400000>; >>>> + }; >>>> }; >>>> >>>> /* ... */ >>>> @@ -138,4 +157,9 @@ one for multimedia processing (named multimedia-memory@77000000, 64MiB). >>>> memory-region = <&multimedia_reserved>; >>>> /* ... */ >>>> }; >>>> + >>>> + pcie_device: pcie_device@0,0 { >>>> + memory-region = <&restricted_dma_mem_reserved>; >>> >>> PCI hosts often have inbound window configurations that limit the >>> address range and translate PCI to bus addresses. Those windows happen >>> to be configured by dma-ranges. In any case, wouldn't you want to put >>> the configuration in the PCI host node? Is there a usecase of >>> restricting one PCIe device and not another? >> >> The general design seems to accommodate devices having their own pools >> such that they can't even snoop on each others' transient DMA data. If >> the interconnect had a way of wiring up, say, PCI RIDs to AMBA NSAIDs, >> then in principle you could certainly apply that to PCI endpoints too >> (presumably you'd also disallow them from peer-to-peer transactions at >> the PCI level too). > > At least for PCI, I think we can handle this. We have the BDF in the > 3rd address cell in dma-ranges. The Openfirmware spec says those are 0 > in the case of ranges. It doesn't talk about dma-ranges though. But I > think we could extend it to allow for BDF. Though typically with PCIe > every device is behind its own bridge and each bridge node can have a > dma-ranges. > > Rob >