Received: by 2002:a05:6358:e9c4:b0:b2:91dc:71ab with SMTP id hc4csp4962940rwb; Mon, 8 Aug 2022 09:45:59 -0700 (PDT) X-Google-Smtp-Source: AA6agR6wqeGwl+j6HAYu+/TjgxKZ793HPt20bPN84hQwrO4NQTemzi4MHavqIh8rIkNHEkwmN9aq X-Received: by 2002:a65:6202:0:b0:41d:5906:2165 with SMTP id d2-20020a656202000000b0041d59062165mr6985686pgv.320.1659977159450; Mon, 08 Aug 2022 09:45:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659977159; cv=none; d=google.com; s=arc-20160816; b=p+fbp9anqZ8hj57ZUxIV9j4W9rt6wCGjp1DVdYeVReqXvdwN92AVjDLkXSZuc+RC6A dvxq2rSNBMsZtozXfAJ23yQGn2TEZLRQbOFkWwwNdLiI64lN6dc0+uQlqW5t7EAna9ih 0Q8QV20poMNnQV/SA6uHrpMDk8mYFaopzVgf9beWQEZH6ZKe86wQIDw1e5/gMXOE93U5 mqtp5RT/9Cq2HyVhkBk5hORqZ3/1wTYqN24AkNvmpT9ull3MfGz8vd05H9ZmUkKiz+1L bWuSKPBPaTh1P5JIDyxIw/7E17biNIZdvVI8te+QFwY/wWFAUxVcLnnTot2IcbJkJERI GLDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=Y2IMNWPIjkzwaAM4UvkDC79np/PUW3X2sEbAT/ct8M4=; b=bp7txjqUFjgXjq6AXgbxGUaNSK0WRzs9eFZRX/TroDOP9dBp4iaRibKzzxh3bdtXne o/ygyyl+vLEnMlH3rQBrzvjFdoZ/WWi00nnXcuVZtgahqG5rNrNHdUsvleHQ3O1w3P1X Js+nE+d5cCks9L2S+e2MPeIk7t27+kq8gIGBrghhKGkbl9QJZE1ULKfAOSKyKa/JNDHt iSuKU6GEX0+E/IGKLzXyQNwgPrmPyd/S8kSfdIHL4Fwj6sQYIooqBCFCEBRfsKWm4HzD 2aq936fYSdF4GbS1J16SsIMunhYm2G37LWS6h/I1ELs2brg15N4RaeB2IOft34gzDJje ZWcQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=LxmDtRGZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x36-20020a634a24000000b0041ae5bf1290si10706840pga.246.2022.08.08.09.45.44; Mon, 08 Aug 2022 09:45:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=LxmDtRGZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243999AbiHHQ3j (ORCPT + 99 others); Mon, 8 Aug 2022 12:29:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57626 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243519AbiHHQ3i (ORCPT ); Mon, 8 Aug 2022 12:29:38 -0400 Received: from mail-yb1-xb2a.google.com (mail-yb1-xb2a.google.com [IPv6:2607:f8b0:4864:20::b2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6A1C626E8 for ; Mon, 8 Aug 2022 09:29:37 -0700 (PDT) Received: by mail-yb1-xb2a.google.com with SMTP id i62so14475144yba.5 for ; Mon, 08 Aug 2022 09:29:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=Y2IMNWPIjkzwaAM4UvkDC79np/PUW3X2sEbAT/ct8M4=; b=LxmDtRGZHfImjROQcKCDQgGiXe9nnrACKjUj1qQ1Oo5w8W1sVppkiIaStTZalM3DXn d4fg1VCCvKly9zyZDlAlOBTDConQkbR6pYotSN97XAEEa3G6Ci/dkjE2SlkeewTK2qS/ +LZD9TM6jhdfJet5x28hBgUlGGgDvlR0Axijg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=Y2IMNWPIjkzwaAM4UvkDC79np/PUW3X2sEbAT/ct8M4=; b=IZVmkp71AOMT0bAPYCXMCMSpgp2JfEIspY5gJXsQrJq1ukkpa0tflhL/YmrCOgEECM /e2VASnQfqPmJxlFesgpVIc0fpvjzdTrf/EhYz0TJmz9rTF6GqlxYKcMd3NmieLWssWr kSkJMssWrFDDRBAV8EiInQlEa+w4mZO2YRCgFKbzAEChnpdgJUnAo6S+CuTp/kjfhSaF V/oNt12+0nY1jACBC6gWBR2DLyQKud7oggrvq0FJMjpT+N6HxTOZRK2/ASREFzlmy5dn zFj6MrmgtVFKqh8VyES8YFiXb+yRWAySmVRzCD8uvxHQWCRhcPRX8/YiITOTXh5qHuKD mdQA== X-Gm-Message-State: ACgBeo2JzGhk8JUhMU6KROAu0awwdEC4/lImDLwwYOpHNqeSvNV3GFXv Yhie5awyvc1AJkaC+NDYjzu8Gm+xB9Hcw3m4h1IvCPjyDz8= X-Received: by 2002:a25:b68b:0:b0:673:df99:5838 with SMTP id s11-20020a25b68b000000b00673df995838mr17329517ybj.157.1659976165690; Mon, 08 Aug 2022 09:29:25 -0700 (PDT) MIME-Version: 1.0 References: <20220729170744.1301044-1-robdclark@gmail.com> <20220729170744.1301044-2-robdclark@gmail.com> <3d2083aa-fc6c-6875-3daf-e5abe45fb762@gmail.com> <973de2f8-75e4-d4c7-a13a-c541a6cf7c77@amd.com> <2fc74efe-220f-b57a-e804-7d2b3880d14f@gmail.com> <4e7448d2-7b26-e260-3d6c-7aa263a75250@amd.com> In-Reply-To: <4e7448d2-7b26-e260-3d6c-7aa263a75250@amd.com> From: Rob Clark Date: Mon, 8 Aug 2022 09:29:52 -0700 Message-ID: Subject: Re: [Linaro-mm-sig] [PATCH 1/3] dma-buf: Add ioctl to query mmap info To: =?UTF-8?Q?Christian_K=C3=B6nig?= Cc: Rob Clark , =?UTF-8?Q?Christian_K=C3=B6nig?= , dri-devel@lists.freedesktop.org, Daniel Vetter , freedreno@lists.freedesktop.org, Sumit Semwal , =?UTF-8?B?SsOpcsO0bWUgUG91aWxsZXI=?= , "open list:DMA BUFFER SHARING FRAMEWORK" , "moderated list:DMA BUFFER SHARING FRAMEWORK" , open list Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 8, 2022 at 7:56 AM Christian K=C3=B6nig wrote: > > Am 08.08.22 um 15:26 schrieb Rob Clark: > > On Mon, Aug 8, 2022 at 4:22 AM Christian K=C3=B6nig wrote: > > > > [SNIP] > >>>> If the virtio/virtgpu UAPI was build around the idea that this is > >>>> possible then it is most likely fundamental broken. > >>> How else can you envision mmap'ing to guest userspace working? > >> Well long story short: You can't. > >> > >> See userspace mappings are not persistent, but rather faulted in on > >> demand. The exporter is responsible for setting those up to be able to > >> add reverse tracking and so can invalidate those mappings when the > >> backing store changes. > > I think that is not actually a problem. At least for how it works on > > arm64 but I'm almost positive x86 is similar.. I'm not sure how else > > you could virtualize mmu/iommu/etc in a way that didn't have horrible > > performance. > > > > There are two levels of pagetable translation, the first controlled by > > the host kernel, the second by the guest. From the PoV of host > > kernel, it is just memory mapped to userspace, getting faulted in on > > demand, just as normal. First the guest controlled translation > > triggers a fault in the guest which sets up guest mapping. And then > > the second level of translation to translate from what guest sees as > > PA (but host sees as VA) to actual PA triggers a fault in the host. > > Ok, that's calming. > > At least that's not the approach talked about the last time this came up > and it doesn't rip a massive security hole somewhere. Hmm, tbh I'm not sure which thread/discussion this was.. it could have been before I was paying much attention to the vm use-case > The question is why is the guest then not using the caching attributes > setup by the host page tables when the translation is forwarded anyway? The guest kernel itself doesn't know. AFAICT, at least on arm, the hw will combine the attributes of the mapping in S1 and S2 pagetables and use the most restrictive. So if S1 (host) is cached but S2 (guest) is WC, you'll end up w/ WC. That said, at least on aarch64, it seems like we could always tell the guest it is cached, and if mapped WC in S1 you'll end up with WC access. But this seems to depend on an optional feature, FWB, which allows S2 to override S1 attributes, not being enabled. And not entirely sure how it works on x86. BR, -R > > [SNIP] > > This is basically what happens, although via the two levels of pgtable > > translation. This patch provides the missing piece, the caching > > attributes. > > Yeah, but that won't work like this. See the backing store migrates all > the time and when it is backed by PCIe/VRAM/local memory you need to use > write combine while system memory is usually cached. > > >> Because otherwise you can't accommodate that the exporter is > >> changing those caching attributes. > > Changing the attributes dynamically isn't going to work.. or at least > > not easily. If you had some sort of synchronous notification to host > > userspace, it could trigger an irq to the guest, I suppose. But it > > would mean host kernel has to block waiting for host userspace to > > interrupt the guest, then wait for guest vgpu process to be scheduled > > and handle the irq. > > We basically change that on every page flip on APUs and that doesn't > sound like something fast. > > Thanks for the explanation how this works, > Christian. > > > > > At least in the case of msm, the cache attributes are static for the > > life of the buffer, so this scenario isn't a problem. AFAICT this > > should work fine for at least all UMA hw.. I'm a bit less sure when it > > comes to TTM, but shouldn't you at least be able to use worst-cache > > cache attributes for buffers that are allowed to be mapped to guest? > > > > BR, > > -R > > > >>> But more seriously, let's take a step back here.. what scenarios are > >>> you seeing this being problematic for? Then we can see how to come u= p > >>> with solutions. The current situation of host userspace VMM just > >>> guessing isn't great. > >> Well "isn't great" is a complete understatement. When KVM/virtio/virtg= pu > >> is doing what I guess they are doing here then that is a really major > >> security hole. > >> > >>> And sticking our heads in the sand and > >>> pretending VMs don't exist isn't great. So what can we do? I can > >>> instead add a msm ioctl to return this info and solve the problem eve= n > >>> more narrowly for a single platform. But then the problem still > >>> remains on other platforms. > >> Well once more: This is *not* MSM specific, you just absolutely *can't > >> do that* for any driver! > >> > >> I'm just really wondering what the heck is going on here, because all = of > >> this was discussed in lengthy before on the mailing list and very > >> bluntly rejected. > >> > >> Either I'm missing something (that's certainly possible) or we have a > >> strong case of somebody implementing something without thinking about > >> all the consequences. > >> > >> Regards, > >> Christian. > >> > >> > >>> Slightly implicit in this is that mapping dma-bufs to the guest won't > >>> work for anything that requires DMA_BUF_IOCTL_SYNC for coherency.. we > >>> could add a possible return value for DMA_BUF_INFO_VM_PROT indicating > >>> that the buffer does not support mapping to guest or CPU access > >>> without DMA_BUF_IOCTL_SYNC. Then at least the VMM can fail gracefull= y > >>> instead of subtly. > >>> > >>> BR, > >>> -R >