Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp1215312pxp; Thu, 10 Mar 2022 00:41:50 -0800 (PST) X-Google-Smtp-Source: ABdhPJywofAErtuww1E1iZIiuISCmRXVw1/ETPHZ4yWGId8oc/SuJZmLXaSqdEQwC6CXvpGg69oD X-Received: by 2002:a50:e60f:0:b0:415:9509:32a2 with SMTP id y15-20020a50e60f000000b00415950932a2mr3187027edm.235.1646901710046; Thu, 10 Mar 2022 00:41:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1646901710; cv=none; d=google.com; s=arc-20160816; b=TPIfTNZHW2hgsvN5zaUrnT+ovRUgNmGqPCsJdfOoSHP2KMX0iLUHeAq2EjFnUQn9Hi RO0L5u5hdgtIOe+NNHhGdobXWo+zZCaNdGfPMjizZX/pjzzfGrRGt/glGFgdo0yY9Puj HxD2Eu7zQURsjgXHWQA3TGV4EjaElf4//aQez3KWEGKkxdroD8pWiEzWuqgEd84bCqxa YyUzbFFcPmyZnRSw3SVAml/u/tD9vzw+qQR6H0Fzp25tuUzoxogFJy5rTrYBLaBOAKzc l9HeyN45k6tWXV8u3XloAQA1MN9230jp470xLb4wq4gZopWp6/MjSSe+ldggB4Lun9X5 hTOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=wB+NOcZPjuBQ7I++42cwqpD+y9hE+iVjMiQbLppxTXQ=; b=XBi9pNQbiSj5ix8F0WJegwQ9Mjhv0F8woh9dQd6ycV+1FyBnSNuz2TVrAvtGFCxrDn 5QdYbv/1U2pgLFSEeLjKdFS6tSOJGnLh0BXII1dUR77cFf1AufIDem/UyTqCCytZ3Z39 Hu8pC6QzpXA77b63ou5HraP0cnYzGrIP2uXxdT61V6HPoH09VQ2Ai6nM0j3+DGuCTvga 4H2lbU3tUuY7eiPMdhq3EwjHkrSE/hrVYNMoBibCTovG+EZbnrzF18XpHiYaGeOxJrVq pABYLBcbCF1nghlwX1W04nCH/cAP6F6bAekbvl1zEkAlrrvMxmpiySuY0zFB07d+b2uj RtVw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=S5ziUcSw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r20-20020a05640251d400b00416a7459637si1418561edd.344.2022.03.10.00.41.26; Thu, 10 Mar 2022 00:41:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=S5ziUcSw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236607AbiCIVwG (ORCPT + 99 others); Wed, 9 Mar 2022 16:52:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46926 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230398AbiCIVwE (ORCPT ); Wed, 9 Mar 2022 16:52:04 -0500 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB5AE4B1CA for ; Wed, 9 Mar 2022 13:51:04 -0800 (PST) Received: by mail-wr1-x436.google.com with SMTP id k24so5079670wrd.7 for ; Wed, 09 Mar 2022 13:51:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=wB+NOcZPjuBQ7I++42cwqpD+y9hE+iVjMiQbLppxTXQ=; b=S5ziUcSwcWcbqql4ryBgFaU9gKpCBE1RLKnQP7DR/A/4pJO6wUAXSOpyOC/PimqW0K enPHbFGVz/35MIg78/PK2T2OmkRBZlHkZvRYSg31B27Ojy6Ymwd7fNI8qSHlnt75iSVX UzwdOpu2dZf0hwZ4B9nww9hl9WhRfSTUI35DLGd0M9FKMZKdlpVsVn4yPoxhx+DAiqkm NSylcRRoIzknJ19apuqe5z+3Z1yNmpa0sIlQBa5ZBFZjdmun0ZFn9enrkE9px7gNyuSR 9sEAsMuA2MFx4TFTw0bsjNEJRtihF96nvJDmKiSuS83NE1nUoynqmTX/VcJAh8K/tiTO 15mA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=wB+NOcZPjuBQ7I++42cwqpD+y9hE+iVjMiQbLppxTXQ=; b=KnAPklWYgC8tEQp2p6NwxjSJgjmIcaez9r0Y65z2pYhoXJcqTXMYilHBwrjN4cWT61 +BnQmzddbDsW7EebuoRPhTD3yuYUExe+qiHaezwq6omfERl/AkTga6oPIzj1hGfhTWqO FgFdlaVlL0AsSHvJzvyjYpmK7ZV66Qz19BWHGAUDp/TVG7Kil7Ipb76EmxcFCSOsTAkR Spwn6ZhWO7K99v3pn4FBpY6w1Y0L8QUg3U+0UD6A7g0uL44kYYusFgQJl/e6Ql0FooZd litY2xKu1qDUYiOgiw1I5ee56dCiKBLsKfzKRUv+oZ9iEy8eiYAxU2gqSFYeKmDLYXrZ k+oA== X-Gm-Message-State: AOAM533Y+2Z2gvWZWgdUwGIBfGdGVSwwBe+wzBPAfC2GGUWekqvlSdEQ /QF4AgHp17VuU70dZNAp7HnWwJq4QReDCFekzUI= X-Received: by 2002:adf:914f:0:b0:1ed:bb92:d0cc with SMTP id j73-20020adf914f000000b001edbb92d0ccmr1224465wrj.297.1646862663329; Wed, 09 Mar 2022 13:51:03 -0800 (PST) MIME-Version: 1.0 References: <20220308131725.60607-1-dmitry.osipenko@collabora.com> <42facae5-8f2c-9c1f-5144-4ebb99c798bd@collabora.com> <05e1fe61-1c29-152f-414b-cd6a44525af0@collabora.com> In-Reply-To: <05e1fe61-1c29-152f-414b-cd6a44525af0@collabora.com> From: Rob Clark Date: Wed, 9 Mar 2022 13:51:42 -0800 Message-ID: Subject: Re: [PATCH v1 0/5] Add memory shrinker to VirtIO-GPU DRM driver To: Dmitry Osipenko Cc: David Airlie , Gerd Hoffmann , Gurchetan Singh , Chia-I Wu , Daniel Vetter , Daniel Almeida , Gert Wollny , Tomeu Vizoso , Linux Kernel Mailing List , "open list:VIRTIO GPU DRIVER" , Gustavo Padovan , dri-devel , Dmitry Osipenko , Rob Clark Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 9, 2022 at 12:06 PM Dmitry Osipenko wrote: > > On 3/9/22 03:56, Rob Clark wrote: > >> If we really can't track madvise state in the guest for dealing with > >> host memory pressure, I think the better option is to introduce > >> MADV:WILLNEED_REPLACE, ie. something to tell the host kernel that the > >> buffer is needed but the previous contents are not (as long as the GPU > >> VA remains the same). With this the host could allocate new pages if > >> needed, and the guest would not need to wait for a reply from host. > > If variant with the memory ballooning will work, then it will be > > possible to track the state within guest-only. Let's consider the > > simplest variant for now. > > > > I'll try to implement the balloon driver support in the v2 and will get > > back to you. > > > > I looked at the generic balloon driver and looks like this not what we > want because: > > 1. Memory ballooning is primarily about handling memory overcommit > situations. I.e. when there are multiple VMs consuming more memory than > available in the system. Ballooning allows host to ask guest to give > unused pages back to host and host could give pages to other VMs. > > 2. Memory ballooning operates with guest memory pages only. I.e. each > ballooned page is reported to/from host in a form of page's DMA address. > > 3. There is no direct connection between host's OOM events and the > balloon manager. I guess host could watch system's memory pressure and > inflate VMs' balloons on low memory, releasing the guest's memory to the > system, but apparently this use-case not supported by anyone today, at > least I don't see Qemu supporting it. > hmm, on CrOS I do see balloon getting used to balance host vs guest memory.. but admittedly I've not yet looked closely at how that works, and it does seem like we have some things that are not yet upstream all over the place (not to mention crosvm vs qemu) > > So the virtio-balloon driver isn't very useful for us as-is. > > One possible solution could be to create something like a new > virtio-shrinker device or add shrinker functionality to the virtio-gpu > device, allowing host to ask guests to drop shared caches. Host then > should become a PSI handler. I think this should be doable in a case of > crosvm. In a case of GNU world, it could take a lot of effort to get > everything to upstreamable state, at first there is a need to > demonstrate real problem being solved by this solution. I guess with 4GB chromebooks running one or more VMs in addition to lots of browser tabs in the host, it shouldn't be too hard to demonstrate a problem ;-) (but also, however we end up solving that, certainly shouldn't block this series) > The other minor issue is that only integrated GPUs may use system's > memory and even then they could use a dedicated memory carveout, i.e. > releasing VRAM BOs may not help with host's OOM. In case of virgl > context we have no clue about where buffers are physically located. On > the other hand, in the worst case dropping host caches just won't help > with OOM. Userspace should know whether the BO has CPU storage, so I don't think this should be a problem virtio_gpu needs to worry about > It's now unclear how we should proceed with the host-side shrinker > support. Thoughts? > > We may start easy and instead of thinking about host-side shrinker, we > could make VirtIO-GPU driver to expire cached BOs after a certain > timeout. Mesa already uses timeout-based BO caching, but it doesn't have > an alarm timer and simply checks expiration when BO is allocated. Should > be too much trouble to handle timers within Mesa since it's executed in > application context, easier to do it in kernel, like VC4 driver does it > for example. This is not good as a proper memory shrinker, but could be > good enough in practice. I think that, given virgl uses host storage, guest shrinker should be still useful.. so I think continue with this series. For host shrinker, I'll have to look more at when crosvm triggers balloon inflation. I could still go the MADV:WILLNEED_REPLACE approach instead, which does have the advantage of host kernel not relying on host userspace or vm having a chance to run in order to release pages. The downside (perhaps?) is it would be more specific to virtgpu-native-context and less so to virgl or venus (but I guess there doesn't currently exist a way for madvise to be useful for vk drivers). BR, -R