Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp3784185rdb; Thu, 14 Sep 2023 02:27:56 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEVDgV7ZW3O5G2vG4bbJN3m/wBPhBB9he7iJAi6CHWmfhKnFOkBl9+ZysQELg45cnVs9Vu3 X-Received: by 2002:a05:6870:a54a:b0:1b0:5bf7:3bb6 with SMTP id p10-20020a056870a54a00b001b05bf73bb6mr5179864oal.28.1694683676462; Thu, 14 Sep 2023 02:27:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694683676; cv=none; d=google.com; s=arc-20160816; b=Tqy90NhhzKmpwy/tcwGNjIgn6lzwEKViUO2m4BtrvGAwB2O//36E7muO7sMj+3Xu5r 1wrrmHWi4Ig32sm8M87+ZXemDBuPQ6OZL5yzviawrTT0hn58KmfkWgXMjx5PpWFO6PNU bshh//vNcqUmk6oSepygpZS1qpF7Zb+bQt/fWPZ/slJBLrBtZegBV3l6xe5+puNue7UI iQotMSU0dOXYY2iFvhnuKKSrmHOh2cMeK+0UoMp+pErXEo3jcQRCVGFkmyaDIWwFI1bA V47UqAp1b3bkjbt/VIxteW9G1CwYlgC7PD+yBSUbcKSQ9nQ/bwlJWwo/+XROhTLNQSr5 NA5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date:dkim-signature; bh=aocE0RWXAM7Bs/LfajRBHN3chE7EV9rAtURENd7rey8=; fh=uxLNaPoPRUzmkUc1KOgGwCEU/bTSjK8Yed2Up6ovXLY=; b=vbiS5sDys32YHS78OhotBcAymg2bhvupooJPikDo0BBQYm9+lquh53Z9HhIOms8CFG xHsJtlEVYl2HFV0W8Ns5B95oiUjXRMhbzUbo9hGinVi4wT/VcAAbzdx0BbcBx3pxdFmJ luZCYilGOMUSq6Ju9UaEfwa5B+V9MCeQrB6iLI9jYG8tRpNJVjHcyzehYwpqguH8kZKc Yw5py691L6Ile653YOdmaf2m85JyttoQplquoa3Gzy+37lFqfAetzQD2l9L6txlWZxNQ FGfG730ge1S3Zv2gMrIC+DkskwxRO81KmEK82sWyL+tn9sbvS1NuVrPfBs5Y2KqH1WK0 B0nw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=kjGSuN3C; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Return-Path: Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id bm18-20020a656e92000000b00573efae0ee1si1126613pgb.515.2023.09.14.02.27.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 02:27:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=kjGSuN3C; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id D98B2808F004; Thu, 14 Sep 2023 01:27:51 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236339AbjINI1s (ORCPT + 99 others); Thu, 14 Sep 2023 04:27:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235947AbjINI1q (ORCPT ); Thu, 14 Sep 2023 04:27:46 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e5ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6D4AA1FC0 for ; Thu, 14 Sep 2023 01:27:42 -0700 (PDT) Received: from localhost (unknown [IPv6:2a01:e0a:2c:6930:5cf4:84a1:2763:fe0d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bbrezillon) by madras.collabora.co.uk (Postfix) with ESMTPSA id 2E11D6607341; Thu, 14 Sep 2023 09:27:40 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1694680060; bh=fZX5Tn85U8haS5fhfXILHpsZMtnYGSKKCN9SMPHQfEM=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=kjGSuN3CnP+8AWLel1wPkVnVKRxAJQbYKZcWFnPVVEGur4p5ae3vyLRHQ51uJiVWj 8edB1aW26JCx7PGhyQYQsiVp/qTOO+sIKgGzzgHPD3JJ805LAknCQ2SrSziR+CZ1Tb ede+NvZTVYcSQBVdCHAy6+C7FcVeBSxTbCy6GHAFZRhQyIJtzblcGkDKQwLYW6ptQ+ luFDIXzk3YOw7LVAtukFgEwRqnAhTJ4fLOboGcFhYkEp13Ns9A4GQyhQ02TsCF5nYH lybP283KJbJJb+0edRyhfkMK6uWZn0BfPP0/RQXcxIyJL4vVME75rzdKr0JhH6VrrZ 6VpeSa/p2nd+w== Date: Thu, 14 Sep 2023 10:27:37 +0200 From: Boris Brezillon To: Dmitry Osipenko Cc: David Airlie , Gerd Hoffmann , Gurchetan Singh , Chia-I Wu , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Christian =?UTF-8?B?S8O2bmln?= , Qiang Yu , Steven Price , Emma Anholt , Melissa Wen , dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, kernel@collabora.com, virtualization@lists.linux-foundation.org Subject: Re: [PATCH v16 15/20] drm/shmem-helper: Add memory shrinker Message-ID: <20230914102737.08e61498@collabora.com> In-Reply-To: <21dda0bd-4264-b480-dbbc-29a7744bc96c@collabora.com> References: <20230903170736.513347-1-dmitry.osipenko@collabora.com> <20230903170736.513347-16-dmitry.osipenko@collabora.com> <20230905100306.3564e729@collabora.com> <26f7ba6d-3520-0311-35e2-ef5706a98232@collabora.com> <20230913094832.3317c2df@collabora.com> <20230914093626.19692c24@collabora.com> <21dda0bd-4264-b480-dbbc-29a7744bc96c@collabora.com> Organization: Collabora X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Thu, 14 Sep 2023 01:27:52 -0700 (PDT) X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email On Thu, 14 Sep 2023 10:50:32 +0300 Dmitry Osipenko wrote: > On 9/14/23 10:36, Boris Brezillon wrote: > > On Thu, 14 Sep 2023 07:02:52 +0300 > > Dmitry Osipenko wrote: > > > >> On 9/13/23 10:48, Boris Brezillon wrote: > >>> On Wed, 13 Sep 2023 03:56:14 +0300 > >>> Dmitry Osipenko wrote: > >>> > >>>> On 9/5/23 11:03, Boris Brezillon wrote: > >>>>>> * But > >>>>>> + * acquiring the obj lock in drm_gem_shmem_release_pages_locked() can > >>>>>> + * cause a locking order inversion between reservation_ww_class_mutex > >>>>>> + * and fs_reclaim. > >>>>>> + * > >>>>>> + * This deadlock is not actually possible, because no one should > >>>>>> + * be already holding the lock when drm_gem_shmem_free() is called. > >>>>>> + * Unfortunately lockdep is not aware of this detail. So when the > >>>>>> + * refcount drops to zero, don't touch the reservation lock. > >>>>>> + */ > >>>>>> + if (shmem->got_pages_sgt && > >>>>>> + refcount_dec_and_test(&shmem->pages_use_count)) { > >>>>>> + drm_gem_shmem_do_release_pages_locked(shmem); > >>>>>> + shmem->got_pages_sgt = false; > >>>>>> } > >>>>> Leaking memory is the right thing to do if pages_use_count > 1 (it's > >>>>> better to leak than having someone access memory it no longer owns), but > >>>>> I think it's worth mentioning in the above comment. > >>>> > >>>> It's unlikely that it will be only a leak without a following up > >>>> use-after-free. Neither is acceptable. > >>> > >>> Not necessarily, if you have a page leak, it could be that the GPU has > >>> access to those pages, but doesn't need the GEM object anymore > >>> (pages are mapped by the iommu, which doesn't need shmem->sgt or > >>> shmem->pages after the mapping is created). Without a WARN_ON(), this > >>> can go unnoticed and lead to memory corruptions/information leaks. > >>> > >>>> > >>>> The drm_gem_shmem_free() could be changed such that kernel won't blow up > >>>> on a refcnt bug, but that's not worthwhile doing because drivers > >>>> shouldn't have silly bugs. > >>> > >>> We definitely don't want to fix that, but we want to complain loudly > >>> (WARN_ON()), and make sure the risk is limited (preventing memory from > >>> being re-assigned to someone else by not freeing it). > >> > >> That's what the code did and continues to do here. Not exactly sure what > >> you're trying to say. I'm going to relocate the comment in v17 to > >> put_pages(), we can continue discussing it there if I'm missing yours point. > >> > > > > I'm just saying it would be worth mentioning that we're intentionally > > leaking memory if shmem->pages_use_count > 1. Something like: > > > > /** > > * shmem->pages_use_count should be 1 when ->sgt != NULL and > > * zero otherwise. If some users still hold a pages reference > > * that's a bug, and we intentionally leak the pages so they > > * can't be re-allocated to someone else while the GPU/CPU > > * still have access to it. > > */ > > drm_WARN_ON(drm, > > refcount_read(&shmem->pages_use_count) == (shmem->sgt ? 1 : 0)); > > if (shmem->sgt && refcount_dec_and_test(&shmem->pages_use_count)) > > drm_gem_shmem_free_pages(shmem); > > That may be acceptable, but only once there will a driver using this > feature. Which feature? That's not related to a specific feature, that's just how drm_gem_shmem_get_pages_sgt() works, it takes a pages ref that can only be released in drm_gem_shmem_free(), because sgt users are not refcounted and the sgt stays around until the GEM object is freed or its pages are evicted. The only valid cases we have at the moment are: - pages_use_count == 1 && sgt != NULL - pages_use_count == 0 any other situations are buggy.