Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp5362407rwl; Sun, 8 Jan 2023 13:44:21 -0800 (PST) X-Google-Smtp-Source: AMrXdXtKq1SfahWAcIKJV4+GpU7ExUTua4ybDf/uGGUK6A61xkJGtHh4/K/HYSB0x74rqqyoH9OW X-Received: by 2002:a17:906:656:b0:7ee:1596:4b6 with SMTP id t22-20020a170906065600b007ee159604b6mr52310139ejb.59.1673214261601; Sun, 08 Jan 2023 13:44:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673214261; cv=none; d=google.com; s=arc-20160816; b=XWsbrhgCgH2ot1FsJn8bOmm0jZgLt87BH/gdnrFQm2WYIUwv9mABzS/dUEmO6wj9f9 eGfLuPH+HKhWJSdmnewvOAYm5hDq4PJ1gb8c8+AhW3p/UG9zLI9nm6TDW7B4e68CTNIK n7CS1vIuUVCQp3MItMbYNHMG1dpf+ocjf09y+P497hFbunVF5UH0mZgGf+aShwSFZKLw 4yDKUDosGqRRf1PLbKTTr1r6DtnNn9TIFHrmvmyA8zEU17npRQ77HkLa4FlgZFuIL2AM THCsX+Ln8zIstEtcyxLpRIzTyhCA3szv2zpskbKQYUvnLqxNxoRrvKPK4y991xsV5rFB ZKBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=dmmWyu8zgermQ0GM9GHKsetqTox83eDCIVVLdpqnuCM=; b=YZjA2vihkmg/VqrIVac2RZhnc5Skgz7t1VOsxs4m3Mp1NEdnjrKfu3ZziMvpFqWzBW 0C/nyKrJP/6ADhmKdmAkPD8/bxmZw8DFFqTlSX+ohR/ibMmuemsJS4E40HM/2ezldD/s B7pz7B2QNcYX3jCOnjUSFQafAkq5LjI8ApwcOSGRPytyYqVyGGp9+7zft1z9KI0RC23G je6yP3NROgsOlNOYHhcGuKY7mP4B0GC0Hc2AfABLcwkueQMvTJ3iWAB5X9da+Fy3OSNT 6ttfNLgWiEtG8n5cMOYcx+d5jDHs/xpeyefZ28zzmrFmc0S/4h0HTieqp9C8E2zmAj24 Ywrw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=ZwZISmc8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=collabora.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dm8-20020a170907948800b007c0fa2d86d1si8303078ejc.906.2023.01.08.13.44.09; Sun, 08 Jan 2023 13:44:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=ZwZISmc8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=collabora.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235893AbjAHVFS (ORCPT + 51 others); Sun, 8 Jan 2023 16:05:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54966 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234254AbjAHVFO (ORCPT ); Sun, 8 Jan 2023 16:05:14 -0500 Received: from madras.collabora.co.uk (madras.collabora.co.uk [46.235.227.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7989CD129 for ; Sun, 8 Jan 2023 13:05:13 -0800 (PST) Received: from workpc.. (109-252-117-89.nat.spd-mgts.ru [109.252.117.89]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: dmitry.osipenko) by madras.collabora.co.uk (Postfix) with ESMTPSA id 37D856602CEF; Sun, 8 Jan 2023 21:05:06 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1673211912; bh=xheEHTvNJCU3UmE2chdlCsbRJ19Ni7CwqhXr39k8UuQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZwZISmc8+Jg/kHBz87b3Ky5cFP4Urq5l3YILlqrX2AKenF/onbH4BkCjgLxgQEEU2 UC6LqLDscHcQw4s1R+UuCXbXhTxgBwIrIcndnxKuEN67MP4oQtbEPDGge46ZFsimPU ynNjZEGVlz3FVqTgwoVuYBhTtBW69bBOjz+k6e03P+DJcDE4DFIXNg6wMfI8ad9xYK W/AkKrbpe25r/WM1ZOn9BJm5I8fmhIjOU+kH9E1GxI8uVUBNQIgkk4cYZrb3m15TPD s9K2ZPfYwMrOyPDYTWteZjXQ+/1NKGdXhGtUQJfQVPZwbt2l2IKVjv0SqyFvX5kWZD LTsdGXBGF6CkQ== From: Dmitry Osipenko To: David Airlie , Gerd Hoffmann , Gurchetan Singh , Chia-I Wu , Daniel Vetter , Daniel Almeida , Gustavo Padovan , Daniel Stone , Tomeu Vizoso , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Rob Clark , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , Qiang Yu , Steven Price , Alyssa Rosenzweig , Rob Herring , Sean Paul , Dmitry Baryshkov , Abhinav Kumar Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, kernel@collabora.com, virtualization@lists.linux-foundation.org Subject: [PATCH v10 01/11] drm/msm/gem: Prevent blocking within shrinker loop Date: Mon, 9 Jan 2023 00:04:35 +0300 Message-Id: <20230108210445.3948344-2-dmitry.osipenko@collabora.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20230108210445.3948344-1-dmitry.osipenko@collabora.com> References: <20230108210445.3948344-1-dmitry.osipenko@collabora.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Consider this scenario: 1. APP1 continuously creates lots of small GEMs 2. APP2 triggers `drop_caches` 3. Shrinker starts to evict APP1 GEMs, while APP1 produces new purgeable GEMs 4. msm_gem_shrinker_scan() returns non-zero number of freed pages and causes shrinker to try shrink more 5. msm_gem_shrinker_scan() returns non-zero number of freed pages again, goto 4 6. The APP2 is blocked in `drop_caches` until APP1 stops producing purgeable GEMs To prevent this blocking scenario, check number of remaining pages that GPU shrinker couldn't release due to a GEM locking contention or shrinking rejection. If there are no remaining pages left to shrink, then there is no need to free up more pages and shrinker may break out from the loop. This problem was found during shrinker/madvise IOCTL testing of virtio-gpu driver. The MSM driver is affected in the same way. Reviewed-by: Rob Clark Fixes: b352ba54a820 ("drm/msm/gem: Convert to using drm_gem_lru") Signed-off-by: Dmitry Osipenko --- drivers/gpu/drm/drm_gem.c | 9 +++++++-- drivers/gpu/drm/msm/msm_gem_shrinker.c | 8 ++++++-- include/drm/drm_gem.h | 4 +++- 3 files changed, 16 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index 59a0bb5ebd85..c6bca5ac6e0f 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -1388,10 +1388,13 @@ EXPORT_SYMBOL(drm_gem_lru_move_tail); * * @lru: The LRU to scan * @nr_to_scan: The number of pages to try to reclaim + * @remaining: The number of pages left to reclaim * @shrink: Callback to try to shrink/reclaim the object. */ unsigned long -drm_gem_lru_scan(struct drm_gem_lru *lru, unsigned nr_to_scan, +drm_gem_lru_scan(struct drm_gem_lru *lru, + unsigned int nr_to_scan, + unsigned long *remaining, bool (*shrink)(struct drm_gem_object *obj)) { struct drm_gem_lru still_in_lru; @@ -1430,8 +1433,10 @@ drm_gem_lru_scan(struct drm_gem_lru *lru, unsigned nr_to_scan, * hit shrinker in response to trying to get backing pages * for this obj (ie. while it's lock is already held) */ - if (!dma_resv_trylock(obj->resv)) + if (!dma_resv_trylock(obj->resv)) { + *remaining += obj->size >> PAGE_SHIFT; goto tail; + } if (shrink(obj)) { freed += obj->size >> PAGE_SHIFT; diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c index 051bdbc093cf..b7c1242014ec 100644 --- a/drivers/gpu/drm/msm/msm_gem_shrinker.c +++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c @@ -116,12 +116,14 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) }; long nr = sc->nr_to_scan; unsigned long freed = 0; + unsigned long remaining = 0; for (unsigned i = 0; (nr > 0) && (i < ARRAY_SIZE(stages)); i++) { if (!stages[i].cond) continue; stages[i].freed = - drm_gem_lru_scan(stages[i].lru, nr, stages[i].shrink); + drm_gem_lru_scan(stages[i].lru, nr, &remaining, + stages[i].shrink); nr -= stages[i].freed; freed += stages[i].freed; } @@ -132,7 +134,7 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) stages[3].freed); } - return (freed > 0) ? freed : SHRINK_STOP; + return (freed > 0 && remaining > 0) ? freed : SHRINK_STOP; } #ifdef CONFIG_DEBUG_FS @@ -182,10 +184,12 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr) NULL, }; unsigned idx, unmapped = 0; + unsigned long remaining = 0; for (idx = 0; lrus[idx] && unmapped < vmap_shrink_limit; idx++) { unmapped += drm_gem_lru_scan(lrus[idx], vmap_shrink_limit - unmapped, + &remaining, vmap_shrink); } diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 772a4adf5287..f1f00fc2dba6 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -476,7 +476,9 @@ int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev, void drm_gem_lru_init(struct drm_gem_lru *lru, struct mutex *lock); void drm_gem_lru_remove(struct drm_gem_object *obj); void drm_gem_lru_move_tail(struct drm_gem_lru *lru, struct drm_gem_object *obj); -unsigned long drm_gem_lru_scan(struct drm_gem_lru *lru, unsigned nr_to_scan, +unsigned long drm_gem_lru_scan(struct drm_gem_lru *lru, + unsigned int nr_to_scan, + unsigned long *remaining, bool (*shrink)(struct drm_gem_object *obj)); #endif /* __DRM_GEM_H__ */ -- 2.38.1