Received: by 2002:a05:6358:16cd:b0:dc:6189:e246 with SMTP id r13csp3061828rwl; Sat, 5 Nov 2022 16:40:55 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5F1ckVniZxAToEBpCKCOWBEZHDRsiKIgStZNn84n9b4Zbj0c2cVWNxtZFweTRxkE61Z8dL X-Received: by 2002:a17:907:6e1a:b0:7ad:ba0b:538c with SMTP id sd26-20020a1709076e1a00b007adba0b538cmr38431416ejc.111.1667691654958; Sat, 05 Nov 2022 16:40:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667691654; cv=none; d=google.com; s=arc-20160816; b=ioe56kIbgSEUm7iGlos67DsgB2fAFT2dsjjRyQJajh72H59Q3cZoiwFlu859wc5vE2 e1zPFjygj7R6N/gCaZ2WmYD+TRO4h0STxwO3CSfulT31PBq+yNxXVIynq9SBXq+HRCpv VSD/VIPhLcbw0QOSM6u6ZeAI8QBCzGvT9Y99ZJns0RJk8KOQ5J9Ct8/I+BjLgmuNo/SS EcRMIUy4Q3eLZ879hwTqg404wPuHqt4SLo1kxoZrpZzTPvYpGuBMti0Nr+faV3bQSaXH +6hfnjNojhDlEZKwrY1AttUO3Gh3pUfJx7YqsGpTPyzU/qNbZxi5S5bjMCjb7q7og+Rw JXHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=xv1KZJHoX8/Y/2zSqAbJe5ROqq1f1P/b5hEuVEYKScw=; b=fFEghLWD6SGSVCngWP6EQlpuMKQGtZ6ZtsiX7K3nh5cCrJ4w+pE8vUJewL8vsK+IDM vsaVU2hpPWleI2kCiN+KNF7DVGXmDyDyEmc7IuoXH7qGb7BmMXB5ipe2eCiJnbagiAok dqFVW4CgzwHZctFSAf5P/lpp++nZcB9ntbT5wKi3My+E/NWxQgFedohFPbwXNiRKqgo4 C7DXhrUufgHrspncdB5eWdAXrsvkCHyR0DCZ+0k47jo8xB3qqSLPUo1MQsp6/ftNlpCy w/2c3bYugj1Zp9z4kz6uc4OLFhbVs6TVm+naa6LOCpTBl8Io9zpo5SFVsKhZsPgX5uWU 0shQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=iSp1UqcM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=collabora.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id uz13-20020a170907118d00b00730936657d1si3114753ejb.552.2022.11.05.16.40.31; Sat, 05 Nov 2022 16:40:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=iSp1UqcM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=collabora.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230093AbiKEX3I (ORCPT + 97 others); Sat, 5 Nov 2022 19:29:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230012AbiKEX3G (ORCPT ); Sat, 5 Nov 2022 19:29:06 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [46.235.227.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0FB5C10574 for ; Sat, 5 Nov 2022 16:29:05 -0700 (PDT) Received: from dimapc.. (109-252-117-140.nat.spd-mgts.ru [109.252.117.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: dmitry.osipenko) by madras.collabora.co.uk (Postfix) with ESMTPSA id BF0136602395; Sat, 5 Nov 2022 23:29:01 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1667690943; bh=QpvuclmtyteddLw86CWzFOH3pWu373K8JLLPGJRCCFw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iSp1UqcMQRoHEVxLfu5rAo3frnpCjtmDKAR6GfPVB2+9cuMoZ3w5YgGly4zBKJZ9u 9E4AyuEcbNf2wQ6gtdxazl3GczA7ivB4fqbrQIs0g4UkMMep9ZL8xOsqNUnD2c+jvZ HNctvKoX2X+PbjIqNQM4l8lV5k/QYFs3guwZLoD56lzJsc5ExicHkFIJ8pabd5z4Yz 1PjQFvPXhjdv+lb58knEO2+RBB9wNecjUidWqjFgQHWHIsP8gJ0LwOEmGb5edzBFDu qK/ikXHtbPULw27Lr5R+cDLVcmDiZ99vLADhMJ/vMi7B0nSmo+3dqH9HWe20vGWsN8 YKUyi29G/SWzA== From: Dmitry Osipenko To: David Airlie , Gerd Hoffmann , Gurchetan Singh , Chia-I Wu , Daniel Vetter , Daniel Almeida , Gustavo Padovan , Daniel Stone , Tomeu Vizoso , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Rob Clark , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , Qiang Yu , Steven Price , Alyssa Rosenzweig , Rob Herring , Sean Paul , Dmitry Baryshkov , Abhinav Kumar Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Dmitry Osipenko , kernel@collabora.com, virtualization@lists.linux-foundation.org Subject: [PATCH v8 1/7] drm/msm/gem: Prevent blocking within shrinker loop Date: Sun, 6 Nov 2022 02:27:13 +0300 Message-Id: <20221105232719.302619-2-dmitry.osipenko@collabora.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221105232719.302619-1-dmitry.osipenko@collabora.com> References: <20221105232719.302619-1-dmitry.osipenko@collabora.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Consider this scenario: 1. APP1 continuously creates lots of small GEMs 2. APP2 triggers `drop_caches` 3. Shrinker starts to evict APP1 GEMs, while APP1 produces new purgeable GEMs 4. msm_gem_shrinker_scan() returns non-zero number of freed pages and causes shrinker to try shrink more 5. msm_gem_shrinker_scan() returns non-zero number of freed pages again, goto 4 6. The APP2 is blocked in `drop_caches` until APP1 stops producing purgeable GEMs To prevent this blocking scenario, check number of remaining pages that GPU shrinker couldn't release due to a GEM locking contention. If there are no remaining pages left to shrink, then there is no need to free up more pages and shrinker may break out from the loop. This problem was uncovered during shrinker/madvise IOCTL testing of virtio-gpu driver. The MSM driver is affected in the same way. Signed-off-by: Dmitry Osipenko --- drivers/gpu/drm/drm_gem.c | 9 +++++++-- drivers/gpu/drm/msm/msm_gem_shrinker.c | 8 ++++++-- include/drm/drm_gem.h | 4 +++- 3 files changed, 16 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index b8db675e7fb5..299bca1390aa 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -1375,10 +1375,13 @@ EXPORT_SYMBOL(drm_gem_lru_move_tail); * * @lru: The LRU to scan * @nr_to_scan: The number of pages to try to reclaim + * @remaining: The number of pages left to reclaim * @shrink: Callback to try to shrink/reclaim the object. */ unsigned long -drm_gem_lru_scan(struct drm_gem_lru *lru, unsigned nr_to_scan, +drm_gem_lru_scan(struct drm_gem_lru *lru, + unsigned int nr_to_scan, + unsigned long *remaining, bool (*shrink)(struct drm_gem_object *obj)) { struct drm_gem_lru still_in_lru; @@ -1417,8 +1420,10 @@ drm_gem_lru_scan(struct drm_gem_lru *lru, unsigned nr_to_scan, * hit shrinker in response to trying to get backing pages * for this obj (ie. while it's lock is already held) */ - if (!dma_resv_trylock(obj->resv)) + if (!dma_resv_trylock(obj->resv)) { + *remaining += obj->size >> PAGE_SHIFT; goto tail; + } if (shrink(obj)) { freed += obj->size >> PAGE_SHIFT; diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c index 1de14e67f96b..4c8b0ab61ce4 100644 --- a/drivers/gpu/drm/msm/msm_gem_shrinker.c +++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c @@ -116,12 +116,14 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) }; long nr = sc->nr_to_scan; unsigned long freed = 0; + unsigned long remaining = 0; for (unsigned i = 0; (nr > 0) && (i < ARRAY_SIZE(stages)); i++) { if (!stages[i].cond) continue; stages[i].freed = - drm_gem_lru_scan(stages[i].lru, nr, stages[i].shrink); + drm_gem_lru_scan(stages[i].lru, nr, &remaining, + stages[i].shrink); nr -= stages[i].freed; freed += stages[i].freed; } @@ -132,7 +134,7 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) stages[3].freed); } - return (freed > 0) ? freed : SHRINK_STOP; + return (freed > 0 && remaining > 0) ? freed : SHRINK_STOP; } #ifdef CONFIG_DEBUG_FS @@ -182,10 +184,12 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr) NULL, }; unsigned idx, unmapped = 0; + unsigned long remaining = 0; for (idx = 0; lrus[idx] && unmapped < vmap_shrink_limit; idx++) { unmapped += drm_gem_lru_scan(lrus[idx], vmap_shrink_limit - unmapped, + &remaining, vmap_shrink); } diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index a17c2f903f81..b46ade812443 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -475,7 +475,9 @@ int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev, void drm_gem_lru_init(struct drm_gem_lru *lru, struct mutex *lock); void drm_gem_lru_remove(struct drm_gem_object *obj); void drm_gem_lru_move_tail(struct drm_gem_lru *lru, struct drm_gem_object *obj); -unsigned long drm_gem_lru_scan(struct drm_gem_lru *lru, unsigned nr_to_scan, +unsigned long drm_gem_lru_scan(struct drm_gem_lru *lru, + unsigned int nr_to_scan, + unsigned long *remaining, bool (*shrink)(struct drm_gem_object *obj)); #endif /* __DRM_GEM_H__ */ -- 2.37.3