Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp7048352rwl; Wed, 22 Mar 2023 21:05:41 -0700 (PDT) X-Google-Smtp-Source: AK7set+vDwAST/ZNbFum5vv76DrKfIWwBFrjPy4wdnSxnR4ralpOY/Xu49rxl9qQ4lVB+7bpGNx7 X-Received: by 2002:a17:906:9253:b0:92c:5f1:8288 with SMTP id c19-20020a170906925300b0092c05f18288mr8968747ejx.13.1679544340956; Wed, 22 Mar 2023 21:05:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679544340; cv=none; d=google.com; s=arc-20160816; b=GtNAOLsYtgH2b1yMfmopqkp5Vtm41tkqqpgUoJKgXn9v47mOCx1s8jZgc4slUsbnqe rnGU11mV0w6mPiRbe3XI/kzGNnVYDuL60YBoqif69NFplSLoD5+GlAXmZ5p/tbFL21jQ rLv9V8gMAU6w9tR2XkF6sfMR2FXZjSqxno9WnY7NXNDY+Q8hLpUZYg8mPb0InTJNZJDN pFpomSNU5DD288BfS9AZTvPnNj4cpLs3SE3utlPMJrGcdh6Fo4mA4TaskWaR0NxBnwZ6 lz28IabzyMEfakxijmxDlYADbKK0D43kYipe5oiZaix5DNrPu2wrkbCjKwda+HfzCKRf +5YQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:mime-version:date :dkim-signature; bh=dQm6vY6azaIWfVBk9OleqJiypEWxonUrfDuxGdzwvEo=; b=ykFePj0PxUKVnGGCd56EBrQ8Eg+rt/ds81xWgozUybqJv13ACpIhyuehh55tMrGXGw ErI0f+IokIYvnCS7gbAdh1T7lfFLMaM9GFhGwux7hGbxgmdy9/+A6xeaO6yD1RmK3jc7 4Am3Q4ncsYLbUxdMwQYnBzrZEINDhuEgK58nWYulIgJxjpZ3r37nFbKuyfaWZihd4mZ7 N11am9OUorCyY7C595RHkmGz8Y6NRSpslU4Krl4S28E0xgjRk6bCwRayBnAUSXA0uwr5 dWM8mtW6YN7MkkmLzDC7LkbNo1FY1EViLFAItAPNHaDanF3K5S9hGMiec7hH82Tq010l wmYQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Wqjo2gOw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gn13-20020a1709070d0d00b00933b62b2533si2844594ejc.297.2023.03.22.21.05.16; Wed, 22 Mar 2023 21:05:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Wqjo2gOw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230163AbjCWEAy (ORCPT + 99 others); Thu, 23 Mar 2023 00:00:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229451AbjCWEAr (ORCPT ); Thu, 23 Mar 2023 00:00:47 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 44B5E1F5F6 for ; Wed, 22 Mar 2023 21:00:42 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5446a91c40cso209230347b3.18 for ; Wed, 22 Mar 2023 21:00:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1679544041; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=dQm6vY6azaIWfVBk9OleqJiypEWxonUrfDuxGdzwvEo=; b=Wqjo2gOwLuBNwjRWcqy/e97mtEalxuitp0xY/9ucFAeoVjzZqkLFNj+Kg/bS7+SLso 0mEqszCZOwticCjD28rnJun5qYG8zLpCe7Lgqr1Zcq77xAINPLQkmlCr1RUZ3ZFrFZLe yVSSzSX2DRXw2LGgoiHCS2XSksG6+AorfME/vlcB+xENetBawM2tXFoeFxzlxQ+CDzRr rABuT5MXEbil4k21Tw2rj0lLKNuTBpVjCslzrHFHpTW3cYQbR83DpY9L8+R2A0wlBskB V7890htQHHBp7dpJzFMQR7ATN69oyvzWl1Z0FZ3aHCz1QscoKhEDDw6wgjhuKwrEdKtk wggQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679544041; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=dQm6vY6azaIWfVBk9OleqJiypEWxonUrfDuxGdzwvEo=; b=ZEsV6Pa66Y+M2xK+Cq+uvtY4duDAool9YYqhSD9DFDer7WeIUsUtzjpeWpVIcoxKCH e6xvY+y70pjtcJAAF4XYCEAume4ac++xHodJ2bn/TTO+yfwXjs66bOjwkvLOQRJUCHUy UAWNNtIrrw7HQE1Z0Tit/cUra0UcZrDZLhlp3D/XcXs/TBEHVNP1clLCsCAs9bLR1Lwm gBapVtFkKO3zkJL+EDOIexBtm37um2EicEdFF9SgEsPUm7kdAWXhO9AoQZ8qnDdMCSS5 R0J6LwXTkZCR53B8j06FwooTGcbqKMt4q4G3+tuLso/MEhan6bV4XSeybgGpyDI0nHyb CNcw== X-Gm-Message-State: AAQBX9efa4N4q5sUFks53+Gc93srez+GsEu+wIYY/0UJeN+yv82Neu/a F534l5UY4UBtSEKVAdR1p3Bk0dQbYeVlERw8 X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:2327]) (user=yosryahmed job=sendgmr) by 2002:a25:840a:0:b0:b26:884:c35e with SMTP id u10-20020a25840a000000b00b260884c35emr1126100ybk.4.1679544041493; Wed, 22 Mar 2023 21:00:41 -0700 (PDT) Date: Thu, 23 Mar 2023 04:00:30 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.40.0.rc1.284.g88254d51c5-goog Message-ID: <20230323040037.2389095-1-yosryahmed@google.com> Subject: [RFC PATCH 0/7] Make rstat flushing IRQ and sleep friendly From: Yosry Ahmed To: Tejun Heo , Josef Bacik , Jens Axboe , Zefan Li , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton Cc: Vasily Averin , cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, bpf@vger.kernel.org, Yosry Ahmed Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-7.7 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Currently, if rstat flushing is invoked using the irqsafe variant cgroup_rstat_flush_irqsafe(), we keep interrupts disabled and do not sleep for the entire flush operation, which is O(# cpus * # cgroups). This can be rather dangerous. Not all contexts that use cgroup_rstat_flush_irqsafe() actually cannot sleep, and among those that cannot sleep, not all contexts require interrupts to be disabled. This patch series breaks down the O(# cpus * # cgroups) duration that we disable interrupts for into a series of O(# cgroups) durations. Disabling interrupts is deferred to the caller if needed. Patch 1 mainly addresses this by not requiring interrupts to be disabled for the global rstat lock to be acquired. As a side effect of that, the we disable rstat flushing in interrupt context. See patch 1 for more details. One thing I am not sure about is whether the only caller of cgroup_rstat_flush_hold() -- cgroup_base_stat_cputime_show(), currently has any dependency on that call disabling interrupts. Patch 2 follows suit for stats_flush_lock in the memcg code, allowing it to be acquired without disabling interrupts. Patch 3 removes cgroup_rstat_flush_irqsafe() and updates cgroup_rstat_flush() to be more explicit about sleeping. Patch 4 changes memcg code paths that invoke rstat flushing to sleep where possible. The patch changes code paths where it is naturally saef to sleep: userspace reads and the background periodic flusher. Patches 5 & 6 allow sleeping while rstat flushing in reclaim context and refault context. I am not sure if this is okay, especially the latter, so I placed them in separate patches for ease of revert/drop. Patch 7 is a slightly tangential optimization that limits the work done by rstat flushing in some scenarios. Yosry Ahmed (7): cgroup: rstat: only disable interrupts for the percpu lock memcg: do not disable interrupts when holding stats_flush_lock cgroup: rstat: remove cgroup_rstat_flush_irqsafe() memcg: sleep during flushing stats in safe contexts vmscan: memcg: sleep when flushing stats during reclaim workingset: memcg: sleep when flushing stats in workingset_refault() memcg: do not modify rstat tree for zero updates block/blk-cgroup.c | 2 +- include/linux/cgroup.h | 3 +-- include/linux/memcontrol.h | 8 +++--- kernel/cgroup/cgroup.c | 4 +-- kernel/cgroup/rstat.c | 54 ++++++++++++++++++++------------------ mm/memcontrol.c | 52 ++++++++++++++++++++++-------------- mm/vmscan.c | 2 +- mm/workingset.c | 4 +-- 8 files changed, 73 insertions(+), 56 deletions(-) -- 2.40.0.rc1.284.g88254d51c5-goog