Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1786644pxb; Mon, 22 Feb 2021 10:49:32 -0800 (PST) X-Google-Smtp-Source: ABdhPJy7a+3Cv3fTgmuaf+dQd7bNGy946JX42aTYNxA2iwbPETf8UOL7TRY9U4UtJfU23LbY+lGq X-Received: by 2002:a05:6402:451:: with SMTP id p17mr24799257edw.248.1614019772671; Mon, 22 Feb 2021 10:49:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614019772; cv=none; d=google.com; s=arc-20160816; b=JJ4ymi9xM6GLko6v3/WxVA94oKP6GGIQMkcAlvH+oUjedPG/inyuGKkIdTYT3e0k8H hpbMNzYLiO0pBBdLYedydefLY20Y2lCZxIMTje3P5UeHfezAjhx1heWuv1exCc/Bz8TZ cO3X95ezMfw62pO27068Df/MUnZmkXyeSAhAL78rd4A1S9H0OmNN6jgTmwhBCQv+7BtP aFFkV5v3+nreo07NSkyUJEomPx8i1wuD0VX9YaECv9MTdAv4BL5ALD7r3jg8WRaQcSlI nedsd7BLwGo+KvpzJ6s4xAXfXY7Rd7SM1SD3oYqVWYfBM3kK10UMm8rpewqLleg5ZwiK irQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=EF4DZ8/YPHLgiqcKgmmLocSOb1RhL6xlS59EKbvTOMk=; b=k2+U5bS7RfacEkAWprioScwT8UfQgmP5Hcu0uFKJYtmbAidU6ogW2yzNc6dgEZiyPG 6DFBLZIYI7KuFfzTwaFM0I/nH+AUGfkJ4gNLQM9grUXUL//GFFni2D1Q8jI0OsjOkqUK pWkuNZVz1x3DJOEAKZegkhtdbZUPn+gbUBCy3Xu8B2cWGS9e62D/xoOnkMMkRI/Rs3ue lUeZEOnMxFKNRC9gjG/bemy/DyLjmSVy/x83eeO7dTYnI8YoT3bIdj6+9USiF1kaK4Xb x/nUi+FTa0TGD7T1zYTAzdJQnuD0QBJevo9XY23O4cI+Lp+OvafmNCaO4bjV6qisZc5Q 9GiA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="l7F/ZAQ0"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v5si12528729edc.346.2021.02.22.10.49.09; Mon, 22 Feb 2021 10:49:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="l7F/ZAQ0"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231156AbhBVSsb (ORCPT + 99 others); Mon, 22 Feb 2021 13:48:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58158 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231424AbhBVSqs (ORCPT ); Mon, 22 Feb 2021 13:46:48 -0500 Received: from mail-lj1-x234.google.com (mail-lj1-x234.google.com [IPv6:2a00:1450:4864:20::234]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE617C06178B for ; Mon, 22 Feb 2021 10:46:07 -0800 (PST) Received: by mail-lj1-x234.google.com with SMTP id u4so60190395ljh.6 for ; Mon, 22 Feb 2021 10:46:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=EF4DZ8/YPHLgiqcKgmmLocSOb1RhL6xlS59EKbvTOMk=; b=l7F/ZAQ0Ut25vN6NuV1Mgnps+4oWPHrywIpIgaIMkipU6kKTEaaav9qFISBOq1DdyE NxV5B0YqGvSpNTeBTU4M8gA9rVt3DhfzqEvukbwx9tXDmt6zQ2n3ccGgf5qIKe49GnOK +MG8d7OArOJULgO1WVyGHqga7pkUYoKEF0jYlWUT7KupD2fnJOKOBEvX+wG9rctXY6A0 huZx9EPyD30ZfcsfqJCpapDIv4dTgJ9lGV8+dWQVJX0mNpKcf+5Bv88WBKvJOqVaTD7s YZ1Umhgva/38fQ8BD83N9bpYvAW5BKOdVqDYTmQ9f7UVVfHRMuI58kMKCF4Rq1tWhOEZ 1ufg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=EF4DZ8/YPHLgiqcKgmmLocSOb1RhL6xlS59EKbvTOMk=; b=TjrrOhokH1izlxLpgBuyOFzuNyoqHFwICOPkQPCfUGHRrJ4GJrCq2w37rRVsDYkZ8E 5otAzNkRTdnCvtkfxxHLg4FtCz77W8DXtksgNtIlefA8bG4Wu5INuU5AyJvxpUMZ5zMj 6/V3CazOl7HcCh6HGtwpRMwDEq2PJFm5YCAr5mNOdkRuKOMlIEWttUlUidQzD4UnktmK dNsMrWkUayuj1XM1Jc2uC8/BNaYrBxzQDrnVn7wlWeNvCDyhprWUt9mDNStFxk6FAkIf NOOI4DlsJJ6UGiiWvur7brqqs/ANO5tLtVO6IkY0awBh2Al6uXA58HsBmxowrTvVxzcv aZ+A== X-Gm-Message-State: AOAM5312WkjkJ11mUyv6OoxpxK0OPwS2WYxieZrGse0wqcWEUnuC2hcl G44O08aUYhI3ltQ7z/r8dgWigRT0ZB/sunJ8h7Be2w== X-Received: by 2002:a2e:2f05:: with SMTP id v5mr13605682ljv.279.1614019566052; Mon, 22 Feb 2021 10:46:06 -0800 (PST) MIME-Version: 1.0 References: <20210219224405.1544597-1-shakeelb@google.com> In-Reply-To: From: Shakeel Butt Date: Mon, 22 Feb 2021 10:45:54 -0800 Message-ID: Subject: Re: [PATCH] memcg: charge before adding to swapcache on swapin To: Johannes Weiner Cc: Hugh Dickins , Roman Gushchin , Michal Hocko , Andrew Morton , Cgroups , Linux MM , LKML Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 19, 2021 at 4:34 PM Johannes Weiner wrote: > > On Fri, Feb 19, 2021 at 02:44:05PM -0800, Shakeel Butt wrote: > > Currently the kernel adds the page, allocated for swapin, to the > > swapcache before charging the page. This is fine but now we want a > > per-memcg swapcache stat which is essential for folks who wants to > > transparently migrate from cgroup v1's memsw to cgroup v2's memory and > > swap counters. > > > > To correctly maintain the per-memcg swapcache stat, one option which > > this patch has adopted is to charge the page before adding it to > > swapcache. One challenge in this option is the failure case of > > add_to_swap_cache() on which we need to undo the mem_cgroup_charge(). > > Specifically undoing mem_cgroup_uncharge_swap() is not simple. > > > > This patch circumvent this specific issue by removing the failure path > > of add_to_swap_cache() by providing __GFP_NOFAIL. Please note that in > > this specific situation ENOMEM was the only possible failure of > > add_to_swap_cache() which is removed by using __GFP_NOFAIL. > > > > Another option was to use __mod_memcg_lruvec_state(NR_SWAPCACHE) in > > mem_cgroup_charge() but then we need to take of the do_swap_page() case > > where synchronous swap devices bypass the swapcache. The do_swap_page() > > already does hackery to set and reset PageSwapCache bit to make > > mem_cgroup_charge() execute the swap accounting code and then we would > > need to add additional parameter to tell to not touch NR_SWAPCACHE stat > > as that code patch bypass swapcache. > > > > This patch added memcg charging API explicitly foe swapin pages and > > cleaned up do_swap_page() to not set and reset PageSwapCache bit. > > > > Signed-off-by: Shakeel Butt > > The patch makes sense to me. While it extends the charge interface, I > actually quite like that it charges the page earlier - before putting > it into wider circulation. It's a step in the right direction. > > But IMO the semantics of mem_cgroup_charge_swapin_page() are a bit too > fickle: the __GFP_NOFAIL in add_to_swap_cache() works around it, but > having a must-not-fail-after-this line makes the code tricky to work > on and error prone. > > It would be nicer to do a proper transaction sequence. > > > @@ -497,16 +497,15 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, > > __SetPageLocked(page); > > __SetPageSwapBacked(page); > > > > - /* May fail (-ENOMEM) if XArray node allocation failed. */ > > - if (add_to_swap_cache(page, entry, gfp_mask & GFP_RECLAIM_MASK, &shadow)) { > > - put_swap_page(page, entry); > > + if (mem_cgroup_charge_swapin_page(page, NULL, gfp_mask, entry)) > > goto fail_unlock; > > - } > > > > - if (mem_cgroup_charge(page, NULL, gfp_mask)) { > > - delete_from_swap_cache(page); > > - goto fail_unlock; > > - } > > + /* > > + * Use __GFP_NOFAIL to not worry about undoing the changes done by > > + * mem_cgroup_charge_swapin_page() on failure of add_to_swap_cache(). > > + */ > > + add_to_swap_cache(page, entry, > > + (gfp_mask|__GFP_NOFAIL) & GFP_RECLAIM_MASK, &shadow); > > How about: > > mem_cgroup_charge_swapin_page() > add_to_swap_cache() > mem_cgroup_finish_swapin_page() > > where finish_swapin_page() only uncharges the swap entry (on cgroup1) > once the swap->memory transition is complete? > > Otherwise the patch looks good to me. Thanks for the review and yes this makes the code much more clear and maintainable.