Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp8186082rwb; Tue, 6 Dec 2022 15:25:49 -0800 (PST) X-Google-Smtp-Source: AA0mqf4Cad1t7z+pdlNV3vrUL+N63i4bfbKUIpoo/zwm+Uks1iShnXuAjH47nMmL4eoC7zvBmlz8 X-Received: by 2002:a17:90a:3d41:b0:213:d34:a80b with SMTP id o1-20020a17090a3d4100b002130d34a80bmr97458464pjf.74.1670369148829; Tue, 06 Dec 2022 15:25:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670369148; cv=none; d=google.com; s=arc-20160816; b=M34kLL5O5v4edo/u1ibRu9gMyLlOFlRV7GbLnUqP4MWRyTNEe+X5CJBTS6Vmn2DyNh g2XEc7PeqSr2nlzeTh8aum+WaTN97VL9hv8qki5euE8M25vJrHXbhk6L5zd14Ps8kAvL wBC4/SnOFD+yWO8o2ntEEIAj8YhEeDg1sVVnd5nO79EIiDQlzJUB1sINg5qWaPKQdDi6 Ycdr6svEaAb/61N5T4l2LHU/x9QZcFdvjEZkZ/ytJJeQF2MVNsh5hyZZFW5b8aAAZhwr rMpUCSMXH/SC6bSPSBLVmp/9TGA39UOtHCTiqO3m/6AhJTps1XT+TD0NliqWxfnbrqoO ataw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=H30kxZZAF5jNAG74FSVbdmAZx19VpCBztd5KoQzCrP8=; b=kn8bD6cdYS0UOF+nicoPnONwuaNRCcbHHtxbK7aZuoJLb/vNRDTB+EJp15GZYgpah/ T3oLAPvNAKkDyO2WJ+bacpXSRvayyZ7IREE6OmYgq3tUcvKp10WPNJa8AWGaX/XyFu2v U6I5+uOs8gC8rRu7gbPqidLfZ4CLBzhA4P9pvLl7Qv7NWDwBPz52CklJLXS9NImKurRJ K/5d4WJ6r6vmrpisDFaRVl8lMfJ7LW0sjPem0FyB0z77Zw0aVu/9XYC7XqTmSYMVqwHr 3qzyS+i9pxXUNFo43jvJfe89ghauCS1oWqaPMUTC6T101lGDuxoAehcd/fiI9YtE6wIi S8Lg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Jlk1MmZC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p13-20020a170902f08d00b001865d74328esi16681733pla.312.2022.12.06.15.25.38; Tue, 06 Dec 2022 15:25:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Jlk1MmZC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229588AbiLFXK4 (ORCPT + 77 others); Tue, 6 Dec 2022 18:10:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56042 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229649AbiLFXKw (ORCPT ); Tue, 6 Dec 2022 18:10:52 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7A9B42992 for ; Tue, 6 Dec 2022 15:10:51 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id b1-20020a17090a10c100b0020da29fa5e5so14182135pje.2 for ; Tue, 06 Dec 2022 15:10:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=H30kxZZAF5jNAG74FSVbdmAZx19VpCBztd5KoQzCrP8=; b=Jlk1MmZCRjTlJDcMa2AlTVoJLrTt+/m+1AA4uwx+MttAi+FmUNRsMX8SzhrD1HHgXX d81kNXOA1lGLcF+WHEmHQo1v1bFG4omVEdJD7yIqGo8TB3Yt5ZVtaZARZqYKl/9iAx5M Ip2C0G4UFGPZaZYJb+DSc6cmX73dTrJ37Hku2SMBzeSGOXzMKMp1vVoHaJhAsGL1m2Lo cKtochCKQNZ2Aat3boBLKDEdB+ibi5/spc/sAVj6Nmq2vhlYKpKSeDxMzQwvTvE+fNqK L6rj/E8c0TzzAExGWhtgPxUFwhopdoD46Xa10DctN43JPpt1Q266DAF3IiT5L4puAbzi i6hA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=H30kxZZAF5jNAG74FSVbdmAZx19VpCBztd5KoQzCrP8=; b=zR/GHm18q8fib+cRwru3bttjalcym47Sg5dWbJWKjuizgXaHxQYRGvSIUK3W2OasE5 M/xF62MiTc4ycrM0QL0dc45ev2AhgLgaDtv6VUnkCB7C0BRuWLgYv+MRvxDQ2Y6KrAXk n4lZGDAasBoGHwFE2EKC2dKm8VnwzF3stgG4WglG4vjn1CLmWvQyhuXROaSMhiedzrmr gfnu531f4P7DPdAhqoNu+p+1QM2XOemID+75k3E9fncRSgeZl1+3FRZmKnnK940zfNbt GO7xNRNFZK/TOqINiTvLNyh4sFVTSUliJrEuk2MSMZCwieQYB4C24xkHNPGcrxXOGGoR EBiw== X-Gm-Message-State: ANoB5plwBfIzXTmcqG3AR/CFW0vI9jeAiqdxdMAeW7YndFuiVqSHCABI 50FJ6rlKYP83WfkJAGAYv7ojzB+LWSx6Yg== X-Received: from shakeelb.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:262e]) (user=shakeelb job=sendgmr) by 2002:a05:6a00:1391:b0:575:eaa:c28c with SMTP id t17-20020a056a00139100b005750eaac28cmr46822396pfg.76.1670368251140; Tue, 06 Dec 2022 15:10:51 -0800 (PST) Date: Tue, 6 Dec 2022 23:10:49 +0000 In-Reply-To: Mime-Version: 1.0 References: Message-ID: <20221206231049.g35ltbxbk54izrie@google.com> Subject: Re: Low TCP throughput due to vmpressure with swap enabled From: Shakeel Butt To: Johannes Weiner Cc: Eric Dumazet , Ivan Babrou , Linux MM , Linux Kernel Network Developers , linux-kernel , Michal Hocko , Roman Gushchin , Muchun Song , Andrew Morton , "David S. Miller" , Hideaki YOSHIFUJI , David Ahern , Jakub Kicinski , Paolo Abeni , cgroups@vger.kernel.org, kernel-team Content-Type: text/plain; charset="us-ascii" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 06, 2022 at 09:51:01PM +0100, Johannes Weiner wrote: > On Tue, Dec 06, 2022 at 08:13:50PM +0100, Eric Dumazet wrote: > > On Tue, Dec 6, 2022 at 8:00 PM Johannes Weiner wrote: > > > @@ -1701,10 +1701,10 @@ void mem_cgroup_sk_alloc(struct sock *sk); > > > void mem_cgroup_sk_free(struct sock *sk); > > > static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg) > > > { > > > - if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && memcg->tcpmem_pressure) > > > + if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && memcg->socket_pressure) > > > > && READ_ONCE(memcg->socket_pressure)) > > > > > return true; > > > do { > > > - if (time_before(jiffies, READ_ONCE(memcg->socket_pressure))) > > > + if (memcg->socket_pressure) > > > > if (READ_ONCE(...)) > > Good point, I'll add those. > > > > @@ -7195,10 +7194,10 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages, > > > struct page_counter *fail; > > > > > > if (page_counter_try_charge(&memcg->tcpmem, nr_pages, &fail)) { > > > - memcg->tcpmem_pressure = 0; > > > > Orthogonal to your patch, but: > > > > Maybe avoid touching this cache line too often and use READ/WRITE_ONCE() ? > > > > if (READ_ONCE(memcg->socket_pressure)) > > WRITE_ONCE(memcg->socket_pressure, false); > > Ah, that's a good idea. > > I think it'll be fine in the failure case, since that's associated > with OOM and total performance breakdown anyway. > > But certainly, in the common case of the charge succeeding, we should > not keep hammering false into that variable over and over. > > How about the delta below? I also flipped the branches around to keep > the common path at the first indentation level, hopefully making that > a bit clearer too. > > Thanks for taking a look, Eric! > I still think we should not put a persistent state of socket pressure on unsuccessful charge which will only get reset on successful charge. I think the better approach would be to limit the pressure state by time window same as today but set it on charge path. Something like below: diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index d3c8203cab6c..7bd88d443c42 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -287,7 +287,6 @@ struct mem_cgroup { /* Legacy tcp memory accounting */ bool tcpmem_active; - int tcpmem_pressure; #ifdef CONFIG_MEMCG_KMEM int kmemcg_id; @@ -1712,8 +1711,6 @@ void mem_cgroup_sk_alloc(struct sock *sk); void mem_cgroup_sk_free(struct sock *sk); static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg) { - if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && memcg->tcpmem_pressure) - return true; do { if (time_before(jiffies, READ_ONCE(memcg->socket_pressure))) return true; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 48c44229cf47..290444bcab84 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5286,7 +5286,6 @@ static struct mem_cgroup *mem_cgroup_alloc(void) vmpressure_init(&memcg->vmpressure); INIT_LIST_HEAD(&memcg->event_list); spin_lock_init(&memcg->event_list_lock); - memcg->socket_pressure = jiffies; #ifdef CONFIG_MEMCG_KMEM memcg->kmemcg_id = -1; INIT_LIST_HEAD(&memcg->objcg_list); @@ -7252,10 +7251,12 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages, struct page_counter *fail; if (page_counter_try_charge(&memcg->tcpmem, nr_pages, &fail)) { - memcg->tcpmem_pressure = 0; + if (READ_ONCE(memcg->socket_pressure)) + WRITE_ONCE(memcg->socket_pressure, 0); return true; } - memcg->tcpmem_pressure = 1; + if (READ_ONCE(memcg->socket_pressure) < jiffies + HZ) + WRITE_ONCE(memcg->socket_pressure, jiffies + HZ); if (gfp_mask & __GFP_NOFAIL) { page_counter_charge(&memcg->tcpmem, nr_pages); return true; @@ -7263,12 +7264,21 @@ bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages, return false; } - if (try_charge(memcg, gfp_mask, nr_pages) == 0) { - mod_memcg_state(memcg, MEMCG_SOCK, nr_pages); - return true; + if (try_charge(memcg, gfp_mask & ~__GFP_NOFAIL, nr_pages) < 0) { + if (READ_ONCE(memcg->socket_pressure) < jiffies + HZ) + WRITE_ONCE(memcg->socket_pressure, jiffies + HZ); + if (gfp_mask & __GFP_NOFAIL) { + try_charge(memcg, gfp_mask, nr_pages); + goto out; + } + return false; } - return false; + if (READ_ONCE(memcg->socket_pressure)) + WRITE_ONCE(memcg->socket_pressure, 0); +out: + mod_memcg_state(memcg, MEMCG_SOCK, nr_pages); + return true; } /**