Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp6374431rwb; Tue, 22 Nov 2022 12:25:53 -0800 (PST) X-Google-Smtp-Source: AA0mqf4XCsNhWTMywgGKK0bQliISmbY8tDxs4xGzqTTdZMjp6KrMqUtnzn1PLCYUmplsUTJ03ip9 X-Received: by 2002:a17:906:d85:b0:7ae:3a88:9487 with SMTP id m5-20020a1709060d8500b007ae3a889487mr21015080eji.193.1669148753509; Tue, 22 Nov 2022 12:25:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669148753; cv=none; d=google.com; s=arc-20160816; b=xWNQpK6GmtNttfzqrjx6f8GMwf42FCCocEnHkcYyX5dCdbAZroGsrMu6ny9qBRl40v xENMnm83Ejr/AR/5x7tIDo618zCHXuo00uM2TvPxrWnu3Tfu0PN3u8wi1qJFhg2NBDE8 oPH37BxkBEKqRAFonqrF73y93dMLrHyFnJyJwortUEl01qzI2Y84MF9LyN3aLhC+Gw5x 4ayde+1Rdn7dZm3Mls9N7Sx36/Pa62lfwyijqvMvYB8moR764qXe1dGcdz/cqnDu9Jc/ 2dUB8IFxtDNuSqmdVq8PWTxJ8no9FvU8Td/a05kVqMUtLeKlez8rCv7/0xkhzCEZysyB jJkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=zkjjvph7GSIuRbhiRbs3/9b09qAojNLtWfTKgHMBCNM=; b=IQhXmIw/IAOzMkqKMW3Ob5i8aJH7pGRp3bVqQrcoafzWns9ZOW8UZ9ZMdj4DGV44I9 BGoyw9ZBjzcpS1Uf3E+0crSHeeNaSnclKVIGN9vv4me2/CmKfbMH5cJNB258VwFjINGP yuvDyPsCAzTIdDBAgeYd7Fnd4UZhy7X6Legx8WgMq/5k/D8z/KLiOxofh80Iqq8vaqXN 0Lf2b8oLQ7Arn4Ria62mBjkI/8nuRLXV/yLNz3eJqDi24Z6HVtlIlcRVjtsgJDncASae 4C6z4/eksFqgpEFj7pO8B3q29UQIX21jeK5A81nBXwO4pCp2IbpQUpho9wrud0pTsyZ5 PEYg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=EKHw8TY0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i16-20020a0564020f1000b00461bff75d84si11737987eda.463.2022.11.22.12.25.31; Tue, 22 Nov 2022 12:25:53 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=EKHw8TY0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234329AbiKVUGN (ORCPT + 90 others); Tue, 22 Nov 2022 15:06:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48082 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234517AbiKVUGL (ORCPT ); Tue, 22 Nov 2022 15:06:11 -0500 Received: from mail-vs1-xe2b.google.com (mail-vs1-xe2b.google.com [IPv6:2607:f8b0:4864:20::e2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C42C98A168 for ; Tue, 22 Nov 2022 12:06:09 -0800 (PST) Received: by mail-vs1-xe2b.google.com with SMTP id p4so15531617vsa.11 for ; Tue, 22 Nov 2022 12:06:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=zkjjvph7GSIuRbhiRbs3/9b09qAojNLtWfTKgHMBCNM=; b=EKHw8TY0iwnDzUIKHY+rodJJ/AkxSWo4HZVVCllcpw9GQBAjvq5Z2tW17w75DwjlKz TivHS5jzHlS39CGYE7qhA7a9zKY8pHRwkcvn3KHmv5jSREEprRRd9eI08TlOSM44hCAi ZN7/PIst4o4dBeNi5HAdOhkeMzmAERfeDkRsHXc2ooqxrx2o8ro/gyzgMFLSlDIxXopF Hk4YN5AdX4j+NGxDQ5NAUgA3vX3d7h+U3K0yu0URqR/j65q4WpklARZe0dkMUYiK1Ocu LvwIBOVwVmPWKEfmLiwEMIqaZAFHE5+hZsNwB1QxnCaNtnlJvbi2zZ6es1qu2s70dE0P hoOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=zkjjvph7GSIuRbhiRbs3/9b09qAojNLtWfTKgHMBCNM=; b=x/x+ANU4bZhMfSU6HGljw5Zcb3Ab896UuILabaTxyoVjd9CB8Am7psIXJq6p1HoFJc qTzY1A8dMtQw2zp55E/DWiE5eM5W9gCqeAzVsHe0o4DPZQZvrH0Ydhd5oY5nX5bY/xj2 gX+SM0kzo2ygVBBF/ras8kTYoHOHgZ46i3iVhRmF1I3lGmvtpfjUZ/bhHzrU53hjVhTU 5ndi3PhT9Nk1g05uW+czlygKTuSktR/m8HsNu+i7uPCCdZYAbJKDDZ4uToDKXWJQMIPs gyWGP6bt7pRkwB7UQ6nW0oMhMhu4IoQAV2S7MXOZiH5RhYOQ9E28a0VcgKtQFQ5Y163/ K6aw== X-Gm-Message-State: ANoB5pna5/4anbIL5Ma/AxRykBPzeLprWYMzkv3XYS8KWONmBeXaY5rb Xa8jvMtlJmgXoj3uHHPbBgwkbm/xgXI4nEumeIvvTw== X-Received: by 2002:a67:c906:0:b0:3aa:f64:fbfd with SMTP id w6-20020a67c906000000b003aa0f64fbfdmr5458009vsk.15.1669147568764; Tue, 22 Nov 2022 12:06:08 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Yu Zhao Date: Tue, 22 Nov 2022 13:05:32 -0700 Message-ID: Subject: Re: Low TCP throughput due to vmpressure with swap enabled To: Ivan Babrou Cc: Linux MM , Linux Kernel Network Developers , linux-kernel , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Eric Dumazet , "David S. Miller" , Hideaki YOSHIFUJI , David Ahern , Jakub Kicinski , Paolo Abeni , cgroups@vger.kernel.org, kernel-team Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 22, 2022 at 12:46 PM Yu Zhao wrote: > > On Mon, Nov 21, 2022 at 5:53 PM Ivan Babrou wrote: > > > > Hello, > > > > We have observed a negative TCP throughput behavior from the following commit: > > > > * 8e8ae645249b mm: memcontrol: hook up vmpressure to socket pressure > > > > It landed back in 2016 in v4.5, so it's not exactly a new issue. > > > > The crux of the issue is that in some cases with swap present the > > workload can be unfairly throttled in terms of TCP throughput. > > > > I am able to reproduce this issue in a VM locally on v6.1-rc6 with 8 > > GiB of RAM with zram enabled. > > > > The setup is fairly simple: > > > > 1. Run the following go proxy in one cgroup (it has some memory > > ballast to simulate useful memory usage): > > > > * https://gist.github.com/bobrik/2c1a8a19b921fefe22caac21fda1be82 > > > > sudo systemd-run --scope -p MemoryLimit=6G go run main.go > > > > 2. Run the following fio config in another cgroup to simulate mmapped > > page cache usage: > > > > [global] > > size=8g > > bs=256k > > iodepth=256 > > direct=0 > > ioengine=mmap > > group_reporting > > time_based > > runtime=86400 > > numjobs=8 > > name=randread > > rw=randread > > Is it practical for your workload to apply some madvise/fadvise hint? > For the above repro, it would be fadvise_hint=1 which is mapped into > MADV_RANDOM automatically. The kernel also supports MADV_SEQUENTIAL, > but not POSIX_FADV_NOREUSE at the moment. Actually fadvise_hint already defaults to 1. At least with MGLRU, the page cache should be thrown away without causing you any problem. It might be mapped to POSIX_FADV_RANDOM rather than MADV_RANDOM. POSIX_FADV_RANDOM is ignored at the moment. Sorry for all the noise. Let me dig into this and get back to you later today. > We actually have similar issues but unfortunately I haven't been able > to come up with any solution beyond recommending the above flags. > The problem is that harvesting the accessed bit from mmapped memory is > costly, and when random accesses happen fast enough, the cost of doing > that prevents LRU from collecting more information to make better > decisions. In a nutshell, LRU can't tell whether there is genuine > memory locality with your test case. > > It's a very difficult problem to solve from LRU's POV. I'd like to > hear more about your workloads and see whether there are workarounds > other than tackling the problem head-on, if applying hints is not > practical or preferrable.