Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp2154015pxb; Fri, 25 Mar 2022 12:03:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz4bpGEdg1StOUx14k/lB6JS57eQ0IHo8ZMQCZVuYsGrLsuLBvLN+QCKrah8t9WEz2m/fRh X-Received: by 2002:a17:902:8e82:b0:151:6f68:7088 with SMTP id bg2-20020a1709028e8200b001516f687088mr13381758plb.11.1648235014281; Fri, 25 Mar 2022 12:03:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648235014; cv=none; d=google.com; s=arc-20160816; b=KjI+tE8uSLQkR6BEnJXf/ILkGHyngWAQS59GUGW53qgEJPUh97zk9XBc3DPZ4xj0Hf WV8p55TH3k6ORc3Fcv0DSbKsxrgJoOpPCKCqPWt10+Mg2XVU2XWKzF6GhQ44qHcEawOc GFwgVoOXKkJ7yx5JJ21SN0GNT84w0uMLQXznftYKAU/i372RG3pwkWbu52htPmFc0qeA NVauPChQgpjXU7AKQ6Q1lbhkR8QRONMn1ab1NDxqOqm4+zBPeFoHys1KQ1UbymqU6hJm 8zRtfREoAkIgAN6mURC4FAkAL8WUt5gpYnOQy+W6hmkhiCLHPayeCqMKRo18EtfruO/G lAaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=h6jomU3pNhNfJYGm9LatF9aqOCiUlgqkCJJ0ZhBuK44=; b=GmNSEAl/zglOmcgqwsqAlnmH6J6ScWZOwgCcJLGNNLe8jGU1fGSP06XU8Fi29Gzvl4 bJgW3vOBSbPjBrmReC7sL+kTt/t4lVq4mwPHKDpOI9asrKIGdUA72cpouEYuRHEwMT7o CS8Qq2e/xqTsnSgvI188C7iQWOJTZ7rSj9HqjghYyLaQlsJLVGjifYNffjZ2b/I5HK2I bv5sCna65SkuTTQObHqou/wlBRksJtulSgKW0JMwnerNu15IjpiamIYGR9lHPfv2TE+D l+yjyxpCciDGkYl/NSjo+UmzL+F6c7etm9grTWhvxsYmiDEHid4Ax09oc/nMSIFzcdVy l4Aw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=OsMPUua2; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id l3-20020a17090270c300b00153b2d16619si3019229plt.545.2022.03.25.12.03.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Mar 2022 12:03:34 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=OsMPUua2; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E5BA61F080E; Fri, 25 Mar 2022 11:12:18 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1358865AbiCYKdF (ORCPT + 99 others); Fri, 25 Mar 2022 06:33:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1358907AbiCYKcz (ORCPT ); Fri, 25 Mar 2022 06:32:55 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 79A22CA0D5; Fri, 25 Mar 2022 03:31:21 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 315BB210DD; Fri, 25 Mar 2022 10:31:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1648204280; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h6jomU3pNhNfJYGm9LatF9aqOCiUlgqkCJJ0ZhBuK44=; b=OsMPUua2iforFSm/UO/9WA9UJuZ+xXzZyx7U7SbDGrfPWJ19fd6uUDvHS5HlFiizH8bmb/ p2dURGMdjGTgK315FM0aQYhh88yWszyYJUdLrvzWW2gJttjV5HsA9O+13YgiZIiNGIXf9t fn7HnjdhscuOqORQ1tnVhDPzk3vTItA= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id B7E901332D; Fri, 25 Mar 2022 10:31:19 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id /5V/K/eZPWLFHgAAMHmgww (envelope-from ); Fri, 25 Mar 2022 10:31:19 +0000 Date: Fri, 25 Mar 2022 11:31:18 +0100 From: Michal =?iso-8859-1?Q?Koutn=FD?= To: Roman Gushchin Cc: cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Richard Palethorpe , Andrew Morton , Shakeel Butt , Michal Hocko , Vlastimil Babka , "Matthew Wilcox (Oracle)" , Muchun Song , Johannes Weiner , Yang Shi , Suren Baghdasaryan , Tejun Heo , Chris Down Subject: Re: [RFC PATCH] mm: memcg: Do not count memory.low reclaim if it does not happen Message-ID: <20220325103118.GC2828@blackbody.suse.cz> References: <20220324095157.GA16685@blackbody.suse.cz> <5049EBC3-5BAE-4509-BA63-1F4A7D913517@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <5049EBC3-5BAE-4509-BA63-1F4A7D913517@linux.dev> User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 24, 2022 at 11:17:14AM -0700, Roman Gushchin wrote: > Ok, so it’s not really about the implementation details of the reclaim > mechanism (I mean rounding up to the batch size etc), Actually, that was what I deemed more serious first. It's the point 2 of RFCness: | 2) The observed behavior slightly impacts distribution of parent's memory.low. | Constructed example is a passive protected workload in s1 and active in s2 | (active ~ counteracts the reclaim with allocations). It could strip | protection from s1 one by one (one:=SWAP_CLUSTER_MAX/2^sc.priority). | That may be considered both wrong (s1 should have been more protected) or | correct s2 deserves protection due to its activity. | I don't have (didn't collect) data for this, so I think just masking the | false events is sufficient (or independent). > Idk, I don’t have a strong argument against this change (except that > it changes the existing behavior), but I also don’t see why such > events are harmful. Do you mind elaborating a bit more? So I've collected some demo data now. systemd-run \ -u precious.service --slice=test-protected.slice \ -p MemoryLow=50M \ /root/memeater 50 # allocates 50M anon, doesn't use it systemd-run \ -u victim.service --slice=test-protected.slice \ -p MemoryLow=0M \ /root/memeater -m 50 50 # allocates 50M anon, uses it echo "Started workloads" systemctl set-property --runtime test.slice MemoryMax=200M systemctl set-property --runtime test-protected.slice MemoryLow=50M sleep 5 systemd-run \ -u pressure.service --slice=test.slice \ -p MemorySwapMax=0M \ # to push test-protected.slice to swap /root/memeater -m 170 170 sleep 5 systemd-cgtop -b -1 -m test.slice Result with memory_recursiveprot > Control Group Tasks %CPU Memory Input/s Output/s > test.slice 3 - 199.9M - - > test.slice/pressure.service 1 - 170.5M - - > test.slice/test-protected.slice 2 - 29.4M - - > test.slice/test-protected.slice/victim.service 1 - 29.1M - - > test.slice/test-protected.slice/precious.service 1 - 292.0K - - Result without memory_recursiveprot > Control Group Tasks %CPU Memory Input/s Output/s > test.slice 3 - 199.8M - - > test.slice/pressure.service 1 - 170.5M - - > test.slice/test-protected.slice 2 - 29.3M - - > test.slice/test-protected.slice/precious.service 1 - 28.7M - - > test.slice/test-protected.slice/victim.service 1 - 560.0K - - (kernel 5.17.0, systemd 249.10) So with this result, I'd say the event reporting is an independent change (admiteddly, thanks to the current implementation (not the proposal of mine) I noticed this issue). /me scratches head, let me review my other approaches... Michal