Received: by 2002:a25:ef43:0:0:0:0:0 with SMTP id w3csp824668ybm; Wed, 27 May 2020 08:53:15 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwGAm307digt5Y3n9kAbvQfsno9ntsjE1uejkDdGM+tIowETCIrJnS+y3d0HHrZes2pRh+H X-Received: by 2002:a17:906:ae8f:: with SMTP id md15mr420415ejb.149.1590594795836; Wed, 27 May 2020 08:53:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1590594795; cv=none; d=google.com; s=arc-20160816; b=sg+hWoRItf/0Pz1xT68n5xjxfvfntpgjNExF39WjUKeHiSRWaIwuf/WAQuWj4RG7NF Bk1FVGVjmfQGr58u42P8AokP01anV0SCV4gSpm1wkxmV50rtI+WNuP+1tWDP7LaGYPi6 Do8yccgq+KAgIDxHjpD+N8XYVm8mZQfXBfQ2/ZaMbEfbonlfMyXaT/06ESsgywyLA3LN J53CsIBmfzZXznS9X3hBQ4ZXccTs8GWXNsF7gq+78MFeKH9rzsd1ewGRNu8veyGB+YSv 80aZG3UJpm9WtNVijYRL87WZoc9TyotkBPgjaoiJe2vXhtCo+JpM453mf6RrDDjPz8Hm H6xg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:ironport-sdr:ironport-sdr; bh=Y24T6Wq7DLvRP3sQ6L7wdLsCVtm4irQfQQDd4zxetHE=; b=snhtv9+zvRvfu+LYf2BMPq5gJ8BgoKFFVajwpLN61+eiVhqU9Mh0zGG1Tr1kn4LqVm DS6bn1Xm44Zfh26KiyJXwp4HA1/aWur5Y64HBV/5AC+emSgPRWOYOj0ulfhnlQ9Bo9AI ecVGY3Gehr/iinrzTgX2JigUlOnHbfTvFPm3TbVTwXQ3mlG+LbM12UjYW4qcUTmlaj3v /4PbvKEanDLmO3DSZ3rXkuHeah/yAarwcZLvmOfXcA0tE0Y1wKmmcifIyRKCK/KAX6I3 d3tZmJhQ81vFU2iycLBuVtNOm0rcSskfQ11RXvCEozKiPbM6I/1urtzOyl2Rz2VbbuJG 1BOA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ly12si2235621ejb.623.2020.05.27.08.52.52; Wed, 27 May 2020 08:53:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388215AbgE0Ndj (ORCPT + 99 others); Wed, 27 May 2020 09:33:39 -0400 Received: from mga18.intel.com ([134.134.136.126]:10478 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387650AbgE0Ndh (ORCPT ); Wed, 27 May 2020 09:33:37 -0400 IronPort-SDR: 3QBb63cQmxayhPe23+Yhhp8Dstw/DvlHo7FHXlH5uA3wlsf0aT2cJ4SUEtaTJeERzIC8Rpai/N mPPdP8mVejww== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 May 2020 06:33:37 -0700 IronPort-SDR: Sp85A4i2kD1hIJeERk2+fDoPnHvu1ieNLHpHlYnUQoRo+UJZ+SVOZlsY8FKXyMSUVQx+2yYVDX C2fHseRFb0VA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,441,1583222400"; d="scan'208";a="291594465" Received: from shbuild999.sh.intel.com (HELO localhost) ([10.239.146.107]) by fmsmga004.fm.intel.com with ESMTP; 27 May 2020 06:33:34 -0700 Date: Wed, 27 May 2020 21:33:32 +0800 From: Feng Tang To: Qian Cai Cc: Andrew Morton , Michal Hocko , Johannes Weiner , Stephen Rothwell , Matthew Wilcox , Mel Gorman , Kees Cook , Luis Chamberlain , Iurii Zaikin , andi.kleen@intel.com, tim.c.chen@intel.com, dave.hansen@intel.com, ying.huang@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/3] make vm_committed_as_batch aware of vm overcommit policy Message-ID: <20200527133332.GA20232@shbuild999.sh.intel.com> References: <1588922717-63697-1-git-send-email-feng.tang@intel.com> <20200521212726.GC6367@ovpn-112-192.phx2.redhat.com> <20200526181459.GD991@lca.pw> <20200527014647.GB93879@shbuild999.sh.intel.com> <20200527022539.GK991@lca.pw> <20200527104606.GE93879@shbuild999.sh.intel.com> <20200527120549.GA741@lca.pw> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200527120549.GA741@lca.pw> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Qian, On Wed, May 27, 2020 at 08:05:49AM -0400, Qian Cai wrote: > On Wed, May 27, 2020 at 06:46:06PM +0800, Feng Tang wrote: > > Hi Qian, > > > > On Tue, May 26, 2020 at 10:25:39PM -0400, Qian Cai wrote: > > > > > > > [1] https://lkml.org/lkml/2020/3/5/57 > > > > > > > > > > > > Reverted this series fixed a warning under memory pressue. > > > > > > > > > > Andrew, Stephen, can you drop this series? > > > > > > > > > > > > > > > > > [ 3319.257898] LTP: starting oom01 > > > > > > [ 3319.284417] ------------[ cut here ]------------ > > > > > > [ 3319.284439] memory commitment underflow > > > > > > > > Thanks for the catch! > > > > > > > > Could you share the info about the platform, like the CPU numbers > > > > and RAM size, and what's the mmap test size of your test program. > > > > It would be great if you can point me the link to the test program. > > > > > > I have been reproduced this on both AMD and Intel. The test just > > > allocating memory and swapping. > > > > > > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/oom/oom01.c > > > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/mem/tunable/overcommit_memory.c > > > > > > It might be better to run the whole LTP mm tests if none of the above > > > triggers it for you which has quite a few memory pressurers. > > > > > > /opt/ltp/runltp -f mm > > > > Thanks for sharing. I tried to reproduce this on 2 server plaforms, > > but can't reproduce it, and they are still under testing. > > > > Meanwhile, could you help to try the below patch, which is based on > > Andi's suggestion and have some debug info. The warning is a little > > strange, as the condition is > > > > (percpu_counter_read(&vm_committed_as) < > > -(s64)vm_committed_as_batch * num_online_cpus()) > > > > while for your platform (48 CPU + 128 GB RAM), the > > '-(s64)vm_committed_as_batch * num_online_cpus()' > > is a s64 value: '-32G', which makes the condition hard to be true, > > and when it is, it could be triggered by some magic for s32/s64 > > operations around the percpu-counter. > > Here is the information on AMD and powerpc below affected by this. It > could need a bit patient to reproduce, but our usual daily CI would > trigger it eventually after a few tries. > > # git clone https://github.com/cailca/linux-mm.git > # cd linux-mm > # ./compile.sh > # systemctl reboot > # ./test.sh I just downloaded it, and it failed on my desktop machine as it failed in 'yum' and 'grub2' setup. The difficulty for me to reproduce is the test platforms are behind the 0day framework, and I can hardly setup external test suits, though I have been trying for all day today :) So if possible, please help to try the patch in my last email. thanks! - Feng