Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp1738917ybn; Thu, 26 Sep 2019 01:14:12 -0700 (PDT) X-Google-Smtp-Source: APXvYqze60mp4gSS9OreRVR4UtHyLpd3KdCxbWnG5efUTYWfvoF7JGRLyLSLzXHNLgBeIasHcxvp X-Received: by 2002:a17:906:2782:: with SMTP id j2mr1917952ejc.203.1569485652874; Thu, 26 Sep 2019 01:14:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569485652; cv=none; d=google.com; s=arc-20160816; b=WCYmrULOpCvZeto/Aw/2LRrMwQz4tsnPrlpZnk05MFWCXKZ0YwNpwvOTTvkmEv/Qqn EFPz7wfhbyvPu1x2ZLgZ0kqg26T2CFpUdnPjnuk7OXEL97DgxxlX455EhzkyAMMomsIs ReYXrwW0cOCytdf7fVsALGJQ7s0aozSLIHhBSqp0cfBKWjCp5zPY9t1hr6Oe3SswvSVM 1HZQf46KcKy/Hl04mkHqzWO9VPR7Jq6LWtkgzVwdWMpQfuy7LrJElYamjR4GT82tN8KX 0Z0QrMagqNq3esLSXFZjCWT8pxZi9jPIwik/SH000AgOW/oLU9P6YLfFFGgGaigIDLhY cMpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject :dkim-signature; bh=AkqHICb3Rv9Y4nE27Lsc42IJ4riOSFUvurUNqZZ440Y=; b=QEHm267ADAt0bofI8frvqKe8wN2xa0RiQNurMYILcV5RDeXujA9v8cNUvfJFd00Hz/ kv2epceKKONO/TE5a5nPiXXUMkk03G/u1x7BDmGFp8W5lCkeaL4MC2FpcUNEhN1gui3z 2eFL/qyvT0ecY+buZJx8YoBcOvt5Y4CFxvXZPxjhAEOxI8Z15pmNxa85XstUPJMSxZQS yW0vnscf3qkNcs3I89pI0gQ7UG/iD1pZXlQ5+j2Y2dDrPis3B7M/ELTS1WzIg04zIH5z /38S4E/d6Yma/zzOoO1c5Kwm4KI8gqyU3H7CcmNK7a7PGDlHku4NePII+SB6tn3u4BEb MEfg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2019-08-05 header.b=Xedxqpcm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p28si730875ejn.88.2019.09.26.01.13.49; Thu, 26 Sep 2019 01:14:12 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2019-08-05 header.b=Xedxqpcm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2441253AbfIXOMA (ORCPT + 99 others); Tue, 24 Sep 2019 10:12:00 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:45938 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730491AbfIXOMA (ORCPT ); Tue, 24 Sep 2019 10:12:00 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x8OE9thY048021; Tue, 24 Sep 2019 14:11:23 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=AkqHICb3Rv9Y4nE27Lsc42IJ4riOSFUvurUNqZZ440Y=; b=XedxqpcmhIQo9c73WDaxFGaxa0Bp0wmxoyhW99yubHUWE8nNv1MStzP8D0+7Q6FPsxpH noBDVZw2bRaxzHk616czB4vPFgRxOEPGMHk9rEYLlVk33f6z0Y/HOnFXSu2+8MnHel2W Hx6Leu3at1OyPw7gDQb/cxOk05tonHy5u3bwPRq9fxj7qoEQlBsafzw+riz4Yj42ZVZC kj8Y+n7I9JNb3OuoLBW1RFb027bUH/Gfroq9Ve6cGlht/jX620dTZCXs2TH+NXkObaNK RpOAIdUg4RiXeP8en+RPu5/1jsEl8tXl5vB5ub0kRnHcpa1XedfD3ABM8k5nDDUIHXGI rw== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by aserp2120.oracle.com with ESMTP id 2v5btpxcb2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 24 Sep 2019 14:11:23 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x8OEBBLq022074; Tue, 24 Sep 2019 14:11:22 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userp3020.oracle.com with ESMTP id 2v6yvns3du-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 24 Sep 2019 14:11:22 +0000 Received: from abhmp0011.oracle.com (abhmp0011.oracle.com [141.146.116.17]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x8OEBBVj026601; Tue, 24 Sep 2019 14:11:11 GMT Received: from [10.154.189.242] (/10.154.189.242) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 24 Sep 2019 07:11:11 -0700 Subject: Re: [RFC] mm: Proactive compaction To: Vlastimil Babka , Nitin Gupta , "dan.j.williams@intel.com" , "mhocko@suse.com" , "mgorman@techsingularity.net" , "akpm@linux-foundation.org" Cc: "cai@lca.pw" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "gregkh@linuxfoundation.org" , "aryabinin@virtuozzo.com" , "jannh@google.com" , "guro@fb.com" , "hannes@cmpxchg.org" , "keescook@chromium.org" , "yuzhao@google.com" , "arunks@codeaurora.org" , "willy@infradead.org" , "janne.huttunen@nokia.com" , "khlebnikov@yandex-team.ru" References: <20190816214413.15006-1-nigupta@nvidia.com> <87634ddc-8bfd-8311-46c4-35f7dc32d42f@suse.cz> <7bbd5322ed7a7fcb349c83952f8fc17448cd07d8.camel@nvidia.com> <71d7fba0-bd6f-3ac5-1fd8-9a8ff6fc6b8b@suse.cz> From: Khalid Aziz Organization: Oracle Corp Message-ID: Date: Tue, 24 Sep 2019 08:11:09 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <71d7fba0-bd6f-3ac5-1fd8-9a8ff6fc6b8b@suse.cz> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9389 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1908290000 definitions=main-1909240140 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9389 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1908290000 definitions=main-1909240140 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/24/19 7:39 AM, Vlastimil Babka wrote: > On 9/20/19 1:37 AM, Nitin Gupta wrote: >> On Tue, 2019-08-20 at 10:46 +0200, Vlastimil Babka wrote: >>> >>> That's a lot of control knobs - how is an admin supposed to tune them= to >>> their >>> needs? >> >> >> Yes, it's difficult for an admin to get so many tunable right unless >> targeting a very specific workload. >> >> How about a simpler solution where we exposed just one tunable per-nod= e: >> /sys/.../node-x/compaction_effort >> which accepts [0, 100] >> >> This parallels /proc/sys/vm/swappiness but for compaction. With this >> single number, we can estimate per-order [low, high] watermarks for ex= ternal >> fragmentation like this: >> - For now, map this range to [low, medium, high] which correponds to = specific >> low, high thresholds for extfrag. >> - Apply more relaxed thresholds for higher-order than for lower order= s. >> >> With this single tunable we remove the burden of setting per-order exp= licit >> [low, high] thresholds and it should be easier to experiment with. >=20 > What about instead autotuning by the numbers of allocations hitting > direct compaction recently? IIRC there were attempts in the past (mysel= f > included) and recently Khalid's that was quite elaborated. >=20 I do think the right way forward with this longstanding problem is to take the burden of managing free memory away from end user and let the kernel autotune itself to the demands of workload. We can start with a simpler algorithm in the kernel that adapts to workload and refine it as we move forward. As long as initial implementation performs at least as well as current free page management, we have a workable path for improvements. I am moving the implementation I put together in kernel to a userspace daemon just to test it out on larger variety of workloads. It is more limited in userspace with limited access to statistics the algorithm needs to perform trend analysis so I would rather be doing this in the kernel. -- Khalid