Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp520606ybh; Thu, 12 Mar 2020 06:23:48 -0700 (PDT) X-Google-Smtp-Source: ADFU+vu/R1D+PAeo94ScX7prHKVdTNfk384tNIj2SNgMzkRSRdsBd+XszpLB7QdzkR69SXeStHzL X-Received: by 2002:a05:6808:1cf:: with SMTP id x15mr2650230oic.44.1584019428764; Thu, 12 Mar 2020 06:23:48 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1584019428; cv=pass; d=google.com; s=arc-20160816; b=Rp/Q5HtYmdR65H1C7VSuW7TR5nw/AUvAYez/AFIInejOrgJUmvfUf9ivdixDbGSpAp lDtXK4NMaQd4WDFNui61ETKdatJw53LDw/vCnwK8+FgA8G7tAB2PyeD4MJGJsiwf3Gag 0Apg16EyrQ11x080kJLIO7vkyUw6B952LN9y4gfmApf7S2HPxnRYopgL3dhp3GXv6/yf PjuhJRxH5nLeF6n+rpjsBfBitISobClSpJ80ScTlPlZRid0fRL2L7bzJFYoLmNv+cJxY OWR8hP5XUE+Kf9jmvwl+Ve2yWkcGxYPxzunw1/g++7GKLGOYuahkzM1H9hqoV0h9Dy8f jhew== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :content-language:accept-language:in-reply-to:references:message-id :date:thread-index:thread-topic:subject:cc:to:from:dkim-signature; bh=8GA/JcmvOIdjbFtI2qkXuFwFE0oCqdARTGdk//fqsT4=; b=CshE1wT5RhPD1klEEXEidsTQY6O+oDJ5C6a7f1zqOBvfyKCQFfji5frmTjuID+KBP3 lOhVwJ0x1Z2JJoq+WyeCPUYhtFVXYmRiOB+UpAGnWb9OxXBPqKGBtJDYv8DPq8xEJygt G66OD8PYuLIkqK2vXIPpmcyFxHDk0GKcHvBPKH5xsqlbL28w3APKrslu3ix5RUbXf6J3 6dsCsH5+O1nrnnuMHL2BMAAi69JoxcWNfzjIMJtL2ssDr9pIV3vK7MbA/jf9GQUk3f4B 8YPuhhGtoPyt4r0dsk4NTWmKMYrQVRaryzaR3r0P01ghxtWpATGKTk1jMviE7fkPcbN5 VCRg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@nutanix.com header.s=proofpoint20171006 header.b=QC2aZGNP; arc=pass (i=1 spf=pass spfdomain=nutanix.com dkim=pass dkdomain=nutanix.com dmarc=pass fromdomain=nutanix.com); spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nutanix.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l65si2640966oig.156.2020.03.12.06.23.35; Thu, 12 Mar 2020 06:23:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@nutanix.com header.s=proofpoint20171006 header.b=QC2aZGNP; arc=pass (i=1 spf=pass spfdomain=nutanix.com dkim=pass dkdomain=nutanix.com dmarc=pass fromdomain=nutanix.com); spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nutanix.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726641AbgCLNW6 (ORCPT + 99 others); Thu, 12 Mar 2020 09:22:58 -0400 Received: from mx0b-002c1b01.pphosted.com ([148.163.155.12]:39900 "EHLO mx0b-002c1b01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725978AbgCLNW6 (ORCPT ); Thu, 12 Mar 2020 09:22:58 -0400 Received: from pps.filterd (m0127842.ppops.net [127.0.0.1]) by mx0b-002c1b01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 02CCd8CQ006630; Thu, 12 Mar 2020 05:48:24 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nutanix.com; h=from : to : cc : subject : date : message-id : references : in-reply-to : content-type : content-transfer-encoding : mime-version; s=proofpoint20171006; bh=8GA/JcmvOIdjbFtI2qkXuFwFE0oCqdARTGdk//fqsT4=; b=QC2aZGNPeF1j7ssKflddzZBBFhmU/JaJsSSxuZeOgDcLaI/xlQylZW0iVeb17dgSEYxd 6J54iqkrdnLw4AqKMWz1kWInJWw+38gT/SCr/janKQ8QN/9oqK/yiR/sW9sDeBa+OPoO Q1eChSogapnQOkpVFPdNJVIELyIP2/5LIjSI4LG4f5h6CkxE/bhYV9pHSghpUOFJjVH5 p93bV13jEvqPdUGaUKrMd8kQlaWHLAN/c9CkOqtCRcfPdVC+E4O6ip0q35VrFHTlmz4V OtNylR5BwMq5i3Gwn2BGuN2OHmLGL2BGttbE0K+OVasW01FESMd4WFu3rs/m+qHl3D8r gA== Received: from nam10-mw2-obe.outbound.protection.outlook.com (mail-mw2nam10lp2109.outbound.protection.outlook.com [104.47.55.109]) by mx0b-002c1b01.pphosted.com with ESMTP id 2yqf32rug2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 12 Mar 2020 05:48:24 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=aIrcv6AiggXebWBDd1LTtKHT5JpcEZkJcCrDwuY1c7NbGSMF2wAk64Et0Vr5NxQZ+3aYw7mttLpBVyMqLf3HZozZCvx2v+VTmNDrQtgTHATTf67dBct8QJK1DGeeTNZp+3x2kNqGH09u0ayDrxmYdlEuzBBDNrWyY2XCvHt6TI0dKKvIVDPWZJzcjtulIiB6iVFqzrZsDlkGdbVgvWv5mpagnBQKzkkvPo4kPxmNnbBX4hmLd2HPjsDkTrybW4muRMgC78QYkn0QxOeFTlURj7Ugx0QAP72x4gFdV3ocTGlqFmlYV+e2T90N6Du6wNSVicCz5RxQ8DKlDWuTyG9w4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8GA/JcmvOIdjbFtI2qkXuFwFE0oCqdARTGdk//fqsT4=; b=cgCe9SLpB3bpXzw4Npj7Z4PaYIpa79m0aoZrCFE89sWoyx0+EtbCa+DIJLBtSieTG9DlYDqRgcExhkK2uPV+uzxbsk34ir5+QiQCzNC8lQvF3Zo/eVLP/6Y/1vy+zn2+uvTfh3XAdclbe/yzNHZuO0KcZdUh2V02fVrvQwqDh66alci1lBBJsMWpla1HzdC+1fv3hkItdKGLjAStxbVwQW4MxzXxqWv2rg7YA7hkOBw8dPVNfK5jVMza/OIlxvqp93ar2SmnEMR/99znw5nDsC2P3ZLFUU/DErff9cSHOX5foxkq7kwhA2blihPq34jnieYSRkp03ZSEVUUYU+HjaA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nutanix.com; dmarc=pass action=none header.from=nutanix.com; dkim=pass header.d=nutanix.com; arc=none Received: from BL0PR02MB5601.namprd02.prod.outlook.com (2603:10b6:208:88::10) by BL0PR02MB5394.namprd02.prod.outlook.com (2603:10b6:208:37::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2793.17; Thu, 12 Mar 2020 12:48:22 +0000 Received: from BL0PR02MB5601.namprd02.prod.outlook.com ([fe80::ddf8:e6cc:908f:a98c]) by BL0PR02MB5601.namprd02.prod.outlook.com ([fe80::ddf8:e6cc:908f:a98c%6]) with mapi id 15.20.2814.007; Thu, 12 Mar 2020 12:48:22 +0000 From: Ivan Teterevkov To: David Rientjes CC: "corbet@lwn.net" , "akpm@linux-foundation.org" , "mchehab+samsung@kernel.org" , "tglx@linutronix.de" , "jpoimboe@redhat.com" , "pawan.kumar.gupta@linux.intel.com" , "jgross@suse.com" , "oneukum@suse.com" , "linux-doc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" Subject: RE: [PATCH] mm/vmscan: add vm_swappiness configuration knobs Thread-Topic: [PATCH] mm/vmscan: add vm_swappiness configuration knobs Thread-Index: AdX3zHNqbuFpQvKERxyPueA21j6FDgAD0RWAACPlSzA= Date: Thu, 12 Mar 2020 12:48:22 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [62.254.189.133] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: d86b2342-960b-42d2-af6f-08d7c683a5ec x-ms-traffictypediagnostic: BL0PR02MB5394: x-microsoft-antispam-prvs: x-proofpoint-crosstenant: true x-ms-oob-tlc-oobclassifiers: OLM:10000; x-forefront-prvs: 0340850FCD x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(136003)(376002)(346002)(366004)(396003)(39860400002)(199004)(6506007)(86362001)(33656002)(4326008)(55016002)(26005)(316002)(71200400001)(8676002)(8936002)(7696005)(66946007)(66556008)(81156014)(76116006)(52536014)(2906002)(81166006)(64756008)(66446008)(5660300002)(66476007)(478600001)(9686003)(44832011)(186003)(7416002)(54906003)(6916009);DIR:OUT;SFP:1102;SCL:1;SRVR:BL0PR02MB5394;H:BL0PR02MB5601.namprd02.prod.outlook.com;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;A:1; received-spf: None (protection.outlook.com: nutanix.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: Xy/zmVSvxubRH65+qpwaALSw194iFNRzwDjNy628l5vjRWCPqk6KwExq2D5b9t0kXrhe7CsT4VFnjGMpdTTDpY5stBaaQM/uW2nlPKKuc9ErfRaBkf3zxejQs/xU7rQcVUvici4+KFwbg5tClaAGpiSLvp1xfQ8bN6OJVANwo2KciQTkgob4ge5AxaJHBHEncDqeKYz6Y/mzAMfYVj0p90tG8DAbRx602tpomVzgiHa0n47PV2F3Yu/fVxMS/F7bX7nCuPwILHIC+lo0qPVHEKirbKFifKAL3Gzk+VNihtFXatK0rDaFDcZPJtICMglQb9OGDnxhzpqLti48f3o6qreQYCa4H8Vph9df5K5+Y7CloE1I8tP/PoME7/mbx0C8zhkb9NVduntbAEEN8Mzy8XoLpXCMh4vAtFG+cY6sSKryIf5dm782roOGbjICDG4X x-ms-exchange-antispam-messagedata: AYmqrfwa6yZksHLR7lxNLIGAa03wqywEvPuXR4wCJfwLA5ZBCivgnSQImD9/VkJyifSRudpq6uVU8ye6r3q9TbtkxZbbNpHBtCM4y0EGBL3yywx27M1sRL1hsEmi8ECEIGlMBb8+2NT2YhlS/kOnSQ== x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: nutanix.com X-MS-Exchange-CrossTenant-Network-Message-Id: d86b2342-960b-42d2-af6f-08d7c683a5ec X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Mar 2020 12:48:22.2172 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: bb047546-786f-4de1-bd75-24e5b6f79043 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: cCCLJ2Ik9QUsx4XVdtfPo6VHerR2TBnHDx5nqiC5GQmjL3VhftiwO9z1Ff91gJ4LcYEwBtHewxHJ0kOplqbfbOymLmu8a1lkDICfMgy6+rE= X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL0PR02MB5394 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138,18.0.572 definitions=2020-03-12_04:2020-03-11,2020-03-12 signatures=0 X-Proofpoint-Spam-Reason: safe Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 11 Mar 2020, David Rientjes wrote: > On Wed, 11 Mar 2020, Ivan Teterevkov wrote: >=20 > > This patch adds a couple of knobs: > > > > - The configuration option (CONFIG_VM_SWAPPINESS). > > - The command line parameter (vm_swappiness). > > > > The default value is preserved, but now defined by CONFIG_VM_SWAPPINESS= . > > > > Historically, the default swappiness is set to the well-known value 60, > > and this works well for the majority of cases. The vm_swappiness is als= o > > exposed as the kernel parameter that can be changed at runtime too, e.g= . > > with sysctl. > > > > This approach might not suit well some configurations, e.g. systemd-bas= ed > > distros, where systemd is put in charge of the cgroup controllers, > > including the memory one. In such cases, the default swappiness 60 > > is copied across the cgroup subtrees early at startup, when systemd > > is arranging the slices for its services, before the sysctl.conf > > or tmpfiles.d/*.conf changes are applied. > > >=20 > Seems like something that can be fully handled by an initscript that woul= d > set the sysctl and then iterate the memcg hierarchy propagating the > non-default value. I don't think that's too much of an ask if userspace > wants to manipulate the swappiness value. >=20 This is exactly what I'm trying to avoid: in some distros there is no way to tackle the configuration early enough, e.g. in systemd-based systems the systemd is the process that starts first and arranges memcg in a way it's configured, but unfortunately, it doesn't offer the swappiness knob. There could be a script to iterate the memcg later, but there would be a race condition with the system entity that's put in charge of the memcg because the configuration can't be changed atomically, e.g. a possible script could iterate the memcg tree and update each memory.swappiness while systemd is creating another slice or scope subtree. > Or maybe we can be more clever: have memcg->swappiness store -1 by defaul= t > unless it is changed by the user explicitly and then have > mem_cgroup_swappiness() return vm_swappiness for this value. If the user > overwrites it, it's intended. >=20 Does it mean that -1 would become a reference to the vm_swappiness or the parent's memory.swappiness? It sounds interesting and if so then it would address my issues with the swappiness but would also change the existing memcg behaviour: if the referred-to value changed, would the memory.swappiness backed by -1 also change? > So there are a couple options here but I don't think one of them is to ad= d > a new config option or kernel command line option. >=20 The vm_swappiness starts its lifespan in the kernel and thus why not to facilitate it with a simple "constructor" there? > > One could run a script to traverse the cgroup trees later and set the > > desired memory.swappiness individually in each occurrence when the runt= ime > > is set up, but this would require some amount of work to implement > > properly. Instead, why not set the default swappiness as early as possi= ble? > > > > Signed-off-by: Ivan Teterevkov > > --- > > .../admin-guide/kernel-parameters.txt | 4 ++++ > > mm/Kconfig | 10 ++++++++ > > mm/vmscan.c | 24 ++++++++++++++++++- > > 3 files changed, 37 insertions(+), 1 deletion(-) > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt > b/Documentation/admin-guide/kernel-parameters.txt > > index c07815d230bc..5d54a4303522 100644 > > --- a/Documentation/admin-guide/kernel-parameters.txt > > +++ b/Documentation/admin-guide/kernel-parameters.txt > > @@ -5317,6 +5317,10 @@ > > P Enable page structure init time poisoning > > - Disable all of the above options > > > > + vm_swappiness=3D [KNL] > > + Sets the default vm_swappiness. > > + Ranges from 0 to 100, the default value is 60. > > + > > vmalloc=3Dnn[KMG] [KNL,BOOT] Forces the vmalloc area to have an > exact > > size of . This can be used to increase the > > minimum size (128MB on x86). It can also be used to > diff --git a/mm/Kconfig b/mm/Kconfig index ab80933be65f..ec59c19e578e > 100644 > > --- a/mm/Kconfig > > +++ b/mm/Kconfig > > @@ -739,4 +739,14 @@ config ARCH_HAS_HUGEPD config > MAPPING_DIRTY_HELPERS > > bool > > > > +config VM_SWAPPINESS > > + int "Default memory swappiness" > > + default 60 > > + range 0 100 > > + help > > + Sets the default vm_swappiness, that could be changed later > > + in the runtime, e.g. kernel command line, sysctl, etc. > > + > > + Higher value means more swappy. Historically, defaults to 60. > > + > > endmenu > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index 876370565455..7d2d3550f698 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -163,7 +163,29 @@ struct scan_control { > > /* > > * From 0 .. 100. Higher means more swappy. > > */ > > -int vm_swappiness =3D 60; > > +int vm_swappiness =3D CONFIG_VM_SWAPPINESS; > > + > > +static int __init swappiness_cmdline(char *str) { > > + int val, err; > > + > > + if (!str) > > + return -EINVAL; > > + > > + err =3D kstrtoint(str, 10, &val); > > + if (err) > > + return -EINVAL; > > + > > + if (val < 0 || val > 100) > > + return -EINVAL; > > + > > + vm_swappiness =3D val; > > + > > + return 0; > > +} > > + > > +early_param("vm_swappiness", swappiness_cmdline); > > + > > /* > > * The total number of pages which are beyond the high watermark withi= n all > > * zones.