Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754623AbbLPD6b (ORCPT ); Tue, 15 Dec 2015 22:58:31 -0500 Received: from mgwym03.jp.fujitsu.com ([211.128.242.42]:13932 "EHLO mgwym03.jp.fujitsu.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753612AbbLPD6a (ORCPT ); Tue, 15 Dec 2015 22:58:30 -0500 X-SecurityPolicyCheck: OK by SHieldMailChecker v2.3.2 X-SHieldMailCheckerPolicyVersion: FJ-ISEC-20150223 X-SHieldMailCheckerMailID: 5c529c6080a144e0aca734dbff84e035 Subject: Re: [PATCH 1/7] mm: memcontrol: charge swap to cgroup2 To: Michal Hocko , Vladimir Davydov References: <265d8fe623ed2773d69a26d302eb31e335377c77.1449742560.git.vdavydov@virtuozzo.com> <20151214153037.GB4339@dhcp22.suse.cz> <20151214194258.GH28521@esperanza> <20151215172127.GC27880@dhcp22.suse.cz> Cc: Andrew Morton , Johannes Weiner , linux-mm@kvack.org, linux-kernel@vger.kernel.org From: Kamezawa Hiroyuki Message-ID: <5670E147.8060203@jp.fujitsu.com> Date: Wed, 16 Dec 2015 12:57:59 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: <20151215172127.GC27880@dhcp22.suse.cz> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2310 Lines: 51 On 2015/12/16 2:21, Michal Hocko wrote: > I completely agree that malicious/untrusted users absolutely have to > be capped by the hard limit. Then the separate swap limit would work > for sure. But I am less convinced about usefulness of the rigid (to > the global memory pressure) swap limit without the hard limit. All the > memory that could have been swapped out will make a memory pressure to > the rest of the system without being punished for it too much. Memcg > is allowed to grow over the high limit (in the current implementation) > without any way to shrink back in other words. > > My understanding was that the primary use case for the swap limit is to > handle potential (not only malicious but also unexpectedly misbehaving > application) anon memory consumption runaways more gracefully without > the massive disruption on the global level. I simply didn't see swap > space partitioning as important enough because an alternative to swap > usage is to consume primary memory which is a more precious resource > IMO. Swap storage is really cheap and runtime expandable resource which > is not the case for the primary memory in general. Maybe there are other > use cases I am not aware of, though. Do you want to guarantee the swap > availability? > At the first implementation, NEC guy explained their use case in HPC area. At that time, there was no swap support. Considering 2 workloads partitioned into group A, B. total swap was 100GB. A: memory.limit = 40G B: memory.limit = 40G Job scheduler runs applications in A and B in turn. Apps in A stops while Apps in B running. If App-A requires 120GB of anonymous memory, it uses 80GB of swap. So, App-B can use only 20GB of swap. This can cause trouble if App-B needs 100GB of anonymous memory. They need some knob to control amount of swap per cgroup. The point is, at least for their customer, the swap is "resource", which should be under control. With their use case, memory usage and swap usage has the same meaning. So, mem+swap limit doesn't cause trouble. Thanks, -Kame -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/