Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754812AbaGIHxL (ORCPT ); Wed, 9 Jul 2014 03:53:11 -0400 Received: from mx2.parallels.com ([199.115.105.18]:56908 "EHLO mx2.parallels.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751450AbaGIHxK (ORCPT ); Wed, 9 Jul 2014 03:53:10 -0400 Date: Wed, 9 Jul 2014 11:52:52 +0400 From: Vladimir Davydov To: CC: , , Andrew Morton , Tejun Heo , Li Zefan , Johannes Weiner , Michal Hocko , Mel Gorman , Rik van Riel , "Kirill A. Shutemov" , Hugh Dickins , David Rientjes , Pavel Emelyanov , Balbir Singh Subject: Re: [PATCH RFC 0/5] Virtual Memory Resource Controller for cgroups Message-ID: <20140709075252.GB31067@esperanza> References: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 03, 2014 at 04:48:16PM +0400, Vladimir Davydov wrote: > Hi, > > Typically, when a process calls mmap, it isn't given all the memory pages it > requested immediately. Instead, only its address space is grown, while the > memory pages will be actually allocated on the first use. If the system fails > to allocate a page, it will have no choice except invoking the OOM killer, > which may kill this or any other process. Obviously, it isn't the best way of > telling the user that the system is unable to handle his request. It would be > much better to fail mmap with ENOMEM instead. > > That's why Linux has the memory overcommit control feature, which accounts and > limits VM size that may contribute to mem+swap, i.e. private writable mappings > and shared memory areas. However, currently it's only available system-wide, > and there's no way of avoiding OOM in cgroups. > > This patch set is an attempt to fill the gap. It implements the resource > controller for cgroups that accounts and limits address space allocations that > may contribute to mem+swap. > > The interface is similar to the one of the memory cgroup except it controls > virtual memory usage, not actual memory allocation: > > vm.usage_in_bytes current vm usage of processes inside cgroup > (read-only) > > vm.max_usage_in_bytes max vm.usage_in_bytes, can be reset by writing 0 > > vm.limit_in_bytes vm.usage_in_bytes must be <= vm.limite_in_bytes; > allocations that hit the limit will be failed > with ENOMEM > > vm.failcnt number of times the limit was hit, can be reset > by writing 0 > > In future, the controller can be easily extended to account for locked pages > and shmem. Any thoughts on this? Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/