Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp1070345imm; Fri, 15 Jun 2018 10:40:51 -0700 (PDT) X-Google-Smtp-Source: ADUXVKL5N/tKZSn+kE7ktqIoFStX8FFJHzpVDe452+STkHgCbM4PiBaCt3V7G5bqXRvVpSjN91i+ X-Received: by 2002:a17:902:8308:: with SMTP id bd8-v6mr3092289plb.329.1529084451295; Fri, 15 Jun 2018 10:40:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529084451; cv=none; d=google.com; s=arc-20160816; b=jlzvB+Hezr0p67+IIswyyDI4mNoMey2E83RT8w3cLhZAqGdMfQs3COQhJ42FtN5j2c ZnTavM5AuW11L+AmymA/lGlUT7HGmdajFA0j/yrg1Okmi0CMRm36aZmWxokjDoiYI0xv 15GYFePlz0HuyS/6C1uBZV3tQyECemrycvykj1Yzc1dSbihPPKO2yh3cgZ1K3ZIY08ma WcheGTnaQUuPJ1+hgY1MQhI+CKle5tZQBw5oUW1OBQmXPcYkAokD2CdQ8JkFueP42eWp KFOL0h/KXlbo+XG4h70XiRdPIs8hdN9p0LLSmCGhpcEQM5PEDrj13+Ly/BM8W/czk6Ki 7syw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=dGzlIpI6jrNu55PxgHw3cGA5RYVSvPR0uJDTJhiy8Ew=; b=rtnmej9iWwv+4zyoPTWJr6pvaWmltUjOYv1OdFFNOVmEGi2btULGp0MOeJWKPmuMVH muDrqfizkS/UJ+j14BKvzGVlTFa6CWhqTn6/eqMkhRhMShs9/q0dQ59fTsbMbUMGPRgs J+Md7a1L+lXCY0JjsGLFMXv/AjezTt5UiH/LQmaEWhuwt3Y+c8hL/pCasbD5kVkqQGFY WyKJJ9qBly0SPXPVGfBxR56tn52FobrmcEF7tdNInhQZ6c163i9D2EqQKYBEmvsCIrWL V4L/uaMyM/W2+RBH+KRpvGODFqt3Eru8kieifnfkzjRsgu935XUMLvkqGnPpylF75yQj 7idg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u191-v6si6833786pgd.667.2018.06.15.10.40.36; Fri, 15 Jun 2018 10:40:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966112AbeFORkI (ORCPT + 99 others); Fri, 15 Jun 2018 13:40:08 -0400 Received: from us.icdsoft.com ([192.252.146.184]:39836 "EHLO us.icdsoft.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966054AbeFORkG (ORCPT ); Fri, 15 Jun 2018 13:40:06 -0400 Received: (qmail 26978 invoked by uid 1001); 15 Jun 2018 17:40:04 -0000 Received: from unknown (HELO ?94.155.37.249?) (famzah@94.155.37.249) by 192.252.159.165 with ESMTPA; 15 Jun 2018 17:40:04 -0000 Subject: Re: Cgroups "pids" controller does not update "pids.current" count immediately To: Tejun Heo Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org References: <77af3805-e912-2664-f347-e30c0919d0c4@icdsoft.com> <20180614150650.GU1351649@devbig577.frc2.facebook.com> <7860105c-553a-534b-57fc-222d931cb972@icdsoft.com> <20180615154140.GV1351649@devbig577.frc2.facebook.com> <1d635d1d-6152-ecfc-d235-147ff1fe7c95@icdsoft.com> <20180615161647.GW1351649@devbig577.frc2.facebook.com> From: Ivan Zahariev Message-ID: <6c2c9bfb-3175-b9ec-cf39-c9d4ebf654b2@icdsoft.com> Date: Fri, 15 Jun 2018 20:40:02 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <20180615161647.GW1351649@devbig577.frc2.facebook.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-bg Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On 15.6.2018 г. 19:16 ч., Tejun Heo wrote: > On Fri, Jun 15, 2018 at 07:07:27PM +0300, Ivan Zahariev wrote: >> I understand all concerns and design decisions. However, having >> RLIMIT_NPROC support combined with "cgroups" hierarchy would be very >> handy. >> >> Does it make sense that you introduce "nproc.current" and >> "nproc.max" metrics which work in the same atomic, real-time way >> like RLIMIT_NPROC? Or make this in a new "nproc" controller? > I'm skeptical for two reasons. > > 1. That doesn't sound much like a resource control problem but more of > a policy enforcement problem. > > 2. and it's difficult to see why such policies would need to be that > strict. Where is the requirement coming from? > The lazy pids accounting + modern fast CPUs makes the "pids.current" metric practically unusable for resource limiting in our case. For a test, when we started and ended one single process very quickly, we saw "pids.current" equal up to 185 (while the correct value at all time is either 0 or 1). If we want that a "cgroup" can spawn maximum 50 processes, we should use some high value like 300 for "pids.max", in order to compensate the pids uncharge lag (and this depends on the speed of the CPU and how busy the system is). Our use-case is for a shared web hosting service. Our customers start a CGI process for each PHP web request and therefore process start/end happens at a very high rate. We don't want customers to be able to launch too many CGI processes (NPROC limit) because this exhausts the web & database servers, and probably obsesses Linux kernel resources (like total "opened files" per user). Furthermore, some users are malicious and launch fork-bombs and other resource-exhaustion attacks. You may be right that we enforce a policy rather than resource control. This has worked for us for 15+ years now. The motivation is that a global RLIMIT_NPROC easily let's us limit all system and Linux kernel resources "per customer" ("cgroups" allows us to limit only certain system resources). Additionally, not all user-space daemons allow for a granular "per user" limit or proper grouping (for example, MySQL has only users, and no "per customer" groups support). Now we want to have different "cgroups" hierarchies for a customer (SSH, CGI, Crond), each with their own RLIMIT_NPROC, and a total RLIMIT_NPROC for the parent "per customer" cgroup. Excuse me for the lengthy post :-) -- Ivan