Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp2335688pxa; Mon, 17 Aug 2020 07:15:15 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwDPHDvUN8FyduJ2ZFs5C9m00GQ1B3I4stITSaoFOLtdDY3ettHZ6DNX9++7MvnlOXKrqlb X-Received: by 2002:a05:6402:b67:: with SMTP id cb7mr15191818edb.216.1597673714889; Mon, 17 Aug 2020 07:15:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1597673714; cv=none; d=google.com; s=arc-20160816; b=V7AYWE1AFdGHQRPjTq7IsUACYPQRb9WRl28zeIrBBhh5szy6Ielu03HJjARZ++BEYG Dpxj6yBNuVwP0kvq4FDcWtLSkFuzuS+Apb38n/xu4IA/IBtjq8PRrLefLEcwVFXJRyW5 oMNb9w1ClYFxDTsM4lWTd499HwzbxsBWhYw9raHy06RpfkNoAygdzjq75SMxMrdfHgy/ XFvliOeWpSsDqq+JyLSMwCVMK6z7vrUaFbPb/DXL9DS176vCG40NLbff4i+iSATHG8AQ 1vMsEd1LGOgSwffvrLZ2v/DrF09AsqV6vlSI3p9ciHqxsYWDLLxx+CivrkuMzfUwqQ85 /ErQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature; bh=+lmlYQx2uz53YfMyFrKjqL1FinoRZitUJmdbvkIY3+4=; b=t4KmV0bO+1dUOf2N5MUim3cjLr8LZULq0m5YadHmuVVv6bFINFLClEXKD+bih3oP5J 6Ku98GWnx8fnZiwZw+1BcZxsHjjKv4ds1dCQNJf069u4O5XFIbSF4BjPJ2Snt+o/8pQp nMqd7rbAnKGbf4p+862Hqzx3BNRGAf46uD9fhZwKZ3i/PNrGWyTAa8bbhuRKfz0S2zE8 q8tXh8CcnoitMzONBrTVi9Sb82xT32LmTXSoYsFvaIQ5rgq66qxbVy0qn66tQ9a2pjjD cb3eRSbmKNCksewrhdh4yEwqLqqGhU7sHM3V3bzkgwOU6/zkpZbeJUQHwPB2AqxKTK6g noTg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="O1lMOhR/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k21si11036297eje.480.2020.08.17.07.14.51; Mon, 17 Aug 2020 07:15:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="O1lMOhR/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728914AbgHQOJ5 (ORCPT + 99 others); Mon, 17 Aug 2020 10:09:57 -0400 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:58035 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728897AbgHQOJx (ORCPT ); Mon, 17 Aug 2020 10:09:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1597673391; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc; bh=+lmlYQx2uz53YfMyFrKjqL1FinoRZitUJmdbvkIY3+4=; b=O1lMOhR/ZwxLiHxgvd8pefOfMvFy460VENnRCg4mQP+oZ7VXpOPIthpprK4n1C7rrxkrSt vZu570eoKMRLqd4CHzKAnUedCvvE40+vbTP+0Qekk/aXQJcNJhHOASLwAVhdv2hh3DYcab cDnRvyCWOWdguNYVXGnzp+40SuvOgHQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-414-z0VJS0sDPW6M5J-0AMqYtA-1; Mon, 17 Aug 2020 10:09:49 -0400 X-MC-Unique: z0VJS0sDPW6M5J-0AMqYtA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9DCE281F001; Mon, 17 Aug 2020 14:09:47 +0000 (UTC) Received: from llong.com (ovpn-118-35.rdu2.redhat.com [10.10.118.35]) by smtp.corp.redhat.com (Postfix) with ESMTP id 11C7926323; Mon, 17 Aug 2020 14:09:38 +0000 (UTC) From: Waiman Long To: Andrew Morton , Johannes Weiner , Michal Hocko , Vladimir Davydov , Jonathan Corbet , Alexey Dobriyan , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot Cc: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, Waiman Long Subject: [RFC PATCH 0/8] memcg: Enable fine-grained per process memory control Date: Mon, 17 Aug 2020 10:08:23 -0400 Message-Id: <20200817140831.30260-1-longman@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Memory controller can be used to control and limit the amount of physical memory used by a task. When a limit is set in "memory.high" in a v2 non-root memory cgroup, the memory controller will try to reclaim memory if the limit has been exceeded. Normally, that will be enough to keep the physical memory consumption of tasks in the memory cgroup to be around or below the "memory.high" limit. Sometimes, memory reclaim may not be able to recover memory in a rate that can catch up to the physical memory allocation rate. In this case, the physical memory consumption will keep on increasing. When it reaches "memory.max" for memory cgroup v2 or when the system is running out of free memory, the OOM killer will be invoked to kill some tasks to free up additional memory. However, one has little control of which tasks are going to be killed by an OOM killer. Killing tasks that hold some important resources without freeing them first can create other system problems down the road. Users who do not want the OOM killer to be invoked to kill random tasks in an out-of-memory situation can use the memory control facility provided by this new patchset via prctl(2) to better manage the mitigation action that needs to be performed to various tasks when the specified memory limit is exceeded with memory cgroup v2 being used. The currently supported mitigation actions include the followings: 1) Return ENOMEM for some syscalls that allocate or handle memory 2) Slow down the process for memory reclaim to catch up 3) Send a specific signal to the task 4) Kill the task The users that want better memory control for their applicatons can either modify their applications to call the prctl(2) syscall directly with the new memory control command code or write the desired action to the newly provided memctl procfs files of their applications provided that those applications run in a non-root v2 memory cgroup. Waiman Long (8): memcg: Enable fine-grained control of over memory.high action memcg, mm: Return ENOMEM or delay if memcg_over_limit memcg: Allow the use of task RSS memory as over-high action trigger fs/proc: Support a new procfs memctl file memcg: Allow direct per-task memory limit checking memcg: Introduce additional memory control slowdown if needed memcg: Enable logging of memory control mitigation action memcg: Add over-high action prctl() documentation Documentation/userspace-api/index.rst | 1 + Documentation/userspace-api/memcontrol.rst | 174 ++++++++++++++++ fs/proc/base.c | 109 ++++++++++ include/linux/memcontrol.h | 4 + include/linux/sched.h | 24 +++ include/uapi/linux/prctl.h | 48 +++++ kernel/fork.c | 1 + kernel/sys.c | 16 ++ mm/memcontrol.c | 227 +++++++++++++++++++++ mm/mlock.c | 6 + mm/mmap.c | 12 ++ mm/mprotect.c | 3 + 12 files changed, 625 insertions(+) create mode 100644 Documentation/userspace-api/memcontrol.rst -- 2.18.1