Received: by 2002:a17:90a:c8b:0:0:0:0 with SMTP id v11csp2358503pja; Wed, 10 Apr 2019 18:44:59 -0700 (PDT) X-Google-Smtp-Source: APXvYqxedRfPIjCYKQN4+bzIQUqUC0gEDFa9RSd1lJKslMDvA28yrRsEL0N2TNRRI6RYESU5n1JH X-Received: by 2002:a62:5ec2:: with SMTP id s185mr46453789pfb.16.1554947099373; Wed, 10 Apr 2019 18:44:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554947099; cv=none; d=google.com; s=arc-20160816; b=RfH3rpKRE+SKbwbb1YQ4cW2eV74thT+zIf8TO3GU25yP0s+kjIO21fW69cMMILeQME MbhPD/N95bTxKJjjZ5erEVsRZb9mD5EV9zdYY08pG6XYtfEstFEzv5aHas/SV3cWc+Es 6vuAxuJgej4gSdn/rgfL3WN52s+O6bMO5xPkcdGijnWxk3rIu/IisJmT5L//feewd07H zjIGKfknMiPyUuPiIur+yDJHhNsLLoWemMmh5ogCD2ZU9Hld7xBLduaZc+hIvSNaKbhv DCnJemaO1tWZNx2sspLCHIAMSNN1xPsYHShPlPRpLNfOpM7TthNrN3EbXcEzLrONthXO VgdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:from :subject:mime-version:message-id:date:dkim-signature; bh=nMpKzlGnRjDmkTf14iU5Inuh5KPox7tkPv+EdiTrd5w=; b=ScGQDoNd3NP2oKA5clE+eHs5UMeFKnJvRTn8+I7EdRHYf99LG8HqfpwqejEz2Marlo uquWThZ1wqCglFkwYd/5SQ9UKU5KxjD93JSGGxjqW2ofXIx3mumF8Mnxl33qnYMmHO18 mMk6knCCIUgkqht4kfVQdR7S0OK7LFWi2u8411mq8IFVJO1olpcz36y9B1LaIKf2AAP/ OlwKg+jAljJIhYKAPeyP4SwqLdbVf2C1RCvp+Ve/8v+JX/LtHAiJH3reZOTx2lVHxEN/ FdijEtHMZupLmnWMSPFW24XT3EGWsm8azeFpK72KmWYz6M9BGxTkYzaIeh3fZC02Acml 0Yig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=sVv0xkpT; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r190si7785493pgr.125.2019.04.10.18.44.43; Wed, 10 Apr 2019 18:44:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=sVv0xkpT; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726690AbfDKBoB (ORCPT + 99 others); Wed, 10 Apr 2019 21:44:01 -0400 Received: from mail-pg1-f201.google.com ([209.85.215.201]:37376 "EHLO mail-pg1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725982AbfDKBoB (ORCPT ); Wed, 10 Apr 2019 21:44:01 -0400 Received: by mail-pg1-f201.google.com with SMTP id z12so3330003pgs.4 for ; Wed, 10 Apr 2019 18:44:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:to:cc :content-transfer-encoding; bh=nMpKzlGnRjDmkTf14iU5Inuh5KPox7tkPv+EdiTrd5w=; b=sVv0xkpTwtzSlxpxf7Eh4uPvLQht8KTOHpymErP8ujjl0cROfAcy7azfsZtPXzWhWD cPgOu8rqPN+xOWDfdsbN52MxtcOc8fPxkqn1nTl62mmFD96RJ3eCnLvqiljGRTmR+iU2 u/8Wm3cwbwW0mNn+qKGQvNbBO57ZzMpj+1ZVHMkfxMRgVg+wd1ZXlbaMZjQehFJeS3+D I5DtiYpbRf9zi0jJ5YUu0jLfd6NTxO8SXl1KmcijblbJDQNv52x7jwD2KAQNYPIZX5ED 9ThgeOsWuFoiKZpvcFu2LQm8oDNYWq/0s+1e6KxU9h+JPK2hW29GfO5mc8EA+4hoT0b3 AMvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc :content-transfer-encoding; bh=nMpKzlGnRjDmkTf14iU5Inuh5KPox7tkPv+EdiTrd5w=; b=e8V8xfT3hpax9358baW2N29za59tX+dGFICFFAhBUndC7hz+4tNgMZArLFiHtbTUIc wMocymnufDFAuEbirBWMaqlVlMbELwEt0ji1r5iYuBw+I48M1U56+W3ZpBhEW5+w6rHh XYPn+TUVaE5Dbh31VhxJVHDYVc3ZWjcn4sBgnTCLig7K/PgqbXCkKu5f75GQxGaxQhrH tacA9BKr0bg9FiRcm/JAK86u0qWETnvqmsWRPv/WjfJPRC6sHvYhMrHeuKe8s5wnEbyP 1p0f2d3pL+TDlV1MMo7yVu/FOIkdpCInK4defS9Tb/3uIOYRNFowD0N7fPmuUbzUrVlP 6eXQ== X-Gm-Message-State: APjAAAWwf6MHITWdHsatFsMQZePyhmfNFHPR8ushMgii9kcYT7+FQ1uG 0782df5HLxUBBYs2zYwOeDghp8gev9E= X-Received: by 2002:a63:b74b:: with SMTP id w11mr915185pgt.87.1554947039847; Wed, 10 Apr 2019 18:43:59 -0700 (PDT) Date: Wed, 10 Apr 2019 18:43:51 -0700 Message-Id: <20190411014353.113252-1-surenb@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.21.0.392.gf8f6787159e-goog Subject: [RFC 0/2] opportunistic memory reclaim of a killed process From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: mhocko@suse.com, rientjes@google.com, willy@infradead.org, yuzhoujian@didichuxing.com, jrdr.linux@gmail.com, guro@fb.com, hannes@cmpxchg.org, penguin-kernel@I-love.SAKURA.ne.jp, ebiederm@xmission.com, shakeelb@google.com, christian@brauner.io, minchan@kernel.org, timmurray@google.com, dancol@google.com, joel@joelfernandes.org, jannh@google.com, surenb@google.com, linux-mm@kvack.org, lsf-pc@lists.linux-foundation.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The time to kill a process and free its memory can be critical when the killing was done to prevent memory shortages affecting system responsiveness. In the case of Android, where processes can be restarted easily, killing a less important background process is preferred to delaying or throttling an interactive foreground process. At the same time unnecessary kills should be avoided as they cause delays when the killed process is needed again. This requires a balanced decision from the system software about how long a kill can be postponed in the hope that memory usage will decrease without such drastic measures. As killing a process and reclaiming its memory is not an instant operation, a margin of free memory has to be maintained to prevent system performance deterioration while memory of the killed process is being reclaimed. The size of this margin depends on the minimum reclaim rate to cover the worst-case scenario and this minimum rate should be deterministic. Note that on asymmetric architectures like ARM big.LITTLE the reclaim rate can vary dramatically depending on which core it=E2=80=99s performed at (se= e test results). It=E2=80=99s a usual scenario that a non-essential victim process= is being restricted to a less performant or throttled CPU for power saving purposes. This makes the worst-case reclaim rate scenario very probable. The cases when victim=E2=80=99s memory reclaim can be delayed further due t= o process being blocked in an uninterruptible sleep or when it performs a time-consuming operation makes the reclaim time even more unpredictable. Increasing memory reclaim rate and making it more deterministic would allow for a smaller free memory margin and would lead to more opportunities to avoid killing a process. Note that while other strategies like throttling memory allocations are viable and can be employed for other non-essential processes they would affect user experience if applied towards an interactive process. Proposed solution uses existing oom-reaper thread to increase memory reclaim rate of a killed process and to make this rate more deterministic. By no means the proposed solution is considered the best and was chosen because it was simple to implement and allowed for test data collection. The downside of this solution is that it requires additional =E2=80=9Cexped= ite=E2=80=9D hint for something which has to be fast in all cases. Would be great to find a way that does not require additional hints. Other possible approaches include: - Implementing a dedicated syscall to perform opportunistic reclaim in the context of the process waiting for the victim=E2=80=99s death. A natural bo= ost bonus occurs if the waiting process has high or RT priority and is not limited by cpuset cgroup in its CPU choices. - Implement a mechanism that would perform opportunistic reclaim if it=E2= =80=99s possible unconditionally (similar to checks in task_will_free_mem()). - Implement opportunistic reclaim that uses shrinker interface, PSI or other memory pressure indications as a hint to engage. Test details: Tests are performed on a Qualcomm=C2=AE Snapdragon=E2=84=A2 845 8-core ARM = big.LITTLE system with 4 little cores (0.3-1.6GHz) and 4 big cores (0.8-2.5GHz) running Android. Memory reclaim speed was measured using signal/signal_generate, kmem/rss_stat and sched/sched_process_exit traces. Test results: powersave governor, min freq normal kills expedited kills little 856 MB/sec 3236 MB/sec big 5084 MB/sec 6144 MB/sec performance governor, max freq normal kills expedited kills little 5602 MB/sec 8144 MB/sec big 14656 MB/sec 12398 MB/sec schedutil governor (default) normal kills expedited kills little 2386 MB/sec 3908 MB/sec big 7282 MB/sec 6820-16386 MB/sec =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D min reclaim speed: 856 MB/sec 3236 MB/sec The patches are based on 5.1-rc1 Suren Baghdasaryan (2): mm: oom: expose expedite_reclaim to use oom_reaper outside of oom_kill.c signal: extend pidfd_send_signal() to allow expedited process killing include/linux/oom.h | 1 + include/linux/sched/signal.h | 3 ++- include/linux/signal.h | 11 ++++++++++- ipc/mqueue.c | 2 +- kernel/signal.c | 37 ++++++++++++++++++++++++++++-------- kernel/time/itimer.c | 2 +- mm/oom_kill.c | 15 +++++++++++++++ 7 files changed, 59 insertions(+), 12 deletions(-) --=20 2.21.0.392.gf8f6787159e-goog