Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp1510879pxv; Fri, 23 Jul 2021 10:03:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwmJ6zomsGW9biNgfmpecnco454kdnnCDmDe6oaPbIh2T5GcJj88r59iSKydtdkOtIUEmym X-Received: by 2002:a17:906:288d:: with SMTP id o13mr5475124ejd.120.1627059806372; Fri, 23 Jul 2021 10:03:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627059806; cv=none; d=google.com; s=arc-20160816; b=wqYIkC5QtaF21AbiT22udvccFI0V77wjMYhnwmxHQxqjewBLXRDd+w0bO+TNx9+fHP hQZDtJw0Zls5Az3ZiZ0j0fRagspJ1NRVdv44AnU+yl3jvRbGsRmab523FANNpjj7IL8H pc8DIuZ6WARCYi3qLKZg8gfePLPqrFstRTwTkaDAW2VLWTSddOx1ddrxg0qYXbiyLNRB 8DLYGJ1mXUU2tksLCgo9fRPCKJ9wCEhk8RPDPS37IPjYwoUA6DWrXOUWAtck+cLKS0Bm CeK1HoqsWWK5y4QcCB7+FxmB09qebD2VcmVCAt2ue97+rxmWtXgmdeO7imQZSZCynTgn RbFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=mYrZg+fpRQAmubKOisYlI4PP99r2kR+Pc+emzbHA1zU=; b=LsySZlO99x8OlVErTRlcIGDbCMDVNXHyH5U04IUMWo0D0v4/SWYLt95i56tRB1+17R SM3g4CgNx5lxveCw8zPRvLtryQtZSWtGkj70MhUf6D7R6edG0+bjlLWbIja0niN8WGF5 wJFys7Aj0YmdCy77KOoGLIfZXSPo5KZ1hxB79buyK0deuef6+V+116I1xLXO8g/Xfh8u FSQAR13Myc4RbaiwtV1i54MkspM5lGZ/sP0D5z1xP5TGlSeY3MUSmpnfElTruaM3wfYm T5Ae/uz5uHAgBy8146H8ujeTH79ob0BsABNYrXmxWJN+vamTRnNa4CQL/kMsrs0zKaTj EHBQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=VIsnpgjC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h18si12471043edw.405.2021.07.23.10.02.52; Fri, 23 Jul 2021 10:03:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=VIsnpgjC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230455AbhGWQUH (ORCPT + 99 others); Fri, 23 Jul 2021 12:20:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42016 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229492AbhGWQUH (ORCPT ); Fri, 23 Jul 2021 12:20:07 -0400 Received: from mail-lf1-x12a.google.com (mail-lf1-x12a.google.com [IPv6:2a00:1450:4864:20::12a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 25EBEC061575 for ; Fri, 23 Jul 2021 10:00:40 -0700 (PDT) Received: by mail-lf1-x12a.google.com with SMTP id m13so3081753lfg.13 for ; Fri, 23 Jul 2021 10:00:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=mYrZg+fpRQAmubKOisYlI4PP99r2kR+Pc+emzbHA1zU=; b=VIsnpgjCvbquqGy8008dby3WGq1lkP0WIKEXONe2zP3+7E8+Bqd70Z+nw+8CTxPdVG fTF4I0uoy5w17M0iQl1dD64gcIXcr2NbpNJiRuoWnI2R4Nq8b9/4kritGYxRVpL+5Srq Ontytx421RfBfulHBx4pTLtcUXqVFYwoX3BnhAaHltHe1YK1DFjT9yVaLkNq8wSLrXHC ZAuOR1hE6s6zZrgvJFjs0y1eWGw9+dsO9w+ojiQbGaADbEXT4/RnUTJKpVZ70s5xFLv3 PTCtCXIz23hGMJvKvyJAPEpH4nKokDR+RvbDiNBfjqOH+sTZ2P4Zi6AP98qsI0pcz/Vn aOMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=mYrZg+fpRQAmubKOisYlI4PP99r2kR+Pc+emzbHA1zU=; b=W6RTYlDTDHHviTQTQPr5tWYoBtP0Rb2aJsxs5nuI2aUdg89jPd3BO4WrPuNqJJHOVU KEZ/BKBQgJzEizWvxH9DyTgFF+Sf9srB3753boUq3wnjamZewz5o7/QCut+nfUvXmNMU YPoFcBOyPZndGvReStFZDstTcLoagvrso8BW5de4Y5gVtfKlQrcuG9neIMn/oRaFmZPv K1tep4lhey6/WR3pEfeCqS20jmHJGWq1fhs8Fqhw97ji8ilGESIQd9X/GS/GMzZpFdb6 ghrxvNUaf4vXTFNJZXlIshAJErn7bJXq6ykYdk1WiDJYbXp775SJYcPnzuOG6Dn54I45 xYgw== X-Gm-Message-State: AOAM530OIg1yHsO6r9zsf2GBs4Hg7U32kTgAYjZhQL3OD+H35j+K9mAn RHj0v/Ir/HWr1SHt2E/omuyojK7TvCy0Tloo51ueYw== X-Received: by 2002:a05:6512:c23:: with SMTP id z35mr3714473lfu.299.1627059638178; Fri, 23 Jul 2021 10:00:38 -0700 (PDT) MIME-Version: 1.0 References: <20210723011436.60960-1-surenb@google.com> In-Reply-To: From: Shakeel Butt Date: Fri, 23 Jul 2021 10:00:26 -0700 Message-ID: Subject: Re: [PATCH v3 1/2] mm: introduce process_mrelease system call To: Suren Baghdasaryan Cc: Michal Hocko , Andrew Morton , David Rientjes , Matthew Wilcox , Johannes Weiner , Roman Gushchin , Rik van Riel , Minchan Kim , Christian Brauner , Christoph Hellwig , Oleg Nesterov , David Hildenbrand , Jann Horn , Andy Lutomirski , Christian Brauner , Florian Weimer , Jan Engelhardt , Tim Murray , Linux API , Linux MM , LKML , kernel-team Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 23, 2021 at 9:09 AM Suren Baghdasaryan wrote: > > On Fri, Jul 23, 2021 at 6:46 AM Shakeel Butt wrote: > > > > On Fri, Jul 23, 2021 at 1:53 AM Michal Hocko wrote: > > > > > [...] > > > > However > > > > retrying means issuing another syscall, so additional overhead... > > > > I guess such "best effort" approach would be unusual for a syscall, so > > > > maybe we can keep it as it is now and if such "do not block" mode is needed > > > > we can use flags to implement it later? > > > > > > Yeah, an explicit opt-in via flags would be an option if that turns out > > > to be really necessary. > > > > > > > I am fine with keeping it as it is but we do need the non-blocking > > option (via flags) to enable userspace to act more aggressively. > > I think you want to check memory conditions shortly after issuing > kill/reap requests irrespective of mmap_sem contention. The reason is > that even when memory release is not blocked, allocations from other > processes might consume memory faster than we release it. For example, > in Android we issue kill and start waiting on pidfd for its death > notification. As soon as the process is dead we reassess the situation > and possibly kill again. If the process is not dead within a > configurable timeout we check conditions again and might issue more > kill requests (IOW our wait for the process to die has a timeout). If > process_mrelease() is blocked on mmap_sem, we might timeout like this. > I imagine that a non-blocking option for process_mrelease() would not > really change this logic. On a containerized system, killing a job requires killing multiple processes and then process_mrelease() them. Now there is cgroup.kill to kill all the processes in a cgroup tree but we would still need to process_mrelease() all the processes in that tree. There is a chance that we get stuck in reaping the early process. Making process_mrelease() non-blocking will enable the userspace to go to other processes in the list. An alternative would be to have a cgroup specific interface for reaping similar to cgroup.kill. > Adding such an option is trivial but I would like to make sure it's > indeed useful. Maybe after the syscall is in place you can experiment > with it and see if such an option would really change the way you use > it? SGTM.