Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp3828996pxu; Mon, 30 Nov 2020 11:06:35 -0800 (PST) X-Google-Smtp-Source: ABdhPJykjMsDP/aPNYCtct+AA5e7UGyMS3z109QXD0f5VWGWmk3GKrg+mrOK1UU9ahFoil7+3+rk X-Received: by 2002:aa7:dccd:: with SMTP id w13mr23788678edu.385.1606763195496; Mon, 30 Nov 2020 11:06:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606763195; cv=none; d=google.com; s=arc-20160816; b=o5Edq/ulGhiq3YDYnkMtZVGXVEA71qI9wtb8kNpMbZBDRSLjBAksVm8YG0ZmTwpPGG BC+UUg0sAsBu3Qkk/DpXxEFjy+DdtrYfWajpo0+htML8Ye5hKBFJnXKiYLHx1fI9UD8e Y3lhRFaGchhjUkaA3yhkraVxo6hL2paJev2RrXyq1K5KeewkkaE1oqpyhjMlcLFZwhHR Uy1I95pexEJqAhVJd0YZRoCDuroF2pECUfmtW1rSFyxLiC2xu7Pvw9VK+wIYbEQB9WNS rXoAePJut91SQ0LcROKroEEmBKKUXEaEpBCHhVCsVk3/mOFlaDxkjKuKbWx5VSNQxBSX Go8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=jfWeO4tbhTGJrN6AO+wZJR6KFcxbZZ5VZ2sCRkzWaLQ=; b=gO7ZSc4JjLrpSvNF7H5zh+x7y7ZIyM/ks5QVMr7KQxz1jmKJRmbKIxRw124DXM4F7d on/g/5MQiaPxrriWWS90wq7D9jJyLEkIVriH3qcHjBdiq9SOr1TThxTyZL7gtMVpMlXE k1YuX++0e+4ADvHg/lZqxtDRmyzviV5+cjCXUPg3o+4Lg6aVDxxb9hpuRK4YuU9EAOAh snVFrsqJRqUIqrAywEt3W2QUYVk2RqQWWZZW03mhttjd/HnnqMdsPrQYHxv7m1eqVsH/ X8a2xMMGuRiX7fQNVjim2iOC9otxSy48baP6zBf01A7av1o8FPfRMQYQnmWQ2hCp4uOM 7boQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=EABF8k2s; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bo12si5091197edb.193.2020.11.30.11.06.12; Mon, 30 Nov 2020 11:06:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=EABF8k2s; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729744AbgK3TCK (ORCPT + 99 others); Mon, 30 Nov 2020 14:02:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33818 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729595AbgK3TCJ (ORCPT ); Mon, 30 Nov 2020 14:02:09 -0500 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 16000C0613D4 for ; Mon, 30 Nov 2020 11:01:29 -0800 (PST) Received: by mail-wm1-x342.google.com with SMTP id 3so484866wmg.4 for ; Mon, 30 Nov 2020 11:01:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=jfWeO4tbhTGJrN6AO+wZJR6KFcxbZZ5VZ2sCRkzWaLQ=; b=EABF8k2s1USjU9oYjO5iwqvFMSQ4Yhe8e6PUMCsy7Nv0n6mtBtxWhrl6sFE2b6aG1W oTjB3pd696431yT8mK6dpmQtnWW5tbT/uaKqLRuHdnRBQE+ahwJeRI5c3mJSF+EIM5hB bGjQuA3JayCi8/mY0PRoPsntKsgg1+j9i7RZKC21GT/lj2n6j4dnbUnj7sBzxYaMSU7B YkidQoecd4EC79njWzY+0q8L9wx0ONQiHtVje8VL4/JjQzTYQCZ7UKeYBdxla6zvI/Xz KrOZJGEXwbcJ8VBoQR1UlNPXTfTeDu8+zZfZ2veJqxMkfxUNXmQm60n+BmRAxrgFZe80 i5uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=jfWeO4tbhTGJrN6AO+wZJR6KFcxbZZ5VZ2sCRkzWaLQ=; b=c284BGJkkmTuavEhaZbS7eZruAnjkaNN6dlkNbR/ZjcTYn7LooaTIxu3APxxBx/7xe 7QUWkzyFPs18E54O9RZTYV3aCd7P8GuAeTtK67m0ErjDsyVkQhdxY62KQOzRvZI0ziIj iE90ZOQBUugA8xnudiMHKhYMieo3tHFSp1Cmk71NTzrVo+b7vzGp6JKMySC/2YGSJ9ST +HukmtAUrZaAwtXU77vz83HVyo69Qp9uEKauAGbN7M32Mxt8D7uCmpP/JM/XJnZDgozY TgL4tkcSd1P2m7pUr5G5WTzKOVfdLoynqvc6zkegOCAnAhl+ZSquXZmZLp6bKP1YsuFc 2GVQ== X-Gm-Message-State: AOAM5316GCLUoHXm/hQhsaSy43KKuLg0EwIX6K4vb2e0+adA1t2g4gml LqWFBPSvRItf1hkB2peIawvu7bYI0zz2HrZH0dvmaw== X-Received: by 2002:a1c:4e0a:: with SMTP id g10mr292802wmh.88.1606762886790; Mon, 30 Nov 2020 11:01:26 -0800 (PST) MIME-Version: 1.0 References: <20201124053943.1684874-1-surenb@google.com> <20201124053943.1684874-2-surenb@google.com> <20201125231322.GF1484898@google.com> <20201125234322.GG1484898@google.com> In-Reply-To: <20201125234322.GG1484898@google.com> From: Suren Baghdasaryan Date: Mon, 30 Nov 2020 11:01:15 -0800 Message-ID: Subject: Re: [PATCH 1/2] mm/madvise: allow process_madvise operations on entire memory range To: Minchan Kim Cc: Andrew Morton , Michal Hocko , Michal Hocko , David Rientjes , Matthew Wilcox , Johannes Weiner , Roman Gushchin , Rik van Riel , Christian Brauner , Oleg Nesterov , Tim Murray , linux-api@vger.kernel.org, linux-mm , LKML , kernel-team Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 25, 2020 at 3:43 PM Minchan Kim wrote: > > On Wed, Nov 25, 2020 at 03:23:40PM -0800, Suren Baghdasaryan wrote: > > On Wed, Nov 25, 2020 at 3:13 PM Minchan Kim wrote: > > > > > > On Mon, Nov 23, 2020 at 09:39:42PM -0800, Suren Baghdasaryan wrote: > > > > process_madvise requires a vector of address ranges to be provided for > > > > its operations. When an advice should be applied to the entire process, > > > > the caller process has to obtain the list of VMAs of the target process > > > > by reading the /proc/pid/maps or some other way. The cost of this > > > > operation grows linearly with increasing number of VMAs in the target > > > > process. Even constructing the input vector can be non-trivial when > > > > target process has several thousands of VMAs and the syscall is being > > > > issued during high memory pressure period when new allocations for such > > > > a vector would only worsen the situation. > > > > In the case when advice is being applied to the entire memory space of > > > > the target process, this creates an extra overhead. > > > > Add PMADV_FLAG_RANGE flag for process_madvise enabling the caller to > > > > advise a memory range of the target process. For now, to keep it simple, > > > > only the entire process memory range is supported, vec and vlen inputs > > > > in this mode are ignored and can be NULL and 0. > > > > Instead of returning the number of bytes that advice was successfully > > > > applied to, the syscall in this mode returns 0 on success. This is due > > > > to the fact that the number of bytes would not be useful for the caller > > > > that does not know the amount of memory the call is supposed to affect. > > > > Besides, the ssize_t return type can be too small to hold the number of > > > > bytes affected when the operation is applied to a large memory range. > > > > > > Can we just use one element in iovec to indicate entire address rather > > > than using up the reserved flags? > > > > > > struct iovec { > > > .iov_base = NULL, > > > .iov_len = (~(size_t)0), > > > }; > > > > > > Furthermore, it would be applied for other syscalls where have support > > > iovec if we agree on it. > > > > > > > The flag also changes the return value semantics. If we follow your > > suggestion we should also agree that in this mode the return value > > will be 0 on success and negative otherwise instead of the number of > > bytes madvise was applied to. > > Well, return value will depends on the each API. If the operation is > desruptive, it should return the right size affected by the API but > would be okay with 0 or error, otherwise. I'm fine with dropping the flag, I just thought with the flag it would be more explicit that this is a special mode operating on ranges. This way the patch also becomes simpler. Andrew, Michal, Christian, what do you think about such API? Should I change the API this way / keep the flag / change it in some other way?