Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp4562608yba; Sun, 19 May 2019 22:59:35 -0700 (PDT) X-Google-Smtp-Source: APXvYqwU64YV4n+Xo+HCMxlFj0HUZ9sIp7V5x/TF/ZNrnw27PXvUxVYPRs1N3qjv0J0oVvvb1ObW X-Received: by 2002:a17:902:868c:: with SMTP id g12mr56939388plo.323.1558331975773; Sun, 19 May 2019 22:59:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1558331975; cv=none; d=google.com; s=arc-20160816; b=u4PaVvIZN2EH6nVqIq6cRa8YMQkR8NEEOEtOqy2zLfdrndORmCP1hR050p5t7ORFlY a4r7ijLGZCg137Jifr0eyLYrDz0KRzA0ah2EOXp1oleQIC/XxwaVl5dvoeMdDcjHcvTc +ljrzZjTTMhyOcpS6TImYIslJOhWb9vLM/LsAiQEs08TEZm9fbKtQxC+On/Kj1VYSjWM L4/HTI9Vuz6MgRaEHWvScukWnfXIkXsBp+xooNI1XyUq66houKclGjSWbV2WgYxD3OZm YgyU35y8a6QTCHfBxbUrIIL/vpLbpVGbaRDBJ3KFAWgu1ltfhHHDGVzoC4UANuM7GYFm jP9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=6eom6C+0aXRglfnsINAswRs1OeG+39R7hMy87CNGrDQ=; b=P6XSr+1kCa1TLmT2LSw7SlJhYFoAW9m+IhGmAc+W9Vs3L81luBG3VFtScEVpOAQ+y+ axEXIF0222MWsKrMwQrQHOkcn3899i9n3JnJijIJwNQnHyKI2TWfcz6go/2oewmS2Rae lAq1wsqAr2wxuGF2cHi4oUtPFcZk3RLUZKyHuVf3MhZAKxeXQYWuYryOEroqfIF/GzDN YvZVHCZBG+gfJGd3pvGIOACl0kWPbIQ8Yba9bwWdweQ+SI4H4vG+j+iS39ikytVeTQX5 BBRU4rWVEwJxG7anip0Vw5wMbQARreUgjYlYkj7L9l0DTwcNqWhZh5q2X3Gr5kF094sr uWPw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=XG5I+EPz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i4si18664297pfc.26.2019.05.19.22.59.20; Sun, 19 May 2019 22:59:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=XG5I+EPz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730338AbfETDxh (ORCPT + 99 others); Sun, 19 May 2019 23:53:37 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:38607 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727626AbfETDxg (ORCPT ); Sun, 19 May 2019 23:53:36 -0400 Received: by mail-pl1-f194.google.com with SMTP id f97so6034824plb.5 for ; Sun, 19 May 2019 20:53:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6eom6C+0aXRglfnsINAswRs1OeG+39R7hMy87CNGrDQ=; b=XG5I+EPz86QA3OhsQLafnFpAan9eaChflXFWy43qaxMLemrkVaXoKjqy7PxZqkUxkP O0Pi6kyy6arJRnGcyGSp5wdYWbuXmnvmMU9TLW9lQqedL13pwu5ot2YxbHHnH5JDECz2 5s7ZSy68dQ3tlt3cl5NMx33/pwti9RfeGoJWKiUEMCQZqpnPep50Djr8xxL6STY36Ujw beVq6BNhkOiPzbrCyDh5fI53flkYcWi5NjAVWwKl/1j+lPu5r1E1UxZcQv/nYdEAVufF I5AElncQaFNYUw4EOAGcoZbjZ6MvPKbdF3qRi76J+cqbdAOTOvWbZZqXTrDpaKJYoBkJ Nunw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=6eom6C+0aXRglfnsINAswRs1OeG+39R7hMy87CNGrDQ=; b=f6yTiEemTxMgPMJqqiwSxFVsSgLPr6I/hnCgzH/+NMfASfJMVl+c7A9LMkiXf14FdZ aotz3MLG2+EZrTwk6O8oKwVH6wWd1Gw2BZeICV7XF2RqdifMYwBOR5UDCI3dQ5LLNbia v85fKVV6ujCI7XgReFK4Ykbe3svFwV6/v5JU8IlyhER/n0PZferORnotuNXbywX7aXNF QmIG03hYnpKs3F2/33Y2focz0g/w5cfIZpSnKLWbnZJC2X1Tp+k1lS9fS5YV+hdQJCO6 brIbjhXueo5xNodgFrLiE9SP5pvmgdBR09IhLWNIAHkZ3y2XFr1HcY+tXpoTP3ytijwr MSuw== X-Gm-Message-State: APjAAAU9p7MIlnj+AGXTCheliqSM5kcxo3BMGXN6ZS6/lTne6hIj4kmU tDUAyZ7cjlH0zwHrenv74Ck= X-Received: by 2002:a17:902:bc42:: with SMTP id t2mr21860026plz.55.1558324415659; Sun, 19 May 2019 20:53:35 -0700 (PDT) Received: from bbox-2.seo.corp.google.com ([2401:fa00:d:0:98f1:8b3d:1f37:3e8]) by smtp.gmail.com with ESMTPSA id x66sm3312779pfx.139.2019.05.19.20.53.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 19 May 2019 20:53:34 -0700 (PDT) From: Minchan Kim To: Andrew Morton Cc: LKML , linux-mm , Michal Hocko , Johannes Weiner , Tim Murray , Joel Fernandes , Suren Baghdasaryan , Daniel Colascione , Shakeel Butt , Sonny Rao , Brian Geffon , Minchan Kim Subject: [RFC 7/7] mm: madvise support MADV_ANONYMOUS_FILTER and MADV_FILE_FILTER Date: Mon, 20 May 2019 12:52:54 +0900 Message-Id: <20190520035254.57579-8-minchan@kernel.org> X-Mailer: git-send-email 2.21.0.1020.gf2820cf01a-goog In-Reply-To: <20190520035254.57579-1-minchan@kernel.org> References: <20190520035254.57579-1-minchan@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org System could have much faster swap device like zRAM. In that case, swapping is extremely cheaper than file-IO on the low-end storage. In this configuration, userspace could handle different strategy for each kinds of vma. IOW, they want to reclaim anonymous pages by MADV_COLD while it keeps file-backed pages in inactive LRU by MADV_COOL because file IO is more expensive in this case so want to keep them in memory until memory pressure happens. To support such strategy easier, this patch introduces MADV_ANONYMOUS_FILTER and MADV_FILE_FILTER options in madvise(2) like that /proc//clear_refs already has supported same filters. They are filters could be Ored with other existing hints using top two bits of (int behavior). Once either of them is set, the hint could affect only the interested vma either anonymous or file-backed. With that, user could call a process_madvise syscall simply with a entire range(0x0 - 0xFFFFFFFFFFFFFFFF) but either of MADV_ANONYMOUS_FILTER and MADV_FILE_FILTER so there is no need to call the syscall range by range. * from v1r2 * use consistent check with clear_refs to identify anon/file vma - surenb * from v1r1 * use naming "filter" for new madvise option - dancol Signed-off-by: Minchan Kim --- include/uapi/asm-generic/mman-common.h | 5 +++++ mm/madvise.c | 14 ++++++++++++++ 2 files changed, 19 insertions(+) diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h index b8e230de84a6..be59a1b90284 100644 --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -66,6 +66,11 @@ #define MADV_WIPEONFORK 18 /* Zero memory on fork, child only */ #define MADV_KEEPONFORK 19 /* Undo MADV_WIPEONFORK */ +#define MADV_BEHAVIOR_MASK (~(MADV_ANONYMOUS_FILTER|MADV_FILE_FILTER)) + +#define MADV_ANONYMOUS_FILTER (1<<31) /* works for only anonymous vma */ +#define MADV_FILE_FILTER (1<<30) /* works for only file-backed vma */ + /* compatibility flags */ #define MAP_FILE 0 diff --git a/mm/madvise.c b/mm/madvise.c index f4f569dac2bd..116131243540 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1002,7 +1002,15 @@ static int madvise_core(struct task_struct *tsk, unsigned long start, int write; size_t len; struct blk_plug plug; + bool anon_only, file_only; + anon_only = behavior & MADV_ANONYMOUS_FILTER; + file_only = behavior & MADV_FILE_FILTER; + + if (anon_only && file_only) + return error; + + behavior = behavior & MADV_BEHAVIOR_MASK; if (!madvise_behavior_valid(behavior)) return error; @@ -1067,12 +1075,18 @@ static int madvise_core(struct task_struct *tsk, unsigned long start, if (end < tmp) tmp = end; + if (anon_only && vma->vm_file) + goto next; + if (file_only && !vma->vm_file) + goto next; + /* Here vma->vm_start <= start < tmp <= (end|vma->vm_end). */ error = madvise_vma(tsk, vma, &prev, start, tmp, behavior, &pages); if (error) goto out; *nr_pages += pages; +next: start = tmp; if (prev && start < prev->vm_end) start = prev->vm_end; -- 2.21.0.1020.gf2820cf01a-goog