Received: by 2002:a05:6a10:144:0:0:0:0 with SMTP id 4csp1477472pxw; Sun, 10 Apr 2022 03:47:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwjb+ww858V3Rs9DKQWTLbkOZI5sT/M6otz4ayplOQ1BnjKJ6ipTKtoA2kNdB1rvL0SzjuM X-Received: by 2002:a17:907:6d8b:b0:6e7:5610:d355 with SMTP id sb11-20020a1709076d8b00b006e75610d355mr25411802ejc.369.1649587663685; Sun, 10 Apr 2022 03:47:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649587663; cv=none; d=google.com; s=arc-20160816; b=OawMJQcZ1IIFzh6PvVUACQpKulRE2qBhicUh+yRKliYWvBRSxD4j9MICJa7hKhh6nm QbX1FyQW0VwiZt3bDkK3N4BA2SFFV0tes6gmU51PBi7eNe506SMGSrQPcZLy0zGs5k6c 5Fi5gT8lSbmYzQnujhr++fM2AQ12F228YPg7rWJ1tTwdixUzCHtYFnCTkqae0ZS3VJbO wkUtFs3L2IpcYScj6/5IYX6mJehMGy7h/ZMRls7buCxWnuoAgFNXa+5TnAfac3VkRb/V RjwAXdGBC8mUSd3XG009Io2E0BFL5xvx2oB7zw3QH1U9/tg/0ym7itSkJgV0gz5ZUv6D l8Kg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=IudNh/HynxxH+tLoweLS06+fVTpNQOOa59eg8I3l6dE=; b=iDjHF9We7fbmWG5/2L4C0ola83DNPW8Oy+ahSNTRzcZ/TmX06YkhZROkAcpRD6uWQg n9dADfAsrQE5rLrAXP2tDjWwUGPoX0RUzF3OTW+sRvjAT22ek8i2FZ+OMxl5NDxqdo/B BAjW038FcJF02tDdhI4EeXF9bW7+AIpeVxL5gJBVSR9bLyxDL7e286eXvzvBNQTUd8J5 Yzbo2gJyN5kAovxVdnzsHM8IumxhwNFUjJf/34+l+iNjkXFdwg2ureV85oozi1qkwU1U HkUygBbrKCpMDJJSdzHths4aiRMOviq8GpoDGxjRkXt9MLPRH2yJw3iKRYYhOw5NaLZL jOQA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="U2/eA7V9"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c3-20020a50d643000000b0041c84a53323si4462121edj.87.2022.04.10.03.47.19; Sun, 10 Apr 2022 03:47:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="U2/eA7V9"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238408AbiDHR7Q (ORCPT + 99 others); Fri, 8 Apr 2022 13:59:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59002 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232577AbiDHR7H (ORCPT ); Fri, 8 Apr 2022 13:59:07 -0400 Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D221210F6E9 for ; Fri, 8 Apr 2022 10:57:00 -0700 (PDT) Received: by mail-pl1-x62a.google.com with SMTP id c23so8591552plo.0 for ; Fri, 08 Apr 2022 10:57:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=IudNh/HynxxH+tLoweLS06+fVTpNQOOa59eg8I3l6dE=; b=U2/eA7V9kOv67ug/CEAAH5jJ3bDdjcuxlz1/CKBkgdCZNj0JT/9J3pCPxI/ctDFSbM qK1+Sr9mcyKaUr/H66Pv9yktEidb3BhejDS+0/amDof+SQfyaKA08TzT03iiUsMWJJ3J 1MOBs6THAm4fpRtqfO8Z5nRoWMFj9z3GkfuNpKxv/p7ca1UQgBQvuQUPEasblyTQ9RrO yn1hcR+9v3A8/OJPimHYWYuYVMj6vFYN+ei68jGp8D7XrjHDIdyTkfs0MDCZSOMFvB64 yHKUCeuwajm7ozlzkxvbDs/GO8MPtR8QBXSY2YgA4lYkeM/deK+/SqRK5LFaoYUgODz1 i3nA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=IudNh/HynxxH+tLoweLS06+fVTpNQOOa59eg8I3l6dE=; b=bIVHbYPGygZ2RmbA/n2vXV/4LTPLo32xatYHL+EHo+K3OfxgXCtkrFfUjb7lGve/q+ XLOmlA9CXMuY+9PN3Hl+JYhpANN0oFFglWU2uJZ73deVKKkN+GALImcnDy19eUmAJuWi tq2Wm+F27HlvT5jcppD0KrvzqzXQyeqBBtGfSt+vSNaTs6FDG9fGFEobgQslhbBGj3bn BtMS0bB5LtDXFfiH96gOrX1Y2iTiuEIL+1YMgravtBjN0foBg5Yz6ZlXK3sJ7TNHwEUu 1Xnw0ltqpqoJNkeC60VbZYo8MWaD2ds3370InZNgA3QeRNqtvlt3fIN/f8LWAPJOGuZR OTAQ== X-Gm-Message-State: AOAM5337v6mwK+26JKlZ8L+sfpCGoXjMNXczlMwTEuiJXnH4aQ/5Hp9Y 9xHPRrA/qipzdDbzSsAdvqy76g== X-Received: by 2002:a17:903:2346:b0:156:9956:f437 with SMTP id c6-20020a170903234600b001569956f437mr20871133plh.123.1649440620128; Fri, 08 Apr 2022 10:57:00 -0700 (PDT) Received: from google.com (157.214.185.35.bc.googleusercontent.com. [35.185.214.157]) by smtp.gmail.com with ESMTPSA id w123-20020a623081000000b005056a4d71e3sm6021624pfw.77.2022.04.08.10.56.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Apr 2022 10:56:59 -0700 (PDT) Date: Fri, 8 Apr 2022 17:56:55 +0000 From: Sean Christopherson To: Andy Lutomirski Cc: Chao Peng , kvm list , Linux Kernel Mailing List , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Linux API , qemu-devel@nongnu.org, Paolo Bonzini , Jonathan Corbet , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , the arch/x86 maintainers , "H. Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Mike Rapoport , Steven Price , "Maciej S . Szmigiero" , Vlastimil Babka , Vishal Annapurve , Yu Zhang , "Kirill A. Shutemov" , "Nakajima, Jun" , Dave Hansen , Andi Kleen , David Hildenbrand Subject: Re: [PATCH v5 04/13] mm/shmem: Restrict MFD_INACCESSIBLE memory against RLIMIT_MEMLOCK Message-ID: References: <20220310140911.50924-1-chao.p.peng@linux.intel.com> <20220310140911.50924-5-chao.p.peng@linux.intel.com> <02e18c90-196e-409e-b2ac-822aceea8891@www.fastmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <02e18c90-196e-409e-b2ac-822aceea8891@www.fastmail.com> X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 07, 2022, Andy Lutomirski wrote: > > On Thu, Apr 7, 2022, at 9:05 AM, Sean Christopherson wrote: > > On Thu, Mar 10, 2022, Chao Peng wrote: > >> Since page migration / swapping is not supported yet, MFD_INACCESSIBLE > >> memory behave like longterm pinned pages and thus should be accounted to > >> mm->pinned_vm and be restricted by RLIMIT_MEMLOCK. > >> > >> Signed-off-by: Chao Peng > >> --- > >> mm/shmem.c | 25 ++++++++++++++++++++++++- > >> 1 file changed, 24 insertions(+), 1 deletion(-) > >> > >> diff --git a/mm/shmem.c b/mm/shmem.c > >> index 7b43e274c9a2..ae46fb96494b 100644 > >> --- a/mm/shmem.c > >> +++ b/mm/shmem.c > >> @@ -915,14 +915,17 @@ static void notify_fallocate(struct inode *inode, pgoff_t start, pgoff_t end) > >> static void notify_invalidate_page(struct inode *inode, struct folio *folio, > >> pgoff_t start, pgoff_t end) > >> { > >> -#ifdef CONFIG_MEMFILE_NOTIFIER > >> struct shmem_inode_info *info = SHMEM_I(inode); > >> > >> +#ifdef CONFIG_MEMFILE_NOTIFIER > >> start = max(start, folio->index); > >> end = min(end, folio->index + folio_nr_pages(folio)); > >> > >> memfile_notifier_invalidate(&info->memfile_notifiers, start, end); > >> #endif > >> + > >> + if (info->xflags & SHM_F_INACCESSIBLE) > >> + atomic64_sub(end - start, ¤t->mm->pinned_vm); > > > > As Vishal's to-be-posted selftest discovered, this is broken as current->mm > > may be NULL. Or it may be a completely different mm, e.g. AFAICT there's > > nothing that prevents a different process from punching hole in the shmem > > backing. > > > > How about just not charging the mm in the first place? There’s precedent: > ramfs and hugetlbfs (at least sometimes — I’ve lost track of the current > status). > > In any case, for an administrator to try to assemble the various rlimits into > a coherent policy is, and always has been, quite messy. ISTM cgroup limits, > which can actually add across processes usefully, are much better. > > So, aside from the fact that these fds aren’t in a filesystem and are thus > available by default, I’m not convinced that this accounting is useful or > necessary. > > Maybe we could just have some switch require to enable creation of private > memory in the first place, and anyone who flips that switch without > configuring cgroups is subject to DoS. I personally have no objection to that, and I'm 99% certain Google doesn't rely on RLIMIT_MEMLOCK.