Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp5095448pxb; Thu, 14 Oct 2021 19:16:37 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyRQv7953sDSOXSnSoFEmIXBhRUB+lYTbPdFzBHkqw5RKBGHiWU7HZLGCZqvr19ytYbZ7rK X-Received: by 2002:a17:90a:6289:: with SMTP id d9mr25111803pjj.110.1634264197273; Thu, 14 Oct 2021 19:16:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634264197; cv=none; d=google.com; s=arc-20160816; b=ZkVkIcl63tp8OuSQnXX1Oskh0hnQfso7xBUQcj+6Eo5QDR/JMTiodKGw0z0lJqm7OJ NqqMheWWVm0/Id2+ZCqKJUzN8EOfM+OIe51x2sEe4qUxqGUL7Nj33HcnGUfDitIR6QDN 1fEHxKdjk6/vlXHvDofOVLAh61SR0qp9yKfKIaXtiUzueSPi5/wgDv+AaPFiqwp81FfG 7uOYXFkehUDJw2LlUV+2XoMt8JBDPGBUPxxfOYztQMK9utkBjHH6CwFd0Q4apMbAznvL Noa+mGgcCBkJy9HSpA86Fkt8sxeeHH4Vpm9JrLLtZ4H2VdRXb8KS/GzriqEEiuOp7g2W vC1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=c0cmuWRsBM8lyTJIMBQh8bNKy5hHUkW2gy7WXb11ikU=; b=x1O9X0Yw4wuKrE66bNVDCvOFiOs9/9kJY3AZPLSiziRNc04qwxCy3Ni768vKcwTxGo 3oLdt1qqUrCqf8krjH7/UFuiJAVHXG/QP5IxCAHMneqykW4KX7bvC4GwHEU6VH1bDm0G gkT6bNeqtAbyf8h7K/6atodTxx2twvf2Ip9I6gTSyTD3pdbhvGfTF7Ck8xWfWsxnEMiI hKzoFBqdCOM+8V2J1RFGMmYAYUfG8bp1aDcKkzV9IzlHDfgzsmE6z4k7nKsG2SVOyYHu 8RLfC3/9ym6v43BzV/Yhu/JGYvehEmrDATUeaMzMpp//AGchoIvKOrvGmEtpajMX/aq6 plrw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=f0qLDOKz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z4si8228851pfe.229.2021.10.14.19.16.25; Thu, 14 Oct 2021 19:16:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=f0qLDOKz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233711AbhJNUTG (ORCPT + 99 others); Thu, 14 Oct 2021 16:19:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233060AbhJNUTF (ORCPT ); Thu, 14 Oct 2021 16:19:05 -0400 Received: from mail-yb1-xb2b.google.com (mail-yb1-xb2b.google.com [IPv6:2607:f8b0:4864:20::b2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EAF5EC061755 for ; Thu, 14 Oct 2021 13:16:59 -0700 (PDT) Received: by mail-yb1-xb2b.google.com with SMTP id i84so17391492ybc.12 for ; Thu, 14 Oct 2021 13:16:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=c0cmuWRsBM8lyTJIMBQh8bNKy5hHUkW2gy7WXb11ikU=; b=f0qLDOKzIPAztbLgdkQY2YKlIIwOO4CPZpbMpiSv8Ew9Oil/ath9n07/ohZFTfCchC tUfFpgpxKWGSiuPuGQXa1lAJMzvJ2v2jEgfNJIChSA6p3F/PA65vL35nhyaFPtFPTu49 1247Evth3zZtmiV9KJGFY1VnR8L91bL9rPwjPym6PmsUCAowdylHSWXNeLXHESeggihA Vt1lQtjprzK6LMxeZPiVT/NtzOrDnx2xOOG8OJMrpdLwEzixkPIbxPqs1oZmLmYb8uvN owoPUF3VgkSF9BJUFMguFjxt/z84NGzPG34QC62MSK7yEjYhoPlVKw2QYEROpDMnIVFl tCsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=c0cmuWRsBM8lyTJIMBQh8bNKy5hHUkW2gy7WXb11ikU=; b=O90pmxkALjRKFxxaxPq44BtsTUvnaru9yq81XLrS9Lu/6wwdWt4K46ZUhOEcXBz0/T btRN1BuUTVlpQX/iSU30w17tFesNg7nnm6i84UVrzWPlb+stAPlBPHkM5pPHaGSe3qsA CbLQWrwf9Ton3GWR8GhvarHFxqPCWfY6s1pf5CPTLLZNgIYLHLL3ZkyWVmbJ3U1hBVTH +a3E3+mva2bqnrkBi+TpSpQFdiGhoeDnXQ90JuYGm2rfv+enccZpc5Z+zl63sh+0df1E 4O0uSHTfeFKar39xtmwsnf/7rPCT2vAQGChpnfN8iaLDyRfGt/at/ISopW2mitfVZYpi fTKw== X-Gm-Message-State: AOAM531a8wN5axBsl0bfssrvzLx0/gIufpkpWfV+arp3uDBOTezL31mf jabN5WPtL4X1mrYSr5P1vJK2ONiYckizwvA4wWMR1A== X-Received: by 2002:a25:5b04:: with SMTP id p4mr8493469ybb.34.1634242618619; Thu, 14 Oct 2021 13:16:58 -0700 (PDT) MIME-Version: 1.0 References: <92cbfe3b-f3d1-a8e1-7eb9-bab735e782f6@rasmusvillemoes.dk> <20211007101527.GA26288@duo.ucw.cz> <202110071111.DF87B4EE3@keescook> <202110081344.FE6A7A82@keescook> <26f9db1e-69e9-1a54-6d49-45c0c180067c@redhat.com> In-Reply-To: From: Suren Baghdasaryan Date: Thu, 14 Oct 2021 13:16:47 -0700 Message-ID: Subject: Re: [PATCH v10 3/3] mm: add anonymous vma name refcounting To: David Hildenbrand Cc: Michal Hocko , Kees Cook , Pavel Machek , Rasmus Villemoes , John Hubbard , Andrew Morton , Colin Cross , Sumit Semwal , Dave Hansen , Matthew Wilcox , "Kirill A . Shutemov" , Vlastimil Babka , Johannes Weiner , Jonathan Corbet , Al Viro , Randy Dunlap , Kalesh Singh , Peter Xu , rppt@kernel.org, Peter Zijlstra , Catalin Marinas , vincenzo.frascino@arm.com, =?UTF-8?B?Q2hpbndlbiBDaGFuZyAo5by16Yym5paHKQ==?= , Axel Rasmussen , Andrea Arcangeli , Jann Horn , apopple@nvidia.com, Yu Zhao , Will Deacon , fenghua.yu@intel.com, thunder.leizhen@huawei.com, Hugh Dickins , feng.tang@intel.com, Jason Gunthorpe , Roman Gushchin , Thomas Gleixner , krisman@collabora.com, Chris Hyser , Peter Collingbourne , "Eric W. Biederman" , Jens Axboe , legion@kernel.org, Rolf Eike Beer , Cyrill Gorcunov , Muchun Song , Viresh Kumar , Thomas Cedeno , sashal@kernel.org, cxfcosmos@gmail.com, LKML , linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm , kernel-team Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 12, 2021 at 10:01 AM Suren Baghdasaryan wrote: > > On Tue, Oct 12, 2021 at 12:44 AM David Hildenbrand wrote: > > > > > I'm still evaluating the proposal to use memfds but I'm not sure if > > > the issue that David Hildenbrand mentioned about additional memory > > > consumed in pagecache (which has to be addressed) is the only one we > > > will encounter with this approach. If anyone knows of any potential > > > issues with using memfds as named anonymous memory, I would really > > > appreciate your feedback before I go too far in that direction. > > > > [MAP_PRIVATE memfd only behave that way with 4k, not with huge pages, so > > I think it just has to be fixed. It doesn't make any sense to allocate a > > page for the pagecache ("populate the file") when accessing via a > > private mapping that's supposed to leave the file untouched] > > > > My gut feeling is if you really need a string as identifier, then try > > going with memfds. Yes, we might hit some road blocks to be sorted out, > > but it just logically makes sense to me: Files have names. These names > > exist before mapping and after mapping. They "name" the content. > > I'm investigating this direction. I don't have much background with > memfds, so I'll need to digest the code first. I've done some investigation into the possibility of using memfds to name anonymous VMAs. Here are my findings: 1. Forking a process with anonymous vmas named using memfd is 5-15% slower than with prctl (depends on the number of VMAs in the process being forked). Profiling shows that i_mmap_lock_write() dominates dup_mmap(). Exit path is also slower by roughly 9% with free_pgtables() and fput() dominating exit_mmap(). Fork performance is important for Android because almost all processes are forked from zygote, therefore this limitation already makes this approach prohibitive. 2. mremap() usage to grow the mapping has an issue when used with memfds: fd = memfd_create(name, MFD_ALLOW_SEALING); ftruncate(fd, size_bytes); ptr = mmap(NULL, size_bytes, prot, MAP_PRIVATE, fd, 0); close(fd); ptr = mremap(ptr, size_bytes, size_bytes * 2, MREMAP_MAYMOVE); touch_mem(ptr, size_bytes * 2); This would generate a SIGBUS in touch_mem(). I believe it's because ftruncate() specified the size to be size_bytes and we are accessing more than that after remapping. prctl() does not have this limitation and we do have a usecase for growing a named VMA. 3. Leaves an fd exposed, even briefly, which may lead to unexpected flaws (e.g. anything using mmap MAP_SHARED could allow exposures or overwrites). Even MAP_PRIVATE, if an attacker writes into the file after ftruncate() and before mmap(), can cause private memory to be initialized with unexpected data. 4. There is a usecase in the Android userspace where vma naming happens after memory was allocated. Bionic linker does in-memory relocations and then names some relocated sections. In the light of these findings, could the current patchset be reconsidered? Thanks, Suren. > > > > > Maybe it's just me, but the whole interface, setting the name via a > > prctl after the mapping was already instantiated doesn't really spark > > joy at my end. That's not a strong pushback, but if we can avoid it > > using something that's already there, that would be very much preferred. > > Actually that's one of my worries about using memfds. There might be > cases when we need to name a vma after it was mapped. memfd_create() > would not allow us to do that AFAIKT. But I need to check all usages > to say if that's really an issue. > Thanks! > > > > > -- > > Thanks, > > > > David / dhildenb > > > > -- > > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. > >