Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753720AbdHKTml (ORCPT ); Fri, 11 Aug 2017 15:42:41 -0400 Received: from mail-oi0-f65.google.com ([209.85.218.65]:34455 "EHLO mail-oi0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753343AbdHKTmj (ORCPT ); Fri, 11 Aug 2017 15:42:39 -0400 MIME-Version: 1.0 In-Reply-To: <20170811191942.17487-3-riel@redhat.com> References: <20170811191942.17487-1-riel@redhat.com> <20170811191942.17487-3-riel@redhat.com> From: Linus Torvalds Date: Fri, 11 Aug 2017 12:42:38 -0700 X-Google-Sender-Auth: KlfRcbTEYwULwQLgw61ih7CWWp4 Message-ID: Subject: Re: [PATCH 2/2] mm,fork: introduce MADV_WIPEONFORK To: Rik van Riel Cc: Linux Kernel Mailing List , Michal Hocko , Mike Kravetz , linux-mm , Florian Weimer , colm@allcosts.net, Andrew Morton , Kees Cook , Andy Lutomirski , Will Drewry , Ingo Molnar , "Kirill A. Shutemov" , Dave Hansen , Linux API , Matthew Wilcox Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1348 Lines: 40 On Fri, Aug 11, 2017 at 12:19 PM, wrote: > diff --git a/mm/memory.c b/mm/memory.c > index 0e517be91a89..f9b0ad7feb57 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -1134,6 +1134,16 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm, > !vma->anon_vma) > return 0; > > + /* > + * With VM_WIPEONFORK, the child inherits the VMA from the > + * parent, but not its contents. > + * > + * A child accessing VM_WIPEONFORK memory will see all zeroes; > + * a child accessing VM_DONTCOPY memory receives a segfault. > + */ > + if (vma->vm_flags & VM_WIPEONFORK) > + return 0; > + Is this right? Yes, you don't do the page table copies. Fine. But you leave vma with the the anon_vma pointer - doesn't that mean that it's still connected to the original anonvma chain, and we might end up swapping something in? And even if that ends up not being an issue, I'd expect that you'd want to break the anon_vma chain just to not make it grow unnecessarily. So my gut feel is that doing this in "copy_page_range()" is wrong, and the logic should be moved up to dup_mmap(), where we can also short-circuit the anon_vma chain entirely. No? The madvice() interface looks fine to me. Linus