Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751852AbdHSACq (ORCPT ); Fri, 18 Aug 2017 20:02:46 -0400 Received: from mx1.redhat.com ([209.132.183.28]:50386 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751350AbdHSACp (ORCPT ); Fri, 18 Aug 2017 20:02:45 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com A814B883D2 Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=riel@redhat.com Message-ID: <1503100961.6577.71.camel@redhat.com> Subject: Re: [PATCH 2/2] mm,fork: introduce MADV_WIPEONFORK From: Rik van Riel To: Andrew Morton Cc: linux-kernel@vger.kernel.org, mhocko@kernel.org, mike.kravetz@oracle.com, linux-mm@kvack.org, fweimer@redhat.com, colm@allcosts.net, keescook@chromium.org, luto@amacapital.net, wad@chromium.org, mingo@kernel.org, kirill@shutemov.name, dave.hansen@intel.com, linux-api@vger.kernel.org, torvalds@linux-foundation.org, willy@infradead.org Date: Fri, 18 Aug 2017 20:02:41 -0400 In-Reply-To: <20170818111545.ab371cfedb71d13d76590030@linux-foundation.org> References: <20170811212829.29186-1-riel@redhat.com> <20170811212829.29186-3-riel@redhat.com> <20170815155114.ff9f4164eed28bf02db48fbb@linux-foundation.org> <1502849899.6577.66.camel@redhat.com> <20170817155033.172cf22ec143713d5cf98b4e@linux-foundation.org> <1503073709.6577.68.camel@redhat.com> <20170818111545.ab371cfedb71d13d76590030@linux-foundation.org> Organization: Red Hat, Inc Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Sat, 19 Aug 2017 00:02:44 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2321 Lines: 69 On Fri, 2017-08-18 at 11:15 -0700, Andrew Morton wrote: > On Fri, 18 Aug 2017 12:28:29 -0400 Rik van Riel > wrote: > > > On Thu, 2017-08-17 at 15:50 -0700, Andrew Morton wrote: > > > On Tue, 15 Aug 2017 22:18:19 -0400 Rik van Riel > > > wrote: > > > > > > > > > --- a/mm/madvise.c > > > > > > +++ b/mm/madvise.c > > > > > > @@ -80,6 +80,17 @@ static long madvise_behavior(struct > > > > > > vm_area_struct *vma, > > > > > > __ } > > > > > > __ new_flags &= ~VM_DONTCOPY; > > > > > > __ break; > > > > > > + case MADV_WIPEONFORK: > > > > > > + /* MADV_WIPEONFORK is only supported on > > > > > > anonymous > > > > > > memory. */ > > > > > > + if (vma->vm_file || vma->vm_flags & > > > > > > VM_SHARED) > > > > > > { > > > > > > + error = -EINVAL; > > > > > > + goto out; > > > > > > + } > > > > > > + new_flags |= VM_WIPEONFORK; > > > > > > + break; > > > > > > + case MADV_KEEPONFORK: > > > > > > + new_flags &= ~VM_WIPEONFORK; > > > > > > + break; > > > > > > __ case MADV_DONTDUMP: > > > > > > __ new_flags |= VM_DONTDUMP; > > > > > > __ break; > > > > > > > > > > It seems odd to permit MADV_KEEPONFORK against other-than- > > > > > anon > > > > > vmas? > > > > > > > > Given that the only way to set VM_WIPEONFORK is through > > > > MADV_WIPEONFORK, calling MADV_KEEPONFORK on an > > > > other-than-anon vma would be equivalent to a noop. > > > > > > > > If new_flags == vma->vm_flags, madvise_behavior() will > > > > immediately exit. > > > > > > Yes, but calling MADV_WIPEONFORK against an other-than-anon vma > > > is > > > presumably a userspace bug.____A bug which will probably result > > > in > > > userspace having WIPEONFORK memory which it didn't want.____The > > > kernel > > > can trivially tell userspace that it has this bug so why not do > > > so? > > > > Uh, what? > > > > Braino.  I meant MADV_KEEPONFORK.  Calling MADV_KEEPONFORK against an > other-than-anon vma is a presumptive userspace bug and the kernel > should report that. All MADV_KEEPONFORK does is clear the flag set by MADV_WIPEONFORK. Since there is no way to set the WIPEONFORK flag on other-than-anon VMAs, that means MADV_KEEPONFORK is always a noop for those VMAs. You remind me that I should send in a man page addition, though, when this code gets sent to Linus.