Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752021AbdHEOFr (ORCPT ); Sat, 5 Aug 2017 10:05:47 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60710 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751392AbdHEOFp (ORCPT ); Sat, 5 Aug 2017 10:05:45 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com F000C80F6C Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=riel@redhat.com Message-ID: <1501941942.6577.7.camel@redhat.com> Subject: Re: [PATCH 2/2] mm,fork: introduce MADV_WIPEONFORK From: Rik van Riel To: Mike Kravetz , linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, fweimer@redhat.com, colm@allcosts.net, akpm@linux-foundation.org, rppt@linux.vnet.ibm.com, keescook@chromium.org, luto@amacapital.net, wad@chromium.org, mingo@kernel.org Date: Sat, 05 Aug 2017 10:05:42 -0400 In-Reply-To: <54eba2da-94ff-bd8a-3405-47577437550a@oracle.com> References: <20170804190730.17858-1-riel@redhat.com> <20170804190730.17858-3-riel@redhat.com> <54eba2da-94ff-bd8a-3405-47577437550a@oracle.com> Organization: Red Hat, Inc Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Sat, 05 Aug 2017 14:05:45 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5402 Lines: 117 On Fri, 2017-08-04 at 16:09 -0700, Mike Kravetz wrote: > On 08/04/2017 12:07 PM, riel@redhat.com wrote: > > From: Rik van Riel > > > > Introduce MADV_WIPEONFORK semantics, which result in a VMA being > > empty in the child process after fork. This differs from > > MADV_DONTFORK > > in one important way. > > > > If a child process accesses memory that was MADV_WIPEONFORK, it > > will get zeroes. The address ranges are still valid, they are just > > empty. > > > This didn't seem 'quite right' to me for shared mappings and/or file > backed mappings.  I wasn't exactly sure what it 'should' do in such > cases.  So, I tried it with a mapping created as follows: > > addr = mmap(ADDR, page_size, >                         PROT_READ | PROT_WRITE, >                         MAP_ANONYMOUS|MAP_SHARED, -1, 0); Your test program is pretty much the same I used, except I used MAP_PRIVATE instead of MAP_SHARED. Let me see how the code paths differ for both cases... > When setting MADV_WIPEONFORK on the vma/mapping, I got the following > at task exit time: > > [  694.558290] ------------[ cut here ]------------ > [  694.558978] kernel BUG at mm/filemap.c:212! > [  694.559476] invalid opcode: 0000 [#1] SMP > [  694.560023] Modules linked in: ip6t_REJECT nf_reject_ipv6 > ip6t_rpfilter xt_conntrack ebtable_broute bridge stp llc ebtable_nat > ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 > ip6table_raw ip6table_mangle ip6table_security iptable_nat > nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack > iptable_raw iptable_mangle 9p iptable_security ebtable_filter > ebtables ip6table_filter ip6_tables snd_hda_codec_generic > snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq ppdev > snd_seq_device joydev crct10dif_pclmul crc32_pclmul crc32c_intel > snd_pcm ghash_clmulni_intel 9pnet_virtio virtio_balloon snd_timer > 9pnet parport_pc snd parport i2c_piix4 soundcore nfsd auth_rpcgss > nfs_acl lockd grace sunrpc virtio_net virtio_blk virtio_console > 8139too qxl drm_kms_helper ttm drm serio_raw 8139cp > [  694.571554]  mii virtio_pci ata_generic virtio_ring virtio > pata_acpi > [  694.572608] CPU: 3 PID: 1200 Comm: test_wipe2 Not tainted 4.13.0- > rc3+ #8 > [  694.573778] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS 1.9.1-1.fc24 04/01/2014 > [  694.574917] task: ffff880137178040 task.stack: ffffc900019d4000 > [  694.575650] RIP: 0010:__delete_from_page_cache+0x344/0x410 > [  694.576409] RSP: 0018:ffffc900019d7a88 EFLAGS: 00010082 > [  694.577238] RAX: 0000000000000021 RBX: ffffea00047d0e00 RCX: > 0000000000000006 > [  694.578537] RDX: 0000000000000000 RSI: 0000000000000096 RDI: > ffff88023fd0db90 > [  694.579774] RBP: ffffc900019d7ad8 R08: 00000000000882b6 R09: > 000000000000028a > [  694.580754] R10: ffffc900019d7da8 R11: ffffffff8211184d R12: > ffffea00047d0e00 > [  694.582040] R13: 0000000000000000 R14: 0000000000000202 R15: > ffff8801384439e8 > [  694.583236] FS:  0000000000000000(0000) GS:ffff88023fd00000(0000) > knlGS:0000000000000000 > [  694.584607] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [  694.585409] CR2: 00007ff77a8da618 CR3: 0000000001e09000 CR4: > 00000000001406e0 > [  694.586547] Call Trace: > [  694.586996]  delete_from_page_cache+0x54/0x110 > [  694.587481]  truncate_inode_page+0xab/0x120 > [  694.588110]  shmem_undo_range+0x498/0xa50 > [  694.588813]  ? save_stack_trace+0x1b/0x20 > [  694.589529]  ? set_track+0x70/0x140 > [  694.590150]  ? init_object+0x69/0xa0 > [  694.590722]  ? __inode_wait_for_writeback+0x73/0xe0 > [  694.591525]  shmem_truncate_range+0x16/0x40 > [  694.592268]  shmem_evict_inode+0xb1/0x190 > [  694.592735]  evict+0xbb/0x1c0 > [  694.593147]  iput+0x1c0/0x210 > [  694.593497]  dentry_unlink_inode+0xb4/0x150 > [  694.593982]  __dentry_kill+0xc1/0x150 > [  694.594400]  dput+0x1c8/0x1e0 > [  694.594745]  __fput+0x172/0x1e0 > [  694.595103]  ____fput+0xe/0x10 > [  694.595463]  task_work_run+0x80/0xa0 > [  694.595886]  do_exit+0x2d6/0xb50 > [  694.596323]  ? __do_page_fault+0x288/0x4a0 > [  694.596818]  do_group_exit+0x47/0xb0 > [  694.597249]  SyS_exit_group+0x14/0x20 > [  694.597682]  entry_SYSCALL_64_fastpath+0x1a/0xa5 > [  694.598198] RIP: 0033:0x7ff77a5e78c8 > [  694.598612] RSP: 002b:00007ffc5aece318 EFLAGS: 00000246 ORIG_RAX: > 00000000000000e7 > [  694.599804] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: > 00007ff77a5e78c8 > [  694.600609] RDX: 0000000000000000 RSI: 000000000000003c RDI: > 0000000000000000 > [  694.601424] RBP: 00007ff77a8da618 R08: 00000000000000e7 R09: > ffffffffffffff98 > [  694.602224] R10: 0000000000000003 R11: 0000000000000246 R12: > 0000000000000001 > [  694.603151] R13: 00007ff77a8dbc60 R14: 0000000000000000 R15: > 0000000000000000 > [  694.603984] Code: 60 f3 c5 81 e8 2e 7e 03 00 0f 0b 48 c7 c6 60 f3 > c5 81 4c 89 e7 e8 1d 7e 03 00 0f 0b 48 c7 c6 00 f4 c5 81 4c 89 e7 e8 > 0c 7e 03 00 <0f> 0b 48 c7 c6 38 f3 c5 81 4c 89 e7 e8 fb 7d 03 00 0f > 0b 48 c7  > [  694.606500] RIP: __delete_from_page_cache+0x344/0x410 RSP: > ffffc900019d7a88 > [  694.607426] ---[ end trace 55e6b04ae95d8ce3 ]--- > > BTW, this was on 4.13.0-rc3 + your patches.  Simple test program is > below. >