Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757221AbaGNUN3 (ORCPT ); Mon, 14 Jul 2014 16:13:29 -0400 Received: from mail-pa0-f41.google.com ([209.85.220.41]:57816 "EHLO mail-pa0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756975AbaGNUN1 (ORCPT ); Mon, 14 Jul 2014 16:13:27 -0400 MIME-Version: 1.0 In-Reply-To: References: <748020aaaf5c5c2924a16232313e0175.squirrel@webmail.tu-dortmund.de> Date: Tue, 15 Jul 2014 00:13:26 +0400 Message-ID: Subject: Re: PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit From: Konstantin Khlebnikov To: Hugh Dickins Cc: Ingo Korb , "Kirill A. Shutemov" , Ning Qu , Dave Jones , Sasha Levin , Andrew Morton , "linux-mm@kvack.org" , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org It seems boundng logic in do_fault_around is wrong: start_addr = max(address & fault_around_mask(), vma->vm_start); off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1); pte -= off; pgoff -= off; Ok, off <= 511, but it might be bigger than pte offset in pte table. So after pte -= off pte points into previous page. /* * max_pgoff is either end of page table or end of vma * or fault_around_pages() from pgoff, depending what is nearest. */ max_pgoff = pgoff - ((start_addr >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) + PTRS_PER_PTE - 1; max_pgoff = min3(max_pgoff, vma_pages(vma) + vma->vm_pgoff - 1, pgoff + fault_around_pages() - 1); On Mon, Jul 14, 2014 at 11:22 PM, Hugh Dickins wrote: > On Mon, 14 Jul 2014, Ingo Korb wrote: > >> Hi, >> >> repeated mapping of the same file on tmpfs using remap_file_pages >> sometimes triggers a "BUG at mm/filemap.c:202" when the process exits, log >> message below. The system is an x86_64 VirtualBox machine with 2GB of RAM >> running Debian, but it could also be reproduced on a non-virtualized >> laptop. >> >> The bug can be triggered in Linux 3.16-rc5, bisecting has located d7c17551 >> as the first failing commit (mm: implement ->map_pages for shmem/tmpfs). >> >> A test program for this has been attached (I don't trust this webmailer to >> not mangle it). With the parameters set in the source code, the BUG >> message should be triggered within a small number of tries (usually the >> first or second). Changing the size of the memory map sometimes delays the >> bug ("while true; do ./remap-demo; done" should still trigger it within a >> few seconds) or avoids it completely - I don't see any patterns yet. Using >> (at least) two different mappings for the file, each of which has been >> remapped seem to be a requirement for triggering it. >> >> Implementing the same mappings using mmap() does not appear to cause any >> problems, but I assume that someone might care about this problem while >> remap_file_pages() is still in the kernel. > > This is very good news :) Thank you so much for going to all this > trouble over it. If you didn't realize, yours is not the first report > of an mm/filemap.c:202! BUG_ON(page_mapped(page)), but most of them > have happened when using the Trinity fuzzer (known to be fond of tmpfs > and remap_file_pages), and too rare to track down further. > > I have several times in recent months eyed the (old) remap_file_pages > code, and the filemap_map_pages code, hoping to find the answer in one > or the other; but had no success. > > Kirill, Konstantin, would either of you have a moment to try and track > this down further? I'd love to, but I am _still_ not finished with the > fallocate hang business, then sealing review, then plenty beyond that. > Ingo's remap-demo.c inline below. > > Of course, one option will be just to revert d7c17551; but I'd much > rather track down the bug and fix it, if we can in the next couple of > weeks - even if it does turn out to be in code removed in 3.17. > > Thanks! > Hugh > >> >> -ik >> >> >> ------------[ cut here ]------------ >> kernel BUG at mm/filemap.c:202! >> invalid opcode: 0000 [#1] SMP >> Modules linked in: uinput nfsd auth_rpcgss oid_registry nfs_acl nfs lockd >> fscache sunrpc ext3 jbd loop joydev hid_generic usbhid hid psmouse >> parport_pc ohci_pci ohci_hcd ehci_hcd usbcore ac i2c_piix4 pcspkr >> serio_raw evdev parport battery button processor i2c_core usb_common >> microcode thermal_sys ext4 crc16 jbd2 mbcache sr_mod cdrom sg sd_mod >> crc_t10dif crct10dif_common ata_generic e1000 ahci libahci ata_piix libata >> scsi_mod >> CPU: 3 PID: 2992 Comm: test Not tainted 3.16.0-rc5ik1 #37 >> Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006 task: >> ffff88005a9363d0 ti: ffff880037968000 task.ti: ffff880037968000 RIP: >> 0010:[] [] >> __delete_from_page_cache+0x16f/0x1f6 >> RSP: 0018:ffff88003796bba8 EFLAGS: 00010046 >> RAX: 0000000000000000 RBX: ffffea00012ee220 RCX: 00000000ffffffe2 >> RDX: 0000000000000018 RSI: 0000000000000018 RDI: ffff88005dbeb700 >> RBP: ffff8800378d1c10 R08: ffff88005dbeb700 R09: 0000000000000013 >> R10: 0000000000000013 R11: 0000000000000000 R12: 0000000000000000 >> R13: 0000000000000003 R14: ffff8800378d1c18 R15: 000000000000000f >> FS: 0000000000000000(0000) GS:ffff88005d980000(0000) >> knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 00007f69ad38fa30 CR3: 0000000001611000 CR4: 00000000000006e0 >> Stack: >> 0000000000000002 000000000000000f ffff880059899008 ffff8800598990a8 >> ffff8800378d1c10 ffffea00012ee220 ffff8800378d1c28 0000000000000000 >> ffff8800378d1ac0 ffff8800374d0600 0000000000000001 ffffffff810db65b >> Call Trace: >> [] ? delete_from_page_cache+0x32/0x56 >> [] ? truncate_inode_page+0x62/0x69 >> [] ? shmem_undo_range+0x13f/0x3f3 >> [] ? get_pfnblock_flags_mask+0x1d/0x4d >> [] ? free_hot_cold_page+0x76/0x134 >> [] ? release_pages+0x171/0x180 >> [] ? hpage_nr_pages+0x1b/0x1b >> [] ? __inode_wait_for_writeback+0x67/0xae >> [] ? shmem_truncate_range+0xb/0x25 >> [] ? shmem_evict_inode+0x4f/0xed >> [] ? shmem_file_setup+0x7/0x7 >> [] ? evict+0xa3/0x147 >> [] ? __dentry_kill+0x103/0x173 >> [] ? dput+0x133/0x150 >> [] ? __fput+0x163/0x184 >> [] ? task_work_run+0x7b/0x8f >> [] ? do_exit+0x3f6/0x904 >> [] ? do_group_exit+0x68/0x9a >> [] ? SyS_exit_group+0x10/0x10 >> [] ? system_call_fastpath+0x16/0x1b >> Code: be 0a 00 00 00 48 89 df e8 96 5b 01 00 48 8b 03 a9 00 00 08 00 74 0d >> be 18 00 00 00 48 89 df e8 7f 5b 01 00 8b 43 18 85 c0 78 02 <0f> 0b 48 8b >> 03 a8 10 74 6f 48 8b 85 88 00 00 00 f6 40 20 01 75 >> RIP [] __delete_from_page_cache+0x16f/0x1f6 >> RSP >> ---[ end trace 79ae5bd27fcedca9 ]--- >> Fixing recursive fault but reboot is needed! >> BUG: Bad rss-counter state mm:ffff88005aae60c0 idx:0 val:1 > > And that "Bad rss-counter" report fits some of the reports too, good. > > Here's Ingo's remap-demo.c inline, but I've not tried it: > > #define _GNU_SOURCE > #include > #include > #include > #include > #include > #include > #include > #include > #include > > #define PAGE_SIZE 4096 > // NOTE: DATA=MAP2=16 seems to trigger in the first few tries > // NOTE: 9/9 needs a loop and a few seconds to trigger > // NOTE: DATA=9, MAP2=8 does not trigger > #define DATA_SIZE 16 > #define MAP2_SIZE 16 > > int shmfd; > char shmpath[] = "/dev/shm/mmaptest-XXXXXX"; > unsigned char *map1, *map2; > unsigned int i; > > int main(int argc, char *argv[]) { > /* create a data file on tmpfs */ > shmfd = mkstemp(shmpath); > if (shmfd < 0) { > perror("mkstemp"); > exit(2); > } > > if (unlink(shmpath)) { > perror("unlink"); > exit(2); > } > > if (ftruncate(shmfd, DATA_SIZE * PAGE_SIZE)) { > perror("ftruncate"); > exit(2); > } > > /* map a single page from the file */ > map1 = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, shmfd, 0); > if (map1 == MAP_FAILED) { > perror("mmap 1"); > exit(2); > } > > /* remap it to another page in the file */ > // NOTE: Does not trigger without remapping > // NOTE: Does not trigger for 7, but does trigger for 8 if both sizes are 16 > // (DATA_SIZE-2 is sufficiently generic here) > if (remap_file_pages(map1, PAGE_SIZE, 0, DATA_SIZE - 2, MAP_SHARED)) { > perror("remap_file_pages 1"); > exit(2); > } > > /* create a second mapping */ > map2 = mmap(NULL, MAP2_SIZE * PAGE_SIZE, PROT_READ | PROT_WRITE, > MAP_SHARED, shmfd, 0); > if (map2 == MAP_FAILED) { > perror("mmap 2"); > exit(2); > } > > /* map all of its pages to page 0 */ > // NOTE: Remapping only the last page does not trigger > for (i = 0; i < MAP2_SIZE; i++) { > if (remap_file_pages(map2 + PAGE_SIZE * i, PAGE_SIZE, 0, 0, MAP_SHARED)) { > perror("remap_file_pages 3"); > exit(2); > } > } > > close(shmfd); > > exit(0); > } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/