Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752053Ab0DBSOI (ORCPT ); Fri, 2 Apr 2010 14:14:08 -0400 Received: from smtp1.linux-foundation.org ([140.211.169.13]:41705 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750735Ab0DBSOC (ORCPT ); Fri, 2 Apr 2010 14:14:02 -0400 Date: Fri, 2 Apr 2010 11:09:14 -0700 (PDT) From: Linus Torvalds To: Borislav Petkov , Rik van Riel cc: Andrew Morton , Linux Kernel Mailing List , KOSAKI Motohiro , Lee Schermerhorn , Minchan Kim , Nick Piggin , Andrea Arcangeli , Hugh Dickins Subject: Re: Ugly rmap NULL ptr deref oopsie on hibernate (was Linux 2.6.34-rc3) In-Reply-To: <20100402175937.GA19690@liondog.tnic> Message-ID: References: <20100402175937.GA19690@liondog.tnic> User-Agent: Alpine 2.00 (LFD 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3571 Lines: 110 I think this is likely due to the new scalable anon_vma linking by Rik. Nothing else I can imagine should have introduced anything like it. Rik: the picures have the information, but you need to look at several to see both the oops and the backtrace. Here's a condensed version: shrink_all_memory -> do_try_to_free_pages -> shrink_zone -> shrink_inactive_list -> shrink_page_list -> page_referenced where page_referenced() oopses due page_referenced_anon() as per Borislav's description below. Added all the usual suspects to the Cc list. Left the full report appended so that the new people don't have to search for it on lkml. Linus On Fri, 2 Apr 2010, Borislav Petkov wrote: > > I've got the following oopsie two times now when hibernating - this > means, I don't get it everytime I hibernate but only sometimes, say once > in a blue moon. > > And yeah, I couldn't catch it over serial console so I had to make ugly > pictures. By the way, the numbers in the filenames increment as I scroll > down the whole oops (yep, it hadn't completely frozen and I still could > do Shift->PgUp or Shift->PgDn on the console): > > http://www.kernel.org/pub/linux/kernel/people/bp/ > > So, here's what I could decipher from the oopsie, someone else who's > more knowledgeable in mm, rmap and anon_vma's list traversal should be > able to tell what goes wrong there. > > EIP is at page_referenced+0xee > > which is > > > 10c4: 41 01 c4 add %eax,%r12d > 10c7: 83 7d cc 00 cmpl $0x0,-0x34(%rbp) > 10cb: 74 19 je 10e6 > 10cd: 4d 8b 6d 20 mov 0x20(%r13),%r13 > 10d1: 49 83 ed 20 sub $0x20,%r13 > > 10d5: 49 8b 45 20 mov 0x20(%r13),%rax <-------------- > > 10d9: 0f 18 08 prefetcht0 (%rax) > 10dc: 49 8d 45 20 lea 0x20(%r13),%rax > 10e0: 48 39 45 80 cmp %rax,-0x80(%rbp) > > > > Corresponding asm: > > > .loc 1 496 0 > movq 32(%r13), %r13 # .same_anon_vma.next, __mptr.451 > .LVL295: > subq $32, %r13 #, avc > .LVL296: > .L184: > .LBE1278: > movq 32(%r13), %rax # .same_anon_vma.next, .same_anon_vma.next <---------------- > prefetcht0 (%rax) # .same_anon_vma.next > leaq 32(%r13), %rax #, tmp97 > cmpq %rax, -128(%rbp) # tmp97, %sfp > jne .L187 #, > .L186: > .loc 1 514 0 > movq %r14, %rdi # anon_vma, > call page_unlock_anon_vma # > > > > and the NULL pointer in question is being written into %r13 and then 32 > is subtracted from it (I'm guessing container_of()). This is consistent > with the register snapshot - %r13 contains 0xffffffffffffffe0 which is > -32 and with the code dump in the oops, in CIMG1640.JPG code points to > opcode 49 8b 45 20. > > Which is the following piece of code in . > > > > mapcount = page_mapcount(page); > list_for_each_entry(avc, &anon_vma->head, same_anon_vma) { > struct vm_area_struct *vma = avc->vma; > unsigned long address = vma_address(page, vma); > if (address == -EFAULT) > continue; > > > > which tells us that same_anon_vma.next is NULL. Hmm... > > -- > Regards/Gruss, > Boris. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/