Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751852Ab0G3Eei (ORCPT ); Fri, 30 Jul 2010 00:34:38 -0400 Received: from complete.lackof.org ([198.49.126.79]:39816 "EHLO complete.lackof.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750859Ab0G3Eef (ORCPT ); Fri, 30 Jul 2010 00:34:35 -0400 Date: Thu, 29 Jul 2010 22:34:23 -0600 From: dann frazier To: Hugh Dickins Cc: KAMEZAWA Hiroyuki , linux-ia64@vger.kernel.org, linux-kernel@vger.kernel.org, Rik van Riel , KOSAKI Motohiro , Nick Piggin , Mel Gorman , Minchan Kim , Ralf Baechle Subject: Re: ia64 hang/mca running gdb 'make check' Message-ID: <20100730043423.GH20265@lackof.org> References: <20100720173512.GF26783@ldl.fc.hp.com> <20100721105136.9d4440de.kamezawa.hiroyu@jp.fujitsu.com> <20100721030629.GA9987@lackof.org> <20100727071914.GB22945@lackof.org> <20100727180330.b6ecba7f.kamezawa.hiroyu@jp.fujitsu.com> <20100727144326.GC22945@lackof.org> <20100729192216.GF20265@lackof.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4460 Lines: 118 On Thu, Jul 29, 2010 at 07:01:56PM -0700, Hugh Dickins wrote: > On Thu, 29 Jul 2010, dann frazier wrote: > > On Wed, Jul 28, 2010 at 08:50:18PM -0700, Hugh Dickins wrote: > > > > > > Let's note that gdb's gcore is building up its own version of a > > > coredump, not going through the get_dump_page() code I was wondering > > > about. If I read gcore correctly (possibly not!), it will be reading > > > selected areas from /proc//mem i.e. using access_process_vm(). > > > > This appears to be correct. I was able to collect the following > > stacktrace using INIT: > > > > [ 2535.074197] Backtrace of pid 4605 (gdb) > > [ 2535.074197] > > [ 2535.074197] Call Trace: > > [ 2535.074197] [] ia64_native_leave_kernel+0x0/0x270 > > [ 2535.074197] sp=e000004081c77c40 bsp=e000004081c71018 > > [ 2535.074197] [] __copy_user+0x160/0x960 > > [ 2535.074197] sp=e000004081c77e10 bsp=e000004081c71018 > > [ 2535.074197] [] access_process_vm+0x2c0/0x380 > > [ 2535.074197] sp=e000004081c77e10 bsp=e000004081c70f60 > > Thanks a lot, dann. But it was the [vdso] line in foo's /proc//maps > which you sent me privately, that set me thinking on the right track. > Here's what I believe is the appropriate patch: please give it a try > and let us know... dannf@rx2600:~> gdb foo GNU gdb (GDB) SUSE (7.0-0.4.16) Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "ia64-suse-linux". For bug reporting instructions, please see: ... Reading symbols from /home/dannf/foo...done. (gdb) break leaf Breakpoint 1 at 0x4000000000000401: file foo.c, line 2. (gdb) run Starting program: /home/dannf/foo Breakpoint 1, leaf () at foo.c:2 2 return 0; (gdb) gcore Saved corefile core.3952 (gdb) good work Hugh! -dann > > [PATCH] mm: fix ia64 crash when gcore reads gate area > > Debian's ia64 autobuilders have been seeing kernel freeze or reboot > when running the gdb testsuite (Debian bug 588574): dannf bisected to > 2.6.32 62eede62dafb4a6633eae7ffbeb34c60dba5e7b1 "mm: ZERO_PAGE without > PTE_SPECIAL"; and reproduced it with gdb's gcore on a simple target. > > I'd missed updating the gate_vma handling in __get_user_pages(): that > happens to use vm_normal_page() (nowadays failing on the zero page), > yet reported success even when it failed to get a page - boom when > access_process_vm() tried to copy that to its intermediate buffer. > > Fix this, resisting cleanups: in particular, leave it for now reporting > success when not asked to get any pages - very probably safe to change, > but let's not risk it without testing exposure. > > Why did ia64 crash with 16kB pages, but succeed with 64kB pages? > Because setup_gate() pads each 64kB of its gate area with zero pages. > > Reported-by: Andreas Barth > Bisected-by: dann frazier > Signed-off-by: Hugh Dickins > Cc: stable@kernel.org > --- > > mm/memory.c | 16 +++++++++++++--- > 1 file changed, 13 insertions(+), 3 deletions(-) > > --- 2.6.35-rc6/mm/memory.c 2010-05-30 17:58:57.000000000 -0700 > +++ linux/mm/memory.c 2010-07-29 17:57:29.000000000 -0700 > @@ -1394,10 +1394,20 @@ int __get_user_pages(struct task_struct > return i ? : -EFAULT; > } > if (pages) { > - struct page *page = vm_normal_page(gate_vma, start, *pte); > + struct page *page; > + > + page = vm_normal_page(gate_vma, start, *pte); > + if (!page) { > + if (!(gup_flags & FOLL_DUMP) && > + is_zero_pfn(pte_pfn(*pte))) > + page = pte_page(*pte); > + else { > + pte_unmap(pte); > + return i ? : -EFAULT; > + } > + } > pages[i] = page; > - if (page) > - get_page(page); > + get_page(page); > } > pte_unmap(pte); > if (vmas) > -- dann frazier -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/