Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758237Ab0DBFIa (ORCPT ); Fri, 2 Apr 2010 01:08:30 -0400 Received: from fgwmail6.fujitsu.co.jp ([192.51.44.36]:49232 "EHLO fgwmail6.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753652Ab0DBFIX (ORCPT ); Fri, 2 Apr 2010 01:08:23 -0400 X-SecurityPolicyCheck-FJ: OK by FujitsuOutboundMailChecker v1.3.1 Date: Fri, 2 Apr 2010 14:04:06 +0900 From: KAMEZAWA Hiroyuki To: TAO HU Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Ye Yuan.Bo-A22116" , Chang Qing-A21550 , linux-arm-kernel@lists.infradead.org Subject: Re: [Question] race condition in mm/page_alloc.c regarding page->lru? Message-Id: <20100402140406.d3d7f18e.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: References: Organization: FUJITSU Co. LTD. X-Mailer: Sylpheed 3.0.1 (GTK+ 2.10.14; i686-pc-mingw32) Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6373 Lines: 162 On Fri, 2 Apr 2010 11:51:33 +0800 TAO HU wrote: > 2 patches related to page_alloc.c were applied. > Does anyone see a connection between the 2 patches and the panic? > NOTE: the full patches are attached. > I don't think there are relationship between patches and your panic. BTW, there is other case about the backlog rather than race in alloc_pages() itself. If someone list_del(&page->lru) and the page is already freed, you'll see the same backlog later. Then, I doubt use-after-free case rather than complicated races. Thanks, -Kame > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index a596bfd..34a29e2 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -2551,6 +2551,20 @@ static inline unsigned long > wait_table_bits(unsigned long size) > #define LONG_ALIGN(x) (((x)+(sizeof(long))-1)&~((sizeof(long))-1)) > > /* > + * Check if a pageblock contains reserved pages > + */ > +static int pageblock_is_reserved(unsigned long start_pfn) > +{ > + unsigned long end_pfn = start_pfn + pageblock_nr_pages; > + unsigned long pfn; > + > + for (pfn = start_pfn; pfn < end_pfn; pfn++) > + if (PageReserved(pfn_to_page(pfn))) > + return 1; > + return 0; > +} > + > +/* > * Mark a number of pageblocks as MIGRATE_RESERVE. The number > * of blocks reserved is based on zone->pages_min. The memory within the > * reserve will tend to store contiguous free pages. Setting min_free_kbytes > @@ -2579,7 +2593,7 @@ static void setup_zone_migrate_reserve(struct zone *zone) > continue; > > /* Blocks with reserved pages will never free, skip them. */ > - if (PageReserved(page)) > + if (pageblock_is_reserved(pfn)) > continue; > > block_migratetype = get_pageblock_migratetype(page); > -- > 1.5.4.3 > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 5c44ed4..a596bfd 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -119,6 +119,7 @@ static char * const zone_names[MAX_NR_ZONES] = { > }; > > int min_free_kbytes = 1024; > +int min_free_order_shift = 1; > > unsigned long __meminitdata nr_kernel_pages; > unsigned long __meminitdata nr_all_pages; > @@ -1256,7 +1257,7 @@ int zone_watermark_ok(struct zone *z, int order, > unsigned long mark, > free_pages -= z->free_area[o].nr_free << o; > > /* Require fewer higher order pages to be free */ > - min >>= 1; > + min >>= min_free_order_shift; > > if (free_pages <= min) > return 0; > -- > > > On Thu, Apr 1, 2010 at 12:05 PM, TAO HU wrote: > > Hi, all > > > > We got a panic on our ARM (OMAP) based HW. > > Our code is based on 2.6.29 kernel (last commit for mm/page_alloc.c is > > cc2559bccc72767cb446f79b071d96c30c26439b) > > > > It appears to crash while going through pcp->list in > > buffered_rmqueue() of mm/page_alloc.c after checking vmlinux. > > "00100100" implies LIST_POISON1 that suggests a race condition between > > list_add() and list_del() in my personal view. > > However we not yet figure out locking problem regarding page.lru. > > > > Any known issues about race condition in mm/page_alloc.c? > > And other hints are highly appreciated. > > > >  /* Find a page of the appropriate migrate type */ > >                if (cold) { > >                   ... ... > >                } else { > >                        list_for_each_entry(page, &pcp->list, lru) > >                                if (page_private(page) == migratetype) > >                                        break; > >                } > > > > <1>[120898.805267] Unable to handle kernel paging request at virtual > > address 00100100 > > <1>[120898.805633] pgd = c1560000 > > <1>[120898.805786] [00100100] *pgd=897b3031, *pte=00000000, *ppte=00000000 > > <4>[120898.806457] Internal error: Oops: 17 [#1] PREEMPT > > ... ... > > <4>[120898.807861] CPU: 0    Not tainted  (2.6.29-omap1 #1) > > <4>[120898.808044] PC is at get_page_from_freelist+0x1d0/0x4b0 > > <4>[120898.808227] LR is at get_page_from_freelist+0xc8/0x4b0 > > <4>[120898.808563] pc : []    lr : []    psr: 800000d3 > > <4>[120898.808563] sp : c49fbd18  ip : 00000000  fp : c49fbd74 > > <4>[120898.809020] r10: 00000000  r9 : 001000e8  r8 : 00000002 > > <4>[120898.809204] r7 : 001200d2  r6 : 60000053  r5 : c0507c4c  r4 : c49fa000 > > <4>[120898.809509] r3 : 001000e8  r2 : 00100100  r1 : c0507c6c  r0 : 00000001 > > <4>[120898.809844] Flags: Nzcv  IRQs off  FIQs off  Mode SVC_32  ISA > > ARM  Segment kernel > > <4>[120898.810028] Control: 10c5387d  Table: 82160019  DAC: 00000017 > > <4>[120898.948425] Backtrace: > > <4>[120898.948760] [] (get_page_from_freelist+0x0/0x4b0) > > from [] (__alloc_pages_internal+0xac/0x3e8) > > <4>[120898.949554] [] (__alloc_pages_internal+0x0/0x3e8) > > from [] (handle_mm_fault+0x16c/0xbac) > > <4>[120898.950347] [] (handle_mm_fault+0x0/0xbac) from > > [] (__get_user_pages+0x174/0x2b4) > > <4>[120898.951019] [] (__get_user_pages+0x0/0x2b4) from > > [] (get_user_pages+0x3c/0x44) > > <4>[120898.951812] [] (get_user_pages+0x0/0x44) from > > [] (get_arg_page+0x50/0xa4) > > <4>[120898.952636] [] (get_arg_page+0x0/0xa4) from > > [] (copy_strings+0x108/0x210) > > <4>[120898.953430]  r7:beffffe4 r6:00000ffc r5:00000000 r4:00000018 > > <4>[120898.954223] [] (copy_strings+0x0/0x210) from > > [] (copy_strings_kernel+0x3c/0x74) > > <4>[120898.955047] [] (copy_strings_kernel+0x0/0x74) from > > [] (do_execve+0x18c/0x2b0) > > <4>[120898.955841]  r5:0001e240 r4:0001e224 > > <4>[120898.956329] [] (do_execve+0x0/0x2b0) from > > [] (sys_execve+0x3c/0x5c) > > <4>[120898.957153] [] (sys_execve+0x0/0x5c) from > > [] (ret_fast_syscall+0x0/0x2c) > > <4>[120898.957946]  r7:0000000b r6:0001e270 r5:00000000 r4:0001d580 > > <4>[120898.958740] Code: e1530008 0a000006 e2429018 e1a03009 (e5b32018) > > > > > > > > -- > > Best Regards > > Hu Tao > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/