Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756298AbZJSOJw (ORCPT ); Mon, 19 Oct 2009 10:09:52 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756185AbZJSOJv (ORCPT ); Mon, 19 Oct 2009 10:09:51 -0400 Received: from gir.skynet.ie ([193.1.99.77]:57208 "EHLO gir.skynet.ie" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756112AbZJSOJu (ORCPT ); Mon, 19 Oct 2009 10:09:50 -0400 Date: Mon, 19 Oct 2009 15:09:57 +0100 From: Mel Gorman To: Tobias Oetiker Cc: Frans Pop , Pekka Enberg , David Rientjes , KOSAKI Motohiro , "Rafael J. Wysocki" , Linux Kernel Mailing List , Kernel Testers List , Reinette Chatre , Bartlomiej Zolnierkiewicz , Karol Lewandowski , Mohamed Abbas , "John W. Linville" , linux-mm@kvack.org, jens.axboe@oracle.com Subject: Re: [Bug #14141] order 2 page allocation failures (generic) Message-ID: <20091019140957.GE9036@csn.ul.ie> References: <3onW63eFtRF.A.xXH.oMTxKB@chimera> <200910190133.33183.elendil@planet.nl> <1255912562.6824.9.camel@penberg-laptop> <200910190444.55867.elendil@planet.nl> <20091019133146.GB9036@csn.ul.ie> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17+20080114 (2008-01-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7774 Lines: 103 On Mon, Oct 19, 2009 at 03:40:05PM +0200, Tobias Oetiker wrote: > Hi Mel, > > Today Mel Gorman wrote: > > > On Mon, Oct 19, 2009 at 11:49:08AM +0200, Tobi Oetiker wrote: > > > Today Frans Pop wrote: > > > > > > > > > > > I'm starting to think that this commit may not be directly related to high > > > > order allocation failures. The fact that I'm seeing SKB allocation > > > > failures earlier because of this commit could be just a side effect. > > > > It could be that instead the main impact of this commit is on encrypted > > > > file system and/or encrypted swap (kcryptd). > > > > > > > > Besides mm the commit also touches dm-crypt (and nfs/write.c, but as I'm > > > > only reading from NFS that's unlikely). > > > > > > I have updated a fileserver to 2.6.31 today and I see page > > > allocation failures from several parts of the system ... mostly nfs though ... (it is a nfs server). > > > So I guess the problem must be quite generic: > > > > > > > > > Oct 19 07:10:02 johan kernel: [23565.684110] swapper: page allocation failure. order:5, mode:0x4020 [kern.warning] > > > Oct 19 07:10:02 johan kernel: [23565.684118] Pid: 0, comm: swapper Not tainted 2.6.31-02063104-generic #02063104 [kern.warning] > > > Oct 19 07:10:02 johan kernel: [23565.684121] Call Trace: [kern.warning] > > > Oct 19 07:10:02 johan kernel: [23565.684124] [] __alloc_pages_slowpath+0x3b2/0x4c0 [kern.warning] > > > > > > > What's the rest of the stack trace? I'm wondering where a large number > > of order-5 GFP_ATOMIC allocations are coming from. It seems different to > > the e100 problem where there is one GFP_ATOMIC allocation while the > > firmware is being loaded. > > Oct 19 07:10:02 johan kernel: [23565.684110] swapper: page allocation failure. order:5, mode:0x4020 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684118] Pid: 0, comm: swapper Not tainted 2.6.31-02063104-generic #02063104 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684121] Call Trace: [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684124] [] __alloc_pages_slowpath+0x3b2/0x4c0 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684157] [] __alloc_pages_nodemask+0x135/0x140 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684164] [] ? _spin_unlock_bh+0x14/0x20 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684170] [] kmalloc_large_node+0x68/0xc0 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684175] [] __kmalloc_node_track_caller+0x11a/0x180 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684181] [] ? skb_copy+0x32/0xa0 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684185] [] __alloc_skb+0x76/0x180 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684205] [] skb_copy+0x32/0xa0 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684221] [] vboxNetFltLinuxPacketHandler+0x5c/0xd0 [vboxnetflt] [kern.warning] Is the MTU set very high between the host and virtualised machine? Can you test please with the patch at http://lkml.org/lkml/2009/10/16/89 applied and with commits 373c0a7e and 8aa7e847 reverted please? > Oct 19 07:10:02 johan kernel: [23565.684231] [] dev_hard_start_xmit+0x189/0x1c0 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684236] [] __qdisc_run+0x1a1/0x230 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684240] [] dev_queue_xmit+0x238/0x310 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684246] [] ip_finish_output+0x11b/0x2f0 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684250] [] ip_output+0x89/0xd0 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684254] [] ip_local_out+0x20/0x30 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684258] [] ip_queue_xmit+0x22b/0x3f0 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684264] [] tcp_transmit_skb+0x345/0x4e0 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684269] [] tcp_write_xmit+0xb6/0x2e0 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684273] [] __tcp_push_pending_frames+0x2b/0xa0 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684277] [] tcp_rcv_established+0x459/0x6d0 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684282] [] tcp_v4_do_rcv+0x12d/0x140 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684285] [] tcp_v4_rcv+0x58e/0x7c0 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684289] [] ip_local_deliver_finish+0x11d/0x2b0 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684293] [] ip_local_deliver+0x3b/0x90 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684297] [] ip_rcv_finish+0x146/0x420 [kern.warning] > Oct 19 07:10:02 johan kernel: [23565.684301] [] ip_rcv+0x29b/0x370 [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684304] [] netif_receive_skb+0x38a/0x4d0 [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684308] [] napi_skb_finish+0x48/0x60 [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684311] [] napi_gro_receive+0x34/0x40 [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684330] [] tg3_rx+0x373/0x4b0 [tg3] [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684339] [] tg3_poll_work+0x70/0xf0 [tg3] [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684347] [] tg3_poll+0x3e/0xe0 [tg3] [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684350] [] net_rx_action+0x102/0x210 [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684357] [] __do_softirq+0xc4/0x1f0 [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684362] [] call_softirq+0x1c/0x30 [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684365] [] do_softirq+0x55/0x90 [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684369] [] irq_exit+0x7b/0x90 [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684372] [] do_IRQ+0x73/0xe0 [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684378] [] ret_from_intr+0x0/0x11 [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684381] [] ? native_safe_halt+0x6/0x10 [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684391] [] ? default_idle+0x48/0xe0 [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684396] [] ? __atomic_notifier_call_chain+0xd/0x10 [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684400] [] ? atomic_notifier_call_chain+0x11/0x20 [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684404] [] ? cpu_idle+0x98/0xe0 [kern.warning] > Oct 19 07:10:04 johan kernel: [23565.684410] [] ? start_secondary+0x95/0xc0 [kern.warning] > > if you need more, I can send you a whole bunch of them ... > I'm assuming they are all more or less the same. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/