Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754719AbYBXMhv (ORCPT ); Sun, 24 Feb 2008 07:37:51 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752403AbYBXMhi (ORCPT ); Sun, 24 Feb 2008 07:37:38 -0500 Received: from wa-out-1112.google.com ([209.85.146.179]:59606 "EHLO wa-out-1112.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752352AbYBXMhh (ORCPT ); Sun, 24 Feb 2008 07:37:37 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=bJf6c8CkK63YOsKjIRHhGZUl6W1fOvAF6hmD/m/uUzGQ6AN7Bjf0Y4f9seCwG6nBOr5Jg44IUUHqxLEgt5+dW5mREtM9wNqmLNgqIGxftmJ55VzrN7J6EGQJyk65EHbQMCtkp6+osmpLXhwfYACrprR6yYw1OUsg+uPkZ5kRsk4= Message-ID: <6278d2220802240437o1f730bbof65d366d5506d3e6@mail.gmail.com> Date: Sun, 24 Feb 2008 12:37:36 +0000 From: "Daniel J Blueman" To: "Kok, Auke" Subject: Re: [2.6.25-rc2, 2.6.24-rc8] page allocation failure... Cc: "Andrew Morton" , "Linux Kernel" , netdev@vger.kernel.org In-Reply-To: <47BB13DD.1040804@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <6278d2220802141240g6ee2421ew94e57669ef930be6@mail.gmail.com> <6278d2220802170520k2ddf9072x386e4a9e3062f4da@mail.gmail.com> <20080218045849.59311851.akpm@linux-foundation.org> <47BB13DD.1040804@intel.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3748 Lines: 86 On Tue, Feb 19, 2008 at 5:37 PM, Kok, Auke wrote: > Andrew Morton wrote: > > On Sun, 17 Feb 2008 13:20:59 +0000 "Daniel J Blueman" wrote: > >> I'm still hitting this with e1000e on 2.6.25-rc2, 10 times again. > are you sure? I don't think that's the case and you're seeing e1000 dumps here... Indeed so! I thought I moved to e1000e a time ago, but forgot that I had moved back due to lack of support for 82566DC, added since. I'm not seeing any related messages with e1000e after a few days' uptime, so all looks well... Thanks again, Daniel > >> It's clearly non-fatal, but then do we expect it to occur? > >> > >> Daniel > >> > >> --- [dmesg] > >> > >> [ 1250.822786] swapper: page allocation failure. order:3, mode:0x4020 > >> [ 1250.822786] Pid: 0, comm: swapper Not tainted 2.6.25-rc2-119 #2 > >> [ 1250.822786] > >> [ 1250.822786] Call Trace: > >> [ 1250.822786] [] __alloc_pages+0x34e/0x3a0 > >> [ 1250.822786] [] ? __netdev_alloc_skb+0x1f/0x40 > >> [ 1250.822786] [] __slab_alloc+0x102/0x3d0 > >> [ 1250.822786] [] ? __netdev_alloc_skb+0x1f/0x40 > >> [ 1250.822786] [] __kmalloc_track_caller+0x7b/0xc0 > >> [ 1250.822786] [] __alloc_skb+0x6f/0x160 > >> [ 1250.822786] [] __netdev_alloc_skb+0x1f/0x40 > >> [ 1250.822786] [] e1000_alloc_rx_buffers+0x1ed/0x260 > >> [ 1250.822786] [] e1000_clean_rx_irq+0x22a/0x330 > >> [ 1250.822786] [] e1000_clean+0x1e1/0x540 > >> [ 1250.822786] [] ? tick_program_event+0x45/0x70 > >> [ 1250.822786] [] net_rx_action+0x9a/0x150 > >> [ 1250.822786] [] __do_softirq+0x74/0xf0 > >> [ 1250.822786] [] call_softirq+0x1c/0x30 > >> [ 1250.822786] [] do_softirq+0x3d/0x80 > >> [ 1250.822786] [] irq_exit+0x85/0x90 > >> [ 1250.822786] [] do_IRQ+0x85/0x100 > >> [ 1250.822786] [] ? mwait_idle+0x0/0x50 > >> [ 1250.822786] [] ret_from_intr+0x0/0xa > >> [ 1250.822786] [] ? mwait_idle+0x45/0x50 > >> [ 1250.822786] [] ? enter_idle+0x22/0x30 > >> [ 1250.822786] [] ? cpu_idle+0x74/0xa0 > >> [ 1250.822786] [] ? rest_init+0x55/0x60 > > > > They're regularly reported with e1000 too - I don't think aything really > > changed. > > > > e1000 has this crazy problem where because of a cascade of follies (mainly > > borked hardware) it has to do a 32kb allocation for a 9kb(?) packet. It > > would be sad if that was carried over into e1000e? > > can't be, I personally removed that code. > > for MTU > 1500 e1000e uses a plain normal sized SKB. for anything bigger e1000e > uses pages. > > so I don't see how this bug could still be showing up for e1000e at all. The large > skb receive code is all gone (literally, removed). > > *please* rmmod e1000; modprobe e1000e and show the dumps again so we know for sure > that we're not looking at e1000 dumps. > > short fix: increase ring size for e1000 with `modprobe e1000 RxDescriptors=4096` > (or use ethtool) and `echo -n 8192 > /proc/sys/vm/min_free_kbytes` or something > like that. > > what nic hardware is this on? lspci? > > Auke > -- Daniel J Blueman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/