Return-path: Received: from mail-iy0-f174.google.com ([209.85.210.174]:43791 "EHLO mail-iy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752601Ab1BGIGm (ORCPT ); Mon, 7 Feb 2011 03:06:42 -0500 Date: Mon, 7 Feb 2011 00:06:35 -0800 From: Dmitry Torokhov To: Felix Fietkau Cc: Miklos Szeredi , linux-kernel@vger.kernel.org, linux-wireless@vger.kernel.org, "John W. Linville" Subject: Re: Wireless regression (was 2.6.38-rc3: FUSE (sshfs) hangs under load) Message-ID: <20110207080634.GA11580@core.coreip.homeip.net> References: <20110201175452.GB518@core.coreip.homeip.net> <20110202165236.GA3178@core.coreip.homeip.net> <20110203065541.GB5592@core.coreip.homeip.net> <20110203194115.GA14159@core.coreip.homeip.net> <20110204064952.GA12914@core.coreip.homeip.net> <4D4BE5D7.8090202@openwrt.org> <4D4BEB99.4060202@openwrt.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <4D4BEB99.4060202@openwrt.org> Sender: linux-wireless-owner@vger.kernel.org List-ID: On Fri, Feb 04, 2011 at 01:05:45PM +0100, Felix Fietkau wrote: > On 2011-02-04 12:41 PM, Felix Fietkau wrote: > > On 2011-02-04 7:49 AM, Dmitry Torokhov wrote: > >> On Thu, Feb 03, 2011 at 11:41:15AM -0800, Dmitry Torokhov wrote: > >>> On Thu, Feb 03, 2011 at 12:13:24PM +0100, Miklos Szeredi wrote: > >>> > On Wed, 2 Feb 2011, Dmitry Torokhov wrote: > >>> > > On Wed, Feb 02, 2011 at 08:52:36AM -0800, Dmitry Torokhov wrote: > >>> > > > On Wed, Feb 02, 2011 at 12:52:36PM +0100, Miklos Szeredi wrote: > >>> > > > > On Tue, 1 Feb 2011, Dmitry Torokhov wrote: > >>> > > > > > Hi, > >>> > > > > > > >>> > > > > > After installing 2.6.38-rc3 (plus a few input patches) sshfs started to > >>> > > > > > misbehave on me under load. It starts off fine but when I try to compile > >>> > > > > > a few modules against kernel sources residing on the other box the > >>> > > > > > processes go into 'D' state and just sit there doing nothing. > >>> > > > > > >>> > > > > Can you please post a stack trace from SysRq-T? > >>> > > > > > >>> > > > > >> ... > >>> > > > >>> > > OK, so here are the stack traces you requested. First one is snapshot of > >>> > > when compile got stuck, the 2nd one is when I interrupted make which > >>> > > caused gcc to go to 'D' state. > >>> > > >>> > There doesn't appear anything abnormal there. > >>> > > >>> > It's going into D state after it has received an interrupt and sent it > >>> > along to the userspace filesystem. Then it will go into > >>> > uninterruptible sleep until the answer is received. > >>> > > >>> > So the hang is because the answer to an open request is not being > >>> > received. I can't tell where it got stuck, apparently not anywhere on > >>> > the local machine. > >>> > > >>> > Can you please get a log from sshfs with "-odebug,sshfs_debug" and > >>> > redirect stderr to a file? That might tell a bit more about the > >>> > situation. Or it might not... > >>> > >>> Hmm, it might be just the network itself, last night mutt in ssh session > >>> froze on me as well. I guess I'll just have to finish my bisect > >>> exercise. > >>> > >> > >> I finished bisecting and it turned out that the problematic commit > >> happened to be in wireless (I have iwl3945): > >> > >> commit 4cd06a344db752f513437138953af191cbe9a691 > >> Author: Felix Fietkau > >> Date: Sat Dec 18 19:30:49 2010 +0100 > >> > >> mac80211: skip unnecessary pskb_expand_head calls > >> > >> If the skb is not cloned and we don't need any extra headroom, there > >> is no point in reallocating the skb head. > >> > >> Signed-off-by: Felix Fietkau > >> Signed-off-by: John W. Linville > >> > >> With this commit reverted from 2.6.38-rc3 I can not reproduce sshfs > >> getting stuck here. > > I really don't see how this commit could be causing these issues, and > > I'm not aware of any similar issues affecting other drivers. > Could you please try this patch to see if it fixes the issue as well? > > diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c > index ffc6749..3168eae 100644 > --- a/net/mac80211/tx.c > +++ b/net/mac80211/tx.c > @@ -1547,7 +1547,7 @@ static int ieee80211_skb_resize(struct ieee80211_local *local, > skb_orphan(skb); > } > > - if (skb_header_cloned(skb)) > + if (skb_cloned(skb)) > I802_DEBUG_INC(local->tx_expand_skb_head_cloned); > else if (head_need || tail_need) > I802_DEBUG_INC(local->tx_expand_skb_head); Yes, it does, thank you for fixing it. -- Dmitry