Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760615AbZGINX0 (ORCPT ); Thu, 9 Jul 2009 09:23:26 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1759168AbZGINXQ (ORCPT ); Thu, 9 Jul 2009 09:23:16 -0400 Received: from mail-fx0-f218.google.com ([209.85.220.218]:60292 "EHLO mail-fx0-f218.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756397AbZGINXP (ORCPT ); Thu, 9 Jul 2009 09:23:15 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=tJLMoJtP6ul0X6612evk5qVwKB8pJ4md0n6aoZCnfesCh89La2i7Pv9mJZz0t9m8th 3CQfvugSaF5mj3i78oXwpK8qplS+QXoCxqR/RQyS1w3hXxPu4KXWookv1VDQFTSsx3Gv 4sRCQnAvpG+38iapwjZwxmrfGWd3vooK71ZPI= Date: Thu, 9 Jul 2009 15:22:56 +0200 From: Jarek Poplawski To: Thomas Gleixner Cc: Andres Freund , Joao Correia , Arun R Bharadwaj , Stephen Hemminger , netdev@vger.kernel.org, LKML , Patrick McHardy , Peter Zijlstra Subject: Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem) Message-ID: <20090709132256.GB3651@ami.dom.local> References: <200907031326.21822.andres@anarazel.de> <200907071811.27570.andres@anarazel.de> <20090708080852.GC3148@ami.dom.local> <200907090023.18040.andres@anarazel.de> <20090708224828.GD3666@ami.dom.local> <20090709104412.GA3651@ami.dom.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1482 Lines: 39 On Thu, Jul 09, 2009 at 02:03:50PM +0200, Thomas Gleixner wrote: > On Thu, 9 Jul 2009, Jarek Poplawski wrote: > > > > > > I have the feeling that the code relies on some implicit cpu > > > boundness, which is not longer guaranteed with the timer migration > > > changes, but that's a question for the network experts. > > > > As a matter of fact, I've just looked at this __netif_schedule(), > > which really is cpu bound, so you might be 100% right. > > So the watchdog is the one which causes the trouble. The patch below > should fix this. I hope so. On the other hand it seems it should work with this migration yet, so it probably needs additional debugging. Thanks, Jarek P. > --- > > diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c > index 24d17ce..fbe554f 100644 > --- a/net/sched/sch_api.c > +++ b/net/sched/sch_api.c > @@ -485,7 +485,7 @@ void qdisc_watchdog_schedule(struct qdisc_watchdog *wd, psched_time_t expires) > wd->qdisc->flags |= TCQ_F_THROTTLED; > time = ktime_set(0, 0); > time = ktime_add_ns(time, PSCHED_TICKS2NS(expires)); > - hrtimer_start(&wd->timer, time, HRTIMER_MODE_ABS); > + hrtimer_start(&wd->timer, time, HRTIMER_MODE_ABS_PINNED); > } > EXPORT_SYMBOL(qdisc_watchdog_schedule); > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/