Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752433Ab3GaNN0 (ORCPT ); Wed, 31 Jul 2013 09:13:26 -0400 Received: from s15338416.onlinehome-server.info ([87.106.68.36]:50951 "EHLO order.stressinduktion.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751338Ab3GaNNZ (ORCPT ); Wed, 31 Jul 2013 09:13:25 -0400 Date: Wed, 31 Jul 2013 15:13:23 +0200 From: Hannes Frederic Sowa To: "Paul E. McKenney" Cc: vinayak menon , linux-kernel@vger.kernel.org, davem@davemloft.net, getarunks@gmail.com, netdev@vger.kernel.org Subject: Re: ipv4: crash at leaf_walk_rcu Message-ID: <20130731131323.GB31245@order.stressinduktion.org> Mail-Followup-To: "Paul E. McKenney" , vinayak menon , linux-kernel@vger.kernel.org, davem@davemloft.net, getarunks@gmail.com, netdev@vger.kernel.org References: <20130731125513.GS26694@linux.vnet.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20130731125513.GS26694@linux.vnet.ibm.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1858 Lines: 47 On Wed, Jul 31, 2013 at 05:55:13AM -0700, Paul E. McKenney wrote: > On Wed, Jul 31, 2013 at 04:40:47PM +0530, vinayak menon wrote: > > Hi, > > > > A crash was seen on 3.4.5 kernel during some random wlan operations. > > > > CPU: Single core ARM Cortex A9. > > > > fib_route_seq_next was called with second argument (void *v) as 0xd6e3e360 > > which is a "freed" object of the "ip_fib_trie" cache. I confirmed that the > > object was freed with crash utility. > > > > Sequence: fib_route_seq_next->trie_nextleaf->leaf_walk_rcu > > > > As "v" was a freed object, inside trie_nextleaf(), node_parent_rcu() > > returned an invalid tnode. But as I had enabled slab poisoning and the > > object was already freed, the tnode was 0x6b6b6b6b. And this was passed to > > leaf_walk_rcu and resulted in the crash. > > > > fib_route_seq_start, takes rcu_read_lock(), but free_leaf > > calls call_rcu_bh. Can this be the problem ? > > Should rcu_read_lock() in fib_route_seq_start be changed to rcu_read_lock_bh() > > ? > > One way or the other, the RCU read-side primitives need to match the RCU > update-side primitives. Adding netdev... Already fixed by: commit 0c03eca3d995e73d691edea8c787e25929ec156d Author: Eric Dumazet Date: Tue Aug 7 00:47:11 2012 +0000 net: fib: fix incorrect call_rcu_bh() After IP route cache removal, I believe rcu_bh() has very little use and we should remove this RCU variant, since it adds some cycles in fast path. Anyway, the call_rcu_bh() use in fib_true is obviously wrong, since some users only assert rcu_read_lock(). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/