Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757374AbZDBUj2 (ORCPT ); Thu, 2 Apr 2009 16:39:28 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753161AbZDBUjK (ORCPT ); Thu, 2 Apr 2009 16:39:10 -0400 Received: from mail.vyatta.com ([76.74.103.46]:47517 "EHLO mail.vyatta.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751500AbZDBUjG (ORCPT ); Thu, 2 Apr 2009 16:39:06 -0400 Date: Thu, 2 Apr 2009 13:38:57 -0700 From: Stephen Hemminger To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, netfilter@vger.kernel.org, "Paul E. McKenney" , Eric Dumazet , "David S. Miller" , Patrick McHardy , Rusty Russell , coreteam@netfilter.org Subject: Re: [PATCH] netfilter: iptables: lock free counters, PREEMPT_RCU=y fix Message-ID: <20090402133857.1e3421b0@nehalam> In-Reply-To: <20090402201245.GA29904@elte.hu> References: <20090402200128.GA21805@elte.hu> <20090402201245.GA29904@elte.hu> Organization: Vyatta X-Mailer: Claws Mail 3.6.1 (GTK+ 2.16.0; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4459 Lines: 110 On Thu, 2 Apr 2009 22:12:45 +0200 Ingo Molnar wrote: > > Impact: fix log spam under CONFIG_DEBUG_PREEMPT=y > > This recent commit: > > 7845447: netfilter: iptables: lock free counters > > Converted a couple of netfilter codepaths from read_lock() critical > sections to lockless rcu_read_lock(). What it forgot about is that > under CONFIG_PREEMPT=y and CONFIG_PREEMPT_RCU=y these sections can > be preempted. > > Under CONFIG_DEBUG_PREEMPT=y this produces such warnings: > > BUG: using smp_processor_id() in preemptible [00000000] code: ssh/9115 > caller is ipt_do_table+0xc8/0x559 > Pid: 9115, comm: ssh Tainted: G W 2.6.29-tip-08646-g45ef7c3-dirty #26231 > Call Trace: > [] ? printk+0x14/0x16 > [] debug_smp_processor_id+0xa6/0xbc > [] ipt_do_table+0xc8/0x559 > [] ? _read_unlock+0x3d/0x49 > [] ? fn_hash_lookup+0x94/0xa0 > [] ? __inet_dev_addr_type+0x56/0x8d > [] ? neigh_lookup+0xe5/0x108 > [] ipt_local_hook+0x40/0x50 > [] nf_iterate+0x34/0x80 > [] ? dst_output+0x0/0x10 > [] nf_hook_slow+0x47/0xa4 > [] ? dst_output+0x0/0x10 > [] __ip_local_out+0x78/0x7f > [] ? dst_output+0x0/0x10 > [] ip_local_out+0x10/0x20 > [] ip_queue_xmit+0x2bc/0x332 > [] ? __ip_route_output_key+0x112/0x77b > [] ? local_bh_enable+0x10/0x12 > [] ? tcp_connect+0x32a/0x3bb > [] ? __inet_hash_nolisten+0x97/0xaf > [] ? __copy_skb_header+0xe/0x13a > [] ? tcp_connect+0x32a/0x3bb > [] ? tcp_transmit_skb+0x5a5/0x61c > [] tcp_transmit_skb+0x5e5/0x61c > [] ? __alloc_skb+0x54/0x120 > [] ? tcp_connect+0x20f/0x3bb > [] tcp_connect+0x32a/0x3bb > [] tcp_v4_connect+0x466/0x4be > [] inet_stream_connect+0x8f/0x212 > [] ? might_fault+0x75/0x77 > [] ? copy_from_user+0x2f/0x117 > BUG: using smp_processor_id() in preemptible [00000000] code: ssh/9114 > > Since it appears that the tables are RCU freed, and there are no > non-preempt assumptions in the code, the using of > raw_smp_processor_id() is safe. > > [ I also audited all of net/netfilter/*.c for smp_processor_id() use, > and fixed all places that used them unsafely. ] > > Signed-off-by: Ingo Molnar > --- > net/ipv4/netfilter/arp_tables.c | 2 +- > net/ipv4/netfilter/ip_tables.c | 4 ++-- > 2 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c > index 35c5f6a..30baf3e 100644 > --- a/net/ipv4/netfilter/arp_tables.c > +++ b/net/ipv4/netfilter/arp_tables.c > @@ -255,7 +255,7 @@ unsigned int arpt_do_table(struct sk_buff *skb, > > rcu_read_lock(); > private = rcu_dereference(table->private); > - table_base = rcu_dereference(private->entries[smp_processor_id()]); > + table_base = rcu_dereference(private->entries[raw_smp_processor_id()]); > > e = get_entry(table_base, private->hook_entry[hook]); > back = get_entry(table_base, private->underflow[hook]); > diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c > index 82ee7c9..eff124e 100644 > --- a/net/ipv4/netfilter/ip_tables.c > +++ b/net/ipv4/netfilter/ip_tables.c > @@ -280,7 +280,7 @@ static void trace_packet(struct sk_buff *skb, > char *hookname, *chainname, *comment; > unsigned int rulenum = 0; > > - table_base = (void *)private->entries[smp_processor_id()]; > + table_base = (void *)private->entries[raw_smp_processor_id()]; > root = get_entry(table_base, private->hook_entry[hook]); > > hookname = chainname = (char *)hooknames[hook]; > @@ -341,7 +341,7 @@ ipt_do_table(struct sk_buff *skb, > > rcu_read_lock(); > private = rcu_dereference(table->private); > - table_base = rcu_dereference(private->entries[smp_processor_id()]); > + table_base = rcu_dereference(private->entries[raw_smp_processor_id()]); > > e = get_entry(table_base, private->hook_entry[hook]); > NAK. The rcu_read_lock() needs to be rcu_read_lock_bh() otherwise RCU could corrupt the referenceses. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/