Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754254AbaG3Mfa (ORCPT ); Wed, 30 Jul 2014 08:35:30 -0400 Received: from ip4-83-240-18-248.cust.nbox.cz ([83.240.18.248]:36714 "EHLO ip4-83-240-18-248.cust.nbox.cz" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752871AbaG3MPc (ORCPT ); Wed, 30 Jul 2014 08:15:32 -0400 From: Jiri Slaby To: stable@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Eric Dumazet , "David S. Miller" , Jiri Slaby Subject: [PATCH 3.12 38/94] ipv4: fix dst race in sk_dst_get() Date: Wed, 30 Jul 2014 14:14:27 +0200 Message-Id: <591b1e1bb40152e22cee757f493046a0ca946bf8.1406722270.git.jslaby@suse.cz> X-Mailer: git-send-email 2.0.1 In-Reply-To: References: In-Reply-To: References: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Eric Dumazet 3.12-stable review patch. If anyone has any objections, please let me know. =============== [ Upstream commit f88649721268999bdff09777847080a52004f691 ] When IP route cache had been removed in linux-3.6, we broke assumption that dst entries were all freed after rcu grace period. DST_NOCACHE dst were supposed to be freed from dst_release(). But it appears we want to keep such dst around, either in UDP sockets or tunnels. In sk_dst_get() we need to make sure dst refcount is not 0 before incrementing it, or else we might end up freeing a dst twice. DST_NOCACHE set on a dst does not mean this dst can not be attached to a socket or a tunnel. Then, before actual freeing, we need to observe a rcu grace period to make sure all other cpus can catch the fact the dst is no longer usable. Signed-off-by: Eric Dumazet Reported-by: Dormando Signed-off-by: David S. Miller Signed-off-by: Jiri Slaby --- include/net/sock.h | 4 ++-- net/core/dst.c | 16 +++++++++++----- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index 4aa873a6267f..dc84ec518ff5 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1749,8 +1749,8 @@ sk_dst_get(struct sock *sk) rcu_read_lock(); dst = rcu_dereference(sk->sk_dst_cache); - if (dst) - dst_hold(dst); + if (dst && !atomic_inc_not_zero(&dst->__refcnt)) + dst = NULL; rcu_read_unlock(); return dst; } diff --git a/net/core/dst.c b/net/core/dst.c index ca4231ec7347..15b6792e6ebb 100644 --- a/net/core/dst.c +++ b/net/core/dst.c @@ -267,6 +267,15 @@ again: } EXPORT_SYMBOL(dst_destroy); +static void dst_destroy_rcu(struct rcu_head *head) +{ + struct dst_entry *dst = container_of(head, struct dst_entry, rcu_head); + + dst = dst_destroy(dst); + if (dst) + __dst_free(dst); +} + void dst_release(struct dst_entry *dst) { if (dst) { @@ -274,11 +283,8 @@ void dst_release(struct dst_entry *dst) newrefcnt = atomic_dec_return(&dst->__refcnt); WARN_ON(newrefcnt < 0); - if (unlikely(dst->flags & DST_NOCACHE) && !newrefcnt) { - dst = dst_destroy(dst); - if (dst) - __dst_free(dst); - } + if (unlikely(dst->flags & DST_NOCACHE) && !newrefcnt) + call_rcu(&dst->rcu_head, dst_destroy_rcu); } } EXPORT_SYMBOL(dst_release); -- 2.0.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/