Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp2205018imm; Mon, 28 May 2018 03:55:31 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpJXAea4SVm9rMmfV9y/2gFnCD/1JQWBe5j7So2am3xnhvQpqpl/552OHjwGSHlnz51fE4u X-Received: by 2002:a63:8b44:: with SMTP id j65-v6mr10539378pge.203.1527504931464; Mon, 28 May 2018 03:55:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527504931; cv=none; d=google.com; s=arc-20160816; b=tXGoXtlElADyMoyr6zIwfVcBh4fK06XivVJKRGHGclV/GEmz4wIzKZdE+T98uoI7SO glYwL/9AZxJn5jXhHpXBr6a4jz4UQ2llmLaAr40tpaASu/Izn3pdJcLR8sd3EK3dRa04 6l4n0VeAoGvo0t7DDEAL4NnW/7UECuJ+T1oQ9cKVI0EsOMPRs8Q6KinULn2KGf45hq5r 4jHumAUZnZRRCazx7T6HOz4ECzRt3YmvPv0ylgFTb+6t3PdJEQNPtJf7p7chh7oh/1v0 +iGefb5mKJaoZGs3OEgivjiRY+Pw9wqe1lo+Q01f34u9PHdT9rKSJUHTHHRMj7ueXoLY DBPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :arc-authentication-results; bh=qav4zwiTcu1jWxkl0NmgELSNpH1KESFNeXeACjUideg=; b=rx+Se5scxbafDLUAKl+ExIojgYp7Sr+M/x1Xkm9SzGa4WBuTTQLc1Ylh+XeRsRK/TB FrcMAb2Tx5Rinmnj+4lGHDnNwI/dQn0SSUZc2MCaMD1Hbcf6hGRIJYHii3sSr9cblud5 sgmXrqquqxRSDkiAJPVdCbdIdIU0RbX7PJUzhSAt83T4TagS7fxaTNcivXg1tRBuQPtd xK+hPJq4GqND8An3sUvBgMIqTKXHh4HAp3i6lJD1s7pv1OHJb2StO3t/m/buqIIrJY87 6btexqjfVENrZXT4AI/xklULq0Ps4nV3jm8uBLqziVCqtdPe6G4QBpGN9BoLDjJmo5+Y hb6Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=y2BzDdX/; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h89-v6si10182708pld.378.2018.05.28.03.55.16; Mon, 28 May 2018 03:55:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=y2BzDdX/; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1164240AbeE1KyC (ORCPT + 99 others); Mon, 28 May 2018 06:54:02 -0400 Received: from mail.kernel.org ([198.145.29.99]:42938 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1164165AbeE1Kxz (ORCPT ); Mon, 28 May 2018 06:53:55 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E9765208A1; Mon, 28 May 2018 10:53:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1527504834; bh=UJexcaEPa76ZmldizUDAC74Habj/ciAEwrc8XO9R1Zc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=y2BzDdX/By9yrVqmbCS4hllP7RTurXJf2UV5HrMgiVuRc1E/uEnIbT6rhzS89SaHo aQ6rsQftBdqB4sUevhcgv3qFQEjncAkcVM+Nz7UxT8nO5mNnqwMlCTykH/JIAgcAUH ST6prqkm4dQa1s139f/9I1HdQU+Z39WLlWXGUUKQ= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Sabrina Dubroca , Stefano Brivio , "David S. Miller" , Sasha Levin Subject: [PATCH 4.14 247/496] ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu Date: Mon, 28 May 2018 12:00:32 +0200 Message-Id: <20180528100330.302458730@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180528100319.498712256@linuxfoundation.org> References: <20180528100319.498712256@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.14-stable review patch. If anyone has any objections, please let me know. ------------------ From: Sabrina Dubroca [ Upstream commit d52e5a7e7ca49457dd31fc8b42fb7c0d58a31221 ] Prior to the rework of PMTU information storage in commit 2c8cec5c10bc ("ipv4: Cache learned PMTU information in inetpeer."), when a PMTU event advertising a PMTU smaller than net.ipv4.route.min_pmtu was received, we would disable setting the DF flag on packets by locking the MTU metric, and set the PMTU to net.ipv4.route.min_pmtu. Since then, we don't disable DF, and set PMTU to net.ipv4.route.min_pmtu, so the intermediate router that has this link with a small MTU will have to drop the packets. This patch reestablishes pre-2.6.39 behavior by splitting rtable->rt_pmtu into a bitfield with rt_mtu_locked and rt_pmtu. rt_mtu_locked indicates that we shouldn't set the DF bit on that path, and is checked in ip_dont_fragment(). One possible workaround is to set net.ipv4.route.min_pmtu to a value low enough to accommodate the lowest MTU encountered. Fixes: 2c8cec5c10bc ("ipv4: Cache learned PMTU information in inetpeer.") Signed-off-by: Sabrina Dubroca Reviewed-by: Stefano Brivio Signed-off-by: David S. Miller Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman --- include/net/ip.h | 11 +++++++++-- include/net/ip_fib.h | 1 + include/net/route.h | 3 ++- net/ipv4/route.c | 26 +++++++++++++++++++------- net/ipv4/xfrm4_policy.c | 1 + 5 files changed, 32 insertions(+), 10 deletions(-) --- a/include/net/ip.h +++ b/include/net/ip.h @@ -326,6 +326,13 @@ int ip_decrease_ttl(struct iphdr *iph) return --iph->ttl; } +static inline int ip_mtu_locked(const struct dst_entry *dst) +{ + const struct rtable *rt = (const struct rtable *)dst; + + return rt->rt_mtu_locked || dst_metric_locked(dst, RTAX_MTU); +} + static inline int ip_dont_fragment(const struct sock *sk, const struct dst_entry *dst) { @@ -333,7 +340,7 @@ int ip_dont_fragment(const struct sock * return pmtudisc == IP_PMTUDISC_DO || (pmtudisc == IP_PMTUDISC_WANT && - !(dst_metric_locked(dst, RTAX_MTU))); + !ip_mtu_locked(dst)); } static inline bool ip_sk_accept_pmtu(const struct sock *sk) @@ -359,7 +366,7 @@ static inline unsigned int ip_dst_mtu_ma struct net *net = dev_net(dst->dev); if (net->ipv4.sysctl_ip_fwd_use_pmtu || - dst_metric_locked(dst, RTAX_MTU) || + ip_mtu_locked(dst) || !forwarding) return dst_mtu(dst); --- a/include/net/ip_fib.h +++ b/include/net/ip_fib.h @@ -59,6 +59,7 @@ struct fib_nh_exception { int fnhe_genid; __be32 fnhe_daddr; u32 fnhe_pmtu; + bool fnhe_mtu_locked; __be32 fnhe_gw; unsigned long fnhe_expires; struct rtable __rcu *fnhe_rth_input; --- a/include/net/route.h +++ b/include/net/route.h @@ -63,7 +63,8 @@ struct rtable { __be32 rt_gateway; /* Miscellaneous cached information */ - u32 rt_pmtu; + u32 rt_mtu_locked:1, + rt_pmtu:31; u32 rt_table_id; --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -639,6 +639,7 @@ static inline u32 fnhe_hashfun(__be32 da static void fill_route_from_fnhe(struct rtable *rt, struct fib_nh_exception *fnhe) { rt->rt_pmtu = fnhe->fnhe_pmtu; + rt->rt_mtu_locked = fnhe->fnhe_mtu_locked; rt->dst.expires = fnhe->fnhe_expires; if (fnhe->fnhe_gw) { @@ -649,7 +650,7 @@ static void fill_route_from_fnhe(struct } static void update_or_create_fnhe(struct fib_nh *nh, __be32 daddr, __be32 gw, - u32 pmtu, unsigned long expires) + u32 pmtu, bool lock, unsigned long expires) { struct fnhe_hash_bucket *hash; struct fib_nh_exception *fnhe; @@ -686,8 +687,10 @@ static void update_or_create_fnhe(struct fnhe->fnhe_genid = genid; if (gw) fnhe->fnhe_gw = gw; - if (pmtu) + if (pmtu) { fnhe->fnhe_pmtu = pmtu; + fnhe->fnhe_mtu_locked = lock; + } fnhe->fnhe_expires = max(1UL, expires); /* Update all cached dsts too */ rt = rcu_dereference(fnhe->fnhe_rth_input); @@ -711,6 +714,7 @@ static void update_or_create_fnhe(struct fnhe->fnhe_daddr = daddr; fnhe->fnhe_gw = gw; fnhe->fnhe_pmtu = pmtu; + fnhe->fnhe_mtu_locked = lock; fnhe->fnhe_expires = max(1UL, expires); /* Exception created; mark the cached routes for the nexthop @@ -792,7 +796,8 @@ static void __ip_do_redirect(struct rtab struct fib_nh *nh = &FIB_RES_NH(res); update_or_create_fnhe(nh, fl4->daddr, new_gw, - 0, jiffies + ip_rt_gc_timeout); + 0, false, + jiffies + ip_rt_gc_timeout); } if (kill_route) rt->dst.obsolete = DST_OBSOLETE_KILL; @@ -1005,15 +1010,18 @@ static void __ip_rt_update_pmtu(struct r { struct dst_entry *dst = &rt->dst; struct fib_result res; + bool lock = false; - if (dst_metric_locked(dst, RTAX_MTU)) + if (ip_mtu_locked(dst)) return; if (ipv4_mtu(dst) < mtu) return; - if (mtu < ip_rt_min_pmtu) + if (mtu < ip_rt_min_pmtu) { + lock = true; mtu = ip_rt_min_pmtu; + } if (rt->rt_pmtu == mtu && time_before(jiffies, dst->expires - ip_rt_mtu_expires / 2)) @@ -1023,7 +1031,7 @@ static void __ip_rt_update_pmtu(struct r if (fib_lookup(dev_net(dst->dev), fl4, &res, 0) == 0) { struct fib_nh *nh = &FIB_RES_NH(res); - update_or_create_fnhe(nh, fl4->daddr, 0, mtu, + update_or_create_fnhe(nh, fl4->daddr, 0, mtu, lock, jiffies + ip_rt_mtu_expires); } rcu_read_unlock(); @@ -1276,7 +1284,7 @@ static unsigned int ipv4_mtu(const struc mtu = READ_ONCE(dst->dev->mtu); - if (unlikely(dst_metric_locked(dst, RTAX_MTU))) { + if (unlikely(ip_mtu_locked(dst))) { if (rt->rt_uses_gateway && mtu > 576) mtu = 576; } @@ -1548,6 +1556,7 @@ struct rtable *rt_dst_alloc(struct net_d rt->rt_is_input = 0; rt->rt_iif = 0; rt->rt_pmtu = 0; + rt->rt_mtu_locked = 0; rt->rt_gateway = 0; rt->rt_uses_gateway = 0; rt->rt_table_id = 0; @@ -2526,6 +2535,7 @@ struct dst_entry *ipv4_blackhole_route(s rt->rt_is_input = ort->rt_is_input; rt->rt_iif = ort->rt_iif; rt->rt_pmtu = ort->rt_pmtu; + rt->rt_mtu_locked = ort->rt_mtu_locked; rt->rt_genid = rt_genid_ipv4(net); rt->rt_flags = ort->rt_flags; @@ -2628,6 +2638,8 @@ static int rt_fill_info(struct net *net, memcpy(metrics, dst_metrics_ptr(&rt->dst), sizeof(metrics)); if (rt->rt_pmtu && expires) metrics[RTAX_MTU - 1] = rt->rt_pmtu; + if (rt->rt_mtu_locked && expires) + metrics[RTAX_LOCK - 1] |= BIT(RTAX_MTU); if (rtnetlink_put_metrics(skb, metrics) < 0) goto nla_put_failure; --- a/net/ipv4/xfrm4_policy.c +++ b/net/ipv4/xfrm4_policy.c @@ -100,6 +100,7 @@ static int xfrm4_fill_dst(struct xfrm_ds xdst->u.rt.rt_gateway = rt->rt_gateway; xdst->u.rt.rt_uses_gateway = rt->rt_uses_gateway; xdst->u.rt.rt_pmtu = rt->rt_pmtu; + xdst->u.rt.rt_mtu_locked = rt->rt_mtu_locked; xdst->u.rt.rt_table_id = rt->rt_table_id; INIT_LIST_HEAD(&xdst->u.rt.rt_uncached);