Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp3937541pxj; Mon, 21 Jun 2021 09:45:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwKoxB1XVO5wLTrqaVUYoVBTQCF2YDVy0qQyRRImu2CWw9vxv6jIWS9Ml6Yia/uH/iwpeyM X-Received: by 2002:a5e:890d:: with SMTP id k13mr15825369ioj.71.1624293949277; Mon, 21 Jun 2021 09:45:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624293949; cv=none; d=google.com; s=arc-20160816; b=GuWWqwYBvqQlmMBnyz6nLD3flnd35HvJZ+nGUoRzdChyNnRiarYRmlSaL3aCLbVFK3 Be9a+iOUKhUXBRQq+vO+FVpPkhCo7soy0tejazvlrGoDPEmsKZV1FZyAU+DalvdUfvuN DShpGFKosLqbIQ8cqj6i36+tLZ26zymm+Ru70OYjmu/G9p+cORdWi0teknHFpZgPskm5 iHXM08FaQQPn+az+LbK01gkOztxDnkQtTIpiI7MHknunBl8uarcxfUhUEushmMzSJb1n PFtuO5XqTLJbER0sGtT1jTsOEIDpZHWZjj2pq6DhefIpSiKBdSfpVeFrf7o9Dk5uURh8 dxOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=M2m3WV5b6DUnqRmm6qDYPzbpYzDzx02WreHgENwnZVQ=; b=z5416eaZ+6cRlpWA5jEa9khTnV+aD12w+60qRO3mWgTrnVEpG8lsPnfjojwwNw87bc XRWtD4oDO3wmDU4L3ckFDvmBGlfuC/Wt8jsT8MfRx5uPOuQyNzURhEcp0V+sHV7OULah FWoWTzPiL+0gHAaMnTuXjkadSiaWm7bOYg+EQBJhefEHxzVoHYQtZZr53FCvhuyFkfpL 0QblIIAQxvd8rahH95zZzrqt0zJtKmAXCRpMCmnYYufvk8gHUjoqzCUkGS5Qb/0cWTMI DtmOMu1MNIZhnZzaYNoTE0JFVNQsUlvG+P6Bh8m7esjxdnkchI8qvHU48/mvBP9TUEtr 3kkA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=0Xn2H57u; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h19si10650846jaj.37.2021.06.21.09.45.35; Mon, 21 Jun 2021 09:45:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=0Xn2H57u; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230403AbhFUQrU (ORCPT + 99 others); Mon, 21 Jun 2021 12:47:20 -0400 Received: from mail.kernel.org ([198.145.29.99]:33850 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232113AbhFUQmc (ORCPT ); Mon, 21 Jun 2021 12:42:32 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 80DA76144B; Mon, 21 Jun 2021 16:31:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1624293087; bh=0BZwI5s3Wq92Qypb7yLZFTZZS+0nLPtp9pc0bUohzbk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=0Xn2H57uTwm6ftZvT0ij7+dOqEsI23T4ZAV7jWLQ9J/SyMq915xTxp8ae6yas8vVA S3zl+9bDtM7lKoRYAqfnmC8KURyZIgbjFqDoiXkNrb+HWUPfW6zIVITKBS5f9kNFPX nF2KXQuCM7ZFTqZoXOog2sa5FONSP1zG5Pq3Ly+Q= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Oliver Herms , David Ahern , "David S. Miller" , Sasha Levin Subject: [PATCH 5.12 060/178] ipv4: Fix device used for dst_alloc with local routes Date: Mon, 21 Jun 2021 18:14:34 +0200 Message-Id: <20210621154924.518932557@linuxfoundation.org> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210621154921.212599475@linuxfoundation.org> References: <20210621154921.212599475@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: David Ahern [ Upstream commit b87b04f5019e821c8c6c7761f258402e43500a1f ] Oliver reported a use case where deleting a VRF device can hang waiting for the refcnt to drop to 0. The root cause is that the dst is allocated against the VRF device but cached on the loopback device. The use case (added to the selftests) has an implicit VRF crossing due to the ordering of the FIB rules (lookup local is before the l3mdev rule, but the problem occurs even if the FIB rules are re-ordered with local after l3mdev because the VRF table does not have a default route to terminate the lookup). The end result is is that the FIB lookup returns the loopback device as the nexthop, but the ingress device is in a VRF. The mismatch causes the dst alloc against the VRF device but then cached on the loopback. The fix is to bring the trick used for IPv6 (see ip6_rt_get_dev_rcu): pick the dst alloc device based the fib lookup result but with checks that the result has a nexthop device (e.g., not an unreachable or prohibit entry). Fixes: f5a0aab84b74 ("net: ipv4: dst for local input routes should use l3mdev if relevant") Reported-by: Oliver Herms Signed-off-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- net/ipv4/route.c | 15 +++++++++++++- tools/testing/selftests/net/fib_tests.sh | 25 ++++++++++++++++++++++++ 2 files changed, 39 insertions(+), 1 deletion(-) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index d635b4f32d34..09506203156d 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -2081,6 +2081,19 @@ martian_source: return err; } +/* get device for dst_alloc with local routes */ +static struct net_device *ip_rt_get_dev(struct net *net, + const struct fib_result *res) +{ + struct fib_nh_common *nhc = res->fi ? res->nhc : NULL; + struct net_device *dev = NULL; + + if (nhc) + dev = l3mdev_master_dev_rcu(nhc->nhc_dev); + + return dev ? : net->loopback_dev; +} + /* * NOTE. We drop all the packets that has local source * addresses, because every properly looped back packet @@ -2237,7 +2250,7 @@ local_input: } } - rth = rt_dst_alloc(l3mdev_master_dev_rcu(dev) ? : net->loopback_dev, + rth = rt_dst_alloc(ip_rt_get_dev(net, res), flags | RTCF_LOCAL, res->type, IN_DEV_ORCONF(in_dev, NOPOLICY), false); if (!rth) diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh index 2b5707738609..6fad54c7ecb4 100755 --- a/tools/testing/selftests/net/fib_tests.sh +++ b/tools/testing/selftests/net/fib_tests.sh @@ -1384,12 +1384,37 @@ ipv4_rt_replace() ipv4_rt_replace_mpath } +# checks that cached input route on VRF port is deleted +# when VRF is deleted +ipv4_local_rt_cache() +{ + run_cmd "ip addr add 10.0.0.1/32 dev lo" + run_cmd "ip netns add test-ns" + run_cmd "ip link add veth-outside type veth peer name veth-inside" + run_cmd "ip link add vrf-100 type vrf table 1100" + run_cmd "ip link set veth-outside master vrf-100" + run_cmd "ip link set veth-inside netns test-ns" + run_cmd "ip link set veth-outside up" + run_cmd "ip link set vrf-100 up" + run_cmd "ip route add 10.1.1.1/32 dev veth-outside table 1100" + run_cmd "ip netns exec test-ns ip link set veth-inside up" + run_cmd "ip netns exec test-ns ip addr add 10.1.1.1/32 dev veth-inside" + run_cmd "ip netns exec test-ns ip route add 10.0.0.1/32 dev veth-inside" + run_cmd "ip netns exec test-ns ip route add default via 10.0.0.1" + run_cmd "ip netns exec test-ns ping 10.0.0.1 -c 1 -i 1" + run_cmd "ip link delete vrf-100" + + # if we do not hang test is a success + log_test $? 0 "Cached route removed from VRF port device" +} + ipv4_route_test() { route_setup ipv4_rt_add ipv4_rt_replace + ipv4_local_rt_cache route_cleanup } -- 2.30.2