Received: by 10.223.185.116 with SMTP id b49csp1144024wrg; Fri, 23 Feb 2018 12:43:43 -0800 (PST) X-Google-Smtp-Source: AH8x226h9HBAOecik9JFiUeVivkHP43n9bl5xu+01JRfI5DLi3JAi5uonrcj6eKVOVb63q5l9QM2 X-Received: by 2002:a17:902:7148:: with SMTP id u8-v6mr2745104plm.91.1519418623763; Fri, 23 Feb 2018 12:43:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519418623; cv=none; d=google.com; s=arc-20160816; b=MwGtB9JNLvyetF96j5EKJIyqpS7/Dws953hfXc3yNpzDkkNExrSWHOIOHGt532DW8a MSRue+fB6mH7FuE6PS1lytsIAPTHEc3sKmUYcMXJJx0ivYReiuo0MIqOQlZNyBQ7nx72 Yef04hvHidCVYTqe2PtH0tmTQ/MG/iyYTohkFcRwMggR4zbZU2fm4cKge0KIsJ9pZ2VU 5H3voPHdrQpeQF3WAf2vja6ujF/JFhlxlRgB3P6R9A63E9/XG+EvVs9E7ZAg73rZGedl vYxr7Q41ilwUBkiHlmaskpV4ssvJZrCF9jQLCl4LlFzlyTHDd3IfS8IHdmIHn1yMu5lC 58Sw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=jqws0IwAvc4q3GxGow4bxAeblXFRc49xuwUymZ3E7tA=; b=Gif7fYuMumhSES6+g+eZ6pfjuJ/g1qnoGjmRcyxqOgTERp2xihU024R7D7AZ9ng1Wh 688fB9F3VchKjJDd/ryEvNEGqHLApeDyfNKGBefX45uSlnxrUKcJnJEMzZaxLkj5Ycrb fqp684ML7DKIKCedFkeYqJwkJZPoS13C4rdbJevmDMNG5bHgBtw1HvKdlLUOD+tC5jYv HNX99V60eSm+PKG5ymkXGMt5J265Hl87qcFa+Djq7B5op7XekSUtdmdiuUJ+v0r+SKXA KYtFqmai34N/xBWymjLOiKdYoiQtHjaE4a6KBGL48XmJzodc4yDmMgg2DRmsGbeGCbSV d6yw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b3si1939376pgc.496.2018.02.23.12.43.29; Fri, 23 Feb 2018 12:43:43 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753183AbeBWSdU (ORCPT + 99 others); Fri, 23 Feb 2018 13:33:20 -0500 Received: from mail.linuxfoundation.org ([140.211.169.12]:36482 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753102AbeBWSdR (ORCPT ); Fri, 23 Feb 2018 13:33:17 -0500 Received: from localhost (LFbn-1-12258-90.w90-92.abo.wanadoo.fr [90.92.71.90]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 898D610EA; Fri, 23 Feb 2018 18:33:16 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, syzbot+a4c2dc980ac1af699b36@syzkaller.appspotmail.com, Florian Westphal , Paolo Abeni , Pablo Neira Ayuso Subject: [PATCH 4.4 018/193] netfilter: on sockopt() acquire sock lock only in the required scope Date: Fri, 23 Feb 2018 19:24:11 +0100 Message-Id: <20180223170328.917840129@linuxfoundation.org> X-Mailer: git-send-email 2.16.2 In-Reply-To: <20180223170325.997716448@linuxfoundation.org> References: <20180223170325.997716448@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.4-stable review patch. If anyone has any objections, please let me know. ------------------ From: Paolo Abeni commit 3f34cfae1238848fd53f25e5c8fd59da57901f4b upstream. Syzbot reported several deadlocks in the netfilter area caused by rtnl lock and socket lock being acquired with a different order on different code paths, leading to backtraces like the following one: ====================================================== WARNING: possible circular locking dependency detected 4.15.0-rc9+ #212 Not tainted ------------------------------------------------------ syzkaller041579/3682 is trying to acquire lock: (sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>] lock_sock include/net/sock.h:1463 [inline] (sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>] do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167 but task is already holding lock: (rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (rtnl_mutex){+.+.}: __mutex_lock_common kernel/locking/mutex.c:756 [inline] __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908 rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74 register_netdevice_notifier+0xad/0x860 net/core/dev.c:1607 tee_tg_check+0x1a0/0x280 net/netfilter/xt_TEE.c:106 xt_check_target+0x22c/0x7d0 net/netfilter/x_tables.c:845 check_target net/ipv6/netfilter/ip6_tables.c:538 [inline] find_check_entry.isra.7+0x935/0xcf0 net/ipv6/netfilter/ip6_tables.c:580 translate_table+0xf52/0x1690 net/ipv6/netfilter/ip6_tables.c:749 do_replace net/ipv6/netfilter/ip6_tables.c:1165 [inline] do_ip6t_set_ctl+0x370/0x5f0 net/ipv6/netfilter/ip6_tables.c:1691 nf_sockopt net/netfilter/nf_sockopt.c:106 [inline] nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115 ipv6_setsockopt+0x115/0x150 net/ipv6/ipv6_sockglue.c:928 udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422 sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978 SYSC_setsockopt net/socket.c:1849 [inline] SyS_setsockopt+0x189/0x360 net/socket.c:1828 entry_SYSCALL_64_fastpath+0x29/0xa0 -> #0 (sk_lock-AF_INET6){+.+.}: lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3914 lock_sock_nested+0xc2/0x110 net/core/sock.c:2780 lock_sock include/net/sock.h:1463 [inline] do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167 ipv6_setsockopt+0xd7/0x150 net/ipv6/ipv6_sockglue.c:922 udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422 sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978 SYSC_setsockopt net/socket.c:1849 [inline] SyS_setsockopt+0x189/0x360 net/socket.c:1828 entry_SYSCALL_64_fastpath+0x29/0xa0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(rtnl_mutex); lock(sk_lock-AF_INET6); lock(rtnl_mutex); lock(sk_lock-AF_INET6); *** DEADLOCK *** 1 lock held by syzkaller041579/3682: #0: (rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74 The problem, as Florian noted, is that nf_setsockopt() is always called with the socket held, even if the lock itself is required only for very tight scopes and only for some operation. This patch addresses the issues moving the lock_sock() call only where really needed, namely in ipv*_getorigdst(), so that nf_setsockopt() does not need anymore to acquire both locks. Fixes: 22265a5c3c10 ("netfilter: xt_TEE: resolve oif using netdevice notifiers") Reported-by: syzbot+a4c2dc980ac1af699b36@syzkaller.appspotmail.com Suggested-by: Florian Westphal Signed-off-by: Paolo Abeni Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman --- net/ipv4/ip_sockglue.c | 14 ++++---------- net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c | 6 +++++- net/ipv6/ipv6_sockglue.c | 17 +++++------------ net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c | 18 ++++++++++++------ 4 files changed, 26 insertions(+), 29 deletions(-) --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -1221,11 +1221,8 @@ int ip_setsockopt(struct sock *sk, int l if (err == -ENOPROTOOPT && optname != IP_HDRINCL && optname != IP_IPSEC_POLICY && optname != IP_XFRM_POLICY && - !ip_mroute_opt(optname)) { - lock_sock(sk); + !ip_mroute_opt(optname)) err = nf_setsockopt(sk, PF_INET, optname, optval, optlen); - release_sock(sk); - } #endif return err; } @@ -1250,12 +1247,9 @@ int compat_ip_setsockopt(struct sock *sk if (err == -ENOPROTOOPT && optname != IP_HDRINCL && optname != IP_IPSEC_POLICY && optname != IP_XFRM_POLICY && - !ip_mroute_opt(optname)) { - lock_sock(sk); - err = compat_nf_setsockopt(sk, PF_INET, optname, - optval, optlen); - release_sock(sk); - } + !ip_mroute_opt(optname)) + err = compat_nf_setsockopt(sk, PF_INET, optname, optval, + optlen); #endif return err; } --- a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c +++ b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c @@ -259,15 +259,19 @@ getorigdst(struct sock *sk, int optval, struct nf_conntrack_tuple tuple; memset(&tuple, 0, sizeof(tuple)); + + lock_sock(sk); tuple.src.u3.ip = inet->inet_rcv_saddr; tuple.src.u.tcp.port = inet->inet_sport; tuple.dst.u3.ip = inet->inet_daddr; tuple.dst.u.tcp.port = inet->inet_dport; tuple.src.l3num = PF_INET; tuple.dst.protonum = sk->sk_protocol; + release_sock(sk); /* We only do TCP and SCTP at the moment: is there a better way? */ - if (sk->sk_protocol != IPPROTO_TCP && sk->sk_protocol != IPPROTO_SCTP) { + if (tuple.dst.protonum != IPPROTO_TCP && + tuple.dst.protonum != IPPROTO_SCTP) { pr_debug("SO_ORIGINAL_DST: Not a TCP/SCTP socket\n"); return -ENOPROTOOPT; } --- a/net/ipv6/ipv6_sockglue.c +++ b/net/ipv6/ipv6_sockglue.c @@ -905,12 +905,8 @@ int ipv6_setsockopt(struct sock *sk, int #ifdef CONFIG_NETFILTER /* we need to exclude all possible ENOPROTOOPTs except default case */ if (err == -ENOPROTOOPT && optname != IPV6_IPSEC_POLICY && - optname != IPV6_XFRM_POLICY) { - lock_sock(sk); - err = nf_setsockopt(sk, PF_INET6, optname, optval, - optlen); - release_sock(sk); - } + optname != IPV6_XFRM_POLICY) + err = nf_setsockopt(sk, PF_INET6, optname, optval, optlen); #endif return err; } @@ -940,12 +936,9 @@ int compat_ipv6_setsockopt(struct sock * #ifdef CONFIG_NETFILTER /* we need to exclude all possible ENOPROTOOPTs except default case */ if (err == -ENOPROTOOPT && optname != IPV6_IPSEC_POLICY && - optname != IPV6_XFRM_POLICY) { - lock_sock(sk); - err = compat_nf_setsockopt(sk, PF_INET6, optname, - optval, optlen); - release_sock(sk); - } + optname != IPV6_XFRM_POLICY) + err = compat_nf_setsockopt(sk, PF_INET6, optname, optval, + optlen); #endif return err; } --- a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c +++ b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c @@ -226,20 +226,27 @@ static struct nf_hook_ops ipv6_conntrack static int ipv6_getorigdst(struct sock *sk, int optval, void __user *user, int *len) { - const struct inet_sock *inet = inet_sk(sk); + struct nf_conntrack_tuple tuple = { .src.l3num = NFPROTO_IPV6 }; const struct ipv6_pinfo *inet6 = inet6_sk(sk); + const struct inet_sock *inet = inet_sk(sk); const struct nf_conntrack_tuple_hash *h; struct sockaddr_in6 sin6; - struct nf_conntrack_tuple tuple = { .src.l3num = NFPROTO_IPV6 }; struct nf_conn *ct; + __be32 flow_label; + int bound_dev_if; + lock_sock(sk); tuple.src.u3.in6 = sk->sk_v6_rcv_saddr; tuple.src.u.tcp.port = inet->inet_sport; tuple.dst.u3.in6 = sk->sk_v6_daddr; tuple.dst.u.tcp.port = inet->inet_dport; tuple.dst.protonum = sk->sk_protocol; + bound_dev_if = sk->sk_bound_dev_if; + flow_label = inet6->flow_label; + release_sock(sk); - if (sk->sk_protocol != IPPROTO_TCP && sk->sk_protocol != IPPROTO_SCTP) + if (tuple.dst.protonum != IPPROTO_TCP && + tuple.dst.protonum != IPPROTO_SCTP) return -ENOPROTOOPT; if (*len < 0 || (unsigned int) *len < sizeof(sin6)) @@ -257,14 +264,13 @@ ipv6_getorigdst(struct sock *sk, int opt sin6.sin6_family = AF_INET6; sin6.sin6_port = ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u.tcp.port; - sin6.sin6_flowinfo = inet6->flow_label & IPV6_FLOWINFO_MASK; + sin6.sin6_flowinfo = flow_label & IPV6_FLOWINFO_MASK; memcpy(&sin6.sin6_addr, &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u3.in6, sizeof(sin6.sin6_addr)); nf_ct_put(ct); - sin6.sin6_scope_id = ipv6_iface_scope_id(&sin6.sin6_addr, - sk->sk_bound_dev_if); + sin6.sin6_scope_id = ipv6_iface_scope_id(&sin6.sin6_addr, bound_dev_if); return copy_to_user(user, &sin6, sizeof(sin6)) ? -EFAULT : 0; }