Received: by 10.223.148.5 with SMTP id 5csp6788930wrq; Wed, 17 Jan 2018 19:08:55 -0800 (PST) X-Google-Smtp-Source: ACJfBoucEeIEdlAHrMfF3/LnkKEfTuNvJ/IhHxCqhcXAZawSpbyiIjYmsseBOyjoaIvwIzP4NA3c X-Received: by 10.101.98.207 with SMTP id m15mr34700408pgv.67.1516244935452; Wed, 17 Jan 2018 19:08:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516244935; cv=none; d=google.com; s=arc-20160816; b=t0gkuZ5uaYl+bzYy0Ipso5wl5rMD6QJ74/Smz3rjigGlvAjsjYGKMyRqFcfS3EoHgS s80HD7Gys6ZPC1PZmXymy9MgS+TaFCvsziJc25ITXxJfkLsGTLuF6NC+90n1kSJvb0IR qc+HgO8grebaQCuwgAuPox6pEqLrtd0JVkIO/Up1smWZxLEkfVe0gYP3WFftDZvkBb7y DKBTbShE0Ip5hNV6sd3GetufeZS1hLSIiYc73ehzHnbdiQcRNlstLngC+9+1sjDcnqFY /MSz1FjYedlx1VPKPmlrhW/APD4YsKBAEgHqFv/emxN7wAKFku6/xeENB+6Bfkh+Sxeg TDiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=0gTva+UCIyOXwllI5fWubZnhAvXVB+H6kthLfTLpZeM=; b=0Bp0ctMS2g2WyLoTLfEX5YqAbwSna3OCjWEbtHp+xO2lKEcnZzJpBjB4Ak6trJ4CF2 xNxoIYqgkEyxgpsWFQuJDnGgV/sb0i8yQDIaLA9IT/U9pPHTYvA7QHuc0kig1UzNQJFF jQyiJpuciC6lMiwMRwB8sasAXSxHxuLXtafu/3sZeifS9Q4xAKYhHAaC0dWL0LGUe/d8 nRiaicrHC2SD8IxlJ5k36FxIOtc8Fd89WoNainZDYZC17Gcb/ww4WsuOTptH0mj+bEFX 0sMw+OVFwu7Vyjq6iQO9Slsmo5WTDGMWaXK47jadTlxjXEVgNZtqLN7fQTQVnxxYXGQP xwkw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p13si5352195plo.628.2018.01.17.19.08.41; Wed, 17 Jan 2018 19:08:55 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753928AbeARDGu (ORCPT + 99 others); Wed, 17 Jan 2018 22:06:50 -0500 Received: from zeniv.linux.org.uk ([195.92.253.2]:59936 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753107AbeARDGs (ORCPT ); Wed, 17 Jan 2018 22:06:48 -0500 Received: from viro by ZenIV.linux.org.uk with local (Exim 4.87 #1 (Red Hat Linux)) id 1ec0XC-0002oF-6x; Thu, 18 Jan 2018 03:06:34 +0000 Date: Thu, 18 Jan 2018 03:06:34 +0000 From: Al Viro To: netdev@vger.kernel.org Cc: Linus Torvalds , Dan Williams , Linux Kernel Mailing List , linux-arch@vger.kernel.org, Andi Kleen , Kees Cook , kernel-hardening@lists.openwall.com, Greg Kroah-Hartman , the arch/x86 maintainers , Ingo Molnar , "H. Peter Anvin" , Thomas Gleixner , Andrew Morton , Alan Cox , David Miller Subject: [RFC][PATCH] get rid of the use of set_fs() (by way of kernel_recvmsg()) in sunrpc Message-ID: <20180118030634.GY13338@ZenIV.linux.org.uk> References: <151586744180.5820.13215059696964205856.stgit@dwillia2-desk3.amr.corp.intel.com> <151586748981.5820.14559543798744763404.stgit@dwillia2-desk3.amr.corp.intel.com> <1516198646.4184.13.camel@linux.intel.com> <20180117185232.GW13338@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180117185232.GW13338@ZenIV.linux.org.uk> User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 17, 2018 at 06:52:32PM +0000, Al Viro wrote: [use of set_fs() by sunrpc] > We are asking for recvmsg() with zero data length; what we really want is > ->msg_control. And _that_ is why we need that set_fs() - we want the damn > thing to go into local variable. > > But note that filling ->msg_control will happen in put_cmsg(), called > from ip_cmsg_recv_pktinfo(), called from ip_cmsg_recv_offset(), > called from udp_recvmsg(), called from sock_recvmsg_nosec(), called > from sock_recvmsg(). Or in another path in case of IPv6. > Sure, we can arrange for propagation of that all way down those > call chains. My preference would be to try and mark that (one and > only) case in ->msg_flags, so that put_cmsg() would be able to > check. ___sys_recvmsg() sets that as > msg_sys->msg_flags = flags & (MSG_CMSG_CLOEXEC|MSG_CMSG_COMPAT); > so we ought to be free to use any bit other than those two. Since > put_cmsg() already checks ->msg_flags, that shouldn't put too much > overhead. Folks, does anybody have objections against the following: Get rid of kernel_recvmsg() (and thus set_fs()) use in sunrpc In net/sunrpc/svcsock.c:svc_udp_recvfrom() we want to get IP_PKTINFO; currently we stash the address of local variable into ->msg_control (which normall contains a userland pointer in recepients) and issue zero-length ->recvmsg() under set_fs(KERNEL_DS). Similar to the way put_cmsg() handles 32bit case on biarch targets, introduce a flag telling put_cmsg() to treat ->msg_control as kernel pointer, using memcpy instead of copy_to_user(). That allows to avoid the use of kernel_recvmsg() with its set_fs(). All places that might have non-NULL ->msg_control either pass the msghdr only to ->sendmsg() and its ilk, or have ->msg_flags sanitized before passing msghdr anywhere. IOW, there no way the new flag to reach put_cmsg() in the mainline kernel, and after this change it will only be seen in sunrpc case. Incidentally, all other kernel_recvmsg() users are very easy to convert to sock_recvmsg(), so that would allow to kill kernel_recvmsg() off... Signed-off-by: Al Viro --- diff --git a/include/linux/socket.h b/include/linux/socket.h index 9286a5a8c60c..60947da9c806 100644 --- a/include/linux/socket.h +++ b/include/linux/socket.h @@ -298,6 +298,7 @@ struct ucred { #else #define MSG_CMSG_COMPAT 0 /* We never have 32 bit fixups */ #endif +#define MSG_CMSG_KERNEL 0x10000000 /* Setsockoptions(2) level. Thanks to BSD these must match IPPROTO_xxx */ diff --git a/net/core/scm.c b/net/core/scm.c index b1ff8a441748..1b73b682e441 100644 --- a/net/core/scm.c +++ b/net/core/scm.c @@ -237,10 +237,17 @@ int put_cmsg(struct msghdr * msg, int level, int type, int len, void *data) cmhdr.cmsg_len = cmlen; err = -EFAULT; - if (copy_to_user(cm, &cmhdr, sizeof cmhdr)) - goto out; - if (copy_to_user(CMSG_DATA(cm), data, cmlen - sizeof(struct cmsghdr))) - goto out; + if (unlikely(MSG_CMSG_KERNEL & msg->msg_flags)) { + struct cmsghdr *p = msg->msg_control; + memcpy(p, &cmhdr, sizeof cmhdr); + memcpy(CMSG_DATA(p), data, cmlen - sizeof(struct cmsghdr)); + } else { + if (copy_to_user(cm, &cmhdr, sizeof cmhdr)) + goto out; + if (copy_to_user(CMSG_DATA(cm), data, + cmlen - sizeof(struct cmsghdr))) + goto out; + } cmlen = CMSG_SPACE(len); if (msg->msg_controllen < cmlen) cmlen = msg->msg_controllen; diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 5570719e4787..774904f67c93 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -545,7 +545,7 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp) .msg_name = svc_addr(rqstp), .msg_control = cmh, .msg_controllen = sizeof(buffer), - .msg_flags = MSG_DONTWAIT, + .msg_flags = MSG_DONTWAIT | MSG_CMSG_KERNEL, }; size_t len; int err; @@ -565,8 +565,7 @@ static int svc_udp_recvfrom(struct svc_rqst *rqstp) clear_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags); skb = NULL; - err = kernel_recvmsg(svsk->sk_sock, &msg, NULL, - 0, 0, MSG_PEEK | MSG_DONTWAIT); + err = sock_recvmsg(svsk->sk_sock, &msg, MSG_PEEK | MSG_DONTWAIT); if (err >= 0) skb = skb_recv_udp(svsk->sk_sk, 0, 1, &err);