Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp2224450imm; Thu, 18 Oct 2018 10:59:30 -0700 (PDT) X-Google-Smtp-Source: ACcGV61NkyNTnOo+XEOcDqhhfDjQ7hNmdskU+mBFpu4cJJNDUBHOFsFZbgOStN8VIwdikkSGyblJ X-Received: by 2002:a17:902:d881:: with SMTP id b1-v6mr31541864plz.10.1539885570778; Thu, 18 Oct 2018 10:59:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539885570; cv=none; d=google.com; s=arc-20160816; b=WpnRdz0GQaF9v2F7HlvKRKlV8wIrxmBvgrvYM2cmVg+FSvp96PNOYkMq4i9ecUD7/E 1a1Edrrn+O5OXiu810EcqJ7R5JQ9nRuk1/EYZHRbcSOonHro/rhbt3Pf+FtBPgNpx/yO fD7U+nYueeeGBwouOVm9tJVtwcWWg2nfzN1po523T+WNTPbEC0/THqkIn4GbhhoLFv54 uiwNbGb/yAAyIg0NsVyyUFiPpv1W3DItj3Ic2vBZ5cwNMVRfaVtPr59xEWDZCxfRPdS2 gfUXWTqej9nJd83OkWDOFajGqVil12jjubMo8BFXFtQVC1Znm1CEA/Jrjg9EX+3iDQLC 5OHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=/5am/JxA9f9xRKz1AjVkkSar0hIkgh9v0oShabj7HCo=; b=timGtKJfy2Q8/sBOnmMyIZW3b4nEEZwWccSFNx2m4OAJxWHD/ceWAn1uXv4nliTKOz xH9KcU3Koa5UA0ZExN0xEzO/PTEL2rs08qgAUXmadE4CL3msKGFBiGLwpB/jHFvBREfw RkUXj03I8EItIB9OgjgGNl5q/VcEbBJ1/vOoWWNkgR9G+6deulNJi0VT7GHK4rpq2V2X /6pxCiFjM9xr2MYBI+w2YB1vI+Sc9T2quIL6D07pwre4CP2SMFm084iWvbsiLLStNdul ZWBDuSOuqvI70iTL2CAtp7D4qYbylzrsEuqGvZr/+Ut8K01skesaRtALmJKocFQrCEJy PX3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=sgCj80yI; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s3-v6si2586817pgi.366.2018.10.18.10.59.15; Thu, 18 Oct 2018 10:59:30 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=sgCj80yI; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728980AbeJSCAJ (ORCPT + 99 others); Thu, 18 Oct 2018 22:00:09 -0400 Received: from mail.kernel.org ([198.145.29.99]:47792 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728335AbeJSCAI (ORCPT ); Thu, 18 Oct 2018 22:00:08 -0400 Received: from localhost (ip-213-127-77-176.ip.prioritytelecom.net [213.127.77.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id F328C21476; Thu, 18 Oct 2018 17:58:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1539885483; bh=TCrK768sXP6V/9CaiKlPe1c+UgEbrszBaHrpD+N4rfY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=sgCj80yI2u67RFT+tX3uXxzDbZ2iKkSjLDgvq0gqGZJihTtVR9vlYIYDXhfC1t6t4 NRcWJhqYX0fjyXjCXf4Pa/Xsuga/nihgva4gFrdWpCYbEJ3aHdd3i9NlAU5F2I0SJy P/uRvPJsE9QbxZL53YPyG0HQqPN+4on9pYOvUrzg= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Eric Dumazet , John Fastabend , Yonghong Song , Daniel Borkmann , Sasha Levin Subject: [PATCH 4.18 37/53] bpf: sockmap, fix transition through disconnect without close Date: Thu, 18 Oct 2018 19:54:30 +0200 Message-Id: <20181018175423.021198321@linuxfoundation.org> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181018175416.561567978@linuxfoundation.org> References: <20181018175416.561567978@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.18-stable review patch. If anyone has any objections, please let me know. ------------------ From: John Fastabend [ Upstream commit b05545e15e1ff1d6a6a8593971275f9cc3e6b92b ] It is possible (via shutdown()) for TCP socks to go trough TCP_CLOSE state via tcp_disconnect() without actually calling tcp_close which would then call our bpf_tcp_close() callback. Because of this a user could disconnect a socket then put it in a LISTEN state which would break our assumptions about sockets always being ESTABLISHED state. To resolve this rely on the unhash hook, which is called in the disconnect case, to remove the sock from the sockmap. Reported-by: Eric Dumazet Fixes: 1aa12bdf1bfb ("bpf: sockmap, add sock close() hook to remove socks") Signed-off-by: John Fastabend Acked-by: Yonghong Song Signed-off-by: Daniel Borkmann Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman --- kernel/bpf/sockmap.c | 60 ++++++++++++++++++++++++++++++++++----------------- 1 file changed, 41 insertions(+), 19 deletions(-) --- a/kernel/bpf/sockmap.c +++ b/kernel/bpf/sockmap.c @@ -132,6 +132,7 @@ struct smap_psock { struct work_struct gc_work; struct proto *sk_proto; + void (*save_unhash)(struct sock *sk); void (*save_close)(struct sock *sk, long timeout); void (*save_data_ready)(struct sock *sk); void (*save_write_space)(struct sock *sk); @@ -143,6 +144,7 @@ static int bpf_tcp_recvmsg(struct sock * static int bpf_tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size); static int bpf_tcp_sendpage(struct sock *sk, struct page *page, int offset, size_t size, int flags); +static void bpf_tcp_unhash(struct sock *sk); static void bpf_tcp_close(struct sock *sk, long timeout); static inline struct smap_psock *smap_psock_sk(const struct sock *sk) @@ -184,6 +186,7 @@ static void build_protos(struct proto pr struct proto *base) { prot[SOCKMAP_BASE] = *base; + prot[SOCKMAP_BASE].unhash = bpf_tcp_unhash; prot[SOCKMAP_BASE].close = bpf_tcp_close; prot[SOCKMAP_BASE].recvmsg = bpf_tcp_recvmsg; prot[SOCKMAP_BASE].stream_memory_read = bpf_tcp_stream_read; @@ -217,6 +220,7 @@ static int bpf_tcp_init(struct sock *sk) return -EBUSY; } + psock->save_unhash = sk->sk_prot->unhash; psock->save_close = sk->sk_prot->close; psock->sk_proto = sk->sk_prot; @@ -305,30 +309,12 @@ static struct smap_psock_map_entry *psoc return e; } -static void bpf_tcp_close(struct sock *sk, long timeout) +static void bpf_tcp_remove(struct sock *sk, struct smap_psock *psock) { - void (*close_fun)(struct sock *sk, long timeout); struct smap_psock_map_entry *e; struct sk_msg_buff *md, *mtmp; - struct smap_psock *psock; struct sock *osk; - lock_sock(sk); - rcu_read_lock(); - psock = smap_psock_sk(sk); - if (unlikely(!psock)) { - rcu_read_unlock(); - release_sock(sk); - return sk->sk_prot->close(sk, timeout); - } - - /* The psock may be destroyed anytime after exiting the RCU critial - * section so by the time we use close_fun the psock may no longer - * be valid. However, bpf_tcp_close is called with the sock lock - * held so the close hook and sk are still valid. - */ - close_fun = psock->save_close; - if (psock->cork) { free_start_sg(psock->sock, psock->cork, true); kfree(psock->cork); @@ -379,6 +365,42 @@ static void bpf_tcp_close(struct sock *s kfree(e); e = psock_map_pop(sk, psock); } +} + +static void bpf_tcp_unhash(struct sock *sk) +{ + void (*unhash_fun)(struct sock *sk); + struct smap_psock *psock; + + rcu_read_lock(); + psock = smap_psock_sk(sk); + if (unlikely(!psock)) { + rcu_read_unlock(); + if (sk->sk_prot->unhash) + sk->sk_prot->unhash(sk); + return; + } + unhash_fun = psock->save_unhash; + bpf_tcp_remove(sk, psock); + rcu_read_unlock(); + unhash_fun(sk); +} + +static void bpf_tcp_close(struct sock *sk, long timeout) +{ + void (*close_fun)(struct sock *sk, long timeout); + struct smap_psock *psock; + + lock_sock(sk); + rcu_read_lock(); + psock = smap_psock_sk(sk); + if (unlikely(!psock)) { + rcu_read_unlock(); + release_sock(sk); + return sk->sk_prot->close(sk, timeout); + } + close_fun = psock->save_close; + bpf_tcp_remove(sk, psock); rcu_read_unlock(); release_sock(sk); close_fun(sk, timeout);