Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp721131pxb; Thu, 21 Apr 2022 08:57:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxQSw8GEVo20owc1/HdcXCpr5ePKcTUFSNto+4fSlRLehUJ1udbeSOo4JFCKCKRVrsDf4tV X-Received: by 2002:a17:902:bcc5:b0:158:d637:8330 with SMTP id o5-20020a170902bcc500b00158d6378330mr200017pls.134.1650556636678; Thu, 21 Apr 2022 08:57:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650556636; cv=none; d=google.com; s=arc-20160816; b=Y+c/S1tlJnPU414D8Z5m7BTPn5+Ktdltz8B5bD8WMwmHtPX1dfDoAjgAL1qqfC7xBF NDav8qVwRhNxBXEYtyKZnCcb6H5NF4yq18SNM37C3EBm7eIQuh5YBEfpV+STk2beM6q5 28STym4U7taLx5R5CQkoaRZkn50esebAdNSRDOxyOh1Hdi6IYyK8ak5vC8ZGzVPzkCiX Lda06uqaGq8tFlJ+U/JADoXiLc89dNBRqlDG+Act0uwVNctCBGrwsjZE+L66TOF0YRSp e8Qelm4wUGKa2IAPW+AS23wCscjz1luQ/LvL06obZj0zfsrDjz7GE/avS/dILRQByJLA Qfbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:cc:to:from :subject; bh=cNjoxCSCnIcfYlvKMTrG5SG5OzWg/wu9POVKzaWKdRQ=; b=XsHi+ROzxgXP1MXciBeP/vJGhAJJ066d4kYLJKV+uzJ0ga4MZdHSyhLJ/ys0zOAI5v 40d6LrUtIEMK0pcx8MF0+KFTmzKCI9mzaeNeM56vmJbKuNztfuXrrjsmsOuA82dQe8xN zWnsy+JClt0/AP3C0JUCDWy8deHO5HBu7TBoKRY9/UXhTH7g83An2yv5lNgPyZ1/37FR Mid560XQ5oqrn8IwUDXWkd/Q6ItRsZkFdZ5BCpE1MsJ/M35uqV8/MKVgS28afLCEygKh zZ0m+VYcnVHUmQZfr+BPlj6eTX2HHCrdnsT2YcVk/6H+WeMu27f3A9h3Bvxkx+1wnCki /+2g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j13-20020a17090a840d00b001d2865c095fsi5605743pjn.61.2022.04.21.08.56.46; Thu, 21 Apr 2022 08:57:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237931AbiDRQwN (ORCPT + 99 others); Mon, 18 Apr 2022 12:52:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346425AbiDRQwM (ORCPT ); Mon, 18 Apr 2022 12:52:12 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C34E32EC7; Mon, 18 Apr 2022 09:49:32 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D5D07612DC; Mon, 18 Apr 2022 16:49:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B5625C385A7; Mon, 18 Apr 2022 16:49:30 +0000 (UTC) Subject: [PATCH RFC 1/5] net: Add distinct sk_psock field From: Chuck Lever To: netdev@vger.kernel.org, linux-nfs@vger.kernel.org, linux-nvme@lists.infradead.org, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: ak@tempesta-tech.com, borisp@nvidia.com, simo@redhat.com Date: Mon, 18 Apr 2022 12:49:29 -0400 Message-ID: <165030056960.5073.6664402939918720250.stgit@oracle-102.nfsv4.dev> In-Reply-To: <165030051838.5073.8699008789153780301.stgit@oracle-102.nfsv4.dev> References: <165030051838.5073.8699008789153780301.stgit@oracle-102.nfsv4.dev> User-Agent: StGit/1.5 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-6.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org The sk_psock facility populates the sk_user_data field with the address of an extra bit of metadata. User space sockets never populate the sk_user_data field, so this has worked out fine. However, kernel consumers such as the RPC client and server do populate the sk_user_data field. The sk_psock() function cannot tell that the content of sk_user_data does not point to psock metadata, so it will happily return a pointer to something else, cast to a struct sk_psock. Thus kernel consumers and psock currently cannot co-exist. We could educate sk_psock() to return NULL if sk_user_data does not point to a struct sk_psock. However, a more general solution that enables full co-existence psock and other uses of sk_user_data might be more interesting. Move the struct sk_psock address to its own pointer field so that the contents of the sk_user_data field is preserved. Signed-off-by: Chuck Lever --- include/linux/skmsg.h | 2 +- include/net/sock.h | 4 +++- net/core/skmsg.c | 6 +++--- 3 files changed, 7 insertions(+), 5 deletions(-) diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h index c5a2d6f50f25..5ef3a07c5b6c 100644 --- a/include/linux/skmsg.h +++ b/include/linux/skmsg.h @@ -277,7 +277,7 @@ static inline void sk_msg_sg_copy_clear(struct sk_msg *msg, u32 start) static inline struct sk_psock *sk_psock(const struct sock *sk) { - return rcu_dereference_sk_user_data(sk); + return rcu_dereference(sk->sk_psock); } static inline void sk_psock_set_state(struct sk_psock *psock, diff --git a/include/net/sock.h b/include/net/sock.h index c4b91fc19b9c..d2a513169527 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -327,7 +327,8 @@ struct sk_filter; * @sk_tskey: counter to disambiguate concurrent tstamp requests * @sk_zckey: counter to order MSG_ZEROCOPY notifications * @sk_socket: Identd and reporting IO signals - * @sk_user_data: RPC layer private data + * @sk_user_data: Upper layer private data + * @sk_psock: socket policy data (bpf) * @sk_frag: cached page frag * @sk_peek_off: current peek_offset value * @sk_send_head: front of stuff to transmit @@ -519,6 +520,7 @@ struct sock { struct socket *sk_socket; void *sk_user_data; + struct sk_psock __rcu *sk_psock; #ifdef CONFIG_SECURITY void *sk_security; #endif diff --git a/net/core/skmsg.c b/net/core/skmsg.c index cc381165ea08..2b3d01d92790 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -695,7 +695,7 @@ struct sk_psock *sk_psock_init(struct sock *sk, int node) write_lock_bh(&sk->sk_callback_lock); - if (sk->sk_user_data) { + if (sk->sk_psock) { psock = ERR_PTR(-EBUSY); goto out; } @@ -726,7 +726,7 @@ struct sk_psock *sk_psock_init(struct sock *sk, int node) sk_psock_set_state(psock, SK_PSOCK_TX_ENABLED); refcount_set(&psock->refcnt, 1); - rcu_assign_sk_user_data_nocopy(sk, psock); + rcu_assign_pointer(sk->sk_psock, psock); sock_hold(sk); out: @@ -825,7 +825,7 @@ void sk_psock_drop(struct sock *sk, struct sk_psock *psock) { write_lock_bh(&sk->sk_callback_lock); sk_psock_restore_proto(sk, psock); - rcu_assign_sk_user_data(sk, NULL); + rcu_assign_pointer(sk->sk_psock, NULL); if (psock->progs.stream_parser) sk_psock_stop_strp(sk, psock); else if (psock->progs.stream_verdict || psock->progs.skb_verdict)