2022-07-13 14:43:12

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v1] net: Add distinct sk_psock field

The sk_psock facility populates the sk_user_data field with the
address of an extra bit of metadata. User space sockets never
populate the sk_user_data field, so this has worked out fine.

However, kernel socket consumers such as the RPC client and server
do populate the sk_user_data field. The sk_psock() function cannot
tell that the content of sk_user_data does not point to psock
metadata, so it will happily return a pointer to something else,
cast to a struct sk_psock.

Thus kernel socket consumers and psock currently cannot co-exist.

We could educate sk_psock() to return NULL if sk_user_data does
not point to a struct sk_psock. However, a more general solution
that enables full co-existence psock and other uses of sk_user_data
might be more interesting.

Move the struct sk_psock address to its own pointer field so that
the contents of the sk_user_data field is preserved.

Reviewed-by: Hannes Reinecke <[email protected]>
Signed-off-by: Chuck Lever <[email protected]>
---
include/linux/skmsg.h | 2 +-
include/net/sock.h | 4 +++-
net/core/skmsg.c | 6 +++---
3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index c5a2d6f50f25..5ef3a07c5b6c 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -277,7 +277,7 @@ static inline void sk_msg_sg_copy_clear(struct sk_msg *msg, u32 start)

static inline struct sk_psock *sk_psock(const struct sock *sk)
{
- return rcu_dereference_sk_user_data(sk);
+ return rcu_dereference(sk->sk_psock);
}

static inline void sk_psock_set_state(struct sk_psock *psock,
diff --git a/include/net/sock.h b/include/net/sock.h
index c4b91fc19b9c..d2a513169527 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -327,7 +327,8 @@ struct sk_filter;
* @sk_tskey: counter to disambiguate concurrent tstamp requests
* @sk_zckey: counter to order MSG_ZEROCOPY notifications
* @sk_socket: Identd and reporting IO signals
- * @sk_user_data: RPC layer private data
+ * @sk_user_data: Upper layer private data
+ * @sk_psock: socket policy data (bpf)
* @sk_frag: cached page frag
* @sk_peek_off: current peek_offset value
* @sk_send_head: front of stuff to transmit
@@ -519,6 +520,7 @@ struct sock {

struct socket *sk_socket;
void *sk_user_data;
+ struct sk_psock __rcu *sk_psock;
#ifdef CONFIG_SECURITY
void *sk_security;
#endif
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index cc381165ea08..2b3d01d92790 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -695,7 +695,7 @@ struct sk_psock *sk_psock_init(struct sock *sk, int node)

write_lock_bh(&sk->sk_callback_lock);

- if (sk->sk_user_data) {
+ if (sk->sk_psock) {
psock = ERR_PTR(-EBUSY);
goto out;
}
@@ -726,7 +726,7 @@ struct sk_psock *sk_psock_init(struct sock *sk, int node)
sk_psock_set_state(psock, SK_PSOCK_TX_ENABLED);
refcount_set(&psock->refcnt, 1);

- rcu_assign_sk_user_data_nocopy(sk, psock);
+ rcu_assign_pointer(sk->sk_psock, psock);
sock_hold(sk);

out:
@@ -825,7 +825,7 @@ void sk_psock_drop(struct sock *sk, struct sk_psock *psock)
{
write_lock_bh(&sk->sk_callback_lock);
sk_psock_restore_proto(sk, psock);
- rcu_assign_sk_user_data(sk, NULL);
+ rcu_assign_pointer(sk->sk_psock, NULL);
if (psock->progs.stream_parser)
sk_psock_stop_strp(sk, psock);
else if (psock->progs.stream_verdict || psock->progs.skb_verdict)



2022-07-14 02:34:30

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH v1] net: Add distinct sk_psock field

On Wed, 13 Jul 2022 10:26:21 -0400 Chuck Lever wrote:
> The sk_psock facility populates the sk_user_data field with the
> address of an extra bit of metadata. User space sockets never
> populate the sk_user_data field, so this has worked out fine.
>
> However, kernel socket consumers such as the RPC client and server
> do populate the sk_user_data field. The sk_psock() function cannot
> tell that the content of sk_user_data does not point to psock
> metadata, so it will happily return a pointer to something else,
> cast to a struct sk_psock.
>
> Thus kernel socket consumers and psock currently cannot co-exist.
>
> We could educate sk_psock() to return NULL if sk_user_data does
> not point to a struct sk_psock. However, a more general solution
> that enables full co-existence psock and other uses of sk_user_data
> might be more interesting.
>
> Move the struct sk_psock address to its own pointer field so that
> the contents of the sk_user_data field is preserved.
>
> Reviewed-by: Hannes Reinecke <[email protected]>
> Signed-off-by: Chuck Lever <[email protected]>

Thanks for posting separately. We already have the (somewhat
nondescript) SK_USER_DATA_BPF, can we use another bit for psock?
Or add a u8 user_data type and have TCP ULP reject if the type is
anything but psock. I'm not sure why psock is special to deserve
its own pointer.

2022-07-14 05:08:11

by Khalid Masum

[permalink] [raw]
Subject: Re: [PATCH v1] net: Add distinct sk_psock field

On Wed, 13 Jul 2022 10:26:21 -0400 Chuck Lever wrote:
> The sk_psock facility populates the sk_user_data field with the
> address of an extra bit of metadata. User space sockets never
> populate the sk_user_data field, so this has worked out fine.
>
> However, kernel socket consumers such as the RPC client and server
> do populate the sk_user_data field. The sk_psock() function cannot
> tell that the content of sk_user_data does not point to psock
> metadata, so it will happily return a pointer to something else,
> cast to a struct sk_psock.
>
> Thus kernel socket consumers and psock currently cannot co-exist.
>
> We could educate sk_psock() to return NULL if sk_user_data does
> not point to a struct sk_psock. However, a more general solution
> that enables full co-existence psock and other uses of sk_user_data
> might be more interesting.
>
> Move the struct sk_psock address to its own pointer field so that
> the contents of the sk_user_data field is preserved.
>
> Reviewed-by: Hannes Reinecke <[email protected]>
> Signed-off-by: Chuck Lever <[email protected]>

The patch seems to fix the syzbot bug:
[syzbot] KASAN: slab-out-of-bounds Read in sk_psock_get
Reported-by: [email protected]
As the reproducer no longer triggers the warning.

Tested-by: Khalid Masum <[email protected]>

2022-07-14 14:11:19

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v1] net: Add distinct sk_psock field



> On Jul 13, 2022, at 9:51 PM, Jakub Kicinski <[email protected]> wrote:
>
> On Wed, 13 Jul 2022 10:26:21 -0400 Chuck Lever wrote:
>> The sk_psock facility populates the sk_user_data field with the
>> address of an extra bit of metadata. User space sockets never
>> populate the sk_user_data field, so this has worked out fine.
>>
>> However, kernel socket consumers such as the RPC client and server
>> do populate the sk_user_data field. The sk_psock() function cannot
>> tell that the content of sk_user_data does not point to psock
>> metadata, so it will happily return a pointer to something else,
>> cast to a struct sk_psock.
>>
>> Thus kernel socket consumers and psock currently cannot co-exist.
>>
>> We could educate sk_psock() to return NULL if sk_user_data does
>> not point to a struct sk_psock. However, a more general solution
>> that enables full co-existence psock and other uses of sk_user_data
>> might be more interesting.
>>
>> Move the struct sk_psock address to its own pointer field so that
>> the contents of the sk_user_data field is preserved.
>>
>> Reviewed-by: Hannes Reinecke <[email protected]>
>> Signed-off-by: Chuck Lever <[email protected]>
>
> Thanks for posting separately. We already have the (somewhat
> nondescript) SK_USER_DATA_BPF, can we use another bit for psock?
> Or add a u8 user_data type and have TCP ULP reject if the type is
> anything but psock. I'm not sure why psock is special to deserve
> its own pointer.

Hi Jakub, for an informed answer, you will need to ask the folks
who maintain psock. My guess is that kernel consumers might need
to populate both BPF/psock and sk_user_data concurrently for
separate purposes. If concurrent usage is never necessary, then
you can probably get away with a small enumerator that describes
the content of sk_user_data. But after some code auditing it didn't
look to me like that would be adequate.


--
Chuck Lever