2021-08-16 19:04:50

by Jiang Wang .

[permalink] [raw]
Subject: [PATCH bpf-next v7 0/5] sockmap: add sockmap support for unix stream socket

This patch series add support for unix stream type
for sockmap. Sockmap already supports TCP, UDP,
unix dgram types. The unix stream support is similar
to unix dgram.

Also add selftests for unix stream type in sockmap tests.


Jiang Wang (5):
af_unix: add read_sock for stream socket types
af_unix: add unix_stream_proto for sockmap
selftest/bpf: add tests for sockmap with unix stream type.
selftest/bpf: change udp to inet in some function names
selftest/bpf: add new tests in sockmap for unix stream to tcp.

include/net/af_unix.h | 8 +-
net/core/sock_map.c | 1 +
net/unix/af_unix.c | 95 ++++++++++++++++---
net/unix/unix_bpf.c | 93 +++++++++++++-----
.../selftests/bpf/prog_tests/sockmap_listen.c | 48 ++++++----
5 files changed, 191 insertions(+), 54 deletions(-)

v1 -> v2 :
- Call unhash in shutdown.
- Clean up unix_create1 a bit.
- Return -ENOTCONN if socket is not connected.

v2 -> v3 :
- check for stream type in update_proto
- remove intermediate variable in __unix_stream_recvmsg
- fix compile warning in unix_stream_recvmsg

v3 -> v4 :
- remove sk_is_unix_stream, just check TCP_ESTABLISHED for UNIX sockets.
- add READ_ONCE in unix_dgram_recvmsg
- remove type check in unix_stream_bpf_update_proto

v4 -> v5 :
- add two missing READ_ONCE for sk_prot.

v5 -> v6 :
- fix READ_ONCE by reading to a local variable first.

v6 -> v7 :
- fix the following compiler error when CONFIG_UNIX is m.
modpost: "sock_map_unhash" [net/unix/unix.ko] undefined!

For the series:

Acked-by: John Fastabend <[email protected]>
Acked-by: Jakub Sitnicki <[email protected]>
--
2.20.1


2021-08-16 19:04:56

by Jiang Wang .

[permalink] [raw]
Subject: [PATCH bpf-next v7 3/5] selftest/bpf: add tests for sockmap with unix stream type.

Add two tests for unix stream to unix stream redirection
in sockmap tests.

Signed-off-by: Jiang Wang <[email protected]>
Reviewed-by: Cong Wang <[email protected]>
Acked-by: John Fastabend <[email protected]>
---
tools/testing/selftests/bpf/prog_tests/sockmap_listen.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
index a9f1bf9d5dff..7a976d43281a 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
@@ -2020,11 +2020,13 @@ void test_sockmap_listen(void)
run_tests(skel, skel->maps.sock_map, AF_INET);
run_tests(skel, skel->maps.sock_map, AF_INET6);
test_unix_redir(skel, skel->maps.sock_map, SOCK_DGRAM);
+ test_unix_redir(skel, skel->maps.sock_map, SOCK_STREAM);

skel->bss->test_sockmap = false;
run_tests(skel, skel->maps.sock_hash, AF_INET);
run_tests(skel, skel->maps.sock_hash, AF_INET6);
test_unix_redir(skel, skel->maps.sock_hash, SOCK_DGRAM);
+ test_unix_redir(skel, skel->maps.sock_hash, SOCK_STREAM);

test_sockmap_listen__destroy(skel);
}
--
2.20.1

2021-08-16 19:05:45

by Jiang Wang .

[permalink] [raw]
Subject: [PATCH bpf-next v7 5/5] selftest/bpf: add new tests in sockmap for unix stream to tcp.

Add two new test cases in sockmap tests, where unix stream is
redirected to tcp and vice versa.

Signed-off-by: Jiang Wang <[email protected]>
Reviewed-by: Cong Wang <[email protected]>
Acked-by: John Fastabend <[email protected]>
---
.../selftests/bpf/prog_tests/sockmap_listen.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
index 07ed8081f9ae..afa14fb66f08 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
@@ -1884,7 +1884,7 @@ static void inet_unix_redir_to_connected(int family, int type, int sock_mapfd,
xclose(p0);
}

-static void udp_unix_skb_redir_to_connected(struct test_sockmap_listen *skel,
+static void inet_unix_skb_redir_to_connected(struct test_sockmap_listen *skel,
struct bpf_map *inner_map, int family)
{
int verdict = bpf_program__fd(skel->progs.prog_skb_verdict);
@@ -1899,9 +1899,13 @@ static void udp_unix_skb_redir_to_connected(struct test_sockmap_listen *skel,
skel->bss->test_ingress = false;
inet_unix_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
REDIR_EGRESS);
+ inet_unix_redir_to_connected(family, SOCK_STREAM, sock_map, verdict_map,
+ REDIR_EGRESS);
skel->bss->test_ingress = true;
inet_unix_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
REDIR_INGRESS);
+ inet_unix_redir_to_connected(family, SOCK_STREAM, sock_map, verdict_map,
+ REDIR_INGRESS);

xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT);
}
@@ -1961,7 +1965,7 @@ static void unix_inet_redir_to_connected(int family, int type, int sock_mapfd,

}

-static void unix_udp_skb_redir_to_connected(struct test_sockmap_listen *skel,
+static void unix_inet_skb_redir_to_connected(struct test_sockmap_listen *skel,
struct bpf_map *inner_map, int family)
{
int verdict = bpf_program__fd(skel->progs.prog_skb_verdict);
@@ -1976,9 +1980,13 @@ static void unix_udp_skb_redir_to_connected(struct test_sockmap_listen *skel,
skel->bss->test_ingress = false;
unix_inet_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
REDIR_EGRESS);
+ unix_inet_redir_to_connected(family, SOCK_STREAM, sock_map, verdict_map,
+ REDIR_EGRESS);
skel->bss->test_ingress = true;
unix_inet_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
REDIR_INGRESS);
+ unix_inet_redir_to_connected(family, SOCK_STREAM, sock_map, verdict_map,
+ REDIR_INGRESS);

xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT);
}
@@ -1994,8 +2002,8 @@ static void test_udp_unix_redir(struct test_sockmap_listen *skel, struct bpf_map
snprintf(s, sizeof(s), "%s %s %s", map_name, family_name, __func__);
if (!test__start_subtest(s))
return;
- udp_unix_skb_redir_to_connected(skel, map, family);
- unix_udp_skb_redir_to_connected(skel, map, family);
+ inet_unix_skb_redir_to_connected(skel, map, family);
+ unix_inet_skb_redir_to_connected(skel, map, family);
}

static void run_tests(struct test_sockmap_listen *skel, struct bpf_map *map,
--
2.20.1

2021-08-16 19:05:57

by Jiang Wang .

[permalink] [raw]
Subject: [PATCH bpf-next v7 4/5] selftest/bpf: change udp to inet in some function names

This is to prepare for adding new unix stream tests.
Mostly renames, also pass the socket types as an argument.

Signed-off-by: Jiang Wang <[email protected]>
Reviewed-by: Cong Wang <[email protected]>
Acked-by: John Fastabend <[email protected]>
---
.../selftests/bpf/prog_tests/sockmap_listen.c | 30 +++++++++++--------
1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
index 7a976d43281a..07ed8081f9ae 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
@@ -1692,14 +1692,14 @@ static void test_reuseport(struct test_sockmap_listen *skel,
}
}

-static int udp_socketpair(int family, int *s, int *c)
+static int inet_socketpair(int family, int type, int *s, int *c)
{
struct sockaddr_storage addr;
socklen_t len;
int p0, c0;
int err;

- p0 = socket_loopback(family, SOCK_DGRAM | SOCK_NONBLOCK);
+ p0 = socket_loopback(family, type | SOCK_NONBLOCK);
if (p0 < 0)
return p0;

@@ -1708,7 +1708,7 @@ static int udp_socketpair(int family, int *s, int *c)
if (err)
goto close_peer0;

- c0 = xsocket(family, SOCK_DGRAM | SOCK_NONBLOCK, 0);
+ c0 = xsocket(family, type | SOCK_NONBLOCK, 0);
if (c0 < 0) {
err = c0;
goto close_peer0;
@@ -1747,10 +1747,10 @@ static void udp_redir_to_connected(int family, int sock_mapfd, int verd_mapfd,

zero_verdict_count(verd_mapfd);

- err = udp_socketpair(family, &p0, &c0);
+ err = inet_socketpair(family, SOCK_DGRAM, &p0, &c0);
if (err)
return;
- err = udp_socketpair(family, &p1, &c1);
+ err = inet_socketpair(family, SOCK_DGRAM, &p1, &c1);
if (err)
goto close_cli0;

@@ -1825,7 +1825,7 @@ static void test_udp_redir(struct test_sockmap_listen *skel, struct bpf_map *map
udp_skb_redir_to_connected(skel, map, family);
}

-static void udp_unix_redir_to_connected(int family, int sock_mapfd,
+static void inet_unix_redir_to_connected(int family, int type, int sock_mapfd,
int verd_mapfd, enum redir_mode mode)
{
const char *log_prefix = redir_mode_str(mode);
@@ -1843,7 +1843,7 @@ static void udp_unix_redir_to_connected(int family, int sock_mapfd,
return;
c0 = sfd[0], p0 = sfd[1];

- err = udp_socketpair(family, &p1, &c1);
+ err = inet_socketpair(family, SOCK_DGRAM, &p1, &c1);
if (err)
goto close;

@@ -1897,14 +1897,16 @@ static void udp_unix_skb_redir_to_connected(struct test_sockmap_listen *skel,
return;

skel->bss->test_ingress = false;
- udp_unix_redir_to_connected(family, sock_map, verdict_map, REDIR_EGRESS);
+ inet_unix_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
+ REDIR_EGRESS);
skel->bss->test_ingress = true;
- udp_unix_redir_to_connected(family, sock_map, verdict_map, REDIR_INGRESS);
+ inet_unix_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
+ REDIR_INGRESS);

xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT);
}

-static void unix_udp_redir_to_connected(int family, int sock_mapfd,
+static void unix_inet_redir_to_connected(int family, int type, int sock_mapfd,
int verd_mapfd, enum redir_mode mode)
{
const char *log_prefix = redir_mode_str(mode);
@@ -1917,7 +1919,7 @@ static void unix_udp_redir_to_connected(int family, int sock_mapfd,

zero_verdict_count(verd_mapfd);

- err = udp_socketpair(family, &p0, &c0);
+ err = inet_socketpair(family, SOCK_DGRAM, &p0, &c0);
if (err)
return;

@@ -1972,9 +1974,11 @@ static void unix_udp_skb_redir_to_connected(struct test_sockmap_listen *skel,
return;

skel->bss->test_ingress = false;
- unix_udp_redir_to_connected(family, sock_map, verdict_map, REDIR_EGRESS);
+ unix_inet_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
+ REDIR_EGRESS);
skel->bss->test_ingress = true;
- unix_udp_redir_to_connected(family, sock_map, verdict_map, REDIR_INGRESS);
+ unix_inet_redir_to_connected(family, SOCK_DGRAM, sock_map, verdict_map,
+ REDIR_INGRESS);

xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT);
}
--
2.20.1

2021-08-16 19:06:43

by Jiang Wang .

[permalink] [raw]
Subject: [PATCH bpf-next v7 1/5] af_unix: add read_sock for stream socket types

To support sockmap for af_unix stream type, implement
read_sock, which is similar to the read_sock for unix
dgram sockets.

Signed-off-by: Jiang Wang <[email protected]>
Reviewed-by: Cong Wang <[email protected]>
---
net/unix/af_unix.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 1c2224f05b51..31061304ccf2 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -678,6 +678,8 @@ static int unix_dgram_sendmsg(struct socket *, struct msghdr *, size_t);
static int unix_dgram_recvmsg(struct socket *, struct msghdr *, size_t, int);
static int unix_read_sock(struct sock *sk, read_descriptor_t *desc,
sk_read_actor_t recv_actor);
+static int unix_stream_read_sock(struct sock *sk, read_descriptor_t *desc,
+ sk_read_actor_t recv_actor);
static int unix_dgram_connect(struct socket *, struct sockaddr *,
int, int);
static int unix_seqpacket_sendmsg(struct socket *, struct msghdr *, size_t);
@@ -731,6 +733,7 @@ static const struct proto_ops unix_stream_ops = {
.shutdown = unix_shutdown,
.sendmsg = unix_stream_sendmsg,
.recvmsg = unix_stream_recvmsg,
+ .read_sock = unix_stream_read_sock,
.mmap = sock_no_mmap,
.sendpage = unix_stream_sendpage,
.splice_read = unix_stream_splice_read,
@@ -2490,6 +2493,15 @@ static struct sk_buff *manage_oob(struct sk_buff *skb, struct sock *sk,
}
#endif

+static int unix_stream_read_sock(struct sock *sk, read_descriptor_t *desc,
+ sk_read_actor_t recv_actor)
+{
+ if (unlikely(sk->sk_state != TCP_ESTABLISHED))
+ return -ENOTCONN;
+
+ return unix_read_sock(sk, desc, recv_actor);
+}
+
static int unix_stream_read_generic(struct unix_stream_read_state *state,
bool freezable)
{
--
2.20.1

2021-08-16 19:06:56

by Jiang Wang .

[permalink] [raw]
Subject: [PATCH bpf-next v7 2/5] af_unix: add unix_stream_proto for sockmap

Previously, sockmap for AF_UNIX protocol only supports
dgram type. This patch add unix stream type support, which
is similar to unix_dgram_proto. To support sockmap, dgram
and stream cannot share the same unix_proto anymore, because
they have different implementations, such as unhash for stream
type (which will remove closed or disconnected sockets from the map),
so rename unix_proto to unix_dgram_proto and add a new
unix_stream_proto.

Also implement stream related sockmap functions.
And add dgram key words to those dgram specific functions.

Signed-off-by: Jiang Wang <[email protected]>
Reviewed-by: Cong Wang <[email protected]>
---
include/net/af_unix.h | 8 +++-
net/core/sock_map.c | 1 +
net/unix/af_unix.c | 83 ++++++++++++++++++++++++++++++++------
net/unix/unix_bpf.c | 93 +++++++++++++++++++++++++++++++++----------
4 files changed, 148 insertions(+), 37 deletions(-)

diff --git a/include/net/af_unix.h b/include/net/af_unix.h
index 4757d7f53f13..7d142e8a0550 100644
--- a/include/net/af_unix.h
+++ b/include/net/af_unix.h
@@ -87,6 +87,8 @@ long unix_outq_len(struct sock *sk);

int __unix_dgram_recvmsg(struct sock *sk, struct msghdr *msg, size_t size,
int flags);
+int __unix_stream_recvmsg(struct sock *sk, struct msghdr *msg, size_t size,
+ int flags);
#ifdef CONFIG_SYSCTL
int unix_sysctl_register(struct net *net);
void unix_sysctl_unregister(struct net *net);
@@ -96,9 +98,11 @@ static inline void unix_sysctl_unregister(struct net *net) {}
#endif

#ifdef CONFIG_BPF_SYSCALL
-extern struct proto unix_proto;
+extern struct proto unix_dgram_proto;
+extern struct proto unix_stream_proto;

-int unix_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore);
+int unix_dgram_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore);
+int unix_stream_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore);
void __init unix_bpf_build_proto(void);
#else
static inline void __init unix_bpf_build_proto(void)
diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index ae5fa4338d9c..e252b8ec2b85 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -1494,6 +1494,7 @@ void sock_map_unhash(struct sock *sk)
rcu_read_unlock();
saved_unhash(sk);
}
+EXPORT_SYMBOL_GPL(sock_map_unhash);

void sock_map_close(struct sock *sk, long timeout)
{
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 31061304ccf2..e7da96c80c71 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -797,17 +797,35 @@ static void unix_close(struct sock *sk, long timeout)
*/
}

-struct proto unix_proto = {
- .name = "UNIX",
+static void unix_unhash(struct sock *sk)
+{
+ /* Nothing to do here, unix socket does not need a ->unhash().
+ * This is merely for sockmap.
+ */
+}
+
+struct proto unix_dgram_proto = {
+ .name = "UNIX-DGRAM",
+ .owner = THIS_MODULE,
+ .obj_size = sizeof(struct unix_sock),
+ .close = unix_close,
+#ifdef CONFIG_BPF_SYSCALL
+ .psock_update_sk_prot = unix_dgram_bpf_update_proto,
+#endif
+};
+
+struct proto unix_stream_proto = {
+ .name = "UNIX-STREAM",
.owner = THIS_MODULE,
.obj_size = sizeof(struct unix_sock),
.close = unix_close,
+ .unhash = unix_unhash,
#ifdef CONFIG_BPF_SYSCALL
- .psock_update_sk_prot = unix_bpf_update_proto,
+ .psock_update_sk_prot = unix_stream_bpf_update_proto,
#endif
};

-static struct sock *unix_create1(struct net *net, struct socket *sock, int kern)
+static struct sock *unix_create1(struct net *net, struct socket *sock, int kern, int type)
{
struct sock *sk = NULL;
struct unix_sock *u;
@@ -816,7 +834,11 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern)
if (atomic_long_read(&unix_nr_socks) > 2 * get_max_files())
goto out;

- sk = sk_alloc(net, PF_UNIX, GFP_KERNEL, &unix_proto, kern);
+ if (type == SOCK_STREAM)
+ sk = sk_alloc(net, PF_UNIX, GFP_KERNEL, &unix_stream_proto, kern);
+ else /*dgram and seqpacket */
+ sk = sk_alloc(net, PF_UNIX, GFP_KERNEL, &unix_dgram_proto, kern);
+
if (!sk)
goto out;

@@ -878,7 +900,7 @@ static int unix_create(struct net *net, struct socket *sock, int protocol,
return -ESOCKTNOSUPPORT;
}

- return unix_create1(net, sock, kern) ? 0 : -ENOMEM;
+ return unix_create1(net, sock, kern, sock->type) ? 0 : -ENOMEM;
}

static int unix_release(struct socket *sock)
@@ -1292,7 +1314,7 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
err = -ENOMEM;

/* create new sock for complete connection */
- newsk = unix_create1(sock_net(sk), NULL, 0);
+ newsk = unix_create1(sock_net(sk), NULL, 0, sock->type);
if (newsk == NULL)
goto out;

@@ -2322,8 +2344,10 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si
struct sock *sk = sock->sk;

#ifdef CONFIG_BPF_SYSCALL
- if (sk->sk_prot != &unix_proto)
- return sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT,
+ const struct proto *prot = READ_ONCE(sk->sk_prot);
+
+ if (prot != &unix_dgram_proto)
+ return prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT,
flags & ~MSG_DONTWAIT, NULL);
#endif
return __unix_dgram_recvmsg(sk, msg, size, flags);
@@ -2727,6 +2751,20 @@ static int unix_stream_read_actor(struct sk_buff *skb,
return ret ?: chunk;
}

+int __unix_stream_recvmsg(struct sock *sk, struct msghdr *msg,
+ size_t size, int flags)
+{
+ struct unix_stream_read_state state = {
+ .recv_actor = unix_stream_read_actor,
+ .socket = sk->sk_socket,
+ .msg = msg,
+ .size = size,
+ .flags = flags
+ };
+
+ return unix_stream_read_generic(&state, true);
+}
+
static int unix_stream_recvmsg(struct socket *sock, struct msghdr *msg,
size_t size, int flags)
{
@@ -2738,6 +2776,14 @@ static int unix_stream_recvmsg(struct socket *sock, struct msghdr *msg,
.flags = flags
};

+#ifdef CONFIG_BPF_SYSCALL
+ struct sock *sk = sock->sk;
+ const struct proto *prot = READ_ONCE(sk->sk_prot);
+
+ if (prot != &unix_stream_proto)
+ return prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT,
+ flags & ~MSG_DONTWAIT, NULL);
+#endif
return unix_stream_read_generic(&state, true);
}

@@ -2798,7 +2844,9 @@ static int unix_shutdown(struct socket *sock, int mode)
(sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET)) {

int peer_mode = 0;
+ const struct proto *prot = READ_ONCE(other->sk_prot);

+ prot->unhash(other);
if (mode&RCV_SHUTDOWN)
peer_mode |= SEND_SHUTDOWN;
if (mode&SEND_SHUTDOWN)
@@ -2807,10 +2855,12 @@ static int unix_shutdown(struct socket *sock, int mode)
other->sk_shutdown |= peer_mode;
unix_state_unlock(other);
other->sk_state_change(other);
- if (peer_mode == SHUTDOWN_MASK)
+ if (peer_mode == SHUTDOWN_MASK) {
sk_wake_async(other, SOCK_WAKE_WAITD, POLL_HUP);
- else if (peer_mode & RCV_SHUTDOWN)
+ other->sk_state = TCP_CLOSE;
+ } else if (peer_mode & RCV_SHUTDOWN) {
sk_wake_async(other, SOCK_WAKE_WAITD, POLL_IN);
+ }
}
if (other)
sock_put(other);
@@ -3201,7 +3251,13 @@ static int __init af_unix_init(void)

BUILD_BUG_ON(sizeof(struct unix_skb_parms) > sizeof_field(struct sk_buff, cb));

- rc = proto_register(&unix_proto, 1);
+ rc = proto_register(&unix_dgram_proto, 1);
+ if (rc != 0) {
+ pr_crit("%s: Cannot create unix_sock SLAB cache!\n", __func__);
+ goto out;
+ }
+
+ rc = proto_register(&unix_stream_proto, 1);
if (rc != 0) {
pr_crit("%s: Cannot create unix_sock SLAB cache!\n", __func__);
goto out;
@@ -3217,7 +3273,8 @@ static int __init af_unix_init(void)
static void __exit af_unix_exit(void)
{
sock_unregister(PF_UNIX);
- proto_unregister(&unix_proto);
+ proto_unregister(&unix_dgram_proto);
+ proto_unregister(&unix_stream_proto);
unregister_pernet_subsys(&unix_net_ops);
}

diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c
index 20f53575b5c9..b927e2baae50 100644
--- a/net/unix/unix_bpf.c
+++ b/net/unix/unix_bpf.c
@@ -38,9 +38,18 @@ static int unix_msg_wait_data(struct sock *sk, struct sk_psock *psock,
return ret;
}

-static int unix_dgram_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
- size_t len, int nonblock, int flags,
- int *addr_len)
+static int __unix_recvmsg(struct sock *sk, struct msghdr *msg,
+ size_t len, int flags)
+{
+ if (sk->sk_type == SOCK_DGRAM)
+ return __unix_dgram_recvmsg(sk, msg, len, flags);
+ else
+ return __unix_stream_recvmsg(sk, msg, len, flags);
+}
+
+static int unix_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
+ size_t len, int nonblock, int flags,
+ int *addr_len)
{
struct unix_sock *u = unix_sk(sk);
struct sk_psock *psock;
@@ -48,14 +57,14 @@ static int unix_dgram_bpf_recvmsg(struct sock *sk, struct msghdr *msg,

psock = sk_psock_get(sk);
if (unlikely(!psock))
- return __unix_dgram_recvmsg(sk, msg, len, flags);
+ return __unix_recvmsg(sk, msg, len, flags);

mutex_lock(&u->iolock);
if (!skb_queue_empty(&sk->sk_receive_queue) &&
sk_psock_queue_empty(psock)) {
mutex_unlock(&u->iolock);
sk_psock_put(sk, psock);
- return __unix_dgram_recvmsg(sk, msg, len, flags);
+ return __unix_recvmsg(sk, msg, len, flags);
}

msg_bytes_ready:
@@ -71,7 +80,7 @@ static int unix_dgram_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
goto msg_bytes_ready;
mutex_unlock(&u->iolock);
sk_psock_put(sk, psock);
- return __unix_dgram_recvmsg(sk, msg, len, flags);
+ return __unix_recvmsg(sk, msg, len, flags);
}
copied = -EAGAIN;
}
@@ -80,30 +89,55 @@ static int unix_dgram_bpf_recvmsg(struct sock *sk, struct msghdr *msg,
return copied;
}

-static struct proto *unix_prot_saved __read_mostly;
-static DEFINE_SPINLOCK(unix_prot_lock);
-static struct proto unix_bpf_prot;
+static struct proto *unix_dgram_prot_saved __read_mostly;
+static DEFINE_SPINLOCK(unix_dgram_prot_lock);
+static struct proto unix_dgram_bpf_prot;
+
+static struct proto *unix_stream_prot_saved __read_mostly;
+static DEFINE_SPINLOCK(unix_stream_prot_lock);
+static struct proto unix_stream_bpf_prot;

-static void unix_bpf_rebuild_protos(struct proto *prot, const struct proto *base)
+static void unix_dgram_bpf_rebuild_protos(struct proto *prot, const struct proto *base)
{
*prot = *base;
prot->close = sock_map_close;
- prot->recvmsg = unix_dgram_bpf_recvmsg;
+ prot->recvmsg = unix_bpf_recvmsg;
+}
+
+static void unix_stream_bpf_rebuild_protos(struct proto *prot,
+ const struct proto *base)
+{
+ *prot = *base;
+ prot->close = sock_map_close;
+ prot->recvmsg = unix_bpf_recvmsg;
+ prot->unhash = sock_map_unhash;
+}
+
+static void unix_dgram_bpf_check_needs_rebuild(struct proto *ops)
+{
+ if (unlikely(ops != smp_load_acquire(&unix_dgram_prot_saved))) {
+ spin_lock_bh(&unix_dgram_prot_lock);
+ if (likely(ops != unix_dgram_prot_saved)) {
+ unix_dgram_bpf_rebuild_protos(&unix_dgram_bpf_prot, ops);
+ smp_store_release(&unix_dgram_prot_saved, ops);
+ }
+ spin_unlock_bh(&unix_dgram_prot_lock);
+ }
}

-static void unix_bpf_check_needs_rebuild(struct proto *ops)
+static void unix_stream_bpf_check_needs_rebuild(struct proto *ops)
{
- if (unlikely(ops != smp_load_acquire(&unix_prot_saved))) {
- spin_lock_bh(&unix_prot_lock);
- if (likely(ops != unix_prot_saved)) {
- unix_bpf_rebuild_protos(&unix_bpf_prot, ops);
- smp_store_release(&unix_prot_saved, ops);
+ if (unlikely(ops != smp_load_acquire(&unix_stream_prot_saved))) {
+ spin_lock_bh(&unix_stream_prot_lock);
+ if (likely(ops != unix_stream_prot_saved)) {
+ unix_stream_bpf_rebuild_protos(&unix_stream_bpf_prot, ops);
+ smp_store_release(&unix_stream_prot_saved, ops);
}
- spin_unlock_bh(&unix_prot_lock);
+ spin_unlock_bh(&unix_stream_prot_lock);
}
}

-int unix_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
+int unix_dgram_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
{
if (sk->sk_type != SOCK_DGRAM)
return -EOPNOTSUPP;
@@ -114,12 +148,27 @@ int unix_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
return 0;
}

- unix_bpf_check_needs_rebuild(psock->sk_proto);
- WRITE_ONCE(sk->sk_prot, &unix_bpf_prot);
+ unix_dgram_bpf_check_needs_rebuild(psock->sk_proto);
+ WRITE_ONCE(sk->sk_prot, &unix_dgram_bpf_prot);
+ return 0;
+}
+
+int unix_stream_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
+{
+ if (restore) {
+ sk->sk_write_space = psock->saved_write_space;
+ WRITE_ONCE(sk->sk_prot, psock->sk_proto);
+ return 0;
+ }
+
+ unix_stream_bpf_check_needs_rebuild(psock->sk_proto);
+ WRITE_ONCE(sk->sk_prot, &unix_stream_bpf_prot);
return 0;
}

void __init unix_bpf_build_proto(void)
{
- unix_bpf_rebuild_protos(&unix_bpf_prot, &unix_proto);
+ unix_dgram_bpf_rebuild_protos(&unix_dgram_bpf_prot, &unix_dgram_proto);
+ unix_stream_bpf_rebuild_protos(&unix_stream_bpf_prot, &unix_stream_proto);
+
}
--
2.20.1

2021-08-17 02:01:36

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH bpf-next v7 0/5] sockmap: add sockmap support for unix stream socket

Hello:

This series was applied to bpf/bpf-next.git (refs/heads/master):

On Mon, 16 Aug 2021 19:03:19 +0000 you wrote:
> This patch series add support for unix stream type
> for sockmap. Sockmap already supports TCP, UDP,
> unix dgram types. The unix stream support is similar
> to unix dgram.
>
> Also add selftests for unix stream type in sockmap tests.
>
> [...]

Here is the summary with links:
- [bpf-next,v7,1/5] af_unix: add read_sock for stream socket types
https://git.kernel.org/bpf/bpf-next/c/77462de14a43
- [bpf-next,v7,2/5] af_unix: add unix_stream_proto for sockmap
https://git.kernel.org/bpf/bpf-next/c/94531cfcbe79
- [bpf-next,v7,3/5] selftest/bpf: add tests for sockmap with unix stream type.
https://git.kernel.org/bpf/bpf-next/c/9b03152bd469
- [bpf-next,v7,4/5] selftest/bpf: change udp to inet in some function names
https://git.kernel.org/bpf/bpf-next/c/75e0e27db6cf
- [bpf-next,v7,5/5] selftest/bpf: add new tests in sockmap for unix stream to tcp.
https://git.kernel.org/bpf/bpf-next/c/31c50aeed5a1

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


2021-08-20 22:54:05

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: [PATCH bpf-next v7 2/5] af_unix: add unix_stream_proto for sockmap

16.08.2021 22:03, Jiang Wang пишет:
> Previously, sockmap for AF_UNIX protocol only supports
> dgram type. This patch add unix stream type support, which
> is similar to unix_dgram_proto. To support sockmap, dgram
> and stream cannot share the same unix_proto anymore, because
> they have different implementations, such as unhash for stream
> type (which will remove closed or disconnected sockets from the map),
> so rename unix_proto to unix_dgram_proto and add a new
> unix_stream_proto.
>
> Also implement stream related sockmap functions.
> And add dgram key words to those dgram specific functions.
>
> Signed-off-by: Jiang Wang <[email protected]>
> Reviewed-by: Cong Wang <[email protected]>
> ---
> include/net/af_unix.h | 8 +++-
> net/core/sock_map.c | 1 +
> net/unix/af_unix.c | 83 ++++++++++++++++++++++++++++++++------
> net/unix/unix_bpf.c | 93 +++++++++++++++++++++++++++++++++----------
> 4 files changed, 148 insertions(+), 37 deletions(-)

This patch broke Qt WebEngine using recent linux-next (tested on ARM32
only), please fix.

8<--- cut here ---
Unable to handle kernel NULL pointer dereference at virtual address
00000000
pgd = 2fba1ffb
*pgd=00000000
Internal error: Oops: 80000005 [#1] PREEMPT SMP THUMB2
Modules linked in:
CPU: 1 PID: 1999 Comm: falkon Tainted: G W
5.14.0-rc5-01175-g94531cfcbe79-dirty #9240
Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
PC is at 0x0
LR is at unix_shutdown+0x81/0x1a8
pc : [<00000000>] lr : [<c08f3311>] psr: 600f0013
sp : e45aff70 ip : e463a3c0 fp : beb54f04
r10: 00000125 r9 : e45ae000 r8 : c4a56664
r7 : 00000001 r6 : c4a56464 r5 : 00000001 r4 : c4a56400
r3 : 00000000 r2 : c5a6b180 r1 : 00000000 r0 : c4a56400
Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 50c5387d Table: 05aa804a DAC: 00000051
Register r0 information: slab PING start c4a56400 pointer offset 0
Register r1 information: NULL pointer
Register r2 information: slab task_struct start c5a6b180 pointer offset 0
Register r3 information: NULL pointer
Register r4 information: slab PING start c4a56400 pointer offset 0
Register r5 information: non-paged memory
Register r6 information: slab PING start c4a56400 pointer offset 100
Register r7 information: non-paged memory
Register r8 information: slab PING start c4a56400 pointer offset 612
Register r9 information: non-slab/vmalloc memory
Register r10 information: non-paged memory
Register r11 information: non-paged memory
Register r12 information: slab filp start e463a3c0 pointer offset 0
Process falkon (pid: 1999, stack limit = 0x9ec48895)
Stack: (0xe45aff70 to 0xe45b0000)
ff60: e45ae000 c5f26a00 00000000
00000125
ff80: c0100264 c07f7fa3 beb54f04 fffffff7 00000001 e6f3fc0e b5e5e9ec
beb54ec4
ffa0: b5da0ccc c010024b b5e5e9ec beb54ec4 0000000f 00000000 00000000
beb54ebc
ffc0: b5e5e9ec beb54ec4 b5da0ccc 00000125 beb54f58 00785238 beb5529c
beb54f04
ffe0: b5da1e24 beb54eac b301385c b62b6ee8 600f0030 0000000f 00000000
00000000
[<c08f3311>] (unix_shutdown) from [<c07f7fa3>] (__sys_shutdown+0x2f/0x50)
[<c07f7fa3>] (__sys_shutdown) from [<c010024b>]
(__sys_trace_return+0x1/0x16)
Exception stack(0xe45affa8 to 0xe45afff0)