Hello,
this patchset fixes appending newly arrived skbuff to the last skbuff of
the socket's queue during rx path. Problem fires when we are trying to
append data to skbuff which was already processed in dequeue callback
at least once. Dequeue callback calls function 'skb_pull()' which changes
'skb->len'. In current implementation 'skb->len' is used to update length
in header of last skbuff after new data was copied to it. This is bug,
because value in header is used to calculate 'rx_bytes'/'fwd_cnt' and
thus must be constant during skbuff lifetime. Here is example, we have
two skbuffs: skb0 with length 10 and skb1 with length 4.
1) skb0 arrives, hdr->len == skb->len == 10, rx_bytes == 10
2) Read 3 bytes from skb0, skb->len == 7, hdr->len == 10, rx_bytes == 10
3) skb1 arrives, hdr->len == skb->len == 4, rx_bytes == 14
4) Append skb1 to skb0, skb0 now has skb->len == 11, hdr->len == 11.
But value of 11 in header is invalid.
5) Read whole skb0, update rx_bytes by 11 from skb0's header.
6) At this moment rx_bytes == 3, but socket's queue is empty.
This bug starts to fire since:
commit
077706165717 ("virtio/vsock: don't use skbuff state to account credit")
In fact, it presents before, but didn't triggered due to a little bit
buggy implementation of credit calculation logic. So i'll use Fixes tag
for it.
I really forgot about this branch in rx path when implemented patch
077706165717.
This patchset contains 3 patches:
1) Fix itself.
2) Patch with WARN_ONCE() to catch such problems in future.
3) Patch with test which triggers skb appending logic. It looks like
simple test with several 'send()' and 'recv()', but it checks, that
skbuff appending works ok.
Link to v1:
https://lore.kernel.org/netdev/[email protected]/
Changelog:
v1 -> v2:
- Replace 'WARN()' with 'WARN_ONCE()'.
- Clean and refactor source code of the reproducer, now it is test for
vsock_test suite.
- Commit messages update.
Arseniy Krasnov (3):
virtio/vsock: fix header length on skb merging
virtio/vsock: WARN_ONCE() for invalid state of socket
test/vsock: new skbuff appending test
net/vmw_vsock/virtio_transport_common.c | 9 ++-
tools/testing/vsock/vsock_test.c | 90 +++++++++++++++++++++++++
2 files changed, 98 insertions(+), 1 deletion(-)
--
2.25.1
This fixes appending newly arrived skbuff to the last skbuff of the
socket's queue. Problem fires when we are trying to append data to skbuff
which was already processed in dequeue callback at least once. Dequeue
callback calls function 'skb_pull()' which changes 'skb->len'. In current
implementation 'skb->len' is used to update length in header of the last
skbuff after new data was copied to it. This is bug, because value in
header is used to calculate 'rx_bytes'/'fwd_cnt' and thus must be not
be changed during skbuff's lifetime.
Bug starts to fire since:
commit 077706165717
("virtio/vsock: don't use skbuff state to account credit")
It presents before, but didn't triggered due to a little bit buggy
implementation of credit calculation logic. So use Fixes tag for it.
Fixes: 077706165717 ("virtio/vsock: don't use skbuff state to account credit")
Signed-off-by: Arseniy Krasnov <[email protected]>
---
net/vmw_vsock/virtio_transport_common.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 7fc178c3ee07..b9144af71553 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -1101,7 +1101,7 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk,
memcpy(skb_put(last_skb, skb->len), skb->data, skb->len);
free_pkt = true;
last_hdr->flags |= hdr->flags;
- last_hdr->len = cpu_to_le32(last_skb->len);
+ le32_add_cpu(&last_hdr->len, len);
goto out;
}
}
--
2.25.1
This adds WARN_ONCE() and return from stream dequeue callback when
socket's queue is empty, but 'rx_bytes' still non-zero.
Signed-off-by: Arseniy Krasnov <[email protected]>
---
net/vmw_vsock/virtio_transport_common.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index b9144af71553..ad70531de133 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -398,6 +398,13 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
u32 free_space;
spin_lock_bh(&vvs->rx_lock);
+
+ if (WARN_ONCE(skb_queue_empty(&vvs->rx_queue) && vvs->rx_bytes,
+ "No skbuffs with non-zero 'rx_bytes'\n")) {
+ spin_unlock_bh(&vvs->rx_lock);
+ return err;
+ }
+
while (total < len && !skb_queue_empty(&vvs->rx_queue)) {
skb = skb_peek(&vvs->rx_queue);
--
2.25.1
This adds test which checks case when data of newly received skbuff is
appended to the last skbuff in the socket's queue. It looks like simple
test with 'send()' and 'recv()', but internally it triggers logic which
appends one received skbuff to another. Test checks that this feature
works correctly.
This test is actual only for virtio transport.
Signed-off-by: Arseniy Krasnov <[email protected]>
---
tools/testing/vsock/vsock_test.c | 90 ++++++++++++++++++++++++++++++++
1 file changed, 90 insertions(+)
diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index 3de10dbb50f5..12b97c92fbb2 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -968,6 +968,91 @@ static void test_seqpacket_inv_buf_server(const struct test_opts *opts)
test_inv_buf_server(opts, false);
}
+#define HELLO_STR "HELLO"
+#define WORLD_STR "WORLD"
+
+static void test_stream_virtio_skb_merge_client(const struct test_opts *opts)
+{
+ ssize_t res;
+ int fd;
+
+ fd = vsock_stream_connect(opts->peer_cid, 1234);
+ if (fd < 0) {
+ perror("connect");
+ exit(EXIT_FAILURE);
+ }
+
+ /* Send first skbuff. */
+ res = send(fd, HELLO_STR, strlen(HELLO_STR), 0);
+ if (res != strlen(HELLO_STR)) {
+ fprintf(stderr, "unexpected send(2) result %zi\n", res);
+ exit(EXIT_FAILURE);
+ }
+
+ control_writeln("SEND0");
+ /* Peer reads part of first skbuff. */
+ control_expectln("REPLY0");
+
+ /* Send second skbuff, it will be appended to the first. */
+ res = send(fd, WORLD_STR, strlen(WORLD_STR), 0);
+ if (res != strlen(WORLD_STR)) {
+ fprintf(stderr, "unexpected send(2) result %zi\n", res);
+ exit(EXIT_FAILURE);
+ }
+
+ control_writeln("SEND1");
+ /* Peer reads merged skbuff packet. */
+ control_expectln("REPLY1");
+
+ close(fd);
+}
+
+static void test_stream_virtio_skb_merge_server(const struct test_opts *opts)
+{
+ unsigned char buf[64];
+ ssize_t res;
+ int fd;
+
+ fd = vsock_stream_accept(VMADDR_CID_ANY, 1234, NULL);
+ if (fd < 0) {
+ perror("accept");
+ exit(EXIT_FAILURE);
+ }
+
+ control_expectln("SEND0");
+
+ /* Read skbuff partially. */
+ res = recv(fd, buf, 2, 0);
+ if (res != 2) {
+ fprintf(stderr, "expected recv(2) returns 2 bytes, got %zi\n", res);
+ exit(EXIT_FAILURE);
+ }
+
+ control_writeln("REPLY0");
+ control_expectln("SEND1");
+
+ res = recv(fd, buf + 2, sizeof(buf) - 2, 0);
+ if (res != 8) {
+ fprintf(stderr, "expected recv(2) returns 8 bytes, got %zi\n", res);
+ exit(EXIT_FAILURE);
+ }
+
+ res = recv(fd, buf, sizeof(buf) - 8 - 2, MSG_DONTWAIT);
+ if (res != -1) {
+ fprintf(stderr, "expected recv(2) failure, got %zi\n", res);
+ exit(EXIT_FAILURE);
+ }
+
+ if (memcmp(buf, HELLO_STR WORLD_STR, strlen(HELLO_STR WORLD_STR))) {
+ fprintf(stderr, "pattern mismatch\n");
+ exit(EXIT_FAILURE);
+ }
+
+ control_writeln("REPLY1");
+
+ close(fd);
+}
+
static struct test_case test_cases[] = {
{
.name = "SOCK_STREAM connection reset",
@@ -1038,6 +1123,11 @@ static struct test_case test_cases[] = {
.run_client = test_seqpacket_inv_buf_client,
.run_server = test_seqpacket_inv_buf_server,
},
+ {
+ .name = "SOCK_STREAM virtio skb merge",
+ .run_client = test_stream_virtio_skb_merge_client,
+ .run_server = test_stream_virtio_skb_merge_server,
+ },
{},
};
--
2.25.1
On Sun, Mar 26, 2023 at 01:08:22AM +0300, Arseniy Krasnov wrote:
>This fixes appending newly arrived skbuff to the last skbuff of the
>socket's queue. Problem fires when we are trying to append data to skbuff
>which was already processed in dequeue callback at least once. Dequeue
>callback calls function 'skb_pull()' which changes 'skb->len'. In current
>implementation 'skb->len' is used to update length in header of the last
>skbuff after new data was copied to it. This is bug, because value in
>header is used to calculate 'rx_bytes'/'fwd_cnt' and thus must be not
>be changed during skbuff's lifetime.
>
>Bug starts to fire since:
>
>commit 077706165717
>("virtio/vsock: don't use skbuff state to account credit")
>
>It presents before, but didn't triggered due to a little bit buggy
>implementation of credit calculation logic. So use Fixes tag for it.
>
>Fixes: 077706165717 ("virtio/vsock: don't use skbuff state to account credit")
>Signed-off-by: Arseniy Krasnov <[email protected]>
>---
> net/vmw_vsock/virtio_transport_common.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 7fc178c3ee07..b9144af71553 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -1101,7 +1101,7 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk,
> memcpy(skb_put(last_skb, skb->len), skb->data, skb->len);
> free_pkt = true;
> last_hdr->flags |= hdr->flags;
>- last_hdr->len = cpu_to_le32(last_skb->len);
>+ le32_add_cpu(&last_hdr->len, len);
> goto out;
> }
> }
>--
>2.25.1
>
LGTM!
Reviewed-by: Stefano Garzarella <[email protected]>
On Sun, Mar 26, 2023 at 01:09:25AM +0300, Arseniy Krasnov wrote:
>This adds WARN_ONCE() and return from stream dequeue callback when
>socket's queue is empty, but 'rx_bytes' still non-zero.
Nit: I would explain why we add this, for example:
This allows the detection of potential bugs due to packet merging
(see previous patch).
>
>Signed-off-by: Arseniy Krasnov <[email protected]>
>---
> net/vmw_vsock/virtio_transport_common.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index b9144af71553..ad70531de133 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -398,6 +398,13 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
> u32 free_space;
>
> spin_lock_bh(&vvs->rx_lock);
>+
>+ if (WARN_ONCE(skb_queue_empty(&vvs->rx_queue) && vvs->rx_bytes,
>+ "No skbuffs with non-zero 'rx_bytes'\n")) {
Nit: I would rephrase it this way:
"rx_queue is empty, but rx_bytes is non-zero"
>+ spin_unlock_bh(&vvs->rx_lock);
>+ return err;
>+ }
>+
> while (total < len && !skb_queue_empty(&vvs->rx_queue)) {
> skb = skb_peek(&vvs->rx_queue);
>
>--
>2.25.1
>
Anyway the patch LGTM!
Reviewed-by: Stefano Garzarella <[email protected]>
On 28.03.2023 12:29, Stefano Garzarella wrote:
> On Sun, Mar 26, 2023 at 01:09:25AM +0300, Arseniy Krasnov wrote:
>> This adds WARN_ONCE() and return from stream dequeue callback when
>> socket's queue is empty, but 'rx_bytes' still non-zero.
>
> Nit: I would explain why we add this, for example:
>
> This allows the detection of potential bugs due to packet merging
> (see previous patch).
>
>>
>> Signed-off-by: Arseniy Krasnov <[email protected]>
>> ---
>> net/vmw_vsock/virtio_transport_common.c | 7 +++++++
>> 1 file changed, 7 insertions(+)
>
>>
>> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>> index b9144af71553..ad70531de133 100644
>> --- a/net/vmw_vsock/virtio_transport_common.c
>> +++ b/net/vmw_vsock/virtio_transport_common.c
>> @@ -398,6 +398,13 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
>> u32 free_space;
>>
>> spin_lock_bh(&vvs->rx_lock);
>> +
>> + if (WARN_ONCE(skb_queue_empty(&vvs->rx_queue) && vvs->rx_bytes,
>> + "No skbuffs with non-zero 'rx_bytes'\n")) {
>
> Nit: I would rephrase it this way:
> "rx_queue is empty, but rx_bytes is non-zero"
>
>> + spin_unlock_bh(&vvs->rx_lock);
>> + return err;
>> + }
>> +
>> while (total < len && !skb_queue_empty(&vvs->rx_queue)) {
>> skb = skb_peek(&vvs->rx_queue);
>>
>> --
>> 2.25.1
>>
>
> Anyway the patch LGTM!
Thanks for review! Since only string value and commit message should be
updated, i can resend it with 'net' (as it is fix) and update two thing
above in 'net' version?
Thanks, Arseniy
>
> Reviewed-by: Stefano Garzarella <[email protected]>
>
On Sun, Mar 26, 2023 at 01:10:16AM +0300, Arseniy Krasnov wrote:
>This adds test which checks case when data of newly received skbuff is
>appended to the last skbuff in the socket's queue. It looks like simple
>test with 'send()' and 'recv()', but internally it triggers logic which
>appends one received skbuff to another. Test checks that this feature
>works correctly.
>
>This test is actual only for virtio transport.
>
>Signed-off-by: Arseniy Krasnov <[email protected]>
>---
> tools/testing/vsock/vsock_test.c | 90 ++++++++++++++++++++++++++++++++
> 1 file changed, 90 insertions(+)
Reviewed-by: Stefano Garzarella <[email protected]>
>
>diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
>index 3de10dbb50f5..12b97c92fbb2 100644
>--- a/tools/testing/vsock/vsock_test.c
>+++ b/tools/testing/vsock/vsock_test.c
>@@ -968,6 +968,91 @@ static void test_seqpacket_inv_buf_server(const struct test_opts *opts)
> test_inv_buf_server(opts, false);
> }
>
>+#define HELLO_STR "HELLO"
>+#define WORLD_STR "WORLD"
>+
>+static void test_stream_virtio_skb_merge_client(const struct test_opts *opts)
>+{
>+ ssize_t res;
>+ int fd;
>+
>+ fd = vsock_stream_connect(opts->peer_cid, 1234);
>+ if (fd < 0) {
>+ perror("connect");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ /* Send first skbuff. */
>+ res = send(fd, HELLO_STR, strlen(HELLO_STR), 0);
>+ if (res != strlen(HELLO_STR)) {
>+ fprintf(stderr, "unexpected send(2) result %zi\n", res);
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ control_writeln("SEND0");
>+ /* Peer reads part of first skbuff. */
>+ control_expectln("REPLY0");
>+
>+ /* Send second skbuff, it will be appended to the first. */
>+ res = send(fd, WORLD_STR, strlen(WORLD_STR), 0);
>+ if (res != strlen(WORLD_STR)) {
>+ fprintf(stderr, "unexpected send(2) result %zi\n", res);
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ control_writeln("SEND1");
>+ /* Peer reads merged skbuff packet. */
>+ control_expectln("REPLY1");
>+
>+ close(fd);
>+}
>+
>+static void test_stream_virtio_skb_merge_server(const struct test_opts *opts)
>+{
>+ unsigned char buf[64];
>+ ssize_t res;
>+ int fd;
>+
>+ fd = vsock_stream_accept(VMADDR_CID_ANY, 1234, NULL);
>+ if (fd < 0) {
>+ perror("accept");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ control_expectln("SEND0");
>+
>+ /* Read skbuff partially. */
>+ res = recv(fd, buf, 2, 0);
>+ if (res != 2) {
>+ fprintf(stderr, "expected recv(2) returns 2 bytes, got %zi\n", res);
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ control_writeln("REPLY0");
>+ control_expectln("SEND1");
>+
>+ res = recv(fd, buf + 2, sizeof(buf) - 2, 0);
>+ if (res != 8) {
>+ fprintf(stderr, "expected recv(2) returns 8 bytes, got %zi\n", res);
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ res = recv(fd, buf, sizeof(buf) - 8 - 2, MSG_DONTWAIT);
>+ if (res != -1) {
>+ fprintf(stderr, "expected recv(2) failure, got %zi\n", res);
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ if (memcmp(buf, HELLO_STR WORLD_STR, strlen(HELLO_STR WORLD_STR))) {
>+ fprintf(stderr, "pattern mismatch\n");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ control_writeln("REPLY1");
>+
>+ close(fd);
>+}
>+
> static struct test_case test_cases[] = {
> {
> .name = "SOCK_STREAM connection reset",
>@@ -1038,6 +1123,11 @@ static struct test_case test_cases[] = {
> .run_client = test_seqpacket_inv_buf_client,
> .run_server = test_seqpacket_inv_buf_server,
> },
>+ {
>+ .name = "SOCK_STREAM virtio skb merge",
>+ .run_client = test_stream_virtio_skb_merge_client,
>+ .run_server = test_stream_virtio_skb_merge_server,
>+ },
> {},
> };
>
>--
>2.25.1
>
On Tue, Mar 28, 2023 at 11:35 AM Arseniy Krasnov
<[email protected]> wrote:
>
>
>
> On 28.03.2023 12:29, Stefano Garzarella wrote:
> > On Sun, Mar 26, 2023 at 01:09:25AM +0300, Arseniy Krasnov wrote:
> >> This adds WARN_ONCE() and return from stream dequeue callback when
> >> socket's queue is empty, but 'rx_bytes' still non-zero.
> >
> > Nit: I would explain why we add this, for example:
> >
> > This allows the detection of potential bugs due to packet merging
> > (see previous patch).
> >
> >>
> >> Signed-off-by: Arseniy Krasnov <[email protected]>
> >> ---
> >> net/vmw_vsock/virtio_transport_common.c | 7 +++++++
> >> 1 file changed, 7 insertions(+)
> >
> >>
> >> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
> >> index b9144af71553..ad70531de133 100644
> >> --- a/net/vmw_vsock/virtio_transport_common.c
> >> +++ b/net/vmw_vsock/virtio_transport_common.c
> >> @@ -398,6 +398,13 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
> >> u32 free_space;
> >>
> >> spin_lock_bh(&vvs->rx_lock);
> >> +
> >> + if (WARN_ONCE(skb_queue_empty(&vvs->rx_queue) && vvs->rx_bytes,
> >> + "No skbuffs with non-zero 'rx_bytes'\n")) {
> >
> > Nit: I would rephrase it this way:
> > "rx_queue is empty, but rx_bytes is non-zero"
> >
> >> + spin_unlock_bh(&vvs->rx_lock);
> >> + return err;
> >> + }
> >> +
> >> while (total < len && !skb_queue_empty(&vvs->rx_queue)) {
> >> skb = skb_peek(&vvs->rx_queue);
> >>
> >> --
> >> 2.25.1
> >>
> >
> > Anyway the patch LGTM!
>
> Thanks for review! Since only string value and commit message should be
> updated, i can resend it with 'net' (as it is fix) and update two thing
> above in 'net' version?
Yep, sure!
And you can already add my R-b ;-)
Thanks,
Stefano
>
> Thanks, Arseniy
> >
> > Reviewed-by: Stefano Garzarella <[email protected]>
> >
>