2023-03-28 13:13:11

by Juergen Gross

[permalink] [raw]
Subject: [PATCH v2 0/3] xen/netback: fix issue introduced recently

The fix for XSA-423 introduced a bug which resulted in loss of network
connection in some configurations.

The first patch is fixing the issue, while the second one is removing
a test which isn't needed. The third patch is making error messages
more uniform.

Changes in V2:
- add patch 3
- comment addressed (patch 1)


Juergen Gross (3):
xen/netback: don't do grant copy across page boundary
xen/netback: remove not needed test in xenvif_tx_build_gops()
xen/netback: use same error messages for same errors

drivers/net/xen-netback/common.h | 2 +-
drivers/net/xen-netback/netback.c | 37 ++++++++++++++++++++++---------
2 files changed, 28 insertions(+), 11 deletions(-)

--
2.35.3


2023-03-28 13:14:41

by Juergen Gross

[permalink] [raw]
Subject: [PATCH v2 1/3] xen/netback: don't do grant copy across page boundary

Fix xenvif_get_requests() not to do grant copy operations across local
page boundaries. This requires to double the maximum number of copy
operations per queue, as each copy could now be split into 2.

Make sure that struct xenvif_tx_cb doesn't grow too large.

Cc: [email protected]
Fixes: ad7f402ae4f4 ("xen/netback: Ensure protocol headers don't fall in the non-linear area")
Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Paul Durrant <[email protected]>
---
V2:
- add another BUILD_BUG_ON() (Jan Beulich)
---
drivers/net/xen-netback/common.h | 2 +-
drivers/net/xen-netback/netback.c | 27 +++++++++++++++++++++++++--
2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 3dbfc8a6924e..1fcbd83f7ff2 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -166,7 +166,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
struct pending_tx_info pending_tx_info[MAX_PENDING_REQS];
grant_handle_t grant_tx_handle[MAX_PENDING_REQS];

- struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS];
+ struct gnttab_copy tx_copy_ops[2 * MAX_PENDING_REQS];
struct gnttab_map_grant_ref tx_map_ops[MAX_PENDING_REQS];
struct gnttab_unmap_grant_ref tx_unmap_ops[MAX_PENDING_REQS];
/* passed to gnttab_[un]map_refs with pages under (un)mapping */
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 1b42676ca141..54c76af90233 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -334,6 +334,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
struct xenvif_tx_cb {
u16 copy_pending_idx[XEN_NETBK_LEGACY_SLOTS_MAX + 1];
u8 copy_count;
+ u32 split_mask;
};

#define XENVIF_TX_CB(skb) ((struct xenvif_tx_cb *)(skb)->cb)
@@ -361,6 +362,10 @@ static inline struct sk_buff *xenvif_alloc_skb(unsigned int size)
struct sk_buff *skb =
alloc_skb(size + NET_SKB_PAD + NET_IP_ALIGN,
GFP_ATOMIC | __GFP_NOWARN);
+
+ BUILD_BUG_ON(sizeof(*XENVIF_TX_CB(skb)) > sizeof(skb->cb));
+ BUILD_BUG_ON(sizeof(XENVIF_TX_CB(skb)->split_mask) * 8 <
+ ARRAY_SIZE(XENVIF_TX_CB(skb)->copy_pending_idx));
if (unlikely(skb == NULL))
return NULL;

@@ -396,11 +401,13 @@ static void xenvif_get_requests(struct xenvif_queue *queue,
nr_slots = shinfo->nr_frags + 1;

copy_count(skb) = 0;
+ XENVIF_TX_CB(skb)->split_mask = 0;

/* Create copy ops for exactly data_len bytes into the skb head. */
__skb_put(skb, data_len);
while (data_len > 0) {
int amount = data_len > txp->size ? txp->size : data_len;
+ bool split = false;

cop->source.u.ref = txp->gref;
cop->source.domid = queue->vif->domid;
@@ -413,6 +420,13 @@ static void xenvif_get_requests(struct xenvif_queue *queue,
cop->dest.u.gmfn = virt_to_gfn(skb->data + skb_headlen(skb)
- data_len);

+ /* Don't cross local page boundary! */
+ if (cop->dest.offset + amount > XEN_PAGE_SIZE) {
+ amount = XEN_PAGE_SIZE - cop->dest.offset;
+ XENVIF_TX_CB(skb)->split_mask |= 1U << copy_count(skb);
+ split = true;
+ }
+
cop->len = amount;
cop->flags = GNTCOPY_source_gref;

@@ -420,7 +434,8 @@ static void xenvif_get_requests(struct xenvif_queue *queue,
pending_idx = queue->pending_ring[index];
callback_param(queue, pending_idx).ctx = NULL;
copy_pending_idx(skb, copy_count(skb)) = pending_idx;
- copy_count(skb)++;
+ if (!split)
+ copy_count(skb)++;

cop++;
data_len -= amount;
@@ -441,7 +456,8 @@ static void xenvif_get_requests(struct xenvif_queue *queue,
nr_slots--;
} else {
/* The copy op partially covered the tx_request.
- * The remainder will be mapped.
+ * The remainder will be mapped or copied in the next
+ * iteration.
*/
txp->offset += amount;
txp->size -= amount;
@@ -539,6 +555,13 @@ static int xenvif_tx_check_gop(struct xenvif_queue *queue,
pending_idx = copy_pending_idx(skb, i);

newerr = (*gopp_copy)->status;
+
+ /* Split copies need to be handled together. */
+ if (XENVIF_TX_CB(skb)->split_mask & (1U << i)) {
+ (*gopp_copy)++;
+ if (!newerr)
+ newerr = (*gopp_copy)->status;
+ }
if (likely(!newerr)) {
/* The first frag might still have this slot mapped */
if (i < copy_count(skb) - 1 || !sharedslot)
--
2.35.3

2023-03-28 13:14:43

by Juergen Gross

[permalink] [raw]
Subject: [PATCH v2 2/3] xen/netback: remove not needed test in xenvif_tx_build_gops()

The tests for the number of grant mapping or copy operations reaching
the array size of the operations buffer at the end of the main loop in
xenvif_tx_build_gops() isn't needed.

The loop can handle at maximum MAX_PENDING_REQS transfer requests, as
XEN_RING_NR_UNCONSUMED_REQUESTS() is taking unsent responses into
consideration, too.

Remove the tests.

Suggested-by: Jan Beulich <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Paul Durrant <[email protected]>
---
drivers/net/xen-netback/netback.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 54c76af90233..9ca4b69d3b39 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1084,10 +1084,6 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
__skb_queue_tail(&queue->tx_queue, skb);

queue->tx.req_cons = idx;
-
- if ((*map_ops >= ARRAY_SIZE(queue->tx_map_ops)) ||
- (*copy_ops >= ARRAY_SIZE(queue->tx_copy_ops)))
- break;
}

return;
--
2.35.3

2023-03-28 13:14:56

by Juergen Gross

[permalink] [raw]
Subject: [PATCH v2 3/3] xen/netback: use same error messages for same errors

Issue the same error message in case an illegal page boundary crossing
has been detected in both cases where this is tested.

Suggested-by: Jan Beulich <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>
---
V2:
- new patch
---
drivers/net/xen-netback/netback.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 9ca4b69d3b39..5dfdec44354a 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -996,10 +996,8 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,

/* No crossing a page as the payload mustn't fragment. */
if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
- netdev_err(queue->vif->dev,
- "txreq.offset: %u, size: %u, end: %lu\n",
- txreq.offset, txreq.size,
- (unsigned long)(txreq.offset&~XEN_PAGE_MASK) + txreq.size);
+ netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %u, size: %u\n",
+ txreq.offset, txreq.size);
xenvif_fatal_tx_err(queue->vif);
break;
}
--
2.35.3

2023-03-28 13:33:42

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] xen/netback: fix issue introduced recently

Hello:

This series was applied to netdev/net.git (main)
by Paolo Abeni <[email protected]>:

On Tue, 28 Mar 2023 15:10:44 +0200 you wrote:
> The fix for XSA-423 introduced a bug which resulted in loss of network
> connection in some configurations.
>
> The first patch is fixing the issue, while the second one is removing
> a test which isn't needed. The third patch is making error messages
> more uniform.
>
> [...]

Here is the summary with links:
- [v2,1/3] xen/netback: don't do grant copy across page boundary
(no matching commit)
- [v2,2/3] xen/netback: remove not needed test in xenvif_tx_build_gops()
https://git.kernel.org/netdev/net/c/8fb8ebf94877
- [v2,3/3] xen/netback: use same error messages for same errors
(no matching commit)

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


2023-03-28 13:35:04

by Jan Beulich

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] xen/netback: use same error messages for same errors

On 28.03.2023 15:12, Juergen Gross wrote:
> Issue the same error message in case an illegal page boundary crossing
> has been detected in both cases where this is tested.
>
> Suggested-by: Jan Beulich <[email protected]>
> Signed-off-by: Juergen Gross <[email protected]>
> ---
> V2:
> - new patch
> ---
> drivers/net/xen-netback/netback.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 9ca4b69d3b39..5dfdec44354a 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -996,10 +996,8 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
>
> /* No crossing a page as the payload mustn't fragment. */
> if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
> - netdev_err(queue->vif->dev,
> - "txreq.offset: %u, size: %u, end: %lu\n",
> - txreq.offset, txreq.size,
> - (unsigned long)(txreq.offset&~XEN_PAGE_MASK) + txreq.size);
> + netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %u, size: %u\n",
> + txreq.offset, txreq.size);
> xenvif_fatal_tx_err(queue->vif);
> break;
> }

To be honest I'm of the opinion that this goes slightly too far:
Making the two messages more similar is certainly helpful. But in
case of problems I think it wouldn't hurt if they're still
distinguishable - when the one here triggers it may e.g also mean
that the calculation of the residual size is causing an issue. So
maybe stick to txreq.offset in the message text, with everything
else left as you have it?

Jan

2023-03-28 13:54:42

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] xen/netback: fix issue introduced recently

Hi,

On Tue, 2023-03-28 at 15:10 +0200, Juergen Gross wrote:
> The fix for XSA-423 introduced a bug which resulted in loss of network
> connection in some configurations.
>
> The first patch is fixing the issue, while the second one is removing
> a test which isn't needed. The third patch is making error messages
> more uniform.
>
> Changes in V2:
> - add patch 3
> - comment addressed (patch 1)

I misread the thread on v2 as the build_bug_on() was not needed and
applied such revision. 

Please rebase any further change on top of current net.

Thanks,

Paolo

2023-03-28 14:04:17

by Jan Beulich

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] xen/netback: fix issue introduced recently

On 28.03.2023 15:52, Paolo Abeni wrote:
> On Tue, 2023-03-28 at 15:10 +0200, Juergen Gross wrote:
>> The fix for XSA-423 introduced a bug which resulted in loss of network
>> connection in some configurations.
>>
>> The first patch is fixing the issue, while the second one is removing
>> a test which isn't needed. The third patch is making error messages
>> more uniform.
>>
>> Changes in V2:
>> - add patch 3
>> - comment addressed (patch 1)
>
> I misread the thread on v2 as the build_bug_on() was not needed and
> applied such revision. 

I guess it's not the end of the world if we go forward without that extra
check. It's a safeguard for theoretical future work which, as Jürgen says,
isn't very likely to occur anyway.

Jan

2023-03-29 06:31:24

by Juergen Gross

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] xen/netback: use same error messages for same errors

On 28.03.23 15:32, Jan Beulich wrote:
> On 28.03.2023 15:12, Juergen Gross wrote:
>> Issue the same error message in case an illegal page boundary crossing
>> has been detected in both cases where this is tested.
>>
>> Suggested-by: Jan Beulich <[email protected]>
>> Signed-off-by: Juergen Gross <[email protected]>
>> ---
>> V2:
>> - new patch
>> ---
>> drivers/net/xen-netback/netback.c | 6 ++----
>> 1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>> index 9ca4b69d3b39..5dfdec44354a 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -996,10 +996,8 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
>>
>> /* No crossing a page as the payload mustn't fragment. */
>> if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
>> - netdev_err(queue->vif->dev,
>> - "txreq.offset: %u, size: %u, end: %lu\n",
>> - txreq.offset, txreq.size,
>> - (unsigned long)(txreq.offset&~XEN_PAGE_MASK) + txreq.size);
>> + netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %u, size: %u\n",
>> + txreq.offset, txreq.size);
>> xenvif_fatal_tx_err(queue->vif);
>> break;
>> }
>
> To be honest I'm of the opinion that this goes slightly too far:
> Making the two messages more similar is certainly helpful. But in
> case of problems I think it wouldn't hurt if they're still
> distinguishable - when the one here triggers it may e.g also mean
> that the calculation of the residual size is causing an issue. So
> maybe stick to txreq.offset in the message text, with everything
> else left as you have it?

Fine with me.


Juergen


Attachments:
OpenPGP_0xB0DE9DD628BF132F.asc (3.08 kB)
OpenPGP public key
OpenPGP_signature (505.00 B)
OpenPGP digital signature
Download all attachments