2021-08-24 10:29:12

by Jürgen Groß

[permalink] [raw]
Subject: [PATCH v2 0/4] xen: harden netfront against malicious backends

Xen backends of para-virtualized devices can live in dom0 kernel, dom0
user land, or in a driver domain. This means that a backend might
reside in a less trusted environment than the Xen core components, so
a backend should not be able to do harm to a Xen guest (it can still
mess up I/O data, but it shouldn't be able to e.g. crash a guest by
other means or cause a privilege escalation in the guest).

Unfortunately netfront in the Linux kernel is fully trusting its
backend. This series is fixing netfront in this regard.

It was discussed to handle this as a security problem, but the topic
was discussed in public before, so it isn't a real secret.

It should be mentioned that a similar series has been posted some years
ago by Marek Marczykowski-Górecki, but this series has not been applied
due to a Xen header not having been available in the Xen git repo at
that time. Additionally my series is fixing some more DoS cases.

Changes in V2:
- put netfront patches into own series
- comments addressed
- new patch 3

Juergen Gross (4):
xen/netfront: read response from backend only once
xen/netfront: don't read data from request on the ring page
xen/netfront: disentangle tx_skb_freelist
xen/netfront: don't trust the backend response data blindly

drivers/net/xen-netfront.c | 272 +++++++++++++++++++++++--------------
1 file changed, 169 insertions(+), 103 deletions(-)

--
2.26.2


2021-08-24 10:30:04

by Jürgen Groß

[permalink] [raw]
Subject: [PATCH v2 1/4] xen/netfront: read response from backend only once

In order to avoid problems in case the backend is modifying a response
on the ring page while the frontend has already seen it, just read the
response into a local buffer in one go and then operate on that buffer
only.

Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Jan Beulich <[email protected]>
---
V2:
- use direct structure assignment instead of memcpy() (Jan Beulich)
---
drivers/net/xen-netfront.c | 38 +++++++++++++++++++-------------------
1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 44275908d61a..003cdf2ffc92 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -399,13 +399,13 @@ static void xennet_tx_buf_gc(struct netfront_queue *queue)
rmb(); /* Ensure we see responses up to 'rp'. */

for (cons = queue->tx.rsp_cons; cons != prod; cons++) {
- struct xen_netif_tx_response *txrsp;
+ struct xen_netif_tx_response txrsp;

- txrsp = RING_GET_RESPONSE(&queue->tx, cons);
- if (txrsp->status == XEN_NETIF_RSP_NULL)
+ RING_COPY_RESPONSE(&queue->tx, cons, &txrsp);
+ if (txrsp.status == XEN_NETIF_RSP_NULL)
continue;

- id = txrsp->id;
+ id = txrsp.id;
skb = queue->tx_skbs[id].skb;
if (unlikely(gnttab_query_foreign_access(
queue->grant_tx_ref[id]) != 0)) {
@@ -814,7 +814,7 @@ static int xennet_get_extras(struct netfront_queue *queue,
RING_IDX rp)

{
- struct xen_netif_extra_info *extra;
+ struct xen_netif_extra_info extra;
struct device *dev = &queue->info->netdev->dev;
RING_IDX cons = queue->rx.rsp_cons;
int err = 0;
@@ -830,24 +830,22 @@ static int xennet_get_extras(struct netfront_queue *queue,
break;
}

- extra = (struct xen_netif_extra_info *)
- RING_GET_RESPONSE(&queue->rx, ++cons);
+ RING_COPY_RESPONSE(&queue->rx, ++cons, &extra);

- if (unlikely(!extra->type ||
- extra->type >= XEN_NETIF_EXTRA_TYPE_MAX)) {
+ if (unlikely(!extra.type ||
+ extra.type >= XEN_NETIF_EXTRA_TYPE_MAX)) {
if (net_ratelimit())
dev_warn(dev, "Invalid extra type: %d\n",
- extra->type);
+ extra.type);
err = -EINVAL;
} else {
- memcpy(&extras[extra->type - 1], extra,
- sizeof(*extra));
+ extras[extra.type - 1] = extra;
}

skb = xennet_get_rx_skb(queue, cons);
ref = xennet_get_rx_ref(queue, cons);
xennet_move_rx_slot(queue, skb, ref);
- } while (extra->flags & XEN_NETIF_EXTRA_FLAG_MORE);
+ } while (extra.flags & XEN_NETIF_EXTRA_FLAG_MORE);

queue->rx.rsp_cons = cons;
return err;
@@ -905,7 +903,7 @@ static int xennet_get_responses(struct netfront_queue *queue,
struct sk_buff_head *list,
bool *need_xdp_flush)
{
- struct xen_netif_rx_response *rx = &rinfo->rx;
+ struct xen_netif_rx_response *rx = &rinfo->rx, rx_local;
int max = XEN_NETIF_NR_SLOTS_MIN + (rx->status <= RX_COPY_THRESHOLD);
RING_IDX cons = queue->rx.rsp_cons;
struct sk_buff *skb = xennet_get_rx_skb(queue, cons);
@@ -989,7 +987,8 @@ static int xennet_get_responses(struct netfront_queue *queue,
break;
}

- rx = RING_GET_RESPONSE(&queue->rx, cons + slots);
+ RING_COPY_RESPONSE(&queue->rx, cons + slots, &rx_local);
+ rx = &rx_local;
skb = xennet_get_rx_skb(queue, cons + slots);
ref = xennet_get_rx_ref(queue, cons + slots);
slots++;
@@ -1044,10 +1043,11 @@ static int xennet_fill_frags(struct netfront_queue *queue,
struct sk_buff *nskb;

while ((nskb = __skb_dequeue(list))) {
- struct xen_netif_rx_response *rx =
- RING_GET_RESPONSE(&queue->rx, ++cons);
+ struct xen_netif_rx_response rx;
skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];

+ RING_COPY_RESPONSE(&queue->rx, ++cons, &rx);
+
if (skb_shinfo(skb)->nr_frags == MAX_SKB_FRAGS) {
unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;

@@ -1062,7 +1062,7 @@ static int xennet_fill_frags(struct netfront_queue *queue,

skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
skb_frag_page(nfrag),
- rx->offset, rx->status, PAGE_SIZE);
+ rx.offset, rx.status, PAGE_SIZE);

skb_shinfo(nskb)->nr_frags = 0;
kfree_skb(nskb);
@@ -1161,7 +1161,7 @@ static int xennet_poll(struct napi_struct *napi, int budget)
i = queue->rx.rsp_cons;
work_done = 0;
while ((i != rp) && (work_done < budget)) {
- memcpy(rx, RING_GET_RESPONSE(&queue->rx, i), sizeof(*rx));
+ RING_COPY_RESPONSE(&queue->rx, i, rx);
memset(extras, 0, sizeof(rinfo.extras));

err = xennet_get_responses(queue, &rinfo, rp, &tmpq,
--
2.26.2

2021-08-24 10:30:44

by Jürgen Groß

[permalink] [raw]
Subject: [PATCH v2 4/4] xen/netfront: don't trust the backend response data blindly

Today netfront will trust the backend to send only sane response data.
In order to avoid privilege escalations or crashes in case of malicious
backends verify the data to be within expected limits. Especially make
sure that the response always references an outstanding request.

Note that only the tx queue needs special id handling, as for the rx
queue the id is equal to the index in the ring page.

Introduce a new indicator for the device whether it is broken and let
the device stop working when it is set. Set this indicator in case the
backend sets any weird data.

Signed-off-by: Juergen Gross <[email protected]>
---
V2:
- set the pending flag only just before sending the request (Jan Beulich)
- reset broken indicator during connect (Jan Beulich)
---
drivers/net/xen-netfront.c | 89 +++++++++++++++++++++++++++++++++++---
1 file changed, 84 insertions(+), 5 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 956e1266bd1a..e31b98403f31 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -131,10 +131,12 @@ struct netfront_queue {
struct sk_buff *tx_skbs[NET_TX_RING_SIZE];
unsigned short tx_link[NET_TX_RING_SIZE];
#define TX_LINK_NONE 0xffff
+#define TX_PENDING 0xfffe
grant_ref_t gref_tx_head;
grant_ref_t grant_tx_ref[NET_TX_RING_SIZE];
struct page *grant_tx_page[NET_TX_RING_SIZE];
unsigned tx_skb_freelist;
+ unsigned int tx_pend_queue;

spinlock_t rx_lock ____cacheline_aligned_in_smp;
struct xen_netif_rx_front_ring rx;
@@ -167,6 +169,9 @@ struct netfront_info {
bool netback_has_xdp_headroom;
bool netfront_xdp_enabled;

+ /* Is device behaving sane? */
+ bool broken;
+
atomic_t rx_gso_checksum_fixup;
};

@@ -349,7 +354,7 @@ static int xennet_open(struct net_device *dev)
unsigned int i = 0;
struct netfront_queue *queue = NULL;

- if (!np->queues)
+ if (!np->queues || np->broken)
return -ENODEV;

for (i = 0; i < num_queues; ++i) {
@@ -377,11 +382,17 @@ static void xennet_tx_buf_gc(struct netfront_queue *queue)
unsigned short id;
struct sk_buff *skb;
bool more_to_do;
+ const struct device *dev = &queue->info->netdev->dev;

BUG_ON(!netif_carrier_ok(queue->info->netdev));

do {
prod = queue->tx.sring->rsp_prod;
+ if (RING_RESPONSE_PROD_OVERFLOW(&queue->tx, prod)) {
+ dev_alert(dev, "Illegal number of responses %u\n",
+ prod - queue->tx.rsp_cons);
+ goto err;
+ }
rmb(); /* Ensure we see responses up to 'rp'. */

for (cons = queue->tx.rsp_cons; cons != prod; cons++) {
@@ -391,14 +402,27 @@ static void xennet_tx_buf_gc(struct netfront_queue *queue)
if (txrsp.status == XEN_NETIF_RSP_NULL)
continue;

- id = txrsp.id;
+ id = txrsp.id;
+ if (id >= RING_SIZE(&queue->tx)) {
+ dev_alert(dev,
+ "Response has incorrect id (%u)\n",
+ id);
+ goto err;
+ }
+ if (queue->tx_link[id] != TX_PENDING) {
+ dev_alert(dev,
+ "Response for inactive request\n");
+ goto err;
+ }
+
+ queue->tx_link[id] = TX_LINK_NONE;
skb = queue->tx_skbs[id];
queue->tx_skbs[id] = NULL;
if (unlikely(gnttab_query_foreign_access(
queue->grant_tx_ref[id]) != 0)) {
- pr_alert("%s: warning -- grant still in use by backend domain\n",
- __func__);
- BUG();
+ dev_alert(dev,
+ "Grant still in use by backend domain\n");
+ goto err;
}
gnttab_end_foreign_access_ref(
queue->grant_tx_ref[id], GNTMAP_readonly);
@@ -416,6 +440,12 @@ static void xennet_tx_buf_gc(struct netfront_queue *queue)
} while (more_to_do);

xennet_maybe_wake_tx(queue);
+
+ return;
+
+ err:
+ queue->info->broken = true;
+ dev_alert(dev, "Disabled for further use\n");
}

struct xennet_gnttab_make_txreq {
@@ -459,6 +489,12 @@ static void xennet_tx_setup_grant(unsigned long gfn, unsigned int offset,

*tx = info->tx_local;

+ /*
+ * Put the request in the pending queue, it will be set to be pending
+ * when the producer index is about to be raised.
+ */
+ add_id_to_list(&queue->tx_pend_queue, queue->tx_link, id);
+
info->tx = tx;
info->size += info->tx_local.size;
}
@@ -551,6 +587,15 @@ static u16 xennet_select_queue(struct net_device *dev, struct sk_buff *skb,
return queue_idx;
}

+static void xennet_mark_tx_pending(struct netfront_queue *queue)
+{
+ unsigned int i;
+
+ while ((i = get_id_from_list(&queue->tx_pend_queue, queue->tx_link)) !=
+ TX_LINK_NONE)
+ queue->tx_link[i] = TX_PENDING;
+}
+
static int xennet_xdp_xmit_one(struct net_device *dev,
struct netfront_queue *queue,
struct xdp_frame *xdpf)
@@ -568,6 +613,8 @@ static int xennet_xdp_xmit_one(struct net_device *dev,
offset_in_page(xdpf->data),
xdpf->len);

+ xennet_mark_tx_pending(queue);
+
RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&queue->tx, notify);
if (notify)
notify_remote_via_irq(queue->tx_irq);
@@ -592,6 +639,8 @@ static int xennet_xdp_xmit(struct net_device *dev, int n,
int nxmit = 0;
int i;

+ if (unlikely(np->broken))
+ return -ENODEV;
if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK))
return -EINVAL;

@@ -636,6 +685,8 @@ static netdev_tx_t xennet_start_xmit(struct sk_buff *skb, struct net_device *dev
/* Drop the packet if no queues are set up */
if (num_queues < 1)
goto drop;
+ if (unlikely(np->broken))
+ goto drop;
/* Determine which queue to transmit this SKB on */
queue_index = skb_get_queue_mapping(skb);
queue = &np->queues[queue_index];
@@ -742,6 +793,8 @@ static netdev_tx_t xennet_start_xmit(struct sk_buff *skb, struct net_device *dev
/* timestamp packet in software */
skb_tx_timestamp(skb);

+ xennet_mark_tx_pending(queue);
+
RING_PUSH_REQUESTS_AND_CHECK_NOTIFY(&queue->tx, notify);
if (notify)
notify_remote_via_irq(queue->tx_irq);
@@ -1141,6 +1194,13 @@ static int xennet_poll(struct napi_struct *napi, int budget)
skb_queue_head_init(&tmpq);

rp = queue->rx.sring->rsp_prod;
+ if (RING_RESPONSE_PROD_OVERFLOW(&queue->rx, rp)) {
+ dev_alert(&dev->dev, "Illegal number of responses %u\n",
+ rp - queue->rx.rsp_cons);
+ queue->info->broken = true;
+ spin_unlock(&queue->rx_lock);
+ return 0;
+ }
rmb(); /* Ensure we see queued responses up to 'rp'. */

i = queue->rx.rsp_cons;
@@ -1362,6 +1422,9 @@ static irqreturn_t xennet_tx_interrupt(int irq, void *dev_id)
struct netfront_queue *queue = dev_id;
unsigned long flags;

+ if (queue->info->broken)
+ return IRQ_HANDLED;
+
spin_lock_irqsave(&queue->tx_lock, flags);
xennet_tx_buf_gc(queue);
spin_unlock_irqrestore(&queue->tx_lock, flags);
@@ -1374,6 +1437,9 @@ static irqreturn_t xennet_rx_interrupt(int irq, void *dev_id)
struct netfront_queue *queue = dev_id;
struct net_device *dev = queue->info->netdev;

+ if (queue->info->broken)
+ return IRQ_HANDLED;
+
if (likely(netif_carrier_ok(dev) &&
RING_HAS_UNCONSUMED_RESPONSES(&queue->rx)))
napi_schedule(&queue->napi);
@@ -1395,6 +1461,10 @@ static void xennet_poll_controller(struct net_device *dev)
struct netfront_info *info = netdev_priv(dev);
unsigned int num_queues = dev->real_num_tx_queues;
unsigned int i;
+
+ if (info->broken)
+ return;
+
for (i = 0; i < num_queues; ++i)
xennet_interrupt(0, &info->queues[i]);
}
@@ -1466,6 +1536,11 @@ static int xennet_xdp_set(struct net_device *dev, struct bpf_prog *prog,

static int xennet_xdp(struct net_device *dev, struct netdev_bpf *xdp)
{
+ struct netfront_info *np = netdev_priv(dev);
+
+ if (np->broken)
+ return -ENODEV;
+
switch (xdp->command) {
case XDP_SETUP_PROG:
return xennet_xdp_set(dev, xdp->prog, xdp->extack);
@@ -1841,6 +1916,7 @@ static int xennet_init_queue(struct netfront_queue *queue)

/* Initialise tx_skb_freelist as a free chain containing every entry. */
queue->tx_skb_freelist = 0;
+ queue->tx_pend_queue = TX_LINK_NONE;
for (i = 0; i < NET_TX_RING_SIZE; i++) {
queue->tx_link[i] = i + 1;
queue->grant_tx_ref[i] = GRANT_INVALID_REF;
@@ -2115,6 +2191,9 @@ static int talk_to_netback(struct xenbus_device *dev,
if (info->queues)
xennet_destroy_queues(info);

+ /* For the case of a reconnect reset the "broken" indicator. */
+ info->broken = false;
+
err = xennet_create_queues(info, &num_queues);
if (err < 0) {
xenbus_dev_fatal(dev, err, "creating queues");
--
2.26.2

2021-08-24 10:31:30

by Jürgen Groß

[permalink] [raw]
Subject: [PATCH v2 2/4] xen/netfront: don't read data from request on the ring page

In order to avoid a malicious backend being able to influence the local
processing of a request build the request locally first and then copy
it to the ring page. Any reading from the request influencing the
processing in the frontend needs to be done on the local instance.

Signed-off-by: Juergen Gross <[email protected]>
---
V2:
- drop local tx variable (Jan Beulich)
- fix oversight of reading value from ring page (Jan Beulich)
---
drivers/net/xen-netfront.c | 86 +++++++++++++++++++-------------------
1 file changed, 42 insertions(+), 44 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 003cdf2ffc92..714fe9d2c534 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -435,7 +435,8 @@ struct xennet_gnttab_make_txreq {
struct netfront_queue *queue;
struct sk_buff *skb;
struct page *page;
- struct xen_netif_tx_request *tx; /* Last request */
+ struct xen_netif_tx_request *tx; /* Last request on ring page */
+ struct xen_netif_tx_request tx_local; /* Last request local copy*/
unsigned int size;
};

@@ -463,30 +464,27 @@ static void xennet_tx_setup_grant(unsigned long gfn, unsigned int offset,
queue->grant_tx_page[id] = page;
queue->grant_tx_ref[id] = ref;

- tx->id = id;
- tx->gref = ref;
- tx->offset = offset;
- tx->size = len;
- tx->flags = 0;
+ info->tx_local.id = id;
+ info->tx_local.gref = ref;
+ info->tx_local.offset = offset;
+ info->tx_local.size = len;
+ info->tx_local.flags = 0;
+
+ *tx = info->tx_local;

info->tx = tx;
- info->size += tx->size;
+ info->size += info->tx_local.size;
}

static struct xen_netif_tx_request *xennet_make_first_txreq(
- struct netfront_queue *queue, struct sk_buff *skb,
- struct page *page, unsigned int offset, unsigned int len)
+ struct xennet_gnttab_make_txreq *info,
+ unsigned int offset, unsigned int len)
{
- struct xennet_gnttab_make_txreq info = {
- .queue = queue,
- .skb = skb,
- .page = page,
- .size = 0,
- };
+ info->size = 0;

- gnttab_for_one_grant(page, offset, len, xennet_tx_setup_grant, &info);
+ gnttab_for_one_grant(info->page, offset, len, xennet_tx_setup_grant, info);

- return info.tx;
+ return info->tx;
}

static void xennet_make_one_txreq(unsigned long gfn, unsigned int offset,
@@ -499,35 +497,27 @@ static void xennet_make_one_txreq(unsigned long gfn, unsigned int offset,
xennet_tx_setup_grant(gfn, offset, len, data);
}

-static struct xen_netif_tx_request *xennet_make_txreqs(
- struct netfront_queue *queue, struct xen_netif_tx_request *tx,
- struct sk_buff *skb, struct page *page,
+static void xennet_make_txreqs(
+ struct xennet_gnttab_make_txreq *info,
+ struct page *page,
unsigned int offset, unsigned int len)
{
- struct xennet_gnttab_make_txreq info = {
- .queue = queue,
- .skb = skb,
- .tx = tx,
- };
-
/* Skip unused frames from start of page */
page += offset >> PAGE_SHIFT;
offset &= ~PAGE_MASK;

while (len) {
- info.page = page;
- info.size = 0;
+ info->page = page;
+ info->size = 0;

gnttab_foreach_grant_in_range(page, offset, len,
xennet_make_one_txreq,
- &info);
+ info);

page++;
offset = 0;
- len -= info.size;
+ len -= info->size;
}
-
- return info.tx;
}

/*
@@ -580,10 +570,14 @@ static int xennet_xdp_xmit_one(struct net_device *dev,
{
struct netfront_info *np = netdev_priv(dev);
struct netfront_stats *tx_stats = this_cpu_ptr(np->tx_stats);
+ struct xennet_gnttab_make_txreq info = {
+ .queue = queue,
+ .skb = NULL,
+ .page = virt_to_page(xdpf->data),
+ };
int notify;

- xennet_make_first_txreq(queue, NULL,
- virt_to_page(xdpf->data),
+ xennet_make_first_txreq(&info,
offset_in_page(xdpf->data),
xdpf->len);

@@ -638,7 +632,7 @@ static netdev_tx_t xennet_start_xmit(struct sk_buff *skb, struct net_device *dev
{
struct netfront_info *np = netdev_priv(dev);
struct netfront_stats *tx_stats = this_cpu_ptr(np->tx_stats);
- struct xen_netif_tx_request *tx, *first_tx;
+ struct xen_netif_tx_request *first_tx;
unsigned int i;
int notify;
int slots;
@@ -647,6 +641,7 @@ static netdev_tx_t xennet_start_xmit(struct sk_buff *skb, struct net_device *dev
unsigned int len;
unsigned long flags;
struct netfront_queue *queue = NULL;
+ struct xennet_gnttab_make_txreq info = { };
unsigned int num_queues = dev->real_num_tx_queues;
u16 queue_index;
struct sk_buff *nskb;
@@ -704,21 +699,24 @@ static netdev_tx_t xennet_start_xmit(struct sk_buff *skb, struct net_device *dev
}

/* First request for the linear area. */
- first_tx = tx = xennet_make_first_txreq(queue, skb,
- page, offset, len);
- offset += tx->size;
+ info.queue = queue;
+ info.skb = skb;
+ info.page = page;
+ first_tx = xennet_make_first_txreq(&info, offset, len);
+ offset += info.tx_local.size;
if (offset == PAGE_SIZE) {
page++;
offset = 0;
}
- len -= tx->size;
+ len -= info.tx_local.size;

if (skb->ip_summed == CHECKSUM_PARTIAL)
/* local packet? */
- tx->flags |= XEN_NETTXF_csum_blank | XEN_NETTXF_data_validated;
+ first_tx->flags |= XEN_NETTXF_csum_blank |
+ XEN_NETTXF_data_validated;
else if (skb->ip_summed == CHECKSUM_UNNECESSARY)
/* remote but checksummed. */
- tx->flags |= XEN_NETTXF_data_validated;
+ first_tx->flags |= XEN_NETTXF_data_validated;

/* Optional extra info after the first request. */
if (skb_shinfo(skb)->gso_size) {
@@ -727,7 +725,7 @@ static netdev_tx_t xennet_start_xmit(struct sk_buff *skb, struct net_device *dev
gso = (struct xen_netif_extra_info *)
RING_GET_REQUEST(&queue->tx, queue->tx.req_prod_pvt++);

- tx->flags |= XEN_NETTXF_extra_info;
+ first_tx->flags |= XEN_NETTXF_extra_info;

gso->u.gso.size = skb_shinfo(skb)->gso_size;
gso->u.gso.type = (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6) ?
@@ -741,12 +739,12 @@ static netdev_tx_t xennet_start_xmit(struct sk_buff *skb, struct net_device *dev
}

/* Requests for the rest of the linear area. */
- tx = xennet_make_txreqs(queue, tx, skb, page, offset, len);
+ xennet_make_txreqs(&info, page, offset, len);

/* Requests for all the frags. */
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
- tx = xennet_make_txreqs(queue, tx, skb, skb_frag_page(frag),
+ xennet_make_txreqs(&info, skb_frag_page(frag),
skb_frag_off(frag),
skb_frag_size(frag));
}
--
2.26.2

2021-08-24 10:32:11

by Jürgen Groß

[permalink] [raw]
Subject: [PATCH v2 3/4] xen/netfront: disentangle tx_skb_freelist

The tx_skb_freelist elements are in a single linked list with the
request id used as link reference. The per element link field is in a
union with the skb pointer of an in use request.

Move the link reference out of the union in order to enable a later
reuse of it for requests which need a populated skb pointer.

Rename add_id_to_freelist() and get_id_from_freelist() to
add_id_to_list() and get_id_from_list() in order to prepare using
those for other lists as well. Define ~0 as value to indicate the end
of a list and place that value into the link for a request not being
on the list.

When freeing a skb zero the skb pointer in the request. Use a NULL
value of the skb pointer instead of skb_entry_is_link() for deciding
whether a request has a skb linked to it.

Remove skb_entry_set_link() and open code it instead as it is really
trivial now.

Signed-off-by: Juergen Gross <[email protected]>
---
V2:
- new patch
---
drivers/net/xen-netfront.c | 61 ++++++++++++++++----------------------
1 file changed, 25 insertions(+), 36 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 714fe9d2c534..956e1266bd1a 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -126,17 +126,11 @@ struct netfront_queue {

/*
* {tx,rx}_skbs store outstanding skbuffs. Free tx_skb entries
- * are linked from tx_skb_freelist through skb_entry.link.
- *
- * NB. Freelist index entries are always going to be less than
- * PAGE_OFFSET, whereas pointers to skbs will always be equal or
- * greater than PAGE_OFFSET: we use this property to distinguish
- * them.
+ * are linked from tx_skb_freelist through tx_link.
*/
- union skb_entry {
- struct sk_buff *skb;
- unsigned long link;
- } tx_skbs[NET_TX_RING_SIZE];
+ struct sk_buff *tx_skbs[NET_TX_RING_SIZE];
+ unsigned short tx_link[NET_TX_RING_SIZE];
+#define TX_LINK_NONE 0xffff
grant_ref_t gref_tx_head;
grant_ref_t grant_tx_ref[NET_TX_RING_SIZE];
struct page *grant_tx_page[NET_TX_RING_SIZE];
@@ -181,33 +175,25 @@ struct netfront_rx_info {
struct xen_netif_extra_info extras[XEN_NETIF_EXTRA_TYPE_MAX - 1];
};

-static void skb_entry_set_link(union skb_entry *list, unsigned short id)
-{
- list->link = id;
-}
-
-static int skb_entry_is_link(const union skb_entry *list)
-{
- BUILD_BUG_ON(sizeof(list->skb) != sizeof(list->link));
- return (unsigned long)list->skb < PAGE_OFFSET;
-}
-
/*
* Access macros for acquiring freeing slots in tx_skbs[].
*/

-static void add_id_to_freelist(unsigned *head, union skb_entry *list,
- unsigned short id)
+static void add_id_to_list(unsigned *head, unsigned short *list,
+ unsigned short id)
{
- skb_entry_set_link(&list[id], *head);
+ list[id] = *head;
*head = id;
}

-static unsigned short get_id_from_freelist(unsigned *head,
- union skb_entry *list)
+static unsigned short get_id_from_list(unsigned *head, unsigned short *list)
{
unsigned int id = *head;
- *head = list[id].link;
+
+ if (id != TX_LINK_NONE) {
+ *head = list[id];
+ list[id] = TX_LINK_NONE;
+ }
return id;
}

@@ -406,7 +392,8 @@ static void xennet_tx_buf_gc(struct netfront_queue *queue)
continue;

id = txrsp.id;
- skb = queue->tx_skbs[id].skb;
+ skb = queue->tx_skbs[id];
+ queue->tx_skbs[id] = NULL;
if (unlikely(gnttab_query_foreign_access(
queue->grant_tx_ref[id]) != 0)) {
pr_alert("%s: warning -- grant still in use by backend domain\n",
@@ -419,7 +406,7 @@ static void xennet_tx_buf_gc(struct netfront_queue *queue)
&queue->gref_tx_head, queue->grant_tx_ref[id]);
queue->grant_tx_ref[id] = GRANT_INVALID_REF;
queue->grant_tx_page[id] = NULL;
- add_id_to_freelist(&queue->tx_skb_freelist, queue->tx_skbs, id);
+ add_id_to_list(&queue->tx_skb_freelist, queue->tx_link, id);
dev_kfree_skb_irq(skb);
}

@@ -452,7 +439,7 @@ static void xennet_tx_setup_grant(unsigned long gfn, unsigned int offset,
struct netfront_queue *queue = info->queue;
struct sk_buff *skb = info->skb;

- id = get_id_from_freelist(&queue->tx_skb_freelist, queue->tx_skbs);
+ id = get_id_from_list(&queue->tx_skb_freelist, queue->tx_link);
tx = RING_GET_REQUEST(&queue->tx, queue->tx.req_prod_pvt++);
ref = gnttab_claim_grant_reference(&queue->gref_tx_head);
WARN_ON_ONCE(IS_ERR_VALUE((unsigned long)(int)ref));
@@ -460,7 +447,7 @@ static void xennet_tx_setup_grant(unsigned long gfn, unsigned int offset,
gnttab_grant_foreign_access_ref(ref, queue->info->xbdev->otherend_id,
gfn, GNTMAP_readonly);

- queue->tx_skbs[id].skb = skb;
+ queue->tx_skbs[id] = skb;
queue->grant_tx_page[id] = page;
queue->grant_tx_ref[id] = ref;

@@ -1284,17 +1271,18 @@ static void xennet_release_tx_bufs(struct netfront_queue *queue)

for (i = 0; i < NET_TX_RING_SIZE; i++) {
/* Skip over entries which are actually freelist references */
- if (skb_entry_is_link(&queue->tx_skbs[i]))
+ if (!queue->tx_skbs[i])
continue;

- skb = queue->tx_skbs[i].skb;
+ skb = queue->tx_skbs[i];
+ queue->tx_skbs[i] = NULL;
get_page(queue->grant_tx_page[i]);
gnttab_end_foreign_access(queue->grant_tx_ref[i],
GNTMAP_readonly,
(unsigned long)page_address(queue->grant_tx_page[i]));
queue->grant_tx_page[i] = NULL;
queue->grant_tx_ref[i] = GRANT_INVALID_REF;
- add_id_to_freelist(&queue->tx_skb_freelist, queue->tx_skbs, i);
+ add_id_to_list(&queue->tx_skb_freelist, queue->tx_link, i);
dev_kfree_skb_irq(skb);
}
}
@@ -1851,13 +1839,14 @@ static int xennet_init_queue(struct netfront_queue *queue)
snprintf(queue->name, sizeof(queue->name), "vif%s-q%u",
devid, queue->id);

- /* Initialise tx_skbs as a free chain containing every entry. */
+ /* Initialise tx_skb_freelist as a free chain containing every entry. */
queue->tx_skb_freelist = 0;
for (i = 0; i < NET_TX_RING_SIZE; i++) {
- skb_entry_set_link(&queue->tx_skbs[i], i+1);
+ queue->tx_link[i] = i + 1;
queue->grant_tx_ref[i] = GRANT_INVALID_REF;
queue->grant_tx_page[i] = NULL;
}
+ queue->tx_link[NET_TX_RING_SIZE - 1] = TX_LINK_NONE;

/* Clear out rx_skbs */
for (i = 0; i < NET_RX_RING_SIZE; i++) {
--
2.26.2

2021-08-24 15:26:27

by Jan Beulich

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] xen/netfront: don't read data from request on the ring page

On 24.08.2021 12:28, Juergen Gross wrote:
> In order to avoid a malicious backend being able to influence the local
> processing of a request build the request locally first and then copy
> it to the ring page. Any reading from the request influencing the
> processing in the frontend needs to be done on the local instance.
>
> Signed-off-by: Juergen Gross <[email protected]>

Reviewed-by: Jan Beulich <[email protected]>

2021-08-24 15:35:26

by Jan Beulich

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] xen/netfront: don't trust the backend response data blindly

On 24.08.2021 12:28, Juergen Gross wrote:
> Today netfront will trust the backend to send only sane response data.
> In order to avoid privilege escalations or crashes in case of malicious
> backends verify the data to be within expected limits. Especially make
> sure that the response always references an outstanding request.
>
> Note that only the tx queue needs special id handling, as for the rx
> queue the id is equal to the index in the ring page.
>
> Introduce a new indicator for the device whether it is broken and let
> the device stop working when it is set. Set this indicator in case the
> backend sets any weird data.
>
> Signed-off-by: Juergen Gross <[email protected]>

Reviewed-by: Jan Beulich <[email protected]>
with a small suggestion:

> V2:
> - set the pending flag only just before sending the request (Jan Beulich)
> - reset broken indicator during connect (Jan Beulich)

With this latter adjustment, would it make sense to ...

> @@ -416,6 +440,12 @@ static void xennet_tx_buf_gc(struct netfront_queue *queue)
> } while (more_to_do);
>
> xennet_maybe_wake_tx(queue);
> +
> + return;
> +
> + err:
> + queue->info->broken = true;
> + dev_alert(dev, "Disabled for further use\n");

... hint at that behavior here, e.g. by adding "(until reconnect)"?
That way an admin observing this log message will have an immediate
hint at how to get the interface to work again (if so desired).

Jan

2021-08-24 15:36:24

by Jan Beulich

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] xen: harden netfront against malicious backends

On 24.08.2021 12:28, Juergen Gross wrote:
> It should be mentioned that a similar series has been posted some years
> ago by Marek Marczykowski-Górecki, but this series has not been applied
> due to a Xen header not having been available in the Xen git repo at
> that time. Additionally my series is fixing some more DoS cases.

With this, wouldn't it have made sense to Cc Marek, in case he wants to
check against his own version (which over time may have evolved and
hence not match the earlier submission anymore)?

Jan

2021-08-25 10:02:42

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] xen: harden netfront against malicious backends

Hello:

This series was applied to netdev/net-next.git (refs/heads/master):

On Tue, 24 Aug 2021 12:28:05 +0200 you wrote:
> Xen backends of para-virtualized devices can live in dom0 kernel, dom0
> user land, or in a driver domain. This means that a backend might
> reside in a less trusted environment than the Xen core components, so
> a backend should not be able to do harm to a Xen guest (it can still
> mess up I/O data, but it shouldn't be able to e.g. crash a guest by
> other means or cause a privilege escalation in the guest).
>
> [...]

Here is the summary with links:
- [v2,1/4] xen/netfront: read response from backend only once
https://git.kernel.org/netdev/net-next/c/8446066bf8c1
- [v2,2/4] xen/netfront: don't read data from request on the ring page
https://git.kernel.org/netdev/net-next/c/162081ec33c2
- [v2,3/4] xen/netfront: disentangle tx_skb_freelist
https://git.kernel.org/netdev/net-next/c/21631d2d741a
- [v2,4/4] xen/netfront: don't trust the backend response data blindly
https://git.kernel.org/netdev/net-next/c/a884daa61a7d

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


Subject: Re: [PATCH v2 0/4] xen: harden netfront against malicious backends

On Tue, Aug 24, 2021 at 05:33:10PM +0200, Jan Beulich wrote:
> On 24.08.2021 12:28, Juergen Gross wrote:
> > It should be mentioned that a similar series has been posted some years
> > ago by Marek Marczykowski-Górecki, but this series has not been applied
> > due to a Xen header not having been available in the Xen git repo at
> > that time. Additionally my series is fixing some more DoS cases.
>
> With this, wouldn't it have made sense to Cc Marek, in case he wants to
> check against his own version (which over time may have evolved and
> hence not match the earlier submission anymore)?

I have compared this, and the blkfront series against my patches and
they seem to cover exactly the same set of issues. Besides one comment I
made separately, I think nothing is missing. Thanks!

BTW, shouldn't those those patches land in stable branches too? In some
threat models, I'd qualify them as security fixes.

--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab


Attachments:
(No filename) (0.98 kB)
signature.asc (499.00 B)
Download all attachments

2021-09-10 11:12:25

by Jürgen Groß

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] xen: harden netfront against malicious backends

On 10.09.21 12:19, Marek Marczykowski-Górecki wrote:
> On Tue, Aug 24, 2021 at 05:33:10PM +0200, Jan Beulich wrote:
>> On 24.08.2021 12:28, Juergen Gross wrote:
>>> It should be mentioned that a similar series has been posted some years
>>> ago by Marek Marczykowski-Górecki, but this series has not been applied
>>> due to a Xen header not having been available in the Xen git repo at
>>> that time. Additionally my series is fixing some more DoS cases.
>>
>> With this, wouldn't it have made sense to Cc Marek, in case he wants to
>> check against his own version (which over time may have evolved and
>> hence not match the earlier submission anymore)?
>
> I have compared this, and the blkfront series against my patches and
> they seem to cover exactly the same set of issues. Besides one comment I
> made separately, I think nothing is missing. Thanks!
>
> BTW, shouldn't those those patches land in stable branches too? In some
> threat models, I'd qualify them as security fixes.

Its on my todo list.

Most stable branches will need backports, so this might require some
more time to get it finished.


Juergen


Attachments:
OpenPGP_0xB0DE9DD628BF132F.asc (3.06 kB)
OpenPGP public key
OpenPGP_signature (505.00 B)
OpenPGP digital signature
Download all attachments