2014-11-24 11:52:46

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 00/41] linux: towards virtio-1 guest support

Based on patches by Cornelia Rusty and others, but
with an API that should allow better static checking of code,
and slightly more concervative changes in vring,net and blk.

Based on patches by Cornelia and others, but
with an API that should allow better static checking of code,
slightly more concervative changes in vring and drivers,
and compatibility for existing drivers so that
this series be applied before all drivers are converted.

virtio net,blk and scsi drivers have been converted.
They now pass sparse without warnings.

net and blk patches have been tested on s390.
scsi patches pass sparse so they are most likely
ok too, but haven't been through testing yet -
they can be dropped from patchset if necessary.

Please review, and consider for 3.19

David, assuming patches are acceptable, I'd like them all to be merged through
virtio or vhost trees. Could you please ack doing that for net related
patches?

Changes since v2:
converted tun,macvtap and vhost net
partially converted vhost scsi: it still lacks ANY_LAYOUT support
Nicholas, any comments?

Cornelia Huck (3):
virtio: allow transports to get avail/used addresses
KVM: s390: virtio-ccw revision 1 SET_VQ
KVM: s390: enable virtio-ccw revision 1

Michael S. Tsirkin (35):
virtio: add virtio 1.0 feature bit
virtio: memory access APIs
virtio_ring: switch to new memory access APIs
virtio_config: endian conversion for v1.0
virtio: set FEATURES_OK
virtio: simplify feature bit handling
virtio: add legacy feature table support
virtio_net: v1.0 endianness
virtio_blk: v1.0 support
KVM: s390 allow virtio_ccw status writes to fail
virtio_blk: make serial attribute static
virtio_blk: fix race at module removal
virtio_net: pass vi around
virtio_net: get rid of virtio_net_hdr/skb_vnet_hdr
virtio_net: stricter short buffer length checks
virtio_net: bigger header when VERSION_1 is set
virtio_net: enable v1.0 support
vhost: add memory access wrappers
vhost/net: force len for TX to host endian
vhost: virtio 1.0 endian-ness support
vhost: make features 64 bit
vhost/net: virtio 1.0 byte swap
vhost/net: larger header for virtio 1.0
vhost/net: enable virtio 1.0
vhost/net: suppress compiler warning
tun: move internal flag defines out of uapi
tun: drop most type defines
tun: add VNET_LE flag
tun: TUN_VNET_LE support, fix sparse warnings for virtio headers
macvtap: TUN_VNET_HDR support
virtio_scsi: v1.0 support
virtio_scsi: move to uapi
virtio_scsi: export to userspace
vhost/scsi: partial virtio 1.0 support
af_packet: virtio 1.0 stubs

Rusty Russell (2):
virtio: use u32, not bitmap for struct virtio_device's features
virtio: add support for 64 bit features.

Thomas Huth (1):
KVM: s390: Set virtio-ccw transport revision

drivers/vhost/vhost.h | 35 ++++++-
include/linux/virtio.h | 10 +-
include/linux/virtio_byteorder.h | 59 +++++++++++
include/linux/virtio_config.h | 49 ++++++++--
include/uapi/linux/if_tun.h | 17 +---
include/uapi/linux/virtio_blk.h | 15 +--
include/uapi/linux/virtio_config.h | 9 +-
include/uapi/linux/virtio_net.h | 15 +--
include/uapi/linux/virtio_ring.h | 45 ++++-----
include/{ => uapi}/linux/virtio_scsi.h | 106 ++++++++++----------
include/uapi/linux/virtio_types.h | 48 +++++++++
tools/virtio/linux/virtio.h | 22 +----
tools/virtio/linux/virtio_config.h | 2 +-
drivers/block/virtio_blk.c | 75 ++++++++------
drivers/char/virtio_console.c | 2 +-
drivers/lguest/lguest_device.c | 16 +--
drivers/net/macvtap.c | 68 ++++++++-----
drivers/net/tun.c | 170 ++++++++++++++------------------
drivers/net/virtio_net.c | 159 +++++++++++++++---------------
drivers/remoteproc/remoteproc_virtio.c | 7 +-
drivers/s390/kvm/kvm_virtio.c | 10 +-
drivers/s390/kvm/virtio_ccw.c | 172 +++++++++++++++++++++++++++------
drivers/scsi/virtio_scsi.c | 51 ++++++----
drivers/vhost/net.c | 28 +++---
drivers/vhost/scsi.c | 22 +++--
drivers/vhost/vhost.c | 93 +++++++++++-------
drivers/virtio/virtio.c | 74 ++++++++++----
drivers/virtio/virtio_mmio.c | 20 ++--
drivers/virtio/virtio_pci.c | 8 +-
drivers/virtio/virtio_ring.c | 107 +++++++++++---------
net/packet/af_packet.c | 35 ++++---
tools/virtio/virtio_test.c | 5 +-
tools/virtio/vringh_test.c | 16 +--
include/uapi/linux/Kbuild | 2 +
34 files changed, 980 insertions(+), 592 deletions(-)
create mode 100644 include/linux/virtio_byteorder.h
rename include/{ => uapi}/linux/virtio_scsi.h (73%)
create mode 100644 include/uapi/linux/virtio_types.h

--
MST


2014-11-24 11:53:10

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 03/41] virtio: add virtio 1.0 feature bit

Based on original patches by Rusty Russell, Thomas Huth
and Cornelia Huck.

Note: at this time, we do not negotiate this feature bit
in core, drivers have to declare VERSION_1 support explicitly.

After all drivers are converted, we will be able to
move VERSION_1 to core and drop it from all drivers.

Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
---
include/uapi/linux/virtio_config.h | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
index 3ce768c..f3fe33a 100644
--- a/include/uapi/linux/virtio_config.h
+++ b/include/uapi/linux/virtio_config.h
@@ -41,11 +41,11 @@
/* We've given up on this device. */
#define VIRTIO_CONFIG_S_FAILED 0x80

-/* Some virtio feature bits (currently bits 28 through 31) are reserved for the
+/* Some virtio feature bits (currently bits 28 through 32) are reserved for the
* transport being used (eg. virtio_ring), the rest are per-device feature
* bits. */
#define VIRTIO_TRANSPORT_F_START 28
-#define VIRTIO_TRANSPORT_F_END 32
+#define VIRTIO_TRANSPORT_F_END 33

/* Do we get callbacks when the ring is completely used, even if we've
* suppressed them? */
@@ -54,4 +54,7 @@
/* Can the device handle any descriptor layout? */
#define VIRTIO_F_ANY_LAYOUT 27

+/* v1.0 compliant. */
+#define VIRTIO_F_VERSION_1 32
+
#endif /* _UAPI_LINUX_VIRTIO_CONFIG_H */
--
MST

2014-11-24 11:53:21

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 05/41] virtio_ring: switch to new memory access APIs

Use virtioXX_to_cpu and friends for access to
all multibyte structures in memory.

Note: this is intentionally mechanical.
A follow-up patch will split long lines etc.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/virtio/virtio_ring.c | 89 ++++++++++++++++++++++----------------------
1 file changed, 45 insertions(+), 44 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 61a1fe1..b311fa7 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -99,7 +99,8 @@ struct vring_virtqueue

#define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)

-static struct vring_desc *alloc_indirect(unsigned int total_sg, gfp_t gfp)
+static struct vring_desc *alloc_indirect(struct virtqueue *_vq,
+ unsigned int total_sg, gfp_t gfp)
{
struct vring_desc *desc;
unsigned int i;
@@ -116,7 +117,7 @@ static struct vring_desc *alloc_indirect(unsigned int total_sg, gfp_t gfp)
return NULL;

for (i = 0; i < total_sg; i++)
- desc[i].next = i+1;
+ desc[i].next = cpu_to_virtio16(_vq->vdev, i + 1);
return desc;
}

@@ -165,17 +166,17 @@ static inline int virtqueue_add(struct virtqueue *_vq,
/* If the host supports indirect descriptor tables, and we have multiple
* buffers, then go indirect. FIXME: tune this threshold */
if (vq->indirect && total_sg > 1 && vq->vq.num_free)
- desc = alloc_indirect(total_sg, gfp);
+ desc = alloc_indirect(_vq, total_sg, gfp);
else
desc = NULL;

if (desc) {
/* Use a single buffer which doesn't continue */
- vq->vring.desc[head].flags = VRING_DESC_F_INDIRECT;
- vq->vring.desc[head].addr = virt_to_phys(desc);
+ vq->vring.desc[head].flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_INDIRECT);
+ vq->vring.desc[head].addr = cpu_to_virtio64(_vq->vdev, virt_to_phys(desc));
/* avoid kmemleak false positive (hidden by virt_to_phys) */
kmemleak_ignore(desc);
- vq->vring.desc[head].len = total_sg * sizeof(struct vring_desc);
+ vq->vring.desc[head].len = cpu_to_virtio32(_vq->vdev, total_sg * sizeof(struct vring_desc));

/* Set up rest to use this indirect table. */
i = 0;
@@ -205,28 +206,28 @@ static inline int virtqueue_add(struct virtqueue *_vq,

for (n = 0; n < out_sgs; n++) {
for (sg = sgs[n]; sg; sg = sg_next(sg)) {
- desc[i].flags = VRING_DESC_F_NEXT;
- desc[i].addr = sg_phys(sg);
- desc[i].len = sg->length;
+ desc[i].flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_NEXT);
+ desc[i].addr = cpu_to_virtio64(_vq->vdev, sg_phys(sg));
+ desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length);
prev = i;
- i = desc[i].next;
+ i = virtio16_to_cpu(_vq->vdev, desc[i].next);
}
}
for (; n < (out_sgs + in_sgs); n++) {
for (sg = sgs[n]; sg; sg = sg_next(sg)) {
- desc[i].flags = VRING_DESC_F_NEXT|VRING_DESC_F_WRITE;
- desc[i].addr = sg_phys(sg);
- desc[i].len = sg->length;
+ desc[i].flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_NEXT | VRING_DESC_F_WRITE);
+ desc[i].addr = cpu_to_virtio64(_vq->vdev, sg_phys(sg));
+ desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length);
prev = i;
- i = desc[i].next;
+ i = virtio16_to_cpu(_vq->vdev, desc[i].next);
}
}
/* Last one doesn't continue. */
- desc[prev].flags &= ~VRING_DESC_F_NEXT;
+ desc[prev].flags &= cpu_to_virtio16(_vq->vdev, ~VRING_DESC_F_NEXT);

/* Update free pointer */
if (indirect)
- vq->free_head = vq->vring.desc[head].next;
+ vq->free_head = virtio16_to_cpu(_vq->vdev, vq->vring.desc[head].next);
else
vq->free_head = i;

@@ -235,13 +236,13 @@ static inline int virtqueue_add(struct virtqueue *_vq,

/* Put entry in available array (but don't update avail->idx until they
* do sync). */
- avail = (vq->vring.avail->idx & (vq->vring.num-1));
- vq->vring.avail->ring[avail] = head;
+ avail = virtio16_to_cpu(_vq->vdev, vq->vring.avail->idx) & (vq->vring.num - 1);
+ vq->vring.avail->ring[avail] = cpu_to_virtio16(_vq->vdev, head);

/* Descriptors and available array need to be set before we expose the
* new available array entries. */
virtio_wmb(vq->weak_barriers);
- vq->vring.avail->idx++;
+ vq->vring.avail->idx = cpu_to_virtio16(_vq->vdev, virtio16_to_cpu(_vq->vdev, vq->vring.avail->idx) + 1);
vq->num_added++;

/* This is very unlikely, but theoretically possible. Kick
@@ -354,8 +355,8 @@ bool virtqueue_kick_prepare(struct virtqueue *_vq)
* event. */
virtio_mb(vq->weak_barriers);

- old = vq->vring.avail->idx - vq->num_added;
- new = vq->vring.avail->idx;
+ old = virtio16_to_cpu(_vq->vdev, vq->vring.avail->idx) - vq->num_added;
+ new = virtio16_to_cpu(_vq->vdev, vq->vring.avail->idx);
vq->num_added = 0;

#ifdef DEBUG
@@ -367,10 +368,10 @@ bool virtqueue_kick_prepare(struct virtqueue *_vq)
#endif

if (vq->event) {
- needs_kick = vring_need_event(vring_avail_event(&vq->vring),
+ needs_kick = vring_need_event(virtio16_to_cpu(_vq->vdev, vring_avail_event(&vq->vring)),
new, old);
} else {
- needs_kick = !(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY);
+ needs_kick = !(vq->vring.used->flags & cpu_to_virtio16(_vq->vdev, VRING_USED_F_NO_NOTIFY));
}
END_USE(vq);
return needs_kick;
@@ -432,15 +433,15 @@ static void detach_buf(struct vring_virtqueue *vq, unsigned int head)
i = head;

/* Free the indirect table */
- if (vq->vring.desc[i].flags & VRING_DESC_F_INDIRECT)
- kfree(phys_to_virt(vq->vring.desc[i].addr));
+ if (vq->vring.desc[i].flags & cpu_to_virtio16(vq->vq.vdev, VRING_DESC_F_INDIRECT))
+ kfree(phys_to_virt(virtio64_to_cpu(vq->vq.vdev, vq->vring.desc[i].addr)));

- while (vq->vring.desc[i].flags & VRING_DESC_F_NEXT) {
- i = vq->vring.desc[i].next;
+ while (vq->vring.desc[i].flags & cpu_to_virtio16(vq->vq.vdev, VRING_DESC_F_NEXT)) {
+ i = virtio16_to_cpu(vq->vq.vdev, vq->vring.desc[i].next);
vq->vq.num_free++;
}

- vq->vring.desc[i].next = vq->free_head;
+ vq->vring.desc[i].next = cpu_to_virtio16(vq->vq.vdev, vq->free_head);
vq->free_head = head;
/* Plus final descriptor */
vq->vq.num_free++;
@@ -448,7 +449,7 @@ static void detach_buf(struct vring_virtqueue *vq, unsigned int head)

static inline bool more_used(const struct vring_virtqueue *vq)
{
- return vq->last_used_idx != vq->vring.used->idx;
+ return vq->last_used_idx != virtio16_to_cpu(vq->vq.vdev, vq->vring.used->idx);
}

/**
@@ -491,8 +492,8 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
virtio_rmb(vq->weak_barriers);

last_used = (vq->last_used_idx & (vq->vring.num - 1));
- i = vq->vring.used->ring[last_used].id;
- *len = vq->vring.used->ring[last_used].len;
+ i = virtio32_to_cpu(_vq->vdev, vq->vring.used->ring[last_used].id);
+ *len = virtio32_to_cpu(_vq->vdev, vq->vring.used->ring[last_used].len);

if (unlikely(i >= vq->vring.num)) {
BAD_RING(vq, "id %u out of range\n", i);
@@ -510,8 +511,8 @@ void *virtqueue_get_buf(struct virtqueue *_vq, unsigned int *len)
/* If we expect an interrupt for the next entry, tell host
* by writing event index and flush out the write before
* the read in the next get_buf call. */
- if (!(vq->vring.avail->flags & VRING_AVAIL_F_NO_INTERRUPT)) {
- vring_used_event(&vq->vring) = vq->last_used_idx;
+ if (!(vq->vring.avail->flags & cpu_to_virtio16(_vq->vdev, VRING_AVAIL_F_NO_INTERRUPT))) {
+ vring_used_event(&vq->vring) = cpu_to_virtio16(_vq->vdev, vq->last_used_idx);
virtio_mb(vq->weak_barriers);
}

@@ -537,7 +538,7 @@ void virtqueue_disable_cb(struct virtqueue *_vq)
{
struct vring_virtqueue *vq = to_vvq(_vq);

- vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT;
+ vq->vring.avail->flags |= cpu_to_virtio16(_vq->vdev, VRING_AVAIL_F_NO_INTERRUPT);
}
EXPORT_SYMBOL_GPL(virtqueue_disable_cb);

@@ -565,8 +566,8 @@ unsigned virtqueue_enable_cb_prepare(struct virtqueue *_vq)
/* Depending on the VIRTIO_RING_F_EVENT_IDX feature, we need to
* either clear the flags bit or point the event index at the next
* entry. Always do both to keep code simple. */
- vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
- vring_used_event(&vq->vring) = last_used_idx = vq->last_used_idx;
+ vq->vring.avail->flags &= cpu_to_virtio16(_vq->vdev, ~VRING_AVAIL_F_NO_INTERRUPT);
+ vring_used_event(&vq->vring) = cpu_to_virtio16(_vq->vdev, last_used_idx = vq->last_used_idx);
END_USE(vq);
return last_used_idx;
}
@@ -586,7 +587,7 @@ bool virtqueue_poll(struct virtqueue *_vq, unsigned last_used_idx)
struct vring_virtqueue *vq = to_vvq(_vq);

virtio_mb(vq->weak_barriers);
- return (u16)last_used_idx != vq->vring.used->idx;
+ return (u16)last_used_idx != virtio16_to_cpu(_vq->vdev, vq->vring.used->idx);
}
EXPORT_SYMBOL_GPL(virtqueue_poll);

@@ -633,12 +634,12 @@ bool virtqueue_enable_cb_delayed(struct virtqueue *_vq)
/* Depending on the VIRTIO_RING_F_USED_EVENT_IDX feature, we need to
* either clear the flags bit or point the event index at the next
* entry. Always do both to keep code simple. */
- vq->vring.avail->flags &= ~VRING_AVAIL_F_NO_INTERRUPT;
+ vq->vring.avail->flags &= cpu_to_virtio16(_vq->vdev, ~VRING_AVAIL_F_NO_INTERRUPT);
/* TODO: tune this threshold */
- bufs = (u16)(vq->vring.avail->idx - vq->last_used_idx) * 3 / 4;
- vring_used_event(&vq->vring) = vq->last_used_idx + bufs;
+ bufs = (u16)(virtio16_to_cpu(_vq->vdev, vq->vring.avail->idx) - vq->last_used_idx) * 3 / 4;
+ vring_used_event(&vq->vring) = cpu_to_virtio16(_vq->vdev, vq->last_used_idx + bufs);
virtio_mb(vq->weak_barriers);
- if (unlikely((u16)(vq->vring.used->idx - vq->last_used_idx) > bufs)) {
+ if (unlikely((u16)(virtio16_to_cpu(_vq->vdev, vq->vring.used->idx) - vq->last_used_idx) > bufs)) {
END_USE(vq);
return false;
}
@@ -670,7 +671,7 @@ void *virtqueue_detach_unused_buf(struct virtqueue *_vq)
/* detach_buf clears data, so grab it now. */
buf = vq->data[i];
detach_buf(vq, i);
- vq->vring.avail->idx--;
+ vq->vring.avail->idx = cpu_to_virtio16(_vq->vdev, virtio16_to_cpu(_vq->vdev, vq->vring.avail->idx) - 1);
END_USE(vq);
return buf;
}
@@ -747,12 +748,12 @@ struct virtqueue *vring_new_virtqueue(unsigned int index,

/* No callback? Tell other side not to bother us. */
if (!callback)
- vq->vring.avail->flags |= VRING_AVAIL_F_NO_INTERRUPT;
+ vq->vring.avail->flags |= cpu_to_virtio16(vdev, VRING_AVAIL_F_NO_INTERRUPT);

/* Put everything in free lists. */
vq->free_head = 0;
for (i = 0; i < num-1; i++) {
- vq->vring.desc[i].next = i+1;
+ vq->vring.desc[i].next = cpu_to_virtio16(vdev, i + 1);
vq->data[i] = NULL;
}
vq->data[i] = NULL;
--
MST

2014-11-24 11:53:29

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 06/41] virtio_config: endian conversion for v1.0

We (ab)use virtio conversion functions for device-specific
config space accesses.

Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
---
include/linux/virtio_config.h | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index b9cd689..b50c4a1 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -271,12 +271,13 @@ static inline u16 virtio_cread16(struct virtio_device *vdev,
{
u16 ret;
vdev->config->get(vdev, offset, &ret, sizeof(ret));
- return ret;
+ return virtio16_to_cpu(vdev, (__force __virtio16)ret);
}

static inline void virtio_cwrite16(struct virtio_device *vdev,
unsigned int offset, u16 val)
{
+ val = (__force u16)cpu_to_virtio16(vdev, val);
vdev->config->set(vdev, offset, &val, sizeof(val));
}

@@ -285,12 +286,13 @@ static inline u32 virtio_cread32(struct virtio_device *vdev,
{
u32 ret;
vdev->config->get(vdev, offset, &ret, sizeof(ret));
- return ret;
+ return virtio32_to_cpu(vdev, (__force __virtio32)ret);
}

static inline void virtio_cwrite32(struct virtio_device *vdev,
unsigned int offset, u32 val)
{
+ val = (__force u32)cpu_to_virtio32(vdev, val);
vdev->config->set(vdev, offset, &val, sizeof(val));
}

@@ -299,12 +301,13 @@ static inline u64 virtio_cread64(struct virtio_device *vdev,
{
u64 ret;
vdev->config->get(vdev, offset, &ret, sizeof(ret));
- return ret;
+ return virtio64_to_cpu(vdev, (__force __virtio64)ret);
}

static inline void virtio_cwrite64(struct virtio_device *vdev,
unsigned int offset, u64 val)
{
+ val = (__force u64)cpu_to_virtio64(vdev, val);
vdev->config->set(vdev, offset, &val, sizeof(val));
}

--
MST

2014-11-24 11:53:24

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 01/41] virtio: use u32, not bitmap for struct virtio_device's features

From: Rusty Russell <[email protected]>

It seemed like a good idea, but it's actually a pain when we get more
than 32 feature bits. Just change it to a u32 for now.

Cc: Brian Swetland <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>
Acked-by: Pawel Moll <[email protected]>
Acked-by: Ohad Ben-Cohen <[email protected]>

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
include/linux/virtio.h | 3 +--
include/linux/virtio_config.h | 2 +-
tools/virtio/linux/virtio.h | 22 +---------------------
tools/virtio/linux/virtio_config.h | 2 +-
drivers/char/virtio_console.c | 2 +-
drivers/lguest/lguest_device.c | 8 ++++----
drivers/remoteproc/remoteproc_virtio.c | 2 +-
drivers/s390/kvm/kvm_virtio.c | 2 +-
drivers/s390/kvm/virtio_ccw.c | 23 +++++++++--------------
drivers/virtio/virtio.c | 10 +++++-----
drivers/virtio/virtio_mmio.c | 8 ++------
drivers/virtio/virtio_pci.c | 3 +--
drivers/virtio/virtio_ring.c | 2 +-
tools/virtio/virtio_test.c | 5 ++---
tools/virtio/vringh_test.c | 16 ++++++++--------
15 files changed, 39 insertions(+), 71 deletions(-)

diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 65261a7..7828a7f 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -101,8 +101,7 @@ struct virtio_device {
const struct virtio_config_ops *config;
const struct vringh_config_ops *vringh_config;
struct list_head vqs;
- /* Note that this is a Linux set_bit-style bitmap. */
- unsigned long features[1];
+ u32 features;
void *priv;
};

diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 7f4ef66..aa84d0e 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -93,7 +93,7 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
if (fbit < VIRTIO_TRANSPORT_F_START)
virtio_check_driver_offered_feature(vdev, fbit);

- return test_bit(fbit, vdev->features);
+ return vdev->features & (1 << fbit);
}

static inline
diff --git a/tools/virtio/linux/virtio.h b/tools/virtio/linux/virtio.h
index 5a2d1f0..72bff70 100644
--- a/tools/virtio/linux/virtio.h
+++ b/tools/virtio/linux/virtio.h
@@ -6,31 +6,11 @@
/* TODO: empty stubs for now. Broken but enough for virtio_ring.c */
#define list_add_tail(a, b) do {} while (0)
#define list_del(a) do {} while (0)
-
-#define BIT_WORD(nr) ((nr) / BITS_PER_LONG)
-#define BITS_PER_BYTE 8
-#define BITS_PER_LONG (sizeof(long) * BITS_PER_BYTE)
-#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG))
-
-/* TODO: Not atomic as it should be:
- * we don't use this for anything important. */
-static inline void clear_bit(int nr, volatile unsigned long *addr)
-{
- unsigned long mask = BIT_MASK(nr);
- unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
-
- *p &= ~mask;
-}
-
-static inline int test_bit(int nr, const volatile unsigned long *addr)
-{
- return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
-}
/* end of stubs */

struct virtio_device {
void *dev;
- unsigned long features[1];
+ u32 features;
};

struct virtqueue {
diff --git a/tools/virtio/linux/virtio_config.h b/tools/virtio/linux/virtio_config.h
index 5049967..1f1636b 100644
--- a/tools/virtio/linux/virtio_config.h
+++ b/tools/virtio/linux/virtio_config.h
@@ -2,5 +2,5 @@
#define VIRTIO_TRANSPORT_F_END 32

#define virtio_has_feature(dev, feature) \
- test_bit((feature), (dev)->features)
+ ((dev)->features & (1 << feature))

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index cf7a561..0074f9b 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -355,7 +355,7 @@ static inline bool use_multiport(struct ports_device *portdev)
*/
if (!portdev->vdev)
return 0;
- return portdev->vdev->features[0] & (1 << VIRTIO_CONSOLE_F_MULTIPORT);
+ return portdev->vdev->features & (1 << VIRTIO_CONSOLE_F_MULTIPORT);
}

static DEFINE_SPINLOCK(dma_bufs_lock);
diff --git a/drivers/lguest/lguest_device.c b/drivers/lguest/lguest_device.c
index d0a1d8a..c831c47 100644
--- a/drivers/lguest/lguest_device.c
+++ b/drivers/lguest/lguest_device.c
@@ -137,14 +137,14 @@ static void lg_finalize_features(struct virtio_device *vdev)
vring_transport_features(vdev);

/*
- * The vdev->feature array is a Linux bitmask: this isn't the same as a
- * the simple array of bits used by lguest devices for features. So we
- * do this slow, manual conversion which is completely general.
+ * Since lguest is currently x86-only, we're little-endian. That
+ * means we could just memcpy. But it's not time critical, and in
+ * case someone copies this code, we do it the slow, obvious way.
*/
memset(out_features, 0, desc->feature_len);
bits = min_t(unsigned, desc->feature_len, sizeof(vdev->features)) * 8;
for (i = 0; i < bits; i++) {
- if (test_bit(i, vdev->features))
+ if (vdev->features & (1 << i))
out_features[i / 8] |= (1 << (i % 8));
}

diff --git a/drivers/remoteproc/remoteproc_virtio.c b/drivers/remoteproc/remoteproc_virtio.c
index a34b506..dafaf38 100644
--- a/drivers/remoteproc/remoteproc_virtio.c
+++ b/drivers/remoteproc/remoteproc_virtio.c
@@ -231,7 +231,7 @@ static void rproc_virtio_finalize_features(struct virtio_device *vdev)
* Remember the finalized features of our vdev, and provide it
* to the remote processor once it is powered on.
*/
- rsc->gfeatures = vdev->features[0];
+ rsc->gfeatures = vdev->features;
}

static void rproc_virtio_get(struct virtio_device *vdev, unsigned offset,
diff --git a/drivers/s390/kvm/kvm_virtio.c b/drivers/s390/kvm/kvm_virtio.c
index 6431290..ac79a09 100644
--- a/drivers/s390/kvm/kvm_virtio.c
+++ b/drivers/s390/kvm/kvm_virtio.c
@@ -106,7 +106,7 @@ static void kvm_finalize_features(struct virtio_device *vdev)
memset(out_features, 0, desc->feature_len);
bits = min_t(unsigned, desc->feature_len, sizeof(vdev->features)) * 8;
for (i = 0; i < bits; i++) {
- if (test_bit(i, vdev->features))
+ if (vdev->features & (1 << i))
out_features[i / 8] |= (1 << (i % 8));
}
}
diff --git a/drivers/s390/kvm/virtio_ccw.c b/drivers/s390/kvm/virtio_ccw.c
index bda52f1..1dbee95 100644
--- a/drivers/s390/kvm/virtio_ccw.c
+++ b/drivers/s390/kvm/virtio_ccw.c
@@ -701,7 +701,6 @@ static void virtio_ccw_finalize_features(struct virtio_device *vdev)
{
struct virtio_ccw_device *vcdev = to_vc_device(vdev);
struct virtio_feature_desc *features;
- int i;
struct ccw1 *ccw;

ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
@@ -715,19 +714,15 @@ static void virtio_ccw_finalize_features(struct virtio_device *vdev)
/* Give virtio_ring a chance to accept features. */
vring_transport_features(vdev);

- for (i = 0; i < sizeof(*vdev->features) / sizeof(features->features);
- i++) {
- int highbits = i % 2 ? 32 : 0;
- features->index = i;
- features->features = cpu_to_le32(vdev->features[i / 2]
- >> highbits);
- /* Write the feature bits to the host. */
- ccw->cmd_code = CCW_CMD_WRITE_FEAT;
- ccw->flags = 0;
- ccw->count = sizeof(*features);
- ccw->cda = (__u32)(unsigned long)features;
- ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_FEAT);
- }
+ features->index = 0;
+ features->features = cpu_to_le32(vdev->features);
+ /* Write the feature bits to the host. */
+ ccw->cmd_code = CCW_CMD_WRITE_FEAT;
+ ccw->flags = 0;
+ ccw->count = sizeof(*features);
+ ccw->cda = (__u32)(unsigned long)features;
+ ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_FEAT);
+
out_free:
kfree(features);
kfree(ccw);
diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index df598dd..7814b1f 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -49,9 +49,9 @@ static ssize_t features_show(struct device *_d,

/* We actually represent this as a bitstring, as it could be
* arbitrary length in future. */
- for (i = 0; i < ARRAY_SIZE(dev->features)*BITS_PER_LONG; i++)
+ for (i = 0; i < sizeof(dev->features)*8; i++)
len += sprintf(buf+len, "%c",
- test_bit(i, dev->features) ? '1' : '0');
+ dev->features & (1ULL << i) ? '1' : '0');
len += sprintf(buf+len, "\n");
return len;
}
@@ -168,18 +168,18 @@ static int virtio_dev_probe(struct device *_d)
device_features = dev->config->get_features(dev);

/* Features supported by both device and driver into dev->features. */
- memset(dev->features, 0, sizeof(dev->features));
+ dev->features = 0;
for (i = 0; i < drv->feature_table_size; i++) {
unsigned int f = drv->feature_table[i];
BUG_ON(f >= 32);
if (device_features & (1 << f))
- set_bit(f, dev->features);
+ dev->features |= (1 << f);
}

/* Transport features always preserved to pass to finalize_features. */
for (i = VIRTIO_TRANSPORT_F_START; i < VIRTIO_TRANSPORT_F_END; i++)
if (device_features & (1 << i))
- set_bit(i, dev->features);
+ dev->features |= (1 << i);

dev->config->finalize_features(dev);

diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index ef9a165..eb5b0e2 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -155,16 +155,12 @@ static u32 vm_get_features(struct virtio_device *vdev)
static void vm_finalize_features(struct virtio_device *vdev)
{
struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
- int i;

/* Give virtio_ring a chance to accept features. */
vring_transport_features(vdev);

- for (i = 0; i < ARRAY_SIZE(vdev->features); i++) {
- writel(i, vm_dev->base + VIRTIO_MMIO_GUEST_FEATURES_SEL);
- writel(vdev->features[i],
- vm_dev->base + VIRTIO_MMIO_GUEST_FEATURES);
- }
+ writel(0, vm_dev->base + VIRTIO_MMIO_GUEST_FEATURES_SEL);
+ writel(vdev->features, vm_dev->base + VIRTIO_MMIO_GUEST_FEATURES);
}

static void vm_get(struct virtio_device *vdev, unsigned offset,
diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
index d34ebfa..ab95a4c 100644
--- a/drivers/virtio/virtio_pci.c
+++ b/drivers/virtio/virtio_pci.c
@@ -120,8 +120,7 @@ static void vp_finalize_features(struct virtio_device *vdev)
vring_transport_features(vdev);

/* We only support 32 feature bits. */
- BUILD_BUG_ON(ARRAY_SIZE(vdev->features) != 1);
- iowrite32(vdev->features[0], vp_dev->ioaddr+VIRTIO_PCI_GUEST_FEATURES);
+ iowrite32(vdev->features, vp_dev->ioaddr+VIRTIO_PCI_GUEST_FEATURES);
}

/* virtio config->get() implementation */
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 3b1f89b..15a8a05 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -781,7 +781,7 @@ void vring_transport_features(struct virtio_device *vdev)
break;
default:
/* We don't understand this bit. */
- clear_bit(i, vdev->features);
+ vdev->features &= ~(1 << i);
}
}
}
diff --git a/tools/virtio/virtio_test.c b/tools/virtio/virtio_test.c
index 00ea679..db3437c 100644
--- a/tools/virtio/virtio_test.c
+++ b/tools/virtio/virtio_test.c
@@ -60,7 +60,7 @@ void vhost_vq_setup(struct vdev_info *dev, struct vq_info *info)
{
struct vhost_vring_state state = { .index = info->idx };
struct vhost_vring_file file = { .index = info->idx };
- unsigned long long features = dev->vdev.features[0];
+ unsigned long long features = dev->vdev.features;
struct vhost_vring_addr addr = {
.index = info->idx,
.desc_user_addr = (uint64_t)(unsigned long)info->vring.desc,
@@ -113,8 +113,7 @@ static void vdev_info_init(struct vdev_info* dev, unsigned long long features)
{
int r;
memset(dev, 0, sizeof *dev);
- dev->vdev.features[0] = features;
- dev->vdev.features[1] = features >> 32;
+ dev->vdev.features = features;
dev->buf_size = 1024;
dev->buf = malloc(dev->buf_size);
assert(dev->buf);
diff --git a/tools/virtio/vringh_test.c b/tools/virtio/vringh_test.c
index 14a4f4c..b6c9ee3 100644
--- a/tools/virtio/vringh_test.c
+++ b/tools/virtio/vringh_test.c
@@ -304,7 +304,7 @@ static int parallel_test(unsigned long features,
close(to_guest[1]);
close(to_host[0]);

- gvdev.vdev.features[0] = features;
+ gvdev.vdev.features = features;
gvdev.to_host_fd = to_host[1];
gvdev.notifies = 0;

@@ -449,13 +449,13 @@ int main(int argc, char *argv[])
bool fast_vringh = false, parallel = false;

getrange = getrange_iov;
- vdev.features[0] = 0;
+ vdev.features = 0;

while (argv[1]) {
if (strcmp(argv[1], "--indirect") == 0)
- vdev.features[0] |= (1 << VIRTIO_RING_F_INDIRECT_DESC);
+ vdev.features |= (1 << VIRTIO_RING_F_INDIRECT_DESC);
else if (strcmp(argv[1], "--eventidx") == 0)
- vdev.features[0] |= (1 << VIRTIO_RING_F_EVENT_IDX);
+ vdev.features |= (1 << VIRTIO_RING_F_EVENT_IDX);
else if (strcmp(argv[1], "--slow-range") == 0)
getrange = getrange_slow;
else if (strcmp(argv[1], "--fast-vringh") == 0)
@@ -468,7 +468,7 @@ int main(int argc, char *argv[])
}

if (parallel)
- return parallel_test(vdev.features[0], getrange, fast_vringh);
+ return parallel_test(vdev.features, getrange, fast_vringh);

if (posix_memalign(&__user_addr_min, PAGE_SIZE, USER_MEM) != 0)
abort();
@@ -483,7 +483,7 @@ int main(int argc, char *argv[])

/* Set up host side. */
vring_init(&vrh.vring, RINGSIZE, __user_addr_min, ALIGN);
- vringh_init_user(&vrh, vdev.features[0], RINGSIZE, true,
+ vringh_init_user(&vrh, vdev.features, RINGSIZE, true,
vrh.vring.desc, vrh.vring.avail, vrh.vring.used);

/* No descriptor to get yet... */
@@ -652,13 +652,13 @@ int main(int argc, char *argv[])
}

/* Test weird (but legal!) indirect. */
- if (vdev.features[0] & (1 << VIRTIO_RING_F_INDIRECT_DESC)) {
+ if (vdev.features & (1 << VIRTIO_RING_F_INDIRECT_DESC)) {
char *data = __user_addr_max - USER_MEM/4;
struct vring_desc *d = __user_addr_max - USER_MEM/2;
struct vring vring;

/* Force creation of direct, which we modify. */
- vdev.features[0] &= ~(1 << VIRTIO_RING_F_INDIRECT_DESC);
+ vdev.features &= ~(1 << VIRTIO_RING_F_INDIRECT_DESC);
vq = vring_new_virtqueue(0, RINGSIZE, ALIGN, &vdev, true,
__user_addr_min,
never_notify_host,
--
MST

2014-11-24 11:53:32

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 08/41] virtio: set FEATURES_OK

set FEATURES_OK as per virtio 1.0 spec

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
include/uapi/linux/virtio_config.h | 2 ++
drivers/virtio/virtio.c | 29 ++++++++++++++++++++++-------
2 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
index f3fe33a..a6d0cde 100644
--- a/include/uapi/linux/virtio_config.h
+++ b/include/uapi/linux/virtio_config.h
@@ -38,6 +38,8 @@
#define VIRTIO_CONFIG_S_DRIVER 2
/* Driver has used its parts of the config, and is happy */
#define VIRTIO_CONFIG_S_DRIVER_OK 4
+/* Driver has finished configuring features */
+#define VIRTIO_CONFIG_S_FEATURES_OK 8
/* We've given up on this device. */
#define VIRTIO_CONFIG_S_FAILED 0x80

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index d213567..a3df817 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -160,6 +160,7 @@ static int virtio_dev_probe(struct device *_d)
struct virtio_device *dev = dev_to_virtio(_d);
struct virtio_driver *drv = drv_to_virtio(dev->dev.driver);
u64 device_features;
+ unsigned status;

/* We have a driver! */
add_status(dev, VIRTIO_CONFIG_S_DRIVER);
@@ -183,18 +184,32 @@ static int virtio_dev_probe(struct device *_d)

dev->config->finalize_features(dev);

+ if (virtio_has_feature(dev, VIRTIO_F_VERSION_1)) {
+ add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK);
+ status = dev->config->get_status(dev);
+ if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) {
+ printk(KERN_ERR "virtio: device refuses features: %x\n",
+ status);
+ err = -ENODEV;
+ goto err;
+ }
+ }
+
err = drv->probe(dev);
if (err)
- add_status(dev, VIRTIO_CONFIG_S_FAILED);
- else {
- add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
- if (drv->scan)
- drv->scan(dev);
+ goto err;

- virtio_config_enable(dev);
- }
+ add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
+ if (drv->scan)
+ drv->scan(dev);
+
+ virtio_config_enable(dev);

+ return 0;
+err:
+ add_status(dev, VIRTIO_CONFIG_S_FAILED);
return err;
+
}

static int virtio_dev_remove(struct device *_d)
--
MST

2014-11-24 11:53:43

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 10/41] virtio: add legacy feature table support

virtio blk has some legacy feature bits that modern drivers
must not negotiate, but are needed for old legacy hosts
(e.g. that dn't support virtio scsi).
Allow a separate legacy feature table for such cases.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
include/linux/virtio.h | 4 ++++
drivers/virtio/virtio.c | 25 ++++++++++++++++++++++++-
2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index d6359a5..f70411e 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -130,6 +130,8 @@ int virtio_device_restore(struct virtio_device *dev);
* @id_table: the ids serviced by this driver.
* @feature_table: an array of feature numbers supported by this driver.
* @feature_table_size: number of entries in the feature table array.
+ * @feature_table_legacy: same as feature_table but when working in legacy mode.
+ * @feature_table_size_legacy: number of entries in feature table legacy array.
* @probe: the function to call when a device is found. Returns 0 or -errno.
* @remove: the function to call when a device is removed.
* @config_changed: optional function to call when the device configuration
@@ -140,6 +142,8 @@ struct virtio_driver {
const struct virtio_device_id *id_table;
const unsigned int *feature_table;
unsigned int feature_table_size;
+ const unsigned int *feature_table_legacy;
+ unsigned int feature_table_size_legacy;
int (*probe)(struct virtio_device *dev);
void (*scan)(struct virtio_device *dev);
void (*remove)(struct virtio_device *dev);
diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index 0f44cff..e9018b4 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -113,6 +113,13 @@ void virtio_check_driver_offered_feature(const struct virtio_device *vdev,
for (i = 0; i < drv->feature_table_size; i++)
if (drv->feature_table[i] == fbit)
return;
+
+ if (drv->feature_table_legacy) {
+ for (i = 0; i < drv->feature_table_size_legacy; i++)
+ if (drv->feature_table_legacy[i] == fbit)
+ return;
+ }
+
BUG();
}
EXPORT_SYMBOL_GPL(virtio_check_driver_offered_feature);
@@ -161,6 +168,7 @@ static int virtio_dev_probe(struct device *_d)
struct virtio_driver *drv = drv_to_virtio(dev->dev.driver);
u64 device_features;
u64 driver_features;
+ u64 driver_features_legacy;
unsigned status;

/* We have a driver! */
@@ -177,7 +185,22 @@ static int virtio_dev_probe(struct device *_d)
driver_features |= (1ULL << f);
}

- dev->features = driver_features & device_features;
+ /* Some drivers have a separate feature tables for virtio v1.0 */
+ if (drv->feature_table_legacy) {
+ driver_features_legacy = 0;
+ for (i = 0; i < drv->feature_table_size_legacy; i++) {
+ unsigned int f = drv->feature_table_legacy[i];
+ BUG_ON(f >= 64);
+ driver_features_legacy |= (1ULL << f);
+ }
+ } else {
+ driver_features_legacy = driver_features;
+ }
+
+ if (driver_features & device_features & (1ULL << VIRTIO_F_VERSION_1))
+ dev->features = driver_features & device_features;
+ else
+ dev->features = driver_features_legacy & device_features;

/* Transport features always preserved to pass to finalize_features. */
for (i = VIRTIO_TRANSPORT_F_START; i < VIRTIO_TRANSPORT_F_END; i++)
--
MST

2014-11-24 11:53:35

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 09/41] virtio: simplify feature bit handling

Now that we use u64 for bits, we can simply & them together.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/virtio/virtio.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index a3df817..0f44cff 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -160,6 +160,7 @@ static int virtio_dev_probe(struct device *_d)
struct virtio_device *dev = dev_to_virtio(_d);
struct virtio_driver *drv = drv_to_virtio(dev->dev.driver);
u64 device_features;
+ u64 driver_features;
unsigned status;

/* We have a driver! */
@@ -168,15 +169,16 @@ static int virtio_dev_probe(struct device *_d)
/* Figure out what features the device supports. */
device_features = dev->config->get_features(dev);

- /* Features supported by both device and driver into dev->features. */
- dev->features = 0;
+ /* Figure out what features the driver supports. */
+ driver_features = 0;
for (i = 0; i < drv->feature_table_size; i++) {
unsigned int f = drv->feature_table[i];
BUG_ON(f >= 64);
- if (device_features & (1ULL << f))
- dev->features |= (1ULL << f);
+ driver_features |= (1ULL << f);
}

+ dev->features = driver_features & device_features;
+
/* Transport features always preserved to pass to finalize_features. */
for (i = VIRTIO_TRANSPORT_F_START; i < VIRTIO_TRANSPORT_F_END; i++)
if (device_features & (1ULL << i))
--
MST

2014-11-24 11:53:51

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 12/41] virtio_blk: v1.0 support

Based on patch by Cornelia Huck.

Note: for consistency, and to avoid sparse errors,
convert all fields, even those no longer in use
for virtio v1.0.

Reviewed-by: Thomas Huth <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
---
include/uapi/linux/virtio_blk.h | 15 ++++-----
drivers/block/virtio_blk.c | 70 ++++++++++++++++++++++++-----------------
2 files changed, 49 insertions(+), 36 deletions(-)

diff --git a/include/uapi/linux/virtio_blk.h b/include/uapi/linux/virtio_blk.h
index 9ad67b2..247c8ba 100644
--- a/include/uapi/linux/virtio_blk.h
+++ b/include/uapi/linux/virtio_blk.h
@@ -28,6 +28,7 @@
#include <linux/types.h>
#include <linux/virtio_ids.h>
#include <linux/virtio_config.h>
+#include <linux/virtio_types.h>

/* Feature bits */
#define VIRTIO_BLK_F_BARRIER 0 /* Does host support barriers? */
@@ -114,18 +115,18 @@ struct virtio_blk_config {
/* This is the first element of the read scatter-gather list. */
struct virtio_blk_outhdr {
/* VIRTIO_BLK_T* */
- __u32 type;
+ __virtio32 type;
/* io priority. */
- __u32 ioprio;
+ __virtio32 ioprio;
/* Sector (ie. 512 byte offset) */
- __u64 sector;
+ __virtio64 sector;
};

struct virtio_scsi_inhdr {
- __u32 errors;
- __u32 data_len;
- __u32 sense_len;
- __u32 residual;
+ __virtio32 errors;
+ __virtio32 data_len;
+ __virtio32 sense_len;
+ __virtio32 residual;
};

/* And this is the final byte of the write scatter-gather list. */
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index c6a27d5..f601f16 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -80,7 +80,7 @@ static int __virtblk_add_req(struct virtqueue *vq,
{
struct scatterlist hdr, status, cmd, sense, inhdr, *sgs[6];
unsigned int num_out = 0, num_in = 0;
- int type = vbr->out_hdr.type & ~VIRTIO_BLK_T_OUT;
+ __virtio32 type = vbr->out_hdr.type & ~cpu_to_virtio32(vq->vdev, VIRTIO_BLK_T_OUT);

sg_init_one(&hdr, &vbr->out_hdr, sizeof(vbr->out_hdr));
sgs[num_out++] = &hdr;
@@ -91,19 +91,19 @@ static int __virtblk_add_req(struct virtqueue *vq,
* block, and before the normal inhdr we put the sense data and the
* inhdr with additional status information.
*/
- if (type == VIRTIO_BLK_T_SCSI_CMD) {
+ if (type == cpu_to_virtio32(vq->vdev, VIRTIO_BLK_T_SCSI_CMD)) {
sg_init_one(&cmd, vbr->req->cmd, vbr->req->cmd_len);
sgs[num_out++] = &cmd;
}

if (have_data) {
- if (vbr->out_hdr.type & VIRTIO_BLK_T_OUT)
+ if (vbr->out_hdr.type & cpu_to_virtio32(vq->vdev, VIRTIO_BLK_T_OUT))
sgs[num_out++] = data_sg;
else
sgs[num_out + num_in++] = data_sg;
}

- if (type == VIRTIO_BLK_T_SCSI_CMD) {
+ if (type == cpu_to_virtio32(vq->vdev, VIRTIO_BLK_T_SCSI_CMD)) {
sg_init_one(&sense, vbr->req->sense, SCSI_SENSE_BUFFERSIZE);
sgs[num_out + num_in++] = &sense;
sg_init_one(&inhdr, &vbr->in_hdr, sizeof(vbr->in_hdr));
@@ -119,12 +119,13 @@ static int __virtblk_add_req(struct virtqueue *vq,
static inline void virtblk_request_done(struct request *req)
{
struct virtblk_req *vbr = blk_mq_rq_to_pdu(req);
+ struct virtio_blk *vblk = req->q->queuedata;
int error = virtblk_result(vbr);

if (req->cmd_type == REQ_TYPE_BLOCK_PC) {
- req->resid_len = vbr->in_hdr.residual;
- req->sense_len = vbr->in_hdr.sense_len;
- req->errors = vbr->in_hdr.errors;
+ req->resid_len = virtio32_to_cpu(vblk->vdev, vbr->in_hdr.residual);
+ req->sense_len = virtio32_to_cpu(vblk->vdev, vbr->in_hdr.sense_len);
+ req->errors = virtio32_to_cpu(vblk->vdev, vbr->in_hdr.errors);
} else if (req->cmd_type == REQ_TYPE_SPECIAL) {
req->errors = (error != 0);
}
@@ -173,25 +174,25 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, struct request *req,

vbr->req = req;
if (req->cmd_flags & REQ_FLUSH) {
- vbr->out_hdr.type = VIRTIO_BLK_T_FLUSH;
+ vbr->out_hdr.type = cpu_to_virtio32(vblk->vdev, VIRTIO_BLK_T_FLUSH);
vbr->out_hdr.sector = 0;
- vbr->out_hdr.ioprio = req_get_ioprio(vbr->req);
+ vbr->out_hdr.ioprio = cpu_to_virtio32(vblk->vdev, req_get_ioprio(vbr->req));
} else {
switch (req->cmd_type) {
case REQ_TYPE_FS:
vbr->out_hdr.type = 0;
- vbr->out_hdr.sector = blk_rq_pos(vbr->req);
- vbr->out_hdr.ioprio = req_get_ioprio(vbr->req);
+ vbr->out_hdr.sector = cpu_to_virtio64(vblk->vdev, blk_rq_pos(vbr->req));
+ vbr->out_hdr.ioprio = cpu_to_virtio32(vblk->vdev, req_get_ioprio(vbr->req));
break;
case REQ_TYPE_BLOCK_PC:
- vbr->out_hdr.type = VIRTIO_BLK_T_SCSI_CMD;
+ vbr->out_hdr.type = cpu_to_virtio32(vblk->vdev, VIRTIO_BLK_T_SCSI_CMD);
vbr->out_hdr.sector = 0;
- vbr->out_hdr.ioprio = req_get_ioprio(vbr->req);
+ vbr->out_hdr.ioprio = cpu_to_virtio32(vblk->vdev, req_get_ioprio(vbr->req));
break;
case REQ_TYPE_SPECIAL:
- vbr->out_hdr.type = VIRTIO_BLK_T_GET_ID;
+ vbr->out_hdr.type = cpu_to_virtio32(vblk->vdev, VIRTIO_BLK_T_GET_ID);
vbr->out_hdr.sector = 0;
- vbr->out_hdr.ioprio = req_get_ioprio(vbr->req);
+ vbr->out_hdr.ioprio = cpu_to_virtio32(vblk->vdev, req_get_ioprio(vbr->req));
break;
default:
/* We don't put anything else in the queue. */
@@ -204,9 +205,9 @@ static int virtio_queue_rq(struct blk_mq_hw_ctx *hctx, struct request *req,
num = blk_rq_map_sg(hctx->queue, vbr->req, vbr->sg);
if (num) {
if (rq_data_dir(vbr->req) == WRITE)
- vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
+ vbr->out_hdr.type |= cpu_to_virtio32(vblk->vdev, VIRTIO_BLK_T_OUT);
else
- vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
+ vbr->out_hdr.type |= cpu_to_virtio32(vblk->vdev, VIRTIO_BLK_T_IN);
}

spin_lock_irqsave(&vblk->vqs[qid].lock, flags);
@@ -476,7 +477,8 @@ static int virtblk_get_cache_mode(struct virtio_device *vdev)
struct virtio_blk_config, wce,
&writeback);
if (err)
- writeback = virtio_has_feature(vdev, VIRTIO_BLK_F_WCE);
+ writeback = virtio_has_feature(vdev, VIRTIO_BLK_F_WCE) ||
+ virtio_has_feature(vdev, VIRTIO_F_VERSION_1);

return writeback;
}
@@ -821,25 +823,35 @@ static const struct virtio_device_id id_table[] = {
{ 0 },
};

-static unsigned int features[] = {
+static unsigned int features_legacy[] = {
VIRTIO_BLK_F_SEG_MAX, VIRTIO_BLK_F_SIZE_MAX, VIRTIO_BLK_F_GEOMETRY,
VIRTIO_BLK_F_RO, VIRTIO_BLK_F_BLK_SIZE, VIRTIO_BLK_F_SCSI,
VIRTIO_BLK_F_WCE, VIRTIO_BLK_F_TOPOLOGY, VIRTIO_BLK_F_CONFIG_WCE,
VIRTIO_BLK_F_MQ,
+}
+;
+static unsigned int features[] = {
+ VIRTIO_BLK_F_SEG_MAX, VIRTIO_BLK_F_SIZE_MAX, VIRTIO_BLK_F_GEOMETRY,
+ VIRTIO_BLK_F_RO, VIRTIO_BLK_F_BLK_SIZE,
+ VIRTIO_BLK_F_TOPOLOGY,
+ VIRTIO_BLK_F_MQ,
+ VIRTIO_F_VERSION_1,
};

static struct virtio_driver virtio_blk = {
- .feature_table = features,
- .feature_table_size = ARRAY_SIZE(features),
- .driver.name = KBUILD_MODNAME,
- .driver.owner = THIS_MODULE,
- .id_table = id_table,
- .probe = virtblk_probe,
- .remove = virtblk_remove,
- .config_changed = virtblk_config_changed,
+ .feature_table = features,
+ .feature_table_size = ARRAY_SIZE(features),
+ .feature_table_legacy = features_legacy,
+ .feature_table_size_legacy = ARRAY_SIZE(features_legacy),
+ .driver.name = KBUILD_MODNAME,
+ .driver.owner = THIS_MODULE,
+ .id_table = id_table,
+ .probe = virtblk_probe,
+ .remove = virtblk_remove,
+ .config_changed = virtblk_config_changed,
#ifdef CONFIG_PM_SLEEP
- .freeze = virtblk_freeze,
- .restore = virtblk_restore,
+ .freeze = virtblk_freeze,
+ .restore = virtblk_restore,
#endif
};

--
MST

2014-11-24 11:54:00

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 15/41] KVM: s390 allow virtio_ccw status writes to fail

Gracefully handle failure to write device status.
We really should handle other errors as well, but this one is needed for
virtio 1.0 compliance.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/s390/kvm/virtio_ccw.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/s390/kvm/virtio_ccw.c b/drivers/s390/kvm/virtio_ccw.c
index ad06e11..79f6737 100644
--- a/drivers/s390/kvm/virtio_ccw.c
+++ b/drivers/s390/kvm/virtio_ccw.c
@@ -862,7 +862,9 @@ static u8 virtio_ccw_get_status(struct virtio_device *vdev)
static void virtio_ccw_set_status(struct virtio_device *vdev, u8 status)
{
struct virtio_ccw_device *vcdev = to_vc_device(vdev);
+ u8 old_status = *vcdev->status;
struct ccw1 *ccw;
+ int ret;

ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
if (!ccw)
@@ -874,7 +876,10 @@ static void virtio_ccw_set_status(struct virtio_device *vdev, u8 status)
ccw->flags = 0;
ccw->count = sizeof(status);
ccw->cda = (__u32)(unsigned long)vcdev->status;
- ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_STATUS);
+ ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_STATUS);
+ /* Write failed? We assume status is unchanged. */
+ if (ret)
+ *vcdev->status = old_status;
kfree(ccw);
}

--
MST

2014-11-24 11:53:55

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 14/41] KVM: s390: virtio-ccw revision 1 SET_VQ

From: Cornelia Huck <[email protected]>

The CCW_CMD_SET_VQ command has a different format for revision 1+
devices, allowing to specify a more complex virtqueue layout. For
now, we stay however with the old layout and simply use the new
command format for virtio-1 devices.

Signed-off-by: Cornelia Huck <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/s390/kvm/virtio_ccw.c | 54 +++++++++++++++++++++++++++++++++----------
1 file changed, 42 insertions(+), 12 deletions(-)

diff --git a/drivers/s390/kvm/virtio_ccw.c b/drivers/s390/kvm/virtio_ccw.c
index 193e8f1..ad06e11 100644
--- a/drivers/s390/kvm/virtio_ccw.c
+++ b/drivers/s390/kvm/virtio_ccw.c
@@ -68,13 +68,22 @@ struct virtio_ccw_device {
void *airq_info;
};

-struct vq_info_block {
+struct vq_info_block_legacy {
__u64 queue;
__u32 align;
__u16 index;
__u16 num;
} __packed;

+struct vq_info_block {
+ __u64 desc;
+ __u32 res0;
+ __u16 index;
+ __u16 num;
+ __u64 avail;
+ __u64 used;
+} __packed;
+
struct virtio_feature_desc {
__u32 features;
__u8 index;
@@ -100,7 +109,10 @@ struct virtio_ccw_vq_info {
struct virtqueue *vq;
int num;
void *queue;
- struct vq_info_block *info_block;
+ union {
+ struct vq_info_block s;
+ struct vq_info_block_legacy l;
+ } *info_block;
int bit_nr;
struct list_head node;
long cookie;
@@ -411,13 +423,22 @@ static void virtio_ccw_del_vq(struct virtqueue *vq, struct ccw1 *ccw)
spin_unlock_irqrestore(&vcdev->lock, flags);

/* Release from host. */
- info->info_block->queue = 0;
- info->info_block->align = 0;
- info->info_block->index = index;
- info->info_block->num = 0;
+ if (vcdev->revision == 0) {
+ info->info_block->l.queue = 0;
+ info->info_block->l.align = 0;
+ info->info_block->l.index = index;
+ info->info_block->l.num = 0;
+ ccw->count = sizeof(info->info_block->l);
+ } else {
+ info->info_block->s.desc = 0;
+ info->info_block->s.index = index;
+ info->info_block->s.num = 0;
+ info->info_block->s.avail = 0;
+ info->info_block->s.used = 0;
+ ccw->count = sizeof(info->info_block->s);
+ }
ccw->cmd_code = CCW_CMD_SET_VQ;
ccw->flags = 0;
- ccw->count = sizeof(*info->info_block);
ccw->cda = (__u32)(unsigned long)(info->info_block);
ret = ccw_io_helper(vcdev, ccw,
VIRTIO_CCW_DOING_SET_VQ | index);
@@ -500,13 +521,22 @@ static struct virtqueue *virtio_ccw_setup_vq(struct virtio_device *vdev,
}

/* Register it with the host. */
- info->info_block->queue = (__u64)info->queue;
- info->info_block->align = KVM_VIRTIO_CCW_RING_ALIGN;
- info->info_block->index = i;
- info->info_block->num = info->num;
+ if (vcdev->revision == 0) {
+ info->info_block->l.queue = (__u64)info->queue;
+ info->info_block->l.align = KVM_VIRTIO_CCW_RING_ALIGN;
+ info->info_block->l.index = i;
+ info->info_block->l.num = info->num;
+ ccw->count = sizeof(info->info_block->l);
+ } else {
+ info->info_block->s.desc = (__u64)info->queue;
+ info->info_block->s.index = i;
+ info->info_block->s.num = info->num;
+ info->info_block->s.avail = (__u64)virtqueue_get_avail(vq);
+ info->info_block->s.used = (__u64)virtqueue_get_used(vq);
+ ccw->count = sizeof(info->info_block->s);
+ }
ccw->cmd_code = CCW_CMD_SET_VQ;
ccw->flags = 0;
- ccw->count = sizeof(*info->info_block);
ccw->cda = (__u32)(unsigned long)(info->info_block);
err = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_SET_VQ | i);
if (err) {
--
MST

2014-11-24 11:54:07

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 16/41] KVM: s390: enable virtio-ccw revision 1

From: Cornelia Huck <[email protected]>

Now that virtio-ccw has everything needed to support virtio 1.0 in
place, try to enable it if the host supports it.

Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/s390/kvm/virtio_ccw.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/s390/kvm/virtio_ccw.c b/drivers/s390/kvm/virtio_ccw.c
index 79f6737..8ec68dd 100644
--- a/drivers/s390/kvm/virtio_ccw.c
+++ b/drivers/s390/kvm/virtio_ccw.c
@@ -103,7 +103,7 @@ struct virtio_rev_info {
};

/* the highest virtio-ccw revision we support */
-#define VIRTIO_CCW_REV_MAX 0
+#define VIRTIO_CCW_REV_MAX 1

struct virtio_ccw_vq_info {
struct virtqueue *vq;
--
MST

2014-11-24 11:54:29

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 20/41] virtio_net: get rid of virtio_net_hdr/skb_vnet_hdr

virtio 1.0 doesn't use virtio_net_hdr anymore, and in fact, it's not
really useful since virtio_net_hdr_mrg_rxbuf includes that as the first
field anyway.

Let's drop it, precalculate header len and store within vi instead.

This way we can also remove struct skb_vnet_hdr.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/net/virtio_net.c | 90 ++++++++++++++++++++++--------------------------
1 file changed, 41 insertions(+), 49 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 1630c21..516f2cb 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -123,6 +123,9 @@ struct virtnet_info {
/* Host can handle any s/g split between our header and packet data */
bool any_header_sg;

+ /* Packet virtio header size */
+ u8 hdr_len;
+
/* Active statistics */
struct virtnet_stats __percpu *stats;

@@ -139,21 +142,14 @@ struct virtnet_info {
struct notifier_block nb;
};

-struct skb_vnet_hdr {
- union {
- struct virtio_net_hdr hdr;
- struct virtio_net_hdr_mrg_rxbuf mhdr;
- };
-};
-
struct padded_vnet_hdr {
- struct virtio_net_hdr hdr;
+ struct virtio_net_hdr_mrg_rxbuf hdr;
/*
- * virtio_net_hdr should be in a separated sg buffer because of a
- * QEMU bug, and data sg buffer shares same page with this header sg.
- * This padding makes next sg 16 byte aligned after virtio_net_hdr.
+ * hdr is in a separate sg buffer, and data sg buffer shares same page
+ * with this header sg. This padding makes next sg 16 byte aligned
+ * after the header.
*/
- char padding[6];
+ char padding[4];
};

/* Converting between virtqueue no. and kernel tx/rx queue no.
@@ -179,9 +175,9 @@ static int rxq2vq(int rxq)
return rxq * 2;
}

-static inline struct skb_vnet_hdr *skb_vnet_hdr(struct sk_buff *skb)
+static inline struct virtio_net_hdr_mrg_rxbuf *skb_vnet_hdr(struct sk_buff *skb)
{
- return (struct skb_vnet_hdr *)skb->cb;
+ return (struct virtio_net_hdr_mrg_rxbuf *)skb->cb;
}

/*
@@ -247,7 +243,7 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
unsigned int len, unsigned int truesize)
{
struct sk_buff *skb;
- struct skb_vnet_hdr *hdr;
+ struct virtio_net_hdr_mrg_rxbuf *hdr;
unsigned int copy, hdr_len, hdr_padded_len;
char *p;

@@ -260,13 +256,11 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,

hdr = skb_vnet_hdr(skb);

- if (vi->mergeable_rx_bufs) {
- hdr_len = sizeof hdr->mhdr;
- hdr_padded_len = sizeof hdr->mhdr;
- } else {
- hdr_len = sizeof hdr->hdr;
+ hdr_len = vi->hdr_len;
+ if (vi->mergeable_rx_bufs)
+ hdr_padded_len = sizeof *hdr;
+ else
hdr_padded_len = sizeof(struct padded_vnet_hdr);
- }

memcpy(hdr, p, hdr_len);

@@ -317,11 +311,11 @@ static struct sk_buff *page_to_skb(struct virtnet_info *vi,
return skb;
}

-static struct sk_buff *receive_small(void *buf, unsigned int len)
+static struct sk_buff *receive_small(struct virtnet_info *vi, void *buf, unsigned int len)
{
struct sk_buff * skb = buf;

- len -= sizeof(struct virtio_net_hdr);
+ len -= vi->hdr_len;
skb_trim(skb, len);

return skb;
@@ -354,8 +348,8 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
unsigned int len)
{
void *buf = mergeable_ctx_to_buf_address(ctx);
- struct skb_vnet_hdr *hdr = buf;
- u16 num_buf = virtio16_to_cpu(rq->vq->vdev, hdr->mhdr.num_buffers);
+ struct virtio_net_hdr_mrg_rxbuf *hdr = buf;
+ u16 num_buf = virtio16_to_cpu(vi->vdev, hdr->num_buffers);
struct page *page = virt_to_head_page(buf);
int offset = buf - page_address(page);
unsigned int truesize = max(len, mergeable_ctx_to_buf_truesize(ctx));
@@ -373,8 +367,8 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
if (unlikely(!ctx)) {
pr_debug("%s: rx error: %d buffers out of %d missing\n",
dev->name, num_buf,
- virtio16_to_cpu(rq->vq->vdev,
- hdr->mhdr.num_buffers));
+ virtio16_to_cpu(vi->vdev,
+ hdr->num_buffers));
dev->stats.rx_length_errors++;
goto err_buf;
}
@@ -441,7 +435,7 @@ static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
struct net_device *dev = vi->dev;
struct virtnet_stats *stats = this_cpu_ptr(vi->stats);
struct sk_buff *skb;
- struct skb_vnet_hdr *hdr;
+ struct virtio_net_hdr_mrg_rxbuf *hdr;

if (unlikely(len < sizeof(struct virtio_net_hdr) + ETH_HLEN)) {
pr_debug("%s: short packet %i\n", dev->name, len);
@@ -463,7 +457,7 @@ static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
else if (vi->big_packets)
skb = receive_big(dev, vi, rq, buf, len);
else
- skb = receive_small(buf, len);
+ skb = receive_small(vi, buf, len);

if (unlikely(!skb))
return;
@@ -545,7 +539,7 @@ static int add_recvbuf_small(struct virtnet_info *vi, struct receive_queue *rq,
gfp_t gfp)
{
struct sk_buff *skb;
- struct skb_vnet_hdr *hdr;
+ struct virtio_net_hdr_mrg_rxbuf *hdr;
int err;

skb = __netdev_alloc_skb_ip_align(vi->dev, GOOD_PACKET_LEN, gfp);
@@ -556,7 +550,7 @@ static int add_recvbuf_small(struct virtnet_info *vi, struct receive_queue *rq,

hdr = skb_vnet_hdr(skb);
sg_init_table(rq->sg, MAX_SKB_FRAGS + 2);
- sg_set_buf(rq->sg, &hdr->hdr, sizeof hdr->hdr);
+ sg_set_buf(rq->sg, hdr, vi->hdr_len);
skb_to_sgvec(skb, rq->sg + 1, 0, skb->len);

err = virtqueue_add_inbuf(rq->vq, rq->sg, 2, skb, gfp);
@@ -566,7 +560,8 @@ static int add_recvbuf_small(struct virtnet_info *vi, struct receive_queue *rq,
return err;
}

-static int add_recvbuf_big(struct receive_queue *rq, gfp_t gfp)
+static int add_recvbuf_big(struct virtnet_info *vi, struct receive_queue *rq,
+ gfp_t gfp)
{
struct page *first, *list = NULL;
char *p;
@@ -597,8 +592,8 @@ static int add_recvbuf_big(struct receive_queue *rq, gfp_t gfp)
p = page_address(first);

/* rq->sg[0], rq->sg[1] share the same page */
- /* a separated rq->sg[0] for virtio_net_hdr only due to QEMU bug */
- sg_set_buf(&rq->sg[0], p, sizeof(struct virtio_net_hdr));
+ /* a separated rq->sg[0] for header - required in case !any_header_sg */
+ sg_set_buf(&rq->sg[0], p, vi->hdr_len);

/* rq->sg[1] for data packet, from offset */
offset = sizeof(struct padded_vnet_hdr);
@@ -677,7 +672,7 @@ static bool try_fill_recv(struct virtnet_info *vi, struct receive_queue *rq,
if (vi->mergeable_rx_bufs)
err = add_recvbuf_mergeable(rq, gfp);
else if (vi->big_packets)
- err = add_recvbuf_big(rq, gfp);
+ err = add_recvbuf_big(vi, rq, gfp);
else
err = add_recvbuf_small(vi, rq, gfp);

@@ -857,18 +852,14 @@ static void free_old_xmit_skbs(struct send_queue *sq)

static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
{
- struct skb_vnet_hdr *hdr;
+ struct virtio_net_hdr_mrg_rxbuf *hdr;
const unsigned char *dest = ((struct ethhdr *)skb->data)->h_dest;
struct virtnet_info *vi = sq->vq->vdev->priv;
unsigned num_sg;
- unsigned hdr_len;
+ unsigned hdr_len = vi->hdr_len;
bool can_push;

pr_debug("%s: xmit %p %pM\n", vi->dev->name, skb, dest);
- if (vi->mergeable_rx_bufs)
- hdr_len = sizeof hdr->mhdr;
- else
- hdr_len = sizeof hdr->hdr;

can_push = vi->any_header_sg &&
!((unsigned long)skb->data & (__alignof__(*hdr) - 1)) &&
@@ -876,7 +867,7 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
/* Even if we can, don't push here yet as this would skew
* csum_start offset below. */
if (can_push)
- hdr = (struct skb_vnet_hdr *)(skb->data - hdr_len);
+ hdr = (struct virtio_net_hdr_mrg_rxbuf *)(skb->data - hdr_len);
else
hdr = skb_vnet_hdr(skb);

@@ -909,7 +900,7 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
}

if (vi->mergeable_rx_bufs)
- hdr->mhdr.num_buffers = 0;
+ hdr->num_buffers = 0;

sg_init_table(sq->sg, MAX_SKB_FRAGS + 2);
if (can_push) {
@@ -1814,18 +1805,19 @@ static int virtnet_probe(struct virtio_device *vdev)
if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF))
vi->mergeable_rx_bufs = true;

+ if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF))
+ vi->hdr_len = sizeof(struct virtio_net_hdr_mrg_rxbuf);
+ else
+ vi->hdr_len = sizeof(struct virtio_net_hdr);
+
if (virtio_has_feature(vdev, VIRTIO_F_ANY_LAYOUT))
vi->any_header_sg = true;

if (virtio_has_feature(vdev, VIRTIO_NET_F_CTRL_VQ))
vi->has_cvq = true;

- if (vi->any_header_sg) {
- if (vi->mergeable_rx_bufs)
- dev->needed_headroom = sizeof(struct virtio_net_hdr_mrg_rxbuf);
- else
- dev->needed_headroom = sizeof(struct virtio_net_hdr);
- }
+ if (vi->any_header_sg)
+ dev->needed_headroom = vi->hdr_len;

/* Use single tx/rx queue pair as default */
vi->curr_queue_pairs = 1;
--
MST

2014-11-24 11:54:35

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 23/41] virtio_net: enable v1.0 support

Now that we have completed 1.0 support, enable it in our driver.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/net/virtio_net.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index a0e64cf..c6a72d3 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2003,6 +2003,7 @@ static unsigned int features[] = {
VIRTIO_NET_F_GUEST_ANNOUNCE, VIRTIO_NET_F_MQ,
VIRTIO_NET_F_CTRL_MAC_ADDR,
VIRTIO_F_ANY_LAYOUT,
+ VIRTIO_F_VERSION_1,
};

static struct virtio_driver virtio_net_driver = {
--
MST

2014-11-24 11:54:41

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 25/41] vhost/net: force len for TX to host endian

We use native endian-ness internally but never
expose it to guest.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/vhost/net.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 8dae2f7..dce5c58 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -48,15 +48,15 @@ MODULE_PARM_DESC(experimental_zcopytx, "Enable Zero Copy TX;"
* status internally; used for zerocopy tx only.
*/
/* Lower device DMA failed */
-#define VHOST_DMA_FAILED_LEN 3
+#define VHOST_DMA_FAILED_LEN ((__force __virtio32)3)
/* Lower device DMA done */
-#define VHOST_DMA_DONE_LEN 2
+#define VHOST_DMA_DONE_LEN ((__force __virtio32)2)
/* Lower device DMA in progress */
-#define VHOST_DMA_IN_PROGRESS 1
+#define VHOST_DMA_IN_PROGRESS ((__force __virtio32)1)
/* Buffer unused */
-#define VHOST_DMA_CLEAR_LEN 0
+#define VHOST_DMA_CLEAR_LEN ((__force __virtio32)0)

-#define VHOST_DMA_IS_DONE(len) ((len) >= VHOST_DMA_DONE_LEN)
+#define VHOST_DMA_IS_DONE(len) ((__force u32)(len) >= (__force u32)VHOST_DMA_DONE_LEN)

enum {
VHOST_NET_FEATURES = VHOST_FEATURES |
--
MST

2014-11-24 11:54:46

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 26/41] vhost: virtio 1.0 endian-ness support

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/vhost/vhost.c | 93 +++++++++++++++++++++++++++++++--------------------
1 file changed, 56 insertions(+), 37 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index c90f437..4d379ed 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -33,8 +33,8 @@ enum {
VHOST_MEMORY_F_LOG = 0x1,
};

-#define vhost_used_event(vq) ((u16 __user *)&vq->avail->ring[vq->num])
-#define vhost_avail_event(vq) ((u16 __user *)&vq->used->ring[vq->num])
+#define vhost_used_event(vq) ((__virtio16 __user *)&vq->avail->ring[vq->num])
+#define vhost_avail_event(vq) ((__virtio16 __user *)&vq->used->ring[vq->num])

static void vhost_poll_func(struct file *file, wait_queue_head_t *wqh,
poll_table *pt)
@@ -1001,7 +1001,7 @@ EXPORT_SYMBOL_GPL(vhost_log_write);
static int vhost_update_used_flags(struct vhost_virtqueue *vq)
{
void __user *used;
- if (__put_user(vq->used_flags, &vq->used->flags) < 0)
+ if (__put_user(cpu_to_vhost16(vq, vq->used_flags), &vq->used->flags) < 0)
return -EFAULT;
if (unlikely(vq->log_used)) {
/* Make sure the flag is seen before log. */
@@ -1019,7 +1019,7 @@ static int vhost_update_used_flags(struct vhost_virtqueue *vq)

static int vhost_update_avail_event(struct vhost_virtqueue *vq, u16 avail_event)
{
- if (__put_user(vq->avail_idx, vhost_avail_event(vq)))
+ if (__put_user(cpu_to_vhost16(vq, vq->avail_idx), vhost_avail_event(vq)))
return -EFAULT;
if (unlikely(vq->log_used)) {
void __user *used;
@@ -1038,6 +1038,7 @@ static int vhost_update_avail_event(struct vhost_virtqueue *vq, u16 avail_event)

int vhost_init_used(struct vhost_virtqueue *vq)
{
+ __virtio16 last_used_idx;
int r;
if (!vq->private_data)
return 0;
@@ -1046,7 +1047,13 @@ int vhost_init_used(struct vhost_virtqueue *vq)
if (r)
return r;
vq->signalled_used_valid = false;
- return get_user(vq->last_used_idx, &vq->used->idx);
+ if (!access_ok(VERIFY_READ, &vq->used->idx, sizeof vq->used->idx))
+ return -EFAULT;
+ r = __get_user(last_used_idx, &vq->used->idx);
+ if (r)
+ return r;
+ vq->last_used_idx = vhost16_to_cpu(vq, last_used_idx);
+ return 0;
}
EXPORT_SYMBOL_GPL(vhost_init_used);

@@ -1087,16 +1094,16 @@ static int translate_desc(struct vhost_virtqueue *vq, u64 addr, u32 len,
/* Each buffer in the virtqueues is actually a chain of descriptors. This
* function returns the next descriptor in the chain,
* or -1U if we're at the end. */
-static unsigned next_desc(struct vring_desc *desc)
+static unsigned next_desc(struct vhost_virtqueue *vq, struct vring_desc *desc)
{
unsigned int next;

/* If this descriptor says it doesn't chain, we're done. */
- if (!(desc->flags & VRING_DESC_F_NEXT))
+ if (!(desc->flags & cpu_to_vhost16(vq, VRING_DESC_F_NEXT)))
return -1U;

/* Check they're not leading us off end of descriptors. */
- next = desc->next;
+ next = vhost16_to_cpu(vq, desc->next);
/* Make sure compiler knows to grab that: we don't want it changing! */
/* We will use the result as an index in an array, so most
* architectures only need a compiler barrier here. */
@@ -1113,18 +1120,19 @@ static int get_indirect(struct vhost_virtqueue *vq,
{
struct vring_desc desc;
unsigned int i = 0, count, found = 0;
+ u32 len = vhost32_to_cpu(vq, indirect->len);
int ret;

/* Sanity check */
- if (unlikely(indirect->len % sizeof desc)) {
+ if (unlikely(len % sizeof desc)) {
vq_err(vq, "Invalid length in indirect descriptor: "
"len 0x%llx not multiple of 0x%zx\n",
- (unsigned long long)indirect->len,
+ (unsigned long long)vhost32_to_cpu(vq, indirect->len),
sizeof desc);
return -EINVAL;
}

- ret = translate_desc(vq, indirect->addr, indirect->len, vq->indirect,
+ ret = translate_desc(vq, vhost64_to_cpu(vq, indirect->addr), len, vq->indirect,
UIO_MAXIOV);
if (unlikely(ret < 0)) {
vq_err(vq, "Translation failure %d in indirect.\n", ret);
@@ -1135,7 +1143,7 @@ static int get_indirect(struct vhost_virtqueue *vq,
* architectures only need a compiler barrier here. */
read_barrier_depends();

- count = indirect->len / sizeof desc;
+ count = len / sizeof desc;
/* Buffers are chained via a 16 bit next field, so
* we can have at most 2^16 of these. */
if (unlikely(count > USHRT_MAX + 1)) {
@@ -1155,16 +1163,17 @@ static int get_indirect(struct vhost_virtqueue *vq,
if (unlikely(memcpy_fromiovec((unsigned char *)&desc,
vq->indirect, sizeof desc))) {
vq_err(vq, "Failed indirect descriptor: idx %d, %zx\n",
- i, (size_t)indirect->addr + i * sizeof desc);
+ i, (size_t)vhost64_to_cpu(vq, indirect->addr) + i * sizeof desc);
return -EINVAL;
}
- if (unlikely(desc.flags & VRING_DESC_F_INDIRECT)) {
+ if (unlikely(desc.flags & cpu_to_vhost16(vq, VRING_DESC_F_INDIRECT))) {
vq_err(vq, "Nested indirect descriptor: idx %d, %zx\n",
- i, (size_t)indirect->addr + i * sizeof desc);
+ i, (size_t)vhost64_to_cpu(vq, indirect->addr) + i * sizeof desc);
return -EINVAL;
}

- ret = translate_desc(vq, desc.addr, desc.len, iov + iov_count,
+ ret = translate_desc(vq, vhost64_to_cpu(vq, desc.addr),
+ vhost32_to_cpu(vq, desc.len), iov + iov_count,
iov_size - iov_count);
if (unlikely(ret < 0)) {
vq_err(vq, "Translation failure %d indirect idx %d\n",
@@ -1172,11 +1181,11 @@ static int get_indirect(struct vhost_virtqueue *vq,
return ret;
}
/* If this is an input descriptor, increment that count. */
- if (desc.flags & VRING_DESC_F_WRITE) {
+ if (desc.flags & cpu_to_vhost16(vq, VRING_DESC_F_WRITE)) {
*in_num += ret;
if (unlikely(log)) {
- log[*log_num].addr = desc.addr;
- log[*log_num].len = desc.len;
+ log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
+ log[*log_num].len = vhost32_to_cpu(vq, desc.len);
++*log_num;
}
} else {
@@ -1189,7 +1198,7 @@ static int get_indirect(struct vhost_virtqueue *vq,
}
*out_num += ret;
}
- } while ((i = next_desc(&desc)) != -1);
+ } while ((i = next_desc(vq, &desc)) != -1);
return 0;
}

@@ -1209,15 +1218,18 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
struct vring_desc desc;
unsigned int i, head, found = 0;
u16 last_avail_idx;
+ __virtio16 avail_idx;
+ __virtio16 ring_head;
int ret;

/* Check it isn't doing very strange things with descriptor numbers. */
last_avail_idx = vq->last_avail_idx;
- if (unlikely(__get_user(vq->avail_idx, &vq->avail->idx))) {
+ if (unlikely(__get_user(avail_idx, &vq->avail->idx))) {
vq_err(vq, "Failed to access avail idx at %p\n",
&vq->avail->idx);
return -EFAULT;
}
+ vq->avail_idx = vhost16_to_cpu(vq, avail_idx);

if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) {
vq_err(vq, "Guest moved used index from %u to %u",
@@ -1234,7 +1246,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,

/* Grab the next descriptor number they're advertising, and increment
* the index we've seen. */
- if (unlikely(__get_user(head,
+ if (unlikely(__get_user(ring_head,
&vq->avail->ring[last_avail_idx % vq->num]))) {
vq_err(vq, "Failed to read head: idx %d address %p\n",
last_avail_idx,
@@ -1242,6 +1254,8 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
return -EFAULT;
}

+ head = vhost16_to_cpu(vq, ring_head);
+
/* If their number is silly, that's an error. */
if (unlikely(head >= vq->num)) {
vq_err(vq, "Guest says index %u > %u is available",
@@ -1274,7 +1288,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
i, vq->desc + i);
return -EFAULT;
}
- if (desc.flags & VRING_DESC_F_INDIRECT) {
+ if (desc.flags & cpu_to_vhost16(vq, VRING_DESC_F_INDIRECT)) {
ret = get_indirect(vq, iov, iov_size,
out_num, in_num,
log, log_num, &desc);
@@ -1286,20 +1300,21 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
continue;
}

- ret = translate_desc(vq, desc.addr, desc.len, iov + iov_count,
+ ret = translate_desc(vq, vhost64_to_cpu(vq, desc.addr),
+ vhost32_to_cpu(vq, desc.len), iov + iov_count,
iov_size - iov_count);
if (unlikely(ret < 0)) {
vq_err(vq, "Translation failure %d descriptor idx %d\n",
ret, i);
return ret;
}
- if (desc.flags & VRING_DESC_F_WRITE) {
+ if (desc.flags & cpu_to_vhost16(vq, VRING_DESC_F_WRITE)) {
/* If this is an input descriptor,
* increment that count. */
*in_num += ret;
if (unlikely(log)) {
- log[*log_num].addr = desc.addr;
- log[*log_num].len = desc.len;
+ log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
+ log[*log_num].len = vhost32_to_cpu(vq, desc.len);
++*log_num;
}
} else {
@@ -1312,7 +1327,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
}
*out_num += ret;
}
- } while ((i = next_desc(&desc)) != -1);
+ } while ((i = next_desc(vq, &desc)) != -1);

/* On success, increment avail index. */
vq->last_avail_idx++;
@@ -1335,7 +1350,10 @@ EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
* want to notify the guest, using eventfd. */
int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
{
- struct vring_used_elem heads = { head, len };
+ struct vring_used_elem heads = {
+ cpu_to_vhost32(vq, head),
+ cpu_to_vhost32(vq, len)
+ };

return vhost_add_used_n(vq, &heads, 1);
}
@@ -1404,7 +1422,7 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads,

/* Make sure buffer is written before we update index. */
smp_wmb();
- if (put_user(vq->last_used_idx, &vq->used->idx)) {
+ if (__put_user(cpu_to_vhost16(vq, vq->last_used_idx), &vq->used->idx)) {
vq_err(vq, "Failed to increment used idx");
return -EFAULT;
}
@@ -1422,7 +1440,8 @@ EXPORT_SYMBOL_GPL(vhost_add_used_n);

static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
{
- __u16 old, new, event;
+ __u16 old, new;
+ __virtio16 event;
bool v;
/* Flush out used index updates. This is paired
* with the barrier that the Guest executes when enabling
@@ -1434,12 +1453,12 @@ static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
return true;

if (!vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX)) {
- __u16 flags;
+ __virtio16 flags;
if (__get_user(flags, &vq->avail->flags)) {
vq_err(vq, "Failed to get flags");
return true;
}
- return !(flags & VRING_AVAIL_F_NO_INTERRUPT);
+ return !(flags & cpu_to_vhost16(vq, VRING_AVAIL_F_NO_INTERRUPT));
}
old = vq->signalled_used;
v = vq->signalled_used_valid;
@@ -1449,11 +1468,11 @@ static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
if (unlikely(!v))
return true;

- if (get_user(event, vhost_used_event(vq))) {
+ if (__get_user(event, vhost_used_event(vq))) {
vq_err(vq, "Failed to get used event idx");
return true;
}
- return vring_need_event(event, new, old);
+ return vring_need_event(vhost16_to_cpu(vq, event), new, old);
}

/* This actually signals the guest, using eventfd. */
@@ -1488,7 +1507,7 @@ EXPORT_SYMBOL_GPL(vhost_add_used_and_signal_n);
/* OK, now we need to know about added descriptors. */
bool vhost_enable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
{
- u16 avail_idx;
+ __virtio16 avail_idx;
int r;

if (!(vq->used_flags & VRING_USED_F_NO_NOTIFY))
@@ -1519,7 +1538,7 @@ bool vhost_enable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
return false;
}

- return avail_idx != vq->avail_idx;
+ return vhost16_to_cpu(vq, avail_idx) != vq->avail_idx;
}
EXPORT_SYMBOL_GPL(vhost_enable_notify);

--
MST

2014-11-24 11:54:58

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 29/41] vhost/net: larger header for virtio 1.0

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/vhost/net.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index cae22f9..1ac58d0 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1027,7 +1027,8 @@ static int vhost_net_set_features(struct vhost_net *n, u64 features)
size_t vhost_hlen, sock_hlen, hdr_len;
int i;

- hdr_len = (features & (1 << VIRTIO_NET_F_MRG_RXBUF)) ?
+ hdr_len = (features & ((1ULL << VIRTIO_NET_F_MRG_RXBUF) |
+ (1ULL << VIRTIO_F_VERSION_1))) ?
sizeof(struct virtio_net_hdr_mrg_rxbuf) :
sizeof(struct virtio_net_hdr);
if (features & (1 << VHOST_NET_F_VIRTIO_NET_HDR)) {
--
MST

2014-11-24 11:55:05

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 31/41] vhost/net: suppress compiler warning

len is always initialized since function is called with size > 0.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/vhost/net.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 984242e..54ffbb0 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -501,7 +501,7 @@ static int get_rx_bufs(struct vhost_virtqueue *vq,
int headcount = 0;
unsigned d;
int r, nlogs = 0;
- u32 len;
+ u32 uninitialized_var(len);

while (datalen > 0 && headcount < quota) {
if (unlikely(seg >= UIO_MAXIOV)) {
--
MST

2014-11-24 11:54:52

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 28/41] vhost/net: virtio 1.0 byte swap

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/vhost/net.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index dce5c58..cae22f9 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -416,7 +416,7 @@ static void handle_tx(struct vhost_net *net)
struct ubuf_info *ubuf;
ubuf = nvq->ubuf_info + nvq->upend_idx;

- vq->heads[nvq->upend_idx].id = head;
+ vq->heads[nvq->upend_idx].id = cpu_to_vhost32(vq, head);
vq->heads[nvq->upend_idx].len = VHOST_DMA_IN_PROGRESS;
ubuf->callback = vhost_zerocopy_callback;
ubuf->ctx = nvq->ubufs;
@@ -500,6 +500,7 @@ static int get_rx_bufs(struct vhost_virtqueue *vq,
int headcount = 0;
unsigned d;
int r, nlogs = 0;
+ u32 len;

while (datalen > 0 && headcount < quota) {
if (unlikely(seg >= UIO_MAXIOV)) {
@@ -527,13 +528,14 @@ static int get_rx_bufs(struct vhost_virtqueue *vq,
nlogs += *log_num;
log += *log_num;
}
- heads[headcount].id = d;
- heads[headcount].len = iov_length(vq->iov + seg, in);
- datalen -= heads[headcount].len;
+ heads[headcount].id = cpu_to_vhost32(vq, d);
+ len = iov_length(vq->iov + seg, in);
+ heads[headcount].len = cpu_to_vhost32(vq, len);
+ datalen -= len;
++headcount;
seg += in;
}
- heads[headcount - 1].len += datalen;
+ heads[headcount - 1].len = cpu_to_vhost32(vq, len - datalen);
*iovcount = seg;
if (unlikely(log))
*log_num = nlogs;
--
MST

2014-11-24 11:55:21

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 34/41] tun: add VNET_LE flag

virtio 1.0 modified virtio net header format,
making all fields little endian.

Users can tweak header format before submitting it to tun,
but this means more data copies where none were necessary.
And if the iovec is in RO memory, this means we might
need to split iovec also means we might in theory overflow
iovec max size.

This patch adds a simpler way for applications to handle this,
using new "little endian" flag in tun.
As a result, tun simply byte-swaps header fields as appropriate.
This is a NOP on LE architectures.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
include/uapi/linux/if_tun.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
index 277a260..18b2403 100644
--- a/include/uapi/linux/if_tun.h
+++ b/include/uapi/linux/if_tun.h
@@ -57,6 +57,7 @@
#define IFF_ONE_QUEUE 0x2000
#define IFF_VNET_HDR 0x4000
#define IFF_TUN_EXCL 0x8000
+#define IFF_VNET_LE 0x10000
#define IFF_MULTI_QUEUE 0x0100
#define IFF_ATTACH_QUEUE 0x0200
#define IFF_DETACH_QUEUE 0x0400
--
MST

2014-11-24 11:55:39

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 35/41] tun: TUN_VNET_LE support, fix sparse warnings for virtio headers

Pretty straight-forward: convert all fields to/from
virtio endian-ness.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/net/tun.c | 48 +++++++++++++++++++++++++++++-------------------
1 file changed, 29 insertions(+), 19 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 61c000c..f411ffd 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -111,7 +111,7 @@ do { \
#define TUN_FASYNC IFF_ATTACH_QUEUE

#define TUN_FEATURES (IFF_NO_PI | IFF_ONE_QUEUE | IFF_VNET_HDR | \
- IFF_MULTI_QUEUE)
+ IFF_VNET_LE | IFF_MULTI_QUEUE)
#define GOODCOPY_LEN 128

#define FLT_EXACT_COUNT 8
@@ -205,6 +205,16 @@ struct tun_struct {
u32 flow_count;
};

+static inline u16 tun16_to_cpu(struct tun_struct *tun, __virtio16 val)
+{
+ return __virtio16_to_cpu(tun->flags & IFF_VNET_LE, val);
+}
+
+static inline __virtio16 cpu_to_tun16(struct tun_struct *tun, u16 val)
+{
+ return __cpu_to_virtio16(tun->flags & IFF_VNET_LE, val);
+}
+
static inline u32 tun_hashfn(u32 rxhash)
{
return rxhash & 0x3ff;
@@ -1053,10 +1063,10 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
return -EFAULT;

if ((gso.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) &&
- gso.csum_start + gso.csum_offset + 2 > gso.hdr_len)
- gso.hdr_len = gso.csum_start + gso.csum_offset + 2;
+ tun16_to_cpu(tun, gso.csum_start) + tun16_to_cpu(tun, gso.csum_offset) + 2 > tun16_to_cpu(tun, gso.hdr_len))
+ gso.hdr_len = cpu_to_tun16(tun, tun16_to_cpu(tun, gso.csum_start) + tun16_to_cpu(tun, gso.csum_offset) + 2);

- if (gso.hdr_len > len)
+ if (tun16_to_cpu(tun, gso.hdr_len) > len)
return -EINVAL;
offset += tun->vnet_hdr_sz;
}
@@ -1064,7 +1074,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
if ((tun->flags & TUN_TYPE_MASK) == IFF_TAP) {
align += NET_IP_ALIGN;
if (unlikely(len < ETH_HLEN ||
- (gso.hdr_len && gso.hdr_len < ETH_HLEN)))
+ (gso.hdr_len && tun16_to_cpu(tun, gso.hdr_len) < ETH_HLEN)))
return -EINVAL;
}

@@ -1075,7 +1085,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
* enough room for skb expand head in case it is used.
* The rest of the buffer is mapped from userspace.
*/
- copylen = gso.hdr_len ? gso.hdr_len : GOODCOPY_LEN;
+ copylen = gso.hdr_len ? tun16_to_cpu(tun, gso.hdr_len) : GOODCOPY_LEN;
if (copylen > good_linear)
copylen = good_linear;
linear = copylen;
@@ -1085,10 +1095,10 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,

if (!zerocopy) {
copylen = len;
- if (gso.hdr_len > good_linear)
+ if (tun16_to_cpu(tun, gso.hdr_len) > good_linear)
linear = good_linear;
else
- linear = gso.hdr_len;
+ linear = tun16_to_cpu(tun, gso.hdr_len);
}

skb = tun_alloc_skb(tfile, align, copylen, linear, noblock);
@@ -1115,8 +1125,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
}

if (gso.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
- if (!skb_partial_csum_set(skb, gso.csum_start,
- gso.csum_offset)) {
+ if (!skb_partial_csum_set(skb, tun16_to_cpu(tun, gso.csum_start),
+ tun16_to_cpu(tun, gso.csum_offset))) {
tun->dev->stats.rx_frame_errors++;
kfree_skb(skb);
return -EINVAL;
@@ -1184,7 +1194,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
if (gso.gso_type & VIRTIO_NET_HDR_GSO_ECN)
skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;

- skb_shinfo(skb)->gso_size = gso.gso_size;
+ skb_shinfo(skb)->gso_size = tun16_to_cpu(tun, gso.gso_size);
if (skb_shinfo(skb)->gso_size == 0) {
tun->dev->stats.rx_frame_errors++;
kfree_skb(skb);
@@ -1276,8 +1286,8 @@ static ssize_t tun_put_user(struct tun_struct *tun,
struct skb_shared_info *sinfo = skb_shinfo(skb);

/* This is a hint as to how much should be linear. */
- gso.hdr_len = skb_headlen(skb);
- gso.gso_size = sinfo->gso_size;
+ gso.hdr_len = cpu_to_tun16(tun, skb_headlen(skb));
+ gso.gso_size = cpu_to_tun16(tun, sinfo->gso_size);
if (sinfo->gso_type & SKB_GSO_TCPV4)
gso.gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
else if (sinfo->gso_type & SKB_GSO_TCPV6)
@@ -1285,12 +1295,12 @@ static ssize_t tun_put_user(struct tun_struct *tun,
else {
pr_err("unexpected GSO type: "
"0x%x, gso_size %d, hdr_len %d\n",
- sinfo->gso_type, gso.gso_size,
- gso.hdr_len);
+ sinfo->gso_type, tun16_to_cpu(tun, gso.gso_size),
+ tun16_to_cpu(tun, gso.hdr_len));
print_hex_dump(KERN_ERR, "tun: ",
DUMP_PREFIX_NONE,
16, 1, skb->head,
- min((int)gso.hdr_len, 64), true);
+ min((int)tun16_to_cpu(tun, gso.hdr_len), 64), true);
WARN_ON_ONCE(1);
return -EINVAL;
}
@@ -1301,9 +1311,9 @@ static ssize_t tun_put_user(struct tun_struct *tun,

if (skb->ip_summed == CHECKSUM_PARTIAL) {
gso.flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
- gso.csum_start = skb_checksum_start_offset(skb) +
- vlan_hlen;
- gso.csum_offset = skb->csum_offset;
+ gso.csum_start = cpu_to_tun16(tun, skb_checksum_start_offset(skb) +
+ vlan_hlen);
+ gso.csum_offset = cpu_to_tun16(tun, skb->csum_offset);
} else if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
gso.flags = VIRTIO_NET_HDR_F_DATA_VALID;
} /* else everything is zero */
--
MST

2014-11-24 11:55:48

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 40/41] vhost/scsi: partial virtio 1.0 support

Include all endian conversions as required by virtio 1.0.
Don't set virtio 1.0 yet, since that requires ANY_LAYOUT
which we don't yet support.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/vhost/scsi.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index a17f118..01c01cb 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -168,6 +168,7 @@ enum {
VHOST_SCSI_VQ_IO = 2,
};

+/* Note: can't set VIRTIO_F_VERSION_1 yet, since that implies ANY_LAYOUT. */
enum {
VHOST_SCSI_FEATURES = VHOST_FEATURES | (1ULL << VIRTIO_SCSI_F_HOTPLUG) |
(1ULL << VIRTIO_SCSI_F_T10_PI)
@@ -577,8 +578,8 @@ tcm_vhost_allocate_evt(struct vhost_scsi *vs,
return NULL;
}

- evt->event.event = event;
- evt->event.reason = reason;
+ evt->event.event = cpu_to_vhost32(vq, event);
+ evt->event.reason = cpu_to_vhost32(vq, reason);
vs->vs_events_nr++;

return evt;
@@ -636,7 +637,7 @@ again:
}

if (vs->vs_events_missed) {
- event->event |= VIRTIO_SCSI_T_EVENTS_MISSED;
+ event->event |= cpu_to_vhost32(vq, VIRTIO_SCSI_T_EVENTS_MISSED);
vs->vs_events_missed = false;
}

@@ -695,12 +696,13 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work *work)
cmd, se_cmd->residual_count, se_cmd->scsi_status);

memset(&v_rsp, 0, sizeof(v_rsp));
- v_rsp.resid = se_cmd->residual_count;
+ v_rsp.resid = cpu_to_vhost32(cmd->tvc_vq, se_cmd->residual_count);
/* TODO is status_qualifier field needed? */
v_rsp.status = se_cmd->scsi_status;
- v_rsp.sense_len = se_cmd->scsi_sense_length;
+ v_rsp.sense_len = cpu_to_vhost32(cmd->tvc_vq,
+ se_cmd->scsi_sense_length);
memcpy(v_rsp.sense, cmd->tvc_sense_buf,
- v_rsp.sense_len);
+ se_cmd->scsi_sense_length);
ret = copy_to_user(cmd->tvc_resp, &v_rsp, sizeof(v_rsp));
if (likely(ret == 0)) {
struct vhost_scsi_virtqueue *q;
@@ -1095,14 +1097,14 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
", but wrong data_direction\n");
goto err_cmd;
}
- prot_bytes = v_req_pi.pi_bytesout;
+ prot_bytes = vhost32_to_cpu(vq, v_req_pi.pi_bytesout);
} else if (v_req_pi.pi_bytesin) {
if (data_direction != DMA_FROM_DEVICE) {
vq_err(vq, "Received non zero di_pi_niov"
", but wrong data_direction\n");
goto err_cmd;
}
- prot_bytes = v_req_pi.pi_bytesin;
+ prot_bytes = vhost32_to_cpu(vq, v_req_pi.pi_bytesin);
}
if (prot_bytes) {
int tmp = 0;
@@ -1117,12 +1119,12 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct vhost_virtqueue *vq)
data_first += prot_niov;
data_niov = data_num - prot_niov;
}
- tag = v_req_pi.tag;
+ tag = vhost64_to_cpu(vq, v_req_pi.tag);
task_attr = v_req_pi.task_attr;
cdb = &v_req_pi.cdb[0];
lun = ((v_req_pi.lun[2] << 8) | v_req_pi.lun[3]) & 0x3FFF;
} else {
- tag = v_req.tag;
+ tag = vhost64_to_cpu(vq, v_req.tag);
task_attr = v_req.task_attr;
cdb = &v_req.cdb[0];
lun = ((v_req.lun[2] << 8) | v_req.lun[3]) & 0x3FFF;
--
MST

2014-11-24 11:55:54

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 41/41] af_packet: virtio 1.0 stubs

This merely fixes sparse warnings, without actually
adding support for the new APIs.

Still working out the best way to enable the new
functionality.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
net/packet/af_packet.c | 35 ++++++++++++++++++++++-------------
1 file changed, 22 insertions(+), 13 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 87d20f4..d4a877e 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2444,13 +2444,15 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
goto out_unlock;

if ((vnet_hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) &&
- (vnet_hdr.csum_start + vnet_hdr.csum_offset + 2 >
- vnet_hdr.hdr_len))
- vnet_hdr.hdr_len = vnet_hdr.csum_start +
- vnet_hdr.csum_offset + 2;
+ (__virtio16_to_cpu(false, vnet_hdr.csum_start) +
+ __virtio16_to_cpu(false, vnet_hdr.csum_offset) + 2 >
+ __virtio16_to_cpu(false, vnet_hdr.hdr_len)))
+ vnet_hdr.hdr_len = __cpu_to_virtio16(false,
+ __virtio16_to_cpu(false, vnet_hdr.csum_start) +
+ __virtio16_to_cpu(false, vnet_hdr.csum_offset) + 2);

err = -EINVAL;
- if (vnet_hdr.hdr_len > len)
+ if (__virtio16_to_cpu(false, vnet_hdr.hdr_len) > len)
goto out_unlock;

if (vnet_hdr.gso_type != VIRTIO_NET_HDR_GSO_NONE) {
@@ -2492,7 +2494,8 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
err = -ENOBUFS;
hlen = LL_RESERVED_SPACE(dev);
tlen = dev->needed_tailroom;
- skb = packet_alloc_skb(sk, hlen + tlen, hlen, len, vnet_hdr.hdr_len,
+ skb = packet_alloc_skb(sk, hlen + tlen, hlen, len,
+ __virtio16_to_cpu(false, vnet_hdr.hdr_len),
msg->msg_flags & MSG_DONTWAIT, &err);
if (skb == NULL)
goto out_unlock;
@@ -2534,14 +2537,16 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)

if (po->has_vnet_hdr) {
if (vnet_hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
- if (!skb_partial_csum_set(skb, vnet_hdr.csum_start,
- vnet_hdr.csum_offset)) {
+ u16 s = __virtio16_to_cpu(false, vnet_hdr.csum_start);
+ u16 o = __virtio16_to_cpu(false, vnet_hdr.csum_offset);
+ if (!skb_partial_csum_set(skb, s, o)) {
err = -EINVAL;
goto out_free;
}
}

- skb_shinfo(skb)->gso_size = vnet_hdr.gso_size;
+ skb_shinfo(skb)->gso_size =
+ __virtio16_to_cpu(false, vnet_hdr.gso_size);
skb_shinfo(skb)->gso_type = gso_type;

/* Header must be checked, and gso_segs computed. */
@@ -2912,8 +2917,10 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
struct skb_shared_info *sinfo = skb_shinfo(skb);

/* This is a hint as to how much should be linear. */
- vnet_hdr.hdr_len = skb_headlen(skb);
- vnet_hdr.gso_size = sinfo->gso_size;
+ vnet_hdr.hdr_len =
+ __cpu_to_virtio16(false, skb_headlen(skb));
+ vnet_hdr.gso_size =
+ __cpu_to_virtio16(false, sinfo->gso_size);
if (sinfo->gso_type & SKB_GSO_TCPV4)
vnet_hdr.gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
else if (sinfo->gso_type & SKB_GSO_TCPV6)
@@ -2931,8 +2938,10 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,

if (skb->ip_summed == CHECKSUM_PARTIAL) {
vnet_hdr.flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
- vnet_hdr.csum_start = skb_checksum_start_offset(skb);
- vnet_hdr.csum_offset = skb->csum_offset;
+ vnet_hdr.csum_start = __cpu_to_virtio16(false,
+ skb_checksum_start_offset(skb));
+ vnet_hdr.csum_offset = __cpu_to_virtio16(false,
+ skb->csum_offset);
} else if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
vnet_hdr.flags = VIRTIO_NET_HDR_F_DATA_VALID;
} /* else everything is zero */
--
MST

2014-11-24 11:56:01

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 39/41] virtio_scsi: export to userspace

Replace uXX by __uXX and _packed by __attribute((packed))
as seems to be the norm for userspace headers.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
include/uapi/linux/virtio_scsi.h | 74 ++++++++++++++++++++--------------------
include/uapi/linux/Kbuild | 1 +
2 files changed, 38 insertions(+), 37 deletions(-)

diff --git a/include/uapi/linux/virtio_scsi.h b/include/uapi/linux/virtio_scsi.h
index af44864..42b9370 100644
--- a/include/uapi/linux/virtio_scsi.h
+++ b/include/uapi/linux/virtio_scsi.h
@@ -34,78 +34,78 @@

/* SCSI command request, followed by data-out */
struct virtio_scsi_cmd_req {
- u8 lun[8]; /* Logical Unit Number */
+ __u8 lun[8]; /* Logical Unit Number */
__virtio64 tag; /* Command identifier */
- u8 task_attr; /* Task attribute */
- u8 prio; /* SAM command priority field */
- u8 crn;
- u8 cdb[VIRTIO_SCSI_CDB_SIZE];
-} __packed;
+ __u8 task_attr; /* Task attribute */
+ __u8 prio; /* SAM command priority field */
+ __u8 crn;
+ __u8 cdb[VIRTIO_SCSI_CDB_SIZE];
+} __attribute__((packed));

/* SCSI command request, followed by protection information */
struct virtio_scsi_cmd_req_pi {
- u8 lun[8]; /* Logical Unit Number */
+ __u8 lun[8]; /* Logical Unit Number */
__virtio64 tag; /* Command identifier */
- u8 task_attr; /* Task attribute */
- u8 prio; /* SAM command priority field */
- u8 crn;
+ __u8 task_attr; /* Task attribute */
+ __u8 prio; /* SAM command priority field */
+ __u8 crn;
__virtio32 pi_bytesout; /* DataOUT PI Number of bytes */
__virtio32 pi_bytesin; /* DataIN PI Number of bytes */
- u8 cdb[VIRTIO_SCSI_CDB_SIZE];
-} __packed;
+ __u8 cdb[VIRTIO_SCSI_CDB_SIZE];
+} __attribute__((packed));

/* Response, followed by sense data and data-in */
struct virtio_scsi_cmd_resp {
__virtio32 sense_len; /* Sense data length */
__virtio32 resid; /* Residual bytes in data buffer */
__virtio16 status_qualifier; /* Status qualifier */
- u8 status; /* Command completion status */
- u8 response; /* Response values */
- u8 sense[VIRTIO_SCSI_SENSE_SIZE];
-} __packed;
+ __u8 status; /* Command completion status */
+ __u8 response; /* Response values */
+ __u8 sense[VIRTIO_SCSI_SENSE_SIZE];
+} __attribute__((packed));

/* Task Management Request */
struct virtio_scsi_ctrl_tmf_req {
__virtio32 type;
__virtio32 subtype;
- u8 lun[8];
+ __u8 lun[8];
__virtio64 tag;
-} __packed;
+} __attribute__((packed));

struct virtio_scsi_ctrl_tmf_resp {
- u8 response;
-} __packed;
+ __u8 response;
+} __attribute__((packed));

/* Asynchronous notification query/subscription */
struct virtio_scsi_ctrl_an_req {
__virtio32 type;
- u8 lun[8];
+ __u8 lun[8];
__virtio32 event_requested;
-} __packed;
+} __attribute__((packed));

struct virtio_scsi_ctrl_an_resp {
__virtio32 event_actual;
- u8 response;
-} __packed;
+ __u8 response;
+} __attribute__((packed));

struct virtio_scsi_event {
__virtio32 event;
- u8 lun[8];
+ __u8 lun[8];
__virtio32 reason;
-} __packed;
+} __attribute__((packed));

struct virtio_scsi_config {
- u32 num_queues;
- u32 seg_max;
- u32 max_sectors;
- u32 cmd_per_lun;
- u32 event_info_size;
- u32 sense_size;
- u32 cdb_size;
- u16 max_channel;
- u16 max_target;
- u32 max_lun;
-} __packed;
+ __u32 num_queues;
+ __u32 seg_max;
+ __u32 max_sectors;
+ __u32 cmd_per_lun;
+ __u32 event_info_size;
+ __u32 sense_size;
+ __u32 cdb_size;
+ __u16 max_channel;
+ __u16 max_target;
+ __u32 max_lun;
+} __attribute__((packed));

/* Feature Bits */
#define VIRTIO_SCSI_F_INOUT 0
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 44a5581..e61947c 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -428,6 +428,7 @@ header-y += virtio_net.h
header-y += virtio_pci.h
header-y += virtio_ring.h
header-y += virtio_rng.h
+header-y += virtio_scsi.h
header=y += vm_sockets.h
header-y += vt.h
header-y += wait.h
--
MST

2014-11-24 11:55:37

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 36/41] macvtap: TUN_VNET_HDR support

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/net/macvtap.c | 68 ++++++++++++++++++++++++++++++++-------------------
1 file changed, 43 insertions(+), 25 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 880cc09..af90ab5 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -45,6 +45,18 @@ struct macvtap_queue {
struct list_head next;
};

+#define MACVTAP_FEATURES (IFF_VNET_HDR | IFF_VNET_LE | IFF_MULTI_QUEUE)
+
+static inline u16 macvtap16_to_cpu(struct macvtap_queue *q, __virtio16 val)
+{
+ return __virtio16_to_cpu(q->flags & IFF_VNET_LE, val);
+}
+
+static inline __virtio16 cpu_to_macvtap16(struct macvtap_queue *q, u16 val)
+{
+ return __cpu_to_virtio16(q->flags & IFF_VNET_LE, val);
+}
+
static struct proto macvtap_proto = {
.name = "macvtap",
.owner = THIS_MODULE,
@@ -557,7 +569,8 @@ static inline struct sk_buff *macvtap_alloc_skb(struct sock *sk, size_t prepad,
* macvtap_skb_from_vnet_hdr and macvtap_skb_to_vnet_hdr should
* be shared with the tun/tap driver.
*/
-static int macvtap_skb_from_vnet_hdr(struct sk_buff *skb,
+static int macvtap_skb_from_vnet_hdr(struct macvtap_queue *q,
+ struct sk_buff *skb,
struct virtio_net_hdr *vnet_hdr)
{
unsigned short gso_type = 0;
@@ -588,13 +601,13 @@ static int macvtap_skb_from_vnet_hdr(struct sk_buff *skb,
}

if (vnet_hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
- if (!skb_partial_csum_set(skb, vnet_hdr->csum_start,
- vnet_hdr->csum_offset))
+ if (!skb_partial_csum_set(skb, macvtap16_to_cpu(q, vnet_hdr->csum_start),
+ macvtap16_to_cpu(q, vnet_hdr->csum_offset)))
return -EINVAL;
}

if (vnet_hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) {
- skb_shinfo(skb)->gso_size = vnet_hdr->gso_size;
+ skb_shinfo(skb)->gso_size = macvtap16_to_cpu(q, vnet_hdr->gso_size);
skb_shinfo(skb)->gso_type = gso_type;

/* Header must be checked, and gso_segs computed. */
@@ -604,8 +617,9 @@ static int macvtap_skb_from_vnet_hdr(struct sk_buff *skb,
return 0;
}

-static void macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
- struct virtio_net_hdr *vnet_hdr)
+static void macvtap_skb_to_vnet_hdr(struct macvtap_queue *q,
+ const struct sk_buff *skb,
+ struct virtio_net_hdr *vnet_hdr)
{
memset(vnet_hdr, 0, sizeof(*vnet_hdr));

@@ -613,8 +627,8 @@ static void macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
struct skb_shared_info *sinfo = skb_shinfo(skb);

/* This is a hint as to how much should be linear. */
- vnet_hdr->hdr_len = skb_headlen(skb);
- vnet_hdr->gso_size = sinfo->gso_size;
+ vnet_hdr->hdr_len = cpu_to_macvtap16(q, skb_headlen(skb));
+ vnet_hdr->gso_size = cpu_to_macvtap16(q, sinfo->gso_size);
if (sinfo->gso_type & SKB_GSO_TCPV4)
vnet_hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
else if (sinfo->gso_type & SKB_GSO_TCPV6)
@@ -628,10 +642,13 @@ static void macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,

if (skb->ip_summed == CHECKSUM_PARTIAL) {
vnet_hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
- vnet_hdr->csum_start = skb_checksum_start_offset(skb);
if (vlan_tx_tag_present(skb))
- vnet_hdr->csum_start += VLAN_HLEN;
- vnet_hdr->csum_offset = skb->csum_offset;
+ vnet_hdr->csum_start = cpu_to_macvtap16(q,
+ skb_checksum_start_offset(skb) + VLAN_HLEN);
+ else
+ vnet_hdr->csum_start = cpu_to_macvtap16(q,
+ skb_checksum_start_offset(skb));
+ vnet_hdr->csum_offset = cpu_to_macvtap16(q, skb->csum_offset);
} else if (skb->ip_summed == CHECKSUM_UNNECESSARY) {
vnet_hdr->flags = VIRTIO_NET_HDR_F_DATA_VALID;
} /* else everything is zero */
@@ -666,12 +683,14 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
if (err < 0)
goto err;
if ((vnet_hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) &&
- vnet_hdr.csum_start + vnet_hdr.csum_offset + 2 >
- vnet_hdr.hdr_len)
- vnet_hdr.hdr_len = vnet_hdr.csum_start +
- vnet_hdr.csum_offset + 2;
+ macvtap16_to_cpu(q, vnet_hdr.csum_start) +
+ macvtap16_to_cpu(q, vnet_hdr.csum_offset) + 2 >
+ macvtap16_to_cpu(q, vnet_hdr.hdr_len))
+ vnet_hdr.hdr_len = cpu_to_macvtap16(q,
+ macvtap16_to_cpu(q, vnet_hdr.csum_start) +
+ macvtap16_to_cpu(q, vnet_hdr.csum_offset) + 2);
err = -EINVAL;
- if (vnet_hdr.hdr_len > len)
+ if (macvtap16_to_cpu(q, vnet_hdr.hdr_len) > len)
goto err;
}

@@ -684,7 +703,8 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
goto err;

if (m && m->msg_control && sock_flag(&q->sk, SOCK_ZEROCOPY)) {
- copylen = vnet_hdr.hdr_len ? vnet_hdr.hdr_len : GOODCOPY_LEN;
+ copylen = vnet_hdr.hdr_len ?
+ macvtap16_to_cpu(q, vnet_hdr.hdr_len) : GOODCOPY_LEN;
if (copylen > good_linear)
copylen = good_linear;
linear = copylen;
@@ -695,10 +715,10 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,

if (!zerocopy) {
copylen = len;
- if (vnet_hdr.hdr_len > good_linear)
+ if (macvtap16_to_cpu(q, vnet_hdr.hdr_len) > good_linear)
linear = good_linear;
else
- linear = vnet_hdr.hdr_len;
+ linear = macvtap16_to_cpu(q, vnet_hdr.hdr_len);
}

skb = macvtap_alloc_skb(&q->sk, NET_IP_ALIGN, copylen,
@@ -725,7 +745,7 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
skb->protocol = eth_hdr(skb)->h_proto;

if (vnet_hdr_len) {
- err = macvtap_skb_from_vnet_hdr(skb, &vnet_hdr);
+ err = macvtap_skb_from_vnet_hdr(q, skb, &vnet_hdr);
if (err)
goto err_kfree;
}
@@ -791,7 +811,7 @@ static ssize_t macvtap_put_user(struct macvtap_queue *q,
if ((len -= vnet_hdr_len) < 0)
return -EINVAL;

- macvtap_skb_to_vnet_hdr(skb, &vnet_hdr);
+ macvtap_skb_to_vnet_hdr(q, skb, &vnet_hdr);

if (memcpy_toiovecend(iv, (void *)&vnet_hdr, 0, sizeof(vnet_hdr)))
return -EFAULT;
@@ -1003,8 +1023,7 @@ static long macvtap_ioctl(struct file *file, unsigned int cmd,
return -EFAULT;

ret = 0;
- if ((u & ~(IFF_VNET_HDR | IFF_MULTI_QUEUE)) !=
- (IFF_NO_PI | IFF_TAP))
+ if ((u & ~MACVTAP_FEATURES) != (IFF_NO_PI | IFF_TAP))
ret = -EINVAL;
else
q->flags = u;
@@ -1036,8 +1055,7 @@ static long macvtap_ioctl(struct file *file, unsigned int cmd,
return ret;

case TUNGETFEATURES:
- if (put_user(IFF_TAP | IFF_NO_PI | IFF_VNET_HDR |
- IFF_MULTI_QUEUE, up))
+ if (put_user(IFF_TAP | IFF_NO_PI | MACVTAP_FEATURES, up))
return -EFAULT;
return 0;

--
MST

2014-11-24 11:55:36

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 38/41] virtio_scsi: move to uapi

Guests need to use virtio scsi API, so export it to uapi,
nice to e.g. qemu and will help us remember this file
affects ABI.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
include/{ => uapi}/linux/virtio_scsi.h | 0
1 file changed, 0 insertions(+), 0 deletions(-)
rename include/{ => uapi}/linux/virtio_scsi.h (100%)

diff --git a/include/linux/virtio_scsi.h b/include/uapi/linux/virtio_scsi.h
similarity index 100%
rename from include/linux/virtio_scsi.h
rename to include/uapi/linux/virtio_scsi.h
--
MST

2014-11-24 11:57:23

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 37/41] virtio_scsi: v1.0 support

Note: for consistency, and to avoid sparse errors,
convert all fields, even those no longer in use
for virtio v1.0.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
include/linux/virtio_scsi.h | 32 +++++++++++++++-------------
drivers/scsi/virtio_scsi.c | 51 ++++++++++++++++++++++++++++-----------------
2 files changed, 49 insertions(+), 34 deletions(-)

diff --git a/include/linux/virtio_scsi.h b/include/linux/virtio_scsi.h
index de429d1..af44864 100644
--- a/include/linux/virtio_scsi.h
+++ b/include/linux/virtio_scsi.h
@@ -27,13 +27,15 @@
#ifndef _LINUX_VIRTIO_SCSI_H
#define _LINUX_VIRTIO_SCSI_H

+#include <linux/virtio_types.h>
+
#define VIRTIO_SCSI_CDB_SIZE 32
#define VIRTIO_SCSI_SENSE_SIZE 96

/* SCSI command request, followed by data-out */
struct virtio_scsi_cmd_req {
u8 lun[8]; /* Logical Unit Number */
- u64 tag; /* Command identifier */
+ __virtio64 tag; /* Command identifier */
u8 task_attr; /* Task attribute */
u8 prio; /* SAM command priority field */
u8 crn;
@@ -43,20 +45,20 @@ struct virtio_scsi_cmd_req {
/* SCSI command request, followed by protection information */
struct virtio_scsi_cmd_req_pi {
u8 lun[8]; /* Logical Unit Number */
- u64 tag; /* Command identifier */
+ __virtio64 tag; /* Command identifier */
u8 task_attr; /* Task attribute */
u8 prio; /* SAM command priority field */
u8 crn;
- u32 pi_bytesout; /* DataOUT PI Number of bytes */
- u32 pi_bytesin; /* DataIN PI Number of bytes */
+ __virtio32 pi_bytesout; /* DataOUT PI Number of bytes */
+ __virtio32 pi_bytesin; /* DataIN PI Number of bytes */
u8 cdb[VIRTIO_SCSI_CDB_SIZE];
} __packed;

/* Response, followed by sense data and data-in */
struct virtio_scsi_cmd_resp {
- u32 sense_len; /* Sense data length */
- u32 resid; /* Residual bytes in data buffer */
- u16 status_qualifier; /* Status qualifier */
+ __virtio32 sense_len; /* Sense data length */
+ __virtio32 resid; /* Residual bytes in data buffer */
+ __virtio16 status_qualifier; /* Status qualifier */
u8 status; /* Command completion status */
u8 response; /* Response values */
u8 sense[VIRTIO_SCSI_SENSE_SIZE];
@@ -64,10 +66,10 @@ struct virtio_scsi_cmd_resp {

/* Task Management Request */
struct virtio_scsi_ctrl_tmf_req {
- u32 type;
- u32 subtype;
+ __virtio32 type;
+ __virtio32 subtype;
u8 lun[8];
- u64 tag;
+ __virtio64 tag;
} __packed;

struct virtio_scsi_ctrl_tmf_resp {
@@ -76,20 +78,20 @@ struct virtio_scsi_ctrl_tmf_resp {

/* Asynchronous notification query/subscription */
struct virtio_scsi_ctrl_an_req {
- u32 type;
+ __virtio32 type;
u8 lun[8];
- u32 event_requested;
+ __virtio32 event_requested;
} __packed;

struct virtio_scsi_ctrl_an_resp {
- u32 event_actual;
+ __virtio32 event_actual;
u8 response;
} __packed;

struct virtio_scsi_event {
- u32 event;
+ __virtio32 event;
u8 lun[8];
- u32 reason;
+ __virtio32 reason;
} __packed;

struct virtio_scsi_config {
diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index b83846f..c2779ea 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -158,7 +158,7 @@ static void virtscsi_complete_cmd(struct virtio_scsi *vscsi, void *buf)
sc, resp->response, resp->status, resp->sense_len);

sc->result = resp->status;
- virtscsi_compute_resid(sc, resp->resid);
+ virtscsi_compute_resid(sc, __virtio32_to_cpu(vscsi->vdev, resp->resid));
switch (resp->response) {
case VIRTIO_SCSI_S_OK:
set_host_byte(sc, DID_OK);
@@ -196,10 +196,13 @@ static void virtscsi_complete_cmd(struct virtio_scsi *vscsi, void *buf)
break;
}

- WARN_ON(resp->sense_len > VIRTIO_SCSI_SENSE_SIZE);
+ WARN_ON(__virtio32_to_cpu(vscsi->vdev, resp->sense_len) >
+ VIRTIO_SCSI_SENSE_SIZE);
if (sc->sense_buffer) {
memcpy(sc->sense_buffer, resp->sense,
- min_t(u32, resp->sense_len, VIRTIO_SCSI_SENSE_SIZE));
+ min_t(u32,
+ __virtio32_to_cpu(vscsi->vdev, resp->sense_len),
+ VIRTIO_SCSI_SENSE_SIZE));
if (resp->sense_len)
set_driver_byte(sc, DRIVER_SENSE);
}
@@ -323,7 +326,7 @@ static void virtscsi_handle_transport_reset(struct virtio_scsi *vscsi,
unsigned int target = event->lun[1];
unsigned int lun = (event->lun[2] << 8) | event->lun[3];

- switch (event->reason) {
+ switch (__virtio32_to_cpu(vscsi->vdev, event->reason)) {
case VIRTIO_SCSI_EVT_RESET_RESCAN:
scsi_add_device(shost, 0, target, lun);
break;
@@ -349,8 +352,8 @@ static void virtscsi_handle_param_change(struct virtio_scsi *vscsi,
struct Scsi_Host *shost = virtio_scsi_host(vscsi->vdev);
unsigned int target = event->lun[1];
unsigned int lun = (event->lun[2] << 8) | event->lun[3];
- u8 asc = event->reason & 255;
- u8 ascq = event->reason >> 8;
+ u8 asc = __virtio32_to_cpu(vscsi->vdev, event->reason) & 255;
+ u8 ascq = __virtio32_to_cpu(vscsi->vdev, event->reason) >> 8;

sdev = scsi_device_lookup(shost, 0, target, lun);
if (!sdev) {
@@ -374,12 +377,14 @@ static void virtscsi_handle_event(struct work_struct *work)
struct virtio_scsi *vscsi = event_node->vscsi;
struct virtio_scsi_event *event = &event_node->event;

- if (event->event & VIRTIO_SCSI_T_EVENTS_MISSED) {
- event->event &= ~VIRTIO_SCSI_T_EVENTS_MISSED;
+ if (event->event &
+ __cpu_to_virtio32(vscsi->vdev, VIRTIO_SCSI_T_EVENTS_MISSED)) {
+ event->event &= ~__cpu_to_virtio32(vscsi->vdev,
+ VIRTIO_SCSI_T_EVENTS_MISSED);
scsi_scan_host(virtio_scsi_host(vscsi->vdev));
}

- switch (event->event) {
+ switch (__virtio32_to_cpu(vscsi->vdev, event->event)) {
case VIRTIO_SCSI_T_NO_EVENT:
break;
case VIRTIO_SCSI_T_TRANSPORT_RESET:
@@ -482,26 +487,28 @@ static int virtscsi_kick_cmd(struct virtio_scsi_vq *vq,
return err;
}

-static void virtio_scsi_init_hdr(struct virtio_scsi_cmd_req *cmd,
+static void virtio_scsi_init_hdr(struct virtio_device *vdev,
+ struct virtio_scsi_cmd_req *cmd,
struct scsi_cmnd *sc)
{
cmd->lun[0] = 1;
cmd->lun[1] = sc->device->id;
cmd->lun[2] = (sc->device->lun >> 8) | 0x40;
cmd->lun[3] = sc->device->lun & 0xff;
- cmd->tag = (unsigned long)sc;
+ cmd->tag = __cpu_to_virtio64(vdev, (unsigned long)sc);
cmd->task_attr = VIRTIO_SCSI_S_SIMPLE;
cmd->prio = 0;
cmd->crn = 0;
}

-static void virtio_scsi_init_hdr_pi(struct virtio_scsi_cmd_req_pi *cmd_pi,
+static void virtio_scsi_init_hdr_pi(struct virtio_device *vdev,
+ struct virtio_scsi_cmd_req_pi *cmd_pi,
struct scsi_cmnd *sc)
{
struct request *rq = sc->request;
struct blk_integrity *bi;

- virtio_scsi_init_hdr((struct virtio_scsi_cmd_req *)cmd_pi, sc);
+ virtio_scsi_init_hdr(vdev, (struct virtio_scsi_cmd_req *)cmd_pi, sc);

if (!rq || !scsi_prot_sg_count(sc))
return;
@@ -509,9 +516,13 @@ static void virtio_scsi_init_hdr_pi(struct virtio_scsi_cmd_req_pi *cmd_pi,
bi = blk_get_integrity(rq->rq_disk);

if (sc->sc_data_direction == DMA_TO_DEVICE)
- cmd_pi->pi_bytesout = blk_rq_sectors(rq) * bi->tuple_size;
+ cmd_pi->pi_bytesout = __cpu_to_virtio32(vdev,
+ blk_rq_sectors(rq) *
+ bi->tuple_size);
else if (sc->sc_data_direction == DMA_FROM_DEVICE)
- cmd_pi->pi_bytesin = blk_rq_sectors(rq) * bi->tuple_size;
+ cmd_pi->pi_bytesin = __cpu_to_virtio32(vdev,
+ blk_rq_sectors(rq) *
+ bi->tuple_size);
}

static int virtscsi_queuecommand(struct virtio_scsi *vscsi,
@@ -536,11 +547,11 @@ static int virtscsi_queuecommand(struct virtio_scsi *vscsi,
BUG_ON(sc->cmd_len > VIRTIO_SCSI_CDB_SIZE);

if (virtio_has_feature(vscsi->vdev, VIRTIO_SCSI_F_T10_PI)) {
- virtio_scsi_init_hdr_pi(&cmd->req.cmd_pi, sc);
+ virtio_scsi_init_hdr_pi(vscsi->vdev, &cmd->req.cmd_pi, sc);
memcpy(cmd->req.cmd_pi.cdb, sc->cmnd, sc->cmd_len);
req_size = sizeof(cmd->req.cmd_pi);
} else {
- virtio_scsi_init_hdr(&cmd->req.cmd, sc);
+ virtio_scsi_init_hdr(vscsi->vdev, &cmd->req.cmd, sc);
memcpy(cmd->req.cmd.cdb, sc->cmnd, sc->cmd_len);
req_size = sizeof(cmd->req.cmd);
}
@@ -655,7 +666,8 @@ static int virtscsi_device_reset(struct scsi_cmnd *sc)
cmd->sc = sc;
cmd->req.tmf = (struct virtio_scsi_ctrl_tmf_req){
.type = VIRTIO_SCSI_T_TMF,
- .subtype = VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_RESET,
+ .subtype = __cpu_to_virtio32(vscsi->vdev,
+ VIRTIO_SCSI_T_TMF_LOGICAL_UNIT_RESET),
.lun[0] = 1,
.lun[1] = sc->device->id,
.lun[2] = (sc->device->lun >> 8) | 0x40,
@@ -713,7 +725,7 @@ static int virtscsi_abort(struct scsi_cmnd *sc)
.lun[1] = sc->device->id,
.lun[2] = (sc->device->lun >> 8) | 0x40,
.lun[3] = sc->device->lun & 0xff,
- .tag = (unsigned long)sc,
+ .tag = __cpu_to_virtio64(vscsi->vdev, (unsigned long)sc),
};
return virtscsi_tmf(vscsi, cmd);
}
@@ -1073,6 +1085,7 @@ static unsigned int features[] = {
VIRTIO_SCSI_F_HOTPLUG,
VIRTIO_SCSI_F_CHANGE,
VIRTIO_SCSI_F_T10_PI,
+ VIRTIO_F_VERSION_1,
};

static struct virtio_driver virtio_scsi_driver = {
--
MST

2014-11-24 11:58:15

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 33/41] tun: drop most type defines

It's just as easy to use IFF_ flags directly,
there's no point in adding our own defines.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/net/tun.c | 62 +++++++++++++++++++++++++------------------------------
1 file changed, 28 insertions(+), 34 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index bc89d07..61c000c 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -109,12 +109,6 @@ do { \
* overload it to mean fasync when stored there.
*/
#define TUN_FASYNC IFF_ATTACH_QUEUE
-#define TUN_NO_PI IFF_NO_PI
-/* This flag has no real effect */
-#define TUN_ONE_QUEUE IFF_ONE_QUEUE
-#define TUN_PERSIST IFF_PERSIST
-#define TUN_VNET_HDR IFF_VNET_HDR
-#define TUN_TAP_MQ IFF_MULTI_QUEUE

#define TUN_FEATURES (IFF_NO_PI | IFF_ONE_QUEUE | IFF_VNET_HDR | \
IFF_MULTI_QUEUE)
@@ -487,7 +481,7 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
if (tun && tun->numqueues == 0 && tun->numdisabled == 0) {
netif_carrier_off(tun->dev);

- if (!(tun->flags & TUN_PERSIST) &&
+ if (!(tun->flags & IFF_PERSIST) &&
tun->dev->reg_state == NETREG_REGISTERED)
unregister_netdevice(tun->dev);
}
@@ -538,7 +532,7 @@ static void tun_detach_all(struct net_device *dev)
}
BUG_ON(tun->numdisabled != 0);

- if (tun->flags & TUN_PERSIST)
+ if (tun->flags & IFF_PERSIST)
module_put(THIS_MODULE);
}

@@ -556,7 +550,7 @@ static int tun_attach(struct tun_struct *tun, struct file *file, bool skip_filte
goto out;

err = -EBUSY;
- if (!(tun->flags & TUN_TAP_MQ) && tun->numqueues == 1)
+ if (!(tun->flags & IFF_MULTI_QUEUE) && tun->numqueues == 1)
goto out;

err = -E2BIG;
@@ -935,7 +929,7 @@ static void tun_net_init(struct net_device *dev)
struct tun_struct *tun = netdev_priv(dev);

switch (tun->flags & TUN_TYPE_MASK) {
- case TUN_TUN_DEV:
+ case IFF_TUN:
dev->netdev_ops = &tun_netdev_ops;

/* Point-to-Point TUN Device */
@@ -949,7 +943,7 @@ static void tun_net_init(struct net_device *dev)
dev->tx_queue_len = TUN_READQ_SIZE; /* We prefer our own queue length */
break;

- case TUN_TAP_DEV:
+ case IFF_TAP:
dev->netdev_ops = &tap_netdev_ops;
/* Ethernet TAP Device */
ether_setup(dev);
@@ -1040,7 +1034,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
int err;
u32 rxhash;

- if (!(tun->flags & TUN_NO_PI)) {
+ if (!(tun->flags & IFF_NO_PI)) {
if (len < sizeof(pi))
return -EINVAL;
len -= sizeof(pi);
@@ -1050,7 +1044,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
offset += sizeof(pi);
}

- if (tun->flags & TUN_VNET_HDR) {
+ if (tun->flags & IFF_VNET_HDR) {
if (len < tun->vnet_hdr_sz)
return -EINVAL;
len -= tun->vnet_hdr_sz;
@@ -1067,7 +1061,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
offset += tun->vnet_hdr_sz;
}

- if ((tun->flags & TUN_TYPE_MASK) == TUN_TAP_DEV) {
+ if ((tun->flags & TUN_TYPE_MASK) == IFF_TAP) {
align += NET_IP_ALIGN;
if (unlikely(len < ETH_HLEN ||
(gso.hdr_len && gso.hdr_len < ETH_HLEN)))
@@ -1130,8 +1124,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
}

switch (tun->flags & TUN_TYPE_MASK) {
- case TUN_TUN_DEV:
- if (tun->flags & TUN_NO_PI) {
+ case IFF_TUN:
+ if (tun->flags & IFF_NO_PI) {
switch (skb->data[0] & 0xf0) {
case 0x40:
pi.proto = htons(ETH_P_IP);
@@ -1150,7 +1144,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
skb->protocol = pi.proto;
skb->dev = tun->dev;
break;
- case TUN_TAP_DEV:
+ case IFF_TAP:
skb->protocol = eth_type_trans(skb, tun->dev);
break;
}
@@ -1256,10 +1250,10 @@ static ssize_t tun_put_user(struct tun_struct *tun,
if (vlan_tx_tag_present(skb))
vlan_hlen = VLAN_HLEN;

- if (tun->flags & TUN_VNET_HDR)
+ if (tun->flags & IFF_VNET_HDR)
vnet_hdr_sz = tun->vnet_hdr_sz;

- if (!(tun->flags & TUN_NO_PI)) {
+ if (!(tun->flags & IFF_NO_PI)) {
if ((len -= sizeof(pi)) < 0)
return -EINVAL;

@@ -1592,7 +1586,7 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
return -EINVAL;

if (!!(ifr->ifr_flags & IFF_MULTI_QUEUE) !=
- !!(tun->flags & TUN_TAP_MQ))
+ !!(tun->flags & IFF_MULTI_QUEUE))
return -EINVAL;

if (tun_not_capable(tun))
@@ -1605,7 +1599,7 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
if (err < 0)
return err;

- if (tun->flags & TUN_TAP_MQ &&
+ if (tun->flags & IFF_MULTI_QUEUE &&
(tun->numqueues + tun->numdisabled > 1)) {
/* One or more queue has already been attached, no need
* to initialize the device again.
@@ -1628,11 +1622,11 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
/* Set dev type */
if (ifr->ifr_flags & IFF_TUN) {
/* TUN device */
- flags |= TUN_TUN_DEV;
+ flags |= IFF_TUN;
name = "tun%d";
} else if (ifr->ifr_flags & IFF_TAP) {
/* TAP device */
- flags |= TUN_TAP_DEV;
+ flags |= IFF_TAP;
name = "tap%d";
} else
return -EINVAL;
@@ -1825,7 +1819,7 @@ static int tun_set_queue(struct file *file, struct ifreq *ifr)
ret = tun_attach(tun, file, false);
} else if (ifr->ifr_flags & IFF_DETACH_QUEUE) {
tun = rtnl_dereference(tfile->tun);
- if (!tun || !(tun->flags & TUN_TAP_MQ) || tfile->detached)
+ if (!tun || !(tun->flags & IFF_MULTI_QUEUE) || tfile->detached)
ret = -EINVAL;
else
__tun_detach(tfile, false);
@@ -1931,12 +1925,12 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
/* Disable/Enable persist mode. Keep an extra reference to the
* module to prevent the module being unprobed.
*/
- if (arg && !(tun->flags & TUN_PERSIST)) {
- tun->flags |= TUN_PERSIST;
+ if (arg && !(tun->flags & IFF_PERSIST)) {
+ tun->flags |= IFF_PERSIST;
__module_get(THIS_MODULE);
}
- if (!arg && (tun->flags & TUN_PERSIST)) {
- tun->flags &= ~TUN_PERSIST;
+ if (!arg && (tun->flags & IFF_PERSIST)) {
+ tun->flags &= ~IFF_PERSIST;
module_put(THIS_MODULE);
}

@@ -1994,7 +1988,7 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
case TUNSETTXFILTER:
/* Can be set only for TAPs */
ret = -EINVAL;
- if ((tun->flags & TUN_TYPE_MASK) != TUN_TAP_DEV)
+ if ((tun->flags & TUN_TYPE_MASK) != IFF_TAP)
break;
ret = update_filter(&tun->txflt, (void __user *)arg);
break;
@@ -2053,7 +2047,7 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
case TUNATTACHFILTER:
/* Can be set only for TAPs */
ret = -EINVAL;
- if ((tun->flags & TUN_TYPE_MASK) != TUN_TAP_DEV)
+ if ((tun->flags & TUN_TYPE_MASK) != IFF_TAP)
break;
ret = -EFAULT;
if (copy_from_user(&tun->fprog, argp, sizeof(tun->fprog)))
@@ -2065,7 +2059,7 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
case TUNDETACHFILTER:
/* Can be set only for TAPs */
ret = -EINVAL;
- if ((tun->flags & TUN_TYPE_MASK) != TUN_TAP_DEV)
+ if ((tun->flags & TUN_TYPE_MASK) != IFF_TAP)
break;
ret = 0;
tun_detach_filter(tun, tun->numqueues);
@@ -2073,7 +2067,7 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,

case TUNGETFILTER:
ret = -EINVAL;
- if ((tun->flags & TUN_TYPE_MASK) != TUN_TAP_DEV)
+ if ((tun->flags & TUN_TYPE_MASK) != IFF_TAP)
break;
ret = -EFAULT;
if (copy_to_user(argp, &tun->fprog, sizeof(tun->fprog)))
@@ -2266,10 +2260,10 @@ static void tun_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info
strlcpy(info->version, DRV_VERSION, sizeof(info->version));

switch (tun->flags & TUN_TYPE_MASK) {
- case TUN_TUN_DEV:
+ case IFF_TUN:
strlcpy(info->bus_info, "tun", sizeof(info->bus_info));
break;
- case TUN_TAP_DEV:
+ case IFF_TAP:
strlcpy(info->bus_info, "tap", sizeof(info->bus_info));
break;
}
--
MST

2014-11-24 11:55:19

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 19/41] virtio_net: pass vi around

Too many places poke at [rs]q->vq->vdev->priv just to get
the the vi structure. Let's just pass the pointer around: seems
cleaner, and might even be faster.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/net/virtio_net.c | 38 ++++++++++++++++++++------------------
1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index c07e030..1630c21 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -241,11 +241,11 @@ static unsigned long mergeable_buf_to_ctx(void *buf, unsigned int truesize)
}

/* Called from bottom half context */
-static struct sk_buff *page_to_skb(struct receive_queue *rq,
+static struct sk_buff *page_to_skb(struct virtnet_info *vi,
+ struct receive_queue *rq,
struct page *page, unsigned int offset,
unsigned int len, unsigned int truesize)
{
- struct virtnet_info *vi = rq->vq->vdev->priv;
struct sk_buff *skb;
struct skb_vnet_hdr *hdr;
unsigned int copy, hdr_len, hdr_padded_len;
@@ -328,12 +328,13 @@ static struct sk_buff *receive_small(void *buf, unsigned int len)
}

static struct sk_buff *receive_big(struct net_device *dev,
+ struct virtnet_info *vi,
struct receive_queue *rq,
void *buf,
unsigned int len)
{
struct page *page = buf;
- struct sk_buff *skb = page_to_skb(rq, page, 0, len, PAGE_SIZE);
+ struct sk_buff *skb = page_to_skb(vi, rq, page, 0, len, PAGE_SIZE);

if (unlikely(!skb))
goto err;
@@ -359,7 +360,8 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
int offset = buf - page_address(page);
unsigned int truesize = max(len, mergeable_ctx_to_buf_truesize(ctx));

- struct sk_buff *head_skb = page_to_skb(rq, page, offset, len, truesize);
+ struct sk_buff *head_skb = page_to_skb(vi, rq, page, offset, len,
+ truesize);
struct sk_buff *curr_skb = head_skb;

if (unlikely(!curr_skb))
@@ -433,9 +435,9 @@ err_buf:
return NULL;
}

-static void receive_buf(struct receive_queue *rq, void *buf, unsigned int len)
+static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
+ void *buf, unsigned int len)
{
- struct virtnet_info *vi = rq->vq->vdev->priv;
struct net_device *dev = vi->dev;
struct virtnet_stats *stats = this_cpu_ptr(vi->stats);
struct sk_buff *skb;
@@ -459,7 +461,7 @@ static void receive_buf(struct receive_queue *rq, void *buf, unsigned int len)
if (vi->mergeable_rx_bufs)
skb = receive_mergeable(dev, vi, rq, (unsigned long)buf, len);
else if (vi->big_packets)
- skb = receive_big(dev, rq, buf, len);
+ skb = receive_big(dev, vi, rq, buf, len);
else
skb = receive_small(buf, len);

@@ -539,9 +541,9 @@ frame_err:
dev_kfree_skb(skb);
}

-static int add_recvbuf_small(struct receive_queue *rq, gfp_t gfp)
+static int add_recvbuf_small(struct virtnet_info *vi, struct receive_queue *rq,
+ gfp_t gfp)
{
- struct virtnet_info *vi = rq->vq->vdev->priv;
struct sk_buff *skb;
struct skb_vnet_hdr *hdr;
int err;
@@ -664,9 +666,9 @@ static int add_recvbuf_mergeable(struct receive_queue *rq, gfp_t gfp)
* before we're receiving packets, or from refill_work which is
* careful to disable receiving (using napi_disable).
*/
-static bool try_fill_recv(struct receive_queue *rq, gfp_t gfp)
+static bool try_fill_recv(struct virtnet_info *vi, struct receive_queue *rq,
+ gfp_t gfp)
{
- struct virtnet_info *vi = rq->vq->vdev->priv;
int err;
bool oom;

@@ -677,7 +679,7 @@ static bool try_fill_recv(struct receive_queue *rq, gfp_t gfp)
else if (vi->big_packets)
err = add_recvbuf_big(rq, gfp);
else
- err = add_recvbuf_small(rq, gfp);
+ err = add_recvbuf_small(vi, rq, gfp);

oom = err == -ENOMEM;
if (err)
@@ -726,7 +728,7 @@ static void refill_work(struct work_struct *work)
struct receive_queue *rq = &vi->rq[i];

napi_disable(&rq->napi);
- still_empty = !try_fill_recv(rq, GFP_KERNEL);
+ still_empty = !try_fill_recv(vi, rq, GFP_KERNEL);
virtnet_napi_enable(rq);

/* In theory, this can happen: if we don't get any buffers in
@@ -745,12 +747,12 @@ static int virtnet_receive(struct receive_queue *rq, int budget)

while (received < budget &&
(buf = virtqueue_get_buf(rq->vq, &len)) != NULL) {
- receive_buf(rq, buf, len);
+ receive_buf(vi, rq, buf, len);
received++;
}

if (rq->vq->num_free > virtqueue_get_vring_size(rq->vq) / 2) {
- if (!try_fill_recv(rq, GFP_ATOMIC))
+ if (!try_fill_recv(vi, rq, GFP_ATOMIC))
schedule_delayed_work(&vi->refill, 0);
}

@@ -826,7 +828,7 @@ static int virtnet_open(struct net_device *dev)
for (i = 0; i < vi->max_queue_pairs; i++) {
if (i < vi->curr_queue_pairs)
/* Make sure we have some buffers: if oom use wq. */
- if (!try_fill_recv(&vi->rq[i], GFP_KERNEL))
+ if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL))
schedule_delayed_work(&vi->refill, 0);
virtnet_napi_enable(&vi->rq[i]);
}
@@ -1851,7 +1853,7 @@ static int virtnet_probe(struct virtio_device *vdev)

/* Last of all, set up some receive buffers. */
for (i = 0; i < vi->curr_queue_pairs; i++) {
- try_fill_recv(&vi->rq[i], GFP_KERNEL);
+ try_fill_recv(vi, &vi->rq[i], GFP_KERNEL);

/* If we didn't even get one input buffer, we're useless. */
if (vi->rq[i].vq->num_free ==
@@ -1971,7 +1973,7 @@ static int virtnet_restore(struct virtio_device *vdev)

if (netif_running(vi->dev)) {
for (i = 0; i < vi->curr_queue_pairs; i++)
- if (!try_fill_recv(&vi->rq[i], GFP_KERNEL))
+ if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL))
schedule_delayed_work(&vi->refill, 0);

for (i = 0; i < vi->max_queue_pairs; i++)
--
MST

2014-11-24 11:59:38

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 32/41] tun: move internal flag defines out of uapi

TUN_ flags are internal and never exposed
to userspace. Any application using it is almost
certainly buggy.

Move them out to tun.c, we'll remove them in follow-up patches.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
include/uapi/linux/if_tun.h | 16 ++--------
drivers/net/tun.c | 74 ++++++++++++++-------------------------------
2 files changed, 26 insertions(+), 64 deletions(-)

diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h
index e9502dd..277a260 100644
--- a/include/uapi/linux/if_tun.h
+++ b/include/uapi/linux/if_tun.h
@@ -22,21 +22,11 @@

/* Read queue size */
#define TUN_READQ_SIZE 500
-
-/* TUN device flags */
-#define TUN_TUN_DEV 0x0001
-#define TUN_TAP_DEV 0x0002
+/* TUN device type flags: deprecated. Use IFF_TUN/IFF_TAP instead. */
+#define TUN_TUN_DEV IFF_TUN
+#define TUN_TAP_DEV IFF_TAP
#define TUN_TYPE_MASK 0x000f

-#define TUN_FASYNC 0x0010
-#define TUN_NOCHECKSUM 0x0020
-#define TUN_NO_PI 0x0040
-/* This flag has no real effect */
-#define TUN_ONE_QUEUE 0x0080
-#define TUN_PERSIST 0x0100
-#define TUN_VNET_HDR 0x0200
-#define TUN_TAP_MQ 0x0400
-
/* Ioctl defines */
#define TUNSETNOCSUM _IOW('T', 200, int)
#define TUNSETDEBUG _IOW('T', 201, int)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 9dd3746..bc89d07 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -103,6 +103,21 @@ do { \
} while (0)
#endif

+/* TUN device flags */
+
+/* IFF_ATTACH_QUEUE is never stored in device flags,
+ * overload it to mean fasync when stored there.
+ */
+#define TUN_FASYNC IFF_ATTACH_QUEUE
+#define TUN_NO_PI IFF_NO_PI
+/* This flag has no real effect */
+#define TUN_ONE_QUEUE IFF_ONE_QUEUE
+#define TUN_PERSIST IFF_PERSIST
+#define TUN_VNET_HDR IFF_VNET_HDR
+#define TUN_TAP_MQ IFF_MULTI_QUEUE
+
+#define TUN_FEATURES (IFF_NO_PI | IFF_ONE_QUEUE | IFF_VNET_HDR | \
+ IFF_MULTI_QUEUE)
#define GOODCOPY_LEN 128

#define FLT_EXACT_COUNT 8
@@ -1521,32 +1536,7 @@ static struct proto tun_proto = {

static int tun_flags(struct tun_struct *tun)
{
- int flags = 0;
-
- if (tun->flags & TUN_TUN_DEV)
- flags |= IFF_TUN;
- else
- flags |= IFF_TAP;
-
- if (tun->flags & TUN_NO_PI)
- flags |= IFF_NO_PI;
-
- /* This flag has no real effect. We track the value for backwards
- * compatibility.
- */
- if (tun->flags & TUN_ONE_QUEUE)
- flags |= IFF_ONE_QUEUE;
-
- if (tun->flags & TUN_VNET_HDR)
- flags |= IFF_VNET_HDR;
-
- if (tun->flags & TUN_TAP_MQ)
- flags |= IFF_MULTI_QUEUE;
-
- if (tun->flags & TUN_PERSIST)
- flags |= IFF_PERSIST;
-
- return flags;
+ return tun->flags & (TUN_FEATURES | IFF_PERSIST | IFF_TUN | IFF_TAP);
}

static ssize_t tun_show_flags(struct device *dev, struct device_attribute *attr,
@@ -1706,28 +1696,8 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)

tun_debug(KERN_INFO, tun, "tun_set_iff\n");

- if (ifr->ifr_flags & IFF_NO_PI)
- tun->flags |= TUN_NO_PI;
- else
- tun->flags &= ~TUN_NO_PI;
-
- /* This flag has no real effect. We track the value for backwards
- * compatibility.
- */
- if (ifr->ifr_flags & IFF_ONE_QUEUE)
- tun->flags |= TUN_ONE_QUEUE;
- else
- tun->flags &= ~TUN_ONE_QUEUE;
-
- if (ifr->ifr_flags & IFF_VNET_HDR)
- tun->flags |= TUN_VNET_HDR;
- else
- tun->flags &= ~TUN_VNET_HDR;
-
- if (ifr->ifr_flags & IFF_MULTI_QUEUE)
- tun->flags |= TUN_TAP_MQ;
- else
- tun->flags &= ~TUN_TAP_MQ;
+ tun->flags = (tun->flags & ~TUN_FEATURES) |
+ (ifr->ifr_flags & TUN_FEATURES);

/* Make sure persistent devices do not get stuck in
* xoff state.
@@ -1890,9 +1860,11 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd,
if (cmd == TUNGETFEATURES) {
/* Currently this just means: "what IFF flags are valid?".
* This is needed because we never checked for invalid flags on
- * TUNSETIFF. */
- return put_user(IFF_TUN | IFF_TAP | IFF_NO_PI | IFF_ONE_QUEUE |
- IFF_VNET_HDR | IFF_MULTI_QUEUE,
+ * TUNSETIFF. Why do we report IFF_TUN and IFF_TAP which are
+ * not legal for TUNSETIFF here? It's probably a bug, but it
+ * doesn't seem to be worth fixing.
+ */
+ return put_user(IFF_TUN | IFF_TAP | TUN_FEATURES,
(unsigned int __user*)argp);
} else if (cmd == TUNSETQUEUE)
return tun_set_queue(file, &ifr);
--
MST

2014-11-24 12:00:42

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 30/41] vhost/net: enable virtio 1.0

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/vhost/net.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 1ac58d0..984242e 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -61,7 +61,8 @@ MODULE_PARM_DESC(experimental_zcopytx, "Enable Zero Copy TX;"
enum {
VHOST_NET_FEATURES = VHOST_FEATURES |
(1ULL << VHOST_NET_F_VIRTIO_NET_HDR) |
- (1ULL << VIRTIO_NET_F_MRG_RXBUF),
+ (1ULL << VIRTIO_NET_F_MRG_RXBUF) |
+ (1ULL << VIRTIO_F_VERSION_1),
};

enum {
--
MST

2014-11-24 12:01:26

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 27/41] vhost: make features 64 bit

We need to use bit 32 for virtio 1.0

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/vhost/vhost.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index b9032e8..1f321fd 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -106,7 +106,7 @@ struct vhost_virtqueue {
/* Protected by virtqueue mutex. */
struct vhost_memory *memory;
void *private_data;
- unsigned acked_features;
+ u64 acked_features;
/* Log write descriptors */
void __user *log_base;
struct vhost_log *log;
--
MST

2014-11-24 12:02:29

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 24/41] vhost: add memory access wrappers

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/vhost/vhost.h | 33 ++++++++++++++++++++++++++++++++-
1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 3eda654..b9032e8 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -174,6 +174,37 @@ enum {

static inline int vhost_has_feature(struct vhost_virtqueue *vq, int bit)
{
- return vq->acked_features & (1 << bit);
+ return vq->acked_features & (1ULL << bit);
+}
+
+/* Memory accessors */
+static inline u16 vhost16_to_cpu(struct vhost_virtqueue *vq, __virtio16 val)
+{
+ return __virtio16_to_cpu(vhost_has_feature(vq, VIRTIO_F_VERSION_1), val);
+}
+
+static inline __virtio16 cpu_to_vhost16(struct vhost_virtqueue *vq, u16 val)
+{
+ return __cpu_to_virtio16(vhost_has_feature(vq, VIRTIO_F_VERSION_1), val);
+}
+
+static inline u32 vhost32_to_cpu(struct vhost_virtqueue *vq, __virtio32 val)
+{
+ return __virtio32_to_cpu(vhost_has_feature(vq, VIRTIO_F_VERSION_1), val);
+}
+
+static inline __virtio32 cpu_to_vhost32(struct vhost_virtqueue *vq, u32 val)
+{
+ return __cpu_to_virtio32(vhost_has_feature(vq, VIRTIO_F_VERSION_1), val);
+}
+
+static inline u64 vhost64_to_cpu(struct vhost_virtqueue *vq, __virtio64 val)
+{
+ return __virtio64_to_cpu(vhost_has_feature(vq, VIRTIO_F_VERSION_1), val);
+}
+
+static inline __virtio64 cpu_to_vhost64(struct vhost_virtqueue *vq, u64 val)
+{
+ return __cpu_to_virtio64(vhost_has_feature(vq, VIRTIO_F_VERSION_1), val);
}
#endif
--
MST

2014-11-24 12:02:57

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 22/41] virtio_net: bigger header when VERSION_1 is set

With VERSION_1 virtio_net uses same header size
whether mergeable buffers are enabled or not.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/net/virtio_net.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 098f443..a0e64cf 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1805,7 +1805,8 @@ static int virtnet_probe(struct virtio_device *vdev)
if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF))
vi->mergeable_rx_bufs = true;

- if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF))
+ if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF) ||
+ virtio_has_feature(vdev, VIRTIO_F_VERSION_1))
vi->hdr_len = sizeof(struct virtio_net_hdr_mrg_rxbuf);
else
vi->hdr_len = sizeof(struct virtio_net_hdr);
--
MST

2014-11-24 11:54:27

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 21/41] virtio_net: stricter short buffer length checks

Our buffer length check is not strict enough for mergeable
buffers: buffer can still be shorter that header + address
by 2 bytes.

Fix that up.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/net/virtio_net.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 516f2cb..098f443 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -437,7 +437,7 @@ static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
struct sk_buff *skb;
struct virtio_net_hdr_mrg_rxbuf *hdr;

- if (unlikely(len < sizeof(struct virtio_net_hdr) + ETH_HLEN)) {
+ if (unlikely(len < vi->hdr_len + ETH_HLEN)) {
pr_debug("%s: short packet %i\n", dev->name, len);
dev->stats.rx_length_errors++;
if (vi->mergeable_rx_bufs) {
--
MST

2014-11-24 12:03:29

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v3 04/41] virtio: memory access APIs

On Mon, Nov 24, 2014 at 12:52 PM, Michael S. Tsirkin <[email protected]> wrote:
> virtio 1.0 makes all memory structures LE, so
> we need APIs to conditionally do a byteswap on BE
> architectures.
>
> To make it easier to check code statically,
> add virtio specific types for multi-byte integers
> in memory.
>
> Add low level wrappers that do a byteswap conditionally, these will be
> useful e.g. for vhost. Add high level wrappers that
> query device endian-ness and act accordingly.

> diff --git a/include/linux/virtio_byteorder.h b/include/linux/virtio_byteorder.h
> new file mode 100644
> index 0000000..824ed0b
> --- /dev/null
> +++ b/include/linux/virtio_byteorder.h

> +static inline u16 __virtio16_to_cpu(bool little_endian, __virtio16 val)
> +{
> + if (little_endian)
> + return le16_to_cpu((__force __le16)val);
> + else
> + return (__force u16)val;
> +}

What's wrong with just using le16-to_cpu() ...

> --- a/include/uapi/linux/virtio_ring.h
> +++ b/include/uapi/linux/virtio_ring.h

> /* Virtio ring descriptors: 16 bytes. These can chain together via "next". */
> struct vring_desc {
> /* Address (guest-physical). */
> - __u64 addr;
> + __virtio64 addr;

... and __le64?

There's already lots of precedence or this, even in include/uapi/.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2014-11-24 12:03:58

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 18/41] virtio_blk: fix race at module removal

If a device appears while module is being removed,
driver will get a callback after we've given up
on the major number.

In theory this means this major number can get reused
by something else, resulting in a conflict.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/block/virtio_blk.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 055f3df..1f8b111 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -884,8 +884,8 @@ out_destroy_workqueue:

static void __exit fini(void)
{
- unregister_blkdev(major, "virtblk");
unregister_virtio_driver(&virtio_blk);
+ unregister_blkdev(major, "virtblk");
destroy_workqueue(virtblk_wq);
}
module_init(init);
--
MST

2014-11-24 12:04:43

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 17/41] virtio_blk: make serial attribute static

It's never declared so no need to make it extern.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/block/virtio_blk.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index f601f16..055f3df 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -332,7 +332,8 @@ static ssize_t virtblk_serial_show(struct device *dev,

return err;
}
-DEVICE_ATTR(serial, S_IRUGO, virtblk_serial_show, NULL);
+
+static DEVICE_ATTR(serial, S_IRUGO, virtblk_serial_show, NULL);

static void virtblk_config_changed_work(struct work_struct *work)
{
--
MST

2014-11-24 12:06:44

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 13/41] KVM: s390: Set virtio-ccw transport revision

From: Thomas Huth <[email protected]>

With the new SET-VIRTIO-REVISION command of the virtio 1.0 standard, we
can now negotiate the virtio-ccw revision after setting a channel online.

Note that we don't negotiate version 1 yet.

[Cornelia Huck: reworked revision loop a bit]
Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Thomas Huth <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
---
drivers/s390/kvm/virtio_ccw.c | 63 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 63 insertions(+)

diff --git a/drivers/s390/kvm/virtio_ccw.c b/drivers/s390/kvm/virtio_ccw.c
index abba04d..193e8f1 100644
--- a/drivers/s390/kvm/virtio_ccw.c
+++ b/drivers/s390/kvm/virtio_ccw.c
@@ -55,6 +55,7 @@ struct virtio_ccw_device {
struct ccw_device *cdev;
__u32 curr_io;
int err;
+ unsigned int revision; /* Transport revision */
wait_queue_head_t wait_q;
spinlock_t lock;
struct list_head virtqueues;
@@ -86,6 +87,15 @@ struct virtio_thinint_area {
u8 isc;
} __packed;

+struct virtio_rev_info {
+ __u16 revision;
+ __u16 length;
+ __u8 data[];
+};
+
+/* the highest virtio-ccw revision we support */
+#define VIRTIO_CCW_REV_MAX 0
+
struct virtio_ccw_vq_info {
struct virtqueue *vq;
int num;
@@ -122,6 +132,7 @@ static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
#define CCW_CMD_WRITE_STATUS 0x31
#define CCW_CMD_READ_VQ_CONF 0x32
#define CCW_CMD_SET_IND_ADAPTER 0x73
+#define CCW_CMD_SET_VIRTIO_REV 0x83

#define VIRTIO_CCW_DOING_SET_VQ 0x00010000
#define VIRTIO_CCW_DOING_RESET 0x00040000
@@ -134,6 +145,7 @@ static struct airq_info *airq_areas[MAX_AIRQ_AREAS];
#define VIRTIO_CCW_DOING_READ_VQ_CONF 0x02000000
#define VIRTIO_CCW_DOING_SET_CONF_IND 0x04000000
#define VIRTIO_CCW_DOING_SET_IND_ADAPTER 0x08000000
+#define VIRTIO_CCW_DOING_SET_VIRTIO_REV 0x10000000
#define VIRTIO_CCW_INTPARM_MASK 0xffff0000

static struct virtio_ccw_device *to_vc_device(struct virtio_device *vdev)
@@ -933,6 +945,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
case VIRTIO_CCW_DOING_RESET:
case VIRTIO_CCW_DOING_READ_VQ_CONF:
case VIRTIO_CCW_DOING_SET_IND_ADAPTER:
+ case VIRTIO_CCW_DOING_SET_VIRTIO_REV:
vcdev->curr_io &= ~activity;
wake_up(&vcdev->wait_q);
break;
@@ -1048,6 +1061,51 @@ static int virtio_ccw_offline(struct ccw_device *cdev)
return 0;
}

+static int virtio_ccw_set_transport_rev(struct virtio_ccw_device *vcdev)
+{
+ struct virtio_rev_info *rev;
+ struct ccw1 *ccw;
+ int ret;
+
+ ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
+ if (!ccw)
+ return -ENOMEM;
+ rev = kzalloc(sizeof(*rev), GFP_DMA | GFP_KERNEL);
+ if (!rev) {
+ kfree(ccw);
+ return -ENOMEM;
+ }
+
+ /* Set transport revision */
+ ccw->cmd_code = CCW_CMD_SET_VIRTIO_REV;
+ ccw->flags = 0;
+ ccw->count = sizeof(*rev);
+ ccw->cda = (__u32)(unsigned long)rev;
+
+ vcdev->revision = VIRTIO_CCW_REV_MAX;
+ do {
+ rev->revision = vcdev->revision;
+ /* none of our supported revisions carry payload */
+ rev->length = 0;
+ ret = ccw_io_helper(vcdev, ccw,
+ VIRTIO_CCW_DOING_SET_VIRTIO_REV);
+ if (ret == -EOPNOTSUPP) {
+ if (vcdev->revision == 0)
+ /*
+ * The host device does not support setting
+ * the revision: let's operate it in legacy
+ * mode.
+ */
+ ret = 0;
+ else
+ vcdev->revision--;
+ }
+ } while (ret == -EOPNOTSUPP);
+
+ kfree(ccw);
+ kfree(rev);
+ return ret;
+}

static int virtio_ccw_online(struct ccw_device *cdev)
{
@@ -1088,6 +1146,11 @@ static int virtio_ccw_online(struct ccw_device *cdev)
spin_unlock_irqrestore(get_ccwdev_lock(cdev), flags);
vcdev->vdev.id.vendor = cdev->id.cu_type;
vcdev->vdev.id.device = cdev->id.cu_model;
+
+ ret = virtio_ccw_set_transport_rev(vcdev);
+ if (ret)
+ goto out_free;
+
ret = register_virtio_device(&vcdev->vdev);
if (ret) {
dev_warn(&cdev->dev, "Failed to register virtio device: %d\n",
--
MST

2014-11-24 12:07:21

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 11/41] virtio_net: v1.0 endianness

Based on patches by Rusty Russell, Cornelia Huck.
Note: more code changes are needed for 1.0 support
(due to different header size).
So we don't advertize support for 1.0 yet.

Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
---
include/uapi/linux/virtio_net.h | 15 ++++++++-------
drivers/net/virtio_net.c | 33 ++++++++++++++++++++-------------
2 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/include/uapi/linux/virtio_net.h b/include/uapi/linux/virtio_net.h
index 172a7f0..b5f1677 100644
--- a/include/uapi/linux/virtio_net.h
+++ b/include/uapi/linux/virtio_net.h
@@ -28,6 +28,7 @@
#include <linux/types.h>
#include <linux/virtio_ids.h>
#include <linux/virtio_config.h>
+#include <linux/virtio_types.h>
#include <linux/if_ether.h>

/* The feature bitmap for virtio net */
@@ -84,17 +85,17 @@ struct virtio_net_hdr {
#define VIRTIO_NET_HDR_GSO_TCPV6 4 // GSO frame, IPv6 TCP
#define VIRTIO_NET_HDR_GSO_ECN 0x80 // TCP has ECN set
__u8 gso_type;
- __u16 hdr_len; /* Ethernet + IP + tcp/udp hdrs */
- __u16 gso_size; /* Bytes to append to hdr_len per frame */
- __u16 csum_start; /* Position to start checksumming from */
- __u16 csum_offset; /* Offset after that to place checksum */
+ __virtio16 hdr_len; /* Ethernet + IP + tcp/udp hdrs */
+ __virtio16 gso_size; /* Bytes to append to hdr_len per frame */
+ __virtio16 csum_start; /* Position to start checksumming from */
+ __virtio16 csum_offset; /* Offset after that to place checksum */
};

/* This is the version of the header to use when the MRG_RXBUF
* feature has been negotiated. */
struct virtio_net_hdr_mrg_rxbuf {
struct virtio_net_hdr hdr;
- __u16 num_buffers; /* Number of merged rx buffers */
+ __virtio16 num_buffers; /* Number of merged rx buffers */
};

/*
@@ -149,7 +150,7 @@ typedef __u8 virtio_net_ctrl_ack;
* VIRTIO_NET_F_CTRL_MAC_ADDR feature is available.
*/
struct virtio_net_ctrl_mac {
- __u32 entries;
+ __virtio32 entries;
__u8 macs[][ETH_ALEN];
} __attribute__((packed));

@@ -193,7 +194,7 @@ struct virtio_net_ctrl_mac {
* specified.
*/
struct virtio_net_ctrl_mq {
- __u16 virtqueue_pairs;
+ __virtio16 virtqueue_pairs;
};

#define VIRTIO_NET_CTRL_MQ 4
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b0bc8ea..c07e030 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -347,13 +347,14 @@ err:
}

static struct sk_buff *receive_mergeable(struct net_device *dev,
+ struct virtnet_info *vi,
struct receive_queue *rq,
unsigned long ctx,
unsigned int len)
{
void *buf = mergeable_ctx_to_buf_address(ctx);
struct skb_vnet_hdr *hdr = buf;
- int num_buf = hdr->mhdr.num_buffers;
+ u16 num_buf = virtio16_to_cpu(rq->vq->vdev, hdr->mhdr.num_buffers);
struct page *page = virt_to_head_page(buf);
int offset = buf - page_address(page);
unsigned int truesize = max(len, mergeable_ctx_to_buf_truesize(ctx));
@@ -369,7 +370,9 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
ctx = (unsigned long)virtqueue_get_buf(rq->vq, &len);
if (unlikely(!ctx)) {
pr_debug("%s: rx error: %d buffers out of %d missing\n",
- dev->name, num_buf, hdr->mhdr.num_buffers);
+ dev->name, num_buf,
+ virtio16_to_cpu(rq->vq->vdev,
+ hdr->mhdr.num_buffers));
dev->stats.rx_length_errors++;
goto err_buf;
}
@@ -454,7 +457,7 @@ static void receive_buf(struct receive_queue *rq, void *buf, unsigned int len)
}

if (vi->mergeable_rx_bufs)
- skb = receive_mergeable(dev, rq, (unsigned long)buf, len);
+ skb = receive_mergeable(dev, vi, rq, (unsigned long)buf, len);
else if (vi->big_packets)
skb = receive_big(dev, rq, buf, len);
else
@@ -473,8 +476,8 @@ static void receive_buf(struct receive_queue *rq, void *buf, unsigned int len)
if (hdr->hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
pr_debug("Needs csum!\n");
if (!skb_partial_csum_set(skb,
- hdr->hdr.csum_start,
- hdr->hdr.csum_offset))
+ virtio16_to_cpu(vi->vdev, hdr->hdr.csum_start),
+ virtio16_to_cpu(vi->vdev, hdr->hdr.csum_offset)))
goto frame_err;
} else if (hdr->hdr.flags & VIRTIO_NET_HDR_F_DATA_VALID) {
skb->ip_summed = CHECKSUM_UNNECESSARY;
@@ -514,7 +517,8 @@ static void receive_buf(struct receive_queue *rq, void *buf, unsigned int len)
if (hdr->hdr.gso_type & VIRTIO_NET_HDR_GSO_ECN)
skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;

- skb_shinfo(skb)->gso_size = hdr->hdr.gso_size;
+ skb_shinfo(skb)->gso_size = virtio16_to_cpu(vi->vdev,
+ hdr->hdr.gso_size);
if (skb_shinfo(skb)->gso_size == 0) {
net_warn_ratelimited("%s: zero gso size.\n", dev->name);
goto frame_err;
@@ -876,16 +880,19 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)

if (skb->ip_summed == CHECKSUM_PARTIAL) {
hdr->hdr.flags = VIRTIO_NET_HDR_F_NEEDS_CSUM;
- hdr->hdr.csum_start = skb_checksum_start_offset(skb);
- hdr->hdr.csum_offset = skb->csum_offset;
+ hdr->hdr.csum_start = cpu_to_virtio16(vi->vdev,
+ skb_checksum_start_offset(skb));
+ hdr->hdr.csum_offset = cpu_to_virtio16(vi->vdev,
+ skb->csum_offset);
} else {
hdr->hdr.flags = 0;
hdr->hdr.csum_offset = hdr->hdr.csum_start = 0;
}

if (skb_is_gso(skb)) {
- hdr->hdr.hdr_len = skb_headlen(skb);
- hdr->hdr.gso_size = skb_shinfo(skb)->gso_size;
+ hdr->hdr.hdr_len = cpu_to_virtio16(vi->vdev, skb_headlen(skb));
+ hdr->hdr.gso_size = cpu_to_virtio16(vi->vdev,
+ skb_shinfo(skb)->gso_size);
if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4)
hdr->hdr.gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
else if (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6)
@@ -1112,7 +1119,7 @@ static int virtnet_set_queues(struct virtnet_info *vi, u16 queue_pairs)
if (!vi->has_cvq || !virtio_has_feature(vi->vdev, VIRTIO_NET_F_MQ))
return 0;

- s.virtqueue_pairs = queue_pairs;
+ s.virtqueue_pairs = cpu_to_virtio16(vi->vdev, queue_pairs);
sg_init_one(&sg, &s, sizeof(s));

if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_MQ,
@@ -1189,7 +1196,7 @@ static void virtnet_set_rx_mode(struct net_device *dev)
sg_init_table(sg, 2);

/* Store the unicast list and count in the front of the buffer */
- mac_data->entries = uc_count;
+ mac_data->entries = cpu_to_virtio32(vi->vdev, uc_count);
i = 0;
netdev_for_each_uc_addr(ha, dev)
memcpy(&mac_data->macs[i++][0], ha->addr, ETH_ALEN);
@@ -1200,7 +1207,7 @@ static void virtnet_set_rx_mode(struct net_device *dev)
/* multicast list and count fill the end */
mac_data = (void *)&mac_data->macs[uc_count][0];

- mac_data->entries = mc_count;
+ mac_data->entries = cpu_to_virtio32(vi->vdev, mc_count);
i = 0;
netdev_for_each_mc_addr(ha, dev)
memcpy(&mac_data->macs[i++][0], ha->addr, ETH_ALEN);
--
MST

2014-11-24 12:08:15

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 07/41] virtio: allow transports to get avail/used addresses

From: Cornelia Huck <[email protected]>

For virtio-1, we can theoretically have a more complex virtqueue
layout with avail and used buffers not on a contiguous memory area
with the descriptor table. For now, it's fine for a transport driver
to stay with the old layout: It needs, however, a way to access
the locations of the avail/used rings so it can register them with
the host.

Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
include/linux/virtio.h | 3 +++
drivers/virtio/virtio_ring.c | 16 ++++++++++++++++
2 files changed, 19 insertions(+)

diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 149284e..d6359a5 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -75,6 +75,9 @@ unsigned int virtqueue_get_vring_size(struct virtqueue *vq);

bool virtqueue_is_broken(struct virtqueue *vq);

+void *virtqueue_get_avail(struct virtqueue *vq);
+void *virtqueue_get_used(struct virtqueue *vq);
+
/**
* virtio_device - representation of a device using virtio
* @index: unique position on the virtio bus
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index b311fa7..5c8aef8 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -827,4 +827,20 @@ void virtio_break_device(struct virtio_device *dev)
}
EXPORT_SYMBOL_GPL(virtio_break_device);

+void *virtqueue_get_avail(struct virtqueue *_vq)
+{
+ struct vring_virtqueue *vq = to_vvq(_vq);
+
+ return vq->vring.avail;
+}
+EXPORT_SYMBOL_GPL(virtqueue_get_avail);
+
+void *virtqueue_get_used(struct virtqueue *_vq)
+{
+ struct vring_virtqueue *vq = to_vvq(_vq);
+
+ return vq->vring.used;
+}
+EXPORT_SYMBOL_GPL(virtqueue_get_used);
+
MODULE_LICENSE("GPL");
--
MST

2014-11-24 12:09:00

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 04/41] virtio: memory access APIs

virtio 1.0 makes all memory structures LE, so
we need APIs to conditionally do a byteswap on BE
architectures.

To make it easier to check code statically,
add virtio specific types for multi-byte integers
in memory.

Add low level wrappers that do a byteswap conditionally, these will be
useful e.g. for vhost. Add high level wrappers that
query device endian-ness and act accordingly.

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
include/linux/virtio_byteorder.h | 59 +++++++++++++++++++++++++++++++++++++++
include/linux/virtio_config.h | 32 +++++++++++++++++++++
include/uapi/linux/virtio_ring.h | 45 ++++++++++++++---------------
include/uapi/linux/virtio_types.h | 48 +++++++++++++++++++++++++++++++
include/uapi/linux/Kbuild | 1 +
5 files changed, 163 insertions(+), 22 deletions(-)
create mode 100644 include/linux/virtio_byteorder.h
create mode 100644 include/uapi/linux/virtio_types.h

diff --git a/include/linux/virtio_byteorder.h b/include/linux/virtio_byteorder.h
new file mode 100644
index 0000000..824ed0b
--- /dev/null
+++ b/include/linux/virtio_byteorder.h
@@ -0,0 +1,59 @@
+#ifndef _LINUX_VIRTIO_BYTEORDER_H
+#define _LINUX_VIRTIO_BYTEORDER_H
+#include <linux/types.h>
+#include <uapi/linux/virtio_types.h>
+
+/*
+ * Memory accessors for handling virtio in modern little endian and in
+ * compatibility native endian format.
+ */
+
+static inline u16 __virtio16_to_cpu(bool little_endian, __virtio16 val)
+{
+ if (little_endian)
+ return le16_to_cpu((__force __le16)val);
+ else
+ return (__force u16)val;
+}
+
+static inline __virtio16 __cpu_to_virtio16(bool little_endian, u16 val)
+{
+ if (little_endian)
+ return (__force __virtio16)cpu_to_le16(val);
+ else
+ return (__force __virtio16)val;
+}
+
+static inline u32 __virtio32_to_cpu(bool little_endian, __virtio32 val)
+{
+ if (little_endian)
+ return le32_to_cpu((__force __le32)val);
+ else
+ return (__force u32)val;
+}
+
+static inline __virtio32 __cpu_to_virtio32(bool little_endian, u32 val)
+{
+ if (little_endian)
+ return (__force __virtio32)cpu_to_le32(val);
+ else
+ return (__force __virtio32)val;
+}
+
+static inline u64 __virtio64_to_cpu(bool little_endian, __virtio64 val)
+{
+ if (little_endian)
+ return le64_to_cpu((__force __le64)val);
+ else
+ return (__force u64)val;
+}
+
+static inline __virtio64 __cpu_to_virtio64(bool little_endian, u64 val)
+{
+ if (little_endian)
+ return (__force __virtio64)cpu_to_le64(val);
+ else
+ return (__force __virtio64)val;
+}
+
+#endif /* _LINUX_VIRTIO_BYTEORDER */
diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 022d904..b9cd689 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -4,6 +4,7 @@
#include <linux/err.h>
#include <linux/bug.h>
#include <linux/virtio.h>
+#include <linux/virtio_byteorder.h>
#include <uapi/linux/virtio_config.h>

/**
@@ -152,6 +153,37 @@ int virtqueue_set_affinity(struct virtqueue *vq, int cpu)
return 0;
}

+/* Memory accessors */
+static inline u16 virtio16_to_cpu(struct virtio_device *vdev, __virtio16 val)
+{
+ return __virtio16_to_cpu(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), val);
+}
+
+static inline __virtio16 cpu_to_virtio16(struct virtio_device *vdev, u16 val)
+{
+ return __cpu_to_virtio16(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), val);
+}
+
+static inline u32 virtio32_to_cpu(struct virtio_device *vdev, __virtio32 val)
+{
+ return __virtio32_to_cpu(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), val);
+}
+
+static inline __virtio32 cpu_to_virtio32(struct virtio_device *vdev, u32 val)
+{
+ return __cpu_to_virtio32(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), val);
+}
+
+static inline u64 virtio64_to_cpu(struct virtio_device *vdev, __virtio64 val)
+{
+ return __virtio64_to_cpu(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), val);
+}
+
+static inline __virtio64 cpu_to_virtio64(struct virtio_device *vdev, u64 val)
+{
+ return __cpu_to_virtio64(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), val);
+}
+
/* Config space accessors. */
#define virtio_cread(vdev, structname, member, ptr) \
do { \
diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h
index a99f9b7..61c818a 100644
--- a/include/uapi/linux/virtio_ring.h
+++ b/include/uapi/linux/virtio_ring.h
@@ -32,6 +32,7 @@
*
* Copyright Rusty Russell IBM Corporation 2007. */
#include <linux/types.h>
+#include <linux/virtio_types.h>

/* This marks a buffer as continuing via the next field. */
#define VRING_DESC_F_NEXT 1
@@ -61,32 +62,32 @@
/* Virtio ring descriptors: 16 bytes. These can chain together via "next". */
struct vring_desc {
/* Address (guest-physical). */
- __u64 addr;
+ __virtio64 addr;
/* Length. */
- __u32 len;
+ __virtio32 len;
/* The flags as indicated above. */
- __u16 flags;
+ __virtio16 flags;
/* We chain unused descriptors via this, too */
- __u16 next;
+ __virtio16 next;
};

struct vring_avail {
- __u16 flags;
- __u16 idx;
- __u16 ring[];
+ __virtio16 flags;
+ __virtio16 idx;
+ __virtio16 ring[];
};

/* u32 is used here for ids for padding reasons. */
struct vring_used_elem {
/* Index of start of used descriptor chain. */
- __u32 id;
+ __virtio32 id;
/* Total length of the descriptor chain which was used (written to) */
- __u32 len;
+ __virtio32 len;
};

struct vring_used {
- __u16 flags;
- __u16 idx;
+ __virtio16 flags;
+ __virtio16 idx;
struct vring_used_elem ring[];
};

@@ -109,25 +110,25 @@ struct vring {
* struct vring_desc desc[num];
*
* // A ring of available descriptor heads with free-running index.
- * __u16 avail_flags;
- * __u16 avail_idx;
- * __u16 available[num];
- * __u16 used_event_idx;
+ * __virtio16 avail_flags;
+ * __virtio16 avail_idx;
+ * __virtio16 available[num];
+ * __virtio16 used_event_idx;
*
* // Padding to the next align boundary.
* char pad[];
*
* // A ring of used descriptor heads with free-running index.
- * __u16 used_flags;
- * __u16 used_idx;
+ * __virtio16 used_flags;
+ * __virtio16 used_idx;
* struct vring_used_elem used[num];
- * __u16 avail_event_idx;
+ * __virtio16 avail_event_idx;
* };
*/
/* We publish the used event index at the end of the available ring, and vice
* versa. They are at the end for backwards compatibility. */
#define vring_used_event(vr) ((vr)->avail->ring[(vr)->num])
-#define vring_avail_event(vr) (*(__u16 *)&(vr)->used->ring[(vr)->num])
+#define vring_avail_event(vr) (*(__virtio16 *)&(vr)->used->ring[(vr)->num])

static inline void vring_init(struct vring *vr, unsigned int num, void *p,
unsigned long align)
@@ -135,15 +136,15 @@ static inline void vring_init(struct vring *vr, unsigned int num, void *p,
vr->num = num;
vr->desc = p;
vr->avail = p + num*sizeof(struct vring_desc);
- vr->used = (void *)(((unsigned long)&vr->avail->ring[num] + sizeof(__u16)
+ vr->used = (void *)(((unsigned long)&vr->avail->ring[num] + sizeof(__virtio16)
+ align-1) & ~(align - 1));
}

static inline unsigned vring_size(unsigned int num, unsigned long align)
{
- return ((sizeof(struct vring_desc) * num + sizeof(__u16) * (3 + num)
+ return ((sizeof(struct vring_desc) * num + sizeof(__virtio16) * (3 + num)
+ align - 1) & ~(align - 1))
- + sizeof(__u16) * 3 + sizeof(struct vring_used_elem) * num;
+ + sizeof(__virtio16) * 3 + sizeof(struct vring_used_elem) * num;
}

/* The following is used with USED_EVENT_IDX and AVAIL_EVENT_IDX */
diff --git a/include/uapi/linux/virtio_types.h b/include/uapi/linux/virtio_types.h
new file mode 100644
index 0000000..b90385f
--- /dev/null
+++ b/include/uapi/linux/virtio_types.h
@@ -0,0 +1,48 @@
+#ifndef _UAPI_LINUX_VIRTIO_TYPES_H
+#define _UAPI_LINUX_VIRTIO_TYPES_H
+/* An interface for efficient virtio implementation, currently for use by KVM
+ * and lguest, but hopefully others soon. Do NOT change this since it will
+ * break existing servers and clients.
+ *
+ * This header is BSD licensed so anyone can use the definitions to implement
+ * compatible drivers/servers.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of IBM nor the names of its contributors
+ * may be used to endorse or promote products derived from this software
+ * without specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * Copyright (C) 2014 Red Hat, Inc.
+ * Author: Michael S. Tsirkin <[email protected]>
+ */
+#include <linux/types.h>
+
+/*
+ * __virtio{16,32,64} have the following meaning:
+ * - __u{16,32,64} for virtio devices in legacy mode, accessed in native endian
+ * - __le{16,32,64} for standard-compliant virtio devices
+ */
+
+typedef __u16 __bitwise__ __virtio16;
+typedef __u32 __bitwise__ __virtio32;
+typedef __u64 __bitwise__ __virtio64;
+
+#endif /* _UAPI_LINUX_VIRTIO_TYPES_H */
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 4c94f31..44a5581 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -423,6 +423,7 @@ header-y += virtio_blk.h
header-y += virtio_config.h
header-y += virtio_console.h
header-y += virtio_ids.h
+header-y += virtio_types.h
header-y += virtio_net.h
header-y += virtio_pci.h
header-y += virtio_ring.h
--
MST

2014-11-24 12:09:24

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PATCH v3 02/41] virtio: add support for 64 bit features.

From: Rusty Russell <[email protected]>

Change the u32 to a u64, and make sure to use 1ULL everywhere!

Cc: Brian Swetland <[email protected]>
Cc: Christian Borntraeger <[email protected]>
[Thomas Huth: fix up virtio-ccw get_features]
Signed-off-by: Rusty Russell <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>
Acked-by: Pawel Moll <[email protected]>
Acked-by: Ohad Ben-Cohen <[email protected]>

Signed-off-by: Michael S. Tsirkin <[email protected]>
---
include/linux/virtio.h | 2 +-
include/linux/virtio_config.h | 8 ++++----
tools/virtio/linux/virtio.h | 2 +-
tools/virtio/linux/virtio_config.h | 2 +-
drivers/char/virtio_console.c | 2 +-
drivers/lguest/lguest_device.c | 10 +++++-----
drivers/remoteproc/remoteproc_virtio.c | 5 ++++-
drivers/s390/kvm/kvm_virtio.c | 10 +++++-----
drivers/s390/kvm/virtio_ccw.c | 29 ++++++++++++++++++++++++-----
drivers/virtio/virtio.c | 12 ++++++------
drivers/virtio/virtio_mmio.c | 14 +++++++++-----
drivers/virtio/virtio_pci.c | 5 ++---
drivers/virtio/virtio_ring.c | 2 +-
13 files changed, 64 insertions(+), 39 deletions(-)

diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 7828a7f..149284e 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -101,7 +101,7 @@ struct virtio_device {
const struct virtio_config_ops *config;
const struct vringh_config_ops *vringh_config;
struct list_head vqs;
- u32 features;
+ u64 features;
void *priv;
};

diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index aa84d0e..022d904 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -66,7 +66,7 @@ struct virtio_config_ops {
vq_callback_t *callbacks[],
const char *names[]);
void (*del_vqs)(struct virtio_device *);
- u32 (*get_features)(struct virtio_device *vdev);
+ u64 (*get_features)(struct virtio_device *vdev);
void (*finalize_features)(struct virtio_device *vdev);
const char *(*bus_name)(struct virtio_device *vdev);
int (*set_vq_affinity)(struct virtqueue *vq, int cpu);
@@ -86,14 +86,14 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
{
/* Did you forget to fix assumptions on max features? */
if (__builtin_constant_p(fbit))
- BUILD_BUG_ON(fbit >= 32);
+ BUILD_BUG_ON(fbit >= 64);
else
- BUG_ON(fbit >= 32);
+ BUG_ON(fbit >= 64);

if (fbit < VIRTIO_TRANSPORT_F_START)
virtio_check_driver_offered_feature(vdev, fbit);

- return vdev->features & (1 << fbit);
+ return vdev->features & (1ULL << fbit);
}

static inline
diff --git a/tools/virtio/linux/virtio.h b/tools/virtio/linux/virtio.h
index 72bff70..8eb6421 100644
--- a/tools/virtio/linux/virtio.h
+++ b/tools/virtio/linux/virtio.h
@@ -10,7 +10,7 @@

struct virtio_device {
void *dev;
- u32 features;
+ u64 features;
};

struct virtqueue {
diff --git a/tools/virtio/linux/virtio_config.h b/tools/virtio/linux/virtio_config.h
index 1f1636b..a254c2b 100644
--- a/tools/virtio/linux/virtio_config.h
+++ b/tools/virtio/linux/virtio_config.h
@@ -2,5 +2,5 @@
#define VIRTIO_TRANSPORT_F_END 32

#define virtio_has_feature(dev, feature) \
- ((dev)->features & (1 << feature))
+ ((dev)->features & (1ULL << feature))

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 0074f9b..fda9a75 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -355,7 +355,7 @@ static inline bool use_multiport(struct ports_device *portdev)
*/
if (!portdev->vdev)
return 0;
- return portdev->vdev->features & (1 << VIRTIO_CONSOLE_F_MULTIPORT);
+ return portdev->vdev->features & (1ULL << VIRTIO_CONSOLE_F_MULTIPORT);
}

static DEFINE_SPINLOCK(dma_bufs_lock);
diff --git a/drivers/lguest/lguest_device.c b/drivers/lguest/lguest_device.c
index c831c47..4d29bcd 100644
--- a/drivers/lguest/lguest_device.c
+++ b/drivers/lguest/lguest_device.c
@@ -94,17 +94,17 @@ static unsigned desc_size(const struct lguest_device_desc *desc)
}

/* This gets the device's feature bits. */
-static u32 lg_get_features(struct virtio_device *vdev)
+static u64 lg_get_features(struct virtio_device *vdev)
{
unsigned int i;
- u32 features = 0;
+ u64 features = 0;
struct lguest_device_desc *desc = to_lgdev(vdev)->desc;
u8 *in_features = lg_features(desc);

/* We do this the slow but generic way. */
- for (i = 0; i < min(desc->feature_len * 8, 32); i++)
+ for (i = 0; i < min(desc->feature_len * 8, 64); i++)
if (in_features[i / 8] & (1 << (i % 8)))
- features |= (1 << i);
+ features |= (1ULL << i);

return features;
}
@@ -144,7 +144,7 @@ static void lg_finalize_features(struct virtio_device *vdev)
memset(out_features, 0, desc->feature_len);
bits = min_t(unsigned, desc->feature_len, sizeof(vdev->features)) * 8;
for (i = 0; i < bits; i++) {
- if (vdev->features & (1 << i))
+ if (vdev->features & (1ULL << i))
out_features[i / 8] |= (1 << (i % 8));
}

diff --git a/drivers/remoteproc/remoteproc_virtio.c b/drivers/remoteproc/remoteproc_virtio.c
index dafaf38..627737e 100644
--- a/drivers/remoteproc/remoteproc_virtio.c
+++ b/drivers/remoteproc/remoteproc_virtio.c
@@ -207,7 +207,7 @@ static void rproc_virtio_reset(struct virtio_device *vdev)
}

/* provide the vdev features as retrieved from the firmware */
-static u32 rproc_virtio_get_features(struct virtio_device *vdev)
+static u64 rproc_virtio_get_features(struct virtio_device *vdev)
{
struct rproc_vdev *rvdev = vdev_to_rvdev(vdev);
struct fw_rsc_vdev *rsc;
@@ -227,6 +227,9 @@ static void rproc_virtio_finalize_features(struct virtio_device *vdev)
/* Give virtio_ring a chance to accept features */
vring_transport_features(vdev);

+ /* Make sure we don't have any features > 32 bits! */
+ BUG_ON((u32)vdev->features != vdev->features);
+
/*
* Remember the finalized features of our vdev, and provide it
* to the remote processor once it is powered on.
diff --git a/drivers/s390/kvm/kvm_virtio.c b/drivers/s390/kvm/kvm_virtio.c
index ac79a09..c78251d 100644
--- a/drivers/s390/kvm/kvm_virtio.c
+++ b/drivers/s390/kvm/kvm_virtio.c
@@ -80,16 +80,16 @@ static unsigned desc_size(const struct kvm_device_desc *desc)
}

/* This gets the device's feature bits. */
-static u32 kvm_get_features(struct virtio_device *vdev)
+static u64 kvm_get_features(struct virtio_device *vdev)
{
unsigned int i;
- u32 features = 0;
+ u64 features = 0;
struct kvm_device_desc *desc = to_kvmdev(vdev)->desc;
u8 *in_features = kvm_vq_features(desc);

- for (i = 0; i < min(desc->feature_len * 8, 32); i++)
+ for (i = 0; i < min(desc->feature_len * 8, 64); i++)
if (in_features[i / 8] & (1 << (i % 8)))
- features |= (1 << i);
+ features |= (1ULL << i);
return features;
}

@@ -106,7 +106,7 @@ static void kvm_finalize_features(struct virtio_device *vdev)
memset(out_features, 0, desc->feature_len);
bits = min_t(unsigned, desc->feature_len, sizeof(vdev->features)) * 8;
for (i = 0; i < bits; i++) {
- if (vdev->features & (1 << i))
+ if (vdev->features & (1ULL << i))
out_features[i / 8] |= (1 << (i % 8));
}
}
diff --git a/drivers/s390/kvm/virtio_ccw.c b/drivers/s390/kvm/virtio_ccw.c
index 1dbee95..abba04d 100644
--- a/drivers/s390/kvm/virtio_ccw.c
+++ b/drivers/s390/kvm/virtio_ccw.c
@@ -660,11 +660,12 @@ static void virtio_ccw_reset(struct virtio_device *vdev)
kfree(ccw);
}

-static u32 virtio_ccw_get_features(struct virtio_device *vdev)
+static u64 virtio_ccw_get_features(struct virtio_device *vdev)
{
struct virtio_ccw_device *vcdev = to_vc_device(vdev);
struct virtio_feature_desc *features;
- int ret, rc;
+ int ret;
+ u64 rc;
struct ccw1 *ccw;

ccw = kzalloc(sizeof(*ccw), GFP_DMA | GFP_KERNEL);
@@ -677,7 +678,6 @@ static u32 virtio_ccw_get_features(struct virtio_device *vdev)
goto out_free;
}
/* Read the feature bits from the host. */
- /* TODO: Features > 32 bits */
features->index = 0;
ccw->cmd_code = CCW_CMD_READ_FEAT;
ccw->flags = 0;
@@ -691,6 +691,16 @@ static u32 virtio_ccw_get_features(struct virtio_device *vdev)

rc = le32_to_cpu(features->features);

+ /* Read second half feature bits from the host. */
+ features->index = 1;
+ ccw->cmd_code = CCW_CMD_READ_FEAT;
+ ccw->flags = 0;
+ ccw->count = sizeof(*features);
+ ccw->cda = (__u32)(unsigned long)features;
+ ret = ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_READ_FEAT);
+ if (ret == 0)
+ rc |= (u64)le32_to_cpu(features->features) << 32;
+
out_free:
kfree(features);
kfree(ccw);
@@ -715,8 +725,17 @@ static void virtio_ccw_finalize_features(struct virtio_device *vdev)
vring_transport_features(vdev);

features->index = 0;
- features->features = cpu_to_le32(vdev->features);
- /* Write the feature bits to the host. */
+ features->features = cpu_to_le32((u32)vdev->features);
+ /* Write the first half of the feature bits to the host. */
+ ccw->cmd_code = CCW_CMD_WRITE_FEAT;
+ ccw->flags = 0;
+ ccw->count = sizeof(*features);
+ ccw->cda = (__u32)(unsigned long)features;
+ ccw_io_helper(vcdev, ccw, VIRTIO_CCW_DOING_WRITE_FEAT);
+
+ features->index = 1;
+ features->features = cpu_to_le32(vdev->features >> 32);
+ /* Write the second half of the feature bits to the host. */
ccw->cmd_code = CCW_CMD_WRITE_FEAT;
ccw->flags = 0;
ccw->count = sizeof(*features);
diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index 7814b1f..d213567 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -159,7 +159,7 @@ static int virtio_dev_probe(struct device *_d)
int err, i;
struct virtio_device *dev = dev_to_virtio(_d);
struct virtio_driver *drv = drv_to_virtio(dev->dev.driver);
- u32 device_features;
+ u64 device_features;

/* We have a driver! */
add_status(dev, VIRTIO_CONFIG_S_DRIVER);
@@ -171,15 +171,15 @@ static int virtio_dev_probe(struct device *_d)
dev->features = 0;
for (i = 0; i < drv->feature_table_size; i++) {
unsigned int f = drv->feature_table[i];
- BUG_ON(f >= 32);
- if (device_features & (1 << f))
- dev->features |= (1 << f);
+ BUG_ON(f >= 64);
+ if (device_features & (1ULL << f))
+ dev->features |= (1ULL << f);
}

/* Transport features always preserved to pass to finalize_features. */
for (i = VIRTIO_TRANSPORT_F_START; i < VIRTIO_TRANSPORT_F_END; i++)
- if (device_features & (1 << i))
- dev->features |= (1 << i);
+ if (device_features & (1ULL << i))
+ dev->features |= (1ULL << i);

dev->config->finalize_features(dev);

diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index eb5b0e2..fd01c6d 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -142,14 +142,16 @@ struct virtio_mmio_vq_info {

/* Configuration interface */

-static u32 vm_get_features(struct virtio_device *vdev)
+static u64 vm_get_features(struct virtio_device *vdev)
{
struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
+ u64 features;

- /* TODO: Features > 32 bits */
writel(0, vm_dev->base + VIRTIO_MMIO_HOST_FEATURES_SEL);
-
- return readl(vm_dev->base + VIRTIO_MMIO_HOST_FEATURES);
+ features = readl(vm_dev->base + VIRTIO_MMIO_HOST_FEATURES);
+ writel(1, vm_dev->base + VIRTIO_MMIO_HOST_FEATURES_SEL);
+ features |= ((u64)readl(vm_dev->base + VIRTIO_MMIO_HOST_FEATURES) << 32);
+ return features;
}

static void vm_finalize_features(struct virtio_device *vdev)
@@ -160,7 +162,9 @@ static void vm_finalize_features(struct virtio_device *vdev)
vring_transport_features(vdev);

writel(0, vm_dev->base + VIRTIO_MMIO_GUEST_FEATURES_SEL);
- writel(vdev->features, vm_dev->base + VIRTIO_MMIO_GUEST_FEATURES);
+ writel((u32)vdev->features, vm_dev->base + VIRTIO_MMIO_GUEST_FEATURES);
+ writel(1, vm_dev->base + VIRTIO_MMIO_GUEST_FEATURES_SEL);
+ writel(vdev->features >> 32, vm_dev->base + VIRTIO_MMIO_GUEST_FEATURES);
}

static void vm_get(struct virtio_device *vdev, unsigned offset,
diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
index ab95a4c..68c0711 100644
--- a/drivers/virtio/virtio_pci.c
+++ b/drivers/virtio/virtio_pci.c
@@ -102,12 +102,11 @@ static struct virtio_pci_device *to_vp_device(struct virtio_device *vdev)
}

/* virtio config->get_features() implementation */
-static u32 vp_get_features(struct virtio_device *vdev)
+static u64 vp_get_features(struct virtio_device *vdev)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);

- /* When someone needs more than 32 feature bits, we'll need to
- * steal a bit to indicate that the rest are somewhere else. */
+ /* We only support 32 feature bits. */
return ioread32(vp_dev->ioaddr + VIRTIO_PCI_HOST_FEATURES);
}

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 15a8a05..61a1fe1 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -781,7 +781,7 @@ void vring_transport_features(struct virtio_device *vdev)
break;
default:
/* We don't understand this bit. */
- vdev->features &= ~(1 << i);
+ vdev->features &= ~(1ULL << i);
}
}
}
--
MST

2014-11-24 12:15:22

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v3 00/41] linux: towards virtio-1 guest support



On 24/11/2014 12:52, Michael S. Tsirkin wrote:
> Based on patches by Cornelia Rusty and others, but
> with an API that should allow better static checking of code,
> and slightly more concervative changes in vring,net and blk.
>
> Based on patches by Cornelia and others, but
> with an API that should allow better static checking of code,
> slightly more concervative changes in vring and drivers,
> and compatibility for existing drivers so that
> this series be applied before all drivers are converted.
>
> virtio net,blk and scsi drivers have been converted.
> They now pass sparse without warnings.
>
> net and blk patches have been tested on s390.
> scsi patches pass sparse so they are most likely
> ok too, but haven't been through testing yet -
> they can be dropped from patchset if necessary.
>
> Please review, and consider for 3.19
>
> David, assuming patches are acceptable, I'd like them all to be merged through
> virtio or vhost trees. Could you please ack doing that for net related
> patches?

Patches 37-40:

Acked-by: Paolo Bonzini <[email protected]>

2014-11-24 12:15:34

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v3 04/41] virtio: memory access APIs

On Mon, Nov 24, 2014 at 01:03:24PM +0100, Geert Uytterhoeven wrote:
> On Mon, Nov 24, 2014 at 12:52 PM, Michael S. Tsirkin <[email protected]> wrote:
> > virtio 1.0 makes all memory structures LE, so
> > we need APIs to conditionally do a byteswap on BE
> > architectures.
> >
> > To make it easier to check code statically,
> > add virtio specific types for multi-byte integers
> > in memory.
> >
> > Add low level wrappers that do a byteswap conditionally, these will be
> > useful e.g. for vhost. Add high level wrappers that
> > query device endian-ness and act accordingly.
>
> > diff --git a/include/linux/virtio_byteorder.h b/include/linux/virtio_byteorder.h
> > new file mode 100644
> > index 0000000..824ed0b
> > --- /dev/null
> > +++ b/include/linux/virtio_byteorder.h
>
> > +static inline u16 __virtio16_to_cpu(bool little_endian, __virtio16 val)
> > +{
> > + if (little_endian)
> > + return le16_to_cpu((__force __le16)val);
> > + else
> > + return (__force u16)val;
> > +}
>
> What's wrong with just using le16-to_cpu() ...

le16-to_cpu() is simply wrong: virtio needs to be
LE or native endian, depending on whether it's running
in 0.9 or 1.0 mode.

> > --- a/include/uapi/linux/virtio_ring.h
> > +++ b/include/uapi/linux/virtio_ring.h
>
> > /* Virtio ring descriptors: 16 bytes. These can chain together via "next". */
> > struct vring_desc {
> > /* Address (guest-physical). */
> > - __u64 addr;
> > + __virtio64 addr;
>
> ... and __le64?
>
> There's already lots of precedence or this, even in include/uapi/.
>
> Gr{oetje,eeting}s,
>
> Geert

__le would make people think they can use le16-to_cpu() which is wrong.

--
MST

2014-11-24 12:58:46

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v3 04/41] virtio: memory access APIs

On Mon, Nov 24, 2014 at 1:15 PM, Michael S. Tsirkin <[email protected]> wrote:
> On Mon, Nov 24, 2014 at 01:03:24PM +0100, Geert Uytterhoeven wrote:
>> On Mon, Nov 24, 2014 at 12:52 PM, Michael S. Tsirkin <[email protected]> wrote:
>> > virtio 1.0 makes all memory structures LE, so
>> > we need APIs to conditionally do a byteswap on BE
>> > architectures.
>> >
>> > To make it easier to check code statically,
>> > add virtio specific types for multi-byte integers
>> > in memory.
>> >
>> > Add low level wrappers that do a byteswap conditionally, these will be
>> > useful e.g. for vhost. Add high level wrappers that
>> > query device endian-ness and act accordingly.
>>
>> > diff --git a/include/linux/virtio_byteorder.h b/include/linux/virtio_byteorder.h
>> > new file mode 100644
>> > index 0000000..824ed0b
>> > --- /dev/null
>> > +++ b/include/linux/virtio_byteorder.h
>>
>> > +static inline u16 __virtio16_to_cpu(bool little_endian, __virtio16 val)
>> > +{
>> > + if (little_endian)
>> > + return le16_to_cpu((__force __le16)val);
>> > + else
>> > + return (__force u16)val;
>> > +}
>>
>> What's wrong with just using le16-to_cpu() ...
>
> le16-to_cpu() is simply wrong: virtio needs to be
> LE or native endian, depending on whether it's running
> in 0.9 or 1.0 mode.

IC, that was not clear from the description for this patch.
I thought it was dependent on BE architectures.

Nevertheless, any chance you can get rid of the "conditional"?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2014-11-24 14:28:38

by Cédric Le Goater

[permalink] [raw]
Subject: Re: [PATCH v3 26/41] vhost: virtio 1.0 endian-ness support

Hi Michael,

Do you have a tree from where I could pull these patches ?

Thanks,

C.

On 11/24/2014 12:54 PM, Michael S. Tsirkin wrote:
> Signed-off-by: Michael S. Tsirkin <[email protected]>
> ---
> drivers/vhost/vhost.c | 93 +++++++++++++++++++++++++++++++--------------------
> 1 file changed, 56 insertions(+), 37 deletions(-)
>
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index c90f437..4d379ed 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -33,8 +33,8 @@ enum {
> VHOST_MEMORY_F_LOG = 0x1,
> };
>
> -#define vhost_used_event(vq) ((u16 __user *)&vq->avail->ring[vq->num])
> -#define vhost_avail_event(vq) ((u16 __user *)&vq->used->ring[vq->num])
> +#define vhost_used_event(vq) ((__virtio16 __user *)&vq->avail->ring[vq->num])
> +#define vhost_avail_event(vq) ((__virtio16 __user *)&vq->used->ring[vq->num])
>
> static void vhost_poll_func(struct file *file, wait_queue_head_t *wqh,
> poll_table *pt)
> @@ -1001,7 +1001,7 @@ EXPORT_SYMBOL_GPL(vhost_log_write);
> static int vhost_update_used_flags(struct vhost_virtqueue *vq)
> {
> void __user *used;
> - if (__put_user(vq->used_flags, &vq->used->flags) < 0)
> + if (__put_user(cpu_to_vhost16(vq, vq->used_flags), &vq->used->flags) < 0)
> return -EFAULT;
> if (unlikely(vq->log_used)) {
> /* Make sure the flag is seen before log. */
> @@ -1019,7 +1019,7 @@ static int vhost_update_used_flags(struct vhost_virtqueue *vq)
>
> static int vhost_update_avail_event(struct vhost_virtqueue *vq, u16 avail_event)
> {
> - if (__put_user(vq->avail_idx, vhost_avail_event(vq)))
> + if (__put_user(cpu_to_vhost16(vq, vq->avail_idx), vhost_avail_event(vq)))
> return -EFAULT;
> if (unlikely(vq->log_used)) {
> void __user *used;
> @@ -1038,6 +1038,7 @@ static int vhost_update_avail_event(struct vhost_virtqueue *vq, u16 avail_event)
>
> int vhost_init_used(struct vhost_virtqueue *vq)
> {
> + __virtio16 last_used_idx;
> int r;
> if (!vq->private_data)
> return 0;
> @@ -1046,7 +1047,13 @@ int vhost_init_used(struct vhost_virtqueue *vq)
> if (r)
> return r;
> vq->signalled_used_valid = false;
> - return get_user(vq->last_used_idx, &vq->used->idx);
> + if (!access_ok(VERIFY_READ, &vq->used->idx, sizeof vq->used->idx))
> + return -EFAULT;
> + r = __get_user(last_used_idx, &vq->used->idx);
> + if (r)
> + return r;
> + vq->last_used_idx = vhost16_to_cpu(vq, last_used_idx);
> + return 0;
> }
> EXPORT_SYMBOL_GPL(vhost_init_used);
>
> @@ -1087,16 +1094,16 @@ static int translate_desc(struct vhost_virtqueue *vq, u64 addr, u32 len,
> /* Each buffer in the virtqueues is actually a chain of descriptors. This
> * function returns the next descriptor in the chain,
> * or -1U if we're at the end. */
> -static unsigned next_desc(struct vring_desc *desc)
> +static unsigned next_desc(struct vhost_virtqueue *vq, struct vring_desc *desc)
> {
> unsigned int next;
>
> /* If this descriptor says it doesn't chain, we're done. */
> - if (!(desc->flags & VRING_DESC_F_NEXT))
> + if (!(desc->flags & cpu_to_vhost16(vq, VRING_DESC_F_NEXT)))
> return -1U;
>
> /* Check they're not leading us off end of descriptors. */
> - next = desc->next;
> + next = vhost16_to_cpu(vq, desc->next);
> /* Make sure compiler knows to grab that: we don't want it changing! */
> /* We will use the result as an index in an array, so most
> * architectures only need a compiler barrier here. */
> @@ -1113,18 +1120,19 @@ static int get_indirect(struct vhost_virtqueue *vq,
> {
> struct vring_desc desc;
> unsigned int i = 0, count, found = 0;
> + u32 len = vhost32_to_cpu(vq, indirect->len);
> int ret;
>
> /* Sanity check */
> - if (unlikely(indirect->len % sizeof desc)) {
> + if (unlikely(len % sizeof desc)) {
> vq_err(vq, "Invalid length in indirect descriptor: "
> "len 0x%llx not multiple of 0x%zx\n",
> - (unsigned long long)indirect->len,
> + (unsigned long long)vhost32_to_cpu(vq, indirect->len),
> sizeof desc);
> return -EINVAL;
> }
>
> - ret = translate_desc(vq, indirect->addr, indirect->len, vq->indirect,
> + ret = translate_desc(vq, vhost64_to_cpu(vq, indirect->addr), len, vq->indirect,
> UIO_MAXIOV);
> if (unlikely(ret < 0)) {
> vq_err(vq, "Translation failure %d in indirect.\n", ret);
> @@ -1135,7 +1143,7 @@ static int get_indirect(struct vhost_virtqueue *vq,
> * architectures only need a compiler barrier here. */
> read_barrier_depends();
>
> - count = indirect->len / sizeof desc;
> + count = len / sizeof desc;
> /* Buffers are chained via a 16 bit next field, so
> * we can have at most 2^16 of these. */
> if (unlikely(count > USHRT_MAX + 1)) {
> @@ -1155,16 +1163,17 @@ static int get_indirect(struct vhost_virtqueue *vq,
> if (unlikely(memcpy_fromiovec((unsigned char *)&desc,
> vq->indirect, sizeof desc))) {
> vq_err(vq, "Failed indirect descriptor: idx %d, %zx\n",
> - i, (size_t)indirect->addr + i * sizeof desc);
> + i, (size_t)vhost64_to_cpu(vq, indirect->addr) + i * sizeof desc);
> return -EINVAL;
> }
> - if (unlikely(desc.flags & VRING_DESC_F_INDIRECT)) {
> + if (unlikely(desc.flags & cpu_to_vhost16(vq, VRING_DESC_F_INDIRECT))) {
> vq_err(vq, "Nested indirect descriptor: idx %d, %zx\n",
> - i, (size_t)indirect->addr + i * sizeof desc);
> + i, (size_t)vhost64_to_cpu(vq, indirect->addr) + i * sizeof desc);
> return -EINVAL;
> }
>
> - ret = translate_desc(vq, desc.addr, desc.len, iov + iov_count,
> + ret = translate_desc(vq, vhost64_to_cpu(vq, desc.addr),
> + vhost32_to_cpu(vq, desc.len), iov + iov_count,
> iov_size - iov_count);
> if (unlikely(ret < 0)) {
> vq_err(vq, "Translation failure %d indirect idx %d\n",
> @@ -1172,11 +1181,11 @@ static int get_indirect(struct vhost_virtqueue *vq,
> return ret;
> }
> /* If this is an input descriptor, increment that count. */
> - if (desc.flags & VRING_DESC_F_WRITE) {
> + if (desc.flags & cpu_to_vhost16(vq, VRING_DESC_F_WRITE)) {
> *in_num += ret;
> if (unlikely(log)) {
> - log[*log_num].addr = desc.addr;
> - log[*log_num].len = desc.len;
> + log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
> + log[*log_num].len = vhost32_to_cpu(vq, desc.len);
> ++*log_num;
> }
> } else {
> @@ -1189,7 +1198,7 @@ static int get_indirect(struct vhost_virtqueue *vq,
> }
> *out_num += ret;
> }
> - } while ((i = next_desc(&desc)) != -1);
> + } while ((i = next_desc(vq, &desc)) != -1);
> return 0;
> }
>
> @@ -1209,15 +1218,18 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
> struct vring_desc desc;
> unsigned int i, head, found = 0;
> u16 last_avail_idx;
> + __virtio16 avail_idx;
> + __virtio16 ring_head;
> int ret;
>
> /* Check it isn't doing very strange things with descriptor numbers. */
> last_avail_idx = vq->last_avail_idx;
> - if (unlikely(__get_user(vq->avail_idx, &vq->avail->idx))) {
> + if (unlikely(__get_user(avail_idx, &vq->avail->idx))) {
> vq_err(vq, "Failed to access avail idx at %p\n",
> &vq->avail->idx);
> return -EFAULT;
> }
> + vq->avail_idx = vhost16_to_cpu(vq, avail_idx);
>
> if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) {
> vq_err(vq, "Guest moved used index from %u to %u",
> @@ -1234,7 +1246,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
>
> /* Grab the next descriptor number they're advertising, and increment
> * the index we've seen. */
> - if (unlikely(__get_user(head,
> + if (unlikely(__get_user(ring_head,
> &vq->avail->ring[last_avail_idx % vq->num]))) {
> vq_err(vq, "Failed to read head: idx %d address %p\n",
> last_avail_idx,
> @@ -1242,6 +1254,8 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
> return -EFAULT;
> }
>
> + head = vhost16_to_cpu(vq, ring_head);
> +
> /* If their number is silly, that's an error. */
> if (unlikely(head >= vq->num)) {
> vq_err(vq, "Guest says index %u > %u is available",
> @@ -1274,7 +1288,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
> i, vq->desc + i);
> return -EFAULT;
> }
> - if (desc.flags & VRING_DESC_F_INDIRECT) {
> + if (desc.flags & cpu_to_vhost16(vq, VRING_DESC_F_INDIRECT)) {
> ret = get_indirect(vq, iov, iov_size,
> out_num, in_num,
> log, log_num, &desc);
> @@ -1286,20 +1300,21 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
> continue;
> }
>
> - ret = translate_desc(vq, desc.addr, desc.len, iov + iov_count,
> + ret = translate_desc(vq, vhost64_to_cpu(vq, desc.addr),
> + vhost32_to_cpu(vq, desc.len), iov + iov_count,
> iov_size - iov_count);
> if (unlikely(ret < 0)) {
> vq_err(vq, "Translation failure %d descriptor idx %d\n",
> ret, i);
> return ret;
> }
> - if (desc.flags & VRING_DESC_F_WRITE) {
> + if (desc.flags & cpu_to_vhost16(vq, VRING_DESC_F_WRITE)) {
> /* If this is an input descriptor,
> * increment that count. */
> *in_num += ret;
> if (unlikely(log)) {
> - log[*log_num].addr = desc.addr;
> - log[*log_num].len = desc.len;
> + log[*log_num].addr = vhost64_to_cpu(vq, desc.addr);
> + log[*log_num].len = vhost32_to_cpu(vq, desc.len);
> ++*log_num;
> }
> } else {
> @@ -1312,7 +1327,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq,
> }
> *out_num += ret;
> }
> - } while ((i = next_desc(&desc)) != -1);
> + } while ((i = next_desc(vq, &desc)) != -1);
>
> /* On success, increment avail index. */
> vq->last_avail_idx++;
> @@ -1335,7 +1350,10 @@ EXPORT_SYMBOL_GPL(vhost_discard_vq_desc);
> * want to notify the guest, using eventfd. */
> int vhost_add_used(struct vhost_virtqueue *vq, unsigned int head, int len)
> {
> - struct vring_used_elem heads = { head, len };
> + struct vring_used_elem heads = {
> + cpu_to_vhost32(vq, head),
> + cpu_to_vhost32(vq, len)
> + };
>
> return vhost_add_used_n(vq, &heads, 1);
> }
> @@ -1404,7 +1422,7 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, struct vring_used_elem *heads,
>
> /* Make sure buffer is written before we update index. */
> smp_wmb();
> - if (put_user(vq->last_used_idx, &vq->used->idx)) {
> + if (__put_user(cpu_to_vhost16(vq, vq->last_used_idx), &vq->used->idx)) {
> vq_err(vq, "Failed to increment used idx");
> return -EFAULT;
> }
> @@ -1422,7 +1440,8 @@ EXPORT_SYMBOL_GPL(vhost_add_used_n);
>
> static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
> {
> - __u16 old, new, event;
> + __u16 old, new;
> + __virtio16 event;
> bool v;
> /* Flush out used index updates. This is paired
> * with the barrier that the Guest executes when enabling
> @@ -1434,12 +1453,12 @@ static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
> return true;
>
> if (!vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX)) {
> - __u16 flags;
> + __virtio16 flags;
> if (__get_user(flags, &vq->avail->flags)) {
> vq_err(vq, "Failed to get flags");
> return true;
> }
> - return !(flags & VRING_AVAIL_F_NO_INTERRUPT);
> + return !(flags & cpu_to_vhost16(vq, VRING_AVAIL_F_NO_INTERRUPT));
> }
> old = vq->signalled_used;
> v = vq->signalled_used_valid;
> @@ -1449,11 +1468,11 @@ static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
> if (unlikely(!v))
> return true;
>
> - if (get_user(event, vhost_used_event(vq))) {
> + if (__get_user(event, vhost_used_event(vq))) {
> vq_err(vq, "Failed to get used event idx");
> return true;
> }
> - return vring_need_event(event, new, old);
> + return vring_need_event(vhost16_to_cpu(vq, event), new, old);
> }
>
> /* This actually signals the guest, using eventfd. */
> @@ -1488,7 +1507,7 @@ EXPORT_SYMBOL_GPL(vhost_add_used_and_signal_n);
> /* OK, now we need to know about added descriptors. */
> bool vhost_enable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
> {
> - u16 avail_idx;
> + __virtio16 avail_idx;
> int r;
>
> if (!(vq->used_flags & VRING_USED_F_NO_NOTIFY))
> @@ -1519,7 +1538,7 @@ bool vhost_enable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
> return false;
> }
>
> - return avail_idx != vq->avail_idx;
> + return vhost16_to_cpu(vq, avail_idx) != vq->avail_idx;
> }
> EXPORT_SYMBOL_GPL(vhost_enable_notify);
>

2014-11-24 18:35:58

by David Miller

[permalink] [raw]
Subject: Re: [PATCH v3 00/41] linux: towards virtio-1 guest support

From: "Michael S. Tsirkin" <[email protected]>
Date: Mon, 24 Nov 2014 13:52:32 +0200

> Based on patches by Cornelia Rusty and others, but
> with an API that should allow better static checking of code,
> and slightly more concervative changes in vring,net and blk.
>
> Based on patches by Cornelia and others, but
> with an API that should allow better static checking of code,
> slightly more concervative changes in vring and drivers,

Some cut and paste duplication happening here :)

2014-11-24 18:48:41

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 00/41] linux: towards virtio-1 guest support

On Mon, 24 Nov 2014 13:52:32 +0200
"Michael S. Tsirkin" <[email protected]> wrote:

> Based on patches by Cornelia Rusty and others, but
> with an API that should allow better static checking of code,
> and slightly more concervative changes in vring,net and blk.
>
> Based on patches by Cornelia and others, but
> with an API that should allow better static checking of code,
> slightly more concervative changes in vring and drivers,
> and compatibility for existing drivers so that
> this series be applied before all drivers are converted.
>
> virtio net,blk and scsi drivers have been converted.
> They now pass sparse without warnings.
>
> net and blk patches have been tested on s390.

vring_transport_features() seems to knock off the version 1 bit (I
wonder why I did not see that before, maybe VIRTIO_TRANSPORT_F_END had
not been increased before?). If I add VIRTIO_F_VERSION_1 in this
function, virtio-net and virtio-blk seem to work fine with version 1 on
virtio-ccw (tested with my old virtio-1 qemu branch).

I'll try to do a bit of reviewing tomorrow.

2014-11-24 18:49:40

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v3 04/41] virtio: memory access APIs

On Mon, Nov 24, 2014 at 01:58:43PM +0100, Geert Uytterhoeven wrote:
> On Mon, Nov 24, 2014 at 1:15 PM, Michael S. Tsirkin <[email protected]> wrote:
> > On Mon, Nov 24, 2014 at 01:03:24PM +0100, Geert Uytterhoeven wrote:
> >> On Mon, Nov 24, 2014 at 12:52 PM, Michael S. Tsirkin <[email protected]> wrote:
> >> > virtio 1.0 makes all memory structures LE, so
> >> > we need APIs to conditionally do a byteswap on BE
> >> > architectures.
> >> >
> >> > To make it easier to check code statically,
> >> > add virtio specific types for multi-byte integers
> >> > in memory.
> >> >
> >> > Add low level wrappers that do a byteswap conditionally, these will be
> >> > useful e.g. for vhost. Add high level wrappers that
> >> > query device endian-ness and act accordingly.
> >>
> >> > diff --git a/include/linux/virtio_byteorder.h b/include/linux/virtio_byteorder.h
> >> > new file mode 100644
> >> > index 0000000..824ed0b
> >> > --- /dev/null
> >> > +++ b/include/linux/virtio_byteorder.h
> >>
> >> > +static inline u16 __virtio16_to_cpu(bool little_endian, __virtio16 val)
> >> > +{
> >> > + if (little_endian)
> >> > + return le16_to_cpu((__force __le16)val);
> >> > + else
> >> > + return (__force u16)val;
> >> > +}
> >>
> >> What's wrong with just using le16-to_cpu() ...
> >
> > le16-to_cpu() is simply wrong: virtio needs to be
> > LE or native endian, depending on whether it's running
> > in 0.9 or 1.0 mode.
>
> IC, that was not clear from the description for this patch.
> I thought it was dependent on BE architectures.
>
> Nevertheless, any chance you can get rid of the "conditional"?

I don't see how - this is fundamental to any virtio device
that wants to support both 0.9 and 1.0 from the same
codebase.

> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds

2014-11-24 19:12:20

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v3 00/41] linux: towards virtio-1 guest support

On Mon, Nov 24, 2014 at 07:48:31PM +0100, Cornelia Huck wrote:
> On Mon, 24 Nov 2014 13:52:32 +0200
> "Michael S. Tsirkin" <[email protected]> wrote:
>
> > Based on patches by Cornelia Rusty and others, but
> > with an API that should allow better static checking of code,
> > and slightly more concervative changes in vring,net and blk.
> >
> > Based on patches by Cornelia and others, but
> > with an API that should allow better static checking of code,
> > slightly more concervative changes in vring and drivers,
> > and compatibility for existing drivers so that
> > this series be applied before all drivers are converted.
> >
> > virtio net,blk and scsi drivers have been converted.
> > They now pass sparse without warnings.
> >
> > net and blk patches have been tested on s390.
>
> vring_transport_features() seems to knock off the version 1 bit (I
> wonder why I did not see that before, maybe VIRTIO_TRANSPORT_F_END had
> not been increased before?). If I add VIRTIO_F_VERSION_1 in this
> function, virtio-net and virtio-blk seem to work fine with version 1 on
> virtio-ccw (tested with my old virtio-1 qemu branch).

What if we just move VIRTIO_TRANSPORT_F_END back for now?
Does patch below fix it for you?

> I'll try to do a bit of reviewing tomorrow.

BTW could you post your latest qemu bits please?



diff --git a/include/uapi/linux/virtio_config.h b/include/uapi/linux/virtio_config.h
index a6d0cde..0071de9 100644
--- a/include/uapi/linux/virtio_config.h
+++ b/include/uapi/linux/virtio_config.h
@@ -47,7 +47,7 @@
* transport being used (eg. virtio_ring), the rest are per-device feature
* bits. */
#define VIRTIO_TRANSPORT_F_START 28
-#define VIRTIO_TRANSPORT_F_END 33
+#define VIRTIO_TRANSPORT_F_END 32

/* Do we get callbacks when the ring is completely used, even if we've
* suppressed them? */
--
MST

2014-11-24 20:28:46

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH v3 26/41] vhost: virtio 1.0 endian-ness support

On Mon, Nov 24, 2014 at 03:28:26PM +0100, Cedric Le Goater wrote:
> Hi Michael,
>
> Do you have a tree from where I could pull these patches ?
>
> Thanks,
>
> C.

Yes - vhost-next that linux-next includes.

--
MST

2014-11-24 21:26:02

by David Miller

[permalink] [raw]
Subject: Re: [PATCH v3 00/41] linux: towards virtio-1 guest support

From: "Michael S. Tsirkin" <[email protected]>
Date: Mon, 24 Nov 2014 13:52:32 +0200

> David, assuming patches are acceptable, I'd like them all to be merged through
> virtio or vhost trees. Could you please ack doing that for net related
> patches?

Sure, for the networking bits:

Acked-by: David S. Miller <[email protected]>

2014-11-25 09:06:00

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH v3 00/41] linux: towards virtio-1 guest support

On Mon, 24 Nov 2014 21:12:05 +0200
"Michael S. Tsirkin" <[email protected]> wrote:

> On Mon, Nov 24, 2014 at 07:48:31PM +0100, Cornelia Huck wrote:

> > vring_transport_features() seems to knock off the version 1 bit (I
> > wonder why I did not see that before, maybe VIRTIO_TRANSPORT_F_END had
> > not been increased before?). If I add VIRTIO_F_VERSION_1 in this
> > function, virtio-net and virtio-blk seem to work fine with version 1 on
> > virtio-ccw (tested with my old virtio-1 qemu branch).
>
> What if we just move VIRTIO_TRANSPORT_F_END back for now?
> Does patch below fix it for you?

It does.

>
> > I'll try to do a bit of reviewing tomorrow.
>
> BTW could you post your latest qemu bits please?

I haven't done much work on virtio-1 lately, but I could rebase my
existing branch to current qemu.