2024-02-17 18:04:53

by Saurabh Sengar

[permalink] [raw]
Subject: [PATCH 0/6] Low speed Hyper-V devices support

Hyper-V is adding multiple low speed "speciality" synthetic devices.
Instead of writing a new kernel-level VMBus driver for each device,
make the devices accessible to user space through UIO-based
uio_hv_generic driver. Each device can then be supported by a user
space driver. This approach optimizes the development process and
provides flexibility to user space applications to control the key
interactions with the VMBus ring buffer.

The new synthetic devices are low speed devices that don't support
VMBus monitor bits, and so they must use vmbus_setevent() to notify
the host of ring buffer updates. These new devices also have smaller
ring buffer sizes which requires to add support for variable ring buffer
sizes.

Moreover, this patch series adds a new implementation of the fcopy
application that uses the new UIO driver. The older fcopy driver and
application will be phased out gradually. Development of other similar
userspace drivers is still underway.


Efforts have been made previously to implement this solution earlier.
Here are the discussions related to those attempts:
https://lore.kernel.org/lkml/[email protected]/
https://lore.kernel.org/lkml/[email protected]/
https://lore.kernel.org/lkml/[email protected]/

Saurabh Sengar (6):
Drivers: hv: vmbus: Add utility function for querying ring size
uio_hv_generic: Query the ringbuffer size for device
uio_hv_generic: Enable interrupt for low speed VMBus devices
tools: hv: Add vmbus_bufring
tools: hv: Add new fcopy application based on uio driver
Drivers: hv: Remove fcopy driver

drivers/hv/Makefile | 2 +-
drivers/hv/channel_mgmt.c | 7 +-
drivers/hv/hv_fcopy.c | 427 -----------------------------
drivers/hv/hv_util.c | 12 -
drivers/hv/hyperv_vmbus.h | 5 +
drivers/uio/uio_hv_generic.c | 14 +-
include/linux/hyperv.h | 1 +
tools/hv/Build | 3 +-
tools/hv/Makefile | 10 +-
tools/hv/hv_fcopy_daemon.c | 266 ------------------
tools/hv/hv_fcopy_uio_daemon.c | 488 +++++++++++++++++++++++++++++++++
tools/hv/vmbus_bufring.c | 316 +++++++++++++++++++++
tools/hv/vmbus_bufring.h | 158 +++++++++++
13 files changed, 988 insertions(+), 721 deletions(-)
delete mode 100644 drivers/hv/hv_fcopy.c
delete mode 100644 tools/hv/hv_fcopy_daemon.c
create mode 100644 tools/hv/hv_fcopy_uio_daemon.c
create mode 100644 tools/hv/vmbus_bufring.c
create mode 100644 tools/hv/vmbus_bufring.h

--
2.34.1



2024-02-17 18:05:04

by Saurabh Sengar

[permalink] [raw]
Subject: [PATCH 1/6] Drivers: hv: vmbus: Add utility function for querying ring size

Add a function to query for the preferred ring buffer size of VMBus
device.

Signed-off-by: Saurabh Sengar <[email protected]>
---
drivers/hv/channel_mgmt.c | 7 +++++--
drivers/hv/hyperv_vmbus.h | 5 +++++
include/linux/hyperv.h | 1 +
3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 2f4d09ce027a..7ea444d72f9f 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -120,7 +120,8 @@ const struct vmbus_device vmbus_devs[] = {
},

/* File copy */
- { .dev_type = HV_FCOPY,
+ { .pref_ring_size = 0x4000,
+ .dev_type = HV_FCOPY,
HV_FCOPY_GUID,
.perf_device = false,
.allowed_in_isolated = false,
@@ -141,11 +142,13 @@ const struct vmbus_device vmbus_devs[] = {
},

/* Unknown GUID */
- { .dev_type = HV_UNKNOWN,
+ { .pref_ring_size = 0x11000,
+ .dev_type = HV_UNKNOWN,
.perf_device = false,
.allowed_in_isolated = false,
},
};
+EXPORT_SYMBOL_GPL(vmbus_devs);

static const struct {
guid_t guid;
diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
index f6b1e710f805..76ac5185a01a 100644
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -417,6 +417,11 @@ static inline bool hv_is_perf_channel(struct vmbus_channel *channel)
return vmbus_devs[channel->device_id].perf_device;
}

+static inline size_t hv_dev_ring_size(struct vmbus_channel *channel)
+{
+ return vmbus_devs[channel->device_id].pref_ring_size;
+}
+
static inline bool hv_is_allocated_cpu(unsigned int cpu)
{
struct vmbus_channel *channel, *sc;
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 2b00faf98017..5951c7bb5712 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -800,6 +800,7 @@ struct vmbus_requestor {
#define VMBUS_RQST_RESET (U64_MAX - 3)

struct vmbus_device {
+ size_t pref_ring_size;
u16 dev_type;
guid_t guid;
bool perf_device;
--
2.34.1


2024-02-17 18:05:06

by Saurabh Sengar

[permalink] [raw]
Subject: [PATCH 2/6] uio_hv_generic: Query the ringbuffer size for device

Query the ring buffer size from pre defined table per device.
Keep the size as is if the device doesn't have any preferred
ring size.

Signed-off-by: Saurabh Sengar <[email protected]>
---
drivers/uio/uio_hv_generic.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index 20d9762331bd..4bda6b52e49e 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -238,6 +238,7 @@ hv_uio_probe(struct hv_device *dev,
struct hv_uio_private_data *pdata;
void *ring_buffer;
int ret;
+ size_t ring_size = hv_dev_ring_size(channel);

/* Communicating with host has to be via shared memory not hypercall */
if (!channel->offermsg.monitor_allocated) {
@@ -245,12 +246,14 @@ hv_uio_probe(struct hv_device *dev,
return -ENOTSUPP;
}

+ if (!ring_size)
+ ring_size = HV_RING_SIZE * PAGE_SIZE;
+
pdata = devm_kzalloc(&dev->device, sizeof(*pdata), GFP_KERNEL);
if (!pdata)
return -ENOMEM;

- ret = vmbus_alloc_ring(channel, HV_RING_SIZE * PAGE_SIZE,
- HV_RING_SIZE * PAGE_SIZE);
+ ret = vmbus_alloc_ring(channel, ring_size, ring_size);
if (ret)
return ret;

--
2.34.1


2024-02-17 18:05:10

by Saurabh Sengar

[permalink] [raw]
Subject: [PATCH 4/6] tools: hv: Add vmbus_bufring

Common userspace interface for read/write from VMBus ringbuffer.
This implementation is open for use by any userspace driver or
application seeking direct control over VMBus ring buffers.
A significant part of this code is borrowed from DPDK.
Link: https://github.com/DPDK/dpdk/

Signed-off-by: Mary Hardy <[email protected]>
Signed-off-by: Saurabh Sengar <[email protected]>
---
tools/hv/vmbus_bufring.c | 316 +++++++++++++++++++++++++++++++++++++++
tools/hv/vmbus_bufring.h | 158 ++++++++++++++++++++
2 files changed, 474 insertions(+)
create mode 100644 tools/hv/vmbus_bufring.c
create mode 100644 tools/hv/vmbus_bufring.h

diff --git a/tools/hv/vmbus_bufring.c b/tools/hv/vmbus_bufring.c
new file mode 100644
index 000000000000..b74b56283bc5
--- /dev/null
+++ b/tools/hv/vmbus_bufring.c
@@ -0,0 +1,316 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Copyright (c) 2009-2012,2016,2023 Microsoft Corp.
+ * Copyright (c) 2012 NetApp Inc.
+ * Copyright (c) 2012 Citrix Inc.
+ * All rights reserved.
+ */
+
+#include <errno.h>
+#include <fcntl.h>
+#include <emmintrin.h>
+#include <linux/limits.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <sys/uio.h>
+#include <unistd.h>
+#include "vmbus_bufring.h"
+
+#define rte_compiler_barrier() ({ asm volatile ("" : : : "memory"); })
+
+#define rte_smp_rwmb() ({ asm volatile ("" : : : "memory"); })
+
+#define VMBUS_RQST_ERROR 0xFFFFFFFFFFFFFFFF
+#define ALIGN(val, align) ((typeof(val))((val) & (~((typeof(val))((align) - 1)))))
+
+void *vmbus_uio_map(int *fd, int size)
+{
+ void *map;
+
+ map = mmap(NULL, 2 * size, PROT_READ | PROT_WRITE, MAP_SHARED, *fd, 0);
+ if (map == MAP_FAILED)
+ return NULL;
+
+ return map;
+}
+
+/* Increase bufring index by inc with wraparound */
+static inline uint32_t vmbus_br_idxinc(uint32_t idx, uint32_t inc, uint32_t sz)
+{
+ idx += inc;
+ if (idx >= sz)
+ idx -= sz;
+
+ return idx;
+}
+
+void vmbus_br_setup(struct vmbus_br *br, void *buf, unsigned int blen)
+{
+ br->vbr = buf;
+ br->windex = br->vbr->windex;
+ br->dsize = blen - sizeof(struct vmbus_bufring);
+}
+
+static inline __always_inline void
+rte_smp_mb(void)
+{
+ asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
+}
+
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
+{
+ uint8_t res;
+
+ asm volatile("lock ; "
+ "cmpxchgl %[src], %[dst];"
+ "sete %[res];"
+ : [res] "=a" (res), /* output */
+ [dst] "=m" (*dst)
+ : [src] "r" (src), /* input */
+ "a" (exp),
+ "m" (*dst)
+ : "memory"); /* no-clobber list */
+ return res;
+}
+
+static inline uint32_t
+vmbus_txbr_copyto(const struct vmbus_br *tbr, uint32_t windex,
+ const void *src0, uint32_t cplen)
+{
+ uint8_t *br_data = tbr->vbr->data;
+ uint32_t br_dsize = tbr->dsize;
+ const uint8_t *src = src0;
+
+ /* XXX use double mapping like Linux kernel? */
+ if (cplen > br_dsize - windex) {
+ uint32_t fraglen = br_dsize - windex;
+
+ /* Wrap-around detected */
+ memcpy(br_data + windex, src, fraglen);
+ memcpy(br_data, src + fraglen, cplen - fraglen);
+ } else {
+ memcpy(br_data + windex, src, cplen);
+ }
+
+ return vmbus_br_idxinc(windex, cplen, br_dsize);
+}
+
+/*
+ * Write scattered channel packet to TX bufring.
+ *
+ * The offset of this channel packet is written as a 64bits value
+ * immediately after this channel packet.
+ *
+ * The write goes through three stages:
+ * 1. Reserve space in ring buffer for the new data.
+ * Writer atomically moves priv_write_index.
+ * 2. Copy the new data into the ring.
+ * 3. Update the tail of the ring (visible to host) that indicates
+ * next read location. Writer updates write_index
+ */
+static int
+vmbus_txbr_write(struct vmbus_br *tbr, const struct iovec iov[], int iovlen,
+ bool *need_sig)
+{
+ struct vmbus_bufring *vbr = tbr->vbr;
+ uint32_t ring_size = tbr->dsize;
+ uint32_t old_windex, next_windex, windex, total;
+ uint64_t save_windex;
+ int i;
+
+ total = 0;
+ for (i = 0; i < iovlen; i++)
+ total += iov[i].iov_len;
+ total += sizeof(save_windex);
+
+ /* Reserve space in ring */
+ do {
+ uint32_t avail;
+
+ /* Get current free location */
+ old_windex = tbr->windex;
+
+ /* Prevent compiler reordering this with calculation */
+ rte_compiler_barrier();
+
+ avail = vmbus_br_availwrite(tbr, old_windex);
+
+ /* If not enough space in ring, then tell caller. */
+ if (avail <= total)
+ return -EAGAIN;
+
+ next_windex = vmbus_br_idxinc(old_windex, total, ring_size);
+
+ /* Atomic update of next write_index for other threads */
+ } while (!rte_atomic32_cmpset(&tbr->windex, old_windex, next_windex));
+
+ /* Space from old..new is now reserved */
+ windex = old_windex;
+ for (i = 0; i < iovlen; i++)
+ windex = vmbus_txbr_copyto(tbr, windex, iov[i].iov_base, iov[i].iov_len);
+
+ /* Set the offset of the current channel packet. */
+ save_windex = ((uint64_t)old_windex) << 32;
+ windex = vmbus_txbr_copyto(tbr, windex, &save_windex,
+ sizeof(save_windex));
+
+ /* The region reserved should match region used */
+ if (windex != next_windex)
+ return -EINVAL;
+
+ /* Ensure that data is available before updating host index */
+ rte_smp_rwmb();
+
+ /* Checkin for our reservation. wait for our turn to update host */
+ while (!rte_atomic32_cmpset(&vbr->windex, old_windex, next_windex))
+ _mm_pause();
+
+ return 0;
+}
+
+int rte_vmbus_chan_send(struct vmbus_br *txbr, uint16_t type, void *data,
+ uint32_t dlen, uint32_t flags)
+{
+ struct vmbus_chanpkt pkt;
+ unsigned int pktlen, pad_pktlen;
+ const uint32_t hlen = sizeof(pkt);
+ bool send_evt = false;
+ uint64_t pad = 0;
+ struct iovec iov[3];
+ int error;
+
+ pktlen = hlen + dlen;
+ pad_pktlen = ALIGN(pktlen, sizeof(uint64_t));
+
+ pkt.hdr.type = type;
+ pkt.hdr.flags = flags;
+ pkt.hdr.hlen = hlen >> VMBUS_CHANPKT_SIZE_SHIFT;
+ pkt.hdr.tlen = pad_pktlen >> VMBUS_CHANPKT_SIZE_SHIFT;
+ pkt.hdr.xactid = VMBUS_RQST_ERROR;
+
+ iov[0].iov_base = &pkt;
+ iov[0].iov_len = hlen;
+ iov[1].iov_base = data;
+ iov[1].iov_len = dlen;
+ iov[2].iov_base = &pad;
+ iov[2].iov_len = pad_pktlen - pktlen;
+
+ error = vmbus_txbr_write(txbr, iov, 3, &send_evt);
+
+ return error;
+}
+
+static inline uint32_t
+vmbus_rxbr_copyfrom(const struct vmbus_br *rbr, uint32_t rindex,
+ void *dst0, size_t cplen)
+{
+ const uint8_t *br_data = rbr->vbr->data;
+ uint32_t br_dsize = rbr->dsize;
+ uint8_t *dst = dst0;
+
+ if (cplen > br_dsize - rindex) {
+ uint32_t fraglen = br_dsize - rindex;
+
+ /* Wrap-around detected. */
+ memcpy(dst, br_data + rindex, fraglen);
+ memcpy(dst + fraglen, br_data, cplen - fraglen);
+ } else {
+ memcpy(dst, br_data + rindex, cplen);
+ }
+
+ return vmbus_br_idxinc(rindex, cplen, br_dsize);
+}
+
+/* Copy data from receive ring but don't change index */
+static int
+vmbus_rxbr_peek(const struct vmbus_br *rbr, void *data, size_t dlen)
+{
+ uint32_t avail;
+
+ /*
+ * The requested data and the 64bits channel packet
+ * offset should be there at least.
+ */
+ avail = vmbus_br_availread(rbr);
+ if (avail < dlen + sizeof(uint64_t))
+ return -EAGAIN;
+
+ vmbus_rxbr_copyfrom(rbr, rbr->vbr->rindex, data, dlen);
+ return 0;
+}
+
+/*
+ * Copy data from receive ring and change index
+ * NOTE:
+ * We assume (dlen + skip) == sizeof(channel packet).
+ */
+static int
+vmbus_rxbr_read(struct vmbus_br *rbr, void *data, size_t dlen, size_t skip)
+{
+ struct vmbus_bufring *vbr = rbr->vbr;
+ uint32_t br_dsize = rbr->dsize;
+ uint32_t rindex;
+
+ if (vmbus_br_availread(rbr) < dlen + skip + sizeof(uint64_t))
+ return -EAGAIN;
+
+ /* Record where host was when we started read (for debug) */
+ rbr->windex = rbr->vbr->windex;
+
+ /*
+ * Copy channel packet from RX bufring.
+ */
+ rindex = vmbus_br_idxinc(rbr->vbr->rindex, skip, br_dsize);
+ rindex = vmbus_rxbr_copyfrom(rbr, rindex, data, dlen);
+
+ /*
+ * Discard this channel packet's 64bits offset, which is useless to us.
+ */
+ rindex = vmbus_br_idxinc(rindex, sizeof(uint64_t), br_dsize);
+
+ /* Update the read index _after_ the channel packet is fetched. */
+ rte_compiler_barrier();
+
+ vbr->rindex = rindex;
+
+ return 0;
+}
+
+int rte_vmbus_chan_recv_raw(struct vmbus_br *rxbr,
+ void *data, uint32_t *len)
+{
+ struct vmbus_chanpkt_hdr pkt;
+ uint32_t dlen, bufferlen = *len;
+ int error;
+
+ error = vmbus_rxbr_peek(rxbr, &pkt, sizeof(pkt));
+ if (error)
+ return error;
+
+ if (unlikely(pkt.hlen < VMBUS_CHANPKT_HLEN_MIN))
+ /* XXX this channel is dead actually. */
+ return -EIO;
+
+ if (unlikely(pkt.hlen > pkt.tlen))
+ return -EIO;
+
+ /* Length are in quad words */
+ dlen = pkt.tlen << VMBUS_CHANPKT_SIZE_SHIFT;
+ *len = dlen;
+
+ /* If caller buffer is not large enough */
+ if (unlikely(dlen > bufferlen))
+ return -ENOBUFS;
+
+ /* Read data and skip packet header */
+ error = vmbus_rxbr_read(rxbr, data, dlen, 0);
+ if (error)
+ return error;
+
+ /* Return the number of bytes read */
+ return dlen + sizeof(uint64_t);
+}
diff --git a/tools/hv/vmbus_bufring.h b/tools/hv/vmbus_bufring.h
new file mode 100644
index 000000000000..6e7caacfff57
--- /dev/null
+++ b/tools/hv/vmbus_bufring.h
@@ -0,0 +1,158 @@
+/* SPDX-License-Identifier: BSD-3-Clause */
+
+#ifndef _VMBUS_BUF_H_
+#define _VMBUS_BUF_H_
+
+#include <stdbool.h>
+#include <stdint.h>
+
+#define __packed __attribute__((__packed__))
+#define unlikely(x) __builtin_expect(!!(x), 0)
+
+#define ICMSGHDRFLAG_TRANSACTION 1
+#define ICMSGHDRFLAG_REQUEST 2
+#define ICMSGHDRFLAG_RESPONSE 4
+
+#define IC_VERSION_NEGOTIATION_MAX_VER_COUNT 100
+#define ICMSG_HDR (sizeof(struct vmbuspipe_hdr) + sizeof(struct icmsg_hdr))
+#define ICMSG_NEGOTIATE_PKT_SIZE(icframe_vercnt, icmsg_vercnt) \
+ (ICMSG_HDR + sizeof(struct icmsg_negotiate) + \
+ (((icframe_vercnt) + (icmsg_vercnt)) * sizeof(struct ic_version)))
+
+/*
+ * Channel packets
+ */
+
+/* Channel packet flags */
+#define VMBUS_CHANPKT_TYPE_INBAND 0x0006
+#define VMBUS_CHANPKT_TYPE_RXBUF 0x0007
+#define VMBUS_CHANPKT_TYPE_GPA 0x0009
+#define VMBUS_CHANPKT_TYPE_COMP 0x000b
+
+#define VMBUS_CHANPKT_FLAG_NONE 0
+#define VMBUS_CHANPKT_FLAG_RC 0x0001 /* report completion */
+
+#define VMBUS_CHANPKT_SIZE_SHIFT 3
+#define VMBUS_CHANPKT_SIZE_ALIGN BIT(VMBUS_CHANPKT_SIZE_SHIFT)
+#define VMBUS_CHANPKT_HLEN_MIN \
+ (sizeof(struct vmbus_chanpkt_hdr) >> VMBUS_CHANPKT_SIZE_SHIFT)
+
+/*
+ * Buffer ring
+ */
+struct vmbus_bufring {
+ volatile uint32_t windex;
+ volatile uint32_t rindex;
+
+ /*
+ * Interrupt mask {0,1}
+ *
+ * For TX bufring, host set this to 1, when it is processing
+ * the TX bufring, so that we can safely skip the TX event
+ * notification to host.
+ *
+ * For RX bufring, once this is set to 1 by us, host will not
+ * further dispatch interrupts to us, even if there are data
+ * pending on the RX bufring. This effectively disables the
+ * interrupt of the channel to which this RX bufring is attached.
+ */
+ volatile uint32_t imask;
+
+ /*
+ * Win8 uses some of the reserved bits to implement
+ * interrupt driven flow management. On the send side
+ * we can request that the receiver interrupt the sender
+ * when the ring transitions from being full to being able
+ * to handle a message of size "pending_send_sz".
+ *
+ * Add necessary state for this enhancement.
+ */
+ volatile uint32_t pending_send;
+ uint32_t reserved1[12];
+
+ union {
+ struct {
+ uint32_t feat_pending_send_sz:1;
+ };
+ uint32_t value;
+ } feature_bits;
+
+ /* Pad it to rte_mem_page_size() so that data starts on page boundary */
+ uint8_t reserved2[4028];
+
+ /*
+ * Ring data starts here + RingDataStartOffset
+ * !!! DO NOT place any fields below this !!!
+ */
+ uint8_t data[];
+} __packed;
+
+struct vmbus_br {
+ struct vmbus_bufring *vbr;
+ uint32_t dsize;
+ uint32_t windex; /* next available location */
+};
+
+struct vmbus_chanpkt_hdr {
+ uint16_t type; /* VMBUS_CHANPKT_TYPE_ */
+ uint16_t hlen; /* header len, in 8 bytes */
+ uint16_t tlen; /* total len, in 8 bytes */
+ uint16_t flags; /* VMBUS_CHANPKT_FLAG_ */
+ uint64_t xactid;
+} __packed;
+
+struct vmbus_chanpkt {
+ struct vmbus_chanpkt_hdr hdr;
+} __packed;
+
+struct vmbuspipe_hdr {
+ unsigned int flags;
+ unsigned int msgsize;
+} __packed;
+
+struct ic_version {
+ unsigned short major;
+ unsigned short minor;
+} __packed;
+
+struct icmsg_negotiate {
+ unsigned short icframe_vercnt;
+ unsigned short icmsg_vercnt;
+ unsigned int reserved;
+ struct ic_version icversion_data[]; /* any size array */
+} __packed;
+
+struct icmsg_hdr {
+ struct ic_version icverframe;
+ unsigned short icmsgtype;
+ struct ic_version icvermsg;
+ unsigned short icmsgsize;
+ unsigned int status;
+ unsigned char ictransaction_id;
+ unsigned char icflags;
+ unsigned char reserved[2];
+} __packed;
+
+int rte_vmbus_chan_recv_raw(struct vmbus_br *rxbr, void *data, uint32_t *len);
+int rte_vmbus_chan_send(struct vmbus_br *txbr, uint16_t type, void *data,
+ uint32_t dlen, uint32_t flags);
+void vmbus_br_setup(struct vmbus_br *br, void *buf, unsigned int blen);
+void *vmbus_uio_map(int *fd, int size);
+
+/* Amount of space available for write */
+static inline uint32_t vmbus_br_availwrite(const struct vmbus_br *br, uint32_t windex)
+{
+ uint32_t rindex = br->vbr->rindex;
+
+ if (windex >= rindex)
+ return br->dsize - (windex - rindex);
+ else
+ return rindex - windex;
+}
+
+static inline uint32_t vmbus_br_availread(const struct vmbus_br *br)
+{
+ return br->dsize - vmbus_br_availwrite(br, br->vbr->windex);
+}
+
+#endif /* !_VMBUS_BUF_H_ */
--
2.34.1


2024-02-17 18:05:49

by Saurabh Sengar

[permalink] [raw]
Subject: [PATCH 6/6] Drivers: hv: Remove fcopy driver

As the new fcopy driver using uio is introduced, remove obsolete driver
and application.

Signed-off-by: Saurabh Sengar <[email protected]>
---
drivers/hv/Makefile | 2 +-
drivers/hv/hv_fcopy.c | 427 -------------------------------------
drivers/hv/hv_util.c | 12 --
tools/hv/hv_fcopy_daemon.c | 266 -----------------------
4 files changed, 1 insertion(+), 706 deletions(-)
delete mode 100644 drivers/hv/hv_fcopy.c
delete mode 100644 tools/hv/hv_fcopy_daemon.c

diff --git a/drivers/hv/Makefile b/drivers/hv/Makefile
index d76df5c8c2a9..b992c0ed182b 100644
--- a/drivers/hv/Makefile
+++ b/drivers/hv/Makefile
@@ -10,7 +10,7 @@ hv_vmbus-y := vmbus_drv.o \
hv.o connection.o channel.o \
channel_mgmt.o ring_buffer.o hv_trace.o
hv_vmbus-$(CONFIG_HYPERV_TESTING) += hv_debugfs.o
-hv_utils-y := hv_util.o hv_kvp.o hv_snapshot.o hv_fcopy.o hv_utils_transport.o
+hv_utils-y := hv_util.o hv_kvp.o hv_snapshot.o hv_utils_transport.o

# Code that must be built-in
obj-$(subst m,y,$(CONFIG_HYPERV)) += hv_common.o
diff --git a/drivers/hv/hv_fcopy.c b/drivers/hv/hv_fcopy.c
deleted file mode 100644
index 922d83eb7ddf..000000000000
--- a/drivers/hv/hv_fcopy.c
+++ /dev/null
@@ -1,427 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * An implementation of file copy service.
- *
- * Copyright (C) 2014, Microsoft, Inc.
- *
- * Author : K. Y. Srinivasan <[email protected]>
- */
-
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
-#include <linux/nls.h>
-#include <linux/workqueue.h>
-#include <linux/hyperv.h>
-#include <linux/sched.h>
-#include <asm/hyperv-tlfs.h>
-
-#include "hyperv_vmbus.h"
-#include "hv_utils_transport.h"
-
-#define WIN8_SRV_MAJOR 1
-#define WIN8_SRV_MINOR 1
-#define WIN8_SRV_VERSION (WIN8_SRV_MAJOR << 16 | WIN8_SRV_MINOR)
-
-#define FCOPY_VER_COUNT 1
-static const int fcopy_versions[] = {
- WIN8_SRV_VERSION
-};
-
-#define FW_VER_COUNT 1
-static const int fw_versions[] = {
- UTIL_FW_VERSION
-};
-
-/*
- * Global state maintained for transaction that is being processed.
- * For a class of integration services, including the "file copy service",
- * the specified protocol is a "request/response" protocol which means that
- * there can only be single outstanding transaction from the host at any
- * given point in time. We use this to simplify memory management in this
- * driver - we cache and process only one message at a time.
- *
- * While the request/response protocol is guaranteed by the host, we further
- * ensure this by serializing packet processing in this driver - we do not
- * read additional packets from the VMBUs until the current packet is fully
- * handled.
- */
-
-static struct {
- int state; /* hvutil_device_state */
- int recv_len; /* number of bytes received. */
- struct hv_fcopy_hdr *fcopy_msg; /* current message */
- struct vmbus_channel *recv_channel; /* chn we got the request */
- u64 recv_req_id; /* request ID. */
-} fcopy_transaction;
-
-static void fcopy_respond_to_host(int error);
-static void fcopy_send_data(struct work_struct *dummy);
-static void fcopy_timeout_func(struct work_struct *dummy);
-static DECLARE_DELAYED_WORK(fcopy_timeout_work, fcopy_timeout_func);
-static DECLARE_WORK(fcopy_send_work, fcopy_send_data);
-static const char fcopy_devname[] = "vmbus/hv_fcopy";
-static u8 *recv_buffer;
-static struct hvutil_transport *hvt;
-/*
- * This state maintains the version number registered by the daemon.
- */
-static int dm_reg_value;
-
-static void fcopy_poll_wrapper(void *channel)
-{
- /* Transaction is finished, reset the state here to avoid races. */
- fcopy_transaction.state = HVUTIL_READY;
- tasklet_schedule(&((struct vmbus_channel *)channel)->callback_event);
-}
-
-static void fcopy_timeout_func(struct work_struct *dummy)
-{
- /*
- * If the timer fires, the user-mode component has not responded;
- * process the pending transaction.
- */
- fcopy_respond_to_host(HV_E_FAIL);
- hv_poll_channel(fcopy_transaction.recv_channel, fcopy_poll_wrapper);
-}
-
-static void fcopy_register_done(void)
-{
- pr_debug("FCP: userspace daemon registered\n");
- hv_poll_channel(fcopy_transaction.recv_channel, fcopy_poll_wrapper);
-}
-
-static int fcopy_handle_handshake(u32 version)
-{
- u32 our_ver = FCOPY_CURRENT_VERSION;
-
- switch (version) {
- case FCOPY_VERSION_0:
- /* Daemon doesn't expect us to reply */
- dm_reg_value = version;
- break;
- case FCOPY_VERSION_1:
- /* Daemon expects us to reply with our own version */
- if (hvutil_transport_send(hvt, &our_ver, sizeof(our_ver),
- fcopy_register_done))
- return -EFAULT;
- dm_reg_value = version;
- break;
- default:
- /*
- * For now we will fail the registration.
- * If and when we have multiple versions to
- * deal with, we will be backward compatible.
- * We will add this code when needed.
- */
- return -EINVAL;
- }
- pr_debug("FCP: userspace daemon ver. %d connected\n", version);
- return 0;
-}
-
-static void fcopy_send_data(struct work_struct *dummy)
-{
- struct hv_start_fcopy *smsg_out = NULL;
- int operation = fcopy_transaction.fcopy_msg->operation;
- struct hv_start_fcopy *smsg_in;
- void *out_src;
- int rc, out_len;
-
- /*
- * The strings sent from the host are encoded in
- * utf16; convert it to utf8 strings.
- * The host assures us that the utf16 strings will not exceed
- * the max lengths specified. We will however, reserve room
- * for the string terminating character - in the utf16s_utf8s()
- * function we limit the size of the buffer where the converted
- * string is placed to W_MAX_PATH -1 to guarantee
- * that the strings can be properly terminated!
- */
-
- switch (operation) {
- case START_FILE_COPY:
- out_len = sizeof(struct hv_start_fcopy);
- smsg_out = kzalloc(sizeof(*smsg_out), GFP_KERNEL);
- if (!smsg_out)
- return;
-
- smsg_out->hdr.operation = operation;
- smsg_in = (struct hv_start_fcopy *)fcopy_transaction.fcopy_msg;
-
- utf16s_to_utf8s((wchar_t *)smsg_in->file_name, W_MAX_PATH,
- UTF16_LITTLE_ENDIAN,
- (__u8 *)&smsg_out->file_name, W_MAX_PATH - 1);
-
- utf16s_to_utf8s((wchar_t *)smsg_in->path_name, W_MAX_PATH,
- UTF16_LITTLE_ENDIAN,
- (__u8 *)&smsg_out->path_name, W_MAX_PATH - 1);
-
- smsg_out->copy_flags = smsg_in->copy_flags;
- smsg_out->file_size = smsg_in->file_size;
- out_src = smsg_out;
- break;
-
- case WRITE_TO_FILE:
- out_src = fcopy_transaction.fcopy_msg;
- out_len = sizeof(struct hv_do_fcopy);
- break;
- default:
- out_src = fcopy_transaction.fcopy_msg;
- out_len = fcopy_transaction.recv_len;
- break;
- }
-
- fcopy_transaction.state = HVUTIL_USERSPACE_REQ;
- rc = hvutil_transport_send(hvt, out_src, out_len, NULL);
- if (rc) {
- pr_debug("FCP: failed to communicate to the daemon: %d\n", rc);
- if (cancel_delayed_work_sync(&fcopy_timeout_work)) {
- fcopy_respond_to_host(HV_E_FAIL);
- fcopy_transaction.state = HVUTIL_READY;
- }
- }
- kfree(smsg_out);
-}
-
-/*
- * Send a response back to the host.
- */
-
-static void
-fcopy_respond_to_host(int error)
-{
- struct icmsg_hdr *icmsghdr;
- u32 buf_len;
- struct vmbus_channel *channel;
- u64 req_id;
-
- /*
- * Copy the global state for completing the transaction. Note that
- * only one transaction can be active at a time. This is guaranteed
- * by the file copy protocol implemented by the host. Furthermore,
- * the "transaction active" state we maintain ensures that there can
- * only be one active transaction at a time.
- */
-
- buf_len = fcopy_transaction.recv_len;
- channel = fcopy_transaction.recv_channel;
- req_id = fcopy_transaction.recv_req_id;
-
- icmsghdr = (struct icmsg_hdr *)
- &recv_buffer[sizeof(struct vmbuspipe_hdr)];
-
- if (channel->onchannel_callback == NULL)
- /*
- * We have raced with util driver being unloaded;
- * silently return.
- */
- return;
-
- icmsghdr->status = error;
- icmsghdr->icflags = ICMSGHDRFLAG_TRANSACTION | ICMSGHDRFLAG_RESPONSE;
- vmbus_sendpacket(channel, recv_buffer, buf_len, req_id,
- VM_PKT_DATA_INBAND, 0);
-}
-
-void hv_fcopy_onchannelcallback(void *context)
-{
- struct vmbus_channel *channel = context;
- u32 recvlen;
- u64 requestid;
- struct hv_fcopy_hdr *fcopy_msg;
- struct icmsg_hdr *icmsghdr;
- int fcopy_srv_version;
-
- if (fcopy_transaction.state > HVUTIL_READY)
- return;
-
- if (vmbus_recvpacket(channel, recv_buffer, HV_HYP_PAGE_SIZE * 2, &recvlen, &requestid)) {
- pr_err_ratelimited("Fcopy request received. Could not read into recv buf\n");
- return;
- }
-
- if (!recvlen)
- return;
-
- /* Ensure recvlen is big enough to read header data */
- if (recvlen < ICMSG_HDR) {
- pr_err_ratelimited("Fcopy request received. Packet length too small: %d\n",
- recvlen);
- return;
- }
-
- icmsghdr = (struct icmsg_hdr *)&recv_buffer[
- sizeof(struct vmbuspipe_hdr)];
-
- if (icmsghdr->icmsgtype == ICMSGTYPE_NEGOTIATE) {
- if (vmbus_prep_negotiate_resp(icmsghdr,
- recv_buffer, recvlen,
- fw_versions, FW_VER_COUNT,
- fcopy_versions, FCOPY_VER_COUNT,
- NULL, &fcopy_srv_version)) {
-
- pr_info("FCopy IC version %d.%d\n",
- fcopy_srv_version >> 16,
- fcopy_srv_version & 0xFFFF);
- }
- } else if (icmsghdr->icmsgtype == ICMSGTYPE_FCOPY) {
- /* Ensure recvlen is big enough to contain hv_fcopy_hdr */
- if (recvlen < ICMSG_HDR + sizeof(struct hv_fcopy_hdr)) {
- pr_err_ratelimited("Invalid Fcopy hdr. Packet length too small: %u\n",
- recvlen);
- return;
- }
- fcopy_msg = (struct hv_fcopy_hdr *)&recv_buffer[ICMSG_HDR];
-
- /*
- * Stash away this global state for completing the
- * transaction; note transactions are serialized.
- */
-
- fcopy_transaction.recv_len = recvlen;
- fcopy_transaction.recv_req_id = requestid;
- fcopy_transaction.fcopy_msg = fcopy_msg;
-
- if (fcopy_transaction.state < HVUTIL_READY) {
- /* Userspace is not registered yet */
- fcopy_respond_to_host(HV_E_FAIL);
- return;
- }
- fcopy_transaction.state = HVUTIL_HOSTMSG_RECEIVED;
-
- /*
- * Send the information to the user-level daemon.
- */
- schedule_work(&fcopy_send_work);
- schedule_delayed_work(&fcopy_timeout_work,
- HV_UTIL_TIMEOUT * HZ);
- return;
- } else {
- pr_err_ratelimited("Fcopy request received. Invalid msg type: %d\n",
- icmsghdr->icmsgtype);
- return;
- }
- icmsghdr->icflags = ICMSGHDRFLAG_TRANSACTION | ICMSGHDRFLAG_RESPONSE;
- vmbus_sendpacket(channel, recv_buffer, recvlen, requestid,
- VM_PKT_DATA_INBAND, 0);
-}
-
-/* Callback when data is received from userspace */
-static int fcopy_on_msg(void *msg, int len)
-{
- int *val = (int *)msg;
-
- if (len != sizeof(int))
- return -EINVAL;
-
- if (fcopy_transaction.state == HVUTIL_DEVICE_INIT)
- return fcopy_handle_handshake(*val);
-
- if (fcopy_transaction.state != HVUTIL_USERSPACE_REQ)
- return -EINVAL;
-
- /*
- * Complete the transaction by forwarding the result
- * to the host. But first, cancel the timeout.
- */
- if (cancel_delayed_work_sync(&fcopy_timeout_work)) {
- fcopy_transaction.state = HVUTIL_USERSPACE_RECV;
- fcopy_respond_to_host(*val);
- hv_poll_channel(fcopy_transaction.recv_channel,
- fcopy_poll_wrapper);
- }
-
- return 0;
-}
-
-static void fcopy_on_reset(void)
-{
- /*
- * The daemon has exited; reset the state.
- */
- fcopy_transaction.state = HVUTIL_DEVICE_INIT;
-
- if (cancel_delayed_work_sync(&fcopy_timeout_work))
- fcopy_respond_to_host(HV_E_FAIL);
-}
-
-int hv_fcopy_init(struct hv_util_service *srv)
-{
- recv_buffer = srv->recv_buffer;
- fcopy_transaction.recv_channel = srv->channel;
- fcopy_transaction.recv_channel->max_pkt_size = HV_HYP_PAGE_SIZE * 2;
-
- /*
- * When this driver loads, the user level daemon that
- * processes the host requests may not yet be running.
- * Defer processing channel callbacks until the daemon
- * has registered.
- */
- fcopy_transaction.state = HVUTIL_DEVICE_INIT;
-
- hvt = hvutil_transport_init(fcopy_devname, 0, 0,
- fcopy_on_msg, fcopy_on_reset);
- if (!hvt)
- return -EFAULT;
-
- return 0;
-}
-
-static void hv_fcopy_cancel_work(void)
-{
- cancel_delayed_work_sync(&fcopy_timeout_work);
- cancel_work_sync(&fcopy_send_work);
-}
-
-int hv_fcopy_pre_suspend(void)
-{
- struct vmbus_channel *channel = fcopy_transaction.recv_channel;
- struct hv_fcopy_hdr *fcopy_msg;
-
- /*
- * Fake a CANCEL_FCOPY message for the user space daemon in case the
- * daemon is in the middle of copying some file. It doesn't matter if
- * there is already a message pending to be delivered to the user
- * space since we force fcopy_transaction.state to be HVUTIL_READY, so
- * the user space daemon's write() will fail with EINVAL (see
- * fcopy_on_msg()), and the daemon will reset the device by closing
- * and re-opening it.
- */
- fcopy_msg = kzalloc(sizeof(*fcopy_msg), GFP_KERNEL);
- if (!fcopy_msg)
- return -ENOMEM;
-
- tasklet_disable(&channel->callback_event);
-
- fcopy_msg->operation = CANCEL_FCOPY;
-
- hv_fcopy_cancel_work();
-
- /* We don't care about the return value. */
- hvutil_transport_send(hvt, fcopy_msg, sizeof(*fcopy_msg), NULL);
-
- kfree(fcopy_msg);
-
- fcopy_transaction.state = HVUTIL_READY;
-
- /* tasklet_enable() will be called in hv_fcopy_pre_resume(). */
- return 0;
-}
-
-int hv_fcopy_pre_resume(void)
-{
- struct vmbus_channel *channel = fcopy_transaction.recv_channel;
-
- tasklet_enable(&channel->callback_event);
-
- return 0;
-}
-
-void hv_fcopy_deinit(void)
-{
- fcopy_transaction.state = HVUTIL_DEVICE_DYING;
-
- hv_fcopy_cancel_work();
-
- hvutil_transport_destroy(hvt);
-}
diff --git a/drivers/hv/hv_util.c b/drivers/hv/hv_util.c
index 9c97c4065fe7..c4f525325790 100644
--- a/drivers/hv/hv_util.c
+++ b/drivers/hv/hv_util.c
@@ -154,14 +154,6 @@ static struct hv_util_service util_vss = {
.util_deinit = hv_vss_deinit,
};

-static struct hv_util_service util_fcopy = {
- .util_cb = hv_fcopy_onchannelcallback,
- .util_init = hv_fcopy_init,
- .util_pre_suspend = hv_fcopy_pre_suspend,
- .util_pre_resume = hv_fcopy_pre_resume,
- .util_deinit = hv_fcopy_deinit,
-};
-
static void perform_shutdown(struct work_struct *dummy)
{
orderly_poweroff(true);
@@ -700,10 +692,6 @@ static const struct hv_vmbus_device_id id_table[] = {
{ HV_VSS_GUID,
.driver_data = (unsigned long)&util_vss
},
- /* File copy GUID */
- { HV_FCOPY_GUID,
- .driver_data = (unsigned long)&util_fcopy
- },
{ },
};

diff --git a/tools/hv/hv_fcopy_daemon.c b/tools/hv/hv_fcopy_daemon.c
deleted file mode 100644
index 16d629b22c25..000000000000
--- a/tools/hv/hv_fcopy_daemon.c
+++ /dev/null
@@ -1,266 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * An implementation of host to guest copy functionality for Linux.
- *
- * Copyright (C) 2014, Microsoft, Inc.
- *
- * Author : K. Y. Srinivasan <[email protected]>
- */
-
-
-#include <sys/types.h>
-#include <stdio.h>
-#include <stdlib.h>
-#include <unistd.h>
-#include <string.h>
-#include <errno.h>
-#include <linux/hyperv.h>
-#include <linux/limits.h>
-#include <syslog.h>
-#include <sys/stat.h>
-#include <fcntl.h>
-#include <getopt.h>
-
-static int target_fd;
-static char target_fname[PATH_MAX];
-static unsigned long long filesize;
-
-static int hv_start_fcopy(struct hv_start_fcopy *smsg)
-{
- int error = HV_E_FAIL;
- char *q, *p;
-
- filesize = 0;
- p = (char *)smsg->path_name;
- snprintf(target_fname, sizeof(target_fname), "%s/%s",
- (char *)smsg->path_name, (char *)smsg->file_name);
-
- syslog(LOG_INFO, "Target file name: %s", target_fname);
- /*
- * Check to see if the path is already in place; if not,
- * create if required.
- */
- while ((q = strchr(p, '/')) != NULL) {
- if (q == p) {
- p++;
- continue;
- }
- *q = '\0';
- if (access((char *)smsg->path_name, F_OK)) {
- if (smsg->copy_flags & CREATE_PATH) {
- if (mkdir((char *)smsg->path_name, 0755)) {
- syslog(LOG_ERR, "Failed to create %s",
- (char *)smsg->path_name);
- goto done;
- }
- } else {
- syslog(LOG_ERR, "Invalid path: %s",
- (char *)smsg->path_name);
- goto done;
- }
- }
- p = q + 1;
- *q = '/';
- }
-
- if (!access(target_fname, F_OK)) {
- syslog(LOG_INFO, "File: %s exists", target_fname);
- if (!(smsg->copy_flags & OVER_WRITE)) {
- error = HV_ERROR_ALREADY_EXISTS;
- goto done;
- }
- }
-
- target_fd = open(target_fname,
- O_RDWR | O_CREAT | O_TRUNC | O_CLOEXEC, 0744);
- if (target_fd == -1) {
- syslog(LOG_INFO, "Open Failed: %s", strerror(errno));
- goto done;
- }
-
- error = 0;
-done:
- if (error)
- target_fname[0] = '\0';
- return error;
-}
-
-static int hv_copy_data(struct hv_do_fcopy *cpmsg)
-{
- ssize_t bytes_written;
- int ret = 0;
-
- bytes_written = pwrite(target_fd, cpmsg->data, cpmsg->size,
- cpmsg->offset);
-
- filesize += cpmsg->size;
- if (bytes_written != cpmsg->size) {
- switch (errno) {
- case ENOSPC:
- ret = HV_ERROR_DISK_FULL;
- break;
- default:
- ret = HV_E_FAIL;
- break;
- }
- syslog(LOG_ERR, "pwrite failed to write %llu bytes: %ld (%s)",
- filesize, (long)bytes_written, strerror(errno));
- }
-
- return ret;
-}
-
-/*
- * Reset target_fname to "" in the two below functions for hibernation: if
- * the fcopy operation is aborted by hibernation, the daemon should remove the
- * partially-copied file; to achieve this, the hv_utils driver always fakes a
- * CANCEL_FCOPY message upon suspend, and later when the VM resumes back,
- * the daemon calls hv_copy_cancel() to remove the file; if a file is copied
- * successfully before suspend, hv_copy_finished() must reset target_fname to
- * avoid that the file can be incorrectly removed upon resume, since the faked
- * CANCEL_FCOPY message is spurious in this case.
- */
-static int hv_copy_finished(void)
-{
- close(target_fd);
- target_fname[0] = '\0';
- return 0;
-}
-static int hv_copy_cancel(void)
-{
- close(target_fd);
- if (strlen(target_fname) > 0) {
- unlink(target_fname);
- target_fname[0] = '\0';
- }
- return 0;
-
-}
-
-void print_usage(char *argv[])
-{
- fprintf(stderr, "Usage: %s [options]\n"
- "Options are:\n"
- " -n, --no-daemon stay in foreground, don't daemonize\n"
- " -h, --help print this help\n", argv[0]);
-}
-
-int main(int argc, char *argv[])
-{
- int fcopy_fd = -1;
- int error;
- int daemonize = 1, long_index = 0, opt;
- int version = FCOPY_CURRENT_VERSION;
- union {
- struct hv_fcopy_hdr hdr;
- struct hv_start_fcopy start;
- struct hv_do_fcopy copy;
- __u32 kernel_modver;
- } buffer = { };
- int in_handshake;
-
- static struct option long_options[] = {
- {"help", no_argument, 0, 'h' },
- {"no-daemon", no_argument, 0, 'n' },
- {0, 0, 0, 0 }
- };
-
- while ((opt = getopt_long(argc, argv, "hn", long_options,
- &long_index)) != -1) {
- switch (opt) {
- case 'n':
- daemonize = 0;
- break;
- case 'h':
- default:
- print_usage(argv);
- exit(EXIT_FAILURE);
- }
- }
-
- if (daemonize && daemon(1, 0)) {
- syslog(LOG_ERR, "daemon() failed; error: %s", strerror(errno));
- exit(EXIT_FAILURE);
- }
-
- openlog("HV_FCOPY", 0, LOG_USER);
- syslog(LOG_INFO, "starting; pid is:%d", getpid());
-
-reopen_fcopy_fd:
- if (fcopy_fd != -1)
- close(fcopy_fd);
- /* Remove any possible partially-copied file on error */
- hv_copy_cancel();
- in_handshake = 1;
- fcopy_fd = open("/dev/vmbus/hv_fcopy", O_RDWR);
-
- if (fcopy_fd < 0) {
- syslog(LOG_ERR, "open /dev/vmbus/hv_fcopy failed; error: %d %s",
- errno, strerror(errno));
- exit(EXIT_FAILURE);
- }
-
- /*
- * Register with the kernel.
- */
- if ((write(fcopy_fd, &version, sizeof(int))) != sizeof(int)) {
- syslog(LOG_ERR, "Registration failed: %s", strerror(errno));
- exit(EXIT_FAILURE);
- }
-
- while (1) {
- /*
- * In this loop we process fcopy messages after the
- * handshake is complete.
- */
- ssize_t len;
-
- len = pread(fcopy_fd, &buffer, sizeof(buffer), 0);
- if (len < 0) {
- syslog(LOG_ERR, "pread failed: %s", strerror(errno));
- goto reopen_fcopy_fd;
- }
-
- if (in_handshake) {
- if (len != sizeof(buffer.kernel_modver)) {
- syslog(LOG_ERR, "invalid version negotiation");
- exit(EXIT_FAILURE);
- }
- in_handshake = 0;
- syslog(LOG_INFO, "kernel module version: %u",
- buffer.kernel_modver);
- continue;
- }
-
- switch (buffer.hdr.operation) {
- case START_FILE_COPY:
- error = hv_start_fcopy(&buffer.start);
- break;
- case WRITE_TO_FILE:
- error = hv_copy_data(&buffer.copy);
- break;
- case COMPLETE_FCOPY:
- error = hv_copy_finished();
- break;
- case CANCEL_FCOPY:
- error = hv_copy_cancel();
- break;
-
- default:
- error = HV_E_FAIL;
- syslog(LOG_ERR, "Unknown operation: %d",
- buffer.hdr.operation);
-
- }
-
- /*
- * pwrite() may return an error due to the faked CANCEL_FCOPY
- * message upon hibernation. Ignore the error by resetting the
- * dev file, i.e. closing and re-opening it.
- */
- if (pwrite(fcopy_fd, &error, sizeof(int), 0) != sizeof(int)) {
- syslog(LOG_ERR, "pwrite failed: %s", strerror(errno));
- goto reopen_fcopy_fd;
- }
- }
-}
--
2.34.1


2024-02-17 18:06:03

by Saurabh Sengar

[permalink] [raw]
Subject: [PATCH 5/6] tools: hv: Add new fcopy application based on uio driver

New fcopy application which utilizes uio_hv_vmbus_client driver

Signed-off-by: Saurabh Sengar <[email protected]>
---
tools/hv/Build | 3 +-
tools/hv/Makefile | 10 +-
tools/hv/hv_fcopy_uio_daemon.c | 488 +++++++++++++++++++++++++++++++++
3 files changed, 495 insertions(+), 6 deletions(-)
create mode 100644 tools/hv/hv_fcopy_uio_daemon.c

diff --git a/tools/hv/Build b/tools/hv/Build
index 6cf51fa4b306..7d1f1698069b 100644
--- a/tools/hv/Build
+++ b/tools/hv/Build
@@ -1,3 +1,4 @@
hv_kvp_daemon-y += hv_kvp_daemon.o
hv_vss_daemon-y += hv_vss_daemon.o
-hv_fcopy_daemon-y += hv_fcopy_daemon.o
+hv_fcopy_uio_daemon-y += hv_fcopy_uio_daemon.o
+hv_fcopy_uio_daemon-y += vmbus_bufring.o
diff --git a/tools/hv/Makefile b/tools/hv/Makefile
index fe770e679ae8..944180cf916e 100644
--- a/tools/hv/Makefile
+++ b/tools/hv/Makefile
@@ -17,7 +17,7 @@ MAKEFLAGS += -r

override CFLAGS += -O2 -Wall -g -D_GNU_SOURCE -I$(OUTPUT)include

-ALL_TARGETS := hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon
+ALL_TARGETS := hv_kvp_daemon hv_vss_daemon hv_fcopy_uio_daemon
ALL_PROGRAMS := $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS))

ALL_SCRIPTS := hv_get_dhcp_info.sh hv_get_dns_info.sh hv_set_ifconfig.sh
@@ -39,10 +39,10 @@ $(HV_VSS_DAEMON_IN): FORCE
$(OUTPUT)hv_vss_daemon: $(HV_VSS_DAEMON_IN)
$(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $< -o $@

-HV_FCOPY_DAEMON_IN := $(OUTPUT)hv_fcopy_daemon-in.o
-$(HV_FCOPY_DAEMON_IN): FORCE
- $(Q)$(MAKE) $(build)=hv_fcopy_daemon
-$(OUTPUT)hv_fcopy_daemon: $(HV_FCOPY_DAEMON_IN)
+HV_FCOPY_UIO_DAEMON_IN := $(OUTPUT)hv_fcopy_uio_daemon-in.o
+$(HV_FCOPY_UIO_DAEMON_IN): FORCE
+ $(Q)$(MAKE) $(build)=hv_fcopy_uio_daemon
+$(OUTPUT)hv_fcopy_uio_daemon: $(HV_FCOPY_UIO_DAEMON_IN)
$(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $< -o $@

clean:
diff --git a/tools/hv/hv_fcopy_uio_daemon.c b/tools/hv/hv_fcopy_uio_daemon.c
new file mode 100644
index 000000000000..f72c899328fc
--- /dev/null
+++ b/tools/hv/hv_fcopy_uio_daemon.c
@@ -0,0 +1,488 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * An implementation of host to guest copy functionality for Linux.
+ *
+ * Copyright (C) 2023, Microsoft, Inc.
+ *
+ * Author : K. Y. Srinivasan <[email protected]>
+ * Author : Saurabh Sengar <[email protected]>
+ *
+ */
+
+#include <dirent.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <getopt.h>
+#include <locale.h>
+#include <stdbool.h>
+#include <stddef.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <syslog.h>
+#include <unistd.h>
+#include <wchar.h>
+#include <sys/stat.h>
+#include <linux/hyperv.h>
+#include <linux/limits.h>
+#include "vmbus_bufring.h"
+
+#define ICMSGTYPE_NEGOTIATE 0
+#define ICMSGTYPE_FCOPY 7
+
+#define WIN8_SRV_MAJOR 1
+#define WIN8_SRV_MINOR 1
+#define WIN8_SRV_VERSION (WIN8_SRV_MAJOR << 16 | WIN8_SRV_MINOR)
+
+#define MAX_FOLDER_NAME 15
+#define MAX_PATH_LEN 15
+#define FCOPY_UIO "/sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/uio"
+
+#define FCOPY_VER_COUNT 1
+static const int fcopy_versions[] = {
+ WIN8_SRV_VERSION
+};
+
+#define FW_VER_COUNT 1
+static const int fw_versions[] = {
+ UTIL_FW_VERSION
+};
+
+#define HV_RING_SIZE (4 * 4096)
+
+unsigned char desc[HV_RING_SIZE];
+
+static int target_fd;
+static char target_fname[PATH_MAX];
+static unsigned long long filesize;
+
+static int hv_fcopy_create_file(char *file_name, char *path_name, __u32 flags)
+{
+ int error = HV_E_FAIL;
+ char *q, *p;
+
+ filesize = 0;
+ p = (char *)path_name;
+ snprintf(target_fname, sizeof(target_fname), "%s/%s",
+ (char *)path_name, (char *)file_name);
+
+ /*
+ * Check to see if the path is already in place; if not,
+ * create if required.
+ */
+ while ((q = strchr(p, '/')) != NULL) {
+ if (q == p) {
+ p++;
+ continue;
+ }
+ *q = '\0';
+ if (access(path_name, F_OK)) {
+ if (flags & CREATE_PATH) {
+ if (mkdir(path_name, 0755)) {
+ syslog(LOG_ERR, "Failed to create %s",
+ path_name);
+ goto done;
+ }
+ } else {
+ syslog(LOG_ERR, "Invalid path: %s", path_name);
+ goto done;
+ }
+ }
+ p = q + 1;
+ *q = '/';
+ }
+
+ if (!access(target_fname, F_OK)) {
+ syslog(LOG_INFO, "File: %s exists", target_fname);
+ if (!(flags & OVER_WRITE)) {
+ error = HV_ERROR_ALREADY_EXISTS;
+ goto done;
+ }
+ }
+
+ target_fd = open(target_fname,
+ O_RDWR | O_CREAT | O_TRUNC | O_CLOEXEC, 0744);
+ if (target_fd == -1) {
+ syslog(LOG_INFO, "Open Failed: %s", strerror(errno));
+ goto done;
+ }
+
+ error = 0;
+done:
+ if (error)
+ target_fname[0] = '\0';
+ return error;
+}
+
+/* copy the data into the file */
+static int hv_copy_data(struct hv_do_fcopy *cpmsg)
+{
+ ssize_t len;
+ int ret = 0;
+
+ len = pwrite(target_fd, cpmsg->data, cpmsg->size, cpmsg->offset);
+
+ filesize += cpmsg->size;
+ if (len != cpmsg->size) {
+ switch (errno) {
+ case ENOSPC:
+ ret = HV_ERROR_DISK_FULL;
+ break;
+ default:
+ ret = HV_E_FAIL;
+ break;
+ }
+ syslog(LOG_ERR, "pwrite failed to write %llu bytes: %ld (%s)",
+ filesize, (long)len, strerror(errno));
+ }
+
+ return ret;
+}
+
+static int hv_copy_finished(void)
+{
+ close(target_fd);
+ target_fname[0] = '\0';
+
+ return 0;
+}
+
+static void print_usage(char *argv[])
+{
+ fprintf(stderr, "Usage: %s [options]\n"
+ "Options are:\n"
+ " -n, --no-daemon stay in foreground, don't daemonize\n"
+ " -h, --help print this help\n", argv[0]);
+}
+
+static bool vmbus_prep_negotiate_resp(struct icmsg_hdr *icmsghdrp, unsigned char *buf,
+ unsigned int buflen, const int *fw_version, int fw_vercnt,
+ const int *srv_version, int srv_vercnt,
+ int *nego_fw_version, int *nego_srv_version)
+{
+ int icframe_major, icframe_minor;
+ int icmsg_major, icmsg_minor;
+ int fw_major, fw_minor;
+ int srv_major, srv_minor;
+ int i, j;
+ bool found_match = false;
+ struct icmsg_negotiate *negop;
+
+ /* Check that there's enough space for icframe_vercnt, icmsg_vercnt */
+ if (buflen < ICMSG_HDR + offsetof(struct icmsg_negotiate, reserved)) {
+ syslog(LOG_ERR, "Invalid icmsg negotiate");
+ return false;
+ }
+
+ icmsghdrp->icmsgsize = 0x10;
+ negop = (struct icmsg_negotiate *)&buf[ICMSG_HDR];
+
+ icframe_major = negop->icframe_vercnt;
+ icframe_minor = 0;
+
+ icmsg_major = negop->icmsg_vercnt;
+ icmsg_minor = 0;
+
+ /* Validate negop packet */
+ if (icframe_major > IC_VERSION_NEGOTIATION_MAX_VER_COUNT ||
+ icmsg_major > IC_VERSION_NEGOTIATION_MAX_VER_COUNT ||
+ ICMSG_NEGOTIATE_PKT_SIZE(icframe_major, icmsg_major) > buflen) {
+ syslog(LOG_ERR, "Invalid icmsg negotiate - icframe_major: %u, icmsg_major: %u\n",
+ icframe_major, icmsg_major);
+ goto fw_error;
+ }
+
+ /*
+ * Select the framework version number we will
+ * support.
+ */
+
+ for (i = 0; i < fw_vercnt; i++) {
+ fw_major = (fw_version[i] >> 16);
+ fw_minor = (fw_version[i] & 0xFFFF);
+
+ for (j = 0; j < negop->icframe_vercnt; j++) {
+ if (negop->icversion_data[j].major == fw_major &&
+ negop->icversion_data[j].minor == fw_minor) {
+ icframe_major = negop->icversion_data[j].major;
+ icframe_minor = negop->icversion_data[j].minor;
+ found_match = true;
+ break;
+ }
+ }
+
+ if (found_match)
+ break;
+ }
+
+ if (!found_match)
+ goto fw_error;
+
+ found_match = false;
+
+ for (i = 0; i < srv_vercnt; i++) {
+ srv_major = (srv_version[i] >> 16);
+ srv_minor = (srv_version[i] & 0xFFFF);
+
+ for (j = negop->icframe_vercnt;
+ (j < negop->icframe_vercnt + negop->icmsg_vercnt);
+ j++) {
+ if (negop->icversion_data[j].major == srv_major &&
+ negop->icversion_data[j].minor == srv_minor) {
+ icmsg_major = negop->icversion_data[j].major;
+ icmsg_minor = negop->icversion_data[j].minor;
+ found_match = true;
+ break;
+ }
+ }
+
+ if (found_match)
+ break;
+ }
+
+ /*
+ * Respond with the framework and service
+ * version numbers we can support.
+ */
+fw_error:
+ if (!found_match) {
+ negop->icframe_vercnt = 0;
+ negop->icmsg_vercnt = 0;
+ } else {
+ negop->icframe_vercnt = 1;
+ negop->icmsg_vercnt = 1;
+ }
+
+ if (nego_fw_version)
+ *nego_fw_version = (icframe_major << 16) | icframe_minor;
+
+ if (nego_srv_version)
+ *nego_srv_version = (icmsg_major << 16) | icmsg_minor;
+
+ negop->icversion_data[0].major = icframe_major;
+ negop->icversion_data[0].minor = icframe_minor;
+ negop->icversion_data[1].major = icmsg_major;
+ negop->icversion_data[1].minor = icmsg_minor;
+
+ return found_match;
+}
+
+static void wcstoutf8(char *dest, const __u16 *src, size_t dest_size)
+{
+ size_t len = 0;
+
+ while (len < dest_size) {
+ if (src[len] < 0x80)
+ dest[len++] = (char)(*src++);
+ else
+ dest[len++] = 'X';
+ }
+
+ dest[len] = '\0';
+}
+
+static int hv_fcopy_start(struct hv_start_fcopy *smsg_in)
+{
+ setlocale(LC_ALL, "en_US.utf8");
+ size_t file_size, path_size;
+ char *file_name, *path_name;
+ char *in_file_name = (char *)smsg_in->file_name;
+ char *in_path_name = (char *)smsg_in->path_name;
+
+ file_size = wcstombs(NULL, (const wchar_t *restrict)in_file_name, 0) + 1;
+ path_size = wcstombs(NULL, (const wchar_t *restrict)in_path_name, 0) + 1;
+
+ file_name = (char *)malloc(file_size * sizeof(char));
+ path_name = (char *)malloc(path_size * sizeof(char));
+
+ wcstoutf8(file_name, (__u16 *)in_file_name, file_size);
+ wcstoutf8(path_name, (__u16 *)in_path_name, path_size);
+
+ return hv_fcopy_create_file(file_name, path_name, smsg_in->copy_flags);
+}
+
+static int hv_fcopy_send_data(struct hv_fcopy_hdr *fcopy_msg, int recvlen)
+{
+ int operation = fcopy_msg->operation;
+
+ /*
+ * The strings sent from the host are encoded in
+ * utf16; convert it to utf8 strings.
+ * The host assures us that the utf16 strings will not exceed
+ * the max lengths specified. We will however, reserve room
+ * for the string terminating character - in the utf16s_utf8s()
+ * function we limit the size of the buffer where the converted
+ * string is placed to W_MAX_PATH -1 to guarantee
+ * that the strings can be properly terminated!
+ */
+
+ switch (operation) {
+ case START_FILE_COPY:
+ return hv_fcopy_start((struct hv_start_fcopy *)fcopy_msg);
+ case WRITE_TO_FILE:
+ return hv_copy_data((struct hv_do_fcopy *)fcopy_msg);
+ case COMPLETE_FCOPY:
+ return hv_copy_finished();
+ }
+
+ return HV_E_FAIL;
+}
+
+/* process the packet recv from host */
+static int fcopy_pkt_process(struct vmbus_br *txbr)
+{
+ int ret, offset, pktlen;
+ int fcopy_srv_version;
+ const struct vmbus_chanpkt_hdr *pkt;
+ struct hv_fcopy_hdr *fcopy_msg;
+ struct icmsg_hdr *icmsghdr;
+
+ pkt = (const struct vmbus_chanpkt_hdr *)desc;
+ offset = pkt->hlen << 3;
+ pktlen = (pkt->tlen << 3) - offset;
+ icmsghdr = (struct icmsg_hdr *)&desc[offset + sizeof(struct vmbuspipe_hdr)];
+ icmsghdr->status = HV_E_FAIL;
+
+ if (icmsghdr->icmsgtype == ICMSGTYPE_NEGOTIATE) {
+ if (vmbus_prep_negotiate_resp(icmsghdr, desc + offset, pktlen, fw_versions,
+ FW_VER_COUNT, fcopy_versions, FCOPY_VER_COUNT,
+ NULL, &fcopy_srv_version)) {
+ syslog(LOG_INFO, "FCopy IC version %d.%d",
+ fcopy_srv_version >> 16, fcopy_srv_version & 0xFFFF);
+ icmsghdr->status = 0;
+ }
+ } else if (icmsghdr->icmsgtype == ICMSGTYPE_FCOPY) {
+ /* Ensure recvlen is big enough to contain hv_fcopy_hdr */
+ if (pktlen < ICMSG_HDR + sizeof(struct hv_fcopy_hdr)) {
+ syslog(LOG_ERR, "Invalid Fcopy hdr. Packet length too small: %u",
+ pktlen);
+ return -ENOBUFS;
+ }
+
+ fcopy_msg = (struct hv_fcopy_hdr *)&desc[offset + ICMSG_HDR];
+ icmsghdr->status = hv_fcopy_send_data(fcopy_msg, pktlen);
+ }
+
+ icmsghdr->icflags = ICMSGHDRFLAG_TRANSACTION | ICMSGHDRFLAG_RESPONSE;
+ ret = rte_vmbus_chan_send(txbr, 0x6, desc + offset, pktlen, 0);
+ if (ret) {
+ syslog(LOG_ERR, "Write to ringbuffer failed err: %d", ret);
+ return ret;
+ }
+
+ return 0;
+}
+
+static void fcopy_get_first_folder(char *path, char *chan_no)
+{
+ DIR *dir = opendir(path);
+ struct dirent *entry;
+
+ if (!dir) {
+ syslog(LOG_ERR, "Failed to open directory (errno=%s).\n", strerror(errno));
+ return;
+ }
+
+ while ((entry = readdir(dir)) != NULL) {
+ if (entry->d_type == DT_DIR && strcmp(entry->d_name, ".") != 0 &&
+ strcmp(entry->d_name, "..") != 0) {
+ strcpy(chan_no, entry->d_name);
+ break;
+ }
+ }
+
+ closedir(dir);
+}
+
+int main(int argc, char *argv[])
+{
+ int fcopy_fd = -1, tmp = 1;
+ int daemonize = 1, long_index = 0, opt, ret = -EINVAL;
+ struct vmbus_br txbr, rxbr;
+ void *ring;
+ uint32_t len = HV_RING_SIZE;
+ char uio_name[MAX_FOLDER_NAME] = {0};
+ char uio_dev_path[MAX_PATH_LEN] = {0};
+
+ static struct option long_options[] = {
+ {"help", no_argument, 0, 'h' },
+ {"no-daemon", no_argument, 0, 'n' },
+ {0, 0, 0, 0 }
+ };
+
+ while ((opt = getopt_long(argc, argv, "hn", long_options,
+ &long_index)) != -1) {
+ switch (opt) {
+ case 'n':
+ daemonize = 0;
+ break;
+ case 'h':
+ default:
+ print_usage(argv);
+ goto exit;
+ }
+ }
+
+ if (daemonize && daemon(1, 0)) {
+ syslog(LOG_ERR, "daemon() failed; error: %s", strerror(errno));
+ goto exit;
+ }
+
+ openlog("HV_UIO_FCOPY", 0, LOG_USER);
+ syslog(LOG_INFO, "starting; pid is:%d", getpid());
+
+ fcopy_get_first_folder(FCOPY_UIO, uio_name);
+ snprintf(uio_dev_path, sizeof(uio_dev_path), "/dev/%s", uio_name);
+ fcopy_fd = open(uio_dev_path, O_RDWR);
+
+ if (fcopy_fd < 0) {
+ syslog(LOG_ERR, "open %s failed; error: %d %s",
+ uio_dev_path, errno, strerror(errno));
+ ret = fcopy_fd;
+ goto exit;
+ }
+
+ ring = vmbus_uio_map(&fcopy_fd, HV_RING_SIZE);
+ if (!ring) {
+ ret = errno;
+ syslog(LOG_ERR, "mmap ringbuffer failed; error: %d %s", ret, strerror(ret));
+ goto close;
+ }
+ vmbus_br_setup(&txbr, ring, HV_RING_SIZE);
+ vmbus_br_setup(&rxbr, (char *)ring + HV_RING_SIZE, HV_RING_SIZE);
+
+ while (1) {
+ /*
+ * In this loop we process fcopy messages after the
+ * handshake is complete.
+ */
+ ret = pread(fcopy_fd, &tmp, sizeof(int), 0);
+ if (ret < 0) {
+ syslog(LOG_ERR, "pread failed: %s", strerror(errno));
+ continue;
+ }
+
+ len = HV_RING_SIZE;
+ ret = rte_vmbus_chan_recv_raw(&rxbr, desc, &len);
+ if (unlikely(ret <= 0)) {
+ /* This indicates a failure to communicate (or worse) */
+ syslog(LOG_ERR, "VMBus channel recv error: %d", ret);
+ } else {
+ ret = fcopy_pkt_process(&txbr);
+ if (ret < 0)
+ goto close;
+
+ /* Signal host */
+ if ((write(fcopy_fd, &tmp, sizeof(int))) != sizeof(int)) {
+ ret = errno;
+ syslog(LOG_ERR, "Registration failed: %s\n", strerror(ret));
+ goto close;
+ }
+ }
+ }
+close:
+ close(fcopy_fd);
+exit:
+ return ret;
+}
--
2.34.1


2024-02-18 07:10:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 0/6] Low speed Hyper-V devices support

On Sat, Feb 17, 2024 at 10:03:34AM -0800, Saurabh Sengar wrote:
> Hyper-V is adding multiple low speed "speciality" synthetic devices.
> Instead of writing a new kernel-level VMBus driver for each device,
> make the devices accessible to user space through UIO-based
> uio_hv_generic driver. Each device can then be supported by a user
> space driver. This approach optimizes the development process and
> provides flexibility to user space applications to control the key
> interactions with the VMBus ring buffer.
>
> The new synthetic devices are low speed devices that don't support
> VMBus monitor bits, and so they must use vmbus_setevent() to notify
> the host of ring buffer updates. These new devices also have smaller
> ring buffer sizes which requires to add support for variable ring buffer
> sizes.
>
> Moreover, this patch series adds a new implementation of the fcopy
> application that uses the new UIO driver. The older fcopy driver and
> application will be phased out gradually. Development of other similar
> userspace drivers is still underway.
>
>
> Efforts have been made previously to implement this solution earlier.
> Here are the discussions related to those attempts:
> https://lore.kernel.org/lkml/[email protected]/
> https://lore.kernel.org/lkml/[email protected]/
> https://lore.kernel.org/lkml/[email protected]/

So is this a v4 of the patch series? What has changed from those
previous submissions?

thanks,

greg k-h

2024-02-18 07:12:08

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 1/6] Drivers: hv: vmbus: Add utility function for querying ring size

On Sat, Feb 17, 2024 at 10:03:35AM -0800, Saurabh Sengar wrote:
> Add a function to query for the preferred ring buffer size of VMBus
> device.

That says what you did, but not why you did it.

>
> Signed-off-by: Saurabh Sengar <[email protected]>
> ---
> drivers/hv/channel_mgmt.c | 7 +++++--
> drivers/hv/hyperv_vmbus.h | 5 +++++
> include/linux/hyperv.h | 1 +
> 3 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index 2f4d09ce027a..7ea444d72f9f 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -120,7 +120,8 @@ const struct vmbus_device vmbus_devs[] = {
> },
>
> /* File copy */
> - { .dev_type = HV_FCOPY,
> + { .pref_ring_size = 0x4000,
> + .dev_type = HV_FCOPY,
> HV_FCOPY_GUID,
> .perf_device = false,
> .allowed_in_isolated = false,
> @@ -141,11 +142,13 @@ const struct vmbus_device vmbus_devs[] = {
> },
>
> /* Unknown GUID */
> - { .dev_type = HV_UNKNOWN,
> + { .pref_ring_size = 0x11000,
> + .dev_type = HV_UNKNOWN,

Where do these magic numbers for the size come from?

> .perf_device = false,
> .allowed_in_isolated = false,
> },
> };
> +EXPORT_SYMBOL_GPL(vmbus_devs);
>
> static const struct {
> guid_t guid;
> diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
> index f6b1e710f805..76ac5185a01a 100644
> --- a/drivers/hv/hyperv_vmbus.h
> +++ b/drivers/hv/hyperv_vmbus.h
> @@ -417,6 +417,11 @@ static inline bool hv_is_perf_channel(struct vmbus_channel *channel)
> return vmbus_devs[channel->device_id].perf_device;
> }
>
> +static inline size_t hv_dev_ring_size(struct vmbus_channel *channel)
> +{
> + return vmbus_devs[channel->device_id].pref_ring_size;
> +}
> +
> static inline bool hv_is_allocated_cpu(unsigned int cpu)
> {
> struct vmbus_channel *channel, *sc;
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index 2b00faf98017..5951c7bb5712 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -800,6 +800,7 @@ struct vmbus_requestor {
> #define VMBUS_RQST_RESET (U64_MAX - 3)
>
> struct vmbus_device {
> + size_t pref_ring_size;

No documentation for this? What is the size in units of?

thanks,

greg k-h

2024-02-18 07:51:27

by Saurabh Sengar

[permalink] [raw]
Subject: Re: [PATCH 0/6] Low speed Hyper-V devices support

On Sun, Feb 18, 2024 at 08:10:36AM +0100, Greg KH wrote:
> On Sat, Feb 17, 2024 at 10:03:34AM -0800, Saurabh Sengar wrote:
> > Hyper-V is adding multiple low speed "speciality" synthetic devices.
> > Instead of writing a new kernel-level VMBus driver for each device,
> > make the devices accessible to user space through UIO-based
> > uio_hv_generic driver. Each device can then be supported by a user
> > space driver. This approach optimizes the development process and
> > provides flexibility to user space applications to control the key
> > interactions with the VMBus ring buffer.
> >
> > The new synthetic devices are low speed devices that don't support
> > VMBus monitor bits, and so they must use vmbus_setevent() to notify
> > the host of ring buffer updates. These new devices also have smaller
> > ring buffer sizes which requires to add support for variable ring buffer
> > sizes.
> >
> > Moreover, this patch series adds a new implementation of the fcopy
> > application that uses the new UIO driver. The older fcopy driver and
> > application will be phased out gradually. Development of other similar
> > userspace drivers is still underway.
> >
> >
> > Efforts have been made previously to implement this solution earlier.
> > Here are the discussions related to those attempts:
> > https://lore.kernel.org/lkml/[email protected]/
> > https://lore.kernel.org/lkml/[email protected]/
> > https://lore.kernel.org/lkml/[email protected]/
>
> So is this a v4 of the patch series? What has changed from those
> previous submissions?

In the most recent attempt we introduced a new driver uio_hv_vmbus_client
for slow devices, where as in this new approach we are making use of
existing uio_hv_generic driver.

We also introduced the function to query ring buffer sizes, this is an
attempt to address your comment on earlier series to avoid kernel params.
Comment ref: https://lore.kernel.org/lkml/[email protected]/

We later tried to have ring buffer sizes via sysfs for which we wrote a
new driver uio_hv_vmbus_client as explained above.

This is the next approach in an attempt to address all of the concerns
with all the previous series.

- Saurabh


>
> thanks,
>
> greg k-h

2024-02-18 08:03:15

by Saurabh Sengar

[permalink] [raw]
Subject: Re: [PATCH 1/6] Drivers: hv: vmbus: Add utility function for querying ring size

On Sun, Feb 18, 2024 at 08:11:58AM +0100, Greg KH wrote:
> On Sat, Feb 17, 2024 at 10:03:35AM -0800, Saurabh Sengar wrote:
> > Add a function to query for the preferred ring buffer size of VMBus
> > device.
>
> That says what you did, but not why you did it.

I thought subsequent patch will make it clear, but I can add more
info in cover letter. I will enhance this commit as well.

>
> >
> > Signed-off-by: Saurabh Sengar <[email protected]>
> > ---
> > drivers/hv/channel_mgmt.c | 7 +++++--
> > drivers/hv/hyperv_vmbus.h | 5 +++++
> > include/linux/hyperv.h | 1 +
> > 3 files changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> > index 2f4d09ce027a..7ea444d72f9f 100644
> > --- a/drivers/hv/channel_mgmt.c
> > +++ b/drivers/hv/channel_mgmt.c
> > @@ -120,7 +120,8 @@ const struct vmbus_device vmbus_devs[] = {
> > },
> >
> > /* File copy */
> > - { .dev_type = HV_FCOPY,
> > + { .pref_ring_size = 0x4000,
> > + .dev_type = HV_FCOPY,
> > HV_FCOPY_GUID,
> > .perf_device = false,
> > .allowed_in_isolated = false,
> > @@ -141,11 +142,13 @@ const struct vmbus_device vmbus_devs[] = {
> > },
> >
> > /* Unknown GUID */
> > - { .dev_type = HV_UNKNOWN,
> > + { .pref_ring_size = 0x11000,
> > + .dev_type = HV_UNKNOWN,
>
> Where do these magic numbers for the size come from?


This value is (16 + 1) page_size, which is found sufficient for most
of the slow devices. 16 page_size is for the ringbuffer and 1 page_size
for the headre. This is the approximation for default case, to avoid
using fall back case of 512 page_size as used by uio_hv_generic.

>
> > .perf_device = false,
> > .allowed_in_isolated = false,> > },
> > };
> > +EXPORT_SYMBOL_GPL(vmbus_devs);
> >
> > static const struct {
> > guid_t guid;
> > diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
> > index f6b1e710f805..76ac5185a01a 100644
> > --- a/drivers/hv/hyperv_vmbus.h
> > +++ b/drivers/hv/hyperv_vmbus.h
> > @@ -417,6 +417,11 @@ static inline bool hv_is_perf_channel(struct vmbus_channel *channel)
> > return vmbus_devs[channel->device_id].perf_device;
> > }
> >
> > +static inline size_t hv_dev_ring_size(struct vmbus_channel *channel)
> > +{
> > + return vmbus_devs[channel->device_id].pref_ring_size;
> > +}
> > +
> > static inline bool hv_is_allocated_cpu(unsigned int cpu)
> > {
> > struct vmbus_channel *channel, *sc;
> > diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> > index 2b00faf98017..5951c7bb5712 100644
> > --- a/include/linux/hyperv.h
> > +++ b/include/linux/hyperv.h
> > @@ -800,6 +800,7 @@ struct vmbus_requestor {
> > #define VMBUS_RQST_RESET (U64_MAX - 3)
> >
> > struct vmbus_device {
> > + size_t pref_ring_size;
>
> No documentation for this? What is the size in units of?

I can add a comment here like below:

/*
* Total memory in bytes allocated for the one complete ring buffer,
* which includes the ring buffer header, of size PAGE_SIZE. This value
* should be aligned to page_size.
*/

- Saurabh

>
> thanks,
>
> greg k-h

2024-02-18 09:10:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 0/6] Low speed Hyper-V devices support

On Sat, Feb 17, 2024 at 11:51:14PM -0800, Saurabh Singh Sengar wrote:
> On Sun, Feb 18, 2024 at 08:10:36AM +0100, Greg KH wrote:
> > On Sat, Feb 17, 2024 at 10:03:34AM -0800, Saurabh Sengar wrote:
> > > Hyper-V is adding multiple low speed "speciality" synthetic devices.
> > > Instead of writing a new kernel-level VMBus driver for each device,
> > > make the devices accessible to user space through UIO-based
> > > uio_hv_generic driver. Each device can then be supported by a user
> > > space driver. This approach optimizes the development process and
> > > provides flexibility to user space applications to control the key
> > > interactions with the VMBus ring buffer.
> > >
> > > The new synthetic devices are low speed devices that don't support
> > > VMBus monitor bits, and so they must use vmbus_setevent() to notify
> > > the host of ring buffer updates. These new devices also have smaller
> > > ring buffer sizes which requires to add support for variable ring buffer
> > > sizes.
> > >
> > > Moreover, this patch series adds a new implementation of the fcopy
> > > application that uses the new UIO driver. The older fcopy driver and
> > > application will be phased out gradually. Development of other similar
> > > userspace drivers is still underway.
> > >
> > >
> > > Efforts have been made previously to implement this solution earlier.
> > > Here are the discussions related to those attempts:
> > > https://lore.kernel.org/lkml/[email protected]/
> > > https://lore.kernel.org/lkml/[email protected]/
> > > https://lore.kernel.org/lkml/[email protected]/
> >
> > So is this a v4 of the patch series? What has changed from those
> > previous submissions?
>
> In the most recent attempt we introduced a new driver uio_hv_vmbus_client
> for slow devices, where as in this new approach we are making use of
> existing uio_hv_generic driver.
>
> We also introduced the function to query ring buffer sizes, this is an
> attempt to address your comment on earlier series to avoid kernel params.
> Comment ref: https://lore.kernel.org/lkml/[email protected]/
>
> We later tried to have ring buffer sizes via sysfs for which we wrote a
> new driver uio_hv_vmbus_client as explained above.
>
> This is the next approach in an attempt to address all of the concerns
> with all the previous series.

Then you need to say that somewhere, what differs from the previous
submissions and why this is better. Remember, some of us get 1000+
emails a day to do something with, and trying to remember a review
comment from last week, let alone months ago, is impossible.

Make this easy for us please...

And as this is a "next approach", it should be versioned properly. What
would you want to see if you had to review this?

thanks,

greg k-h

2024-02-18 09:11:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 1/6] Drivers: hv: vmbus: Add utility function for querying ring size

On Sun, Feb 18, 2024 at 12:03:06AM -0800, Saurabh Singh Sengar wrote:
> On Sun, Feb 18, 2024 at 08:11:58AM +0100, Greg KH wrote:
> > On Sat, Feb 17, 2024 at 10:03:35AM -0800, Saurabh Sengar wrote:
> > > Add a function to query for the preferred ring buffer size of VMBus
> > > device.
> >
> > That says what you did, but not why you did it.
>
> I thought subsequent patch will make it clear, but I can add more
> info in cover letter. I will enhance this commit as well.

Each patch should stand on its own, as it will be on its own when
committed, right?

I don't know anything is happening "next", nor what any of this means
here, which is required.

Again, what would _you_ want to see if you had to review this?

Along those lines, why not get some internal review, and signed-off-by
first, before asking us to review this for you? You all know this area
the best, and have lots of experience with reviews, right?

thanks,

greg k-h

2024-02-19 08:51:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 2/6] uio_hv_generic: Query the ringbuffer size for device

On Sat, Feb 17, 2024 at 10:03:36AM -0800, Saurabh Sengar wrote:
> Query the ring buffer size from pre defined table per device.
> Keep the size as is if the device doesn't have any preferred
> ring size.

What is the "as is" size?

>
> Signed-off-by: Saurabh Sengar <[email protected]>
> ---
> drivers/uio/uio_hv_generic.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
> index 20d9762331bd..4bda6b52e49e 100644
> --- a/drivers/uio/uio_hv_generic.c
> +++ b/drivers/uio/uio_hv_generic.c
> @@ -238,6 +238,7 @@ hv_uio_probe(struct hv_device *dev,
> struct hv_uio_private_data *pdata;
> void *ring_buffer;
> int ret;
> + size_t ring_size = hv_dev_ring_size(channel);
>
> /* Communicating with host has to be via shared memory not hypercall */
> if (!channel->offermsg.monitor_allocated) {
> @@ -245,12 +246,14 @@ hv_uio_probe(struct hv_device *dev,
> return -ENOTSUPP;
> }
>
> + if (!ring_size)
> + ring_size = HV_RING_SIZE * PAGE_SIZE;

Why the magic * PAGE_SIZE here?

Where is it documented that ring_size is in pages?

And what happens when PAGE_SIZE is changed? Why are you relying on that
arbritrary value to dictate your buffer sizes to a device that has
no relationship with PAGE_SIZE?

Yes, I know you are copying what was there today, but you have the
chance to rethink and most importantly, DOCUMENT this decision properly
now.

thanks,

greg k-h

2024-02-19 08:53:43

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 5/6] tools: hv: Add new fcopy application based on uio driver

On Sat, Feb 17, 2024 at 10:03:39AM -0800, Saurabh Sengar wrote:
> New fcopy application which utilizes uio_hv_vmbus_client driver

What does this "application" do?

>
> Signed-off-by: Saurabh Sengar <[email protected]>
> ---
> tools/hv/Build | 3 +-
> tools/hv/Makefile | 10 +-
> tools/hv/hv_fcopy_uio_daemon.c | 488 +++++++++++++++++++++++++++++++++
> 3 files changed, 495 insertions(+), 6 deletions(-)
> create mode 100644 tools/hv/hv_fcopy_uio_daemon.c
>
> diff --git a/tools/hv/Build b/tools/hv/Build
> index 6cf51fa4b306..7d1f1698069b 100644
> --- a/tools/hv/Build
> +++ b/tools/hv/Build
> @@ -1,3 +1,4 @@
> hv_kvp_daemon-y += hv_kvp_daemon.o
> hv_vss_daemon-y += hv_vss_daemon.o
> -hv_fcopy_daemon-y += hv_fcopy_daemon.o
> +hv_fcopy_uio_daemon-y += hv_fcopy_uio_daemon.o
> +hv_fcopy_uio_daemon-y += vmbus_bufring.o
> diff --git a/tools/hv/Makefile b/tools/hv/Makefile
> index fe770e679ae8..944180cf916e 100644
> --- a/tools/hv/Makefile
> +++ b/tools/hv/Makefile
> @@ -17,7 +17,7 @@ MAKEFLAGS += -r
>
> override CFLAGS += -O2 -Wall -g -D_GNU_SOURCE -I$(OUTPUT)include
>
> -ALL_TARGETS := hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon
> +ALL_TARGETS := hv_kvp_daemon hv_vss_daemon hv_fcopy_uio_daemon
> ALL_PROGRAMS := $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS))
>
> ALL_SCRIPTS := hv_get_dhcp_info.sh hv_get_dns_info.sh hv_set_ifconfig.sh
> @@ -39,10 +39,10 @@ $(HV_VSS_DAEMON_IN): FORCE
> $(OUTPUT)hv_vss_daemon: $(HV_VSS_DAEMON_IN)
> $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $< -o $@
>
> -HV_FCOPY_DAEMON_IN := $(OUTPUT)hv_fcopy_daemon-in.o
> -$(HV_FCOPY_DAEMON_IN): FORCE
> - $(Q)$(MAKE) $(build)=hv_fcopy_daemon
> -$(OUTPUT)hv_fcopy_daemon: $(HV_FCOPY_DAEMON_IN)
> +HV_FCOPY_UIO_DAEMON_IN := $(OUTPUT)hv_fcopy_uio_daemon-in.o
> +$(HV_FCOPY_UIO_DAEMON_IN): FORCE
> + $(Q)$(MAKE) $(build)=hv_fcopy_uio_daemon
> +$(OUTPUT)hv_fcopy_uio_daemon: $(HV_FCOPY_UIO_DAEMON_IN)
> $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $< -o $@
>
> clean:
> diff --git a/tools/hv/hv_fcopy_uio_daemon.c b/tools/hv/hv_fcopy_uio_daemon.c
> new file mode 100644
> index 000000000000..f72c899328fc
> --- /dev/null
> +++ b/tools/hv/hv_fcopy_uio_daemon.c
> @@ -0,0 +1,488 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * An implementation of host to guest copy functionality for Linux.

host to guest of what? I think it's a specific type of host and guest,
right?

> + *
> + * Copyright (C) 2023, Microsoft, Inc.
> + *
> + * Author : K. Y. Srinivasan <[email protected]>
> + * Author : Saurabh Sengar <[email protected]>
> + *
> + */
> +
> +#include <dirent.h>
> +#include <errno.h>
> +#include <fcntl.h>
> +#include <getopt.h>
> +#include <locale.h>
> +#include <stdbool.h>
> +#include <stddef.h>
> +#include <stdint.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <syslog.h>
> +#include <unistd.h>
> +#include <wchar.h>
> +#include <sys/stat.h>
> +#include <linux/hyperv.h>
> +#include <linux/limits.h>
> +#include "vmbus_bufring.h"
> +
> +#define ICMSGTYPE_NEGOTIATE 0
> +#define ICMSGTYPE_FCOPY 7
> +
> +#define WIN8_SRV_MAJOR 1
> +#define WIN8_SRV_MINOR 1
> +#define WIN8_SRV_VERSION (WIN8_SRV_MAJOR << 16 | WIN8_SRV_MINOR)
> +
> +#define MAX_FOLDER_NAME 15
> +#define MAX_PATH_LEN 15
> +#define FCOPY_UIO "/sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/uio"
> +
> +#define FCOPY_VER_COUNT 1
> +static const int fcopy_versions[] = {
> + WIN8_SRV_VERSION
> +};
> +
> +#define FW_VER_COUNT 1
> +static const int fw_versions[] = {
> + UTIL_FW_VERSION
> +};
> +
> +#define HV_RING_SIZE (4 * 4096)

Hey, that doesn't match the kernel driver! Why these values?


> +
> +unsigned char desc[HV_RING_SIZE];
> +
> +static int target_fd;
> +static char target_fname[PATH_MAX];
> +static unsigned long long filesize;
> +
> +static int hv_fcopy_create_file(char *file_name, char *path_name, __u32 flags)
> +{
> + int error = HV_E_FAIL;
> + char *q, *p;
> +
> + filesize = 0;
> + p = (char *)path_name;

Why the unneeded cast?

> + snprintf(target_fname, sizeof(target_fname), "%s/%s",
> + (char *)path_name, (char *)file_name);

Again, why all of the unneeded casts? This feels very odd, so I've
stopped reading here, perhaps get an internal review first before
sending this out again?

thanks,

greg k-h

2024-02-19 09:24:33

by Saurabh Sengar

[permalink] [raw]
Subject: Re: [PATCH 5/6] tools: hv: Add new fcopy application based on uio driver

On Mon, Feb 19, 2024 at 09:53:01AM +0100, Greg KH wrote:
> On Sat, Feb 17, 2024 at 10:03:39AM -0800, Saurabh Sengar wrote:
> > New fcopy application which utilizes uio_hv_vmbus_client driver
>
> What does this "application" do?

I will improve the commit message with more information.

>
> >
> > Signed-off-by: Saurabh Sengar <[email protected]>
> > ---
> > tools/hv/Build | 3 +-
> > tools/hv/Makefile | 10 +-
> > tools/hv/hv_fcopy_uio_daemon.c | 488 +++++++++++++++++++++++++++++++++
> > 3 files changed, 495 insertions(+), 6 deletions(-)
> > create mode 100644 tools/hv/hv_fcopy_uio_daemon.c
> >
> > diff --git a/tools/hv/Build b/tools/hv/Build
> > index 6cf51fa4b306..7d1f1698069b 100644
> > --- a/tools/hv/Build
> > +++ b/tools/hv/Build
> > @@ -1,3 +1,4 @@
> > hv_kvp_daemon-y += hv_kvp_daemon.o
> > hv_vss_daemon-y += hv_vss_daemon.o
> > -hv_fcopy_daemon-y += hv_fcopy_daemon.o
> > +hv_fcopy_uio_daemon-y += hv_fcopy_uio_daemon.o
> > +hv_fcopy_uio_daemon-y += vmbus_bufring.o
> > diff --git a/tools/hv/Makefile b/tools/hv/Makefile
> > index fe770e679ae8..944180cf916e 100644
> > --- a/tools/hv/Makefile
> > +++ b/tools/hv/Makefile
> > @@ -17,7 +17,7 @@ MAKEFLAGS += -r
> >
> > override CFLAGS += -O2 -Wall -g -D_GNU_SOURCE -I$(OUTPUT)include
> >
> > -ALL_TARGETS := hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon
> > +ALL_TARGETS := hv_kvp_daemon hv_vss_daemon hv_fcopy_uio_daemon
> > ALL_PROGRAMS := $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS))
> >
> > ALL_SCRIPTS := hv_get_dhcp_info.sh hv_get_dns_info.sh hv_set_ifconfig.sh
> > @@ -39,10 +39,10 @@ $(HV_VSS_DAEMON_IN): FORCE
> > $(OUTPUT)hv_vss_daemon: $(HV_VSS_DAEMON_IN)
> > $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $< -o $@
> >
> > -HV_FCOPY_DAEMON_IN := $(OUTPUT)hv_fcopy_daemon-in.o
> > -$(HV_FCOPY_DAEMON_IN): FORCE
> > - $(Q)$(MAKE) $(build)=hv_fcopy_daemon
> > -$(OUTPUT)hv_fcopy_daemon: $(HV_FCOPY_DAEMON_IN)
> > +HV_FCOPY_UIO_DAEMON_IN := $(OUTPUT)hv_fcopy_uio_daemon-in.o
> > +$(HV_FCOPY_UIO_DAEMON_IN): FORCE
> > + $(Q)$(MAKE) $(build)=hv_fcopy_uio_daemon
> > +$(OUTPUT)hv_fcopy_uio_daemon: $(HV_FCOPY_UIO_DAEMON_IN)
> > $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $< -o $@
> >
> > clean:
> > diff --git a/tools/hv/hv_fcopy_uio_daemon.c b/tools/hv/hv_fcopy_uio_daemon.c
> > new file mode 100644
> > index 000000000000..f72c899328fc
> > --- /dev/null
> > +++ b/tools/hv/hv_fcopy_uio_daemon.c
> > @@ -0,0 +1,488 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * An implementation of host to guest copy functionality for Linux.
>
> host to guest of what? I think it's a specific type of host and guest,
> right?

This application is replacement of hv_fcopy_daemon, so copied the exact
comment here. This is specific to Hyper-V host, I can add these details.


>
> > + *
> > + * Copyright (C) 2023, Microsoft, Inc.
> > + *
> > + * Author : K. Y. Srinivasan <[email protected]>
> > + * Author : Saurabh Sengar <[email protected]>
> > + *
> > + */
> > +
> > +#include <dirent.h>
> > +#include <errno.h>
> > +#include <fcntl.h>
> > +#include <getopt.h>
> > +#include <locale.h>
> > +#include <stdbool.h>
> > +#include <stddef.h>
> > +#include <stdint.h>
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <string.h>
> > +#include <syslog.h>
> > +#include <unistd.h>
> > +#include <wchar.h>
> > +#include <sys/stat.h>
> > +#include <linux/hyperv.h>
> > +#include <linux/limits.h>
> > +#include "vmbus_bufring.h"
> > +
> > +#define ICMSGTYPE_NEGOTIATE 0
> > +#define ICMSGTYPE_FCOPY 7
> > +
> > +#define WIN8_SRV_MAJOR 1
> > +#define WIN8_SRV_MINOR 1
> > +#define WIN8_SRV_VERSION (WIN8_SRV_MAJOR << 16 | WIN8_SRV_MINOR)
> > +
> > +#define MAX_FOLDER_NAME 15
> > +#define MAX_PATH_LEN 15
> > +#define FCOPY_UIO "/sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/uio"
> > +
> > +#define FCOPY_VER_COUNT 1
> > +static const int fcopy_versions[] = {
> > + WIN8_SRV_VERSION
> > +};
> > +
> > +#define FW_VER_COUNT 1
> > +static const int fw_versions[] = {
> > + UTIL_FW_VERSION
> > +};
> > +
> > +#define HV_RING_SIZE (4 * 4096)
>
> Hey, that doesn't match the kernel driver! Why these values?

This application talks to device which is recognize as HV_FCOPY
is kernel. In the first patch of current patch series I have
mentioned .pref_ring_size = 0x4000 for HV_FCOPY which matches this.

This code is well tested.


>
>
> > +
> > +unsigned char desc[HV_RING_SIZE];
> > +
> > +static int target_fd;
> > +static char target_fname[PATH_MAX];
> > +static unsigned long long filesize;
> > +
> > +static int hv_fcopy_create_file(char *file_name, char *path_name, __u32 flags)
> > +{
> > + int error = HV_E_FAIL;
> > + char *q, *p;
> > +
> > + filesize = 0;
> > + p = (char *)path_name;
>
> Why the unneeded cast?

This code is existing today as form of hv_fcopy_daemon. As this new
application is replacing hv_fcopy_daemon I reused the same code and
casting.

But, I agree I can improve these.

>
> > + snprintf(target_fname, sizeof(target_fname), "%s/%s",
> > + (char *)path_name, (char *)file_name);
>
> Again, why all of the unneeded casts? This feels very odd, so I've
> stopped reading here, perhaps get an internal review first before
> sending this out again?

This patch has gone through extensive review in past, here is the
reference to the history:

https://lore.kernel.org/lkml/[email protected]/

I will look for some more internal review and reviewed-by.

- Saurabh

>
> thanks,
>
> greg k-h

2024-02-19 09:40:35

by Saurabh Sengar

[permalink] [raw]
Subject: Re: [PATCH 2/6] uio_hv_generic: Query the ringbuffer size for device

On Mon, Feb 19, 2024 at 09:50:54AM +0100, Greg KH wrote:
> On Sat, Feb 17, 2024 at 10:03:36AM -0800, Saurabh Sengar wrote:
> > Query the ring buffer size from pre defined table per device.
> > Keep the size as is if the device doesn't have any preferred
> > ring size.
>
> What is the "as is" size?

I will elaborate more here.

>
> >
> > Signed-off-by: Saurabh Sengar <[email protected]>
> > ---
> > drivers/uio/uio_hv_generic.c | 7 +++++--
> > 1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
> > index 20d9762331bd..4bda6b52e49e 100644
> > --- a/drivers/uio/uio_hv_generic.c
> > +++ b/drivers/uio/uio_hv_generic.c
> > @@ -238,6 +238,7 @@ hv_uio_probe(struct hv_device *dev,
> > struct hv_uio_private_data *pdata;
> > void *ring_buffer;
> > int ret;
> > + size_t ring_size = hv_dev_ring_size(channel);
> >
> > /* Communicating with host has to be via shared memory not hypercall */
> > if (!channel->offermsg.monitor_allocated) {
> > @@ -245,12 +246,14 @@ hv_uio_probe(struct hv_device *dev,
> > return -ENOTSUPP;
> > }
> >
> > + if (!ring_size)
> > + ring_size = HV_RING_SIZE * PAGE_SIZE;
>
> Why the magic * PAGE_SIZE here?
>
> Where is it documented that ring_size is in pages?
>
> And what happens when PAGE_SIZE is changed? Why are you relying on that
> arbritrary value to dictate your buffer sizes to a device that has
> no relationship with PAGE_SIZE?
>
> Yes, I know you are copying what was there today, but you have the
> chance to rethink and most importantly, DOCUMENT this decision properly
> now.

I agree PAGE_SIZE is not accurate here and we should use HV_HYP_PAGE_SIZE.
This can be further improved by using VMBUS_RING_SIZE macro.

However, I propose addressing this improvement in a separate patch, given
the are significant changes already present in this series.

- Saurabh


>
> thanks,
>
> greg k-h

2024-02-19 09:52:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 5/6] tools: hv: Add new fcopy application based on uio driver

On Mon, Feb 19, 2024 at 01:24:21AM -0800, Saurabh Singh Sengar wrote:
> > > +#define HV_RING_SIZE (4 * 4096)
> >
> > Hey, that doesn't match the kernel driver! Why these values?
>
> This application talks to device which is recognize as HV_FCOPY
> is kernel. In the first patch of current patch series I have
> mentioned .pref_ring_size = 0x4000 for HV_FCOPY which matches this.
>
> This code is well tested.

I'm commenting on the fact the 4096 is sometimes PAGE_SIZE and sometimes
not, and you have a default of using PAGE_SIZE values in the kernel
driver, and as such, this will not match up. So be careful here.

> > > +
> > > +unsigned char desc[HV_RING_SIZE];
> > > +
> > > +static int target_fd;
> > > +static char target_fname[PATH_MAX];
> > > +static unsigned long long filesize;
> > > +
> > > +static int hv_fcopy_create_file(char *file_name, char *path_name, __u32 flags)
> > > +{
> > > + int error = HV_E_FAIL;
> > > + char *q, *p;
> > > +
> > > + filesize = 0;
> > > + p = (char *)path_name;
> >
> > Why the unneeded cast?
>
> This code is existing today as form of hv_fcopy_daemon. As this new
> application is replacing hv_fcopy_daemon I reused the same code and
> casting.

That wasn't obvious that this was copied code, please be explicit about
that.

thanks,

greg k-h

2024-02-19 10:03:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 2/6] uio_hv_generic: Query the ringbuffer size for device

On Mon, Feb 19, 2024 at 01:40:23AM -0800, Saurabh Singh Sengar wrote:
> On Mon, Feb 19, 2024 at 09:50:54AM +0100, Greg KH wrote:
> > On Sat, Feb 17, 2024 at 10:03:36AM -0800, Saurabh Sengar wrote:
> > > Query the ring buffer size from pre defined table per device.
> > > Keep the size as is if the device doesn't have any preferred
> > > ring size.
> >
> > What is the "as is" size?
>
> I will elaborate more here.
>
> >
> > >
> > > Signed-off-by: Saurabh Sengar <[email protected]>
> > > ---
> > > drivers/uio/uio_hv_generic.c | 7 +++++--
> > > 1 file changed, 5 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
> > > index 20d9762331bd..4bda6b52e49e 100644
> > > --- a/drivers/uio/uio_hv_generic.c
> > > +++ b/drivers/uio/uio_hv_generic.c
> > > @@ -238,6 +238,7 @@ hv_uio_probe(struct hv_device *dev,
> > > struct hv_uio_private_data *pdata;
> > > void *ring_buffer;
> > > int ret;
> > > + size_t ring_size = hv_dev_ring_size(channel);
> > >
> > > /* Communicating with host has to be via shared memory not hypercall */
> > > if (!channel->offermsg.monitor_allocated) {
> > > @@ -245,12 +246,14 @@ hv_uio_probe(struct hv_device *dev,
> > > return -ENOTSUPP;
> > > }
> > >
> > > + if (!ring_size)
> > > + ring_size = HV_RING_SIZE * PAGE_SIZE;
> >
> > Why the magic * PAGE_SIZE here?
> >
> > Where is it documented that ring_size is in pages?
> >
> > And what happens when PAGE_SIZE is changed? Why are you relying on that
> > arbritrary value to dictate your buffer sizes to a device that has
> > no relationship with PAGE_SIZE?
> >
> > Yes, I know you are copying what was there today, but you have the
> > chance to rethink and most importantly, DOCUMENT this decision properly
> > now.
>
> I agree PAGE_SIZE is not accurate here and we should use HV_HYP_PAGE_SIZE.
> This can be further improved by using VMBUS_RING_SIZE macro.
>
> However, I propose addressing this improvement in a separate patch, given
> the are significant changes already present in this series.

Add it as a new patch to the series makes sense, thanks!

greg k-h

2024-02-19 10:21:26

by Saurabh Sengar

[permalink] [raw]
Subject: Re: [PATCH 2/6] uio_hv_generic: Query the ringbuffer size for device

On Mon, Feb 19, 2024 at 11:02:54AM +0100, Greg KH wrote:
> On Mon, Feb 19, 2024 at 01:40:23AM -0800, Saurabh Singh Sengar wrote:
> > On Mon, Feb 19, 2024 at 09:50:54AM +0100, Greg KH wrote:
> > > On Sat, Feb 17, 2024 at 10:03:36AM -0800, Saurabh Sengar wrote:
> > > > Query the ring buffer size from pre defined table per device.
> > > > Keep the size as is if the device doesn't have any preferred
> > > > ring size.
> > >
> > > What is the "as is" size?
> >
> > I will elaborate more here.
> >
> > >
> > > >
> > > > Signed-off-by: Saurabh Sengar <[email protected]>
> > > > ---
> > > > drivers/uio/uio_hv_generic.c | 7 +++++--
> > > > 1 file changed, 5 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
> > > > index 20d9762331bd..4bda6b52e49e 100644
> > > > --- a/drivers/uio/uio_hv_generic.c
> > > > +++ b/drivers/uio/uio_hv_generic.c
> > > > @@ -238,6 +238,7 @@ hv_uio_probe(struct hv_device *dev,
> > > > struct hv_uio_private_data *pdata;
> > > > void *ring_buffer;
> > > > int ret;
> > > > + size_t ring_size = hv_dev_ring_size(channel);
> > > >
> > > > /* Communicating with host has to be via shared memory not hypercall */
> > > > if (!channel->offermsg.monitor_allocated) {
> > > > @@ -245,12 +246,14 @@ hv_uio_probe(struct hv_device *dev,
> > > > return -ENOTSUPP;
> > > > }
> > > >
> > > > + if (!ring_size)
> > > > + ring_size = HV_RING_SIZE * PAGE_SIZE;
> > >
> > > Why the magic * PAGE_SIZE here?
> > >
> > > Where is it documented that ring_size is in pages?
> > >
> > > And what happens when PAGE_SIZE is changed? Why are you relying on that
> > > arbritrary value to dictate your buffer sizes to a device that has
> > > no relationship with PAGE_SIZE?
> > >
> > > Yes, I know you are copying what was there today, but you have the
> > > chance to rethink and most importantly, DOCUMENT this decision properly
> > > now.
> >
> > I agree PAGE_SIZE is not accurate here and we should use HV_HYP_PAGE_SIZE.
> > This can be further improved by using VMBUS_RING_SIZE macro.
> >
> > However, I propose addressing this improvement in a separate patch, given
> > the are significant changes already present in this series.
>
> Add it as a new patch to the series makes sense, thanks!

Sure, will add in V2.

- Saurabh

>
> greg k-h

2024-02-19 10:24:06

by Saurabh Sengar

[permalink] [raw]
Subject: Re: [PATCH 5/6] tools: hv: Add new fcopy application based on uio driver

On Mon, Feb 19, 2024 at 10:52:12AM +0100, Greg KH wrote:
> On Mon, Feb 19, 2024 at 01:24:21AM -0800, Saurabh Singh Sengar wrote:
> > > > +#define HV_RING_SIZE (4 * 4096)
> > >
> > > Hey, that doesn't match the kernel driver! Why these values?
> >
> > This application talks to device which is recognize as HV_FCOPY
> > is kernel. In the first patch of current patch series I have
> > mentioned .pref_ring_size = 0x4000 for HV_FCOPY which matches this.
> >
> > This code is well tested.
>
> I'm commenting on the fact the 4096 is sometimes PAGE_SIZE and sometimes
> not, and you have a default of using PAGE_SIZE values in the kernel
> driver, and as such, this will not match up. So be careful here.

Thanks for clarification, I will add some note for it in V2.

>
> > > > +
> > > > +unsigned char desc[HV_RING_SIZE];
> > > > +
> > > > +static int target_fd;
> > > > +static char target_fname[PATH_MAX];
> > > > +static unsigned long long filesize;
> > > > +
> > > > +static int hv_fcopy_create_file(char *file_name, char *path_name, __u32 flags)
> > > > +{
> > > > + int error = HV_E_FAIL;
> > > > + char *q, *p;
> > > > +
> > > > + filesize = 0;
> > > > + p = (char *)path_name;
> > >
> > > Why the unneeded cast?
> >
> > This code is existing today as form of hv_fcopy_daemon. As this new
> > application is replacing hv_fcopy_daemon I reused the same code and
> > casting.
>
> That wasn't obvious that this was copied code, please be explicit about
> that.
>
> thanks,
>
> greg k-h

2024-03-12 20:57:52

by Long Li

[permalink] [raw]
Subject: RE: [PATCH 1/6] Drivers: hv: vmbus: Add utility function for querying ring size

> Subject: [PATCH 1/6] Drivers: hv: vmbus: Add utility function for querying ring size
>
> Add a function to query for the preferred ring buffer size of VMBus device.

Patch looks good to me. It will be helpful if you can document the ring sizes for each device and put it in the comment. (e.g use a new ring size, or keep using an existing ring size)

>
> Signed-off-by: Saurabh Sengar <[email protected]>
> ---
> drivers/hv/channel_mgmt.c | 7 +++++--
> drivers/hv/hyperv_vmbus.h | 5 +++++
> include/linux/hyperv.h | 1 +
> 3 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index
> 2f4d09ce027a..7ea444d72f9f 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -120,7 +120,8 @@ const struct vmbus_device vmbus_devs[] = {
> },
>
> /* File copy */
> - { .dev_type = HV_FCOPY,
> + { .pref_ring_size = 0x4000,
> + .dev_type = HV_FCOPY,
> HV_FCOPY_GUID,
> .perf_device = false,
> .allowed_in_isolated = false,
> @@ -141,11 +142,13 @@ const struct vmbus_device vmbus_devs[] = {
> },
>
> /* Unknown GUID */
> - { .dev_type = HV_UNKNOWN,
> + { .pref_ring_size = 0x11000,
> + .dev_type = HV_UNKNOWN,
> .perf_device = false,
> .allowed_in_isolated = false,
> },
> };
> +EXPORT_SYMBOL_GPL(vmbus_devs);
>
> static const struct {
> guid_t guid;
> diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h index
> f6b1e710f805..76ac5185a01a 100644
> --- a/drivers/hv/hyperv_vmbus.h
> +++ b/drivers/hv/hyperv_vmbus.h
> @@ -417,6 +417,11 @@ static inline bool hv_is_perf_channel(struct
> vmbus_channel *channel)
> return vmbus_devs[channel->device_id].perf_device;
> }
>
> +static inline size_t hv_dev_ring_size(struct vmbus_channel *channel) {
> + return vmbus_devs[channel->device_id].pref_ring_size;
> +}
> +
> static inline bool hv_is_allocated_cpu(unsigned int cpu) {
> struct vmbus_channel *channel, *sc;
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index
> 2b00faf98017..5951c7bb5712 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -800,6 +800,7 @@ struct vmbus_requestor { #define VMBUS_RQST_RESET
> (U64_MAX - 3)
>
> struct vmbus_device {
> + size_t pref_ring_size;
> u16 dev_type;
> guid_t guid;
> bool perf_device;
> --
> 2.34.1
>


2024-03-13 19:27:22

by Long Li

[permalink] [raw]
Subject: RE: [PATCH 4/6] tools: hv: Add vmbus_bufring

> +
> +#define rte_compiler_barrier() ({ asm volatile ("" : : : "memory"); })
> +
> +#define rte_smp_rwmb() ({ asm volatile ("" : : :
> "memory"); })
> +
> +#define VMBUS_RQST_ERROR 0xFFFFFFFFFFFFFFFF
> +#define ALIGN(val, align) ((typeof(val))((val) & (~((typeof(val))((align) -
> 1)))))
> +
> +void *vmbus_uio_map(int *fd, int size)
> +{
> + void *map;
> +
> + map = mmap(NULL, 2 * size, PROT_READ | PROT_WRITE, MAP_SHARED,
> *fd, 0);
> + if (map == MAP_FAILED)
> + return NULL;
> +
> + return map;
> +}
> +
> +/* Increase bufring index by inc with wraparound */ static inline
> +uint32_t vmbus_br_idxinc(uint32_t idx, uint32_t inc, uint32_t sz) {
> + idx += inc;
> + if (idx >= sz)
> + idx -= sz;
> +
> + return idx;
> +}
> +
> +void vmbus_br_setup(struct vmbus_br *br, void *buf, unsigned int blen)
> +{
> + br->vbr = buf;
> + br->windex = br->vbr->windex;
> + br->dsize = blen - sizeof(struct vmbus_bufring); }
> +
> +static inline __always_inline void
> +rte_smp_mb(void)
> +{
> + asm volatile("lock addl $0, -128(%%rsp); " ::: "memory"); }
> +
> +static inline int
> +rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
> +{
> + uint8_t res;
> +
> + asm volatile("lock ; "
> + "cmpxchgl %[src], %[dst];"
> + "sete %[res];"
> + : [res] "=a" (res), /* output */
> + [dst] "=m" (*dst)
> + : [src] "r" (src), /* input */
> + "a" (exp),
> + "m" (*dst)
> + : "memory"); /* no-clobber list */
> + return res;
> +}

Those helper functions (rte_*) are difficult to read for someone without DPDK background. Can you add comments to those helper functions? Maybe just copy the comments over from DPDK.

> +
> +static inline uint32_t
> +vmbus_txbr_copyto(const struct vmbus_br *tbr, uint32_t windex,
> + const void *src0, uint32_t cplen)
> +{
> + uint8_t *br_data = tbr->vbr->data;
> + uint32_t br_dsize = tbr->dsize;
> + const uint8_t *src = src0;
> +
> + /* XXX use double mapping like Linux kernel? */
> + if (cplen > br_dsize - windex) {
> + uint32_t fraglen = br_dsize - windex;
> +
> + /* Wrap-around detected */
> + memcpy(br_data + windex, src, fraglen);
> + memcpy(br_data, src + fraglen, cplen - fraglen);
> + } else {
> + memcpy(br_data + windex, src, cplen);
> + }
> +
> + return vmbus_br_idxinc(windex, cplen, br_dsize); }
> +
> +/*
> + * Write scattered channel packet to TX bufring.
> + *
> + * The offset of this channel packet is written as a 64bits value
> + * immediately after this channel packet.
> + *
> + * The write goes through three stages:
> + * 1. Reserve space in ring buffer for the new data.
> + * Writer atomically moves priv_write_index.
> + * 2. Copy the new data into the ring.
> + * 3. Update the tail of the ring (visible to host) that indicates
> + * next read location. Writer updates write_index
> + */
> +static int
> +vmbus_txbr_write(struct vmbus_br *tbr, const struct iovec iov[], int iovlen,
> + bool *need_sig)

Do you need "need_sig"? It's probably not for non-network/storage devices.

2024-03-13 19:31:14

by Long Li

[permalink] [raw]
Subject: RE: [PATCH 5/6] tools: hv: Add new fcopy application based on uio driver

> + /* Signal host */
> + if ((write(fcopy_fd, &tmp, sizeof(int))) != sizeof(int)) {
> + ret = errno;
> + syslog(LOG_ERR, "Registration failed: %s\n",
> strerror(ret));

"Signaling failed" is a better word?


2024-03-13 19:31:48

by Long Li

[permalink] [raw]
Subject: RE: [PATCH 6/6] Drivers: hv: Remove fcopy driver

> Subject: [PATCH 6/6] Drivers: hv: Remove fcopy driver
>
> As the new fcopy driver using uio is introduced, remove obsolete driver and
> application.
>
> Signed-off-by: Saurabh Sengar <[email protected]>

Reviewed-by: Long Li <[email protected]>