Hi Folks,
Background
==========
Some days ago, we had a discussion on the topic of channel path virtualization.
Ref:
Subject: [PATCH 0/3] Channel Path realted CRW generation
Message-Id: <[email protected]>
URL: https://lists.nongnu.org/archive/html/qemu-devel/2017-07/msg08414.html
Indeed that thread is not short and discussed many aspects in a
non-concentrated manner. The parts those are most valuable to me are:
1. a re-modelling for channel path is surely the best offer, but is not
possible to have in the near future.
2. to enhance the path related functionalities, using PNO and PNOM might
be something we can do for now. This may be something that I could improve
without model related arguments.
So here I have this series targeting to add basic channel path event handling
for vfio-ccw -- no touch of the channel path modelling in both the kernel and
the QEMU side, but find a way to sync path status change to guest lazily using
SCSW_FLAGS_MASK_PNO and pmcw->pnom. In short, I want to enhance path related
stuff (to be more specific: sync up path status to the guest) on a best effort
basis, which means in a way that won't get us invloed to do channel path
re-modelling.
What benifit can we get from this? The administrator of a virtual machine can
get uptodate (in some extent) status of the current using channel paths, so
he/she can monitor paths status and get path problem noticed timely (see the
example below).
I think we can start a new round discussion based on this series. So reviewers
can give their comments based on some code, and then we can decide if this is
we want or not.
As flagged with RFC, the intention of this series is to show what I have for
now, and what could the code look like in general. Thus I can get some early
feedbacks. I would expect to see opinions on:
- is the target (mentioned above) of this series welcomed or not.
- is the approach of this series good or bad.
So I can either move on with this (or with other suggested approach) or leave
it alone.
Basic Introduction of The Patches
=================================
This is the kernel counterpart, which mainly does:
1. add a schib vfio region for userland to _store_ subchannel information.
2. add a channel path vfio irq to notify userland with chp status change event.
3. add .chp_event handler for vfio-ccw driver, so the driver handles chp event,
and signals userland about the event.
With the above work, userland can be signaled with chp related event, and then
it can read out uptodate SCHIB from the new region, and sync up path related
information to the corresponding virtual subchannel. So a guest can sense the
path update in some extent.
For the QEMU counterpart, please ref:
[RFC PATCH 0/5] vfio/ccw: basic channel path event handling
The QEMU counterpart mainly does:
1. add handling of the schib region, so that it can read out the newest schib
information.
2. add handling of the chp irq, so that it can get notification of channel path
status change.
3. once there is a chp status event, synchronize related information from the
newest schib information to guest lazily.
What are still missing, thus need to be offered in the next version are:
- I/O termination and FSM state handling if currently we have I/O on the status
switched path.
- Vary on/off event is not sensible to a guest.
Example
=======
With both the kernel and Qemu parts applied, we can notice some new behaviors
of a channel path when we have a guest with a passed through vfio-ccw device
using it. The guest can reflect the chp status change of the host side lazily,
and synchronize the updated information.
For example:
0. Prepare a vfio subchannel on the host:
[root@host ~]# lscss --vfio 013f
MDEV Subchan. PIM PAM POM CHPIDs
------------------------------------------------------------------------------
6dfd3ec5-e8b3-4e18-a6fe-57bc9eceb920 0.0.013f f0 f0 ff 42434445 00000000
1. Pass-through subchannel 0.0.013f to a guest:
-device vfio-ccw,sysfsdev="$MDEV_FILE_PATH",devno=0.0.3f3f
2. Start the guest and check the device and path information:
[root@guest ~]# lscss 0002
Device Subchan. DevType CU Type Use PIM PAM POM CHPIDs
----------------------------------------------------------------------
0.0.3f3f 0.0.0002 3390/0c 3990/e9 f0 f0 ff 42434445 00000000
[root@guest ~]# lschp
CHPID Vary Cfg. Type Cmg Shared PCHID
============================================
0.00 1 - 32 - - -
0.42 1 3 1b - - -
0.43 1 3 1b - - -
0.44 1 3 1b - - -
0.45 1 3 1b - - -
3. On the host, configure off one path.
[root@host ~]# chchp -c 0 42
4. On the guest, check the status:
[root@guest ~]# lscss 0002
Device Subchan. DevType CU Type Use PIM PAM POM CHPIDs
----------------------------------------------------------------------
0.0.3f3f 0.0.0002 3390/0c 3990/e9 f0 f0 ff 42434445 00000000
#Notice: No change!
[root@localhost ~]# chccwdev -e 3f3f
Setting device 0.0.3f3f online
dasd-eckd 0.0.3f3f: A channel path to the device has become operational
dasd-eckd 0.0.3f3f: New DASD 3390/0C (CU 3990/01) with 30051 cylinders, 15 heads, 224 sectors
dasd-eckd 0.0.3f3f: DASD with 4 KB/block, 21636720 KB total size, 48 KB/track, compatible disk layout
dasda:VOL1/ 0X3F3F: dasda1
Done
[root@guest ~]# lscss 0002
Device Subchan. DevType CU Type Use PIM PAM POM CHPIDs
----------------------------------------------------------------------
0.0.3f3f 0.0.0002 3390/0c 3990/e9 f0 70 ff 42434445 00000000
#Notice: PAM value of path 0.42 changed.
5. On the host, configure on one path.
[root@host ~]# chchp -c 1 42
6. On the guest, check the status again:
[root@guest ~]# lscss 0002
Device Subchan. DevType CU Type Use PIM PAM POM CHPIDs
----------------------------------------------------------------------
0.0.3f3f 0.0.0002 3390/0c 3990/e9 f0 70 ff 42434445 00000000
#Notice: No change!
[root@localhost ~]# chccwdev -d 3f3f
Setting device 0.0.3f3f offline
Done
[root@guest ~]# lscss 0002
Device Subchan. DevType CU Type Use PIM PAM POM CHPIDs
----------------------------------------------------------------------
0.0.3f3f 0.0.0002 3390/0c 3990/e9 f0 f0 ff 42434445 00000000
#Notice: PAM changed again.
Dong Jia Shi (3):
vfio: ccw: introduce schib region
vfio: ccw: introduce channel path irq
vfio: ccw: handle chp event
drivers/s390/cio/vfio_ccw_drv.c | 51 +++++++++++++++++
drivers/s390/cio/vfio_ccw_fsm.c | 43 ++++++++++++++
drivers/s390/cio/vfio_ccw_ops.c | 108 ++++++++++++++++++++++++++----------
drivers/s390/cio/vfio_ccw_private.h | 8 +++
include/uapi/linux/vfio.h | 2 +
include/uapi/linux/vfio_ccw.h | 6 ++
6 files changed, 190 insertions(+), 28 deletions(-)
--
2.13.5
This introduces a new region for vfio-ccw to provide subchannel
information for user space.
Signed-off-by: Dong Jia Shi <[email protected]>
---
drivers/s390/cio/vfio_ccw_fsm.c | 21 ++++++++++
drivers/s390/cio/vfio_ccw_ops.c | 79 +++++++++++++++++++++++++++----------
drivers/s390/cio/vfio_ccw_private.h | 3 ++
include/uapi/linux/vfio.h | 1 +
include/uapi/linux/vfio_ccw.h | 6 +++
5 files changed, 90 insertions(+), 20 deletions(-)
diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index c30420c517b1..be081ccabea3 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -172,6 +172,22 @@ static void fsm_irq(struct vfio_ccw_private *private,
complete(private->completion);
}
+static void fsm_update_subch(struct vfio_ccw_private *private,
+ enum vfio_ccw_event event)
+{
+ struct subchannel *sch;
+
+ sch = private->sch;
+ if (cio_update_schib(sch)) {
+ private->schib_region.cc = 3;
+ return;
+ }
+
+ private->schib_region.cc = 0;
+ memcpy(private->schib_region.schib_area, &sch->schib,
+ sizeof(sch->schib));
+}
+
/*
* Device statemachine
*/
@@ -180,25 +196,30 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
[VFIO_CCW_EVENT_NOT_OPER] = fsm_nop,
[VFIO_CCW_EVENT_IO_REQ] = fsm_io_error,
[VFIO_CCW_EVENT_INTERRUPT] = fsm_disabled_irq,
+ [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
},
[VFIO_CCW_STATE_STANDBY] = {
[VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
[VFIO_CCW_EVENT_IO_REQ] = fsm_io_error,
[VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
+ [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
},
[VFIO_CCW_STATE_IDLE] = {
[VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
[VFIO_CCW_EVENT_IO_REQ] = fsm_io_request,
[VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
+ [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
},
[VFIO_CCW_STATE_BOXED] = {
[VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
[VFIO_CCW_EVENT_IO_REQ] = fsm_io_busy,
[VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
+ [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
},
[VFIO_CCW_STATE_BUSY] = {
[VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
[VFIO_CCW_EVENT_IO_REQ] = fsm_io_busy,
[VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
+ [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
},
};
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 41eeb57d68a3..9528fce2e7d9 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -13,6 +13,11 @@
#include "vfio_ccw_private.h"
+#define VFIO_CCW_OFFSET_SHIFT 40
+#define VFIO_CCW_OFFSET_TO_INDEX(off) (off >> VFIO_CCW_OFFSET_SHIFT)
+#define VFIO_CCW_INDEX_TO_OFFSET(index) ((u64)(index) << VFIO_CCW_OFFSET_SHIFT)
+#define VFIO_CCW_OFFSET_MASK (((u64)(1) << VFIO_CCW_OFFSET_SHIFT) - 1)
+
static int vfio_ccw_mdev_reset(struct mdev_device *mdev)
{
struct vfio_ccw_private *private;
@@ -168,14 +173,31 @@ static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
loff_t *ppos)
{
struct vfio_ccw_private *private;
- struct ccw_io_region *region;
+ void *region;
+ u32 index;
+ loff_t pos;
+
+ index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
+ pos = (*ppos) & VFIO_CCW_OFFSET_MASK;
+ private = dev_get_drvdata(mdev_parent_dev(mdev));
- if (*ppos + count > sizeof(*region))
+ switch (index) {
+ case VFIO_CCW_CONFIG_REGION_INDEX:
+ if (pos + count > sizeof(struct ccw_io_region))
+ return -EINVAL;
+ region = &private->io_region;
+ break;
+ case VFIO_CCW_SCHIB_REGION_INDEX:
+ if (pos + count > sizeof(struct ccw_schib_region))
+ return -EINVAL;
+ region = &private->schib_region;
+ vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_UPDATE_SUBCH);
+ break;
+ default:
return -EINVAL;
+ }
- private = dev_get_drvdata(mdev_parent_dev(mdev));
- region = &private->io_region;
- if (copy_to_user(buf, (void *)region + *ppos, count))
+ if (copy_to_user(buf, region + pos, count))
return -EFAULT;
return count;
@@ -187,23 +209,35 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
loff_t *ppos)
{
struct vfio_ccw_private *private;
- struct ccw_io_region *region;
-
- if (*ppos + count > sizeof(*region))
- return -EINVAL;
+ u32 index;
+ loff_t pos;
+ index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
+ pos = (*ppos) & VFIO_CCW_OFFSET_MASK;
private = dev_get_drvdata(mdev_parent_dev(mdev));
- if (private->state != VFIO_CCW_STATE_IDLE)
- return -EACCES;
- region = &private->io_region;
- if (copy_from_user((void *)region + *ppos, buf, count))
- return -EFAULT;
-
- vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_IO_REQ);
- if (region->ret_code != 0) {
- private->state = VFIO_CCW_STATE_IDLE;
- return region->ret_code;
+ switch (index) {
+ case VFIO_CCW_CONFIG_REGION_INDEX:
+ {
+ struct ccw_io_region *region;
+ if (pos + count > sizeof(*region))
+ return -EINVAL;
+ if (private->state != VFIO_CCW_STATE_IDLE)
+ return -EACCES;
+ region = &private->io_region;
+ if (copy_from_user((void *)region + *ppos, buf, count))
+ return -EFAULT;
+ vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_IO_REQ);
+ if (region->ret_code != 0) {
+ private->state = VFIO_CCW_STATE_IDLE;
+ return region->ret_code;
+ }
+ break;
+ }
+ case VFIO_CCW_SCHIB_REGION_INDEX:
+ return -EOPNOTSUPP;
+ default:
+ return -EINVAL;
}
return count;
@@ -224,11 +258,16 @@ static int vfio_ccw_mdev_get_region_info(struct vfio_region_info *info,
{
switch (info->index) {
case VFIO_CCW_CONFIG_REGION_INDEX:
- info->offset = 0;
+ info->offset = VFIO_CCW_INDEX_TO_OFFSET(info->index);
info->size = sizeof(struct ccw_io_region);
info->flags = VFIO_REGION_INFO_FLAG_READ
| VFIO_REGION_INFO_FLAG_WRITE;
return 0;
+ case VFIO_CCW_SCHIB_REGION_INDEX:
+ info->offset = VFIO_CCW_INDEX_TO_OFFSET(info->index);
+ info->size = sizeof(struct ccw_schib_region);
+ info->flags = VFIO_REGION_INFO_FLAG_READ;
+ return 0;
default:
return -EINVAL;
}
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index 78a66d96756b..460c8b90d834 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -28,6 +28,7 @@
* @mdev: pointer to the mediated device
* @nb: notifier for vfio events
* @io_region: MMIO region to input/output I/O arguments/results
+ * @schib_region: MMIO region of subchannel information
* @cp: channel program for the current I/O operation
* @irb: irb info received from interrupt
* @scsw: scsw info
@@ -42,6 +43,7 @@ struct vfio_ccw_private {
struct mdev_device *mdev;
struct notifier_block nb;
struct ccw_io_region io_region;
+ struct ccw_schib_region schib_region;
struct channel_program cp;
struct irb irb;
@@ -76,6 +78,7 @@ enum vfio_ccw_event {
VFIO_CCW_EVENT_NOT_OPER,
VFIO_CCW_EVENT_IO_REQ,
VFIO_CCW_EVENT_INTERRUPT,
+ VFIO_CCW_EVENT_UPDATE_SUBCH,
/* last element! */
NR_VFIO_CCW_EVENTS
};
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index e3301dbd27d4..c60244debf71 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -457,6 +457,7 @@ enum {
enum {
VFIO_CCW_CONFIG_REGION_INDEX,
+ VFIO_CCW_SCHIB_REGION_INDEX,
VFIO_CCW_NUM_REGIONS
};
diff --git a/include/uapi/linux/vfio_ccw.h b/include/uapi/linux/vfio_ccw.h
index 2ec5f367ff78..12508ef6e6fc 100644
--- a/include/uapi/linux/vfio_ccw.h
+++ b/include/uapi/linux/vfio_ccw.h
@@ -22,4 +22,10 @@ struct ccw_io_region {
__u32 ret_code;
} __packed;
+struct ccw_schib_region {
+ __u8 cc;
+#define SCHIB_AREA_SIZE 52
+ __u8 schib_area[SCHIB_AREA_SIZE];
+} __packed;
+
#endif
--
2.13.5
This adds channel path related event handler for vfio-ccw.
This also signals userland when there is a chp event.
Signed-off-by: Dong Jia Shi <[email protected]>
---
drivers/s390/cio/vfio_ccw_drv.c | 51 +++++++++++++++++++++++++++++++++++++
drivers/s390/cio/vfio_ccw_fsm.c | 22 ++++++++++++++++
drivers/s390/cio/vfio_ccw_private.h | 3 +++
3 files changed, 76 insertions(+)
diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index ea6a2d0b2894..5f01f3e6742d 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -17,6 +17,7 @@
#include <asm/isc.h>
+#include "chp.h"
#include "ioasm.h"
#include "css.h"
#include "vfio_ccw_private.h"
@@ -88,6 +89,15 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
private->state = VFIO_CCW_STATE_IDLE;
}
+static void vfio_ccw_sch_chp_todo(struct work_struct *work)
+{
+ struct vfio_ccw_private *private =
+ container_of(work, struct vfio_ccw_private, chp_work);
+
+ if (private->chp_trigger)
+ eventfd_signal(private->chp_trigger, 1);
+}
+
/*
* Css driver callbacks
*/
@@ -130,6 +140,7 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
goto out_disable;
INIT_WORK(&private->io_work, vfio_ccw_sch_io_todo);
+ INIT_WORK(&private->chp_work, vfio_ccw_sch_chp_todo);
atomic_set(&private->avail, 1);
private->state = VFIO_CCW_STATE_STANDBY;
@@ -202,6 +213,45 @@ static int vfio_ccw_sch_event(struct subchannel *sch, int process)
return 0;
}
+static int vfio_ccw_sch_chp_event(struct subchannel *sch,
+ struct chp_link *link, int event)
+{
+ struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev);
+ int mask = chp_ssd_get_mask(&sch->ssd_info, link);
+
+ /* move these checks around? */
+ if (!private || !mask)
+ return 0;
+
+ if (cio_update_schib(sch))
+ return -ENODEV;
+
+ switch (event) {
+ case CHP_VARY_OFF:
+ sch->opm &= ~mask;
+ sch->lpm &= ~mask;
+ /* TODO: terminate current I/O on path. */
+ break;
+ case CHP_VARY_ON:
+ sch->opm |= mask;
+ sch->lpm |= mask;
+ break;
+ case CHP_OFFLINE:
+ /* TODO: terminate current I/O on path. */
+ break;
+ case CHP_ONLINE:
+ sch->lpm |= mask & sch->opm;
+ break;
+ default:
+ /* Not possible? */
+ return 0;
+ }
+
+ vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_UPDATE_CHP);
+
+ return 0;
+}
+
static struct css_device_id vfio_ccw_sch_ids[] = {
{ .match_flags = 0x1, .type = SUBCHANNEL_TYPE_IO, },
{ /* end of list */ },
@@ -219,6 +269,7 @@ static struct css_driver vfio_ccw_sch_driver = {
.remove = vfio_ccw_sch_remove,
.shutdown = vfio_ccw_sch_shutdown,
.sch_event = vfio_ccw_sch_event,
+ .chp_event = vfio_ccw_sch_chp_event,
};
static int __init vfio_ccw_sch_init(void)
diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index be081ccabea3..c400021134cb 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -188,6 +188,23 @@ static void fsm_update_subch(struct vfio_ccw_private *private,
sizeof(sch->schib));
}
+static void fsm_update_chp(struct vfio_ccw_private *private,
+ enum vfio_ccw_event event)
+{
+ queue_work(vfio_ccw_work_q, &private->chp_work);
+
+}
+
+static void fsm_update_chp_busy(struct vfio_ccw_private *private,
+ enum vfio_ccw_event event)
+{
+ /*
+ * TODO:
+ * If we are having I/O on the current path, do
+ * extra handling?
+ */
+}
+
/*
* Device statemachine
*/
@@ -197,29 +214,34 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
[VFIO_CCW_EVENT_IO_REQ] = fsm_io_error,
[VFIO_CCW_EVENT_INTERRUPT] = fsm_disabled_irq,
[VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
+ [VFIO_CCW_EVENT_UPDATE_CHP] = fsm_nop,
},
[VFIO_CCW_STATE_STANDBY] = {
[VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
[VFIO_CCW_EVENT_IO_REQ] = fsm_io_error,
[VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
[VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
+ [VFIO_CCW_EVENT_UPDATE_CHP] = fsm_update_chp,
},
[VFIO_CCW_STATE_IDLE] = {
[VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
[VFIO_CCW_EVENT_IO_REQ] = fsm_io_request,
[VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
[VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
+ [VFIO_CCW_EVENT_UPDATE_CHP] = fsm_update_chp,
},
[VFIO_CCW_STATE_BOXED] = {
[VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
[VFIO_CCW_EVENT_IO_REQ] = fsm_io_busy,
[VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
[VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
+ [VFIO_CCW_EVENT_UPDATE_CHP] = fsm_update_chp,
},
[VFIO_CCW_STATE_BUSY] = {
[VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
[VFIO_CCW_EVENT_IO_REQ] = fsm_io_busy,
[VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
[VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
+ [VFIO_CCW_EVENT_UPDATE_CHP] = fsm_update_chp_busy,
},
};
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index da86f82dd7b9..e3bdb90555fb 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -35,6 +35,7 @@
* @io_trigger: eventfd ctx for signaling userspace I/O results
* @chp_trigger: eventfd ctx for signaling userspace chp event
* @io_work: work for deferral process of I/O handling
+ * @chp_work: work for deferral process of chp event
*/
struct vfio_ccw_private {
struct subchannel *sch;
@@ -53,6 +54,7 @@ struct vfio_ccw_private {
struct eventfd_ctx *io_trigger;
struct eventfd_ctx *chp_trigger;
struct work_struct io_work;
+ struct work_struct chp_work;
} __aligned(8);
extern int vfio_ccw_mdev_reg(struct subchannel *sch);
@@ -81,6 +83,7 @@ enum vfio_ccw_event {
VFIO_CCW_EVENT_IO_REQ,
VFIO_CCW_EVENT_INTERRUPT,
VFIO_CCW_EVENT_UPDATE_SUBCH,
+ VFIO_CCW_EVENT_UPDATE_CHP,
/* last element! */
NR_VFIO_CCW_EVENTS
};
--
2.13.5
This introduces a new irq for vfio-ccw to provide channel path
related event for userland.
Signed-off-by: Dong Jia Shi <[email protected]>
---
drivers/s390/cio/vfio_ccw_ops.c | 29 +++++++++++++++++++++--------
drivers/s390/cio/vfio_ccw_private.h | 2 ++
include/uapi/linux/vfio.h | 1 +
3 files changed, 24 insertions(+), 8 deletions(-)
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 9528fce2e7d9..972cfc8834fc 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -275,17 +275,20 @@ static int vfio_ccw_mdev_get_region_info(struct vfio_region_info *info,
static int vfio_ccw_mdev_get_irq_info(struct vfio_irq_info *info)
{
- if (info->index != VFIO_CCW_IO_IRQ_INDEX)
+ switch (info->index) {
+ case VFIO_CCW_IO_IRQ_INDEX:
+ case VFIO_CCW_CHP_IRQ_INDEX:
+ info->count = 1;
+ info->flags = VFIO_IRQ_INFO_EVENTFD;
+ return 0;
+ default:
return -EINVAL;
-
- info->count = 1;
- info->flags = VFIO_IRQ_INFO_EVENTFD;
-
- return 0;
+ }
}
static int vfio_ccw_mdev_set_irqs(struct mdev_device *mdev,
uint32_t flags,
+ unsigned index,
void __user *data)
{
struct vfio_ccw_private *private;
@@ -295,7 +298,17 @@ static int vfio_ccw_mdev_set_irqs(struct mdev_device *mdev,
return -EINVAL;
private = dev_get_drvdata(mdev_parent_dev(mdev));
- ctx = &private->io_trigger;
+
+ switch (index) {
+ case VFIO_CCW_IO_IRQ_INDEX:
+ ctx = &private->io_trigger;
+ break;
+ case VFIO_CCW_CHP_IRQ_INDEX:
+ ctx = &private->chp_trigger;
+ break;
+ default:
+ return -EINVAL;
+ }
switch (flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
case VFIO_IRQ_SET_DATA_NONE:
@@ -433,7 +446,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
return ret;
data = (void __user *)(arg + minsz);
- return vfio_ccw_mdev_set_irqs(mdev, hdr.flags, data);
+ return vfio_ccw_mdev_set_irqs(mdev, hdr.flags, hdr.index, data);
}
case VFIO_DEVICE_RESET:
return vfio_ccw_mdev_reset(mdev);
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index 460c8b90d834..da86f82dd7b9 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -33,6 +33,7 @@
* @irb: irb info received from interrupt
* @scsw: scsw info
* @io_trigger: eventfd ctx for signaling userspace I/O results
+ * @chp_trigger: eventfd ctx for signaling userspace chp event
* @io_work: work for deferral process of I/O handling
*/
struct vfio_ccw_private {
@@ -50,6 +51,7 @@ struct vfio_ccw_private {
union scsw scsw;
struct eventfd_ctx *io_trigger;
+ struct eventfd_ctx *chp_trigger;
struct work_struct io_work;
} __aligned(8);
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index c60244debf71..179a7492d53a 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -463,6 +463,7 @@ enum {
enum {
VFIO_CCW_IO_IRQ_INDEX,
+ VFIO_CCW_CHP_IRQ_INDEX,
VFIO_CCW_NUM_IRQS
};
--
2.13.5
On Thu, 11 Jan 2018 04:04:18 +0100
Dong Jia Shi <[email protected]> wrote:
> Hi Folks,
>
> Background
> ==========
>
> Some days ago, we had a discussion on the topic of channel path virtualization.
> Ref:
> Subject: [PATCH 0/3] Channel Path realted CRW generation
> Message-Id: <[email protected]>
> URL: https://lists.nongnu.org/archive/html/qemu-devel/2017-07/msg08414.html
>
> Indeed that thread is not short and discussed many aspects in a
> non-concentrated manner. The parts those are most valuable to me are:
> 1. a re-modelling for channel path is surely the best offer, but is not
> possible to have in the near future.
> 2. to enhance the path related functionalities, using PNO and PNOM might
> be something we can do for now. This may be something that I could improve
> without model related arguments.
>
> So here I have this series targeting to add basic channel path event handling
> for vfio-ccw -- no touch of the channel path modelling in both the kernel and
> the QEMU side, but find a way to sync path status change to guest lazily using
> SCSW_FLAGS_MASK_PNO and pmcw->pnom. In short, I want to enhance path related
> stuff (to be more specific: sync up path status to the guest) on a best effort
> basis, which means in a way that won't get us invloed to do channel path
> re-modelling.
The guest should also get the updated PIM/PAM/POM, shouldn't it?
>
> What benifit can we get from this? The administrator of a virtual machine can
> get uptodate (in some extent) status of the current using channel paths, so
> he/she can monitor paths status and get path problem noticed timely (see the
> example below).
>
> I think we can start a new round discussion based on this series. So reviewers
> can give their comments based on some code, and then we can decide if this is
> we want or not.
>
> As flagged with RFC, the intention of this series is to show what I have for
> now, and what could the code look like in general. Thus I can get some early
> feedbacks. I would expect to see opinions on:
> - is the target (mentioned above) of this series welcomed or not.
It certainly makes sense to have a way to get an updated schib.
> - is the approach of this series good or bad.
Still need to read :)
> So I can either move on with this (or with other suggested approach) or leave
> it alone.
>
> Basic Introduction of The Patches
> =================================
>
> This is the kernel counterpart, which mainly does:
> 1. add a schib vfio region for userland to _store_ subchannel information.
> 2. add a channel path vfio irq to notify userland with chp status change event.
> 3. add .chp_event handler for vfio-ccw driver, so the driver handles chp event,
> and signals userland about the event.
Do you plan to trigger schib updates for things other than path events?
>
> With the above work, userland can be signaled with chp related event, and then
> it can read out uptodate SCHIB from the new region, and sync up path related
> information to the corresponding virtual subchannel. So a guest can sense the
> path update in some extent.
That's basically what Linux could do before implementing chpid related
machine checks, so it should be at least helpful.
>
> For the QEMU counterpart, please ref:
> [RFC PATCH 0/5] vfio/ccw: basic channel path event handling
>
> The QEMU counterpart mainly does:
> 1. add handling of the schib region, so that it can read out the newest schib
> information.
> 2. add handling of the chp irq, so that it can get notification of channel path
> status change.
> 3. once there is a chp status event, synchronize related information from the
> newest schib information to guest lazily.
>
> What are still missing, thus need to be offered in the next version are:
> - I/O termination and FSM state handling if currently we have I/O on the status
> switched path.
I'm wondering up to which extent we should involve ourselves here. The
normal I/O subchannel driver handles all the path related things; but
for vfio, we basically want to hand the subchannel to the guest and not
involve ourselves in management. A configure off does an SCLP command;
does that already have an impact on running commands? (I can't check
myself due to lack of public documentation, sadly.)
> - Vary on/off event is not sensible to a guest.
As vary on/off basically means manipulating some internal masks and
updating path groups if applicable, I'm not sure how much we
could/should do here anyway.
>
> Example
> =======
>
> With both the kernel and Qemu parts applied, we can notice some new behaviors
> of a channel path when we have a guest with a passed through vfio-ccw device
> using it. The guest can reflect the chp status change of the host side lazily,
> and synchronize the updated information.
>
> For example:
> 0. Prepare a vfio subchannel on the host:
> [root@host ~]# lscss --vfio 013f
Oh, is this a new option? In which version had it been added? (My
Fedora 26 LPAR does not yet have it.)
> MDEV Subchan. PIM PAM POM CHPIDs
> ------------------------------------------------------------------------------
> 6dfd3ec5-e8b3-4e18-a6fe-57bc9eceb920 0.0.013f f0 f0 ff 42434445 00000000
>
> 1. Pass-through subchannel 0.0.013f to a guest:
> -device vfio-ccw,sysfsdev="$MDEV_FILE_PATH",devno=0.0.3f3f
>
> 2. Start the guest and check the device and path information:
> [root@guest ~]# lscss 0002
> Device Subchan. DevType CU Type Use PIM PAM POM CHPIDs
> ----------------------------------------------------------------------
> 0.0.3f3f 0.0.0002 3390/0c 3990/e9 f0 f0 ff 42434445 00000000
> [root@guest ~]# lschp
> CHPID Vary Cfg. Type Cmg Shared PCHID
> ============================================
> 0.00 1 - 32 - - -
> 0.42 1 3 1b - - -
> 0.43 1 3 1b - - -
> 0.44 1 3 1b - - -
> 0.45 1 3 1b - - -
>
> 3. On the host, configure off one path.
> [root@host ~]# chchp -c 0 42
>
> 4. On the guest, check the status:
> [root@guest ~]# lscss 0002
> Device Subchan. DevType CU Type Use PIM PAM POM CHPIDs
> ----------------------------------------------------------------------
> 0.0.3f3f 0.0.0002 3390/0c 3990/e9 f0 f0 ff 42434445 00000000
> #Notice: No change!
>
> [root@localhost ~]# chccwdev -e 3f3f
> Setting device 0.0.3f3f online
> dasd-eckd 0.0.3f3f: A channel path to the device has become operational
> dasd-eckd 0.0.3f3f: New DASD 3390/0C (CU 3990/01) with 30051 cylinders, 15 heads, 224 sectors
> dasd-eckd 0.0.3f3f: DASD with 4 KB/block, 21636720 KB total size, 48 KB/track, compatible disk layout
> dasda:VOL1/ 0X3F3F: dasda1
> Done
>
> [root@guest ~]# lscss 0002
> Device Subchan. DevType CU Type Use PIM PAM POM CHPIDs
> ----------------------------------------------------------------------
> 0.0.3f3f 0.0.0002 3390/0c 3990/e9 f0 70 ff 42434445 00000000
> #Notice: PAM value of path 0.42 changed.
>
> 5. On the host, configure on one path.
> [root@host ~]# chchp -c 1 42
>
> 6. On the guest, check the status again:
> [root@guest ~]# lscss 0002
> Device Subchan. DevType CU Type Use PIM PAM POM CHPIDs
> ----------------------------------------------------------------------
> 0.0.3f3f 0.0.0002 3390/0c 3990/e9 f0 70 ff 42434445 00000000
> #Notice: No change!
>
> [root@localhost ~]# chccwdev -d 3f3f
> Setting device 0.0.3f3f offline
> Done
>
> [root@guest ~]# lscss 0002
> Device Subchan. DevType CU Type Use PIM PAM POM CHPIDs
> ----------------------------------------------------------------------
> 0.0.3f3f 0.0.0002 3390/0c 3990/e9 f0 f0 ff 42434445 00000000
> #Notice: PAM changed again.
Yes, that looks reasonable. The guest being aware of changed masks only
if it actually did something that triggered path verification is
probably the best we can do without implementing channel path machine
checks.
On Thu, 11 Jan 2018 04:04:19 +0100
Dong Jia Shi <[email protected]> wrote:
> This introduces a new region for vfio-ccw to provide subchannel
> information for user space.
>
> Signed-off-by: Dong Jia Shi <[email protected]>
> ---
> drivers/s390/cio/vfio_ccw_fsm.c | 21 ++++++++++
> drivers/s390/cio/vfio_ccw_ops.c | 79 +++++++++++++++++++++++++++----------
> drivers/s390/cio/vfio_ccw_private.h | 3 ++
> include/uapi/linux/vfio.h | 1 +
> include/uapi/linux/vfio_ccw.h | 6 +++
> 5 files changed, 90 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
> index c30420c517b1..be081ccabea3 100644
> --- a/drivers/s390/cio/vfio_ccw_fsm.c
> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> @@ -172,6 +172,22 @@ static void fsm_irq(struct vfio_ccw_private *private,
> complete(private->completion);
> }
>
> +static void fsm_update_subch(struct vfio_ccw_private *private,
> + enum vfio_ccw_event event)
> +{
> + struct subchannel *sch;
> +
> + sch = private->sch;
> + if (cio_update_schib(sch)) {
This implies device gone. Do we also want to trigger some event, or
just wait until a machine check comes around and we're notified in the
normal way? (Probably the latter.)
> + private->schib_region.cc = 3;
> + return;
> + }
> +
> + private->schib_region.cc = 0;
> + memcpy(private->schib_region.schib_area, &sch->schib,
> + sizeof(sch->schib));
We might want to add documentation that schib_area contains the schib
from the last successful invocation of stsch (if any). That makes sense
as the schib remains unchanged for cc=3 after stsch anyway, but it
can't hurt to spell it out.
> +}
> +
> /*
> * Device statemachine
> */
> @@ -180,25 +196,30 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
> [VFIO_CCW_EVENT_NOT_OPER] = fsm_nop,
> [VFIO_CCW_EVENT_IO_REQ] = fsm_io_error,
> [VFIO_CCW_EVENT_INTERRUPT] = fsm_disabled_irq,
> + [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
> },
> [VFIO_CCW_STATE_STANDBY] = {
> [VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
> [VFIO_CCW_EVENT_IO_REQ] = fsm_io_error,
> [VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
> + [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
> },
> [VFIO_CCW_STATE_IDLE] = {
> [VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
> [VFIO_CCW_EVENT_IO_REQ] = fsm_io_request,
> [VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
> + [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
> },
> [VFIO_CCW_STATE_BOXED] = {
> [VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
> [VFIO_CCW_EVENT_IO_REQ] = fsm_io_busy,
> [VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
> + [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
> },
> [VFIO_CCW_STATE_BUSY] = {
> [VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
> [VFIO_CCW_EVENT_IO_REQ] = fsm_io_busy,
> [VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
> + [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
Does it makes to trigger this through the state machine if we always do
the same action and never change state?
> },
> };
Else, looks good.
On 01/11/2018 04:04 AM, Dong Jia Shi wrote:
> What are still missing, thus need to be offered in the next version are:
> - I/O termination and FSM state handling if currently we have I/O on the status
> switched path.
> - Vary on/off event is not sensible to a guest.
I don't see a doc update. We do have documentation (in
Documentation/s390/vfio-ccw.txt) in which the uapi interface with the
regions and their purpose/usage is at least kind of explained. You are
changing this interface without updating the doc.
I would like to see documentation on this because I'm under the
impression either the design is pretty convoluted or I did not
get it at all.
Regards,
Halil
On 11/01/2018 04:04, Dong Jia Shi wrote:
> This introduces a new region for vfio-ccw to provide subchannel
> information for user space.
>
> Signed-off-by: Dong Jia Shi <[email protected]>
> ---
> drivers/s390/cio/vfio_ccw_fsm.c | 21 ++++++++++
> drivers/s390/cio/vfio_ccw_ops.c | 79 +++++++++++++++++++++++++++----------
> drivers/s390/cio/vfio_ccw_private.h | 3 ++
> include/uapi/linux/vfio.h | 1 +
> include/uapi/linux/vfio_ccw.h | 6 +++
> 5 files changed, 90 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
> index c30420c517b1..be081ccabea3 100644
> --- a/drivers/s390/cio/vfio_ccw_fsm.c
> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> @@ -172,6 +172,22 @@ static void fsm_irq(struct vfio_ccw_private *private,
> complete(private->completion);
> }
>
> +static void fsm_update_subch(struct vfio_ccw_private *private,
> + enum vfio_ccw_event event)
> +{
> + struct subchannel *sch;
> +
> + sch = private->sch;
> + if (cio_update_schib(sch)) {
> + private->schib_region.cc = 3;
> + return;
> + }
> +
> + private->schib_region.cc = 0;
> + memcpy(private->schib_region.schib_area, &sch->schib,
> + sizeof(sch->schib));
> +}
> +
> /*
> * Device statemachine
> */
> @@ -180,25 +196,30 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
> [VFIO_CCW_EVENT_NOT_OPER] = fsm_nop,
> [VFIO_CCW_EVENT_IO_REQ] = fsm_io_error,
> [VFIO_CCW_EVENT_INTERRUPT] = fsm_disabled_irq,
> + [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
> },
> [VFIO_CCW_STATE_STANDBY] = {
> [VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
> [VFIO_CCW_EVENT_IO_REQ] = fsm_io_error,
> [VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
> + [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
> },
> [VFIO_CCW_STATE_IDLE] = {
> [VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
> [VFIO_CCW_EVENT_IO_REQ] = fsm_io_request,
> [VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
> + [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
> },
> [VFIO_CCW_STATE_BOXED] = {
> [VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
> [VFIO_CCW_EVENT_IO_REQ] = fsm_io_busy,
> [VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
> + [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
> },
> [VFIO_CCW_STATE_BUSY] = {
> [VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
> [VFIO_CCW_EVENT_IO_REQ] = fsm_io_busy,
> [VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
> + [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
> },
> };
> diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
> index 41eeb57d68a3..9528fce2e7d9 100644
> --- a/drivers/s390/cio/vfio_ccw_ops.c
> +++ b/drivers/s390/cio/vfio_ccw_ops.c
> @@ -13,6 +13,11 @@
>
> #include "vfio_ccw_private.h"
>
> +#define VFIO_CCW_OFFSET_SHIFT 40
> +#define VFIO_CCW_OFFSET_TO_INDEX(off) (off >> VFIO_CCW_OFFSET_SHIFT)
> +#define VFIO_CCW_INDEX_TO_OFFSET(index) ((u64)(index) << VFIO_CCW_OFFSET_SHIFT)
> +#define VFIO_CCW_OFFSET_MASK (((u64)(1) << VFIO_CCW_OFFSET_SHIFT) - 1)
> +
> static int vfio_ccw_mdev_reset(struct mdev_device *mdev)
> {
> struct vfio_ccw_private *private;
> @@ -168,14 +173,31 @@ static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
> loff_t *ppos)
> {
> struct vfio_ccw_private *private;
> - struct ccw_io_region *region;
> + void *region;
> + u32 index;
> + loff_t pos;
> +
> + index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
> + pos = (*ppos) & VFIO_CCW_OFFSET_MASK;
> + private = dev_get_drvdata(mdev_parent_dev(mdev));
>
> - if (*ppos + count > sizeof(*region))
> + switch (index) {
> + case VFIO_CCW_CONFIG_REGION_INDEX:
> + if (pos + count > sizeof(struct ccw_io_region))
> + return -EINVAL;
> + region = &private->io_region;
> + break;
> + case VFIO_CCW_SCHIB_REGION_INDEX:
> + if (pos + count > sizeof(struct ccw_schib_region))
> + return -EINVAL;
> + region = &private->schib_region;
> + vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_UPDATE_SUBCH);
> + break;
> + default:
> return -EINVAL;
> + }
>
> - private = dev_get_drvdata(mdev_parent_dev(mdev));
> - region = &private->io_region;
> - if (copy_to_user(buf, (void *)region + *ppos, count))
> + if (copy_to_user(buf, region + pos, count))
> return -EFAULT;
>
> return count;
> @@ -187,23 +209,35 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
> loff_t *ppos)
> {
> struct vfio_ccw_private *private;
> - struct ccw_io_region *region;
> -
> - if (*ppos + count > sizeof(*region))
> - return -EINVAL;
> + u32 index;
> + loff_t pos;
>
> + index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
> + pos = (*ppos) & VFIO_CCW_OFFSET_MASK;
> private = dev_get_drvdata(mdev_parent_dev(mdev));
> - if (private->state != VFIO_CCW_STATE_IDLE)
> - return -EACCES;
>
> - region = &private->io_region;
> - if (copy_from_user((void *)region + *ppos, buf, count))
> - return -EFAULT;
> -
> - vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_IO_REQ);
> - if (region->ret_code != 0) {
> - private->state = VFIO_CCW_STATE_IDLE;
> - return region->ret_code;
> + switch (index) {
> + case VFIO_CCW_CONFIG_REGION_INDEX:
> + {
> + struct ccw_io_region *region;
> + if (pos + count > sizeof(*region))
> + return -EINVAL;
> + if (private->state != VFIO_CCW_STATE_IDLE)
> + return -EACCES;
> + region = &private->io_region;
> + if (copy_from_user((void *)region + *ppos, buf, count))
> + return -EFAULT;
> + vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_IO_REQ);
> + if (region->ret_code != 0) {
> + private->state = VFIO_CCW_STATE_IDLE;
> + return region->ret_code;
> + }
> + break;
> + }
> + case VFIO_CCW_SCHIB_REGION_INDEX:
> + return -EOPNOTSUPP;
> + default:
> + return -EINVAL;
> }
>
> return count;
> @@ -224,11 +258,16 @@ static int vfio_ccw_mdev_get_region_info(struct vfio_region_info *info,
> {
> switch (info->index) {
> case VFIO_CCW_CONFIG_REGION_INDEX:
> - info->offset = 0;
> + info->offset = VFIO_CCW_INDEX_TO_OFFSET(info->index);
> info->size = sizeof(struct ccw_io_region);
> info->flags = VFIO_REGION_INFO_FLAG_READ
> | VFIO_REGION_INFO_FLAG_WRITE;
> return 0;
> + case VFIO_CCW_SCHIB_REGION_INDEX:
> + info->offset = VFIO_CCW_INDEX_TO_OFFSET(info->index);
> + info->size = sizeof(struct ccw_schib_region);
> + info->flags = VFIO_REGION_INFO_FLAG_READ;
> + return 0;
> default:
> return -EINVAL;
> }
> diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
> index 78a66d96756b..460c8b90d834 100644
> --- a/drivers/s390/cio/vfio_ccw_private.h
> +++ b/drivers/s390/cio/vfio_ccw_private.h
> @@ -28,6 +28,7 @@
> * @mdev: pointer to the mediated device
> * @nb: notifier for vfio events
> * @io_region: MMIO region to input/output I/O arguments/results
> + * @schib_region: MMIO region of subchannel information
> * @cp: channel program for the current I/O operation
> * @irb: irb info received from interrupt
> * @scsw: scsw info
> @@ -42,6 +43,7 @@ struct vfio_ccw_private {
> struct mdev_device *mdev;
> struct notifier_block nb;
> struct ccw_io_region io_region;
> + struct ccw_schib_region schib_region;
>
> struct channel_program cp;
> struct irb irb;
Hi,
I have a problem with these patches: you have 3 definitions for the
subchannel status word:
1: direct in the scsw entry of the vfio_ccw_private structure
2: indirect in the IRB structure irb
3: now in the scsw of the schib_region
How do you keep them all in sync?
The direct scsw in io region seems to only serve as a trigger used by
userland, while
the IRB in the io region and the SCHIB in the schib region are updated
asynchronously,
from hardware, one by polling (scsw in schib region), one by IRQ (scsw
in irb in io region).
I find this at least a source for obfuscation.
Regards,
Pierre
> @@ -76,6 +78,7 @@ enum vfio_ccw_event {
> VFIO_CCW_EVENT_NOT_OPER,
> VFIO_CCW_EVENT_IO_REQ,
> VFIO_CCW_EVENT_INTERRUPT,
> + VFIO_CCW_EVENT_UPDATE_SUBCH,
> /* last element! */
> NR_VFIO_CCW_EVENTS
> };
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index e3301dbd27d4..c60244debf71 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -457,6 +457,7 @@ enum {
>
> enum {
> VFIO_CCW_CONFIG_REGION_INDEX,
> + VFIO_CCW_SCHIB_REGION_INDEX,
> VFIO_CCW_NUM_REGIONS
> };
>
> diff --git a/include/uapi/linux/vfio_ccw.h b/include/uapi/linux/vfio_ccw.h
> index 2ec5f367ff78..12508ef6e6fc 100644
> --- a/include/uapi/linux/vfio_ccw.h
> +++ b/include/uapi/linux/vfio_ccw.h
> @@ -22,4 +22,10 @@ struct ccw_io_region {
> __u32 ret_code;
> } __packed;
>
> +struct ccw_schib_region {
> + __u8 cc;
> +#define SCHIB_AREA_SIZE 52
> + __u8 schib_area[SCHIB_AREA_SIZE];
> +} __packed;
> +
> #endif
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On 15/01/2018 09:57, Dong Jia Shi wrote:
> * Cornelia Huck <[email protected]> [2018-01-11 11:54:22 +0100]:
>
>> On Thu, 11 Jan 2018 04:04:18 +0100
>> Dong Jia Shi <[email protected]> wrote:
>>
>>> Hi Folks,
>>>
>>> Background
>>> ==========
>>>
>>> Some days ago, we had a discussion on the topic of channel path virtualization.
>>> Ref:
>>> Subject: [PATCH 0/3] Channel Path realted CRW generation
>>> Message-Id: <[email protected]>
>>> URL: https://lists.nongnu.org/archive/html/qemu-devel/2017-07/msg08414.html
>>>
>>> Indeed that thread is not short and discussed many aspects in a
>>> non-concentrated manner. The parts those are most valuable to me are:
>>> 1. a re-modelling for channel path is surely the best offer, but is not
>>> possible to have in the near future.
>>> 2. to enhance the path related functionalities, using PNO and PNOM might
>>> be something we can do for now. This may be something that I could improve
>>> without model related arguments.
>>>
>>> So here I have this series targeting to add basic channel path event handling
>>> for vfio-ccw -- no touch of the channel path modelling in both the kernel and
>>> the QEMU side, but find a way to sync path status change to guest lazily using
>>> SCSW_FLAGS_MASK_PNO and pmcw->pnom. In short, I want to enhance path related
>>> stuff (to be more specific: sync up path status to the guest) on a best effort
>>> basis, which means in a way that won't get us invloed to do channel path
>>> re-modelling.
>> The guest should also get the updated PIM/PAM/POM, shouldn't it?
>>
> Yes. The following values will be updated for the guest:
> PMCW:
> - PIM/PAM/POM
> - PNOM
> - CHPIDs
> SCSW
> - PNOM bit
>
> See vfio_ccw_update_schib in patch #4 of the QEMU series.
>
>>> What benifit can we get from this? The administrator of a virtual machine can
>>> get uptodate (in some extent) status of the current using channel paths, so
>>> he/she can monitor paths status and get path problem noticed timely (see the
>>> example below).
>>>
>>> I think we can start a new round discussion based on this series. So reviewers
>>> can give their comments based on some code, and then we can decide if this is
>>> we want or not.
>>>
>>> As flagged with RFC, the intention of this series is to show what I have for
>>> now, and what could the code look like in general. Thus I can get some early
>>> feedbacks. I would expect to see opinions on:
>>> - is the target (mentioned above) of this series welcomed or not.
>> It certainly makes sense to have a way to get an updated schib.
>>
> :)
I think so too, if the guest's administrator wants to be able to do
something.
But I would like to see something about path virtualization.
Having more accurate information on hardware without virtualization is a
big handicap for migration and hotplug.
Regards,
Pierre
--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany
On Mon, 15 Jan 2018 14:43:09 +0800
Dong Jia Shi <[email protected]> wrote:
> * Cornelia Huck <[email protected]> [2018-01-11 15:16:59 +0100]:
>
> Hi Conny,
>
> > On Thu, 11 Jan 2018 04:04:19 +0100
> > Dong Jia Shi <[email protected]> wrote:
> >
> > > This introduces a new region for vfio-ccw to provide subchannel
> > > information for user space.
> > >
> > > Signed-off-by: Dong Jia Shi <[email protected]>
> > > ---
> > > drivers/s390/cio/vfio_ccw_fsm.c | 21 ++++++++++
> > > drivers/s390/cio/vfio_ccw_ops.c | 79 +++++++++++++++++++++++++++----------
> > > drivers/s390/cio/vfio_ccw_private.h | 3 ++
> > > include/uapi/linux/vfio.h | 1 +
> > > include/uapi/linux/vfio_ccw.h | 6 +++
> > > 5 files changed, 90 insertions(+), 20 deletions(-)
> > >
> > > diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
> > > index c30420c517b1..be081ccabea3 100644
> > > --- a/drivers/s390/cio/vfio_ccw_fsm.c
> > > +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> > > @@ -172,6 +172,22 @@ static void fsm_irq(struct vfio_ccw_private *private,
> > > complete(private->completion);
> > > }
> > >
> > > +static void fsm_update_subch(struct vfio_ccw_private *private,
> > > + enum vfio_ccw_event event)
> > > +{
> > > + struct subchannel *sch;
> > > +
> > > + sch = private->sch;
> > > + if (cio_update_schib(sch)) {
> >
> > This implies device gone. Do we also want to trigger some event, or
> > just wait until a machine check comes around and we're notified in the
> > normal way? (Probably the latter.)
> >
> We'd need to handle machine checks better anyway, and we can trigger
> event there. I think we can choose the latter one.
Agreed. We can tackle the whole machine check complex later, especially
as it also has implications for interacting with user space.
>
> > > + private->schib_region.cc = 3;
> > > + return;
> > > + }
> > > +
> > > + private->schib_region.cc = 0;
> > > + memcpy(private->schib_region.schib_area, &sch->schib,
> > > + sizeof(sch->schib));
> >
> > We might want to add documentation that schib_area contains the schib
> > from the last successful invocation of stsch (if any). That makes sense
> > as the schib remains unchanged for cc=3 after stsch anyway, but it
> > can't hurt to spell it out.
> >
> PoP doesn't say anything about the content of SCHIB when cc=3. So it's
> fine to remain the last content I guess. I can add comments here and
> document in vfio-ccw.txt. Ok?
The PoP says "Condition code 3 is set, and no other action is taken" -
I'd interpret this as "no content is changed, but you probably should
not look at that storage area". I'd hope that the caller does not look
at the contents for cc 3, but it's a good idea to document this.
>
> > > +}
> > > +
> > > /*
> > > * Device statemachine
> > > */
> > > @@ -180,25 +196,30 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
> > > [VFIO_CCW_EVENT_NOT_OPER] = fsm_nop,
> > > [VFIO_CCW_EVENT_IO_REQ] = fsm_io_error,
> > > [VFIO_CCW_EVENT_INTERRUPT] = fsm_disabled_irq,
> > > + [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
> > > },
> > > [VFIO_CCW_STATE_STANDBY] = {
> > > [VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
> > > [VFIO_CCW_EVENT_IO_REQ] = fsm_io_error,
> > > [VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
> > > + [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
> > > },
> > > [VFIO_CCW_STATE_IDLE] = {
> > > [VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
> > > [VFIO_CCW_EVENT_IO_REQ] = fsm_io_request,
> > > [VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
> > > + [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
> > > },
> > > [VFIO_CCW_STATE_BOXED] = {
> > > [VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
> > > [VFIO_CCW_EVENT_IO_REQ] = fsm_io_busy,
> > > [VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
> > > + [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
> > > },
> > > [VFIO_CCW_STATE_BUSY] = {
> > > [VFIO_CCW_EVENT_NOT_OPER] = fsm_notoper,
> > > [VFIO_CCW_EVENT_IO_REQ] = fsm_io_busy,
> > > [VFIO_CCW_EVENT_INTERRUPT] = fsm_irq,
> > > + [VFIO_CCW_EVENT_UPDATE_SUBCH] = fsm_update_subch,
> >
> > Does it makes to trigger this through the state machine if we always do
> > the same action and never change state?
> Yes. Ah, are you implying that we can call update_subch directly without
> state machine involved? If so, I agree. There seems no benifit to add
> a new VFIO_CCW_EVENT_UPDATE_SUBCH event to the FSM.
Yes, that's what I meant. Whatever makes the code easy to understand.
On Mon, 15 Jan 2018 10:50:00 +0100
Pierre Morel <[email protected]> wrote:
> On 11/01/2018 04:04, Dong Jia Shi wrote:
> > This introduces a new region for vfio-ccw to provide subchannel
> > information for user space.
> >
> > Signed-off-by: Dong Jia Shi <[email protected]>
> > ---
> > drivers/s390/cio/vfio_ccw_fsm.c | 21 ++++++++++
> > drivers/s390/cio/vfio_ccw_ops.c | 79 +++++++++++++++++++++++++++----------
> > drivers/s390/cio/vfio_ccw_private.h | 3 ++
> > include/uapi/linux/vfio.h | 1 +
> > include/uapi/linux/vfio_ccw.h | 6 +++
> > 5 files changed, 90 insertions(+), 20 deletions(-)
> >
> > diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
> > index 78a66d96756b..460c8b90d834 100644
> > --- a/drivers/s390/cio/vfio_ccw_private.h
> > +++ b/drivers/s390/cio/vfio_ccw_private.h
> > @@ -28,6 +28,7 @@
> > * @mdev: pointer to the mediated device
> > * @nb: notifier for vfio events
> > * @io_region: MMIO region to input/output I/O arguments/results
> > + * @schib_region: MMIO region of subchannel information
> > * @cp: channel program for the current I/O operation
> > * @irb: irb info received from interrupt
> > * @scsw: scsw info
> > @@ -42,6 +43,7 @@ struct vfio_ccw_private {
> > struct mdev_device *mdev;
> > struct notifier_block nb;
> > struct ccw_io_region io_region;
> > + struct ccw_schib_region schib_region;
> >
> > struct channel_program cp;
> > struct irb irb;
>
> Hi,
>
> I have a problem with these patches: you have 3 definitions for the
> subchannel status word:
> 1: direct in the scsw entry of the vfio_ccw_private structure
> 2: indirect in the IRB structure irb
> 3: now in the scsw of the schib_region
>
> How do you keep them all in sync?
>
> The direct scsw in io region seems to only serve as a trigger used by
> userland, while
> the IRB in the io region and the SCHIB in the schib region are updated
> asynchronously,
> from hardware, one by polling (scsw in schib region), one by IRQ (scsw
> in irb in io region).
>
> I find this at least a source for obfuscation.
I agree that this wants some more documentation.
However, some of this structure duplication is a consequence of the
hardware design, because the scsw is present in both the schib (updated
by stsch) and the irb (updated by tsch). There are cases where the irb
is simply not enough (it does not contain a pmcw, and you can only do
tsch on an enabled subchannel). So I think that we really need two
structures for those two distinct operations (and everything
interfacing with this needs to keep synced on the scsw, as current
users of stsch/tsch already need to do now).
The direct scsw entry is used for initiating the different functions
(only start right now), I don't really see a good alternative for that
(and I also don't really see any problem with needed syncing.)
On Tue, 16 Jan 2018 11:16:27 +0800
Dong Jia Shi <[email protected]> wrote:
> * Pierre Morel <[email protected]> [2018-01-15 11:21:47 +0100]:
>
> > On 15/01/2018 09:57, Dong Jia Shi wrote:
> > >* Cornelia Huck <[email protected]> [2018-01-11 11:54:22 +0100]:
> > >
> > >>On Thu, 11 Jan 2018 04:04:18 +0100
> > >>Dong Jia Shi <[email protected]> wrote:
> > >>
> > >>>Hi Folks,
> > >>>
> > >>>Background
> > >>>==========
> > >>>
> > >>>Some days ago, we had a discussion on the topic of channel path virtualization.
> > >>>Ref:
> > >>>Subject: [PATCH 0/3] Channel Path realted CRW generation
> > >>>Message-Id: <[email protected]>
> > >>>URL: https://lists.nongnu.org/archive/html/qemu-devel/2017-07/msg08414.html
> > >>>
> > >>>Indeed that thread is not short and discussed many aspects in a
> > >>>non-concentrated manner. The parts those are most valuable to me are:
> > >>>1. a re-modelling for channel path is surely the best offer, but is not
> > >>> possible to have in the near future.
> > >>>2. to enhance the path related functionalities, using PNO and PNOM might
> > >>> be something we can do for now. This may be something that I could improve
> > >>> without model related arguments.
> > >>>
> > >>>So here I have this series targeting to add basic channel path event handling
> > >>>for vfio-ccw -- no touch of the channel path modelling in both the kernel and
> > >>>the QEMU side, but find a way to sync path status change to guest lazily using
> > >>>SCSW_FLAGS_MASK_PNO and pmcw->pnom. In short, I want to enhance path related
> > >>>stuff (to be more specific: sync up path status to the guest) on a best effort
> > >>>basis, which means in a way that won't get us invloed to do channel path
> > >>>re-modelling.
> > >>The guest should also get the updated PIM/PAM/POM, shouldn't it?
> > >>
> > >Yes. The following values will be updated for the guest:
> > >PMCW:
> > > - PIM/PAM/POM
> > > - PNOM
> > > - CHPIDs
> > >SCSW
> > > - PNOM bit
> > >
> > >See vfio_ccw_update_schib in patch #4 of the QEMU series.
> > >
> > >>>What benifit can we get from this? The administrator of a virtual machine can
> > >>>get uptodate (in some extent) status of the current using channel paths, so
> > >>>he/she can monitor paths status and get path problem noticed timely (see the
> > >>>example below).
> > >>>
> > >>>I think we can start a new round discussion based on this series. So reviewers
> > >>>can give their comments based on some code, and then we can decide if this is
> > >>>we want or not.
> > >>>
> > >>>As flagged with RFC, the intention of this series is to show what I have for
> > >>>now, and what could the code look like in general. Thus I can get some early
> > >>>feedbacks. I would expect to see opinions on:
> > >>>- is the target (mentioned above) of this series welcomed or not.
> > >>It certainly makes sense to have a way to get an updated schib.
> > >>
> > >:)
> >
> > I think so too, if the guest's administrator wants to be able to do
> > something.
> >
> > But I would like to see something about path virtualization.
> Me too... As pointed in the discussion thread (URL listed above), this
> is something that really hard to have in the near future. The question
> is do we want some enhancements like this without channel path
> re-modelling, or we want nothing until we have the re-modelling firstly?
I consider the ability to grab an updated schib useful not only for
path-related stuff, but for getting the whole content of it updated;
this makes the interface interesting even in the future.
And I think everybody wants more path virtualization, but that's not
going to be easy.
>
> > Having more accurate information on hardware without virtualization is a
> > big handicap for migration and hotplug.
> >
> vfio-ccw does not support migration. What could be the handicap for
> that? :^)
>
Heh :)
Actually, thinking about migration has been on my to-do list for a
while; unfortunately, it's not alone there. (I fully expect the items
on my to-do list to hold tea parties so they don't get bored.)