2018-09-14 07:40:10

by Dongli Zhang

[permalink] [raw]
Subject: Introduce xenwatch multithreading (mtwatch)

Hi,

This patch set introduces xenwatch multithreading (mtwatch) based on the
below xen summit 2018 design session notes:

https://lists.xenproject.org/archives/html/xen-devel/2018-07/msg00017.html


xenwatch_thread is a single kernel thread processing the callback function
for subscribed xenwatch events successively. The xenwatch is stalled in 'D'
state if any of callback function is stalled and uninterruptible.

The domU create/destroy is failed if xenwatch is stalled in 'D' state as
the paravirtual driver init/uninit cannot complete. Usually, the only
option is to reboot dom0 server unless there is solution/workaround to
move forward and complete the stalled xenwatch event callback function.
Below is the output of 'xl create' when xenwatch is stalled (the issue is
reproduced on purpose by hooking netif_receive_skb() to intercept an
sk_buff sent out from vifX.Y on dom0 with patch at
https://github.com/finallyjustice/patchset/blob/master/xenwatch-stall-by-vif.patch):

# xl create pv.cfg
Parsing config from pv.cfg
libxl: error: libxl_device.c:1080:device_backend_callback: Domain 2:unable to add device with path /local/domain/0/backend/vbd/2/51712
libxl: error: libxl_create.c:1278:domcreate_launch_dm: Domain 2:unable to add disk devices
libxl: error: libxl_device.c:1080:device_backend_callback: Domain 2:unable to remove device with path /local/domain/0/backend/vbd/2/51712
libxl: error: libxl_domain.c:1073:devices_destroy_cb: Domain 2:libxl__devices_destroy failed
libxl: error: libxl_domain.c:1000:libxl__destroy_domid: Domain 2:Non-existant domain
libxl: error: libxl_domain.c:959:domain_destroy_callback: Domain 2:Unable to destroy guest
libxl: error: libxl_domain.c:886:domain_destroy_cb: Domain 2:Destruction of domain failed


The idea of this patch set is to create a per-domU xenwatch thread for each
domid. The per-domid thread is created when the 1st pv backend device (for
this domid and with xenwatch multithreading enabled) is created, while this
thread is destroyed when the last pv backend device (for this domid and
with xenwatch multithreading enabled) is removed. Per-domid xs_watch_event
is never put on the default event list, but is put on the per-domid event
list directly.


For more details, please refer to the xen summit 2018 design session notes
and presentation slides:

https://lists.xenproject.org/archives/html/xen-devel/2018-07/msg00017.html
http://www.donglizhang.org/xenwatch_multithreading.pdf

----------------------------------------------------------------

Dongli Zhang (6):
xenbus: prepare data structures and parameter for xenwatch multithreading
xenbus: implement the xenwatch multithreading framework
xenbus: dispatch per-domU watch event to per-domU xenwatch thread
xenbus: process otherend_watch event at 'state' entry in xenwatch multithreading
xenbus: process be_watch events in xenwatch multithreading
drivers: enable xenwatch multithreading for xen-netback and xen-blkback driver

Documentation/admin-guide/kernel-parameters.txt | 3 +
drivers/block/xen-blkback/xenbus.c | 3 +-
drivers/net/xen-netback/xenbus.c | 1 +
drivers/xen/xenbus/xenbus_probe.c | 24 +-
drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++
drivers/xen/xenbus/xenbus_xs.c | 357 +++++++++++++++++++++++-
include/xen/xenbus.h | 70 +++++
7 files changed, 484 insertions(+), 6 deletions(-)

Thank you very much!

Dongli Zhang



2018-09-14 07:34:24

by Dongli Zhang

[permalink] [raw]
Subject: [PATCH 1/6] xenbus: prepare data structures and parameter for xenwatch multithreading

This is the 1st patch of a (6-patch) patch set.

This patch set of six patches introduces xenwatch multithreading (or
multithreaded xenwatch, abbreviated as 'mtwatch') to dom0 kernel. In
addition to the existing single xenwatch thread, each domU has its own
kernel thread ([xen-mtwatch-<domid>]) to process its xenwatch event.

A kernel parameter 'xen_mtwatch' is introduced to control whether the
feature is enabled or not during dom0 kernel boot. The feature is disabled
by default if 'xen_mtwatch' is not set in grub. In addition, this patch
also introduces the data structures to maintain the status of each per-domU
xenwatch thread. The status of each xenwatch thread (except the default
one) is maintained by a mtwatch domain.

The feature is available only on dom0.

Signed-off-by: Dongli Zhang <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 3 ++
drivers/xen/xenbus/xenbus_xs.c | 31 ++++++++++++
include/xen/xenbus.h | 65 +++++++++++++++++++++++++
3 files changed, 99 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 64a3bf5..fc295ef 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4992,6 +4992,9 @@
the unplug protocol
never -- do not unplug even if version check succeeds

+ xen_mtwatch [KNL,XEN]
+ Enables the multithreaded xenwatch (mtwatch).
+
xen_nopvspin [X86,XEN]
Disables the ticketlock slowpath using Xen PV
optimizations.
diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
index 49a3874..3f137d2 100644
--- a/drivers/xen/xenbus/xenbus_xs.c
+++ b/drivers/xen/xenbus/xenbus_xs.c
@@ -95,6 +95,19 @@ static pid_t xenwatch_pid;
static DEFINE_MUTEX(xenwatch_mutex);
static DECLARE_WAIT_QUEUE_HEAD(watch_events_waitq);

+bool xen_mtwatch;
+EXPORT_SYMBOL_GPL(xen_mtwatch);
+
+struct mtwatch_info *mtwatch_info;
+
+static bool param_xen_mtwatch;
+static __init int xen_parse_mtwatch(char *arg)
+{
+ param_xen_mtwatch = true;
+ return 0;
+}
+early_param("xen_mtwatch", xen_parse_mtwatch);
+
static void xs_suspend_enter(void)
{
spin_lock(&xs_state_lock);
@@ -929,6 +942,24 @@ int xs_init(void)
if (err)
return err;

+ if (xen_initial_domain() && param_xen_mtwatch) {
+ int i;
+
+ mtwatch_info = kmalloc(sizeof(*mtwatch_info), GFP_KERNEL);
+
+ for (i = 0; i < MTWATCH_HASH_SIZE; i++)
+ INIT_HLIST_HEAD(&mtwatch_info->domain_hash[i]);
+ spin_lock_init(&mtwatch_info->domain_lock);
+ INIT_LIST_HEAD(&mtwatch_info->domain_list);
+
+ spin_lock_init(&mtwatch_info->purge_lock);
+ INIT_LIST_HEAD(&mtwatch_info->purge_list);
+
+ xen_mtwatch = true;
+
+ pr_info("xenwatch multithreading is enabled\n");
+ }
+
task = kthread_run(xenwatch_thread, NULL, "xenwatch");
if (IS_ERR(task))
return PTR_ERR(task);
diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
index 869c816..e807114 100644
--- a/include/xen/xenbus.h
+++ b/include/xen/xenbus.h
@@ -62,6 +62,13 @@ struct xenbus_watch
/* Callback (executed in a process context with no locks held). */
void (*callback)(struct xenbus_watch *,
const char *path, const char *token);
+
+ /* Callback to help calculate the domid the path belongs to */
+ domid_t (*get_domid)(struct xenbus_watch *watch,
+ const char *path, const char *token);
+
+ /* The owner's domid if the watch is for a specific domain */
+ domid_t owner_id;
};


@@ -93,6 +100,7 @@ struct xenbus_device_id
struct xenbus_driver {
const char *name; /* defaults to ids[0].devicetype */
const struct xenbus_device_id *ids;
+ bool use_mtwatch;
int (*probe)(struct xenbus_device *dev,
const struct xenbus_device_id *id);
void (*otherend_changed)(struct xenbus_device *dev,
@@ -233,4 +241,61 @@ extern const struct file_operations xen_xenbus_fops;
extern struct xenstore_domain_interface *xen_store_interface;
extern int xen_store_evtchn;

+extern bool xen_mtwatch;
+
+#define MTWATCH_HASH_SIZE 256
+#define MTWATCH_HASH(_id) ((int)(_id)&(MTWATCH_HASH_SIZE-1))
+
+struct mtwatch_info {
+ /*
+ * The mtwatch_domain is put on both a hash table and a list.
+ * domain_list is used to optimize xenbus_watch un-registration.
+ *
+ * The mtwatch_domain is removed from domain_hash (with state set
+ * to MTWATCH_DOMAIN_DOWN) when its refcnt is zero. However, it is
+ * left on domain_list until all events belong to such
+ * mtwatch_domain are processed in mtwatch_thread().
+ *
+ * While there may exist two mtwatch_domain with the same domid on
+ * domain_list simultaneously, all mtwatch_domain on hash_hash
+ * should have unique domid.
+ */
+ spinlock_t domain_lock;
+ struct hlist_head domain_hash[MTWATCH_HASH_SIZE];
+ struct list_head domain_list;
+
+ /*
+ * When a per-domU kthread is going to be destroyed, it is put
+ * on the purge_list, and will be flushed by purge_work later.
+ */
+ struct work_struct purge_work;
+ spinlock_t purge_lock;
+ struct list_head purge_list;
+};
+
+enum mtwatch_domain_state {
+ MTWATCH_DOMAIN_UP = 1,
+ MTWATCH_DOMAIN_DOWN = 2,
+};
+
+struct mtwatch_domain {
+ domid_t domid;
+ struct task_struct *task;
+ atomic_t refcnt;
+
+ pid_t pid;
+ struct mutex domain_mutex;
+ struct rcu_head rcu;
+
+ struct hlist_node hash_node;
+ struct list_head list_node;
+ struct list_head purge_node;
+
+ wait_queue_head_t events_wq;
+
+ spinlock_t events_lock;
+ struct list_head events;
+ enum mtwatch_domain_state state;
+};
+
#endif /* _XEN_XENBUS_H */
--
2.7.4


2018-09-14 07:34:28

by Dongli Zhang

[permalink] [raw]
Subject: [PATCH 2/6] xenbus: implement the xenwatch multithreading framework

This is the 2nd patch of a (6-patch) patch set.

This patch implements the xenwatch multithreading framework to create or
destroy the per-domU xenwatch thread. The xenwatch thread is created or
destroyed during xenbus device probing or removing (that is,
xenbus_dev_probe() or xenbus_dev_remove()) if the corresponding pv driver
has xenwatch multithreading feature enabled. As there is only one single
per-domU xenwatch thread for each domid, probing the xenbus device for the
same domid again would not create the thread for the same domid again, but
only increment the reference count of the thread's mtwatch domain. When a
xenbus device is removed, the reference count is decremented. The per-domU
xenwatch thread is destroyed when the reference count of its mtwatch domain
is zero, that is, al xenbus devices (whose mtwatch feature is enabled) of
such mtwatch domain are removed.

Therefore, a domid has its own per-domU xenwatch thread only when it is
attached with dom0 backend xenbus device whose pv driver has the feature
enabled. The domid would not have its own xenwatch thread when it is not
running any mtwatch-enabled xenbus device.

When a watch (with xenwatch multithreading enabled) is unregistered, we
will generally traverse all mtwatch domains to remove all inflight pending
events fired by such watch. However, one optimization in this patch is we
only need to remove pending events from a specific mtwatch domain when the
watch is registered for a specific domid, that is, when its owner_id field
is non-zero.

Signed-off-by: Dongli Zhang <[email protected]>
---
drivers/xen/xenbus/xenbus_probe.c | 6 +
drivers/xen/xenbus/xenbus_xs.c | 273 ++++++++++++++++++++++++++++++++++++++
include/xen/xenbus.h | 3 +
3 files changed, 282 insertions(+)

diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 5b47188..5755596 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -236,6 +236,9 @@ int xenbus_dev_probe(struct device *_dev)
if (err)
goto fail;

+ if (xen_mtwatch && drv->use_mtwatch)
+ mtwatch_create_domain(dev->otherend_id);
+
err = watch_otherend(dev);
if (err) {
dev_warn(&dev->dev, "watch_otherend on %s failed.\n",
@@ -263,6 +266,9 @@ int xenbus_dev_remove(struct device *_dev)
if (drv->remove)
drv->remove(dev);

+ if (xen_mtwatch && drv->use_mtwatch)
+ mtwatch_put_domain(dev->otherend_id);
+
free_otherend_details(dev);

xenbus_switch_state(dev, XenbusStateClosed);
diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
index 3f137d2..741dc54 100644
--- a/drivers/xen/xenbus/xenbus_xs.c
+++ b/drivers/xen/xenbus/xenbus_xs.c
@@ -108,6 +108,201 @@ static __init int xen_parse_mtwatch(char *arg)
}
early_param("xen_mtwatch", xen_parse_mtwatch);

+struct mtwatch_domain *mtwatch_find_domain(domid_t domid)
+{
+ struct mtwatch_domain *domain;
+ int hash = MTWATCH_HASH(domid);
+ struct hlist_head *hash_head = &mtwatch_info->domain_hash[hash];
+
+ hlist_for_each_entry_rcu(domain, hash_head, hash_node) {
+ if (domain->domid == domid)
+ return domain;
+ }
+
+ return NULL;
+}
+
+/* per-domU thread for xenwatch multithreading. */
+static int mtwatch_thread(void *arg)
+{
+ struct mtwatch_domain *domain = (struct mtwatch_domain *) arg;
+ struct list_head *ent;
+ struct xs_watch_event *event;
+
+ domain->pid = current->pid;
+
+ for (;;) {
+ wait_event_interruptible(domain->events_wq,
+ !list_empty(&domain->events) ||
+ domain->state == MTWATCH_DOMAIN_DOWN);
+
+ if (domain->state == MTWATCH_DOMAIN_DOWN &&
+ list_empty(&domain->events))
+ break;
+
+ mutex_lock(&domain->domain_mutex);
+
+ spin_lock(&domain->events_lock);
+ ent = domain->events.next;
+ if (ent != &domain->events)
+ list_del(ent);
+ spin_unlock(&domain->events_lock);
+
+ if (ent != &domain->events) {
+ event = list_entry(ent, struct xs_watch_event, list);
+ event->handle->callback(event->handle, event->path,
+ event->token);
+ kfree(event);
+ }
+
+ mutex_unlock(&domain->domain_mutex);
+ }
+
+ /*
+ * domain->state is already set to MTWATCH_DOMAIN_DOWN (to avoid
+ * new event to domain->events) when above for loop breaks, so
+ * that there is no requirement to cleanup domain->events again.
+ */
+
+ spin_lock(&mtwatch_info->domain_lock);
+ list_del_rcu(&domain->list_node);
+ spin_unlock(&mtwatch_info->domain_lock);
+
+ spin_lock(&mtwatch_info->purge_lock);
+ list_add(&domain->purge_node, &mtwatch_info->purge_list);
+ spin_unlock(&mtwatch_info->purge_lock);
+
+ schedule_work(&mtwatch_info->purge_work);
+
+ return 0;
+}
+
+static void delayed_destroy_domain(struct rcu_head *head)
+{
+ struct mtwatch_domain *domain;
+
+ domain = container_of(head, struct mtwatch_domain, rcu);
+ kfree(domain);
+}
+
+static void xen_mtwatch_purge_domain(struct work_struct *work)
+{
+ struct mtwatch_domain *domain;
+ struct list_head *node;
+
+ while (!list_empty(&mtwatch_info->purge_list)) {
+
+ spin_lock(&mtwatch_info->purge_lock);
+ node = mtwatch_info->purge_list.next;
+ if (node != &mtwatch_info->purge_list)
+ list_del(node);
+ spin_unlock(&mtwatch_info->purge_lock);
+
+ if (node != &mtwatch_info->purge_list) {
+ domain = list_entry(node, struct mtwatch_domain,
+ purge_node);
+ kthread_stop(domain->task);
+
+ call_rcu(&domain->rcu, delayed_destroy_domain);
+ }
+ }
+}
+
+/* Running in the context of default xenwatch kthread. */
+void mtwatch_create_domain(domid_t domid)
+{
+ struct mtwatch_domain *domain;
+
+ if (!domid) {
+ pr_err("Default xenwatch thread is for dom0\n");
+ return;
+ }
+
+ spin_lock(&mtwatch_info->domain_lock);
+
+ domain = mtwatch_find_domain(domid);
+ if (domain) {
+ atomic_inc(&domain->refcnt);
+ spin_unlock(&mtwatch_info->domain_lock);
+ return;
+ }
+
+ domain = kzalloc(sizeof(*domain), GFP_ATOMIC);
+ if (!domain) {
+ spin_unlock(&mtwatch_info->domain_lock);
+ pr_err("Failed to allocate memory for mtwatch thread %d\n",
+ domid);
+ return;
+ }
+
+ domain->domid = domid;
+ atomic_set(&domain->refcnt, 1);
+ mutex_init(&domain->domain_mutex);
+ INIT_LIST_HEAD(&domain->purge_node);
+
+ init_waitqueue_head(&domain->events_wq);
+ spin_lock_init(&domain->events_lock);
+ INIT_LIST_HEAD(&domain->events);
+
+ list_add_tail_rcu(&domain->list_node, &mtwatch_info->domain_list);
+
+ hlist_add_head_rcu(&domain->hash_node,
+ &mtwatch_info->domain_hash[MTWATCH_HASH(domid)]);
+
+ spin_unlock(&mtwatch_info->domain_lock);
+
+ domain->task = kthread_run(mtwatch_thread, domain,
+ "xen-mtwatch-%d", domid);
+
+ if (!domain->task) {
+ pr_err("mtwatch kthread creation is failed\n");
+ domain->state = MTWATCH_DOMAIN_DOWN;
+
+ return;
+ }
+
+ domain->state = MTWATCH_DOMAIN_UP;
+}
+
+/* Running in the context of default xenwatch kthread. */
+void mtwatch_put_domain(domid_t domid)
+{
+ struct mtwatch_domain *domain;
+
+ spin_lock(&mtwatch_info->domain_lock);
+
+ domain = mtwatch_find_domain(domid);
+ if (!domain) {
+ spin_unlock(&mtwatch_info->domain_lock);
+ pr_err("mtwatch kthread for domid=%d does not exist\n",
+ domid);
+ return;
+ }
+
+ if (atomic_dec_and_test(&domain->refcnt)) {
+
+ hlist_del_rcu(&domain->hash_node);
+
+ if (!domain->task) {
+ /*
+ * As the task is failed to initialize during
+ * mtwatch_create_domain(), we do not need to wait
+ * for the kernel thread to complete.
+ */
+ list_del_rcu(&domain->list_node);
+ call_rcu(&domain->rcu, delayed_destroy_domain);
+ } else {
+ spin_lock(&domain->events_lock);
+ domain->state = MTWATCH_DOMAIN_DOWN;
+ spin_unlock(&domain->events_lock);
+
+ wake_up(&domain->events_wq);
+ }
+ }
+
+ spin_unlock(&mtwatch_info->domain_lock);
+}
+
static void xs_suspend_enter(void)
{
spin_lock(&xs_state_lock);
@@ -793,6 +988,80 @@ int register_xenbus_watch(struct xenbus_watch *watch)
}
EXPORT_SYMBOL_GPL(register_xenbus_watch);

+static void __unregister_single_mtwatch(struct xenbus_watch *watch,
+ struct mtwatch_domain *domain)
+{
+ struct xs_watch_event *event, *tmp;
+
+ if (current->pid != domain->pid)
+ mutex_lock(&domain->domain_mutex);
+
+ spin_lock(&domain->events_lock);
+ list_for_each_entry_safe(event, tmp,
+ &domain->events, list) {
+ if (event->handle != watch)
+ continue;
+ list_del(&event->list);
+ kfree(event);
+ }
+ spin_unlock(&domain->events_lock);
+
+ if (current->pid != domain->pid)
+ mutex_unlock(&domain->domain_mutex);
+}
+
+static void unregister_single_mtwatch(struct xenbus_watch *watch,
+ domid_t domid)
+{
+ struct mtwatch_domain *domain;
+ bool found = false;
+
+ rcu_read_lock();
+
+ list_for_each_entry_rcu(domain, &mtwatch_info->domain_list,
+ list_node) {
+ if (domain->domid == domid) {
+ found = true;
+ __unregister_single_mtwatch(watch, domain);
+ }
+ }
+
+ WARN_ON_ONCE(unlikely(!found));
+
+ rcu_read_unlock();
+}
+
+static void unregister_all_mtwatch(struct xenbus_watch *watch)
+{
+ struct mtwatch_domain *domain;
+
+ rcu_read_lock();
+
+ list_for_each_entry_rcu(domain, &mtwatch_info->domain_list,
+ list_node) {
+ __unregister_single_mtwatch(watch, domain);
+ }
+
+ rcu_read_unlock();
+}
+
+static void unregister_mtwatch(struct xenbus_watch *watch)
+{
+ /*
+ * Generally, to unregister a watch. we need to traverse all
+ * mtwatch domains to remove all inflight pending watch events for
+ * such watch.
+ *
+ * One exception is we only need to remove pending watch events
+ * from a single mtwatch domain when the watch is registered for a
+ * specific domid.
+ */
+ if (watch->owner_id)
+ unregister_single_mtwatch(watch, watch->owner_id);
+ else
+ unregister_all_mtwatch(watch);
+}
+
void unregister_xenbus_watch(struct xenbus_watch *watch)
{
struct xs_watch_event *event, *tmp;
@@ -831,6 +1100,9 @@ void unregister_xenbus_watch(struct xenbus_watch *watch)

if (current->pid != xenwatch_pid)
mutex_unlock(&xenwatch_mutex);
+
+ if (xen_mtwatch && watch->get_domid)
+ unregister_mtwatch(watch);
}
EXPORT_SYMBOL_GPL(unregister_xenbus_watch);

@@ -954,6 +1226,7 @@ int xs_init(void)

spin_lock_init(&mtwatch_info->purge_lock);
INIT_LIST_HEAD(&mtwatch_info->purge_list);
+ INIT_WORK(&mtwatch_info->purge_work, xen_mtwatch_purge_domain);

xen_mtwatch = true;

diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
index e807114..4ac2cee 100644
--- a/include/xen/xenbus.h
+++ b/include/xen/xenbus.h
@@ -241,6 +241,9 @@ extern const struct file_operations xen_xenbus_fops;
extern struct xenstore_domain_interface *xen_store_interface;
extern int xen_store_evtchn;

+void mtwatch_create_domain(domid_t domid);
+void mtwatch_put_domain(domid_t domid);
+
extern bool xen_mtwatch;

#define MTWATCH_HASH_SIZE 256
--
2.7.4


2018-09-14 07:34:33

by Dongli Zhang

[permalink] [raw]
Subject: [PATCH 4/6] xenbus: process otherend_watch event at 'state' entry in xenwatch multithreading

This is the 4th patch of a (6-patch) patch set.

With this patch, watch event in absolute path pattern
'/local/domain/<domid>/device/<pvdev>/<handle>/state' can be processed in
per-domU xenwatch thread.

Signed-off-by: Dongli Zhang <[email protected]>
---
drivers/xen/xenbus/xenbus_probe.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 5755596..ba0644c 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -129,11 +129,27 @@ static int talk_to_otherend(struct xenbus_device *dev)
}


+static domid_t otherend_get_domid(struct xenbus_watch *watch,
+ const char *path,
+ const char *token)
+{
+ struct xenbus_device *xendev =
+ container_of(watch, struct xenbus_device, otherend_watch);
+
+ return xendev->otherend_id;
+}
+

static int watch_otherend(struct xenbus_device *dev)
{
struct xen_bus_type *bus =
container_of(dev->dev.bus, struct xen_bus_type, bus);
+ struct xenbus_driver *drv = to_xenbus_driver(dev->dev.driver);
+
+ if (xen_mtwatch && drv->use_mtwatch) {
+ dev->otherend_watch.get_domid = otherend_get_domid;
+ dev->otherend_watch.owner_id = dev->otherend_id;
+ }

return xenbus_watch_pathfmt(dev, &dev->otherend_watch,
bus->otherend_changed,
--
2.7.4


2018-09-14 07:34:36

by Dongli Zhang

[permalink] [raw]
Subject: [PATCH 5/6] xenbus: process be_watch events in xenwatch multithreading

This is the 5th patch of a (6-patch) patch set.

With this patch, watch event in relative path pattern
'backend/<pvdev>/<domid>i/...' can be processed in per-domU xenwatch
thread.

Signed-off-by: Dongli Zhang <[email protected]>
---
drivers/xen/xenbus/xenbus_probe.c | 2 +-
drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++++++++++++++++++++++++++++++
include/xen/xenbus.h | 2 ++
3 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index ba0644c..aa1b15a 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -552,7 +552,7 @@ int xenbus_probe_devices(struct xen_bus_type *bus)
}
EXPORT_SYMBOL_GPL(xenbus_probe_devices);

-static unsigned int char_count(const char *str, char c)
+unsigned int char_count(const char *str, char c)
{
unsigned int i, ret = 0;

diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
index b0bed4f..50df86a 100644
--- a/drivers/xen/xenbus/xenbus_probe_backend.c
+++ b/drivers/xen/xenbus/xenbus_probe_backend.c
@@ -211,9 +211,41 @@ static void backend_changed(struct xenbus_watch *watch,
xenbus_dev_changed(path, &xenbus_backend);
}

+static domid_t path_to_domid(const char *path)
+{
+ const char *p = path;
+ domid_t domid = 0;
+
+ while (*p) {
+ if (*p < '0' || *p > '9')
+ break;
+ domid = (domid << 3) + (domid << 1) + (*p - '0');
+ p++;
+ }
+
+ return domid;
+}
+
+/* path: backend/<pvdev>/<domid>/... */
+static domid_t be_get_domid(struct xenbus_watch *watch,
+ const char *path,
+ const char *token)
+{
+ const char *p = path;
+
+ if (char_count(path, '/') < 2)
+ return 0;
+
+ p = strchr(p, '/') + 1;
+ p = strchr(p, '/') + 1;
+
+ return path_to_domid(p);
+}
+
static struct xenbus_watch be_watch = {
.node = "backend",
.callback = backend_changed,
+ .get_domid = be_get_domid,
};

static int read_frontend_details(struct xenbus_device *xendev)
diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
index 4ac2cee..79d784e 100644
--- a/include/xen/xenbus.h
+++ b/include/xen/xenbus.h
@@ -241,6 +241,8 @@ extern const struct file_operations xen_xenbus_fops;
extern struct xenstore_domain_interface *xen_store_interface;
extern int xen_store_evtchn;

+unsigned int char_count(const char *str, char c);
+
void mtwatch_create_domain(domid_t domid);
void mtwatch_put_domain(domid_t domid);

--
2.7.4


2018-09-14 07:34:41

by Dongli Zhang

[permalink] [raw]
Subject: [PATCH 6/6] drivers: enable xenwatch multithreading for xen-netback and xen-blkback driver

This is the 6th patch of a (6-patch) patch set.

As the 'use_mtwatch' for xen-netback and xen-blkback are set to true,
probing any xenbus devices of those two drivers would create the per-domU
xenwatch thread for the domid the new devices belong to, or increment the
reference count of existing thread.

Xenwatch multithreading might be enabled for more xen backend pv drivers in
the future.

Signed-off-by: Dongli Zhang <[email protected]>
---
drivers/block/xen-blkback/xenbus.c | 3 ++-
drivers/net/xen-netback/xenbus.c | 1 +
2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index a4bc74e..debbbd0 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -1108,7 +1108,8 @@ static struct xenbus_driver xen_blkbk_driver = {
.ids = xen_blkbk_ids,
.probe = xen_blkbk_probe,
.remove = xen_blkbk_remove,
- .otherend_changed = frontend_changed
+ .otherend_changed = frontend_changed,
+ .use_mtwatch = true,
};

int xen_blkif_xenbus_init(void)
diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index cd51492..63d46a7 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -1203,6 +1203,7 @@ static struct xenbus_driver netback_driver = {
.remove = netback_remove,
.uevent = netback_uevent,
.otherend_changed = frontend_changed,
+ .use_mtwatch = true,
};

int xenvif_xenbus_init(void)
--
2.7.4


2018-09-14 07:34:49

by Dongli Zhang

[permalink] [raw]
Subject: [PATCH 3/6] xenbus: dispatch per-domU watch event to per-domU xenwatch thread

This is the 3rd patch of a (6-patch) patch set.

This patch dispatches the watch event to per-domU xenwatch thread when the
event meets all of below conditions:

1. The watch is registered to use xenwatch multithreading feature and the
get_domid() method returns valid domid.
2. There is xenwatch thread (mtwatch domain) available for the domid
obtained from get_domid() method.
3. The xenwatch thread is forked successfully by kthread_run() during
initialization and therefore its state should be MTWATCH_DOMAIN_UP.

Signed-off-by: Dongli Zhang <[email protected]>
---
drivers/xen/xenbus/xenbus_xs.c | 53 ++++++++++++++++++++++++++++++++++++++----
1 file changed, 49 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
index 741dc54..7335e19 100644
--- a/drivers/xen/xenbus/xenbus_xs.c
+++ b/drivers/xen/xenbus/xenbus_xs.c
@@ -896,6 +896,32 @@ static struct xenbus_watch *find_watch(const char *token)
return NULL;
}

+static int xs_watch_insert_event(struct xs_watch_event *event, domid_t domid)
+{
+ struct mtwatch_domain *domain;
+ int ret = -1;
+
+ rcu_read_lock();
+
+ domain = mtwatch_find_domain(domid);
+ if (!domain) {
+ rcu_read_unlock();
+ return ret;
+ }
+
+ spin_lock(&domain->events_lock);
+ if (domain->state == MTWATCH_DOMAIN_UP) {
+ list_add_tail(&event->list, &domain->events);
+ wake_up(&domain->events_wq);
+ ret = 0;
+ }
+ spin_unlock(&domain->events_lock);
+
+ rcu_read_unlock();
+
+ return ret;
+}
+
int xs_watch_msg(struct xs_watch_event *event)
{
if (count_strings(event->body, event->len) != 2) {
@@ -908,10 +934,29 @@ int xs_watch_msg(struct xs_watch_event *event)
spin_lock(&watches_lock);
event->handle = find_watch(event->token);
if (event->handle != NULL) {
- spin_lock(&watch_events_lock);
- list_add_tail(&event->list, &watch_events);
- wake_up(&watch_events_waitq);
- spin_unlock(&watch_events_lock);
+ domid_t domid = 0;
+
+ if (xen_mtwatch && event->handle->get_domid)
+ domid = event->handle->get_domid(event->handle,
+ event->path,
+ event->token);
+
+ /*
+ * The event is processed by default xenwatch thread if:
+ *
+ * 1. The watch does not use xenwatch multithreading.
+ * 2. There is no per-domU xenwatch thread (or mtwatch
+ * domain) available for this domid.
+ * 3. The per-domU xenwatch thread is not created
+ * successfully by kthread_run() during initialization.
+ */
+ if (!(domid &&
+ xs_watch_insert_event(event, domid) == 0)) {
+ spin_lock(&watch_events_lock);
+ list_add_tail(&event->list, &watch_events);
+ wake_up(&watch_events_waitq);
+ spin_unlock(&watch_events_lock);
+ }
} else
kfree(event);
spin_unlock(&watches_lock);
--
2.7.4


2018-09-14 08:12:41

by Paul Durrant

[permalink] [raw]
Subject: RE: [PATCH 1/6] xenbus: prepare data structures and parameter for xenwatch multithreading

> -----Original Message-----
> From: Dongli Zhang [mailto:[email protected]]
> Sent: 14 September 2018 08:34
> To: [email protected]; [email protected]
> Cc: [email protected]; [email protected]; Paul Durrant
> <[email protected]>; Wei Liu <[email protected]>;
> [email protected]; Roger Pau Monne <[email protected]>;
> [email protected]
> Subject: [PATCH 1/6] xenbus: prepare data structures and parameter for
> xenwatch multithreading
>
> This is the 1st patch of a (6-patch) patch set.
>
> This patch set of six patches introduces xenwatch multithreading (or
> multithreaded xenwatch, abbreviated as 'mtwatch') to dom0 kernel. In
> addition to the existing single xenwatch thread, each domU has its own
> kernel thread ([xen-mtwatch-<domid>]) to process its xenwatch event.
>

^ You need to put comments like this in a cover letter. Each patch needs to stand on its own merit and the comments should only relate to the context of that patch or a 'subsequent patch'.

> A kernel parameter 'xen_mtwatch' is introduced to control whether the
> feature is enabled or not during dom0 kernel boot. The feature is disabled
> by default if 'xen_mtwatch' is not set in grub.

Why is it disabled by default? Concerns about resource consumption?

> In addition, this patch
> also introduces the data structures to maintain the status of each per-
> domU
> xenwatch thread. The status of each xenwatch thread (except the default
> one) is maintained by a mtwatch domain.
>
> The feature is available only on dom0.

Whilst I can see it is intended for a backend domain, why limit it to dom0? What about driver domains?

>
> Signed-off-by: Dongli Zhang <[email protected]>
> ---
> Documentation/admin-guide/kernel-parameters.txt | 3 ++
> drivers/xen/xenbus/xenbus_xs.c | 31 ++++++++++++
> include/xen/xenbus.h | 65
> +++++++++++++++++++++++++
> 3 files changed, 99 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt
> b/Documentation/admin-guide/kernel-parameters.txt
> index 64a3bf5..fc295ef 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4992,6 +4992,9 @@
> the unplug protocol
> never -- do not unplug even if version check succeeds
>
> + xen_mtwatch [KNL,XEN]
> + Enables the multithreaded xenwatch (mtwatch).
> +
> xen_nopvspin [X86,XEN]
> Disables the ticketlock slowpath using Xen PV
> optimizations.
> diff --git a/drivers/xen/xenbus/xenbus_xs.c
> b/drivers/xen/xenbus/xenbus_xs.c
> index 49a3874..3f137d2 100644
> --- a/drivers/xen/xenbus/xenbus_xs.c
> +++ b/drivers/xen/xenbus/xenbus_xs.c
> @@ -95,6 +95,19 @@ static pid_t xenwatch_pid;
> static DEFINE_MUTEX(xenwatch_mutex);
> static DECLARE_WAIT_QUEUE_HEAD(watch_events_waitq);
>
> +bool xen_mtwatch;
> +EXPORT_SYMBOL_GPL(xen_mtwatch);
> +
> +struct mtwatch_info *mtwatch_info;
> +
> +static bool param_xen_mtwatch;
> +static __init int xen_parse_mtwatch(char *arg)
> +{
> + param_xen_mtwatch = true;
> + return 0;
> +}
> +early_param("xen_mtwatch", xen_parse_mtwatch);
> +
> static void xs_suspend_enter(void)
> {
> spin_lock(&xs_state_lock);
> @@ -929,6 +942,24 @@ int xs_init(void)
> if (err)
> return err;
>
> + if (xen_initial_domain() && param_xen_mtwatch) {
> + int i;
> +
> + mtwatch_info = kmalloc(sizeof(*mtwatch_info), GFP_KERNEL);
> +
> + for (i = 0; i < MTWATCH_HASH_SIZE; i++)
> + INIT_HLIST_HEAD(&mtwatch_info->domain_hash[i]);
> + spin_lock_init(&mtwatch_info->domain_lock);
> + INIT_LIST_HEAD(&mtwatch_info->domain_list);
> +
> + spin_lock_init(&mtwatch_info->purge_lock);
> + INIT_LIST_HEAD(&mtwatch_info->purge_list);
> +
> + xen_mtwatch = true;
> +
> + pr_info("xenwatch multithreading is enabled\n");
> + }
> +
> task = kthread_run(xenwatch_thread, NULL, "xenwatch");
> if (IS_ERR(task))
> return PTR_ERR(task);
> diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
> index 869c816..e807114 100644
> --- a/include/xen/xenbus.h
> +++ b/include/xen/xenbus.h
> @@ -62,6 +62,13 @@ struct xenbus_watch
> /* Callback (executed in a process context with no locks held). */
> void (*callback)(struct xenbus_watch *,
> const char *path, const char *token);
> +
> + /* Callback to help calculate the domid the path belongs to */
> + domid_t (*get_domid)(struct xenbus_watch *watch,
> + const char *path, const char *token);
> +
> + /* The owner's domid if the watch is for a specific domain */
> + domid_t owner_id;
> };
>
>
> @@ -93,6 +100,7 @@ struct xenbus_device_id
> struct xenbus_driver {
> const char *name; /* defaults to ids[0].devicetype */
> const struct xenbus_device_id *ids;
> + bool use_mtwatch;
> int (*probe)(struct xenbus_device *dev,
> const struct xenbus_device_id *id);
> void (*otherend_changed)(struct xenbus_device *dev,
> @@ -233,4 +241,61 @@ extern const struct file_operations xen_xenbus_fops;
> extern struct xenstore_domain_interface *xen_store_interface;
> extern int xen_store_evtchn;
>
> +extern bool xen_mtwatch;
> +
> +#define MTWATCH_HASH_SIZE 256
> +#define MTWATCH_HASH(_id) ((int)(_id)&(MTWATCH_HASH_SIZE-1))
> +
> +struct mtwatch_info {
> + /*
> + * The mtwatch_domain is put on both a hash table and a list.
> + * domain_list is used to optimize xenbus_watch un-registration.
> + *
> + * The mtwatch_domain is removed from domain_hash (with state set
> + * to MTWATCH_DOMAIN_DOWN) when its refcnt is zero. However, it is
> + * left on domain_list until all events belong to such
> + * mtwatch_domain are processed in mtwatch_thread().
> + *
> + * While there may exist two mtwatch_domain with the same domid on
> + * domain_list simultaneously, all mtwatch_domain on hash_hash
> + * should have unique domid.
> + */
> + spinlock_t domain_lock;
> + struct hlist_head domain_hash[MTWATCH_HASH_SIZE];
> + struct list_head domain_list;
> +
> + /*
> + * When a per-domU

'per-frontend-domain' to be more descriptive?

Paul

> kthread is going to be destroyed, it is put
> + * on the purge_list, and will be flushed by purge_work later.
> + */
> + struct work_struct purge_work;
> + spinlock_t purge_lock;
> + struct list_head purge_list;
> +};
> +
> +enum mtwatch_domain_state {
> + MTWATCH_DOMAIN_UP = 1,
> + MTWATCH_DOMAIN_DOWN = 2,
> +};
> +
> +struct mtwatch_domain {
> + domid_t domid;
> + struct task_struct *task;
> + atomic_t refcnt;
> +
> + pid_t pid;
> + struct mutex domain_mutex;
> + struct rcu_head rcu;
> +
> + struct hlist_node hash_node;
> + struct list_head list_node;
> + struct list_head purge_node;
> +
> + wait_queue_head_t events_wq;
> +
> + spinlock_t events_lock;
> + struct list_head events;
> + enum mtwatch_domain_state state;
> +};
> +
> #endif /* _XEN_XENBUS_H */
> --
> 2.7.4


2018-09-14 08:17:48

by Paul Durrant

[permalink] [raw]
Subject: RE: Introduce xenwatch multithreading (mtwatch)

> -----Original Message-----
> From: Dongli Zhang [mailto:[email protected]]
> Sent: 14 September 2018 08:34
> To: [email protected]; [email protected]
> Cc: [email protected]; [email protected]; Paul Durrant
> <[email protected]>; Wei Liu <[email protected]>;
> [email protected]; Roger Pau Monne <[email protected]>;
> [email protected]
> Subject: Introduce xenwatch multithreading (mtwatch)
>
> Hi,
>
> This patch set introduces xenwatch multithreading (mtwatch) based on the
> below xen summit 2018 design session notes:

Ah, here is the cover letter... just not labelled 0/6.

>
> https://lists.xenproject.org/archives/html/xen-devel/2018-07/msg00017.html
>
>
> xenwatch_thread is a single kernel thread processing the callback function
> for subscribed xenwatch events successively. The xenwatch is stalled in
> 'D'
> state if any of callback function is stalled and uninterruptible.
>
> The domU

I think you should use 'frontend'. It is perfectly possible that a frontend could run in dom0m if, say, the system were using a driver domain.

> create/destroy is failed if xenwatch is stalled in 'D' state as
> the paravirtual driver init/uninit cannot complete. Usually, the only
> option is to reboot dom0 server unless there is solution/workaround to

Similarly 'backend' instead of 'dom0', pointing out that this is bad because it is normal for a single backend domain to serve multiple frontend domains, and the PV protocol is not (yet) re-startable if a backend goes away.

Paul

> move forward and complete the stalled xenwatch event callback function.
> Below is the output of 'xl create' when xenwatch is stalled (the issue is
> reproduced on purpose by hooking netif_receive_skb() to intercept an
> sk_buff sent out from vifX.Y on dom0 with patch at
> https://github.com/finallyjustice/patchset/blob/master/xenwatch-stall-by-
> vif.patch):
>
> # xl create pv.cfg
> Parsing config from pv.cfg
> libxl: error: libxl_device.c:1080:device_backend_callback: Domain 2:unable
> to add device with path /local/domain/0/backend/vbd/2/51712
> libxl: error: libxl_create.c:1278:domcreate_launch_dm: Domain 2:unable to
> add disk devices
> libxl: error: libxl_device.c:1080:device_backend_callback: Domain 2:unable
> to remove device with path /local/domain/0/backend/vbd/2/51712
> libxl: error: libxl_domain.c:1073:devices_destroy_cb: Domain
> 2:libxl__devices_destroy failed
> libxl: error: libxl_domain.c:1000:libxl__destroy_domid: Domain 2:Non-
> existant domain
> libxl: error: libxl_domain.c:959:domain_destroy_callback: Domain 2:Unable
> to destroy guest
> libxl: error: libxl_domain.c:886:domain_destroy_cb: Domain 2:Destruction
> of domain failed
>
>
> The idea of this patch set is to create a per-domU xenwatch thread for
> each
> domid. The per-domid thread is created when the 1st pv backend device (for
> this domid and with xenwatch multithreading enabled) is created, while
> this
> thread is destroyed when the last pv backend device (for this domid and
> with xenwatch multithreading enabled) is removed. Per-domid xs_watch_event
> is never put on the default event list, but is put on the per-domid event
> list directly.
>
>
> For more details, please refer to the xen summit 2018 design session notes
> and presentation slides:
>
> https://lists.xenproject.org/archives/html/xen-devel/2018-07/msg00017.html
> http://www.donglizhang.org/xenwatch_multithreading.pdf
>
> ----------------------------------------------------------------
>
> Dongli Zhang (6):
> xenbus: prepare data structures and parameter for xenwatch
> multithreading
> xenbus: implement the xenwatch multithreading framework
> xenbus: dispatch per-domU watch event to per-domU xenwatch thread
> xenbus: process otherend_watch event at 'state' entry in xenwatch
> multithreading
> xenbus: process be_watch events in xenwatch multithreading
> drivers: enable xenwatch multithreading for xen-netback and xen-
> blkback driver
>
> Documentation/admin-guide/kernel-parameters.txt | 3 +
> drivers/block/xen-blkback/xenbus.c | 3 +-
> drivers/net/xen-netback/xenbus.c | 1 +
> drivers/xen/xenbus/xenbus_probe.c | 24 +-
> drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++
> drivers/xen/xenbus/xenbus_xs.c | 357
> +++++++++++++++++++++++-
> include/xen/xenbus.h | 70 +++++
> 7 files changed, 484 insertions(+), 6 deletions(-)
>
> Thank you very much!
>
> Dongli Zhang


2018-09-14 08:33:05

by Jürgen Groß

[permalink] [raw]
Subject: Re: [PATCH 1/6] xenbus: prepare data structures and parameter for xenwatch multithreading

On 14/09/18 09:34, Dongli Zhang wrote:
> This is the 1st patch of a (6-patch) patch set.
>
> This patch set of six patches introduces xenwatch multithreading (or
> multithreaded xenwatch, abbreviated as 'mtwatch') to dom0 kernel. In
> addition to the existing single xenwatch thread, each domU has its own
> kernel thread ([xen-mtwatch-<domid>]) to process its xenwatch event.
>
> A kernel parameter 'xen_mtwatch' is introduced to control whether the
> feature is enabled or not during dom0 kernel boot. The feature is disabled
> by default if 'xen_mtwatch' is not set in grub. In addition, this patch
> also introduces the data structures to maintain the status of each per-domU
> xenwatch thread. The status of each xenwatch thread (except the default
> one) is maintained by a mtwatch domain.
>
> The feature is available only on dom0.
>
> Signed-off-by: Dongli Zhang <[email protected]>
> ---
> Documentation/admin-guide/kernel-parameters.txt | 3 ++
> drivers/xen/xenbus/xenbus_xs.c | 31 ++++++++++++
> include/xen/xenbus.h | 65 +++++++++++++++++++++++++
> 3 files changed, 99 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 64a3bf5..fc295ef 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4992,6 +4992,9 @@
> the unplug protocol
> never -- do not unplug even if version check succeeds
>
> + xen_mtwatch [KNL,XEN]
> + Enables the multithreaded xenwatch (mtwatch).
> +
> xen_nopvspin [X86,XEN]
> Disables the ticketlock slowpath using Xen PV
> optimizations.
> diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
> index 49a3874..3f137d2 100644
> --- a/drivers/xen/xenbus/xenbus_xs.c
> +++ b/drivers/xen/xenbus/xenbus_xs.c
> @@ -95,6 +95,19 @@ static pid_t xenwatch_pid;
> static DEFINE_MUTEX(xenwatch_mutex);
> static DECLARE_WAIT_QUEUE_HEAD(watch_events_waitq);
>
> +bool xen_mtwatch;
> +EXPORT_SYMBOL_GPL(xen_mtwatch);
> +
> +struct mtwatch_info *mtwatch_info;
> +
> +static bool param_xen_mtwatch;
> +static __init int xen_parse_mtwatch(char *arg)
> +{
> + param_xen_mtwatch = true;
> + return 0;
> +}
> +early_param("xen_mtwatch", xen_parse_mtwatch);

Add a Kconfig item to set the default when building the kernel? We can
start with default "off", but later we might want to enable this feature
as the default.

> +
> static void xs_suspend_enter(void)
> {
> spin_lock(&xs_state_lock);
> @@ -929,6 +942,24 @@ int xs_init(void)
> if (err)
> return err;
>
> + if (xen_initial_domain() && param_xen_mtwatch) {

Wouldn't it be better to drop the test for xen_initial_domain() and do
this initialization only when a caller (backend) is requesting the
multithread mode? This would avoid wasting memory in case it isn't going
top be used and - more important - would support driver domains.


Juergen

2018-09-14 08:46:45

by Paul Durrant

[permalink] [raw]
Subject: RE: [PATCH 2/6] xenbus: implement the xenwatch multithreading framework

> -----Original Message-----
> From: Dongli Zhang [mailto:[email protected]]
> Sent: 14 September 2018 08:34
> To: [email protected]; [email protected]
> Cc: [email protected]; [email protected]; Paul Durrant
> <[email protected]>; Wei Liu <[email protected]>;
> [email protected]; Roger Pau Monne <[email protected]>;
> [email protected]
> Subject: [PATCH 2/6] xenbus: implement the xenwatch multithreading
> framework
>
> This is the 2nd patch of a (6-patch) patch set.
>
> This patch implements the xenwatch multithreading framework to create or
> destroy the per-domU xenwatch thread. The xenwatch thread is created or
> destroyed during xenbus device probing or removing (that is,
> xenbus_dev_probe() or xenbus_dev_remove()) if the corresponding pv driver
> has xenwatch multithreading feature enabled. As there is only one single
> per-domU xenwatch thread for each domid, probing the xenbus device for the
> same domid again would not create the thread for the same domid again, but
> only increment the reference count of the thread's mtwatch domain. When a
> xenbus device is removed, the reference count is decremented. The per-domU
> xenwatch thread is destroyed when the reference count of its mtwatch
> domain
> is zero, that is, al xenbus devices (whose mtwatch feature is enabled) of
> such mtwatch domain are removed.
>
> Therefore, a domid has its own per-domU xenwatch thread only when it is
> attached with dom0 backend xenbus device whose pv driver has the feature
> enabled. The domid would not have its own xenwatch thread when it is not
> running any mtwatch-enabled xenbus device.
>
> When a watch (with xenwatch multithreading enabled) is unregistered, we
> will generally traverse all mtwatch domains to remove all inflight pending
> events fired by such watch. However, one optimization in this patch is we
> only need to remove pending events from a specific mtwatch domain when the
> watch is registered for a specific domid, that is, when its owner_id field
> is non-zero.
>
> Signed-off-by: Dongli Zhang <[email protected]>
> ---
> drivers/xen/xenbus/xenbus_probe.c | 6 +
> drivers/xen/xenbus/xenbus_xs.c | 273
> ++++++++++++++++++++++++++++++++++++++
> include/xen/xenbus.h | 3 +
> 3 files changed, 282 insertions(+)
>
> diff --git a/drivers/xen/xenbus/xenbus_probe.c
> b/drivers/xen/xenbus/xenbus_probe.c
> index 5b47188..5755596 100644
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -236,6 +236,9 @@ int xenbus_dev_probe(struct device *_dev)
> if (err)
> goto fail;
>
> + if (xen_mtwatch && drv->use_mtwatch)
> + mtwatch_create_domain(dev->otherend_id);

mtwatch_get_domain()? (Since you have mtwatch_put_domain() below).

> +
> err = watch_otherend(dev);
> if (err) {
> dev_warn(&dev->dev, "watch_otherend on %s failed.\n",
> @@ -263,6 +266,9 @@ int xenbus_dev_remove(struct device *_dev)
> if (drv->remove)
> drv->remove(dev);
>
> + if (xen_mtwatch && drv->use_mtwatch)
> + mtwatch_put_domain(dev->otherend_id);
> +
> free_otherend_details(dev);
>
> xenbus_switch_state(dev, XenbusStateClosed);
> diff --git a/drivers/xen/xenbus/xenbus_xs.c
> b/drivers/xen/xenbus/xenbus_xs.c
> index 3f137d2..741dc54 100644
> --- a/drivers/xen/xenbus/xenbus_xs.c
> +++ b/drivers/xen/xenbus/xenbus_xs.c
> @@ -108,6 +108,201 @@ static __init int xen_parse_mtwatch(char *arg)
> }
> early_param("xen_mtwatch", xen_parse_mtwatch);
>
> +struct mtwatch_domain *mtwatch_find_domain(domid_t domid)
> +{
> + struct mtwatch_domain *domain;
> + int hash = MTWATCH_HASH(domid);
> + struct hlist_head *hash_head = &mtwatch_info->domain_hash[hash];
> +
> + hlist_for_each_entry_rcu(domain, hash_head, hash_node) {
> + if (domain->domid == domid)
> + return domain;
> + }
> +
> + return NULL;
> +}
> +
> +/* per-domU thread for xenwatch multithreading. */
> +static int mtwatch_thread(void *arg)
> +{
> + struct mtwatch_domain *domain = (struct mtwatch_domain *) arg;
> + struct list_head *ent;
> + struct xs_watch_event *event;
> +
> + domain->pid = current->pid;
> +
> + for (;;) {
> + wait_event_interruptible(domain->events_wq,
> + !list_empty(&domain->events) ||
> + domain->state == MTWATCH_DOMAIN_DOWN);
> +
> + if (domain->state == MTWATCH_DOMAIN_DOWN &&
> + list_empty(&domain->events))
> + break;
> +
> + mutex_lock(&domain->domain_mutex);
> +
> + spin_lock(&domain->events_lock);
> + ent = domain->events.next;
> + if (ent != &domain->events)
> + list_del(ent);
> + spin_unlock(&domain->events_lock);
> +
> + if (ent != &domain->events) {
> + event = list_entry(ent, struct xs_watch_event, list);
> + event->handle->callback(event->handle, event->path,
> + event->token);
> + kfree(event);
> + }
> +
> + mutex_unlock(&domain->domain_mutex);
> + }
> +
> + /*
> + * domain->state is already set to MTWATCH_DOMAIN_DOWN (to avoid
> + * new event to domain->events) when above for loop breaks, so
> + * that there is no requirement to cleanup domain->events again.
> + */
> +
> + spin_lock(&mtwatch_info->domain_lock);
> + list_del_rcu(&domain->list_node);
> + spin_unlock(&mtwatch_info->domain_lock);
> +
> + spin_lock(&mtwatch_info->purge_lock);
> + list_add(&domain->purge_node, &mtwatch_info->purge_list);
> + spin_unlock(&mtwatch_info->purge_lock);
> +
> + schedule_work(&mtwatch_info->purge_work);
> +
> + return 0;
> +}
> +
> +static void delayed_destroy_domain(struct rcu_head *head)
> +{
> + struct mtwatch_domain *domain;
> +
> + domain = container_of(head, struct mtwatch_domain, rcu);
> + kfree(domain);
> +}
> +
> +static void xen_mtwatch_purge_domain(struct work_struct *work)
> +{
> + struct mtwatch_domain *domain;
> + struct list_head *node;
> +
> + while (!list_empty(&mtwatch_info->purge_list)) {
> +
> + spin_lock(&mtwatch_info->purge_lock);
> + node = mtwatch_info->purge_list.next;
> + if (node != &mtwatch_info->purge_list)
> + list_del(node);
> + spin_unlock(&mtwatch_info->purge_lock);
> +
> + if (node != &mtwatch_info->purge_list) {
> + domain = list_entry(node, struct mtwatch_domain,
> + purge_node);
> + kthread_stop(domain->task);
> +
> + call_rcu(&domain->rcu, delayed_destroy_domain);
> + }
> + }
> +}
> +
> +/* Running in the context of default xenwatch kthread. */
> +void mtwatch_create_domain(domid_t domid)
> +{
> + struct mtwatch_domain *domain;
> +
> + if (!domid) {
> + pr_err("Default xenwatch thread is for dom0\n");
> + return;
> + }
> +
> + spin_lock(&mtwatch_info->domain_lock);
> +
> + domain = mtwatch_find_domain(domid);
> + if (domain) {
> + atomic_inc(&domain->refcnt);
> + spin_unlock(&mtwatch_info->domain_lock);
> + return;
> + }
> +
> + domain = kzalloc(sizeof(*domain), GFP_ATOMIC);
> + if (!domain) {
> + spin_unlock(&mtwatch_info->domain_lock);
> + pr_err("Failed to allocate memory for mtwatch thread %d\n",
> + domid);
> + return;
> + }
> +
> + domain->domid = domid;
> + atomic_set(&domain->refcnt, 1);
> + mutex_init(&domain->domain_mutex);
> + INIT_LIST_HEAD(&domain->purge_node);
> +
> + init_waitqueue_head(&domain->events_wq);
> + spin_lock_init(&domain->events_lock);
> + INIT_LIST_HEAD(&domain->events);
> +
> + list_add_tail_rcu(&domain->list_node, &mtwatch_info->domain_list);
> +
> + hlist_add_head_rcu(&domain->hash_node,
> + &mtwatch_info->domain_hash[MTWATCH_HASH(domid)]);
> +
> + spin_unlock(&mtwatch_info->domain_lock);
> +
> + domain->task = kthread_run(mtwatch_thread, domain,
> + "xen-mtwatch-%d", domid);
> +
> + if (!domain->task) {
> + pr_err("mtwatch kthread creation is failed\n");
> + domain->state = MTWATCH_DOMAIN_DOWN;
> +
> + return;
> + }
> +
> + domain->state = MTWATCH_DOMAIN_UP;
> +}
> +
> +/* Running in the context of default xenwatch kthread. */
> +void mtwatch_put_domain(domid_t domid)
> +{
> + struct mtwatch_domain *domain;
> +
> + spin_lock(&mtwatch_info->domain_lock);
> +
> + domain = mtwatch_find_domain(domid);
> + if (!domain) {
> + spin_unlock(&mtwatch_info->domain_lock);
> + pr_err("mtwatch kthread for domid=%d does not exist\n",
> + domid);
> + return;
> + }
> +
> + if (atomic_dec_and_test(&domain->refcnt)) {
> +
> + hlist_del_rcu(&domain->hash_node);
> +
> + if (!domain->task) {
> + /*
> + * As the task is failed to initialize during
> + * mtwatch_create_domain(), we do not need to wait
> + * for the kernel thread to complete.
> + */
> + list_del_rcu(&domain->list_node);
> + call_rcu(&domain->rcu, delayed_destroy_domain);
> + } else {
> + spin_lock(&domain->events_lock);
> + domain->state = MTWATCH_DOMAIN_DOWN;
> + spin_unlock(&domain->events_lock);
> +
> + wake_up(&domain->events_wq);
> + }
> + }
> +
> + spin_unlock(&mtwatch_info->domain_lock);
> +}
> +
> static void xs_suspend_enter(void)
> {
> spin_lock(&xs_state_lock);
> @@ -793,6 +988,80 @@ int register_xenbus_watch(struct xenbus_watch *watch)
> }
> EXPORT_SYMBOL_GPL(register_xenbus_watch);
>
> +static void __unregister_single_mtwatch(struct xenbus_watch *watch,
> + struct mtwatch_domain *domain)
> +{
> + struct xs_watch_event *event, *tmp;
> +
> + if (current->pid != domain->pid)
> + mutex_lock(&domain->domain_mutex);

Since you avoid the lock here, what's to stop another thread with a different pid now racing with this?

Paul

> +
> + spin_lock(&domain->events_lock);
> + list_for_each_entry_safe(event, tmp,
> + &domain->events, list) {
> + if (event->handle != watch)
> + continue;
> + list_del(&event->list);
> + kfree(event);
> + }
> + spin_unlock(&domain->events_lock);
> +
> + if (current->pid != domain->pid)
> + mutex_unlock(&domain->domain_mutex);
> +}
> +
> +static void unregister_single_mtwatch(struct xenbus_watch *watch,
> + domid_t domid)
> +{
> + struct mtwatch_domain *domain;
> + bool found = false;
> +
> + rcu_read_lock();
> +
> + list_for_each_entry_rcu(domain, &mtwatch_info->domain_list,
> + list_node) {
> + if (domain->domid == domid) {
> + found = true;
> + __unregister_single_mtwatch(watch, domain);
> + }
> + }
> +
> + WARN_ON_ONCE(unlikely(!found));
> +
> + rcu_read_unlock();
> +}
> +
> +static void unregister_all_mtwatch(struct xenbus_watch *watch)
> +{
> + struct mtwatch_domain *domain;
> +
> + rcu_read_lock();
> +
> + list_for_each_entry_rcu(domain, &mtwatch_info->domain_list,
> + list_node) {
> + __unregister_single_mtwatch(watch, domain);
> + }
> +
> + rcu_read_unlock();
> +}
> +
> +static void unregister_mtwatch(struct xenbus_watch *watch)
> +{
> + /*
> + * Generally, to unregister a watch. we need to traverse all
> + * mtwatch domains to remove all inflight pending watch events for
> + * such watch.
> + *
> + * One exception is we only need to remove pending watch events
> + * from a single mtwatch domain when the watch is registered for a
> + * specific domid.
> + */
> + if (watch->owner_id)
> + unregister_single_mtwatch(watch, watch->owner_id);
> + else
> + unregister_all_mtwatch(watch);
> +}
> +
> void unregister_xenbus_watch(struct xenbus_watch *watch)
> {
> struct xs_watch_event *event, *tmp;
> @@ -831,6 +1100,9 @@ void unregister_xenbus_watch(struct xenbus_watch
> *watch)
>
> if (current->pid != xenwatch_pid)
> mutex_unlock(&xenwatch_mutex);
> +
> + if (xen_mtwatch && watch->get_domid)
> + unregister_mtwatch(watch);
> }
> EXPORT_SYMBOL_GPL(unregister_xenbus_watch);
>
> @@ -954,6 +1226,7 @@ int xs_init(void)
>
> spin_lock_init(&mtwatch_info->purge_lock);
> INIT_LIST_HEAD(&mtwatch_info->purge_list);
> + INIT_WORK(&mtwatch_info->purge_work,
> xen_mtwatch_purge_domain);
>
> xen_mtwatch = true;
>
> diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
> index e807114..4ac2cee 100644
> --- a/include/xen/xenbus.h
> +++ b/include/xen/xenbus.h
> @@ -241,6 +241,9 @@ extern const struct file_operations xen_xenbus_fops;
> extern struct xenstore_domain_interface *xen_store_interface;
> extern int xen_store_evtchn;
>
> +void mtwatch_create_domain(domid_t domid);
> +void mtwatch_put_domain(domid_t domid);
> +
> extern bool xen_mtwatch;
>
> #define MTWATCH_HASH_SIZE 256
> --
> 2.7.4


2018-09-14 08:58:39

by Jürgen Groß

[permalink] [raw]
Subject: Re: [PATCH 2/6] xenbus: implement the xenwatch multithreading framework

On 14/09/18 09:34, Dongli Zhang wrote:
> This is the 2nd patch of a (6-patch) patch set.
>
> This patch implements the xenwatch multithreading framework to create or
> destroy the per-domU xenwatch thread. The xenwatch thread is created or
> destroyed during xenbus device probing or removing (that is,
> xenbus_dev_probe() or xenbus_dev_remove()) if the corresponding pv driver
> has xenwatch multithreading feature enabled. As there is only one single
> per-domU xenwatch thread for each domid, probing the xenbus device for the
> same domid again would not create the thread for the same domid again, but
> only increment the reference count of the thread's mtwatch domain. When a
> xenbus device is removed, the reference count is decremented. The per-domU
> xenwatch thread is destroyed when the reference count of its mtwatch domain
> is zero, that is, al xenbus devices (whose mtwatch feature is enabled) of
> such mtwatch domain are removed.
>
> Therefore, a domid has its own per-domU xenwatch thread only when it is
> attached with dom0 backend xenbus device whose pv driver has the feature
> enabled. The domid would not have its own xenwatch thread when it is not
> running any mtwatch-enabled xenbus device.
>
> When a watch (with xenwatch multithreading enabled) is unregistered, we
> will generally traverse all mtwatch domains to remove all inflight pending
> events fired by such watch. However, one optimization in this patch is we
> only need to remove pending events from a specific mtwatch domain when the
> watch is registered for a specific domid, that is, when its owner_id field
> is non-zero.
>
> Signed-off-by: Dongli Zhang <[email protected]>
> ---
> drivers/xen/xenbus/xenbus_probe.c | 6 +
> drivers/xen/xenbus/xenbus_xs.c | 273 ++++++++++++++++++++++++++++++++++++++
> include/xen/xenbus.h | 3 +
> 3 files changed, 282 insertions(+)
>
> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
> index 5b47188..5755596 100644
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -236,6 +236,9 @@ int xenbus_dev_probe(struct device *_dev)
> if (err)
> goto fail;
>
> + if (xen_mtwatch && drv->use_mtwatch)
> + mtwatch_create_domain(dev->otherend_id);
> +
> err = watch_otherend(dev);
> if (err) {
> dev_warn(&dev->dev, "watch_otherend on %s failed.\n",
> @@ -263,6 +266,9 @@ int xenbus_dev_remove(struct device *_dev)
> if (drv->remove)
> drv->remove(dev);
>
> + if (xen_mtwatch && drv->use_mtwatch)
> + mtwatch_put_domain(dev->otherend_id);
> +
> free_otherend_details(dev);
>
> xenbus_switch_state(dev, XenbusStateClosed);
> diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
> index 3f137d2..741dc54 100644
> --- a/drivers/xen/xenbus/xenbus_xs.c
> +++ b/drivers/xen/xenbus/xenbus_xs.c
> @@ -108,6 +108,201 @@ static __init int xen_parse_mtwatch(char *arg)
> }
> early_param("xen_mtwatch", xen_parse_mtwatch);
>
> +struct mtwatch_domain *mtwatch_find_domain(domid_t domid)
> +{
> + struct mtwatch_domain *domain;
> + int hash = MTWATCH_HASH(domid);
> + struct hlist_head *hash_head = &mtwatch_info->domain_hash[hash];
> +
> + hlist_for_each_entry_rcu(domain, hash_head, hash_node) {
> + if (domain->domid == domid)
> + return domain;
> + }
> +
> + return NULL;
> +}
> +
> +/* per-domU thread for xenwatch multithreading. */
> +static int mtwatch_thread(void *arg)
> +{
> + struct mtwatch_domain *domain = (struct mtwatch_domain *) arg;
> + struct list_head *ent;
> + struct xs_watch_event *event;
> +
> + domain->pid = current->pid;
> +
> + for (;;) {
> + wait_event_interruptible(domain->events_wq,
> + !list_empty(&domain->events) ||
> + domain->state == MTWATCH_DOMAIN_DOWN);
> +
> + if (domain->state == MTWATCH_DOMAIN_DOWN &&
> + list_empty(&domain->events))
> + break;
> +
> + mutex_lock(&domain->domain_mutex);
> +
> + spin_lock(&domain->events_lock);
> + ent = domain->events.next;
> + if (ent != &domain->events)
> + list_del(ent);
> + spin_unlock(&domain->events_lock);
> +
> + if (ent != &domain->events) {
> + event = list_entry(ent, struct xs_watch_event, list);
> + event->handle->callback(event->handle, event->path,
> + event->token);
> + kfree(event);
> + }
> +
> + mutex_unlock(&domain->domain_mutex);
> + }

This is nearly the same coding as xenwatch_thread(). Why don't you
define a new (sub-)structure containing the needed elements and move
xenwatch_pid, watch_events_waitq, watch_events, xenwatch_mutex,
watch_events_lock into such a structure? Then you could a common
function for both purposes (you'd need to set state for
xenwatch_thread() to DOMAIN_UP and add a callback for testing the
thread end condition).

> +
> + /*
> + * domain->state is already set to MTWATCH_DOMAIN_DOWN (to avoid
> + * new event to domain->events) when above for loop breaks, so
> + * that there is no requirement to cleanup domain->events again.
> + */
> +
> + spin_lock(&mtwatch_info->domain_lock);
> + list_del_rcu(&domain->list_node);
> + spin_unlock(&mtwatch_info->domain_lock);
> +
> + spin_lock(&mtwatch_info->purge_lock);
> + list_add(&domain->purge_node, &mtwatch_info->purge_list);
> + spin_unlock(&mtwatch_info->purge_lock);
> +
> + schedule_work(&mtwatch_info->purge_work);
> +
> + return 0;
> +}
> +
> +static void delayed_destroy_domain(struct rcu_head *head)
> +{
> + struct mtwatch_domain *domain;
> +
> + domain = container_of(head, struct mtwatch_domain, rcu);
> + kfree(domain);
> +}
> +
> +static void xen_mtwatch_purge_domain(struct work_struct *work)
> +{
> + struct mtwatch_domain *domain;
> + struct list_head *node;
> +
> + while (!list_empty(&mtwatch_info->purge_list)) {
> +
> + spin_lock(&mtwatch_info->purge_lock);
> + node = mtwatch_info->purge_list.next;
> + if (node != &mtwatch_info->purge_list)
> + list_del(node);
> + spin_unlock(&mtwatch_info->purge_lock);
> +
> + if (node != &mtwatch_info->purge_list) {
> + domain = list_entry(node, struct mtwatch_domain,
> + purge_node);
> + kthread_stop(domain->task);
> +
> + call_rcu(&domain->rcu, delayed_destroy_domain);
> + }
> + }
> +}
> +
> +/* Running in the context of default xenwatch kthread. */
> +void mtwatch_create_domain(domid_t domid)
> +{
> + struct mtwatch_domain *domain;
> +
> + if (!domid) {
> + pr_err("Default xenwatch thread is for dom0\n");

Should we really exclude dom0? What if a driver domain wants to support
a dom0 based frontend?

> + return;
> + }
> +
> + spin_lock(&mtwatch_info->domain_lock);
> +
> + domain = mtwatch_find_domain(domid);
> + if (domain) {
> + atomic_inc(&domain->refcnt);
> + spin_unlock(&mtwatch_info->domain_lock);
> + return;
> + }
> +
> + domain = kzalloc(sizeof(*domain), GFP_ATOMIC);
> + if (!domain) {
> + spin_unlock(&mtwatch_info->domain_lock);
> + pr_err("Failed to allocate memory for mtwatch thread %d\n",
> + domid);

No alloc error messages, please!

> + return;
> + }
> +
> + domain->domid = domid;
> + atomic_set(&domain->refcnt, 1);
> + mutex_init(&domain->domain_mutex);
> + INIT_LIST_HEAD(&domain->purge_node);
> +
> + init_waitqueue_head(&domain->events_wq);
> + spin_lock_init(&domain->events_lock);
> + INIT_LIST_HEAD(&domain->events);
> +
> + list_add_tail_rcu(&domain->list_node, &mtwatch_info->domain_list);
> +
> + hlist_add_head_rcu(&domain->hash_node,
> + &mtwatch_info->domain_hash[MTWATCH_HASH(domid)]);
> +
> + spin_unlock(&mtwatch_info->domain_lock);
> +
> + domain->task = kthread_run(mtwatch_thread, domain,
> + "xen-mtwatch-%d", domid);
> +
> + if (!domain->task) {
> + pr_err("mtwatch kthread creation is failed\n");
> + domain->state = MTWATCH_DOMAIN_DOWN;
> +
> + return;
> + }
> +
> + domain->state = MTWATCH_DOMAIN_UP;
> +}
> +
> +/* Running in the context of default xenwatch kthread. */
> +void mtwatch_put_domain(domid_t domid)
> +{
> + struct mtwatch_domain *domain;
> +
> + spin_lock(&mtwatch_info->domain_lock);
> +
> + domain = mtwatch_find_domain(domid);
> + if (!domain) {
> + spin_unlock(&mtwatch_info->domain_lock);
> + pr_err("mtwatch kthread for domid=%d does not exist\n",
> + domid);
> + return;
> + }
> +
> + if (atomic_dec_and_test(&domain->refcnt)) {
> +
> + hlist_del_rcu(&domain->hash_node);
> +
> + if (!domain->task) {
> + /*
> + * As the task is failed to initialize during
> + * mtwatch_create_domain(), we do not need to wait
> + * for the kernel thread to complete.
> + */
> + list_del_rcu(&domain->list_node);
> + call_rcu(&domain->rcu, delayed_destroy_domain);
> + } else {
> + spin_lock(&domain->events_lock);
> + domain->state = MTWATCH_DOMAIN_DOWN;
> + spin_unlock(&domain->events_lock);
> +
> + wake_up(&domain->events_wq);
> + }
> + }
> +
> + spin_unlock(&mtwatch_info->domain_lock);
> +}
> +
> static void xs_suspend_enter(void)
> {
> spin_lock(&xs_state_lock);
> @@ -793,6 +988,80 @@ int register_xenbus_watch(struct xenbus_watch *watch)
> }
> EXPORT_SYMBOL_GPL(register_xenbus_watch);
>
> +static void __unregister_single_mtwatch(struct xenbus_watch *watch,
> + struct mtwatch_domain *domain)
> +{
> + struct xs_watch_event *event, *tmp;
> +
> + if (current->pid != domain->pid)
> + mutex_lock(&domain->domain_mutex);
> +
> + spin_lock(&domain->events_lock);
> + list_for_each_entry_safe(event, tmp,
> + &domain->events, list) {
> + if (event->handle != watch)
> + continue;
> + list_del(&event->list);
> + kfree(event);
> + }
> + spin_unlock(&domain->events_lock);
> +
> + if (current->pid != domain->pid)
> + mutex_unlock(&domain->domain_mutex);
> +}
> +
> +static void unregister_single_mtwatch(struct xenbus_watch *watch,
> + domid_t domid)
> +{
> + struct mtwatch_domain *domain;
> + bool found = false;
> +
> + rcu_read_lock();
> +
> + list_for_each_entry_rcu(domain, &mtwatch_info->domain_list,
> + list_node) {
> + if (domain->domid == domid) {
> + found = true;
> + __unregister_single_mtwatch(watch, domain);
> + }
> + }
> +
> + WARN_ON_ONCE(unlikely(!found));
> +
> + rcu_read_unlock();
> +}
> +
> +static void unregister_all_mtwatch(struct xenbus_watch *watch)
> +{
> + struct mtwatch_domain *domain;
> +
> + rcu_read_lock();
> +
> + list_for_each_entry_rcu(domain, &mtwatch_info->domain_list,
> + list_node) {
> + __unregister_single_mtwatch(watch, domain);
> + }
> +
> + rcu_read_unlock();
> +}
> +
> +static void unregister_mtwatch(struct xenbus_watch *watch)
> +{
> + /*
> + * Generally, to unregister a watch. we need to traverse all
> + * mtwatch domains to remove all inflight pending watch events for
> + * such watch.
> + *
> + * One exception is we only need to remove pending watch events
> + * from a single mtwatch domain when the watch is registered for a
> + * specific domid.
> + */
> + if (watch->owner_id)

Again: 0 as a special value isn't a good idea. Maybe use one of the
reserved DOMIDs, like DOMID_SELF?


Juergen

2018-09-14 09:03:02

by Jürgen Groß

[permalink] [raw]
Subject: Re: [PATCH 3/6] xenbus: dispatch per-domU watch event to per-domU xenwatch thread

On 14/09/18 09:34, Dongli Zhang wrote:
> This is the 3rd patch of a (6-patch) patch set.
>
> This patch dispatches the watch event to per-domU xenwatch thread when the
> event meets all of below conditions:
>
> 1. The watch is registered to use xenwatch multithreading feature and the
> get_domid() method returns valid domid.
> 2. There is xenwatch thread (mtwatch domain) available for the domid
> obtained from get_domid() method.
> 3. The xenwatch thread is forked successfully by kthread_run() during
> initialization and therefore its state should be MTWATCH_DOMAIN_UP.
>
> Signed-off-by: Dongli Zhang <[email protected]>
> ---
> drivers/xen/xenbus/xenbus_xs.c | 53 ++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 49 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
> index 741dc54..7335e19 100644
> --- a/drivers/xen/xenbus/xenbus_xs.c
> +++ b/drivers/xen/xenbus/xenbus_xs.c
> @@ -896,6 +896,32 @@ static struct xenbus_watch *find_watch(const char *token)
> return NULL;
> }
>
> +static int xs_watch_insert_event(struct xs_watch_event *event, domid_t domid)
> +{
> + struct mtwatch_domain *domain;
> + int ret = -1;
> +
> + rcu_read_lock();
> +
> + domain = mtwatch_find_domain(domid);
> + if (!domain) {
> + rcu_read_unlock();
> + return ret;
> + }
> +
> + spin_lock(&domain->events_lock);
> + if (domain->state == MTWATCH_DOMAIN_UP) {
> + list_add_tail(&event->list, &domain->events);
> + wake_up(&domain->events_wq);
> + ret = 0;
> + }
> + spin_unlock(&domain->events_lock);
> +
> + rcu_read_unlock();
> +
> + return ret;
> +}
> +
> int xs_watch_msg(struct xs_watch_event *event)
> {
> if (count_strings(event->body, event->len) != 2) {
> @@ -908,10 +934,29 @@ int xs_watch_msg(struct xs_watch_event *event)
> spin_lock(&watches_lock);
> event->handle = find_watch(event->token);
> if (event->handle != NULL) {
> - spin_lock(&watch_events_lock);
> - list_add_tail(&event->list, &watch_events);
> - wake_up(&watch_events_waitq);
> - spin_unlock(&watch_events_lock);
> + domid_t domid = 0;

I won't repeat it again (including following patches): use another
special value like DOMID_SELF.


Juergen

2018-09-14 09:04:44

by Jürgen Groß

[permalink] [raw]
Subject: Re: [PATCH 4/6] xenbus: process otherend_watch event at 'state' entry in xenwatch multithreading

On 14/09/18 09:34, Dongli Zhang wrote:
> This is the 4th patch of a (6-patch) patch set.
>
> With this patch, watch event in absolute path pattern
> '/local/domain/<domid>/device/<pvdev>/<handle>/state' can be processed in
> per-domU xenwatch thread.
>
> Signed-off-by: Dongli Zhang <[email protected]>

Reviewed-by: Juergen Gross <[email protected]>


Juergen

2018-09-14 09:13:19

by Jürgen Groß

[permalink] [raw]
Subject: Re: [PATCH 5/6] xenbus: process be_watch events in xenwatch multithreading

On 14/09/18 09:34, Dongli Zhang wrote:
> This is the 5th patch of a (6-patch) patch set.
>
> With this patch, watch event in relative path pattern
> 'backend/<pvdev>/<domid>i/...' can be processed in per-domU xenwatch

superfluous "i" ----------^

> thread.
>
> Signed-off-by: Dongli Zhang <[email protected]>
> ---
> drivers/xen/xenbus/xenbus_probe.c | 2 +-
> drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++++++++++++++++++++++++++++++
> include/xen/xenbus.h | 2 ++
> 3 files changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
> index ba0644c..aa1b15a 100644
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -552,7 +552,7 @@ int xenbus_probe_devices(struct xen_bus_type *bus)
> }
> EXPORT_SYMBOL_GPL(xenbus_probe_devices);
>
> -static unsigned int char_count(const char *str, char c)
> +unsigned int char_count(const char *str, char c)

Please change the name of the function when making it globally
visible, e.g. by prefixing "xenbus_".

Generally I think you don't need to use it below.

> {
> unsigned int i, ret = 0;
>
> diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
> index b0bed4f..50df86a 100644
> --- a/drivers/xen/xenbus/xenbus_probe_backend.c
> +++ b/drivers/xen/xenbus/xenbus_probe_backend.c
> @@ -211,9 +211,41 @@ static void backend_changed(struct xenbus_watch *watch,
> xenbus_dev_changed(path, &xenbus_backend);
> }
>
> +static domid_t path_to_domid(const char *path)
> +{
> + const char *p = path;
> + domid_t domid = 0;
> +
> + while (*p) {
> + if (*p < '0' || *p > '9')
> + break;
> + domid = (domid << 3) + (domid << 1) + (*p - '0');

reinventing atoi()?

Please don't do that. kstrtou16() seems to be a perfect fit.

> + p++;
> + }
> +
> + return domid;
> +}
> +
> +/* path: backend/<pvdev>/<domid>/... */
> +static domid_t be_get_domid(struct xenbus_watch *watch,
> + const char *path,
> + const char *token)
> +{
> + const char *p = path;
> +
> + if (char_count(path, '/') < 2)
> + return 0;
> +
> + p = strchr(p, '/') + 1;
> + p = strchr(p, '/') + 1;

Drop the call of char_count() above and test p for being non-NULL
after each call of strchr?


Juergen

2018-09-14 09:17:04

by Jürgen Groß

[permalink] [raw]
Subject: Re: [PATCH 6/6] drivers: enable xenwatch multithreading for xen-netback and xen-blkback driver

On 14/09/18 09:34, Dongli Zhang wrote:
> This is the 6th patch of a (6-patch) patch set.
>
> As the 'use_mtwatch' for xen-netback and xen-blkback are set to true,
> probing any xenbus devices of those two drivers would create the per-domU
> xenwatch thread for the domid the new devices belong to, or increment the
> reference count of existing thread.
>
> Xenwatch multithreading might be enabled for more xen backend pv drivers in
> the future.
>
> Signed-off-by: Dongli Zhang <[email protected]>
> ---
> drivers/block/xen-blkback/xenbus.c | 3 ++-
> drivers/net/xen-netback/xenbus.c | 1 +
> 2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
> index a4bc74e..debbbd0 100644
> --- a/drivers/block/xen-blkback/xenbus.c
> +++ b/drivers/block/xen-blkback/xenbus.c
> @@ -1108,7 +1108,8 @@ static struct xenbus_driver xen_blkbk_driver = {
> .ids = xen_blkbk_ids,
> .probe = xen_blkbk_probe,
> .remove = xen_blkbk_remove,
> - .otherend_changed = frontend_changed
> + .otherend_changed = frontend_changed,
> + .use_mtwatch = true,
> };
>
> int xen_blkif_xenbus_init(void)
> diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> index cd51492..63d46a7 100644
> --- a/drivers/net/xen-netback/xenbus.c
> +++ b/drivers/net/xen-netback/xenbus.c
> @@ -1203,6 +1203,7 @@ static struct xenbus_driver netback_driver = {
> .remove = netback_remove,
> .uevent = netback_uevent,
> .otherend_changed = frontend_changed,
> + .use_mtwatch = true,

Is there a special reason why kernel based backends shouldn't all use
the multithread model? This would avoid the need for the use_mtwatch
struct member.

This is meant as an honest question. I'm really not sure we should
switch all backends at once. OTOH I can't see any real downsides.

Thoughts?


Juergen

2018-09-14 09:19:09

by Jürgen Groß

[permalink] [raw]
Subject: Re: Introduce xenwatch multithreading (mtwatch)

On 14/09/18 09:34, Dongli Zhang wrote:
> Hi,
>
> This patch set introduces xenwatch multithreading (mtwatch) based on the
> below xen summit 2018 design session notes:
>
> https://lists.xenproject.org/archives/html/xen-devel/2018-07/msg00017.html
>
>
> xenwatch_thread is a single kernel thread processing the callback function
> for subscribed xenwatch events successively. The xenwatch is stalled in 'D'
> state if any of callback function is stalled and uninterruptible.
>
> The domU create/destroy is failed if xenwatch is stalled in 'D' state as
> the paravirtual driver init/uninit cannot complete. Usually, the only
> option is to reboot dom0 server unless there is solution/workaround to
> move forward and complete the stalled xenwatch event callback function.
> Below is the output of 'xl create' when xenwatch is stalled (the issue is
> reproduced on purpose by hooking netif_receive_skb() to intercept an
> sk_buff sent out from vifX.Y on dom0 with patch at
> https://github.com/finallyjustice/patchset/blob/master/xenwatch-stall-by-vif.patch):
>
> # xl create pv.cfg
> Parsing config from pv.cfg
> libxl: error: libxl_device.c:1080:device_backend_callback: Domain 2:unable to add device with path /local/domain/0/backend/vbd/2/51712
> libxl: error: libxl_create.c:1278:domcreate_launch_dm: Domain 2:unable to add disk devices
> libxl: error: libxl_device.c:1080:device_backend_callback: Domain 2:unable to remove device with path /local/domain/0/backend/vbd/2/51712
> libxl: error: libxl_domain.c:1073:devices_destroy_cb: Domain 2:libxl__devices_destroy failed
> libxl: error: libxl_domain.c:1000:libxl__destroy_domid: Domain 2:Non-existant domain
> libxl: error: libxl_domain.c:959:domain_destroy_callback: Domain 2:Unable to destroy guest
> libxl: error: libxl_domain.c:886:domain_destroy_cb: Domain 2:Destruction of domain failed
>
>
> The idea of this patch set is to create a per-domU xenwatch thread for each
> domid. The per-domid thread is created when the 1st pv backend device (for
> this domid and with xenwatch multithreading enabled) is created, while this
> thread is destroyed when the last pv backend device (for this domid and
> with xenwatch multithreading enabled) is removed. Per-domid xs_watch_event
> is never put on the default event list, but is put on the per-domid event
> list directly.
>
>
> For more details, please refer to the xen summit 2018 design session notes
> and presentation slides:
>
> https://lists.xenproject.org/archives/html/xen-devel/2018-07/msg00017.html
> http://www.donglizhang.org/xenwatch_multithreading.pdf
>
> ----------------------------------------------------------------
>
> Dongli Zhang (6):
> xenbus: prepare data structures and parameter for xenwatch multithreading
> xenbus: implement the xenwatch multithreading framework
> xenbus: dispatch per-domU watch event to per-domU xenwatch thread
> xenbus: process otherend_watch event at 'state' entry in xenwatch multithreading
> xenbus: process be_watch events in xenwatch multithreading
> drivers: enable xenwatch multithreading for xen-netback and xen-blkback driver
>
> Documentation/admin-guide/kernel-parameters.txt | 3 +
> drivers/block/xen-blkback/xenbus.c | 3 +-
> drivers/net/xen-netback/xenbus.c | 1 +
> drivers/xen/xenbus/xenbus_probe.c | 24 +-
> drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++
> drivers/xen/xenbus/xenbus_xs.c | 357 +++++++++++++++++++++++-
> include/xen/xenbus.h | 70 +++++
> 7 files changed, 484 insertions(+), 6 deletions(-)
>
> Thank you very much!

One general remark regarding your commit messages: Please drop the
patch number from the text itself, its enough to have it in the mail
subject line. Later it is completely irrelevant in git history.


Juergen


2018-09-14 09:39:27

by Wei Liu

[permalink] [raw]
Subject: Re: [PATCH 6/6] drivers: enable xenwatch multithreading for xen-netback and xen-blkback driver

On Fri, Sep 14, 2018 at 11:16:30AM +0200, Juergen Gross wrote:
> On 14/09/18 09:34, Dongli Zhang wrote:
> > This is the 6th patch of a (6-patch) patch set.
> >
> > As the 'use_mtwatch' for xen-netback and xen-blkback are set to true,
> > probing any xenbus devices of those two drivers would create the per-domU
> > xenwatch thread for the domid the new devices belong to, or increment the
> > reference count of existing thread.
> >
> > Xenwatch multithreading might be enabled for more xen backend pv drivers in
> > the future.
> >
> > Signed-off-by: Dongli Zhang <[email protected]>
> > ---
> > drivers/block/xen-blkback/xenbus.c | 3 ++-
> > drivers/net/xen-netback/xenbus.c | 1 +
> > 2 files changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
> > index a4bc74e..debbbd0 100644
> > --- a/drivers/block/xen-blkback/xenbus.c
> > +++ b/drivers/block/xen-blkback/xenbus.c
> > @@ -1108,7 +1108,8 @@ static struct xenbus_driver xen_blkbk_driver = {
> > .ids = xen_blkbk_ids,
> > .probe = xen_blkbk_probe,
> > .remove = xen_blkbk_remove,
> > - .otherend_changed = frontend_changed
> > + .otherend_changed = frontend_changed,
> > + .use_mtwatch = true,
> > };
> >
> > int xen_blkif_xenbus_init(void)
> > diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> > index cd51492..63d46a7 100644
> > --- a/drivers/net/xen-netback/xenbus.c
> > +++ b/drivers/net/xen-netback/xenbus.c
> > @@ -1203,6 +1203,7 @@ static struct xenbus_driver netback_driver = {
> > .remove = netback_remove,
> > .uevent = netback_uevent,
> > .otherend_changed = frontend_changed,
> > + .use_mtwatch = true,
>
> Is there a special reason why kernel based backends shouldn't all use
> the multithread model? This would avoid the need for the use_mtwatch
> struct member.
>
> This is meant as an honest question. I'm really not sure we should
> switch all backends at once. OTOH I can't see any real downsides.
>
> Thoughts?

I don't see any downside.

Wei.

>
>
> Juergen

2018-09-14 09:57:40

by Roger Pau Monne

[permalink] [raw]
Subject: Re: [PATCH 6/6] drivers: enable xenwatch multithreading for xen-netback and xen-blkback driver

On Fri, Sep 14, 2018 at 11:16:30AM +0200, Juergen Gross wrote:
> On 14/09/18 09:34, Dongli Zhang wrote:
> > This is the 6th patch of a (6-patch) patch set.
> >
> > As the 'use_mtwatch' for xen-netback and xen-blkback are set to true,
> > probing any xenbus devices of those two drivers would create the per-domU
> > xenwatch thread for the domid the new devices belong to, or increment the
> > reference count of existing thread.
> >
> > Xenwatch multithreading might be enabled for more xen backend pv drivers in
> > the future.
> >
> > Signed-off-by: Dongli Zhang <[email protected]>
> > ---
> > drivers/block/xen-blkback/xenbus.c | 3 ++-
> > drivers/net/xen-netback/xenbus.c | 1 +
> > 2 files changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
> > index a4bc74e..debbbd0 100644
> > --- a/drivers/block/xen-blkback/xenbus.c
> > +++ b/drivers/block/xen-blkback/xenbus.c
> > @@ -1108,7 +1108,8 @@ static struct xenbus_driver xen_blkbk_driver = {
> > .ids = xen_blkbk_ids,
> > .probe = xen_blkbk_probe,
> > .remove = xen_blkbk_remove,
> > - .otherend_changed = frontend_changed
> > + .otherend_changed = frontend_changed,
> > + .use_mtwatch = true,
> > };
> >
> > int xen_blkif_xenbus_init(void)
> > diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
> > index cd51492..63d46a7 100644
> > --- a/drivers/net/xen-netback/xenbus.c
> > +++ b/drivers/net/xen-netback/xenbus.c
> > @@ -1203,6 +1203,7 @@ static struct xenbus_driver netback_driver = {
> > .remove = netback_remove,
> > .uevent = netback_uevent,
> > .otherend_changed = frontend_changed,
> > + .use_mtwatch = true,
>
> Is there a special reason why kernel based backends shouldn't all use
> the multithread model? This would avoid the need for the use_mtwatch
> struct member.
>
> This is meant as an honest question. I'm really not sure we should
> switch all backends at once. OTOH I can't see any real downsides.
>
> Thoughts?

From a blkback PoV, I think it's safe to switch as long as there are
no concurrent watches firing in parallel for the same domain, which
from my reading of the series is indeed not possible.

Roger.

2018-09-14 13:41:52

by Dongli Zhang

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 1/6] xenbus: prepare data structures and parameter for xenwatch multithreading

Hi Paul,

On 09/14/2018 04:11 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Dongli Zhang [mailto:[email protected]]
>> Sent: 14 September 2018 08:34
>> To: [email protected]; [email protected]
>> Cc: [email protected]; [email protected]; Paul Durrant
>> <[email protected]>; Wei Liu <[email protected]>;
>> [email protected]; Roger Pau Monne <[email protected]>;
>> [email protected]
>> Subject: [PATCH 1/6] xenbus: prepare data structures and parameter for
>> xenwatch multithreading
>>
>> This is the 1st patch of a (6-patch) patch set.
>>
>> This patch set of six patches introduces xenwatch multithreading (or
>> multithreaded xenwatch, abbreviated as 'mtwatch') to dom0 kernel. In
>> addition to the existing single xenwatch thread, each domU has its own
>> kernel thread ([xen-mtwatch-<domid>]) to process its xenwatch event.
>>
>
> ^ You need to put comments like this in a cover letter. Each patch needs to stand on its own merit and the comments should only relate to the context of that patch or a 'subsequent patch'.
>
>> A kernel parameter 'xen_mtwatch' is introduced to control whether the
>> feature is enabled or not during dom0 kernel boot. The feature is disabled
>> by default if 'xen_mtwatch' is not set in grub.
>
> Why is it disabled by default? Concerns about resource consumption?

I would prefer to leave this feature disabled until it is stable, used and
tested by more people, or when maintainers think it is time to enable it by default.

>
>> In addition, this patch
>> also introduces the data structures to maintain the status of each per-
>> domU
>> xenwatch thread. The status of each xenwatch thread (except the default
>> one) is maintained by a mtwatch domain.
>>
>> The feature is available only on dom0.
>
> Whilst I can see it is intended for a backend domain, why limit it to dom0? What about driver domains?

As more people suggest, I would enable this on all domains used as pv backend.

I will use terms like 'per-frontend-domain' or 'frontend-id'. Seems this does
not change the existing code a lot.

Dongli Zhang



>
>>
>> Signed-off-by: Dongli Zhang <[email protected]>
>> ---
>> Documentation/admin-guide/kernel-parameters.txt | 3 ++
>> drivers/xen/xenbus/xenbus_xs.c | 31 ++++++++++++
>> include/xen/xenbus.h | 65
>> +++++++++++++++++++++++++
>> 3 files changed, 99 insertions(+)
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt
>> b/Documentation/admin-guide/kernel-parameters.txt
>> index 64a3bf5..fc295ef 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -4992,6 +4992,9 @@
>> the unplug protocol
>> never -- do not unplug even if version check succeeds
>>
>> + xen_mtwatch [KNL,XEN]
>> + Enables the multithreaded xenwatch (mtwatch).
>> +
>> xen_nopvspin [X86,XEN]
>> Disables the ticketlock slowpath using Xen PV
>> optimizations.
>> diff --git a/drivers/xen/xenbus/xenbus_xs.c
>> b/drivers/xen/xenbus/xenbus_xs.c
>> index 49a3874..3f137d2 100644
>> --- a/drivers/xen/xenbus/xenbus_xs.c
>> +++ b/drivers/xen/xenbus/xenbus_xs.c
>> @@ -95,6 +95,19 @@ static pid_t xenwatch_pid;
>> static DEFINE_MUTEX(xenwatch_mutex);
>> static DECLARE_WAIT_QUEUE_HEAD(watch_events_waitq);
>>
>> +bool xen_mtwatch;
>> +EXPORT_SYMBOL_GPL(xen_mtwatch);
>> +
>> +struct mtwatch_info *mtwatch_info;
>> +
>> +static bool param_xen_mtwatch;
>> +static __init int xen_parse_mtwatch(char *arg)
>> +{
>> + param_xen_mtwatch = true;
>> + return 0;
>> +}
>> +early_param("xen_mtwatch", xen_parse_mtwatch);
>> +
>> static void xs_suspend_enter(void)
>> {
>> spin_lock(&xs_state_lock);
>> @@ -929,6 +942,24 @@ int xs_init(void)
>> if (err)
>> return err;
>>
>> + if (xen_initial_domain() && param_xen_mtwatch) {
>> + int i;
>> +
>> + mtwatch_info = kmalloc(sizeof(*mtwatch_info), GFP_KERNEL);
>> +
>> + for (i = 0; i < MTWATCH_HASH_SIZE; i++)
>> + INIT_HLIST_HEAD(&mtwatch_info->domain_hash[i]);
>> + spin_lock_init(&mtwatch_info->domain_lock);
>> + INIT_LIST_HEAD(&mtwatch_info->domain_list);
>> +
>> + spin_lock_init(&mtwatch_info->purge_lock);
>> + INIT_LIST_HEAD(&mtwatch_info->purge_list);
>> +
>> + xen_mtwatch = true;
>> +
>> + pr_info("xenwatch multithreading is enabled\n");
>> + }
>> +
>> task = kthread_run(xenwatch_thread, NULL, "xenwatch");
>> if (IS_ERR(task))
>> return PTR_ERR(task);
>> diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
>> index 869c816..e807114 100644
>> --- a/include/xen/xenbus.h
>> +++ b/include/xen/xenbus.h
>> @@ -62,6 +62,13 @@ struct xenbus_watch
>> /* Callback (executed in a process context with no locks held). */
>> void (*callback)(struct xenbus_watch *,
>> const char *path, const char *token);
>> +
>> + /* Callback to help calculate the domid the path belongs to */
>> + domid_t (*get_domid)(struct xenbus_watch *watch,
>> + const char *path, const char *token);
>> +
>> + /* The owner's domid if the watch is for a specific domain */
>> + domid_t owner_id;
>> };
>>
>>
>> @@ -93,6 +100,7 @@ struct xenbus_device_id
>> struct xenbus_driver {
>> const char *name; /* defaults to ids[0].devicetype */
>> const struct xenbus_device_id *ids;
>> + bool use_mtwatch;
>> int (*probe)(struct xenbus_device *dev,
>> const struct xenbus_device_id *id);
>> void (*otherend_changed)(struct xenbus_device *dev,
>> @@ -233,4 +241,61 @@ extern const struct file_operations xen_xenbus_fops;
>> extern struct xenstore_domain_interface *xen_store_interface;
>> extern int xen_store_evtchn;
>>
>> +extern bool xen_mtwatch;
>> +
>> +#define MTWATCH_HASH_SIZE 256
>> +#define MTWATCH_HASH(_id) ((int)(_id)&(MTWATCH_HASH_SIZE-1))
>> +
>> +struct mtwatch_info {
>> + /*
>> + * The mtwatch_domain is put on both a hash table and a list.
>> + * domain_list is used to optimize xenbus_watch un-registration.
>> + *
>> + * The mtwatch_domain is removed from domain_hash (with state set
>> + * to MTWATCH_DOMAIN_DOWN) when its refcnt is zero. However, it is
>> + * left on domain_list until all events belong to such
>> + * mtwatch_domain are processed in mtwatch_thread().
>> + *
>> + * While there may exist two mtwatch_domain with the same domid on
>> + * domain_list simultaneously, all mtwatch_domain on hash_hash
>> + * should have unique domid.
>> + */
>> + spinlock_t domain_lock;
>> + struct hlist_head domain_hash[MTWATCH_HASH_SIZE];
>> + struct list_head domain_list;
>> +
>> + /*
>> + * When a per-domU
>
> 'per-frontend-domain' to be more descriptive?
>
> Paul
>
>> kthread is going to be destroyed, it is put
>> + * on the purge_list, and will be flushed by purge_work later.
>> + */
>> + struct work_struct purge_work;
>> + spinlock_t purge_lock;
>> + struct list_head purge_list;
>> +};
>> +
>> +enum mtwatch_domain_state {
>> + MTWATCH_DOMAIN_UP = 1,
>> + MTWATCH_DOMAIN_DOWN = 2,
>> +};
>> +
>> +struct mtwatch_domain {
>> + domid_t domid;
>> + struct task_struct *task;
>> + atomic_t refcnt;
>> +
>> + pid_t pid;
>> + struct mutex domain_mutex;
>> + struct rcu_head rcu;
>> +
>> + struct hlist_node hash_node;
>> + struct list_head list_node;
>> + struct list_head purge_node;
>> +
>> + wait_queue_head_t events_wq;
>> +
>> + spinlock_t events_lock;
>> + struct list_head events;
>> + enum mtwatch_domain_state state;
>> +};
>> +
>> #endif /* _XEN_XENBUS_H */
>> --
>> 2.7.4
>
>
> _______________________________________________
> Xen-devel mailing list
> [email protected]
> https://lists.xenproject.org/mailman/listinfo/xen-devel
>

2018-09-14 14:00:09

by Dongli Zhang

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 1/6] xenbus: prepare data structures and parameter for xenwatch multithreading

Hi Juergen and Paul,

On 09/14/2018 04:32 PM, Juergen Gross wrote:
> On 14/09/18 09:34, Dongli Zhang wrote:
>> This is the 1st patch of a (6-patch) patch set.
>>
>> This patch set of six patches introduces xenwatch multithreading (or
>> multithreaded xenwatch, abbreviated as 'mtwatch') to dom0 kernel. In
>> addition to the existing single xenwatch thread, each domU has its own
>> kernel thread ([xen-mtwatch-<domid>]) to process its xenwatch event.
>>
>> A kernel parameter 'xen_mtwatch' is introduced to control whether the
>> feature is enabled or not during dom0 kernel boot. The feature is disabled
>> by default if 'xen_mtwatch' is not set in grub. In addition, this patch
>> also introduces the data structures to maintain the status of each per-domU
>> xenwatch thread. The status of each xenwatch thread (except the default
>> one) is maintained by a mtwatch domain.
>>
>> The feature is available only on dom0.
>>
>> Signed-off-by: Dongli Zhang <[email protected]>
>> ---
>> Documentation/admin-guide/kernel-parameters.txt | 3 ++
>> drivers/xen/xenbus/xenbus_xs.c | 31 ++++++++++++
>> include/xen/xenbus.h | 65 +++++++++++++++++++++++++
>> 3 files changed, 99 insertions(+)
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> index 64a3bf5..fc295ef 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -4992,6 +4992,9 @@
>> the unplug protocol
>> never -- do not unplug even if version check succeeds
>>
>> + xen_mtwatch [KNL,XEN]
>> + Enables the multithreaded xenwatch (mtwatch).
>> +
>> xen_nopvspin [X86,XEN]
>> Disables the ticketlock slowpath using Xen PV
>> optimizations.
>> diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
>> index 49a3874..3f137d2 100644
>> --- a/drivers/xen/xenbus/xenbus_xs.c
>> +++ b/drivers/xen/xenbus/xenbus_xs.c
>> @@ -95,6 +95,19 @@ static pid_t xenwatch_pid;
>> static DEFINE_MUTEX(xenwatch_mutex);
>> static DECLARE_WAIT_QUEUE_HEAD(watch_events_waitq);
>>
>> +bool xen_mtwatch;
>> +EXPORT_SYMBOL_GPL(xen_mtwatch);
>> +
>> +struct mtwatch_info *mtwatch_info;
>> +
>> +static bool param_xen_mtwatch;
>> +static __init int xen_parse_mtwatch(char *arg)
>> +{
>> + param_xen_mtwatch = true;
>> + return 0;
>> +}
>> +early_param("xen_mtwatch", xen_parse_mtwatch);
>
> Add a Kconfig item to set the default when building the kernel? We can
> start with default "off", but later we might want to enable this feature
> as the default.

Is there any weakness/downside configuring the param via early_param? Or is
there any strength via Kconfig?

I would prefer to not fix the configuration when building the kernel. The
administrator/user will not be able to choose the preferred option.

I would prefer to configure the option via kernel param and set default as 'off'
(false). In a common cloud environment, the administrator is only required to
enable the param in dom0/driver domain kernel.

For common domU (not using as backend), this feature is useless. I assume there
are always more domUs than dom0 (and driver domain), especially in cloud
environment.

>
>> +
>> static void xs_suspend_enter(void)
>> {
>> spin_lock(&xs_state_lock);
>> @@ -929,6 +942,24 @@ int xs_init(void)
>> if (err)
>> return err;
>>
>> + if (xen_initial_domain() && param_xen_mtwatch) {
>
> Wouldn't it be better to drop the test for xen_initial_domain() and do
> this initialization only when a caller (backend) is requesting the
> multithread mode? This would avoid wasting memory in case it isn't going
> top be used and - more important - would support driver domains.

This would save us <4KB.

Suppose the default 'param_xen_mtwatch' is 'off'. The administrator knows the
domain is used as pv backend when he/she enables the option manually.

In such scenario, is there any benefit to make it on-demand to save <4KB memory?

Thank you very much!

Dongli Zhang

>
>
> Juergen
>
> _______________________________________________
> Xen-devel mailing list
> [email protected]
> https://lists.xenproject.org/mailman/listinfo/xen-devel
>

2018-09-14 14:11:46

by Jürgen Groß

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 1/6] xenbus: prepare data structures and parameter for xenwatch multithreading

On 14/09/18 15:57, Dongli Zhang wrote:
> Hi Juergen and Paul,
>
> On 09/14/2018 04:32 PM, Juergen Gross wrote:
>> On 14/09/18 09:34, Dongli Zhang wrote:
>>> This is the 1st patch of a (6-patch) patch set.
>>>
>>> This patch set of six patches introduces xenwatch multithreading (or
>>> multithreaded xenwatch, abbreviated as 'mtwatch') to dom0 kernel. In
>>> addition to the existing single xenwatch thread, each domU has its own
>>> kernel thread ([xen-mtwatch-<domid>]) to process its xenwatch event.
>>>
>>> A kernel parameter 'xen_mtwatch' is introduced to control whether the
>>> feature is enabled or not during dom0 kernel boot. The feature is disabled
>>> by default if 'xen_mtwatch' is not set in grub. In addition, this patch
>>> also introduces the data structures to maintain the status of each per-domU
>>> xenwatch thread. The status of each xenwatch thread (except the default
>>> one) is maintained by a mtwatch domain.
>>>
>>> The feature is available only on dom0.
>>>
>>> Signed-off-by: Dongli Zhang <[email protected]>
>>> ---
>>> Documentation/admin-guide/kernel-parameters.txt | 3 ++
>>> drivers/xen/xenbus/xenbus_xs.c | 31 ++++++++++++
>>> include/xen/xenbus.h | 65 +++++++++++++++++++++++++
>>> 3 files changed, 99 insertions(+)
>>>
>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>> index 64a3bf5..fc295ef 100644
>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>> @@ -4992,6 +4992,9 @@
>>> the unplug protocol
>>> never -- do not unplug even if version check succeeds
>>>
>>> + xen_mtwatch [KNL,XEN]
>>> + Enables the multithreaded xenwatch (mtwatch).
>>> +
>>> xen_nopvspin [X86,XEN]
>>> Disables the ticketlock slowpath using Xen PV
>>> optimizations.
>>> diff --git a/drivers/xen/xenbus/xenbus_xs.c b/drivers/xen/xenbus/xenbus_xs.c
>>> index 49a3874..3f137d2 100644
>>> --- a/drivers/xen/xenbus/xenbus_xs.c
>>> +++ b/drivers/xen/xenbus/xenbus_xs.c
>>> @@ -95,6 +95,19 @@ static pid_t xenwatch_pid;
>>> static DEFINE_MUTEX(xenwatch_mutex);
>>> static DECLARE_WAIT_QUEUE_HEAD(watch_events_waitq);
>>>
>>> +bool xen_mtwatch;
>>> +EXPORT_SYMBOL_GPL(xen_mtwatch);
>>> +
>>> +struct mtwatch_info *mtwatch_info;
>>> +
>>> +static bool param_xen_mtwatch;
>>> +static __init int xen_parse_mtwatch(char *arg)
>>> +{
>>> + param_xen_mtwatch = true;
>>> + return 0;
>>> +}
>>> +early_param("xen_mtwatch", xen_parse_mtwatch);
>>
>> Add a Kconfig item to set the default when building the kernel? We can
>> start with default "off", but later we might want to enable this feature
>> as the default.
>
> Is there any weakness/downside configuring the param via early_param? Or is
> there any strength via Kconfig?
>
> I would prefer to not fix the configuration when building the kernel. The
> administrator/user will not be able to choose the preferred option.

I wouldn't remove the kernel param, but let the config choose the
default to use. The user would still be able to set the value either
way via boot parameter.

>
> I would prefer to configure the option via kernel param and set default as 'off'
> (false). In a common cloud environment, the administrator is only required to
> enable the param in dom0/driver domain kernel.
>
> For common domU (not using as backend), this feature is useless. I assume there
> are always more domUs than dom0 (and driver domain), especially in cloud
> environment.
>
>>
>>> +
>>> static void xs_suspend_enter(void)
>>> {
>>> spin_lock(&xs_state_lock);
>>> @@ -929,6 +942,24 @@ int xs_init(void)
>>> if (err)
>>> return err;
>>>
>>> + if (xen_initial_domain() && param_xen_mtwatch) {
>>
>> Wouldn't it be better to drop the test for xen_initial_domain() and do
>> this initialization only when a caller (backend) is requesting the
>> multithread mode? This would avoid wasting memory in case it isn't going
>> top be used and - more important - would support driver domains.
>
> This would save us <4KB.
>
> Suppose the default 'param_xen_mtwatch' is 'off'. The administrator knows the
> domain is used as pv backend when he/she enables the option manually.
>
> In such scenario, is there any benefit to make it on-demand to save <4KB memory?

It is a benefit in case a distro ships the kernel with default on in
order to avoid the need to add kernel parameters for dom0 or a driver
domain. In a normal domU there is no backend running so the behavior
would be as today.

If the feature is stable and we know it is better to switch it on for
driver domains (and dom0, of course), we shouldn't require the user to
do so.


Juergen

2018-09-14 14:12:14

by Dongli Zhang

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 2/6] xenbus: implement the xenwatch multithreading framework



On 09/14/2018 04:45 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Dongli Zhang [mailto:[email protected]]
>> Sent: 14 September 2018 08:34
>> To: [email protected]; [email protected]
>> Cc: [email protected]; [email protected]; Paul Durrant
>> <[email protected]>; Wei Liu <[email protected]>;
>> [email protected]; Roger Pau Monne <[email protected]>;
>> [email protected]
>> Subject: [PATCH 2/6] xenbus: implement the xenwatch multithreading
>> framework
>>
>> This is the 2nd patch of a (6-patch) patch set.
>>
>> This patch implements the xenwatch multithreading framework to create or
>> destroy the per-domU xenwatch thread. The xenwatch thread is created or
>> destroyed during xenbus device probing or removing (that is,
>> xenbus_dev_probe() or xenbus_dev_remove()) if the corresponding pv driver
>> has xenwatch multithreading feature enabled. As there is only one single
>> per-domU xenwatch thread for each domid, probing the xenbus device for the
>> same domid again would not create the thread for the same domid again, but
>> only increment the reference count of the thread's mtwatch domain. When a
>> xenbus device is removed, the reference count is decremented. The per-domU
>> xenwatch thread is destroyed when the reference count of its mtwatch
>> domain
>> is zero, that is, al xenbus devices (whose mtwatch feature is enabled) of
>> such mtwatch domain are removed.
>>
>> Therefore, a domid has its own per-domU xenwatch thread only when it is
>> attached with dom0 backend xenbus device whose pv driver has the feature
>> enabled. The domid would not have its own xenwatch thread when it is not
>> running any mtwatch-enabled xenbus device.
>>
>> When a watch (with xenwatch multithreading enabled) is unregistered, we
>> will generally traverse all mtwatch domains to remove all inflight pending
>> events fired by such watch. However, one optimization in this patch is we
>> only need to remove pending events from a specific mtwatch domain when the
>> watch is registered for a specific domid, that is, when its owner_id field
>> is non-zero.
>>
>> Signed-off-by: Dongli Zhang <[email protected]>
>> ---
>> drivers/xen/xenbus/xenbus_probe.c | 6 +
>> drivers/xen/xenbus/xenbus_xs.c | 273
>> ++++++++++++++++++++++++++++++++++++++
>> include/xen/xenbus.h | 3 +
>> 3 files changed, 282 insertions(+)
>>
>> diff --git a/drivers/xen/xenbus/xenbus_probe.c
>> b/drivers/xen/xenbus/xenbus_probe.c
>> index 5b47188..5755596 100644
>> --- a/drivers/xen/xenbus/xenbus_probe.c
>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>> @@ -236,6 +236,9 @@ int xenbus_dev_probe(struct device *_dev)
>> if (err)
>> goto fail;
>>
>> + if (xen_mtwatch && drv->use_mtwatch)
>> + mtwatch_create_domain(dev->otherend_id);
>
> mtwatch_get_domain()? (Since you have mtwatch_put_domain() below).
>
>> +
>> err = watch_otherend(dev);
>> if (err) {
>> dev_warn(&dev->dev, "watch_otherend on %s failed.\n",
>> @@ -263,6 +266,9 @@ int xenbus_dev_remove(struct device *_dev)
>> if (drv->remove)
>> drv->remove(dev);
>>
>> + if (xen_mtwatch && drv->use_mtwatch)
>> + mtwatch_put_domain(dev->otherend_id);
>> +
>> free_otherend_details(dev);
>>
>> xenbus_switch_state(dev, XenbusStateClosed);
>> diff --git a/drivers/xen/xenbus/xenbus_xs.c
>> b/drivers/xen/xenbus/xenbus_xs.c
>> index 3f137d2..741dc54 100644
>> --- a/drivers/xen/xenbus/xenbus_xs.c
>> +++ b/drivers/xen/xenbus/xenbus_xs.c
>> @@ -108,6 +108,201 @@ static __init int xen_parse_mtwatch(char *arg)
>> }
>> early_param("xen_mtwatch", xen_parse_mtwatch);
>>
>> +struct mtwatch_domain *mtwatch_find_domain(domid_t domid)
>> +{
>> + struct mtwatch_domain *domain;
>> + int hash = MTWATCH_HASH(domid);
>> + struct hlist_head *hash_head = &mtwatch_info->domain_hash[hash];
>> +
>> + hlist_for_each_entry_rcu(domain, hash_head, hash_node) {
>> + if (domain->domid == domid)
>> + return domain;
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> +/* per-domU thread for xenwatch multithreading. */
>> +static int mtwatch_thread(void *arg)
>> +{
>> + struct mtwatch_domain *domain = (struct mtwatch_domain *) arg;
>> + struct list_head *ent;
>> + struct xs_watch_event *event;
>> +
>> + domain->pid = current->pid;
>> +
>> + for (;;) {
>> + wait_event_interruptible(domain->events_wq,
>> + !list_empty(&domain->events) ||
>> + domain->state == MTWATCH_DOMAIN_DOWN);
>> +
>> + if (domain->state == MTWATCH_DOMAIN_DOWN &&
>> + list_empty(&domain->events))
>> + break;
>> +
>> + mutex_lock(&domain->domain_mutex);
>> +
>> + spin_lock(&domain->events_lock);
>> + ent = domain->events.next;
>> + if (ent != &domain->events)
>> + list_del(ent);
>> + spin_unlock(&domain->events_lock);
>> +
>> + if (ent != &domain->events) {
>> + event = list_entry(ent, struct xs_watch_event, list);
>> + event->handle->callback(event->handle, event->path,
>> + event->token);
>> + kfree(event);
>> + }
>> +
>> + mutex_unlock(&domain->domain_mutex);
>> + }
>> +
>> + /*
>> + * domain->state is already set to MTWATCH_DOMAIN_DOWN (to avoid
>> + * new event to domain->events) when above for loop breaks, so
>> + * that there is no requirement to cleanup domain->events again.
>> + */
>> +
>> + spin_lock(&mtwatch_info->domain_lock);
>> + list_del_rcu(&domain->list_node);
>> + spin_unlock(&mtwatch_info->domain_lock);
>> +
>> + spin_lock(&mtwatch_info->purge_lock);
>> + list_add(&domain->purge_node, &mtwatch_info->purge_list);
>> + spin_unlock(&mtwatch_info->purge_lock);
>> +
>> + schedule_work(&mtwatch_info->purge_work);
>> +
>> + return 0;
>> +}
>> +
>> +static void delayed_destroy_domain(struct rcu_head *head)
>> +{
>> + struct mtwatch_domain *domain;
>> +
>> + domain = container_of(head, struct mtwatch_domain, rcu);
>> + kfree(domain);
>> +}
>> +
>> +static void xen_mtwatch_purge_domain(struct work_struct *work)
>> +{
>> + struct mtwatch_domain *domain;
>> + struct list_head *node;
>> +
>> + while (!list_empty(&mtwatch_info->purge_list)) {
>> +
>> + spin_lock(&mtwatch_info->purge_lock);
>> + node = mtwatch_info->purge_list.next;
>> + if (node != &mtwatch_info->purge_list)
>> + list_del(node);
>> + spin_unlock(&mtwatch_info->purge_lock);
>> +
>> + if (node != &mtwatch_info->purge_list) {
>> + domain = list_entry(node, struct mtwatch_domain,
>> + purge_node);
>> + kthread_stop(domain->task);
>> +
>> + call_rcu(&domain->rcu, delayed_destroy_domain);
>> + }
>> + }
>> +}
>> +
>> +/* Running in the context of default xenwatch kthread. */
>> +void mtwatch_create_domain(domid_t domid)
>> +{
>> + struct mtwatch_domain *domain;
>> +
>> + if (!domid) {
>> + pr_err("Default xenwatch thread is for dom0\n");
>> + return;
>> + }
>> +
>> + spin_lock(&mtwatch_info->domain_lock);
>> +
>> + domain = mtwatch_find_domain(domid);
>> + if (domain) {
>> + atomic_inc(&domain->refcnt);
>> + spin_unlock(&mtwatch_info->domain_lock);
>> + return;
>> + }
>> +
>> + domain = kzalloc(sizeof(*domain), GFP_ATOMIC);
>> + if (!domain) {
>> + spin_unlock(&mtwatch_info->domain_lock);
>> + pr_err("Failed to allocate memory for mtwatch thread %d\n",
>> + domid);
>> + return;
>> + }
>> +
>> + domain->domid = domid;
>> + atomic_set(&domain->refcnt, 1);
>> + mutex_init(&domain->domain_mutex);
>> + INIT_LIST_HEAD(&domain->purge_node);
>> +
>> + init_waitqueue_head(&domain->events_wq);
>> + spin_lock_init(&domain->events_lock);
>> + INIT_LIST_HEAD(&domain->events);
>> +
>> + list_add_tail_rcu(&domain->list_node, &mtwatch_info->domain_list);
>> +
>> + hlist_add_head_rcu(&domain->hash_node,
>> + &mtwatch_info->domain_hash[MTWATCH_HASH(domid)]);
>> +
>> + spin_unlock(&mtwatch_info->domain_lock);
>> +
>> + domain->task = kthread_run(mtwatch_thread, domain,
>> + "xen-mtwatch-%d", domid);
>> +
>> + if (!domain->task) {
>> + pr_err("mtwatch kthread creation is failed\n");
>> + domain->state = MTWATCH_DOMAIN_DOWN;
>> +
>> + return;
>> + }
>> +
>> + domain->state = MTWATCH_DOMAIN_UP;
>> +}
>> +
>> +/* Running in the context of default xenwatch kthread. */
>> +void mtwatch_put_domain(domid_t domid)
>> +{
>> + struct mtwatch_domain *domain;
>> +
>> + spin_lock(&mtwatch_info->domain_lock);
>> +
>> + domain = mtwatch_find_domain(domid);
>> + if (!domain) {
>> + spin_unlock(&mtwatch_info->domain_lock);
>> + pr_err("mtwatch kthread for domid=%d does not exist\n",
>> + domid);
>> + return;
>> + }
>> +
>> + if (atomic_dec_and_test(&domain->refcnt)) {
>> +
>> + hlist_del_rcu(&domain->hash_node);
>> +
>> + if (!domain->task) {
>> + /*
>> + * As the task is failed to initialize during
>> + * mtwatch_create_domain(), we do not need to wait
>> + * for the kernel thread to complete.
>> + */
>> + list_del_rcu(&domain->list_node);
>> + call_rcu(&domain->rcu, delayed_destroy_domain);
>> + } else {
>> + spin_lock(&domain->events_lock);
>> + domain->state = MTWATCH_DOMAIN_DOWN;
>> + spin_unlock(&domain->events_lock);
>> +
>> + wake_up(&domain->events_wq);
>> + }
>> + }
>> +
>> + spin_unlock(&mtwatch_info->domain_lock);
>> +}
>> +
>> static void xs_suspend_enter(void)
>> {
>> spin_lock(&xs_state_lock);
>> @@ -793,6 +988,80 @@ int register_xenbus_watch(struct xenbus_watch *watch)
>> }
>> EXPORT_SYMBOL_GPL(register_xenbus_watch);
>>
>> +static void __unregister_single_mtwatch(struct xenbus_watch *watch,
>> + struct mtwatch_domain *domain)
>> +{
>> + struct xs_watch_event *event, *tmp;
>> +
>> + if (current->pid != domain->pid)
>> + mutex_lock(&domain->domain_mutex);
>
> Since you avoid the lock here, what's to stop another thread with a different pid now racing with this?
>
> Paul


Here I copy the code from unregister_xenbus_watch(). I think we can assume the
caller would not unregister the same xenbus_watch twice (otherwise, the default
unregister_xenbus_watch() would hit this issue as well)?

When a watch is unregistered, it is processed either in the context of a
xenwatch thread or any other thread/context.

When we avoid the lock here, it indicatew the watch un-registration is processed
by a specific xenwatch thread and no one would race with it.

Dongli Zhang


>
>> +
>> + spin_lock(&domain->events_lock);
>> + list_for_each_entry_safe(event, tmp,
>> + &domain->events, list) {
>> + if (event->handle != watch)
>> + continue;
>> + list_del(&event->list);
>> + kfree(event);
>> + }
>> + spin_unlock(&domain->events_lock);
>> +
>> + if (current->pid != domain->pid)
>> + mutex_unlock(&domain->domain_mutex);
>> +}
>> +
>> +static void unregister_single_mtwatch(struct xenbus_watch *watch,
>> + domid_t domid)
>> +{
>> + struct mtwatch_domain *domain;
>> + bool found = false;
>> +
>> + rcu_read_lock();
>> +
>> + list_for_each_entry_rcu(domain, &mtwatch_info->domain_list,
>> + list_node) {
>> + if (domain->domid == domid) {
>> + found = true;
>> + __unregister_single_mtwatch(watch, domain);
>> + }
>> + }
>> +
>> + WARN_ON_ONCE(unlikely(!found));
>> +
>> + rcu_read_unlock();
>> +}
>> +
>> +static void unregister_all_mtwatch(struct xenbus_watch *watch)
>> +{
>> + struct mtwatch_domain *domain;
>> +
>> + rcu_read_lock();
>> +
>> + list_for_each_entry_rcu(domain, &mtwatch_info->domain_list,
>> + list_node) {
>> + __unregister_single_mtwatch(watch, domain);
>> + }
>> +
>> + rcu_read_unlock();
>> +}
>> +
>> +static void unregister_mtwatch(struct xenbus_watch *watch)
>> +{
>> + /*
>> + * Generally, to unregister a watch. we need to traverse all
>> + * mtwatch domains to remove all inflight pending watch events for
>> + * such watch.
>> + *
>> + * One exception is we only need to remove pending watch events
>> + * from a single mtwatch domain when the watch is registered for a
>> + * specific domid.
>> + */
>> + if (watch->owner_id)
>> + unregister_single_mtwatch(watch, watch->owner_id);
>> + else
>> + unregister_all_mtwatch(watch);
>> +}
>> +
>> void unregister_xenbus_watch(struct xenbus_watch *watch)
>> {
>> struct xs_watch_event *event, *tmp;
>> @@ -831,6 +1100,9 @@ void unregister_xenbus_watch(struct xenbus_watch
>> *watch)
>>
>> if (current->pid != xenwatch_pid)
>> mutex_unlock(&xenwatch_mutex);
>> +
>> + if (xen_mtwatch && watch->get_domid)
>> + unregister_mtwatch(watch);
>> }
>> EXPORT_SYMBOL_GPL(unregister_xenbus_watch);
>>
>> @@ -954,6 +1226,7 @@ int xs_init(void)
>>
>> spin_lock_init(&mtwatch_info->purge_lock);
>> INIT_LIST_HEAD(&mtwatch_info->purge_list);
>> + INIT_WORK(&mtwatch_info->purge_work,
>> xen_mtwatch_purge_domain);
>>
>> xen_mtwatch = true;
>>
>> diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
>> index e807114..4ac2cee 100644
>> --- a/include/xen/xenbus.h
>> +++ b/include/xen/xenbus.h
>> @@ -241,6 +241,9 @@ extern const struct file_operations xen_xenbus_fops;
>> extern struct xenstore_domain_interface *xen_store_interface;
>> extern int xen_store_evtchn;
>>
>> +void mtwatch_create_domain(domid_t domid);
>> +void mtwatch_put_domain(domid_t domid);
>> +
>> extern bool xen_mtwatch;
>>
>> #define MTWATCH_HASH_SIZE 256
>> --
>> 2.7.4
>
>
> _______________________________________________
> Xen-devel mailing list
> [email protected]
> https://lists.xenproject.org/mailman/listinfo/xen-devel
>

2018-09-14 14:19:11

by Dongli Zhang

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 5/6] xenbus: process be_watch events in xenwatch multithreading

Hi Juergen,

On 09/14/2018 05:12 PM, Juergen Gross wrote:
> On 14/09/18 09:34, Dongli Zhang wrote:
>> This is the 5th patch of a (6-patch) patch set.
>>
>> With this patch, watch event in relative path pattern
>> 'backend/<pvdev>/<domid>i/...' can be processed in per-domU xenwatch
>
> superfluous "i" ----------^
>
>> thread.
>>
>> Signed-off-by: Dongli Zhang <[email protected]>
>> ---
>> drivers/xen/xenbus/xenbus_probe.c | 2 +-
>> drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++++++++++++++++++++++++++++++
>> include/xen/xenbus.h | 2 ++
>> 3 files changed, 35 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
>> index ba0644c..aa1b15a 100644
>> --- a/drivers/xen/xenbus/xenbus_probe.c
>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>> @@ -552,7 +552,7 @@ int xenbus_probe_devices(struct xen_bus_type *bus)
>> }
>> EXPORT_SYMBOL_GPL(xenbus_probe_devices);
>>
>> -static unsigned int char_count(const char *str, char c)
>> +unsigned int char_count(const char *str, char c)
>
> Please change the name of the function when making it globally
> visible, e.g. by prefixing "xenbus_".
>
> Generally I think you don't need to use it below.
>
>> {
>> unsigned int i, ret = 0;
>>
>> diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
>> index b0bed4f..50df86a 100644
>> --- a/drivers/xen/xenbus/xenbus_probe_backend.c
>> +++ b/drivers/xen/xenbus/xenbus_probe_backend.c
>> @@ -211,9 +211,41 @@ static void backend_changed(struct xenbus_watch *watch,
>> xenbus_dev_changed(path, &xenbus_backend);
>> }
>>
>> +static domid_t path_to_domid(const char *path)
>> +{
>> + const char *p = path;
>> + domid_t domid = 0;
>> +
>> + while (*p) {
>> + if (*p < '0' || *p > '9')
>> + break;
>> + domid = (domid << 3) + (domid << 1) + (*p - '0');
>
> reinventing atoi()?
>
> Please don't do that. kstrtou16() seems to be a perfect fit.

I did use kstrtou*() in the early prototype and realized kstrtou16() returns 0
if the input string contains non-numerical characters.

E.g., the example of input can be "1/0/state", where 1 is fotherend_id
(frontend_id) and 0 is handle.

When "1/0/state" is used at input, kstrtou16() returns 0 (returned integer) and
-22 (error).

Dongli Zhang




>
>> + p++;
>> + }
>> +
>> + return domid;
>> +}
>> +
>> +/* path: backend/<pvdev>/<domid>/... */
>> +static domid_t be_get_domid(struct xenbus_watch *watch,
>> + const char *path,
>> + const char *token)
>> +{
>> + const char *p = path;
>> +
>> + if (char_count(path, '/') < 2)
>> + return 0;
>> +
>> + p = strchr(p, '/') + 1;
>> + p = strchr(p, '/') + 1;
>
> Drop the call of char_count() above and test p for being non-NULL
> after each call of strchr?
>
>
> Juergen
>
> _______________________________________________
> Xen-devel mailing list
> [email protected]
> https://lists.xenproject.org/mailman/listinfo/xen-devel
>

2018-09-14 14:27:10

by Jürgen Groß

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 5/6] xenbus: process be_watch events in xenwatch multithreading

On 14/09/18 16:18, Dongli Zhang wrote:
> Hi Juergen,
>
> On 09/14/2018 05:12 PM, Juergen Gross wrote:
>> On 14/09/18 09:34, Dongli Zhang wrote:
>>> This is the 5th patch of a (6-patch) patch set.
>>>
>>> With this patch, watch event in relative path pattern
>>> 'backend/<pvdev>/<domid>i/...' can be processed in per-domU xenwatch
>>
>> superfluous "i" ----------^
>>
>>> thread.
>>>
>>> Signed-off-by: Dongli Zhang <[email protected]>
>>> ---
>>> drivers/xen/xenbus/xenbus_probe.c | 2 +-
>>> drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++++++++++++++++++++++++++++++
>>> include/xen/xenbus.h | 2 ++
>>> 3 files changed, 35 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
>>> index ba0644c..aa1b15a 100644
>>> --- a/drivers/xen/xenbus/xenbus_probe.c
>>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>>> @@ -552,7 +552,7 @@ int xenbus_probe_devices(struct xen_bus_type *bus)
>>> }
>>> EXPORT_SYMBOL_GPL(xenbus_probe_devices);
>>>
>>> -static unsigned int char_count(const char *str, char c)
>>> +unsigned int char_count(const char *str, char c)
>>
>> Please change the name of the function when making it globally
>> visible, e.g. by prefixing "xenbus_".
>>
>> Generally I think you don't need to use it below.
>>
>>> {
>>> unsigned int i, ret = 0;
>>>
>>> diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
>>> index b0bed4f..50df86a 100644
>>> --- a/drivers/xen/xenbus/xenbus_probe_backend.c
>>> +++ b/drivers/xen/xenbus/xenbus_probe_backend.c
>>> @@ -211,9 +211,41 @@ static void backend_changed(struct xenbus_watch *watch,
>>> xenbus_dev_changed(path, &xenbus_backend);
>>> }
>>>
>>> +static domid_t path_to_domid(const char *path)
>>> +{
>>> + const char *p = path;
>>> + domid_t domid = 0;
>>> +
>>> + while (*p) {
>>> + if (*p < '0' || *p > '9')
>>> + break;
>>> + domid = (domid << 3) + (domid << 1) + (*p - '0');
>>
>> reinventing atoi()?
>>
>> Please don't do that. kstrtou16() seems to be a perfect fit.
>
> I did use kstrtou*() in the early prototype and realized kstrtou16() returns 0
> if the input string contains non-numerical characters.
>
> E.g., the example of input can be "1/0/state", where 1 is fotherend_id
> (frontend_id) and 0 is handle.
>
> When "1/0/state" is used at input, kstrtou16() returns 0 (returned integer) and
> -22 (error).

Aah, okay. Then simple_strtoul()?


Juergen

2018-09-14 14:30:55

by Dongli Zhang

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 5/6] xenbus: process be_watch events in xenwatch multithreading

Hi Juergen,

On 09/14/2018 10:26 PM, Juergen Gross wrote:
> On 14/09/18 16:18, Dongli Zhang wrote:
>> Hi Juergen,
>>
>> On 09/14/2018 05:12 PM, Juergen Gross wrote:
>>> On 14/09/18 09:34, Dongli Zhang wrote:
>>>> This is the 5th patch of a (6-patch) patch set.
>>>>
>>>> With this patch, watch event in relative path pattern
>>>> 'backend/<pvdev>/<domid>i/...' can be processed in per-domU xenwatch
>>>
>>> superfluous "i" ----------^
>>>
>>>> thread.
>>>>
>>>> Signed-off-by: Dongli Zhang <[email protected]>
>>>> ---
>>>> drivers/xen/xenbus/xenbus_probe.c | 2 +-
>>>> drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++++++++++++++++++++++++++++++
>>>> include/xen/xenbus.h | 2 ++
>>>> 3 files changed, 35 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
>>>> index ba0644c..aa1b15a 100644
>>>> --- a/drivers/xen/xenbus/xenbus_probe.c
>>>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>>>> @@ -552,7 +552,7 @@ int xenbus_probe_devices(struct xen_bus_type *bus)
>>>> }
>>>> EXPORT_SYMBOL_GPL(xenbus_probe_devices);
>>>>
>>>> -static unsigned int char_count(const char *str, char c)
>>>> +unsigned int char_count(const char *str, char c)
>>>
>>> Please change the name of the function when making it globally
>>> visible, e.g. by prefixing "xenbus_".
>>>
>>> Generally I think you don't need to use it below.
>>>
>>>> {
>>>> unsigned int i, ret = 0;
>>>>
>>>> diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
>>>> index b0bed4f..50df86a 100644
>>>> --- a/drivers/xen/xenbus/xenbus_probe_backend.c
>>>> +++ b/drivers/xen/xenbus/xenbus_probe_backend.c
>>>> @@ -211,9 +211,41 @@ static void backend_changed(struct xenbus_watch *watch,
>>>> xenbus_dev_changed(path, &xenbus_backend);
>>>> }
>>>>
>>>> +static domid_t path_to_domid(const char *path)
>>>> +{
>>>> + const char *p = path;
>>>> + domid_t domid = 0;
>>>> +
>>>> + while (*p) {
>>>> + if (*p < '0' || *p > '9')
>>>> + break;
>>>> + domid = (domid << 3) + (domid << 1) + (*p - '0');
>>>
>>> reinventing atoi()?
>>>
>>> Please don't do that. kstrtou16() seems to be a perfect fit.
>>
>> I did use kstrtou*() in the early prototype and realized kstrtou16() returns 0
>> if the input string contains non-numerical characters.
>>
>> E.g., the example of input can be "1/0/state", where 1 is fotherend_id
>> (frontend_id) and 0 is handle.
>>
>> When "1/0/state" is used at input, kstrtou16() returns 0 (returned integer) and
>> -22 (error).
>
> Aah, okay. Then simple_strtoul()?

I did consider simple_strtoul() initially. Unfortunately, it is obsolete (below
line 81). AFAIR, the patch would not be able to pass the check_patch script when
this function is used.

75 /**
76 * simple_strtoul - convert a string to an unsigned long
77 * @cp: The start of the string
78 * @endp: A pointer to the end of the parsed string will be placed here
79 * @base: The number base to use
80 *
81 * This function is obsolete. Please use kstrtoul instead.
82 */
83 unsigned long simple_strtoul(const char *cp, char **endp, unsigned int base)
84 {
85 return simple_strtoull(cp, endp, base);
86 }

Dongli Zhang

>
>
> Juergen
>

2018-09-14 14:35:41

by Dongli Zhang

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 5/6] xenbus: process be_watch events in xenwatch multithreading

Hi Juergen,

On 09/14/2018 05:12 PM, Juergen Gross wrote:
> On 14/09/18 09:34, Dongli Zhang wrote:
>> This is the 5th patch of a (6-patch) patch set.
>>
>> With this patch, watch event in relative path pattern
>> 'backend/<pvdev>/<domid>i/...' can be processed in per-domU xenwatch
>
> superfluous "i" ----------^
>
>> thread.
>>
>> Signed-off-by: Dongli Zhang <[email protected]>
>> ---
>> drivers/xen/xenbus/xenbus_probe.c | 2 +-
>> drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++++++++++++++++++++++++++++++
>> include/xen/xenbus.h | 2 ++
>> 3 files changed, 35 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
>> index ba0644c..aa1b15a 100644
>> --- a/drivers/xen/xenbus/xenbus_probe.c
>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>> @@ -552,7 +552,7 @@ int xenbus_probe_devices(struct xen_bus_type *bus)
>> }
>> EXPORT_SYMBOL_GPL(xenbus_probe_devices);
>>
>> -static unsigned int char_count(const char *str, char c)
>> +unsigned int char_count(const char *str, char c)
>
> Please change the name of the function when making it globally
> visible, e.g. by prefixing "xenbus_".
>
> Generally I think you don't need to use it below.
>
>> {
>> unsigned int i, ret = 0;
>>
>> diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
>> index b0bed4f..50df86a 100644
>> --- a/drivers/xen/xenbus/xenbus_probe_backend.c
>> +++ b/drivers/xen/xenbus/xenbus_probe_backend.c
>> @@ -211,9 +211,41 @@ static void backend_changed(struct xenbus_watch *watch,
>> xenbus_dev_changed(path, &xenbus_backend);
>> }
>>
>> +static domid_t path_to_domid(const char *path)
>> +{
>> + const char *p = path;
>> + domid_t domid = 0;
>> +
>> + while (*p) {
>> + if (*p < '0' || *p > '9')
>> + break;
>> + domid = (domid << 3) + (domid << 1) + (*p - '0');
>
> reinventing atoi()?
>
> Please don't do that. kstrtou16() seems to be a perfect fit.
>
>> + p++;
>> + }
>> +
>> + return domid;
>> +}
>> +
>> +/* path: backend/<pvdev>/<domid>/... */
>> +static domid_t be_get_domid(struct xenbus_watch *watch,
>> + const char *path,
>> + const char *token)
>> +{
>> + const char *p = path;
>> +
>> + if (char_count(path, '/') < 2)
>> + return 0;
>> +
>> + p = strchr(p, '/') + 1;
>> + p = strchr(p, '/') + 1;
>
> Drop the call of char_count() above and test p for being non-NULL
> after each call of strchr?

The usage of char_count() is copied from xenbus_dev_changed() which is used to
process pattern 'backend/<type>/...'.

I use char_count() just because I would prefer we always use the same
equation/algorithm to process xenstore path in linux kernel.

I can drop it if it is better to not change char_count() from static to global
function.

556 void xenbus_dev_changed(const char *node, struct xen_bus_type *bus)
557 {
558 int exists, rootlen;
559 struct xenbus_device *dev;
560 char type[XEN_BUS_ID_SIZE];
561 const char *p, *root;
562
563 if (char_count(node, '/') < 2)
564 return;
565
566 exists = xenbus_exists(XBT_NIL, node, "");
567 if (!exists) {
568 xenbus_cleanup_devices(node, &bus->bus);
569 return;
570 }
571
572 /* backend/<type>/... or device/<type>/... */
573 p = strchr(node, '/') + 1;
574 snprintf(type, XEN_BUS_ID_SIZE, "%.*s", (int)strcspn(p, "/"), p);
575 type[XEN_BUS_ID_SIZE-1] = '\0';

Dongli Zhang

>
>
> Juergen
>
> _______________________________________________
> Xen-devel mailing list
> [email protected]
> https://lists.xenproject.org/mailman/listinfo/xen-devel
>

2018-09-14 14:46:14

by Jürgen Groß

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 5/6] xenbus: process be_watch events in xenwatch multithreading

On 14/09/18 16:29, Dongli Zhang wrote:
> Hi Juergen,
>
> On 09/14/2018 10:26 PM, Juergen Gross wrote:
>> On 14/09/18 16:18, Dongli Zhang wrote:
>>> Hi Juergen,
>>>
>>> On 09/14/2018 05:12 PM, Juergen Gross wrote:
>>>> On 14/09/18 09:34, Dongli Zhang wrote:
>>>>> This is the 5th patch of a (6-patch) patch set.
>>>>>
>>>>> With this patch, watch event in relative path pattern
>>>>> 'backend/<pvdev>/<domid>i/...' can be processed in per-domU xenwatch
>>>>
>>>> superfluous "i" ----------^
>>>>
>>>>> thread.
>>>>>
>>>>> Signed-off-by: Dongli Zhang <[email protected]>
>>>>> ---
>>>>> drivers/xen/xenbus/xenbus_probe.c | 2 +-
>>>>> drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++++++++++++++++++++++++++++++
>>>>> include/xen/xenbus.h | 2 ++
>>>>> 3 files changed, 35 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
>>>>> index ba0644c..aa1b15a 100644
>>>>> --- a/drivers/xen/xenbus/xenbus_probe.c
>>>>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>>>>> @@ -552,7 +552,7 @@ int xenbus_probe_devices(struct xen_bus_type *bus)
>>>>> }
>>>>> EXPORT_SYMBOL_GPL(xenbus_probe_devices);
>>>>>
>>>>> -static unsigned int char_count(const char *str, char c)
>>>>> +unsigned int char_count(const char *str, char c)
>>>>
>>>> Please change the name of the function when making it globally
>>>> visible, e.g. by prefixing "xenbus_".
>>>>
>>>> Generally I think you don't need to use it below.
>>>>
>>>>> {
>>>>> unsigned int i, ret = 0;
>>>>>
>>>>> diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
>>>>> index b0bed4f..50df86a 100644
>>>>> --- a/drivers/xen/xenbus/xenbus_probe_backend.c
>>>>> +++ b/drivers/xen/xenbus/xenbus_probe_backend.c
>>>>> @@ -211,9 +211,41 @@ static void backend_changed(struct xenbus_watch *watch,
>>>>> xenbus_dev_changed(path, &xenbus_backend);
>>>>> }
>>>>>
>>>>> +static domid_t path_to_domid(const char *path)
>>>>> +{
>>>>> + const char *p = path;
>>>>> + domid_t domid = 0;
>>>>> +
>>>>> + while (*p) {
>>>>> + if (*p < '0' || *p > '9')
>>>>> + break;
>>>>> + domid = (domid << 3) + (domid << 1) + (*p - '0');
>>>>
>>>> reinventing atoi()?
>>>>
>>>> Please don't do that. kstrtou16() seems to be a perfect fit.
>>>
>>> I did use kstrtou*() in the early prototype and realized kstrtou16() returns 0
>>> if the input string contains non-numerical characters.
>>>
>>> E.g., the example of input can be "1/0/state", where 1 is fotherend_id
>>> (frontend_id) and 0 is handle.
>>>
>>> When "1/0/state" is used at input, kstrtou16() returns 0 (returned integer) and
>>> -22 (error).
>>
>> Aah, okay. Then simple_strtoul()?
>
> I did consider simple_strtoul() initially. Unfortunately, it is obsolete (below
> line 81). AFAIR, the patch would not be able to pass the check_patch script when
> this function is used.

Better use that than open coding a new instance of it.

Another variant would be to use sscanf() or similar. Then you could even
drop using strchr() by adding that in the format string:

return (sscanf(path, "%*u/%u/", &domid) == 1) ? domid : DOMID_SELF;


Juergen

2018-09-16 20:21:06

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: [PATCH 1/6] xenbus: prepare data structures and parameter for xenwatch multithreading



On 9/14/18 3:34 AM, Dongli Zhang wrote:

> +
> +struct mtwatch_info {
> + /*
> + * The mtwatch_domain is put on both a hash table and a list.
> + * domain_list is used to optimize xenbus_watch un-registration.
> + *
> + * The mtwatch_domain is removed from domain_hash (with state set
> + * to MTWATCH_DOMAIN_DOWN) when its refcnt is zero. However, it is
> + * left on domain_list until all events belong to such
> + * mtwatch_domain are processed in mtwatch_thread().


Do we really need to keep mwatch_domain on both lists? Why is keeping it
on, say, only the hash not sufficient?

> + *
> + * While there may exist two mtwatch_domain with the same domid on
> + * domain_list simultaneously,


How is it possible to have two domids on the list at the same time?
Don't you want to remove it (which IIUIC means moving it to the purge
list) when domain is destroyed?


-boris


> + * all mtwatch_domain on hash_hash
> + * should have unique domid.
> + */
> + spinlock_t domain_lock;
> + struct hlist_head domain_hash[MTWATCH_HASH_SIZE];
> + struct list_head domain_list;
> +
> + /*
> + * When a per-domU kthread is going to be destroyed, it is put
> + * on the purge_list, and will be flushed by purge_work later.
> + */
> + struct work_struct purge_work;
> + spinlock_t purge_lock;
> + struct list_head purge_list;
> +};



2018-09-16 21:21:21

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: [PATCH 2/6] xenbus: implement the xenwatch multithreading framework



On 9/14/18 3:34 AM, Dongli Zhang wrote:
>
> +
> +/* Running in the context of default xenwatch kthread. */
> +void mtwatch_create_domain(domid_t domid)
> +{
> + struct mtwatch_domain *domain;
> +
> + if (!domid) {
> + pr_err("Default xenwatch thread is for dom0\n");
> + return;
> + }
> +
> + spin_lock(&mtwatch_info->domain_lock);
> +
> + domain = mtwatch_find_domain(domid);
> + if (domain) {
> + atomic_inc(&domain->refcnt);
> + spin_unlock(&mtwatch_info->domain_lock);
> + return;
> + }
> +
> + domain = kzalloc(sizeof(*domain), GFP_ATOMIC);

Is there a reason (besides this being done under spinlock) for using
GFP_ATOMIC? If domain_lock is the only reason I'd probably drop the lock
and do GFP_KERNEL.


> + if (!domain) {
> + spin_unlock(&mtwatch_info->domain_lock);
> + pr_err("Failed to allocate memory for mtwatch thread %d\n",
> + domid);
> + return;

This needs to return an error code, I think. Or do you want to fall back
to shared xenwatch thread?


> + }
> +
> + domain->domid = domid;
> + atomic_set(&domain->refcnt, 1);
> + mutex_init(&domain->domain_mutex);
> + INIT_LIST_HEAD(&domain->purge_node);
> +
> + init_waitqueue_head(&domain->events_wq);
> + spin_lock_init(&domain->events_lock);
> + INIT_LIST_HEAD(&domain->events);
> +
> + list_add_tail_rcu(&domain->list_node, &mtwatch_info->domain_list);
> +
> + hlist_add_head_rcu(&domain->hash_node,
> + &mtwatch_info->domain_hash[MTWATCH_HASH(domid)]);
> +
> + spin_unlock(&mtwatch_info->domain_lock);
> +
> + domain->task = kthread_run(mtwatch_thread, domain,
> + "xen-mtwatch-%d", domid);
> +
> + if (!domain->task) {
> + pr_err("mtwatch kthread creation is failed\n");
> + domain->state = MTWATCH_DOMAIN_DOWN;


Why not clean up right here?

> +
> + return;
> + }
> +
> + domain->state = MTWATCH_DOMAIN_UP;
> +}
> +


> +
> void unregister_xenbus_watch(struct xenbus_watch *watch)
> {
> struct xs_watch_event *event, *tmp;
> @@ -831,6 +1100,9 @@ void unregister_xenbus_watch(struct xenbus_watch *watch)
>
> if (current->pid != xenwatch_pid)
> mutex_unlock(&xenwatch_mutex);
> +
> + if (xen_mtwatch && watch->get_domid)
> + unregister_mtwatch(watch);


I may not be understanding the logic flow here, but if we successfully
removed/unregistered/purged the watch from mtwatch lists, do we still
need to try removing it from watch_events list below?


-boris


2018-09-17 01:20:59

by Dongli Zhang

[permalink] [raw]
Subject: Re: [PATCH 1/6] xenbus: prepare data structures and parameter for xenwatch multithreading

Hi Boris,

On 09/17/2018 04:17 AM, Boris Ostrovsky wrote:
>
>
> On 9/14/18 3:34 AM, Dongli Zhang wrote:
>
>> +
>> +struct mtwatch_info {
>> + /*
>> + * The mtwatch_domain is put on both a hash table and a list.
>> + * domain_list is used to optimize xenbus_watch un-registration.
>> + *
>> + * The mtwatch_domain is removed from domain_hash (with state set
>> + * to MTWATCH_DOMAIN_DOWN) when its refcnt is zero. However, it is
>> + * left on domain_list until all events belong to such
>> + * mtwatch_domain are processed in mtwatch_thread().
>
>
> Do we really need to keep mwatch_domain on both lists? Why is keeping it on,
> say, only the hash not sufficient?

In the state of the art xenbus, when a watch is unregistered (e.g.,
unregister_xenbus_watch()), we need to traverse the list 'watch_events' to
remove all inflight/pending events (for such watch) from 'watch_events'.

About this patch set, as each domain would have its own event list, we need to
traverse the list of each domain to remove the pending events for the watch to
be unregistered.

E.g.,
unregister_xenbus_watch()-->unregister_mtwatch()-->unregister_all_mtwatch() in
[PATCH 2/6] xenbus: implement the xenwatch multithreading framework.

To traverse a hash table is not as efficient as traversing a list. That's why a
domain is kept on both the hash table and list.

>
>> + *
>> + * While there may exist two mtwatch_domain with the same domid on
>> + * domain_list simultaneously,
>
>
> How is it possible to have two domids on the list at the same time? Don't you
> want to remove it (which IIUIC means moving it to the purge list) when domain is
> destroyed?

Here is one case (suppose the domid/frontend-id is 9):

1. Suppose the last pv driver device is removed from domid=9, and therefore the
reference count of per-domU xenwatch thread for domid=9 (which we call as old
thread below) should be removed. We remove it from hash table (it is left in the
list).

Here we cannot remove the domain from the list immediately because there might
be pending events being processed by the corresponding per-domU xenwatch thread.
If we remove it from the list while there is related watch being unregistered as
mentioned for last question, we may hit page fault when processing watch event.

2. Now the administrator attaches new pv device to domid=9 immediately and
therefore reference count is initially set to 1. The per-domU xenwatch thread
for domid=9 (called new thread) is created again. It is inserted to both the
hash table and list.

3. As the old thread for domid=9 might still be on the list, we would have two
threads for domid=9 (one old one to be removed and one newly inserted one to be
used by new pv devices).

Dongli Zhang

>
>
> -boris
>
>
>> + * all mtwatch_domain on hash_hash
>> + * should have unique domid.
>> + */
>> + spinlock_t domain_lock;
>> + struct hlist_head domain_hash[MTWATCH_HASH_SIZE];
>> + struct list_head domain_list;
>> +
>> + /*
>> + * When a per-domU kthread is going to be destroyed, it is put
>> + * on the purge_list, and will be flushed by purge_work later.
>> + */
>> + struct work_struct purge_work;
>> + spinlock_t purge_lock;
>> + struct list_head purge_list;
>> +};
>
>

2018-09-17 01:48:39

by Dongli Zhang

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 2/6] xenbus: implement the xenwatch multithreading framework

Hi Boris,

On 09/17/2018 05:20 AM, Boris Ostrovsky wrote:
>
>
> On 9/14/18 3:34 AM, Dongli Zhang wrote:
>>
>> +
>> +/* Running in the context of default xenwatch kthread. */
>> +void mtwatch_create_domain(domid_t domid)
>> +{
>> + struct mtwatch_domain *domain;
>> +
>> + if (!domid) {
>> + pr_err("Default xenwatch thread is for dom0\n");
>> + return;
>> + }
>> +
>> + spin_lock(&mtwatch_info->domain_lock);
>> +
>> + domain = mtwatch_find_domain(domid);
>> + if (domain) {
>> + atomic_inc(&domain->refcnt);
>> + spin_unlock(&mtwatch_info->domain_lock);
>> + return;
>> + }
>> +
>> + domain = kzalloc(sizeof(*domain), GFP_ATOMIC);
>
> Is there a reason (besides this being done under spinlock) for using GFP_ATOMIC?
> If domain_lock is the only reason I'd probably drop the lock and do GFP_KERNEL.

spin_lock is the reason.

Would you like to switch to a mutex here?

>
>
>> + if (!domain) {
>> + spin_unlock(&mtwatch_info->domain_lock);
>> + pr_err("Failed to allocate memory for mtwatch thread %d\n",
>> + domid);
>> + return;
>
> This needs to return an error code, I think. Or do you want to fall back to
> shared xenwatch thread?

We would fall back to the shared default xenwatch thread. As in [PATCH 3/6], the
event is dispatched to the shared xenwatch thread if the per-domU one is not
available.

>
>
>> + }
>> +
>> + domain->domid = domid;
>> + atomic_set(&domain->refcnt, 1);
>> + mutex_init(&domain->domain_mutex);
>> + INIT_LIST_HEAD(&domain->purge_node);
>> +
>> + init_waitqueue_head(&domain->events_wq);
>> + spin_lock_init(&domain->events_lock);
>> + INIT_LIST_HEAD(&domain->events);
>> +
>> + list_add_tail_rcu(&domain->list_node, &mtwatch_info->domain_list);
>> +
>> + hlist_add_head_rcu(&domain->hash_node,
>> + &mtwatch_info->domain_hash[MTWATCH_HASH(domid)]);
>> +
>> + spin_unlock(&mtwatch_info->domain_lock);
>> +
>> + domain->task = kthread_run(mtwatch_thread, domain,
>> + "xen-mtwatch-%d", domid);
>> +
>> + if (!domain->task) {
>> + pr_err("mtwatch kthread creation is failed\n");
>> + domain->state = MTWATCH_DOMAIN_DOWN;
>
>
> Why not clean up right here?

I used to think there might be a race between mtwatch_create_domain() and
mtwatch_put_domain().

Just realized the race is impossible. I will clean up here.

>
>> +
>> + return;
>> + }
>> +
>> + domain->state = MTWATCH_DOMAIN_UP;
>> +}
>> +
>
>
>> +
>> void unregister_xenbus_watch(struct xenbus_watch *watch)
>> {
>> struct xs_watch_event *event, *tmp;
>> @@ -831,6 +1100,9 @@ void unregister_xenbus_watch(struct xenbus_watch *watch)
>> if (current->pid != xenwatch_pid)
>> mutex_unlock(&xenwatch_mutex);
>> +
>> + if (xen_mtwatch && watch->get_domid)
>> + unregister_mtwatch(watch);
>
>
> I may not be understanding the logic flow here, but if we successfully
> removed/unregistered/purged the watch from mtwatch lists, do we still need to
> try removing it from watch_events list below?

Part of original unregister_xenbus_watch() has already removed the pending
events from watch_events before the above added lines of code.


Dongli Zhang

>
>
> -boris
>
>
> _______________________________________________
> Xen-devel mailing list
> [email protected]
> https://lists.xenproject.org/mailman/listinfo/xen-devel

2018-09-17 19:07:27

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: [PATCH 1/6] xenbus: prepare data structures and parameter for xenwatch multithreading

On 9/16/18 9:20 PM, Dongli Zhang wrote:
> Hi Boris,
>
> On 09/17/2018 04:17 AM, Boris Ostrovsky wrote:
>>
>> On 9/14/18 3:34 AM, Dongli Zhang wrote:
>>
>>> +
>>> +struct mtwatch_info {
>>> + /*
>>> + * The mtwatch_domain is put on both a hash table and a list.
>>> + * domain_list is used to optimize xenbus_watch un-registration.
>>> + *
>>> + * The mtwatch_domain is removed from domain_hash (with state set
>>> + * to MTWATCH_DOMAIN_DOWN) when its refcnt is zero. However, it is
>>> + * left on domain_list until all events belong to such
>>> + * mtwatch_domain are processed in mtwatch_thread().
>>
>> Do we really need to keep mwatch_domain on both lists? Why is keeping it on,
>> say, only the hash not sufficient?
> In the state of the art xenbus, when a watch is unregistered (e.g.,
> unregister_xenbus_watch()), we need to traverse the list 'watch_events' to
> remove all inflight/pending events (for such watch) from 'watch_events'.
>
> About this patch set, as each domain would have its own event list, we need to
> traverse the list of each domain to remove the pending events for the watch to
> be unregistered.
>
> E.g.,
> unregister_xenbus_watch()-->unregister_mtwatch()-->unregister_all_mtwatch() in
> [PATCH 2/6] xenbus: implement the xenwatch multithreading framework.
>
> To traverse a hash table is not as efficient as traversing a list. That's why a
> domain is kept on both the hash table and list.


Keeping the same object on two lists also has costs. More importantly
IMO, it increases chances on introducing a bug  when people update one
instance but not the other.


>
>>> + *
>>> + * While there may exist two mtwatch_domain with the same domid on
>>> + * domain_list simultaneously,
>>
>> How is it possible to have two domids on the list at the same time? Don't you
>> want to remove it (which IIUIC means moving it to the purge list) when domain is
>> destroyed?
> Here is one case (suppose the domid/frontend-id is 9):
>
> 1. Suppose the last pv driver device is removed from domid=9, and therefore the
> reference count of per-domU xenwatch thread for domid=9 (which we call as old
> thread below) should be removed. We remove it from hash table (it is left in the
> list).
>
> Here we cannot remove the domain from the list immediately because there might
> be pending events being processed by the corresponding per-domU xenwatch thread.
> If we remove it from the list while there is related watch being unregistered as
> mentioned for last question, we may hit page fault when processing watch event.


Don't you need to grab domain->domain_mutex to remove the driver?
Meaning that events for that mtwatch thread cannot be processed?

In any case, I think that having two mtwatch_domains for the same domain
should be avoided. (and if you keep the hash list only then this issue
gets resolved automatically ;-))


-boris


>
> 2. Now the administrator attaches new pv device to domid=9 immediately and
> therefore reference count is initially set to 1. The per-domU xenwatch thread
> for domid=9 (called new thread) is created again. It is inserted to both the
> hash table and list.
>
> 3. As the old thread for domid=9 might still be on the list, we would have two
> threads for domid=9 (one old one to be removed and one newly inserted one to be
> used by new pv devices).
>
> Dongli Zhang
>
>>
>> -boris
>>
>>
>>> + * all mtwatch_domain on hash_hash
>>> + * should have unique domid.
>>> + */
>>> + spinlock_t domain_lock;
>>> + struct hlist_head domain_hash[MTWATCH_HASH_SIZE];
>>> + struct list_head domain_list;
>>> +
>>> + /*
>>> + * When a per-domU kthread is going to be destroyed, it is put
>>> + * on the purge_list, and will be flushed by purge_work later.
>>> + */
>>> + struct work_struct purge_work;
>>> + spinlock_t purge_lock;
>>> + struct list_head purge_list;
>>> +};
>>


2018-09-17 19:59:27

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 2/6] xenbus: implement the xenwatch multithreading framework

On 9/16/18 9:48 PM, Dongli Zhang wrote:
> Hi Boris,
>
> On 09/17/2018 05:20 AM, Boris Ostrovsky wrote:
>>
>> On 9/14/18 3:34 AM, Dongli Zhang wrote:
>>> +
>>> +/* Running in the context of default xenwatch kthread. */
>>> +void mtwatch_create_domain(domid_t domid)
>>> +{
>>> + struct mtwatch_domain *domain;
>>> +
>>> + if (!domid) {
>>> + pr_err("Default xenwatch thread is for dom0\n");
>>> + return;
>>> + }
>>> +
>>> + spin_lock(&mtwatch_info->domain_lock);
>>> +
>>> + domain = mtwatch_find_domain(domid);
>>> + if (domain) {
>>> + atomic_inc(&domain->refcnt);
>>> + spin_unlock(&mtwatch_info->domain_lock);
>>> + return;
>>> + }
>>> +
>>> + domain = kzalloc(sizeof(*domain), GFP_ATOMIC);
>> Is there a reason (besides this being done under spinlock) for using GFP_ATOMIC?
>> If domain_lock is the only reason I'd probably drop the lock and do GFP_KERNEL.
> spin_lock is the reason.
>
> Would you like to switch to a mutex here?

I'd use mutex.

-boris


2018-09-17 20:10:12

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: [PATCH 3/6] xenbus: dispatch per-domU watch event to per-domU xenwatch thread

On 9/14/18 3:34 AM, Dongli Zhang wrote:
>
> +static int xs_watch_insert_event(struct xs_watch_event *event, domid_t domid)
> +{
> + struct mtwatch_domain *domain;
> + int ret = -1;

-ENODEV maybe?

-boris

> +
> + rcu_read_lock();
> +
> + domain = mtwatch_find_domain(domid);
> + if (!domain) {
> + rcu_read_unlock();
> + return ret;
> + }
> +
> + spin_lock(&domain->events_lock);
> + if (domain->state == MTWATCH_DOMAIN_UP) {
> + list_add_tail(&event->list, &domain->events);
> + wake_up(&domain->events_wq);
> + ret = 0;
> + }
> + spin_unlock(&domain->events_lock);
> +
> + rcu_read_unlock();
> +
> + return ret;
> +}
> +
>


2018-09-19 06:17:09

by Dongli Zhang

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 5/6] xenbus: process be_watch events in xenwatch multithreading

Hi Juergen,

On 09/14/2018 10:44 PM, Juergen Gross wrote:
> On 14/09/18 16:29, Dongli Zhang wrote:
>> Hi Juergen,
>>
>> On 09/14/2018 10:26 PM, Juergen Gross wrote:
>>> On 14/09/18 16:18, Dongli Zhang wrote:
>>>> Hi Juergen,
>>>>
>>>> On 09/14/2018 05:12 PM, Juergen Gross wrote:
>>>>> On 14/09/18 09:34, Dongli Zhang wrote:
>>>>>> This is the 5th patch of a (6-patch) patch set.
>>>>>>
>>>>>> With this patch, watch event in relative path pattern
>>>>>> 'backend/<pvdev>/<domid>i/...' can be processed in per-domU xenwatch
>>>>>
>>>>> superfluous "i" ----------^
>>>>>
>>>>>> thread.
>>>>>>
>>>>>> Signed-off-by: Dongli Zhang <[email protected]>
>>>>>> ---
>>>>>> drivers/xen/xenbus/xenbus_probe.c | 2 +-
>>>>>> drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++++++++++++++++++++++++++++++
>>>>>> include/xen/xenbus.h | 2 ++
>>>>>> 3 files changed, 35 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
>>>>>> index ba0644c..aa1b15a 100644
>>>>>> --- a/drivers/xen/xenbus/xenbus_probe.c
>>>>>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>>>>>> @@ -552,7 +552,7 @@ int xenbus_probe_devices(struct xen_bus_type *bus)
>>>>>> }
>>>>>> EXPORT_SYMBOL_GPL(xenbus_probe_devices);
>>>>>>
>>>>>> -static unsigned int char_count(const char *str, char c)
>>>>>> +unsigned int char_count(const char *str, char c)
>>>>>
>>>>> Please change the name of the function when making it globally
>>>>> visible, e.g. by prefixing "xenbus_".
>>>>>
>>>>> Generally I think you don't need to use it below.
>>>>>
>>>>>> {
>>>>>> unsigned int i, ret = 0;
>>>>>>
>>>>>> diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
>>>>>> index b0bed4f..50df86a 100644
>>>>>> --- a/drivers/xen/xenbus/xenbus_probe_backend.c
>>>>>> +++ b/drivers/xen/xenbus/xenbus_probe_backend.c
>>>>>> @@ -211,9 +211,41 @@ static void backend_changed(struct xenbus_watch *watch,
>>>>>> xenbus_dev_changed(path, &xenbus_backend);
>>>>>> }
>>>>>>
>>>>>> +static domid_t path_to_domid(const char *path)
>>>>>> +{
>>>>>> + const char *p = path;
>>>>>> + domid_t domid = 0;
>>>>>> +
>>>>>> + while (*p) {
>>>>>> + if (*p < '0' || *p > '9')
>>>>>> + break;
>>>>>> + domid = (domid << 3) + (domid << 1) + (*p - '0');
>>>>>
>>>>> reinventing atoi()?
>>>>>
>>>>> Please don't do that. kstrtou16() seems to be a perfect fit.
>>>>
>>>> I did use kstrtou*() in the early prototype and realized kstrtou16() returns 0
>>>> if the input string contains non-numerical characters.
>>>>
>>>> E.g., the example of input can be "1/0/state", where 1 is fotherend_id
>>>> (frontend_id) and 0 is handle.
>>>>
>>>> When "1/0/state" is used at input, kstrtou16() returns 0 (returned integer) and
>>>> -22 (error).
>>>
>>> Aah, okay. Then simple_strtoul()?
>>
>> I did consider simple_strtoul() initially. Unfortunately, it is obsolete (below
>> line 81). AFAIR, the patch would not be able to pass the check_patch script when
>> this function is used.
>
> Better use that than open coding a new instance of it.
>
> Another variant would be to use sscanf() or similar. Then you could even
> drop using strchr() by adding that in the format string:
>
> return (sscanf(path, "%*u/%u/", &domid) == 1) ? domid : DOMID_SELF;

I recall what was happened.

Suppose one sample of path is "backend/vif/19/3/state". (we would like to obtain
domid=19)

Initially I would like to use sscanf(path, "backend/%*[a-z]/%hu/%*u") to obtain
the domid from xenstore path in one call.

Unfortunately, unlike userspace sscanf(), the version in linux kernel does not
support '[' so that I would not be able to use "%*[a-z]" in sscanf() in linux
kernel.

Finally, I decided to skip 2 '/' to reach at "19/3/state" and I forgot to
continue using sscanf and implemented my own one :(

With sscanf, I think I still need to skip 2 '/' to skip "backend/<pv device>" in
order to reach at "<domid>/xxxxxxx...":

1. return if the number of '/' in the path is less than 2

2. skip 2 '/' (e.g., backend/vif/)

3. process the rest of patch via sscanf.

Dongli Zhang



2018-09-19 08:02:03

by Jürgen Groß

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 5/6] xenbus: process be_watch events in xenwatch multithreading

On 19/09/18 08:15, Dongli Zhang wrote:
> Hi Juergen,
>
> On 09/14/2018 10:44 PM, Juergen Gross wrote:
>> On 14/09/18 16:29, Dongli Zhang wrote:
>>> Hi Juergen,
>>>
>>> On 09/14/2018 10:26 PM, Juergen Gross wrote:
>>>> On 14/09/18 16:18, Dongli Zhang wrote:
>>>>> Hi Juergen,
>>>>>
>>>>> On 09/14/2018 05:12 PM, Juergen Gross wrote:
>>>>>> On 14/09/18 09:34, Dongli Zhang wrote:
>>>>>>> This is the 5th patch of a (6-patch) patch set.
>>>>>>>
>>>>>>> With this patch, watch event in relative path pattern
>>>>>>> 'backend/<pvdev>/<domid>i/...' can be processed in per-domU xenwatch
>>>>>>
>>>>>> superfluous "i" ----------^
>>>>>>
>>>>>>> thread.
>>>>>>>
>>>>>>> Signed-off-by: Dongli Zhang <[email protected]>
>>>>>>> ---
>>>>>>> drivers/xen/xenbus/xenbus_probe.c | 2 +-
>>>>>>> drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++++++++++++++++++++++++++++++
>>>>>>> include/xen/xenbus.h | 2 ++
>>>>>>> 3 files changed, 35 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
>>>>>>> index ba0644c..aa1b15a 100644
>>>>>>> --- a/drivers/xen/xenbus/xenbus_probe.c
>>>>>>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>>>>>>> @@ -552,7 +552,7 @@ int xenbus_probe_devices(struct xen_bus_type *bus)
>>>>>>> }
>>>>>>> EXPORT_SYMBOL_GPL(xenbus_probe_devices);
>>>>>>>
>>>>>>> -static unsigned int char_count(const char *str, char c)
>>>>>>> +unsigned int char_count(const char *str, char c)
>>>>>>
>>>>>> Please change the name of the function when making it globally
>>>>>> visible, e.g. by prefixing "xenbus_".
>>>>>>
>>>>>> Generally I think you don't need to use it below.
>>>>>>
>>>>>>> {
>>>>>>> unsigned int i, ret = 0;
>>>>>>>
>>>>>>> diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
>>>>>>> index b0bed4f..50df86a 100644
>>>>>>> --- a/drivers/xen/xenbus/xenbus_probe_backend.c
>>>>>>> +++ b/drivers/xen/xenbus/xenbus_probe_backend.c
>>>>>>> @@ -211,9 +211,41 @@ static void backend_changed(struct xenbus_watch *watch,
>>>>>>> xenbus_dev_changed(path, &xenbus_backend);
>>>>>>> }
>>>>>>>
>>>>>>> +static domid_t path_to_domid(const char *path)
>>>>>>> +{
>>>>>>> + const char *p = path;
>>>>>>> + domid_t domid = 0;
>>>>>>> +
>>>>>>> + while (*p) {
>>>>>>> + if (*p < '0' || *p > '9')
>>>>>>> + break;
>>>>>>> + domid = (domid << 3) + (domid << 1) + (*p - '0');
>>>>>>
>>>>>> reinventing atoi()?
>>>>>>
>>>>>> Please don't do that. kstrtou16() seems to be a perfect fit.
>>>>>
>>>>> I did use kstrtou*() in the early prototype and realized kstrtou16() returns 0
>>>>> if the input string contains non-numerical characters.
>>>>>
>>>>> E.g., the example of input can be "1/0/state", where 1 is fotherend_id
>>>>> (frontend_id) and 0 is handle.
>>>>>
>>>>> When "1/0/state" is used at input, kstrtou16() returns 0 (returned integer) and
>>>>> -22 (error).
>>>>
>>>> Aah, okay. Then simple_strtoul()?
>>>
>>> I did consider simple_strtoul() initially. Unfortunately, it is obsolete (below
>>> line 81). AFAIR, the patch would not be able to pass the check_patch script when
>>> this function is used.
>>
>> Better use that than open coding a new instance of it.
>>
>> Another variant would be to use sscanf() or similar. Then you could even
>> drop using strchr() by adding that in the format string:
>>
>> return (sscanf(path, "%*u/%u/", &domid) == 1) ? domid : DOMID_SELF;
>
> I recall what was happened.
>
> Suppose one sample of path is "backend/vif/19/3/state". (we would like to obtain
> domid=19)
>
> Initially I would like to use sscanf(path, "backend/%*[a-z]/%hu/%*u") to obtain
> the domid from xenstore path in one call.
>
> Unfortunately, unlike userspace sscanf(), the version in linux kernel does not
> support '[' so that I would not be able to use "%*[a-z]" in sscanf() in linux
> kernel.

That is not correct. It doesn't support ranges in [], but it is
perfectly fine to use %[^/]. This requires a temporary buffer, as
%*[ isn't supported.

Why don't you use:

char temp[16];

...

/* kernel sscanf() %[] doesn't support '*' modifier and needs length. */
sscanf(path, "backend/%15[^/]%hu/%*u", temp, &domid)


Juergen

2018-09-19 12:28:29

by Dongli Zhang

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 5/6] xenbus: process be_watch events in xenwatch multithreading

Hi Juergen,

On 09/19/2018 04:01 PM, Juergen Gross wrote:
> On 19/09/18 08:15, Dongli Zhang wrote:
>> Hi Juergen,
>>
>> On 09/14/2018 10:44 PM, Juergen Gross wrote:
>>> On 14/09/18 16:29, Dongli Zhang wrote:
>>>> Hi Juergen,
>>>>
>>>> On 09/14/2018 10:26 PM, Juergen Gross wrote:
>>>>> On 14/09/18 16:18, Dongli Zhang wrote:
>>>>>> Hi Juergen,
>>>>>>
>>>>>> On 09/14/2018 05:12 PM, Juergen Gross wrote:
>>>>>>> On 14/09/18 09:34, Dongli Zhang wrote:
>>>>>>>> This is the 5th patch of a (6-patch) patch set.
>>>>>>>>
>>>>>>>> With this patch, watch event in relative path pattern
>>>>>>>> 'backend/<pvdev>/<domid>i/...' can be processed in per-domU xenwatch
>>>>>>>
>>>>>>> superfluous "i" ----------^
>>>>>>>
>>>>>>>> thread.
>>>>>>>>
>>>>>>>> Signed-off-by: Dongli Zhang <[email protected]>
>>>>>>>> ---
>>>>>>>> drivers/xen/xenbus/xenbus_probe.c | 2 +-
>>>>>>>> drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++++++++++++++++++++++++++++++
>>>>>>>> include/xen/xenbus.h | 2 ++
>>>>>>>> 3 files changed, 35 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
>>>>>>>> index ba0644c..aa1b15a 100644
>>>>>>>> --- a/drivers/xen/xenbus/xenbus_probe.c
>>>>>>>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>>>>>>>> @@ -552,7 +552,7 @@ int xenbus_probe_devices(struct xen_bus_type *bus)
>>>>>>>> }
>>>>>>>> EXPORT_SYMBOL_GPL(xenbus_probe_devices);
>>>>>>>>
>>>>>>>> -static unsigned int char_count(const char *str, char c)
>>>>>>>> +unsigned int char_count(const char *str, char c)
>>>>>>>
>>>>>>> Please change the name of the function when making it globally
>>>>>>> visible, e.g. by prefixing "xenbus_".
>>>>>>>
>>>>>>> Generally I think you don't need to use it below.
>>>>>>>
>>>>>>>> {
>>>>>>>> unsigned int i, ret = 0;
>>>>>>>>
>>>>>>>> diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
>>>>>>>> index b0bed4f..50df86a 100644
>>>>>>>> --- a/drivers/xen/xenbus/xenbus_probe_backend.c
>>>>>>>> +++ b/drivers/xen/xenbus/xenbus_probe_backend.c
>>>>>>>> @@ -211,9 +211,41 @@ static void backend_changed(struct xenbus_watch *watch,
>>>>>>>> xenbus_dev_changed(path, &xenbus_backend);
>>>>>>>> }
>>>>>>>>
>>>>>>>> +static domid_t path_to_domid(const char *path)
>>>>>>>> +{
>>>>>>>> + const char *p = path;
>>>>>>>> + domid_t domid = 0;
>>>>>>>> +
>>>>>>>> + while (*p) {
>>>>>>>> + if (*p < '0' || *p > '9')
>>>>>>>> + break;
>>>>>>>> + domid = (domid << 3) + (domid << 1) + (*p - '0');
>>>>>>>
>>>>>>> reinventing atoi()?
>>>>>>>
>>>>>>> Please don't do that. kstrtou16() seems to be a perfect fit.
>>>>>>
>>>>>> I did use kstrtou*() in the early prototype and realized kstrtou16() returns 0
>>>>>> if the input string contains non-numerical characters.
>>>>>>
>>>>>> E.g., the example of input can be "1/0/state", where 1 is fotherend_id
>>>>>> (frontend_id) and 0 is handle.
>>>>>>
>>>>>> When "1/0/state" is used at input, kstrtou16() returns 0 (returned integer) and
>>>>>> -22 (error).
>>>>>
>>>>> Aah, okay. Then simple_strtoul()?
>>>>
>>>> I did consider simple_strtoul() initially. Unfortunately, it is obsolete (below
>>>> line 81). AFAIR, the patch would not be able to pass the check_patch script when
>>>> this function is used.
>>>
>>> Better use that than open coding a new instance of it.
>>>
>>> Another variant would be to use sscanf() or similar. Then you could even
>>> drop using strchr() by adding that in the format string:
>>>
>>> return (sscanf(path, "%*u/%u/", &domid) == 1) ? domid : DOMID_SELF;
>>
>> I recall what was happened.
>>
>> Suppose one sample of path is "backend/vif/19/3/state". (we would like to obtain
>> domid=19)
>>
>> Initially I would like to use sscanf(path, "backend/%*[a-z]/%hu/%*u") to obtain
>> the domid from xenstore path in one call.
>>
>> Unfortunately, unlike userspace sscanf(), the version in linux kernel does not
>> support '[' so that I would not be able to use "%*[a-z]" in sscanf() in linux
>> kernel.
>
> That is not correct. It doesn't support ranges in [], but it is
> perfectly fine to use %[^/]. This requires a temporary buffer, as
> %*[ isn't supported.
>
> Why don't you use:
>
> char temp[16];
>
> ...
>
> /* kernel sscanf() %[] doesn't support '*' modifier and needs length. */
> sscanf(path, "backend/%15[^/]%hu/%*u", temp, &domid)
>

One '\' between "%15[^/]" and "%hu" is missing.

We should use "backend/%15[^/]/%hu/%*u" instead.

Seems this is supported since commit f9310b2f9a19b7f16c7b1c1558f8b649b9b933c1.
Only tag since 4.6 support this feature.

I should avoid using old ubuntu 4.4.0 kernel to test such features the next time :(

Thank you very much for your help!

As the "devicetype[32]" in struct xenbus_device_id is of size 32, should I use
temp[32] instead of temp[16]?

Dongli Zhang

2018-09-19 12:44:45

by Jürgen Groß

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 5/6] xenbus: process be_watch events in xenwatch multithreading

On 19/09/18 14:27, Dongli Zhang wrote:
> Hi Juergen,
>
> On 09/19/2018 04:01 PM, Juergen Gross wrote:
>> On 19/09/18 08:15, Dongli Zhang wrote:
>>> Hi Juergen,
>>>
>>> On 09/14/2018 10:44 PM, Juergen Gross wrote:
>>>> On 14/09/18 16:29, Dongli Zhang wrote:
>>>>> Hi Juergen,
>>>>>
>>>>> On 09/14/2018 10:26 PM, Juergen Gross wrote:
>>>>>> On 14/09/18 16:18, Dongli Zhang wrote:
>>>>>>> Hi Juergen,
>>>>>>>
>>>>>>> On 09/14/2018 05:12 PM, Juergen Gross wrote:
>>>>>>>> On 14/09/18 09:34, Dongli Zhang wrote:
>>>>>>>>> This is the 5th patch of a (6-patch) patch set.
>>>>>>>>>
>>>>>>>>> With this patch, watch event in relative path pattern
>>>>>>>>> 'backend/<pvdev>/<domid>i/...' can be processed in per-domU xenwatch
>>>>>>>>
>>>>>>>> superfluous "i" ----------^
>>>>>>>>
>>>>>>>>> thread.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Dongli Zhang <[email protected]>
>>>>>>>>> ---
>>>>>>>>> drivers/xen/xenbus/xenbus_probe.c | 2 +-
>>>>>>>>> drivers/xen/xenbus/xenbus_probe_backend.c | 32 +++++++++++++++++++++++++++++++
>>>>>>>>> include/xen/xenbus.h | 2 ++
>>>>>>>>> 3 files changed, 35 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
>>>>>>>>> index ba0644c..aa1b15a 100644
>>>>>>>>> --- a/drivers/xen/xenbus/xenbus_probe.c
>>>>>>>>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>>>>>>>>> @@ -552,7 +552,7 @@ int xenbus_probe_devices(struct xen_bus_type *bus)
>>>>>>>>> }
>>>>>>>>> EXPORT_SYMBOL_GPL(xenbus_probe_devices);
>>>>>>>>>
>>>>>>>>> -static unsigned int char_count(const char *str, char c)
>>>>>>>>> +unsigned int char_count(const char *str, char c)
>>>>>>>>
>>>>>>>> Please change the name of the function when making it globally
>>>>>>>> visible, e.g. by prefixing "xenbus_".
>>>>>>>>
>>>>>>>> Generally I think you don't need to use it below.
>>>>>>>>
>>>>>>>>> {
>>>>>>>>> unsigned int i, ret = 0;
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/xen/xenbus/xenbus_probe_backend.c b/drivers/xen/xenbus/xenbus_probe_backend.c
>>>>>>>>> index b0bed4f..50df86a 100644
>>>>>>>>> --- a/drivers/xen/xenbus/xenbus_probe_backend.c
>>>>>>>>> +++ b/drivers/xen/xenbus/xenbus_probe_backend.c
>>>>>>>>> @@ -211,9 +211,41 @@ static void backend_changed(struct xenbus_watch *watch,
>>>>>>>>> xenbus_dev_changed(path, &xenbus_backend);
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> +static domid_t path_to_domid(const char *path)
>>>>>>>>> +{
>>>>>>>>> + const char *p = path;
>>>>>>>>> + domid_t domid = 0;
>>>>>>>>> +
>>>>>>>>> + while (*p) {
>>>>>>>>> + if (*p < '0' || *p > '9')
>>>>>>>>> + break;
>>>>>>>>> + domid = (domid << 3) + (domid << 1) + (*p - '0');
>>>>>>>>
>>>>>>>> reinventing atoi()?
>>>>>>>>
>>>>>>>> Please don't do that. kstrtou16() seems to be a perfect fit.
>>>>>>>
>>>>>>> I did use kstrtou*() in the early prototype and realized kstrtou16() returns 0
>>>>>>> if the input string contains non-numerical characters.
>>>>>>>
>>>>>>> E.g., the example of input can be "1/0/state", where 1 is fotherend_id
>>>>>>> (frontend_id) and 0 is handle.
>>>>>>>
>>>>>>> When "1/0/state" is used at input, kstrtou16() returns 0 (returned integer) and
>>>>>>> -22 (error).
>>>>>>
>>>>>> Aah, okay. Then simple_strtoul()?
>>>>>
>>>>> I did consider simple_strtoul() initially. Unfortunately, it is obsolete (below
>>>>> line 81). AFAIR, the patch would not be able to pass the check_patch script when
>>>>> this function is used.
>>>>
>>>> Better use that than open coding a new instance of it.
>>>>
>>>> Another variant would be to use sscanf() or similar. Then you could even
>>>> drop using strchr() by adding that in the format string:
>>>>
>>>> return (sscanf(path, "%*u/%u/", &domid) == 1) ? domid : DOMID_SELF;
>>>
>>> I recall what was happened.
>>>
>>> Suppose one sample of path is "backend/vif/19/3/state". (we would like to obtain
>>> domid=19)
>>>
>>> Initially I would like to use sscanf(path, "backend/%*[a-z]/%hu/%*u") to obtain
>>> the domid from xenstore path in one call.
>>>
>>> Unfortunately, unlike userspace sscanf(), the version in linux kernel does not
>>> support '[' so that I would not be able to use "%*[a-z]" in sscanf() in linux
>>> kernel.
>>
>> That is not correct. It doesn't support ranges in [], but it is
>> perfectly fine to use %[^/]. This requires a temporary buffer, as
>> %*[ isn't supported.
>>
>> Why don't you use:
>>
>> char temp[16];
>>
>> ...
>>
>> /* kernel sscanf() %[] doesn't support '*' modifier and needs length. */
>> sscanf(path, "backend/%15[^/]%hu/%*u", temp, &domid)
>>
>
> One '\' between "%15[^/]" and "%hu" is missing.
>
> We should use "backend/%15[^/]/%hu/%*u" instead.

Yes.

> Seems this is supported since commit f9310b2f9a19b7f16c7b1c1558f8b649b9b933c1.
> Only tag since 4.6 support this feature.
>
> I should avoid using old ubuntu 4.4.0 kernel to test such features the next time :(

Right. I suggest to always use a very recent upstream kernel for
testing.

>
> Thank you very much for your help!
>
> As the "devicetype[32]" in struct xenbus_device_id is of size 32, should I use
> temp[32] instead of temp[16]?

Seems to be a good idea.


Juergen


2018-09-25 05:14:06

by Dongli Zhang

[permalink] [raw]
Subject: Re: [PATCH 1/6] xenbus: prepare data structures and parameter for xenwatch multithreading

Hi Boris,

On 09/18/2018 03:08 AM, Boris Ostrovsky wrote:
> On 9/16/18 9:20 PM, Dongli Zhang wrote:
>> Hi Boris,
>>
>> On 09/17/2018 04:17 AM, Boris Ostrovsky wrote:
>>>
>>> On 9/14/18 3:34 AM, Dongli Zhang wrote:
>>>
>>>> +
>>>> +struct mtwatch_info {
>>>> + /*
>>>> + * The mtwatch_domain is put on both a hash table and a list.
>>>> + * domain_list is used to optimize xenbus_watch un-registration.
>>>> + *
>>>> + * The mtwatch_domain is removed from domain_hash (with state set
>>>> + * to MTWATCH_DOMAIN_DOWN) when its refcnt is zero. However, it is
>>>> + * left on domain_list until all events belong to such
>>>> + * mtwatch_domain are processed in mtwatch_thread().
>>>
>>> Do we really need to keep mwatch_domain on both lists? Why is keeping it on,
>>> say, only the hash not sufficient?
>> In the state of the art xenbus, when a watch is unregistered (e.g.,
>> unregister_xenbus_watch()), we need to traverse the list 'watch_events' to
>> remove all inflight/pending events (for such watch) from 'watch_events'.
>>
>> About this patch set, as each domain would have its own event list, we need to
>> traverse the list of each domain to remove the pending events for the watch to
>> be unregistered.
>>
>> E.g.,
>> unregister_xenbus_watch()-->unregister_mtwatch()-->unregister_all_mtwatch() in
>> [PATCH 2/6] xenbus: implement the xenwatch multithreading framework.
>>
>> To traverse a hash table is not as efficient as traversing a list. That's why a
>> domain is kept on both the hash table and list.
>
>
> Keeping the same object on two lists also has costs. More importantly
> IMO, it increases chances on introducing a bug when people update one
> instance but not the other.
>
>
>>
>>>> + *
>>>> + * While there may exist two mtwatch_domain with the same domid on
>>>> + * domain_list simultaneously,
>>>
>>> How is it possible to have two domids on the list at the same time? Don't you
>>> want to remove it (which IIUIC means moving it to the purge list) when domain is
>>> destroyed?
>> Here is one case (suppose the domid/frontend-id is 9):
>>
>> 1. Suppose the last pv driver device is removed from domid=9, and therefore the
>> reference count of per-domU xenwatch thread for domid=9 (which we call as old
>> thread below) should be removed. We remove it from hash table (it is left in the
>> list).
>>
>> Here we cannot remove the domain from the list immediately because there might
>> be pending events being processed by the corresponding per-domU xenwatch thread.
>> If we remove it from the list while there is related watch being unregistered as
>> mentioned for last question, we may hit page fault when processing watch event.
>
>
> Don't you need to grab domain->domain_mutex to remove the driver?
> Meaning that events for that mtwatch thread cannot be processed?

I think the domain_mutex is not required. Most domain_mutex related code are
copied from xenwatch_thread().

The prerequisite to remove driver is that all pv backend devices belong to this
driver are removed. Once the pv backend device is removed, I think we can assume
there should be no xenwatch event relying on such pv backend device anymore.

>
> In any case, I think that having two mtwatch_domains for the same domain
> should be avoided. (and if you keep the hash list only then this issue
> gets resolved automatically ;-))


So far we have: (1) domain hash table, (2) domain list (where duplicate entries
may exist) and (3) purge list.

Can I assume you would like to discard the domain list and only keep domain hash
table and purge list?

The purpose of the domain list is to facilitate the unregister_xenbus_watch() to
scan the pending events for all otherend id. Should we remove it? Xen hypervisor
used both a hash table and a list to manage struct domain.


About the duplicate entries in the domain list, can we change the flow like below:

1. Suppose the thread status is DOWN. To avoid having duplicate entries on the
domain list, instead of keeping the deprecated thread on domain list (until all
its events get processed), we move it to the purge list immediately.

We will tag this deprecated thread so that the purge list would not purge it
unless all events belong to such thread get processed. We cannot purge it
immediately because the worker thread (to purge the purge list) would hang if
the deprecated thread is already stuck.

In this flow, we may have duplicate entries on purge list, but not domain list.

2. During unregister_xenbus_watch(), we need to scan both the domain list and
purge list to remove pending events for the watch. In previous design, we only
scan domain lit.


One option is we allow the deprecated thread to resurrect and we would not move
the thread to purge list immediately when the thread is deprecated.

Suppose when thread for domid=9 is needed, we would not create new one if the
deprecated one for domid=9 is still in domain list. Instead, we would resurrect
it, change its status and reuse it again. In this way, we would not have
duplicate entries on the domain list.

I like the 1st option. I do not like to resurrect a deprecated thread again.
Would you please let me know how you think about it?

Thank you very much!

Dongli Zhang




>
>
> -boris
>
>
>>
>> 2. Now the administrator attaches new pv device to domid=9 immediately and
>> therefore reference count is initially set to 1. The per-domU xenwatch thread
>> for domid=9 (called new thread) is created again. It is inserted to both the
>> hash table and list.
>>
>> 3. As the old thread for domid=9 might still be on the list, we would have two
>> threads for domid=9 (one old one to be removed and one newly inserted one to be
>> used by new pv devices).
>>
>> Dongli Zhang
>>
>>>
>>> -boris
>>>
>>>
>>>> + * all mtwatch_domain on hash_hash
>>>> + * should have unique domid.
>>>> + */
>>>> + spinlock_t domain_lock;
>>>> + struct hlist_head domain_hash[MTWATCH_HASH_SIZE];
>>>> + struct list_head domain_list;
>>>> +
>>>> + /*
>>>> + * When a per-domU kthread is going to be destroyed, it is put
>>>> + * on the purge_list, and will be flushed by purge_work later.
>>>> + */
>>>> + struct work_struct purge_work;
>>>> + spinlock_t purge_lock;
>>>> + struct list_head purge_list;
>>>> +};
>>>
>

2018-09-25 20:18:12

by Boris Ostrovsky

[permalink] [raw]
Subject: Re: [PATCH 1/6] xenbus: prepare data structures and parameter for xenwatch multithreading

On 9/25/18 1:14 AM, Dongli Zhang wrote:
>
> So far we have: (1) domain hash table, (2) domain list (where duplicate entries
> may exist) and (3) purge list.
>
> Can I assume you would like to discard the domain list and only keep domain hash
> table and purge list?

Yes, that's what I was thinking.

>
> The purpose of the domain list is to facilitate the unregister_xenbus_watch() to
> scan the pending events for all otherend id. Should we remove it? Xen hypervisor
> used both a hash table and a list to manage struct domain.


Your concern is that searching for a pending event in the hash is not
especially efficient. Since you are hashing on domain_id, then
unregister_single_mtwatch() should be pretty fast if searching in the
hash table (in fact, it's faster than traversing the domain list).
unregister_all_mtwatch() will indeed take longer since you will be
checking empty buckets. Right?

How often do you expect will we call unregister_all_mtwatch() compared
to unregister_single_mtwatch()?


>
>
> About the duplicate entries in the domain list, can we change the flow like below:
>
> 1. Suppose the thread status is DOWN. To avoid having duplicate entries on the
> domain list, instead of keeping the deprecated thread on domain list (until all
> its events get processed), we move it to the purge list immediately.
>
> We will tag this deprecated thread so that the purge list would not purge it
> unless all events belong to such thread get processed. We cannot purge it
> immediately because the worker thread (to purge the purge list) would hang if
> the deprecated thread is already stuck.
>
> In this flow, we may have duplicate entries on purge list, but not domain list.
>
> 2. During unregister_xenbus_watch(), we need to scan both the domain list and
> purge list to remove pending events for the watch. In previous design, we only
> scan domain lit.
>
>
> One option is we allow the deprecated thread to resurrect and we would not move
> the thread to purge list immediately when the thread is deprecated.
>
> Suppose when thread for domid=9 is needed, we would not create new one if the
> deprecated one for domid=9 is still in domain list. Instead, we would resurrect
> it, change its status and reuse it again. In this way, we would not have
> duplicate entries on the domain list.
>
> I like the 1st option. I do not like to resurrect a deprecated thread again.
> Would you please let me know how you think about it?


Yes, I also think (1) is better --- you are walking the whole purge list
anyway.


-boris



2018-09-26 02:57:05

by Dongli Zhang

[permalink] [raw]
Subject: Re: [Xen-devel] [PATCH 1/6] xenbus: prepare data structures and parameter for xenwatch multithreading

Hi Boris,

On 09/26/2018 04:19 AM, Boris Ostrovsky wrote:
> On 9/25/18 1:14 AM, Dongli Zhang wrote:
>>
>> So far we have: (1) domain hash table, (2) domain list (where duplicate entries
>> may exist) and (3) purge list.
>>
>> Can I assume you would like to discard the domain list and only keep domain hash
>> table and purge list?
>
> Yes, that's what I was thinking.
>
>>
>> The purpose of the domain list is to facilitate the unregister_xenbus_watch() to
>> scan the pending events for all otherend id. Should we remove it? Xen hypervisor
>> used both a hash table and a list to manage struct domain.
>
>
> Your concern is that searching for a pending event in the hash is not
> especially efficient. Since you are hashing on domain_id, then
> unregister_single_mtwatch() should be pretty fast if searching in the
> hash table (in fact, it's faster than traversing the domain list).
> unregister_all_mtwatch() will indeed take longer since you will be
> checking empty buckets. Right?
>
> How often do you expect will we call unregister_all_mtwatch() compared
> to unregister_single_mtwatch()?

In the patch set v1, unregister_all_mtwatch() is indeed never called.


The watch registered globally seems never (let's use "rarely") unregistered.
E.g., the be_watch (at node 'backend') is registered during initialization and
will be used during the lifecycle of a backend domain (dom0 or driver domain).

I agree that unregister_all_mtwatch() is not called so far and will not be used
very often in the future. The performance overhead is trivial.

Therefore, I would discard the domain list. Only 'domain hash' and 'purge list'
will be used.

>
>
>>
>>
>> About the duplicate entries in the domain list, can we change the flow like below:
>>
>> 1. Suppose the thread status is DOWN. To avoid having duplicate entries on the
>> domain list, instead of keeping the deprecated thread on domain list (until all
>> its events get processed), we move it to the purge list immediately.
>>
>> We will tag this deprecated thread so that the purge list would not purge it
>> unless all events belong to such thread get processed. We cannot purge it
>> immediately because the worker thread (to purge the purge list) would hang if
>> the deprecated thread is already stuck.
>>
>> In this flow, we may have duplicate entries on purge list, but not domain list.
>>
>> 2. During unregister_xenbus_watch(), we need to scan both the domain list and
>> purge list to remove pending events for the watch. In previous design, we only
>> scan domain lit.
>>
>>
>> One option is we allow the deprecated thread to resurrect and we would not move
>> the thread to purge list immediately when the thread is deprecated.
>>
>> Suppose when thread for domid=9 is needed, we would not create new one if the
>> deprecated one for domid=9 is still in domain list. Instead, we would resurrect
>> it, change its status and reuse it again. In this way, we would not have
>> duplicate entries on the domain list.
>>
>> I like the 1st option. I do not like to resurrect a deprecated thread again.
>> Would you please let me know how you think about it?
>
>
> Yes, I also think (1) is better --- you are walking the whole purge list
> anyway.

I will take (1).

when unregister_all_mtwatch(): scan both 'hash table' and 'purge list'.

The entries on the purge list are classified (tagged) as (1) can be purged
immediately, and (2) not purge until it completes all pending events.



Thank you very much!

Dongli Zhang

>
>
> -boris
>
>
>
> _______________________________________________
> Xen-devel mailing list
> [email protected]
> https://lists.xenproject.org/mailman/listinfo/xen-devel
>