Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp2030988pxb; Thu, 11 Feb 2021 02:27:26 -0800 (PST) X-Google-Smtp-Source: ABdhPJxOYqAOOghrTzusCj9bdjtMKfso4RZ6iqly2opqtR0HmQhwie9C+wN/wnUOlENkqxyNyxOl X-Received: by 2002:a05:6402:1152:: with SMTP id g18mr7859150edw.18.1613039246186; Thu, 11 Feb 2021 02:27:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613039246; cv=none; d=google.com; s=arc-20160816; b=WhRklS9EO8At7K2SqjiHNER9YdyvSQudQmh/wSJC42V30fKn8oXXQN2misxr7/RSJ6 oiqhf6SuWrF7/79LuGDlz8zJQan2VXCbqqh4ePOYZu1ARAEXz7mRskoFUOYPMddxcxka ANAvJTpWS1abr8auj+/e4uq9p1VGBmheU9cLw+dqGgaBUu86n9zpWVSbc1Nu7Y1fJn8I CC32vzZJXyAMVnCM0QHqJ0h1GKWtrR8DUfQCyuF0ls0w3c2A1GAWLMThRtVNYtaQH3uZ XFPsg8eU1gRMWxNdPWNIQ6swScf4Hq6EFc/Fh9VTzVju9W0MI2BnFA4NsxFS+lY0sF3s /bQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=3cGioF3XhVwmtqHEwtIlRzTuYQpd1I+EFdYzeGvtMh0=; b=yDAMfTW1fkgpVAvYr/CfriyZ9e2z92mTIeDeqLQLkp9smj3IIpKhuINH1q09JX/44i h2Q5BaFHePAPG2552Nqx1afBHMy0YQsao92zT8Acn9b7ID51IvTjatFThi0kJAhRGhSf VC0W0OtJpDLRBlAHwhPgU7X9GIS1G9zv6fG7bEiX2byca00kRQ+b92lDKdCfbqfGNCJg 7GQPjMG2wMfm0iGKfb8OrgVOHJUIgLmLW1gE9G0wTtgVWsY7SzpeBn3mNK4CghDi+uvl 4GIQFIujMnIzbJSamezNufhzJ7LI7yp+/kVodbCMsxkf/a2Iu2ZPzMgv3aaklqSl2lHi Z+CA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=nJWGxbfs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m7si3507829edj.442.2021.02.11.02.27.02; Thu, 11 Feb 2021 02:27:26 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=nJWGxbfs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230289AbhBKK0X (ORCPT + 99 others); Thu, 11 Feb 2021 05:26:23 -0500 Received: from mx2.suse.de ([195.135.220.15]:42182 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230093AbhBKKUg (ORCPT ); Thu, 11 Feb 2021 05:20:36 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1613038661; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3cGioF3XhVwmtqHEwtIlRzTuYQpd1I+EFdYzeGvtMh0=; b=nJWGxbfsZic/fqYIe+mxFP821DzSm6m2XntEhzUHGLp8ReTw/1Z9p6Wf+Zog36NLKvRkvP aBBHU2g6Du1VAciaM66Ag1GwoVvwYFf2dWM8ssG4kS8RlcudQi1k9volZsavYT1ss5EL3j JZviBo2JsyCZQjZJXlTkzxRB8siacRs= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 3841FAF11; Thu, 11 Feb 2021 10:17:41 +0000 (UTC) From: Juergen Gross To: xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org Cc: Juergen Gross , Boris Ostrovsky , Stefano Stabellini Subject: [PATCH v2 6/8] xen/events: add per-xenbus device event statistics and settings Date: Thu, 11 Feb 2021 11:16:14 +0100 Message-Id: <20210211101616.13788-7-jgross@suse.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20210211101616.13788-1-jgross@suse.com> References: <20210211101616.13788-1-jgross@suse.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Add syfs nodes for each xenbus device showing event statistics (number of events and spurious events, number of associated event channels) and for setting a spurious event threshold in case a frontend is sending too many events without being rogue on purpose. Signed-off-by: Juergen Gross --- V2: - add documentation (Boris Ostrovsky) --- .../ABI/testing/sysfs-devices-xenbus | 41 ++++++++++++ drivers/xen/events/events_base.c | 27 +++++++- drivers/xen/xenbus/xenbus_probe.c | 66 +++++++++++++++++++ include/xen/xenbus.h | 7 ++ 4 files changed, 139 insertions(+), 2 deletions(-) create mode 100644 Documentation/ABI/testing/sysfs-devices-xenbus diff --git a/Documentation/ABI/testing/sysfs-devices-xenbus b/Documentation/ABI/testing/sysfs-devices-xenbus new file mode 100644 index 000000000000..fd796cb4f315 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-devices-xenbus @@ -0,0 +1,41 @@ +What: /sys/devices/*/xenbus/event_channels +Date: February 2021 +Contact: Xen Developers mailing list +Description: + Number of Xen event channels associated with a kernel based + paravirtualized device frontend or backend. + +What: /sys/devices/*/xenbus/events +Date: February 2021 +Contact: Xen Developers mailing list +Description: + Total number of Xen events received for a Xen pv device + frontend or backend. + +What: /sys/devices/*/xenbus/jiffies_eoi_delayed +Date: February 2021 +Contact: Xen Developers mailing list +Description: + Summed up time in jiffies the EOI of an interrupt for a Xen + pv device has been delayed in order to avoid stalls due to + event storms. This value rising is a first sign for a rogue + other end of the pv device. + +What: /sys/devices/*/xenbus/spurious_events +Date: February 2021 +Contact: Xen Developers mailing list +Description: + Number of events received for a Xen pv device which did not + require any action. Too many spurious events in a row will + trigger delayed EOI processing. + +What: /sys/devices/*/xenbus/spurious_threshold +Date: February 2021 +Contact: Xen Developers mailing list +Description: + Controls the tolerated number of subsequent spurious events + before delayed EOI processing is triggered for a Xen pv + device. Default is 1. This can be modified in case the other + end of the pv device is issuing spurious events on a regular + basis and is known not to be malicious on purpose. Raising + the value for such cases can improve pv device performance. diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index a5cce4c626c2..48210c1e62e3 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -332,6 +332,8 @@ static int xen_irq_info_evtchn_setup(unsigned irq, ret = xen_irq_info_common_setup(info, irq, IRQT_EVTCHN, evtchn, 0); info->u.interdomain = dev; + if (dev) + atomic_inc(&dev->event_channels); return ret; } @@ -606,18 +608,28 @@ static void xen_irq_lateeoi_locked(struct irq_info *info, bool spurious) return; if (spurious) { + struct xenbus_device *dev = info->u.interdomain; + unsigned int threshold = 1; + + if (dev && dev->spurious_threshold) + threshold = dev->spurious_threshold; + if ((1 << info->spurious_cnt) < (HZ << 2)) { if (info->spurious_cnt != 0xFF) info->spurious_cnt++; } - if (info->spurious_cnt > 1) { - delay = 1 << (info->spurious_cnt - 2); + if (info->spurious_cnt > threshold) { + delay = 1 << (info->spurious_cnt - 1 - threshold); if (delay > HZ) delay = HZ; if (!info->eoi_time) info->eoi_cpu = smp_processor_id(); info->eoi_time = get_jiffies_64() + delay; + if (dev) + atomic_add(delay, &dev->jiffies_eoi_delayed); } + if (dev) + atomic_inc(&dev->spurious_events); } else { info->spurious_cnt = 0; } @@ -950,6 +962,7 @@ static void __unbind_from_irq(unsigned int irq) if (VALID_EVTCHN(evtchn)) { unsigned int cpu = cpu_from_irq(irq); + struct xenbus_device *dev; xen_evtchn_close(evtchn); @@ -960,6 +973,11 @@ static void __unbind_from_irq(unsigned int irq) case IRQT_IPI: per_cpu(ipi_to_irq, cpu)[ipi_from_irq(irq)] = -1; break; + case IRQT_EVTCHN: + dev = info->u.interdomain; + if (dev) + atomic_dec(&dev->event_channels); + break; default: break; } @@ -1623,6 +1641,7 @@ void handle_irq_for_port(evtchn_port_t port, struct evtchn_loop_ctrl *ctrl) { int irq; struct irq_info *info; + struct xenbus_device *dev; irq = get_evtchn_to_irq(port); if (irq == -1) @@ -1654,6 +1673,10 @@ void handle_irq_for_port(evtchn_port_t port, struct evtchn_loop_ctrl *ctrl) if (xchg_acquire(&info->is_active, 1)) return; + dev = (info->type == IRQT_EVTCHN) ? info->u.interdomain : NULL; + if (dev) + atomic_inc(&dev->events); + if (ctrl->defer_eoi) { info->eoi_cpu = smp_processor_id(); info->irq_epoch = __this_cpu_read(irq_epoch); diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c index 18ffd0551b54..9494ecad3c92 100644 --- a/drivers/xen/xenbus/xenbus_probe.c +++ b/drivers/xen/xenbus/xenbus_probe.c @@ -206,6 +206,65 @@ void xenbus_otherend_changed(struct xenbus_watch *watch, } EXPORT_SYMBOL_GPL(xenbus_otherend_changed); +#define XENBUS_SHOW_STAT(name) \ +static ssize_t show_##name(struct device *_dev, \ + struct device_attribute *attr, \ + char *buf) \ +{ \ + struct xenbus_device *dev = to_xenbus_device(_dev); \ + \ + return sprintf(buf, "%d\n", atomic_read(&dev->name)); \ +} \ +static DEVICE_ATTR(name, 0444, show_##name, NULL) + +XENBUS_SHOW_STAT(event_channels); +XENBUS_SHOW_STAT(events); +XENBUS_SHOW_STAT(spurious_events); +XENBUS_SHOW_STAT(jiffies_eoi_delayed); + +static ssize_t show_spurious_threshold(struct device *_dev, + struct device_attribute *attr, + char *buf) +{ + struct xenbus_device *dev = to_xenbus_device(_dev); + + return sprintf(buf, "%d\n", dev->spurious_threshold); +} + +static ssize_t set_spurious_threshold(struct device *_dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct xenbus_device *dev = to_xenbus_device(_dev); + unsigned int val; + ssize_t ret; + + ret = kstrtouint(buf, 0, &val); + if (ret) + return ret; + + dev->spurious_threshold = val; + + return count; +} + +static DEVICE_ATTR(spurious_threshold, 0644, show_spurious_threshold, + set_spurious_threshold); + +static struct attribute *xenbus_attrs[] = { + &dev_attr_event_channels.attr, + &dev_attr_events.attr, + &dev_attr_spurious_events.attr, + &dev_attr_jiffies_eoi_delayed.attr, + &dev_attr_spurious_threshold.attr, + NULL +}; + +static const struct attribute_group xenbus_group = { + .name = "xenbus", + .attrs = xenbus_attrs, +}; + int xenbus_dev_probe(struct device *_dev) { struct xenbus_device *dev = to_xenbus_device(_dev); @@ -253,6 +312,11 @@ int xenbus_dev_probe(struct device *_dev) return err; } + dev->spurious_threshold = 1; + if (sysfs_create_group(&dev->dev.kobj, &xenbus_group)) + dev_warn(&dev->dev, "sysfs_create_group on %s failed.\n", + dev->nodename); + return 0; fail_put: module_put(drv->driver.owner); @@ -269,6 +333,8 @@ int xenbus_dev_remove(struct device *_dev) DPRINTK("%s", dev->nodename); + sysfs_remove_group(&dev->dev.kobj, &xenbus_group); + free_otherend_watch(dev); if (drv->remove) { diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h index 2c43b0ef1e4d..13ee375a1f05 100644 --- a/include/xen/xenbus.h +++ b/include/xen/xenbus.h @@ -88,6 +88,13 @@ struct xenbus_device { struct completion down; struct work_struct work; struct semaphore reclaim_sem; + + /* Event channel based statistics and settings. */ + atomic_t event_channels; + atomic_t events; + atomic_t spurious_events; + atomic_t jiffies_eoi_delayed; + unsigned int spurious_threshold; }; static inline struct xenbus_device *to_xenbus_device(struct device *dev) -- 2.26.2