Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp1240422ybb; Wed, 1 Apr 2020 19:10:21 -0700 (PDT) X-Google-Smtp-Source: APiQypLvhW/ErcyZZ+ybPidR1LzGu2RTks9XLQBF4cqetfZq352zZ+TjtWnF9P5XMXpMjcSYy4RY X-Received: by 2002:a05:6830:2314:: with SMTP id u20mr670972ote.166.1585793420841; Wed, 01 Apr 2020 19:10:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585793420; cv=none; d=google.com; s=arc-20160816; b=XkKwJinMQqeU4gurPUb/znCPRc7e/BkQM8rtuNmc5yJGmXPfJ0HLG4QJK+FvvjwX27 if5VGHABzC5h/fCQIV/Sgv0exPZ7doP8rFThh7TIdK9iAw2LDcCe3UbfjYxOhRlCKDT5 yGy0yK/xGTaybS484svv0rp31c7Ap462qyfEdJvtsDiHxYZF5V6SVtJWClMIpdiKDa88 n2dmBgzDr+AG8HqGiUtDjTHN8bqnh90sSjfj8AHinnIyrDZFVy7ho2vjUa2CNfzh7rza 3GoSaEVd+zzzveK8F4K+4laLBr2nx1XR6vzMARDhu0FraKgk+XSMQ3JhkxBlGoYI6dIN ZoKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=0AvdOMxSzudezFDGyo9moOxPb38tn3j7e1IB79LqmiE=; b=Eku1nKwyzLh6cvRGyXJfnWT6ROJiCLabr+8uwYaKHp6EKGfWUvSBbBT9g3ZiA4gina wS2Kssx7Skj2fL1HKUfPl//Gfnjf1mgHcBMQaq/nCL1XqpTBFgLMybTrPHbTRXo0I7+j L+gB0JHbcoJv31zI0blGEB77+t19YneArsbX6xzJoeuNbCrVJBf2gidFZbfXbaSBcwNY zBY2LJlnGeHyvUaqUxodOf5fQI6I94HYQYjrd0gMHQRqjC1ITDuGfScwxeKHGXQ/FOGL HKzXA4llI47xbOYNNmH55chTnP0YriLF17I0XREJ4JvPZtccIuAZdA5gBTJsA24qNtpB hhyQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=H75OHKsz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k137si1673461oih.226.2020.04.01.19.10.08; Wed, 01 Apr 2020 19:10:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=H75OHKsz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387447AbgDBCIW (ORCPT + 99 others); Wed, 1 Apr 2020 22:08:22 -0400 Received: from mail-ed1-f68.google.com ([209.85.208.68]:38332 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733213AbgDBCIW (ORCPT ); Wed, 1 Apr 2020 22:08:22 -0400 Received: by mail-ed1-f68.google.com with SMTP id e5so2289693edq.5 for ; Wed, 01 Apr 2020 19:08:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=0AvdOMxSzudezFDGyo9moOxPb38tn3j7e1IB79LqmiE=; b=H75OHKszjQii/JJueg9Otm9JKwkcDmohBHalnBaO+1I9/1u8Cbc5S5my3y64OuzR7H eyXkUJ8elmHM3MUgcTl6xMVLx6SYsHqO1tL2LrX7kCAj9ck0mnxhgsL+yAK5bpK1cOyC INNXgnKcgHB2uwr3YhuOQofMcrOJDCnfievy8Ttbqiry2VZ9zVPYJkRPzmA3vDJBvgjy Q7t6Yd0FJLv+vSoAHhykJX1UeteI37+caO+3+Y0uttzZ4bR+7M4ny47I/9XgOpoYgHhd fsFiKYQM0lDh5oZVw9VDLA3WFERxOVGS8KMlA4znSozPjaphuTvkvgrZhccnDWkwWNU2 JgdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=0AvdOMxSzudezFDGyo9moOxPb38tn3j7e1IB79LqmiE=; b=nxP1i2v0V6cJLYe+7mYkM0PrAJw8I4TfO+afm1iLE1AkYpaTBERVDW7KOLWlF+l7DY DmUhoMLenQ7r4+8etlFh5F2yk7/a69Hcepsv6Rq5p5Z3uLtczcAU3ABPngqNSFXQ3WrU 4BDrklnUHw+iWQc5TmApL5msVPrQU+nkRioqHKm0s5flcy8v3Xzd80xnQm66sxdTgicO fRlqLKbIYgmoYoS7CfIR2hpT0/MP6kvQh0+9hrYUwOC5amv812Pt9B+YcoLw8vlJMBx9 /xXimjTIdP4vYzzV/8hw8yIPyFgLC1Izns/AX+/vQcctGpfCiYzyFDKpihiMmqwnGSBX mWhg== X-Gm-Message-State: AGi0PuaMdg6gtcM0DtS21sZH0q1Bp9CEupvvJCS4vwvfGiv84wiqOA6m tlyFD+BiLabzB7rmJtN1JUzxsr2PVbRzcm7hLMWsVQ== X-Received: by 2002:a17:906:1697:: with SMTP id s23mr1019998ejd.211.1585793299540; Wed, 01 Apr 2020 19:08:19 -0700 (PDT) MIME-Version: 1.0 References: <20200327071202.2159885-1-alastair@d-silva.org> <20200327071202.2159885-20-alastair@d-silva.org> In-Reply-To: <20200327071202.2159885-20-alastair@d-silva.org> From: Dan Williams Date: Wed, 1 Apr 2020 19:08:08 -0700 Message-ID: Subject: Re: [PATCH v4 19/25] nvdimm/ocxl: Forward events to userspace To: "Alastair D'Silva" Cc: "Aneesh Kumar K . V" , "Oliver O'Halloran" , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Frederic Barrat , Andrew Donnellan , Arnd Bergmann , Greg Kroah-Hartman , Vishal Verma , Dave Jiang , Ira Weiny , Andrew Morton , Mauro Carvalho Chehab , "David S. Miller" , Rob Herring , Anton Blanchard , Krzysztof Kozlowski , Mahesh Salgaonkar , Madhavan Srinivasan , =?UTF-8?Q?C=C3=A9dric_Le_Goater?= , Anju T Sudhakar , Hari Bathini , Thomas Gleixner , Greg Kurz , Nicholas Piggin , Masahiro Yamada , Alexey Kardashevskiy , Linux Kernel Mailing List , linuxppc-dev , linux-nvdimm , Linux MM Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 31, 2020 at 1:59 AM Alastair D'Silva wrote: > > Some of the interrupts that the card generates are better handled > by the userspace daemon, in particular: > Controller Hardware/Firmware Fatal > Controller Dump Available > Error Log available > > This patch allows a userspace application to register an eventfd with > the driver via SCM_IOCTL_EVENTFD to receive notifications of these > interrupts. > > Userspace can then identify what events have occurred by calling > SCM_IOCTL_EVENT_CHECK and checking against the SCM_IOCTL_EVENT_FOO > masks. The amount new ioctl's in this driver is too high, it seems much of this data can be exported via sysfs attributes which are more maintainable that ioctls. Then sysfs also has the ability to signal events on sysfs attributes, see sys_notify_dirent. Can you step back and review the ABI exposure of the driver and what can be moved to sysfs? If you need to have bus specific attributes ordered underneath the libnvdimm generic attributes you can create a sysfs attribute subdirectory. In general a roadmap document of all the proposed ABI is needed to make sure it is both sufficient and necessary. See the libnvdimm document that introduced the initial libnvdimm ABI: https://www.kernel.org/doc/Documentation/nvdimm/nvdimm.txt > > Signed-off-by: Alastair D'Silva > --- > drivers/nvdimm/ocxl/main.c | 220 +++++++++++++++++++++++++++++++++ > drivers/nvdimm/ocxl/ocxlpmem.h | 4 + > include/uapi/nvdimm/ocxlpmem.h | 12 ++ > 3 files changed, 236 insertions(+) > > diff --git a/drivers/nvdimm/ocxl/main.c b/drivers/nvdimm/ocxl/main.c > index 0040fc09cceb..cb6cdc9eb899 100644 > --- a/drivers/nvdimm/ocxl/main.c > +++ b/drivers/nvdimm/ocxl/main.c > @@ -10,6 +10,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -301,8 +302,19 @@ static void free_ocxlpmem(struct ocxlpmem *ocxlpmem) > { > int rc; > > + // Disable doorbells > + (void)ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHIEC, > + OCXL_LITTLE_ENDIAN, > + GLOBAL_MMIO_CHI_ALL); > + > free_minor(ocxlpmem); > > + if (ocxlpmem->irq_addr[1]) > + iounmap(ocxlpmem->irq_addr[1]); > + > + if (ocxlpmem->irq_addr[0]) > + iounmap(ocxlpmem->irq_addr[0]); > + > if (ocxlpmem->ocxl_context) { > rc = ocxl_context_detach(ocxlpmem->ocxl_context); > if (rc == -EBUSY) > @@ -398,6 +410,11 @@ static int file_release(struct inode *inode, struct file *file) > { > struct ocxlpmem *ocxlpmem = file->private_data; > > + if (ocxlpmem->ev_ctx) { > + eventfd_ctx_put(ocxlpmem->ev_ctx); > + ocxlpmem->ev_ctx = NULL; > + } > + > ocxlpmem_put(ocxlpmem); > return 0; > } > @@ -928,6 +945,52 @@ static int ioctl_controller_stats(struct ocxlpmem *ocxlpmem, > return rc; > } > > +static int ioctl_eventfd(struct ocxlpmem *ocxlpmem, > + struct ioctl_ocxlpmem_eventfd __user *uarg) > +{ > + struct ioctl_ocxlpmem_eventfd args; > + > + if (copy_from_user(&args, uarg, sizeof(args))) > + return -EFAULT; > + > + if (ocxlpmem->ev_ctx) > + return -EBUSY; > + > + ocxlpmem->ev_ctx = eventfd_ctx_fdget(args.eventfd); > + if (IS_ERR(ocxlpmem->ev_ctx)) > + return PTR_ERR(ocxlpmem->ev_ctx); > + > + return 0; > +} > + > +static int ioctl_event_check(struct ocxlpmem *ocxlpmem, u64 __user *uarg) > +{ > + u64 val = 0; > + int rc; > + u64 chi = 0; > + > + rc = ocxlpmem_chi(ocxlpmem, &chi); > + if (rc < 0) > + return rc; > + > + if (chi & GLOBAL_MMIO_CHI_ELA) > + val |= IOCTL_OCXLPMEM_EVENT_ERROR_LOG_AVAILABLE; > + > + if (chi & GLOBAL_MMIO_CHI_CDA) > + val |= IOCTL_OCXLPMEM_EVENT_CONTROLLER_DUMP_AVAILABLE; > + > + if (chi & GLOBAL_MMIO_CHI_CFFS) > + val |= IOCTL_OCXLPMEM_EVENT_FIRMWARE_FATAL; > + > + if (chi & GLOBAL_MMIO_CHI_CHFS) > + val |= IOCTL_OCXLPMEM_EVENT_HARDWARE_FATAL; > + > + if (copy_to_user((u64 __user *)uarg, &val, sizeof(val))) > + return -EFAULT; > + > + return rc; > +} > + > static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args) > { > struct ocxlpmem *ocxlpmem = file->private_data; > @@ -956,6 +1019,15 @@ static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args) > rc = ioctl_controller_stats(ocxlpmem, > (struct ioctl_ocxlpmem_controller_stats __user *)args); > break; > + > + case IOCTL_OCXLPMEM_EVENTFD: > + rc = ioctl_eventfd(ocxlpmem, > + (struct ioctl_ocxlpmem_eventfd __user *)args); > + break; > + > + case IOCTL_OCXLPMEM_EVENT_CHECK: > + rc = ioctl_event_check(ocxlpmem, (u64 __user *)args); > + break; > } > > return rc; > @@ -1109,6 +1181,148 @@ static void dump_error_log(struct ocxlpmem *ocxlpmem) > kfree(buf); > } > > +static irqreturn_t imn0_handler(void *private) > +{ > + struct ocxlpmem *ocxlpmem = private; > + u64 chi = 0; > + > + (void)ocxlpmem_chi(ocxlpmem, &chi); > + > + if (chi & GLOBAL_MMIO_CHI_ELA) { > + dev_warn(&ocxlpmem->dev, "Error log is available\n"); > + > + if (ocxlpmem->ev_ctx) > + eventfd_signal(ocxlpmem->ev_ctx, 1); > + } > + > + if (chi & GLOBAL_MMIO_CHI_CDA) { > + dev_warn(&ocxlpmem->dev, "Controller dump is available\n"); > + > + if (ocxlpmem->ev_ctx) > + eventfd_signal(ocxlpmem->ev_ctx, 1); > + } > + > + return IRQ_HANDLED; > +} > + > +static irqreturn_t imn1_handler(void *private) > +{ > + struct ocxlpmem *ocxlpmem = private; > + u64 chi = 0; > + > + (void)ocxlpmem_chi(ocxlpmem, &chi); > + > + if (chi & (GLOBAL_MMIO_CHI_CFFS | GLOBAL_MMIO_CHI_CHFS)) { > + dev_err(&ocxlpmem->dev, > + "Controller status is fatal, chi=0x%llx, going offline\n", > + chi); > + > + if (ocxlpmem->nvdimm_bus) { > + nvdimm_bus_unregister(ocxlpmem->nvdimm_bus); > + ocxlpmem->nvdimm_bus = NULL; > + } > + > + if (ocxlpmem->ev_ctx) > + eventfd_signal(ocxlpmem->ev_ctx, 1); > + } > + > + return IRQ_HANDLED; > +} > + > +/** > + * ocxlpmem_setup_irq() - Set up the IRQs for the OpenCAPI Persistent Memory device > + * @ocxlpmem: the device metadata > + * Return: 0 on success, negative on failure > + */ > +static int setup_irqs(struct ocxlpmem *ocxlpmem) > +{ > + int rc; > + u64 irq_addr; > + > + rc = ocxl_afu_irq_alloc(ocxlpmem->ocxl_context, > + &ocxlpmem->irq_id[0]); > + if (rc) > + return rc; > + > + rc = ocxl_irq_set_handler(ocxlpmem->ocxl_context, ocxlpmem->irq_id[0], > + imn0_handler, NULL, ocxlpmem); > + > + irq_addr = ocxl_afu_irq_get_addr(ocxlpmem->ocxl_context, > + ocxlpmem->irq_id[0]); > + if (!irq_addr) > + return -EFAULT; > + > + ocxlpmem->irq_addr[0] = ioremap(irq_addr, PAGE_SIZE); > + if (!ocxlpmem->irq_addr[0]) > + return -ENODEV; > + > + rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA0_OHP, > + OCXL_LITTLE_ENDIAN, > + (u64)ocxlpmem->irq_addr[0]); > + if (rc) > + goto out_irq0; > + > + rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA0_CFP, > + OCXL_LITTLE_ENDIAN, 0); > + if (rc) > + goto out_irq0; > + > + rc = ocxl_afu_irq_alloc(ocxlpmem->ocxl_context, &ocxlpmem->irq_id[1]); > + if (rc) > + goto out_irq0; > + > + rc = ocxl_irq_set_handler(ocxlpmem->ocxl_context, ocxlpmem->irq_id[1], > + imn1_handler, NULL, ocxlpmem); > + if (rc) > + goto out_irq0; > + > + irq_addr = ocxl_afu_irq_get_addr(ocxlpmem->ocxl_context, > + ocxlpmem->irq_id[1]); > + if (!irq_addr) { > + rc = -EFAULT; > + goto out_irq0; > + } > + > + ocxlpmem->irq_addr[1] = ioremap(irq_addr, PAGE_SIZE); > + if (!ocxlpmem->irq_addr[1]) { > + rc = -ENODEV; > + goto out_irq0; > + } > + > + rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA1_OHP, > + OCXL_LITTLE_ENDIAN, > + (u64)ocxlpmem->irq_addr[1]); > + if (rc) > + goto out_irq1; > + > + rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_IMA1_CFP, > + OCXL_LITTLE_ENDIAN, 0); > + if (rc) > + goto out_irq1; > + > + // Enable doorbells > + rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, GLOBAL_MMIO_CHIE, > + OCXL_LITTLE_ENDIAN, > + GLOBAL_MMIO_CHI_ELA | > + GLOBAL_MMIO_CHI_CDA | > + GLOBAL_MMIO_CHI_CFFS | > + GLOBAL_MMIO_CHI_CHFS); > + if (rc) > + goto out_irq1; > + > + return 0; > + > +out_irq1: > + iounmap(ocxlpmem->irq_addr[1]); > + ocxlpmem->irq_addr[1] = NULL; > + > +out_irq0: > + iounmap(ocxlpmem->irq_addr[0]); > + ocxlpmem->irq_addr[0] = NULL; > + > + return rc; > +} > + > /** > * probe_function0() - Set up function 0 for an OpenCAPI persistent memory device > * This is important as it enables templates higher than 0 across all other > @@ -1212,6 +1426,12 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent) > goto err; > } > > + rc = setup_irqs(ocxlpmem); > + if (rc) { > + dev_err(&pdev->dev, "Could not set up OCXL IRQs\n"); > + goto err; > + } > + > rc = setup_command_metadata(ocxlpmem); > if (rc) { > dev_err(&pdev->dev, "Could not read command metadata\n"); > diff --git a/drivers/nvdimm/ocxl/ocxlpmem.h b/drivers/nvdimm/ocxl/ocxlpmem.h > index ee3bd651f254..01721596f982 100644 > --- a/drivers/nvdimm/ocxl/ocxlpmem.h > +++ b/drivers/nvdimm/ocxl/ocxlpmem.h > @@ -106,6 +106,9 @@ struct ocxlpmem { > struct pci_dev *pdev; > struct cdev cdev; > struct ocxl_fn *ocxl_fn; > +#define SCM_IRQ_COUNT 2 > + int irq_id[SCM_IRQ_COUNT]; > + void __iomem *irq_addr[SCM_IRQ_COUNT]; > struct nd_interleave_set nd_set; > struct nvdimm_bus_descriptor bus_desc; > struct nvdimm_bus *nvdimm_bus; > @@ -117,6 +120,7 @@ struct ocxlpmem { > struct nd_region *nd_region; > char fw_version[8 + 1]; > u32 timeouts[ADMIN_COMMAND_MAX + 1]; > + struct eventfd_ctx *ev_ctx; > u32 max_controller_dump_size; > u16 scm_revision; // major/minor > u8 readiness_timeout; /* The worst case time (in seconds) that the host > diff --git a/include/uapi/nvdimm/ocxlpmem.h b/include/uapi/nvdimm/ocxlpmem.h > index ca3a7098fa9d..d573bd307e35 100644 > --- a/include/uapi/nvdimm/ocxlpmem.h > +++ b/include/uapi/nvdimm/ocxlpmem.h > @@ -71,6 +71,16 @@ struct ioctl_ocxlpmem_controller_stats { > __u64 fast_write_count; > }; > > +struct ioctl_ocxlpmem_eventfd { > + __s32 eventfd; > + __u32 reserved; > +}; > + > +#define IOCTL_OCXLPMEM_EVENT_CONTROLLER_DUMP_AVAILABLE (1ULL << (0)) > +#define IOCTL_OCXLPMEM_EVENT_ERROR_LOG_AVAILABLE (1ULL << (1)) > +#define IOCTL_OCXLPMEM_EVENT_HARDWARE_FATAL (1ULL << (2)) > +#define IOCTL_OCXLPMEM_EVENT_FIRMWARE_FATAL (1ULL << (3)) > + > /* ioctl numbers */ > #define OCXLPMEM_MAGIC 0xCA > /* OpenCAPI Persistent memory devices */ > @@ -79,5 +89,7 @@ struct ioctl_ocxlpmem_controller_stats { > #define IOCTL_OCXLPMEM_CONTROLLER_DUMP_DATA _IOWR(OCXLPMEM_MAGIC, 0x32, struct ioctl_ocxlpmem_controller_dump_data) > #define IOCTL_OCXLPMEM_CONTROLLER_DUMP_COMPLETE _IO(OCXLPMEM_MAGIC, 0x33) > #define IOCTL_OCXLPMEM_CONTROLLER_STATS _IO(OCXLPMEM_MAGIC, 0x34) > +#define IOCTL_OCXLPMEM_EVENTFD _IOW(OCXLPMEM_MAGIC, 0x35, struct ioctl_ocxlpmem_eventfd) > +#define IOCTL_OCXLPMEM_EVENT_CHECK _IOR(OCXLPMEM_MAGIC, 0x36, __u64) > > #endif /* _UAPI_OCXL_SCM_H */ > -- > 2.24.1 >