Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp731793pxb; Tue, 2 Feb 2021 17:01:03 -0800 (PST) X-Google-Smtp-Source: ABdhPJzb/jcJhR4cQsDCl3kWAmauMvhMnwd1r9X/Aqr4bOBXi3ku/Q3/0hXkitLyK7OYbzqQ3sYb X-Received: by 2002:a17:906:708f:: with SMTP id b15mr642923ejk.267.1612314063704; Tue, 02 Feb 2021 17:01:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612314063; cv=none; d=google.com; s=arc-20160816; b=fdZkNSFWKSJDvS6zmqtf5UlhnAoHcM1fIebmLYlVmX3zl3JmEId2vPyWDbT4p4Wi3v N4ous528EkR0rr2jmzZkyB8+mRy+n7EzAB7M5GKwdlrq51HnJ871ZUD5tN55m8OhVT0U xglrumseuJfc9KwhVTvEEbvs6FkYXuF4DwAJwMyMf28p3mFqbxkFkw441Omx8AyaqLvL QQVHIGbgS2pWdFWxUuIWIn+SCOPuqLpD3KpNZZ9ibxmgr5DxkA8ETXW8WNelwl6ibFQt DuwRL4Cdid2UHy9msBGfdN1i10B7DHa3y4ZNWFsD60i5E5N8QXt2/OaHvlY4r2wHy6ZG jQKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=SKkrzHAizYbstNHTql+yyziDnq6nKmWbR26nQRH3O7E=; b=QKj3qmqrfE40Q9hiLh2dYfi55A1vwfyxP067MFEpeHW5HpE9DO8EcxIKCwKELVpglZ sGoytjfKkOPsWeWKajPQWFg9szcpDAMRR4aB1RAAfbywcvFBTqMVig6nuUH8UugOBHuX cTa51isXYX4fN99D8FAp6HnYrg6PeDFucyhIKFxmLBnPSvlhDBVs+p5O6Lgv9eTiq1Sm 56Qgs3cJIh6Y3yTyZ8FpzfGIC6bgF2jTu7IxnwZrYPVRhs5e7tDSmwZXoJ+8zbJDtEOK 5o4tIYQ33II7vkNRu0zlKr5MGXLRqHX6fB2VzTiOBnJSl+lCGN6dJbknXIOOmCgEPcHi PGDw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=Ik0Gu269; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x2si268757ejn.747.2021.02.02.17.00.39; Tue, 02 Feb 2021 17:01:03 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=Ik0Gu269; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230174AbhBBXzB (ORCPT + 99 others); Tue, 2 Feb 2021 18:55:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36686 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231174AbhBBXyr (ORCPT ); Tue, 2 Feb 2021 18:54:47 -0500 Received: from mail-ed1-x535.google.com (mail-ed1-x535.google.com [IPv6:2a00:1450:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC894C061786 for ; Tue, 2 Feb 2021 15:54:06 -0800 (PST) Received: by mail-ed1-x535.google.com with SMTP id s11so24932231edd.5 for ; Tue, 02 Feb 2021 15:54:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=SKkrzHAizYbstNHTql+yyziDnq6nKmWbR26nQRH3O7E=; b=Ik0Gu269RG3+DbKU7EoRu3pIjbS52PxqLBnYQm71JRD5wwSgREG6DWkOg4GrXaJsA5 VO5mjhFVzlVoM3HIFpFy7dYSJ5pFsGHhNt2J3BvejZ7EWzOxCMA65x2T/FmfldJsqRuy TgD8jhKt817LjKdFAImngxmIQQrJRQWf314iowOkz4uQUT1pCLNcsD+NHVtb2dSeEAv9 4gEgAl42i266h7DkhYnuFmFjifsYAj7q57Q/BH6EQfMDMZCSYrRvZgywPgnbLwtumiYc bj/3+pwKvUpLwJfdNwW30ktwl73Oz9CyXE2T5trunnbvXXp5TH/GWtAQq0RS1Yzghe98 eTCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=SKkrzHAizYbstNHTql+yyziDnq6nKmWbR26nQRH3O7E=; b=FbsZM1r0MX1yAfnnnPjAaxYFhBjQ3ii/MD75VA5//MhfJiDT3dVe9hgLXSbCQVO/eN FNWoOk56Wcf/cpwcRalkq9GP2azdSyCN2/unjDx3VUOArijV8fCGprYjUG7GC/Uo+QaX 0hN+Wi/SbzreYQloHwHUnfmdf1ugMm57mXc7w8jCLl35isBK1kp3yfPVAJZDUDMhGFSF Plh6gT7skSboz+uEEHWIXx4g0d/CNSRJdW4ZG4aNvpjbu+Vb4rMgQEdvSpniwsxPDIcS qHhrOxsxH0TczypDKhZYjb7UkP6dm/QhYUCo71zA4KdJx8WhaPePIPC2Pe7vmqgtF4Kr bpvg== X-Gm-Message-State: AOAM530uvPpNNpLI4FBMhOt3qrN2esMp6jVjMcuenakOFqbBuj9ccjbv zS7GqCv+6yLbclQGElGiNN1BUzyt94w/IMHdPkQglg== X-Received: by 2002:a05:6402:306c:: with SMTP id bs12mr497681edb.348.1612310045362; Tue, 02 Feb 2021 15:54:05 -0800 (PST) MIME-Version: 1.0 References: <20210130002438.1872527-1-ben.widawsky@intel.com> <20210130002438.1872527-5-ben.widawsky@intel.com> <5986abe5-1248-30b2-5f53-fa7013baafad@google.com> <20210202225733.miq5sl3mqit2zuhg@intel.com> In-Reply-To: <20210202225733.miq5sl3mqit2zuhg@intel.com> From: Dan Williams Date: Tue, 2 Feb 2021 15:54:03 -0800 Message-ID: Subject: Re: [PATCH 04/14] cxl/mem: Implement polled mode mailbox To: Ben Widawsky Cc: David Rientjes , linux-cxl@vger.kernel.org, Linux ACPI , Linux Kernel Mailing List , linux-nvdimm , Linux PCI , Bjorn Helgaas , Chris Browy , Christoph Hellwig , Ira Weiny , Jon Masters , Jonathan Cameron , Rafael Wysocki , Randy Dunlap , Vishal Verma , daniel.lll@alibaba-inc.com, "John Groves (jgroves)" , "Kelley, Sean V" Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 2, 2021 at 2:57 PM Ben Widawsky wrote: > > On 21-02-01 12:00:18, Dan Williams wrote: > > On Sat, Jan 30, 2021 at 3:52 PM David Rientjes wrote: > > > > > > On Fri, 29 Jan 2021, Ben Widawsky wrote: > > > > > > > Provide enough functionality to utilize the mailbox of a memory device. > > > > The mailbox is used to interact with the firmware running on the memory > > > > device. > > > > > > > > The CXL specification defines separate capabilities for the mailbox and > > > > the memory device. The mailbox interface has a doorbell to indicate > > > > ready to accept commands and the memory device has a capability register > > > > that indicates the mailbox interface is ready. The expectation is that > > > > the doorbell-ready is always later than the memory-device-indication > > > > that the mailbox is ready. > > > > > > > > Create a function to handle sending a command, optionally with a > > > > payload, to the memory device, polling on a result, and then optionally > > > > copying out the payload. The algorithm for doing this comes straight out > > > > of the CXL 2.0 specification. > > > > > > > > Primary mailboxes are capable of generating an interrupt when submitting > > > > a command in the background. That implementation is saved for a later > > > > time. > > > > > > > > Secondary mailboxes aren't implemented at this time. > > > > > > > > The flow is proven with one implemented command, "identify". Because the > > > > class code has already told the driver this is a memory device and the > > > > identify command is mandatory. > > > > > > > > Signed-off-by: Ben Widawsky > > > > --- > > > > drivers/cxl/Kconfig | 14 ++ > > > > drivers/cxl/cxl.h | 39 +++++ > > > > drivers/cxl/mem.c | 342 +++++++++++++++++++++++++++++++++++++++++++- > > > > 3 files changed, 394 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig > > > > index 3b66b46af8a0..fe591f74af96 100644 > > > > --- a/drivers/cxl/Kconfig > > > > +++ b/drivers/cxl/Kconfig > > > > @@ -32,4 +32,18 @@ config CXL_MEM > > > > Chapter 2.3 Type 3 CXL Device in the CXL 2.0 specification. > > > > > > > > If unsure say 'm'. > > > > + > > > > +config CXL_MEM_INSECURE_DEBUG > > > > + bool "CXL.mem debugging" > > > > + depends on CXL_MEM > > > > + help > > > > + Enable debug of all CXL command payloads. > > > > + > > > > + Some CXL devices and controllers support encryption and other > > > > + security features. The payloads for the commands that enable > > > > + those features may contain sensitive clear-text security > > > > + material. Disable debug of those command payloads by default. > > > > + If you are a kernel developer actively working on CXL > > > > + security enabling say Y, otherwise say N. > > > > > > Not specific to this patch, but the reference to encryption made me > > > curious about integrity: are all CXL.mem devices compatible with DIMP? > > > Some? None? > > > > The encryption here is "device passphrase" similar to the NVDIMM > > Security Management described here: > > > > https://pmem.io/documents/IntelOptanePMem_DSM_Interface-V2.0.pdf > > > > The LIBNVDIMM enabling wrapped this support with the Linux keys > > interface which among other things enforces wrapping the clear text > > passphrase with a Linux "trusted/encrypted" key. > > > > Additionally, the CXL.io interface optionally supports PCI IDE: > > > > https://www.intel.com/content/dam/www/public/us/en/documents/reference-guides/pcie-device-security-enhancements.pdf > > > > I'm otherwise not familiar with the DIMP acronym? > > > > > > + > > > > endif > > > > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h > > > > index a3da7f8050c4..df3d97154b63 100644 > > > > --- a/drivers/cxl/cxl.h > > > > +++ b/drivers/cxl/cxl.h > > > > @@ -31,9 +31,36 @@ > > > > #define CXLDEV_MB_CAPS_OFFSET 0x00 > > > > #define CXLDEV_MB_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0) > > > > #define CXLDEV_MB_CTRL_OFFSET 0x04 > > > > +#define CXLDEV_MB_CTRL_DOORBELL BIT(0) > > > > #define CXLDEV_MB_CMD_OFFSET 0x08 > > > > +#define CXLDEV_MB_CMD_COMMAND_OPCODE_MASK GENMASK(15, 0) > > > > +#define CXLDEV_MB_CMD_PAYLOAD_LENGTH_MASK GENMASK(36, 16) > > > > #define CXLDEV_MB_STATUS_OFFSET 0x10 > > > > +#define CXLDEV_MB_STATUS_RET_CODE_MASK GENMASK(47, 32) > > > > #define CXLDEV_MB_BG_CMD_STATUS_OFFSET 0x18 > > > > +#define CXLDEV_MB_PAYLOAD_OFFSET 0x20 > > > > + > > > > +/* Memory Device (CXL 2.0 - 8.2.8.5.1.1) */ > > > > +#define CXLMDEV_STATUS_OFFSET 0x0 > > > > +#define CXLMDEV_DEV_FATAL BIT(0) > > > > +#define CXLMDEV_FW_HALT BIT(1) > > > > +#define CXLMDEV_STATUS_MEDIA_STATUS_MASK GENMASK(3, 2) > > > > +#define CXLMDEV_MS_NOT_READY 0 > > > > +#define CXLMDEV_MS_READY 1 > > > > +#define CXLMDEV_MS_ERROR 2 > > > > +#define CXLMDEV_MS_DISABLED 3 > > > > +#define CXLMDEV_READY(status) \ > > > > + (CXL_GET_FIELD(status, CXLMDEV_STATUS_MEDIA_STATUS) == CXLMDEV_MS_READY) > > > > +#define CXLMDEV_MBOX_IF_READY BIT(4) > > > > +#define CXLMDEV_RESET_NEEDED_SHIFT 5 > > > > +#define CXLMDEV_RESET_NEEDED_MASK GENMASK(7, 5) > > > > +#define CXLMDEV_RESET_NEEDED_NOT 0 > > > > +#define CXLMDEV_RESET_NEEDED_COLD 1 > > > > +#define CXLMDEV_RESET_NEEDED_WARM 2 > > > > +#define CXLMDEV_RESET_NEEDED_HOT 3 > > > > +#define CXLMDEV_RESET_NEEDED_CXL 4 > > > > +#define CXLMDEV_RESET_NEEDED(status) \ > > > > + (CXL_GET_FIELD(status, CXLMDEV_RESET_NEEDED) != CXLMDEV_RESET_NEEDED_NOT) > > > > > > > > /** > > > > * struct cxl_mem - A CXL memory device > > > > @@ -44,6 +71,16 @@ struct cxl_mem { > > > > struct pci_dev *pdev; > > > > void __iomem *regs; > > > > > > > > + struct { > > > > + struct range range; > > > > + } pmem; > > > > + > > > > + struct { > > > > + struct range range; > > > > + } ram; > > > > + > > > > + char firmware_version[0x10]; > > > > + > > > > /* Cap 0001h - CXL_CAP_CAP_ID_DEVICE_STATUS */ > > > > struct { > > > > void __iomem *regs; > > > > @@ -51,6 +88,7 @@ struct cxl_mem { > > > > > > > > /* Cap 0002h - CXL_CAP_CAP_ID_PRIMARY_MAILBOX */ > > > > struct { > > > > + struct mutex mutex; /* Protects device mailbox and firmware */ > > > > void __iomem *regs; > > > > size_t payload_size; > > > > } mbox; > > > > @@ -89,5 +127,6 @@ struct cxl_mem { > > > > > > > > cxl_reg(status); > > > > cxl_reg(mbox); > > > > +cxl_reg(mem); > > > > > > > > #endif /* __CXL_H__ */ > > > > diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c > > > > index fa14d51243ee..69ed15bfa5d4 100644 > > > > --- a/drivers/cxl/mem.c > > > > +++ b/drivers/cxl/mem.c > > > > @@ -6,6 +6,270 @@ > > > > #include "pci.h" > > > > #include "cxl.h" > > > > > > > > +#define cxl_doorbell_busy(cxlm) \ > > > > + (cxl_read_mbox_reg32(cxlm, CXLDEV_MB_CTRL_OFFSET) & \ > > > > + CXLDEV_MB_CTRL_DOORBELL) > > > > + > > > > +#define CXL_MAILBOX_TIMEOUT_US 2000 > > > > > > This should be _MS? > > > > > > > + > > > > +enum opcode { > > > > + CXL_MBOX_OP_IDENTIFY = 0x4000, > > > > + CXL_MBOX_OP_MAX = 0x10000 > > > > +}; > > > > + > > > > +/** > > > > + * struct mbox_cmd - A command to be submitted to hardware. > > > > + * @opcode: (input) The command set and command submitted to hardware. > > > > + * @payload_in: (input) Pointer to the input payload. > > > > + * @payload_out: (output) Pointer to the output payload. Must be allocated by > > > > + * the caller. > > > > + * @size_in: (input) Number of bytes to load from @payload. > > > > + * @size_out: (output) Number of bytes loaded into @payload. > > > > + * @return_code: (output) Error code returned from hardware. > > > > + * > > > > + * This is the primary mechanism used to send commands to the hardware. > > > > + * All the fields except @payload_* correspond exactly to the fields described in > > > > + * Command Register section of the CXL 2.0 spec (8.2.8.4.5). @payload_in and > > > > + * @payload_out are written to, and read from the Command Payload Registers > > > > + * defined in (8.2.8.4.8). > > > > + */ > > > > +struct mbox_cmd { > > > > + u16 opcode; > > > > + void *payload_in; > > > > + void *payload_out; > > > > + size_t size_in; > > > > + size_t size_out; > > > > + u16 return_code; > > > > +#define CXL_MBOX_SUCCESS 0 > > > > +}; > > > > + > > > > +static int cxl_mem_wait_for_doorbell(struct cxl_mem *cxlm) > > > > +{ > > > > + const int timeout = msecs_to_jiffies(CXL_MAILBOX_TIMEOUT_US); > > > > + const unsigned long start = jiffies; > > > > + unsigned long end = start; > > > > + > > > > + while (cxl_doorbell_busy(cxlm)) { > > > > + end = jiffies; > > > > + > > > > + if (time_after(end, start + timeout)) { > > > > + /* Check again in case preempted before timeout test */ > > > > + if (!cxl_doorbell_busy(cxlm)) > > > > + break; > > > > + return -ETIMEDOUT; > > > > + } > > > > + cpu_relax(); > > > > + } > > > > + > > > > + dev_dbg(&cxlm->pdev->dev, "Doorbell wait took %dms", > > > > + jiffies_to_msecs(end) - jiffies_to_msecs(start)); > > > > + return 0; > > > > +} > > > > + > > > > +static void cxl_mem_mbox_timeout(struct cxl_mem *cxlm, > > > > + struct mbox_cmd *mbox_cmd) > > > > +{ > > > > + dev_warn(&cxlm->pdev->dev, "Mailbox command timed out\n"); > > > > + dev_info(&cxlm->pdev->dev, > > > > + "\topcode: 0x%04x\n" > > > > + "\tpayload size: %zub\n", > > > > + mbox_cmd->opcode, mbox_cmd->size_in); > > > > + > > > > + if (IS_ENABLED(CONFIG_CXL_MEM_INSECURE_DEBUG)) { > > > > + print_hex_dump_debug("Payload ", DUMP_PREFIX_OFFSET, 16, 1, > > > > + mbox_cmd->payload_in, mbox_cmd->size_in, > > > > + true); > > > > + } > > > > + > > > > + /* Here's a good place to figure out if a device reset is needed */ > > > > > > What are the implications if we don't do a reset, as this implementation > > > does not? IOW, does a timeout require a device to be recovered through a > > > reset before it can receive additional commands, or is it safe to simply > > > drop the command that timed out on the floor and proceed? > > > > Not a satisfying answer, but "it depends". It's also complicated by > > the fact that a reset may need to be coordinated with other devices in > > the interleave-set as the HDM decoders may bounce. > > > > For comparison, to date there have been no problems with the "drop on > > the floor" policy of LIBNVDIMM command timeouts. At the same time > > there simply was not a software visible reset mechanism for those > > devices so this problem never came out. This mailbox isn't a fast > > path, so the device is likely completely dead if this timeout is ever > > violated, and the firmware reporting a timeout might as well assume > > that the OS gives up on the device. > > > > I'll let Ben chime in on the rest... > > Reset handling is next on the TODO list for the driver. I had two main reasons > for not even taking a stab at it. > 1. I have no good way to test it. We are working on adding some test conditions > to QEMU for it. > 2. The main difficulty in my mind with reset is you can't pull the memory out > from under the OS here. While the driver doesn't yet handle persistent memory > capacities, it may have volatile capacity configured by the BIOS. So the goal > was, get the bits of the driver in that would at least allow developers, > hardware vendors, and folks contributing to the spec a way to have basic > interaction with a CXL type 3 device. Honestly I think in most cases if the firmware decides to return a "reset required" status the Linux response will be "lol, no" because the firmware has no concept of the violence that would impose on the rest of the system.