Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1496530pxb; Wed, 10 Feb 2021 09:36:28 -0800 (PST) X-Google-Smtp-Source: ABdhPJwcMDsL85PaNZpUTTe//sS4U8wf68GGMCx4ZrBZm7ppkJSvJU1aOcWesgjJlesGnENi52RZ X-Received: by 2002:a17:906:7146:: with SMTP id z6mr3875795ejj.159.1612978588109; Wed, 10 Feb 2021 09:36:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612978588; cv=none; d=google.com; s=arc-20160816; b=gvIfoJ7JxkCIsuqUrUhI24ZaMOpo5iIiFdpgLERtOMwnP5iLS1+lJetTDnacg8whQ0 rl86T7yU2dI91M2pLCJQi1plHOe3RcygVC17CNPCJNQ38v73E4n09OycG1nutelJFxtW zh9XM//CHC/s+QTTCbqJe7g5TbW/UpvFFJ0b+m3gFe57/AD54Q3cT8lborPD80moIrCj 5OWzj0bqu3h6BN7I/LKkEUFaf7H1Zc7RGeft5xb96VZ174rHTElITIPaQAtTRQtn1H7P biVMc9uPEfw9sbGa740v2RqZphpWje64wneOEkJSUQz7ciiLIVwRzm+YCpYJiPPPo7Qy RRWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=P+T7g2hu2i/q7wu5EvmsYSORh59g7K9MuWfc3loaDu8=; b=Ttnw6zvnl+fufdXNNy+FhYPLoaOv5/Ahnkrwa5KchEp35yET/ONZKrfmRqcYICVvbD GY6vMMOeYvOJU004b+LCyi/T7bKC9EZlMgsZxJ1OUn1CurU83i9nZUF06yV/d+zuE9rV D3hSiJBg0X8RkMtLoKNPs5VEByczTwSD1ko7heF4/nIeT2m1JnWUnY9HqnNvvZgVyjnb WRGQ1vVO7eIGvAHK0ApJyt120ucW+TRgcN/0suvj/4WgUrG04np5ruadYq/0pY93Vq33 3UGqXJUXw/hp8LwN7ayi4PCq6VUWqw+/22dnfZjKPqEujxA1p1ZXUfZihnoONIUGSRcQ J8jg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u1si1769523ejy.721.2021.02.10.09.36.04; Wed, 10 Feb 2021 09:36:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232635AbhBJRfc (ORCPT + 99 others); Wed, 10 Feb 2021 12:35:32 -0500 Received: from frasgout.his.huawei.com ([185.176.79.56]:2537 "EHLO frasgout.his.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232936AbhBJRcr (ORCPT ); Wed, 10 Feb 2021 12:32:47 -0500 Received: from fraeml709-chm.china.huawei.com (unknown [172.18.147.206]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4DbRV837zcz67mg6; Thu, 11 Feb 2021 01:25:20 +0800 (CST) Received: from lhreml710-chm.china.huawei.com (10.201.108.61) by fraeml709-chm.china.huawei.com (10.206.15.37) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2106.2; Wed, 10 Feb 2021 18:31:56 +0100 Received: from localhost (10.47.67.2) by lhreml710-chm.china.huawei.com (10.201.108.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2106.2; Wed, 10 Feb 2021 17:31:55 +0000 Date: Wed, 10 Feb 2021 17:30:54 +0000 From: Jonathan Cameron To: Ben Widawsky CC: , , , , , Bjorn Helgaas , "Chris Browy" , Christoph Hellwig , "Dan Williams" , David Hildenbrand , David Rientjes , Ira Weiny , "Jon Masters" , Rafael Wysocki , Randy Dunlap , Vishal Verma , "John Groves (jgroves)" , "Kelley, Sean V" Subject: Re: [PATCH v2 2/8] cxl/mem: Find device capabilities Message-ID: <20210210173054.00002e9f@Huawei.com> In-Reply-To: <20210210165557.7fuqbyr7e7zjoxaa@intel.com> References: <20210210000259.635748-1-ben.widawsky@intel.com> <20210210000259.635748-3-ben.widawsky@intel.com> <20210210133252.000047af@Huawei.com> <20210210150759.00005684@Huawei.com> <20210210165557.7fuqbyr7e7zjoxaa@intel.com> Organization: Huawei Technologies Research and Development (UK) Ltd. X-Mailer: Claws Mail 3.17.4 (GTK+ 2.24.32; i686-w64-mingw32) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.47.67.2] X-ClientProxiedBy: lhreml725-chm.china.huawei.com (10.201.108.76) To lhreml710-chm.china.huawei.com (10.201.108.61) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 10 Feb 2021 08:55:57 -0800 Ben Widawsky wrote: > On 21-02-10 15:07:59, Jonathan Cameron wrote: > > On Wed, 10 Feb 2021 13:32:52 +0000 > > Jonathan Cameron wrote: > > > > > On Tue, 9 Feb 2021 16:02:53 -0800 > > > Ben Widawsky wrote: > > > > > > > Provide enough functionality to utilize the mailbox of a memory device. > > > > The mailbox is used to interact with the firmware running on the memory > > > > device. The flow is proven with one implemented command, "identify". > > > > Because the class code has already told the driver this is a memory > > > > device and the identify command is mandatory. > > > > > > > > CXL devices contain an array of capabilities that describe the > > > > interactions software can have with the device or firmware running on > > > > the device. A CXL compliant device must implement the device status and > > > > the mailbox capability. Additionally, a CXL compliant memory device must > > > > implement the memory device capability. Each of the capabilities can > > > > [will] provide an offset within the MMIO region for interacting with the > > > > CXL device. > > > > > > > > The capabilities tell the driver how to find and map the register space > > > > for CXL Memory Devices. The registers are required to utilize the CXL > > > > spec defined mailbox interface. The spec outlines two mailboxes, primary > > > > and secondary. The secondary mailbox is earmarked for system firmware, > > > > and not handled in this driver. > > > > > > > > Primary mailboxes are capable of generating an interrupt when submitting > > > > a background command. That implementation is saved for a later time. > > > > > > > > Link: https://www.computeexpresslink.org/download-the-specification > > > > Signed-off-by: Ben Widawsky > > > > Reviewed-by: Dan Williams > > > > > > Hi Ben, > > > > > > > > > > +/** > > > > + * cxl_mem_mbox_send_cmd() - Send a mailbox command to a memory device. > > > > + * @cxlm: The CXL memory device to communicate with. > > > > + * @mbox_cmd: Command to send to the memory device. > > > > + * > > > > + * Context: Any context. Expects mbox_lock to be held. > > > > + * Return: -ETIMEDOUT if timeout occurred waiting for completion. 0 on success. > > > > + * Caller should check the return code in @mbox_cmd to make sure it > > > > + * succeeded. > > > > > > cxl_xfer_log() doesn't check mbox_cmd->return_code and for my test it currently > > > enters an infinite loop as a result. > > I meant to fix that. > > > > > > > I haven't checked other paths, but to my mind it is not a good idea to require > > > two levels of error checking - the example here proves how easy it is to forget > > > one. > > Demonstrably, you're correct. I think it would be good to have a kernel only > mbox command that does the error checking though. Let me type something up and > see how it looks. > > > > > > > Now all I have to do is figure out why I'm getting an error in the first place! > > > > For reference this seems to be our old issue of arm64 memcpy_fromio() only doing 8 byte > > or 1 byte copies. The hack in QEMU to allow that to work, doesn't work. > > Result is that 1 byte reads replicate across the register > > (in this case instead of 0000001c I get 1c1c1c1c) > > > > For these particular registers, we are covered by the rules in 8.2 which says that > > a 1, 2, 4, 8 aligned reads of 64 bit registers etc are fine. > > > > So we should not have to care. This isn't true for the component registers where > > we need to guarantee 4 or 8 byte reads only. > > > > For this particular issue the mailbox_read_reg() function in the QEMU code > > needs to handle the size 1 case and set min_access_size = 1 for > > mailbox_ops. Logically it should also handle the 2 byte case I think, > > but I'm not hitting that. > > > > Jonathan > > I think the latest QEMU patches should do the right thing (I have a v4 branch if > you want to try it). If it doesn't, it'd be worth debugging. The memory > accessors should split up or combine the reads/writes to whatever the emulation > supports (4 or 8 only in this case). > > We can move this discussion to the QEMU list if it's not just a simple bug on my > part. I'm on your v4 QEMU branch. I can follow up in the QEMU thread, but needs to do 1 byte reads as well. (but as I'm here and someone might find this thread) The arm64 implementation is 'interesting'. Maybe we want to fix it but I suspect we'll have a non trivial issue arguing it is broken. CXL spec allows (I think) both 1 and 2 byte reads to this particular register. /* * Copy data from IO memory space to "real" memory space. */ void __memcpy_fromio(void *to, const volatile void __iomem *from, size_t count) { while (count && !IS_ALIGNED((unsigned long)from, 8)) { *(u8 *)to = __raw_readb(from); from++; to++; count--; } while (count >= 8) { *(u64 *)to = __raw_readq(from); from += 8; to += 8; count -= 8; } while (count) { *(u8 *)to = __raw_readb(from); from++; to++; count--; } } EXPORT_SYMBOL(__memcpy_fromio); > > > > > > > > > Jonathan > > > > > > > > > > > > > + * > > > > + * This is a generic form of the CXL mailbox send command, thus the only I/O > > > > + * operations used are cxl_read_mbox_reg(). Memory devices, and perhaps other > > > > + * types of CXL devices may have further information available upon error > > > > + * conditions. > > > > + * > > > > + * The CXL spec allows for up to two mailboxes. The intention is for the primary > > > > + * mailbox to be OS controlled and the secondary mailbox to be used by system > > > > + * firmware. This allows the OS and firmware to communicate with the device and > > > > + * not need to coordinate with each other. The driver only uses the primary > > > > + * mailbox. > > > > + */ > > > > +static int cxl_mem_mbox_send_cmd(struct cxl_mem *cxlm, > > > > + struct mbox_cmd *mbox_cmd) > > > > +{ > > > > + void __iomem *payload = cxlm->mbox_regs + CXLDEV_MBOX_PAYLOAD_OFFSET; > > > > + u64 cmd_reg, status_reg; > > > > + size_t out_len; > > > > + int rc; > > > > + > > > > + lockdep_assert_held(&cxlm->mbox_mutex); > > > > + > > > > + /* > > > > + * Here are the steps from 8.2.8.4 of the CXL 2.0 spec. > > > > + * 1. Caller reads MB Control Register to verify doorbell is clear > > > > + * 2. Caller writes Command Register > > > > + * 3. Caller writes Command Payload Registers if input payload is non-empty > > > > + * 4. Caller writes MB Control Register to set doorbell > > > > + * 5. Caller either polls for doorbell to be clear or waits for interrupt if configured > > > > + * 6. Caller reads MB Status Register to fetch Return code > > > > + * 7. If command successful, Caller reads Command Register to get Payload Length > > > > + * 8. If output payload is non-empty, host reads Command Payload Registers > > > > + * > > > > + * Hardware is free to do whatever it wants before the doorbell is rung, > > > > + * and isn't allowed to change anything after it clears the doorbell. As > > > > + * such, steps 2 and 3 can happen in any order, and steps 6, 7, 8 can > > > > + * also happen in any order (though some orders might not make sense). > > > > + */ > > > > + > > > > + /* #1 */ > > > > + if (cxl_doorbell_busy(cxlm)) { > > > > + dev_err_ratelimited(&cxlm->pdev->dev, > > > > + "Mailbox re-busy after acquiring\n"); > > > > + return -EBUSY; > > > > + } > > > > + > > > > + cmd_reg = FIELD_PREP(CXLDEV_MBOX_CMD_COMMAND_OPCODE_MASK, > > > > + mbox_cmd->opcode); > > > > + if (mbox_cmd->size_in) { > > > > + if (WARN_ON(!mbox_cmd->payload_in)) > > > > + return -EINVAL; > > > > + > > > > + cmd_reg |= FIELD_PREP(CXLDEV_MBOX_CMD_PAYLOAD_LENGTH_MASK, > > > > + mbox_cmd->size_in); > > > > + memcpy_toio(payload, mbox_cmd->payload_in, mbox_cmd->size_in); > > > > + } > > > > + > > > > + /* #2, #3 */ > > > > + writeq(cmd_reg, cxlm->mbox_regs + CXLDEV_MBOX_CMD_OFFSET); > > > > + > > > > + /* #4 */ > > > > + dev_dbg(&cxlm->pdev->dev, "Sending command\n"); > > > > + writel(CXLDEV_MBOX_CTRL_DOORBELL, > > > > + cxlm->mbox_regs + CXLDEV_MBOX_CTRL_OFFSET); > > > > + > > > > + /* #5 */ > > > > + rc = cxl_mem_wait_for_doorbell(cxlm); > > > > + if (rc == -ETIMEDOUT) { > > > > + cxl_mem_mbox_timeout(cxlm, mbox_cmd); > > > > + return rc; > > > > + } > > > > + > > > > + /* #6 */ > > > > + status_reg = readq(cxlm->mbox_regs + CXLDEV_MBOX_STATUS_OFFSET); > > > > + mbox_cmd->return_code = > > > > + FIELD_GET(CXLDEV_MBOX_STATUS_RET_CODE_MASK, status_reg); > > > > + > > > > + if (mbox_cmd->return_code != 0) { > > > > + dev_dbg(&cxlm->pdev->dev, "Mailbox operation had an error\n"); > > > > + return 0; > > > > > > I'd return some sort of error in this path. Otherwise the sort of missing > > > handling I mention above is too easy to hit. > > > > > > > + } > > > > + > > > > + /* #7 */ > > > > + cmd_reg = readq(cxlm->mbox_regs + CXLDEV_MBOX_CMD_OFFSET); > > > > + out_len = FIELD_GET(CXLDEV_MBOX_CMD_PAYLOAD_LENGTH_MASK, cmd_reg); > > > > + > > > > + /* #8 */ > > > > + if (out_len && mbox_cmd->payload_out) > > > > + memcpy_fromio(mbox_cmd->payload_out, payload, out_len); > > > > + > > > > + mbox_cmd->size_out = out_len; > > > > + > > > > + return 0; > > > > +} > > > > + > > > > +/** > > > > + * cxl_mem_mbox_get() - Acquire exclusive access to the mailbox. > > > > + * @cxlm: The memory device to gain access to. > > > > + * > > > > + * Context: Any context. Takes the mbox_lock. > > > > + * Return: 0 if exclusive access was acquired. > > > > + */ > > > > +static int cxl_mem_mbox_get(struct cxl_mem *cxlm) > > > > +{ > > > > + struct device *dev = &cxlm->pdev->dev; > > > > + int rc = -EBUSY; > > > > + u64 md_status; > > > > + > > > > + mutex_lock_io(&cxlm->mbox_mutex); > > > > + > > > > + /* > > > > + * XXX: There is some amount of ambiguity in the 2.0 version of the spec > > > > + * around the mailbox interface ready (8.2.8.5.1.1). The purpose of the > > > > + * bit is to allow firmware running on the device to notify the driver > > > > + * that it's ready to receive commands. It is unclear if the bit needs > > > > + * to be read for each transaction mailbox, ie. the firmware can switch > > > > + * it on and off as needed. Second, there is no defined timeout for > > > > + * mailbox ready, like there is for the doorbell interface. > > > > + * > > > > + * Assumptions: > > > > + * 1. The firmware might toggle the Mailbox Interface Ready bit, check > > > > + * it for every command. > > > > + * > > > > + * 2. If the doorbell is clear, the firmware should have first set the > > > > + * Mailbox Interface Ready bit. Therefore, waiting for the doorbell > > > > + * to be ready is sufficient. > > > > + */ > > > > + rc = cxl_mem_wait_for_doorbell(cxlm); > > > > + if (rc) { > > > > + dev_warn(dev, "Mailbox interface not ready\n"); > > > > + goto out; > > > > + } > > > > + > > > > + md_status = readq(cxlm->memdev_regs + CXLMDEV_STATUS_OFFSET); > > > > + if (!(md_status & CXLMDEV_MBOX_IF_READY && CXLMDEV_READY(md_status))) { > > > > + dev_err(dev, > > > > + "mbox: reported doorbell ready, but not mbox ready\n"); > > > > + goto out; > > > > + } > > > > + > > > > + /* > > > > + * Hardware shouldn't allow a ready status but also have failure bits > > > > + * set. Spit out an error, this should be a bug report > > > > + */ > > > > + rc = -EFAULT; > > > > + if (md_status & CXLMDEV_DEV_FATAL) { > > > > + dev_err(dev, "mbox: reported ready, but fatal\n"); > > > > + goto out; > > > > + } > > > > + if (md_status & CXLMDEV_FW_HALT) { > > > > + dev_err(dev, "mbox: reported ready, but halted\n"); > > > > + goto out; > > > > + } > > > > + if (CXLMDEV_RESET_NEEDED(md_status)) { > > > > + dev_err(dev, "mbox: reported ready, but reset needed\n"); > > > > + goto out; > > > > + } > > > > + > > > > + /* with lock held */ > > > > + return 0; > > > > + > > > > +out: > > > > + mutex_unlock(&cxlm->mbox_mutex); > > > > + return rc; > > > > +} > > > > + > > > > +/** > > > > + * cxl_mem_mbox_put() - Release exclusive access to the mailbox. > > > > + * @cxlm: The CXL memory device to communicate with. > > > > + * > > > > + * Context: Any context. Expects mbox_lock to be held. > > > > + */ > > > > +static void cxl_mem_mbox_put(struct cxl_mem *cxlm) > > > > +{ > > > > + mutex_unlock(&cxlm->mbox_mutex); > > > > +} > > > > + > > > > +/** > > > > + * cxl_mem_setup_regs() - Setup necessary MMIO. > > > > + * @cxlm: The CXL memory device to communicate with. > > > > + * > > > > + * Return: 0 if all necessary registers mapped. > > > > + * > > > > + * A memory device is required by spec to implement a certain set of MMIO > > > > + * regions. The purpose of this function is to enumerate and map those > > > > + * registers. > > > > + */ > > > > +static int cxl_mem_setup_regs(struct cxl_mem *cxlm) > > > > +{ > > > > + struct device *dev = &cxlm->pdev->dev; > > > > + int cap, cap_count; > > > > + u64 cap_array; > > > > + > > > > + cap_array = readq(cxlm->regs + CXLDEV_CAP_ARRAY_OFFSET); > > > > + if (FIELD_GET(CXLDEV_CAP_ARRAY_ID_MASK, cap_array) != > > > > + CXLDEV_CAP_ARRAY_CAP_ID) > > > > + return -ENODEV; > > > > + > > > > + cap_count = FIELD_GET(CXLDEV_CAP_ARRAY_COUNT_MASK, cap_array); > > > > + > > > > + for (cap = 1; cap <= cap_count; cap++) { > > > > + void __iomem *register_block; > > > > + u32 offset; > > > > + u16 cap_id; > > > > + > > > > + cap_id = readl(cxlm->regs + cap * 0x10) & 0xffff; > > > > + offset = readl(cxlm->regs + cap * 0x10 + 0x4); > > > > + register_block = cxlm->regs + offset; > > > > + > > > > + switch (cap_id) { > > > > + case CXLDEV_CAP_CAP_ID_DEVICE_STATUS: > > > > + dev_dbg(dev, "found Status capability (0x%x)\n", offset); > > > > + cxlm->status_regs = register_block; > > > > + break; > > > > + case CXLDEV_CAP_CAP_ID_PRIMARY_MAILBOX: > > > > + dev_dbg(dev, "found Mailbox capability (0x%x)\n", offset); > > > > + cxlm->mbox_regs = register_block; > > > > + break; > > > > + case CXLDEV_CAP_CAP_ID_SECONDARY_MAILBOX: > > > > + dev_dbg(dev, "found Secondary Mailbox capability (0x%x)\n", offset); > > > > + break; > > > > + case CXLDEV_CAP_CAP_ID_MEMDEV: > > > > + dev_dbg(dev, "found Memory Device capability (0x%x)\n", offset); > > > > + cxlm->memdev_regs = register_block; > > > > + break; > > > > + default: > > > > + dev_dbg(dev, "Unknown cap ID: %d (0x%x)\n", cap_id, offset); > > > > + break; > > > > + } > > > > + } > > > > + > > > > + if (!cxlm->status_regs || !cxlm->mbox_regs || !cxlm->memdev_regs) { > > > > + dev_err(dev, "registers not found: %s%s%s\n", > > > > + !cxlm->status_regs ? "status " : "", > > > > + !cxlm->mbox_regs ? "mbox " : "", > > > > + !cxlm->memdev_regs ? "memdev" : ""); > > > > + return -ENXIO; > > > > + } > > > > + > > > > + return 0; > > > > +} > > > > + > > > > +static int cxl_mem_setup_mailbox(struct cxl_mem *cxlm) > > > > +{ > > > > + const int cap = readl(cxlm->mbox_regs + CXLDEV_MBOX_CAPS_OFFSET); > > > > + > > > > + cxlm->payload_size = > > > > + 1 << FIELD_GET(CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK, cap); > > > > + > > > > + /* > > > > + * CXL 2.0 8.2.8.4.3 Mailbox Capabilities Register > > > > + * > > > > + * If the size is too small, mandatory commands will not work and so > > > > + * there's no point in going forward. If the size is too large, there's > > > > + * no harm is soft limiting it. > > > > + */ > > > > + cxlm->payload_size = min_t(size_t, cxlm->payload_size, SZ_1M); > > > > + if (cxlm->payload_size < 256) { > > > > + dev_err(&cxlm->pdev->dev, "Mailbox is too small (%zub)", > > > > + cxlm->payload_size); > > > > + return -ENXIO; > > > > + } > > > > + > > > > + dev_dbg(&cxlm->pdev->dev, "Mailbox payload sized %zu", > > > > + cxlm->payload_size); > > > > + > > > > + return 0; > > > > +} > > > > + > > > > +static struct cxl_mem *cxl_mem_create(struct pci_dev *pdev, u32 reg_lo, > > > > + u32 reg_hi) > > > > +{ > > > > + struct device *dev = &pdev->dev; > > > > + struct cxl_mem *cxlm; > > > > + void __iomem *regs; > > > > + u64 offset; > > > > + u8 bar; > > > > + int rc; > > > > + > > > > + cxlm = devm_kzalloc(&pdev->dev, sizeof(*cxlm), GFP_KERNEL); > > > > + if (!cxlm) { > > > > + dev_err(dev, "No memory available\n"); > > > > + return NULL; > > > > + } > > > > + > > > > + offset = ((u64)reg_hi << 32) | FIELD_GET(CXL_REGLOC_ADDR_MASK, reg_lo); > > > > + bar = FIELD_GET(CXL_REGLOC_BIR_MASK, reg_lo); > > > > + > > > > + /* Basic sanity check that BAR is big enough */ > > > > + if (pci_resource_len(pdev, bar) < offset) { > > > > + dev_err(dev, "BAR%d: %pr: too small (offset: %#llx)\n", bar, > > > > + &pdev->resource[bar], (unsigned long long)offset); > > > > + return NULL; > > > > + } > > > > + > > > > + rc = pcim_iomap_regions(pdev, BIT(bar), pci_name(pdev)); > > > > + if (rc != 0) { > > > > + dev_err(dev, "failed to map registers\n"); > > > > + return NULL; > > > > + } > > > > + regs = pcim_iomap_table(pdev)[bar]; > > > > + > > > > + mutex_init(&cxlm->mbox_mutex); > > > > + cxlm->pdev = pdev; > > > > + cxlm->regs = regs + offset; > > > > + > > > > + dev_dbg(dev, "Mapped CXL Memory Device resource\n"); > > > > + return cxlm; > > > > +} > > > > > > > > static int cxl_mem_dvsec(struct pci_dev *pdev, int dvsec) > > > > { > > > > @@ -28,10 +423,85 @@ static int cxl_mem_dvsec(struct pci_dev *pdev, int dvsec) > > > > return 0; > > > > } > > > > > > > > +/** > > > > + * cxl_mem_identify() - Send the IDENTIFY command to the device. > > > > + * @cxlm: The device to identify. > > > > + * > > > > + * Return: 0 if identify was executed successfully. > > > > + * > > > > + * This will dispatch the identify command to the device and on success populate > > > > + * structures to be exported to sysfs. > > > > + */ > > > > +static int cxl_mem_identify(struct cxl_mem *cxlm) > > > > +{ > > > > + struct cxl_mbox_identify { > > > > + char fw_revision[0x10]; > > > > + __le64 total_capacity; > > > > + __le64 volatile_capacity; > > > > + __le64 persistent_capacity; > > > > + __le64 partition_align; > > > > + __le16 info_event_log_size; > > > > + __le16 warning_event_log_size; > > > > + __le16 failure_event_log_size; > > > > + __le16 fatal_event_log_size; > > > > + __le32 lsa_size; > > > > + u8 poison_list_max_mer[3]; > > > > + __le16 inject_poison_limit; > > > > + u8 poison_caps; > > > > + u8 qos_telemetry_caps; > > > > + } __packed id; > > > > + struct mbox_cmd mbox_cmd = { > > > > + .opcode = CXL_MBOX_OP_IDENTIFY, > > > > + .payload_out = &id, > > > > + .size_in = 0, > > > > + }; > > > > + int rc; > > > > + > > > > + /* Retrieve initial device memory map */ > > > > + rc = cxl_mem_mbox_get(cxlm); > > > > + if (rc) > > > > + return rc; > > > > + > > > > + rc = cxl_mem_mbox_send_cmd(cxlm, &mbox_cmd); > > > > + cxl_mem_mbox_put(cxlm); > > > > + if (rc) > > > > + return rc; > > > > + > > > > + /* TODO: Handle retry or reset responses from firmware. */ > > > > + if (mbox_cmd.return_code != CXL_MBOX_SUCCESS) { > > > > + dev_err(&cxlm->pdev->dev, "Mailbox command failed (%d)\n", > > > > + mbox_cmd.return_code); > > > > + return -ENXIO; > > > > + } > > > > + > > > > + if (mbox_cmd.size_out != sizeof(id)) > > > > + return -ENXIO; > > > > + > > > > + /* > > > > + * TODO: enumerate DPA map, as 'ram' and 'pmem' do not alias. > > > > + * For now, only the capacity is exported in sysfs > > > > + */ > > > > + cxlm->ram.range.start = 0; > > > > + cxlm->ram.range.end = le64_to_cpu(id.volatile_capacity) - 1; > > > > + > > > > + cxlm->pmem.range.start = 0; > > > > + cxlm->pmem.range.end = le64_to_cpu(id.persistent_capacity) - 1; > > > > + > > > > + memcpy(cxlm->firmware_version, id.fw_revision, sizeof(id.fw_revision)); > > > > + > > > > + return rc; > > > > +} > > > > + > > > > static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id) > > > > { > > > > struct device *dev = &pdev->dev; > > > > - int regloc; > > > > + struct cxl_mem *cxlm; > > > > + int rc, regloc, i; > > > > + u32 regloc_size; > > > > + > > > > + rc = pcim_enable_device(pdev); > > > > + if (rc) > > > > + return rc; > > > > > > > > regloc = cxl_mem_dvsec(pdev, PCI_DVSEC_ID_CXL_REGLOC_OFFSET); > > > > if (!regloc) { > > > > @@ -39,7 +509,44 @@ static int cxl_mem_probe(struct pci_dev *pdev, const struct pci_device_id *id) > > > > return -ENXIO; > > > > } > > > > > > > > - return 0; > > > > + /* Get the size of the Register Locator DVSEC */ > > > > + pci_read_config_dword(pdev, regloc + PCI_DVSEC_HEADER1, ®loc_size); > > > > + regloc_size = FIELD_GET(PCI_DVSEC_HEADER1_LENGTH_MASK, regloc_size); > > > > + > > > > + regloc += PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET; > > > > + > > > > + rc = -ENXIO; > > > > + for (i = regloc; i < regloc + regloc_size; i += 8) { > > > > + u32 reg_lo, reg_hi; > > > > + u8 reg_type; > > > > + > > > > + /* "register low and high" contain other bits */ > > > > + pci_read_config_dword(pdev, i, ®_lo); > > > > + pci_read_config_dword(pdev, i + 4, ®_hi); > > > > + > > > > + reg_type = FIELD_GET(CXL_REGLOC_RBI_MASK, reg_lo); > > > > + > > > > + if (reg_type == CXL_REGLOC_RBI_MEMDEV) { > > > > + rc = 0; > > > > + cxlm = cxl_mem_create(pdev, reg_lo, reg_hi); > > > > + if (!cxlm) > > > > + rc = -ENODEV; > > > > + break; > > > > + } > > > > + } > > > > + > > > > + if (rc) > > > > + return rc; > > > > + > > > > + rc = cxl_mem_setup_regs(cxlm); > > > > + if (rc) > > > > + return rc; > > > > + > > > > + rc = cxl_mem_setup_mailbox(cxlm); > > > > + if (rc) > > > > + return rc; > > > > + > > > > + return cxl_mem_identify(cxlm); > > > > } > > > > > > > > static const struct pci_device_id cxl_mem_pci_tbl[] = { > > > > diff --git a/drivers/cxl/pci.h b/drivers/cxl/pci.h > > > > index f135b9f7bb21..ffcbc13d7b5b 100644 > > > > --- a/drivers/cxl/pci.h > > > > +++ b/drivers/cxl/pci.h > > > > @@ -14,5 +14,18 @@ > > > > #define PCI_DVSEC_ID_CXL 0x0 > > > > > > > > #define PCI_DVSEC_ID_CXL_REGLOC_OFFSET 0x8 > > > > +#define PCI_DVSEC_ID_CXL_REGLOC_BLOCK1_OFFSET 0xC > > > > + > > > > +/* BAR Indicator Register (BIR) */ > > > > +#define CXL_REGLOC_BIR_MASK GENMASK(2, 0) > > > > + > > > > +/* Register Block Identifier (RBI) */ > > > > +#define CXL_REGLOC_RBI_MASK GENMASK(15, 8) > > > > +#define CXL_REGLOC_RBI_EMPTY 0 > > > > +#define CXL_REGLOC_RBI_COMPONENT 1 > > > > +#define CXL_REGLOC_RBI_VIRT 2 > > > > +#define CXL_REGLOC_RBI_MEMDEV 3 > > > > + > > > > +#define CXL_REGLOC_ADDR_MASK GENMASK(31, 16) > > > > > > > > #endif /* __CXL_PCI_H__ */ > > > > diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h > > > > index e709ae8235e7..6267ca9ae683 100644 > > > > --- a/include/uapi/linux/pci_regs.h > > > > +++ b/include/uapi/linux/pci_regs.h > > > > @@ -1080,6 +1080,7 @@ > > > > > > > > /* Designated Vendor-Specific (DVSEC, PCI_EXT_CAP_ID_DVSEC) */ > > > > #define PCI_DVSEC_HEADER1 0x4 /* Designated Vendor-Specific Header1 */ > > > > +#define PCI_DVSEC_HEADER1_LENGTH_MASK 0xFFF00000 > > > > #define PCI_DVSEC_HEADER2 0x8 /* Designated Vendor-Specific Header2 */ > > > > > > > > /* Data Link Feature */ > > > > >