Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp3258314ybf; Tue, 3 Mar 2020 02:37:57 -0800 (PST) X-Google-Smtp-Source: ADFU+vtXipRVNCFff12xD3X65eaJBhJLGaiaAqJw2ZubpWrDvu+oX1iostkXTi/pp7/l17pYipwT X-Received: by 2002:a9d:6255:: with SMTP id i21mr2951778otk.183.1583231877188; Tue, 03 Mar 2020 02:37:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583231877; cv=none; d=google.com; s=arc-20160816; b=y7XwGnSCF+6i7cDnGgAMkc6OB77G4PNyc8nTENdxajlhEngnx23KC70zpX39jQwi6K lfy97hidNjMpG+PM6o8Uf2Hzd+43QW9jZWEFmEfQnk025OvSNmmxtgFJfBQ69trmgKIU 2JBIePdzDHsV0N8tuHQhDxka1Glz4DeLAxXcnnJaBpKdgLhQjYzXY9PCkNhPcU9u6YnR 1pHSkkOBoDFM0AADy9rpjIahq2bCX8sEzghHH3fD9M8sKlQz9vNxqFl8VMoJb1rQy0Am iIFiSha0fftGeX3eXbYbE+upvGLcULCtFSbg3ycVItry/UCKgctQWentSGXY06itPzOk FURg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date:from :references:cc:to:subject; bh=sj+W4Ibm40Cf1MFLPvH+pKMBqpFyHM2TC3vz2TzKWM8=; b=swjUHXr0vlQByMXJdbodm+ry54AywtHxXfo2SNYRilOc6J5cQ++eEpHW+SiwBWhRTA QkZ7YFhy1gEWFtvguGIB66AbH2roxBjtKvLe0oF2/Djvtht0StwIa+XPcNn0xSqLLek6 VSUr6d+BElYdNXirbBIaVR8RnvSJUfvI33LYOCUzQz0ug4HDK3rZfuHq+VZfFqqNWO77 CVRAunIa3MIFMLzoWLpbHc9rcSilVrQSeuvO4TB2aScp8FAXLjSBjb9QiCBf+clTbNjn 7ZwEWBCDqt5NCmd4+CqNKamB2b1z4HsrmGG0DhJI3soTrYFbx6woohaHyrzlOHn7vZpE 6P2g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p1si8248885otk.42.2020.03.03.02.37.44; Tue, 03 Mar 2020 02:37:57 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728775AbgCCKga (ORCPT + 99 others); Tue, 3 Mar 2020 05:36:30 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:40208 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728506AbgCCKg3 (ORCPT ); Tue, 3 Mar 2020 05:36:29 -0500 Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 023AYDWj043775 for ; Tue, 3 Mar 2020 05:36:28 -0500 Received: from e06smtp01.uk.ibm.com (e06smtp01.uk.ibm.com [195.75.94.97]) by mx0a-001b2d01.pphosted.com with ESMTP id 2yfmu4xysb-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 03 Mar 2020 05:36:27 -0500 Received: from localhost by e06smtp01.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 3 Mar 2020 10:36:25 -0000 Received: from b06avi18626390.portsmouth.uk.ibm.com (9.149.26.192) by e06smtp01.uk.ibm.com (192.168.101.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 3 Mar 2020 10:36:18 -0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 023AZI0G34079072 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 3 Mar 2020 10:35:18 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D43755204E; Tue, 3 Mar 2020 10:36:16 +0000 (GMT) Received: from pic2.home (unknown [9.145.93.72]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id D3FBF52067; Tue, 3 Mar 2020 10:36:15 +0000 (GMT) Subject: Re: [PATCH v3 17/27] powerpc/powernv/pmem: Implement the Read Error Log command To: "Alastair D'Silva" , alastair@d-silva.org Cc: "Aneesh Kumar K . V" , "Oliver O'Halloran" , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Andrew Donnellan , Arnd Bergmann , Greg Kroah-Hartman , Dan Williams , Vishal Verma , Dave Jiang , Ira Weiny , Andrew Morton , Mauro Carvalho Chehab , "David S. Miller" , Rob Herring , Anton Blanchard , Krzysztof Kozlowski , Mahesh Salgaonkar , Madhavan Srinivasan , =?UTF-8?Q?C=c3=a9dric_Le_Goater?= , Anju T Sudhakar , Hari Bathini , Thomas Gleixner , Greg Kurz , Nicholas Piggin , Masahiro Yamada , Alexey Kardashevskiy , linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-nvdimm@lists.01.org, linux-mm@kvack.org References: <20200221032720.33893-1-alastair@au1.ibm.com> <20200221032720.33893-18-alastair@au1.ibm.com> From: Frederic Barrat Date: Tue, 3 Mar 2020 11:36:15 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 In-Reply-To: <20200221032720.33893-18-alastair@au1.ibm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 x-cbid: 20030310-4275-0000-0000-000003A7E410 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 20030310-4276-0000-0000-000038BCEBBE Message-Id: <7767dec4-fb78-dd3e-3720-8d15f544639e@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138,18.0.572 definitions=2020-03-03_02:2020-03-03,2020-03-03 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 suspectscore=2 spamscore=0 malwarescore=0 adultscore=0 mlxlogscore=999 bulkscore=0 mlxscore=0 impostorscore=0 clxscore=1015 phishscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2003030081 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Le 21/02/2020 à 04:27, Alastair D'Silva a écrit : > From: Alastair D'Silva > > The read error log command extracts information from the controller's > internal error log. > > This patch exposes this information in 2 ways: > - During probe, if an error occurs & a log is available, print it to the > console > - After probe, make the error log available to userspace via an IOCTL. > Userspace is notified of pending error logs in a later patch > ("powerpc/powernv/pmem: Forward events to userspace") > > Signed-off-by: Alastair D'Silva > --- > arch/powerpc/platforms/powernv/pmem/ocxl.c | 269 ++++++++++++++++++ > .../platforms/powernv/pmem/ocxl_internal.h | 1 + > include/uapi/nvdimm/ocxl-pmem.h | 46 +++ > 3 files changed, 316 insertions(+) > create mode 100644 include/uapi/nvdimm/ocxl-pmem.h > > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c > index 63109a870d2c..2b64504f9129 100644 > --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c > @@ -447,10 +447,219 @@ static int file_release(struct inode *inode, struct file *file) > return 0; > } > > +/** > + * error_log_header_parse() - Parse the first 64 bits of the error log command response > + * @ocxlpmem: the device metadata > + * @length: out, returns the number of bytes in the response (excluding the 64 bit header) > + */ > +static int error_log_header_parse(struct ocxlpmem *ocxlpmem, u16 *length) > +{ > + int rc; > + u64 val; > + Empty line in the middle of declarations > + u16 data_identifier; > + u32 data_length; > + > + rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, > + ocxlpmem->admin_command.data_offset, > + OCXL_LITTLE_ENDIAN, &val); > + if (rc) > + return rc; > + > + data_identifier = val >> 48; > + data_length = val & 0xFFFF; > + > + if (data_identifier != 0x454C) { // 'EL' > + dev_err(&ocxlpmem->dev, > + "Bad data identifier for error log data, expected 'EL', got '%2s' (%#x), data_length=%u\n", > + (char *)&data_identifier, > + (unsigned int)data_identifier, data_length); > + return -EINVAL; > + } > + > + *length = data_length; > + return 0; > +} > + > +static int error_log_offset_0x08(struct ocxlpmem *ocxlpmem, > + u32 *log_identifier, u32 *program_ref_code) > +{ > + int rc; > + u64 val; > + > + rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, > + ocxlpmem->admin_command.data_offset + 0x08, > + OCXL_LITTLE_ENDIAN, &val); > + if (rc) > + return rc; > + > + *log_identifier = val >> 32; > + *program_ref_code = val & 0xFFFFFFFF; > + > + return 0; > +} > + > +static int read_error_log(struct ocxlpmem *ocxlpmem, > + struct ioctl_ocxl_pmem_error_log *log, bool buf_is_user) > +{ > + u64 val; > + u16 user_buf_length; > + u16 buf_length; > + u16 i; > + int rc; > + > + if (log->buf_size % 8) > + return -EINVAL; > + > + rc = ocxlpmem_chi(ocxlpmem, &val); > + if (rc) > + goto out; "out" will unlock a mutex not yet taken. > + > + if (!(val & GLOBAL_MMIO_CHI_ELA)) > + return -EAGAIN; > + > + user_buf_length = log->buf_size; > + > + mutex_lock(&ocxlpmem->admin_command.lock); > + > + rc = admin_command_request(ocxlpmem, ADMIN_COMMAND_ERRLOG); > + if (rc) > + goto out; > + > + rc = admin_command_execute(ocxlpmem); > + if (rc) > + goto out; > + > + rc = admin_command_complete_timeout(ocxlpmem, ADMIN_COMMAND_ERRLOG); > + if (rc < 0) { > + dev_warn(&ocxlpmem->dev, "Read error log timed out\n"); > + goto out; > + } > + > + rc = admin_response(ocxlpmem); > + if (rc < 0) > + goto out; > + if (rc != STATUS_SUCCESS) { > + warn_status(ocxlpmem, "Unexpected status from retrieve error log", rc); > + goto out; > + } > + > + > + rc = error_log_header_parse(ocxlpmem, &log->buf_size); > + if (rc) > + goto out; > + // log->buf_size now contains the returned buffer size, not the user size > + > + rc = error_log_offset_0x08(ocxlpmem, &log->log_identifier, > + &log->program_reference_code); > + if (rc) > + goto out; Offset 0x08 gets a preferential treatment compared to 0x10 below and it's not clear why. I would create a subfonction which parses all the fields linearly. > + rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, > + ocxlpmem->admin_command.data_offset + 0x10, > + OCXL_LITTLE_ENDIAN, &val); > + if (rc) > + goto out; > + > + log->error_log_type = val >> 56; > + log->action_flags = (log->error_log_type == OCXL_PMEM_ERROR_LOG_TYPE_GENERAL) ? > + (val >> 32) & 0xFFFFFF : 0; > + log->power_on_seconds = val & 0xFFFFFFFF; > + > + rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, > + ocxlpmem->admin_command.data_offset + 0x18, > + OCXL_LITTLE_ENDIAN, &log->timestamp); > + if (rc) > + goto out; > + > + rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, > + ocxlpmem->admin_command.data_offset + 0x20, > + OCXL_HOST_ENDIAN, &log->wwid[0]); A bit of a moot point, but is there a reason why some of those MMIO ops use OCXL_LITTLE_ENDIAN and the others OCXL_HOST_ENDIAN? > + if (rc) > + goto out; > + > + rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, > + ocxlpmem->admin_command.data_offset + 0x28, > + OCXL_HOST_ENDIAN, &log->wwid[1]); > + if (rc) > + goto out; > + > + rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, > + ocxlpmem->admin_command.data_offset + 0x30, > + OCXL_HOST_ENDIAN, (u64 *)log->fw_revision); > + if (rc) > + goto out; > + log->fw_revision[8] = '\0'; > + > + buf_length = (user_buf_length < log->buf_size) ? > + user_buf_length : log->buf_size; > + for (i = 0; i < buf_length + 0x48; i += 8) { > + u64 val; > + > + rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, > + ocxlpmem->admin_command.data_offset + i, > + OCXL_HOST_ENDIAN, &val); > + if (rc) > + goto out; > + > + if (buf_is_user) { > + if (copy_to_user(&log->buf[i], &val, sizeof(u64))) { > + rc = -EFAULT; > + goto out; > + } > + } else > + log->buf[i] = val; > + } I think it could be a bit simplified by keeping the handling of the user buffer out of this function. Always call it with a kernel buffer. And have only one copy_to_user() call on the ioctl() path. You'd need to allocate a kernel buf on the ioctl path, but you're already doing it on the probe() path, so it should be doable to share code. > + > + rc = admin_response_handled(ocxlpmem); > + if (rc) > + goto out; > + > +out: > + mutex_unlock(&ocxlpmem->admin_command.lock); > + return rc; > + > +} > + > +static int ioctl_error_log(struct ocxlpmem *ocxlpmem, > + struct ioctl_ocxl_pmem_error_log __user *uarg) > +{ > + struct ioctl_ocxl_pmem_error_log args; > + int rc; > + > + if (copy_from_user(&args, uarg, sizeof(args))) > + return -EFAULT; > + > + rc = read_error_log(ocxlpmem, &args, true); > + if (rc) > + return rc; > + > + if (copy_to_user(uarg, &args, sizeof(args))) > + return -EFAULT; > + > + return 0; > +} > + > +static long file_ioctl(struct file *file, unsigned int cmd, unsigned long args) > +{ > + struct ocxlpmem *ocxlpmem = file->private_data; > + int rc = -EINVAL; > + > + switch (cmd) { > + case IOCTL_OCXL_PMEM_ERROR_LOG: > + rc = ioctl_error_log(ocxlpmem, > + (struct ioctl_ocxl_pmem_error_log __user *)args); > + break; > + } > + return rc; > +} > + > static const struct file_operations fops = { > .owner = THIS_MODULE, > .open = file_open, > .release = file_release, > + .unlocked_ioctl = file_ioctl, > + .compat_ioctl = file_ioctl, > }; > > /** > @@ -527,6 +736,60 @@ static int read_device_metadata(struct ocxlpmem *ocxlpmem) > return 0; > } > > +static const char *decode_error_log_type(u8 error_log_type) > +{ > + switch (error_log_type) { > + case 0x00: > + return "general"; > + case 0x01: > + return "predictive failure"; > + case 0x02: > + return "thermal warning"; > + case 0x03: > + return "data loss"; > + case 0x04: > + return "health & performance"; > + default: > + return "unknown"; > + } > +} > + > +static void dump_error_log(struct ocxlpmem *ocxlpmem) > +{ > + struct ioctl_ocxl_pmem_error_log log; > + u32 buf_size; > + u8 *buf; > + int rc; > + > + if (ocxlpmem->admin_command.data_size == 0) > + return; > + > + buf_size = ocxlpmem->admin_command.data_size - 0x48; > + buf = kzalloc(buf_size, GFP_KERNEL); > + if (!buf) > + return; > + > + log.buf = buf; > + log.buf_size = buf_size; > + > + rc = read_error_log(ocxlpmem, &log, false); > + if (rc < 0) > + goto out; > + > + dev_warn(&ocxlpmem->dev, > + "OCXL PMEM Error log: WWID=0x%016llx%016llx LID=0x%x PRC=%x type=0x%x %s, Uptime=%u seconds timestamp=0x%llx\n", > + log.wwid[0], log.wwid[1], > + log.log_identifier, log.program_reference_code, > + log.error_log_type, > + decode_error_log_type(log.error_log_type), > + log.power_on_seconds, log.timestamp); > + print_hex_dump(KERN_WARNING, "buf", DUMP_PREFIX_OFFSET, 16, 1, buf, > + log.buf_size, false); dev_warn already logs a warning. Isn't KERN_DEBUG more appropriate for the hex dump? > + > +out: > + kfree(buf); > +} > + > /** > * probe_function0() - Set up function 0 for an OpenCAPI persistent memory device > * This is important as it enables templates higher than 0 across all other functions, > @@ -568,6 +831,7 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent) > struct ocxlpmem *ocxlpmem; > int rc; > u16 elapsed, timeout; > + u64 chi; > > if (PCI_FUNC(pdev->devfn) == 0) > return probe_function0(pdev); > @@ -667,6 +931,11 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent) > return 0; > > err: > + if (ocxlpmem && > + (ocxlpmem_chi(ocxlpmem, &chi) == 0) && > + (chi & GLOBAL_MMIO_CHI_ELA)) > + dump_error_log(ocxlpmem); > + > /* > * Further cleanup is done in the release handler via free_ocxlpmem() > * This allows us to keep the character device live to handle IOCTLs to > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h > index d2d81fec7bb1..b953ee522ed4 100644 > --- a/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl_internal.h > @@ -5,6 +5,7 @@ > #include > #include > #include > +#include Can't we limit the extra include to ocxl.c? Completely unrelated, but ocxl.c contains most of the code for this driver. We should consider renaming it to ocxlpmem.c or something along those lines, since it does a lot more than just interfacing with the opencapi interface. And would avoid confusion with an other already existing ocxl.c file. > #include > > #define LABEL_AREA_SIZE (1UL << PA_SECTION_SHIFT) > diff --git a/include/uapi/nvdimm/ocxl-pmem.h b/include/uapi/nvdimm/ocxl-pmem.h > new file mode 100644 > index 000000000000..b10f8ac0c20f > --- /dev/null > +++ b/include/uapi/nvdimm/ocxl-pmem.h > @@ -0,0 +1,46 @@ > +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */ > +/* Copyright 2017 IBM Corp. */ > +#ifndef _UAPI_OCXL_SCM_H > +#define _UAPI_OCXL_SCM_H > + > +#include > +#include > + > +#define OCXL_PMEM_ERROR_LOG_ACTION_RESET (1 << (32-32)) > +#define OCXL_PMEM_ERROR_LOG_ACTION_CHKFW (1 << (53-32)) > +#define OCXL_PMEM_ERROR_LOG_ACTION_REPLACE (1 << (54-32)) > +#define OCXL_PMEM_ERROR_LOG_ACTION_DUMP (1 << (55-32)) > + > +#define OCXL_PMEM_ERROR_LOG_TYPE_GENERAL (0x00) > +#define OCXL_PMEM_ERROR_LOG_TYPE_PREDICTIVE_FAILURE (0x01) > +#define OCXL_PMEM_ERROR_LOG_TYPE_THERMAL_WARNING (0x02) > +#define OCXL_PMEM_ERROR_LOG_TYPE_DATA_LOSS (0x03) > +#define OCXL_PMEM_ERROR_LOG_TYPE_HEALTH_PERFORMANCE (0x04) > + > +struct ioctl_ocxl_pmem_error_log { > + __u32 log_identifier; /* out */ > + __u32 program_reference_code; /* out */ > + __u32 action_flags; /* out, recommended course of action */ > + __u32 power_on_seconds; /* out, Number of seconds the controller has been on when the error occurred */ > + __u64 timestamp; /* out, relative time since the current IPL */ > + __u64 wwid[2]; /* out, the NAA formatted WWID associated with the controller */ > + char fw_revision[8+1]; /* out, firmware revision as null terminated text */ The 8+1 size will make the compiler add some padding here. Are we confident that all the compilers, at least on powerpc, will do the same thing and we can guarantee a kernel ABI? I would play it safe and have a discussion with folks who understand compilers better. > + __u16 buf_size; /* in/out, buffer size provided/required. > + * If required is greater than provided, the buffer > + * will be truncated to the amount provided. If its > + * less, then only the required bytes will be populated. > + * If it is 0, then there are no more error log entries. > + */ > + __u8 error_log_type; > + __u8 reserved1; > + __u32 reserved2; > + __u64 reserved3[2]; > + __u8 *buf; /* pointer to output buffer */ > +}; > + > +/* ioctl numbers */ > +#define OCXL_PMEM_MAGIC 0x5C Randomly picked? See (and add entry in) Documentation/userspace-api/ioctl/ioctl-number.rst Fred > +/* SCM devices */ > +#define IOCTL_OCXL_PMEM_ERROR_LOG _IOWR(OCXL_PMEM_MAGIC, 0x01, struct ioctl_ocxl_pmem_error_log) > + > +#endif /* _UAPI_OCXL_SCM_H */ >