Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp6211178ybf; Thu, 5 Mar 2020 15:40:39 -0800 (PST) X-Google-Smtp-Source: ADFU+vuXdQIH8AZR7gQi1I03APW1+I24aLRVypPh92iVbNlNU+P4h/EcFGelgVpnSvaGN2Wf2mQi X-Received: by 2002:a9d:7748:: with SMTP id t8mr303949otl.187.1583451639332; Thu, 05 Mar 2020 15:40:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583451639; cv=none; d=google.com; s=arc-20160816; b=vKaEiN8H50wWae836+g94FsGAA7YlFMgFqUbR0CGcaW9GHEhFDzH1anPgha7ldcwwr Gnd3LS0V1j12zTd6MyyDVmnX+DHCaqPILBAbqyA4OF0uQfG/aEARZ/7rOXvBb288jK94 VAEgjGsIzH8HZEGnf2tBC9Y+BlR/2OWvjcNCB/aYu23NrYVoopkzoo2xpqs2EDizkbV7 AIzqwn8QW3qLwwOcbnsH//I0CuIsjwm/FgWdM3gJSmBZhZMFZNtnYsarlHYnfRVK1q3j sKxibcCo/7NOxAht+ZyudXBa4H24/FaDUk24+HIGKA+A3Cd/9cgMEszcwrZfIKqkb8nD Abug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:content-transfer-encoding :mime-version:user-agent:organization:references:in-reply-to:date:cc :to:from:subject; bh=FfD+0CjSx7hLIiryaC/cIbCLqf6dqyDM6oS9p2SYSZo=; b=VJp+y7Y8W1NX1KYJh8Ba157QHeFnkCQiB+tkDT0sii0HU74rDV21fV0bLN8ETzcKl3 SeTZi8NW8yU7cLFmaEdsrVktwwP8+fS68ibQMD/T54gKTppd8hJzgcpfFOUVkrMBaKqL owmgRmpCshlgmYiuEIb5BP7rST1m23LC6RKzs3lBXqmWblF6kLr5iFOjC0izGKH9jAjD x3etzFMHREjsUV0EKRIiJ6QtKZ5KKriBXNS2RGujT90UZy95Iz3tRhC4MKJWuCQVJVfx RFcRff5NAX9FUkT5qqTJrx2ed1Jta9kd1t4goTQGkFTf5DjF5AhwZ5DUeg4uJAIQr4KC cgsw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x20si252871otk.295.2020.03.05.15.40.27; Thu, 05 Mar 2020 15:40:39 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726204AbgCEXhZ (ORCPT + 99 others); Thu, 5 Mar 2020 18:37:25 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:29430 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726049AbgCEXhZ (ORCPT ); Thu, 5 Mar 2020 18:37:25 -0500 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 025NZwNg054065 for ; Thu, 5 Mar 2020 18:37:24 -0500 Received: from e06smtp01.uk.ibm.com (e06smtp01.uk.ibm.com [195.75.94.97]) by mx0b-001b2d01.pphosted.com with ESMTP id 2ykata0m89-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 05 Mar 2020 18:37:23 -0500 Received: from localhost by e06smtp01.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 5 Mar 2020 23:37:21 -0000 Received: from b06cxnps4074.portsmouth.uk.ibm.com (9.149.109.196) by e06smtp01.uk.ibm.com (192.168.101.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 5 Mar 2020 23:37:15 -0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 025NbEYx48496674 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 5 Mar 2020 23:37:14 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 13CD2AE045; Thu, 5 Mar 2020 23:37:14 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 686D0AE053; Thu, 5 Mar 2020 23:37:13 +0000 (GMT) Received: from ozlabs.au.ibm.com (unknown [9.192.253.14]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 5 Mar 2020 23:37:13 +0000 (GMT) Received: from adsilva.ozlabs.ibm.com (haven.au.ibm.com [9.192.254.114]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.au.ibm.com (Postfix) with ESMTPSA id 6B3EEA011F; Fri, 6 Mar 2020 10:37:08 +1100 (AEDT) Subject: Re: [PATCH v3 18/27] powerpc/powernv/pmem: Add controller dump IOCTLs From: "Alastair D'Silva" To: Frederic Barrat Cc: "Aneesh Kumar K . V" , "Oliver O'Halloran" , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Andrew Donnellan , Arnd Bergmann , Greg Kroah-Hartman , Dan Williams , Vishal Verma , Dave Jiang , Ira Weiny , Andrew Morton , Mauro Carvalho Chehab , "David S. Miller" , Rob Herring , Anton Blanchard , Krzysztof Kozlowski , Mahesh Salgaonkar , Madhavan Srinivasan , =?ISO-8859-1?Q?C=E9dric?= Le Goater , Anju T Sudhakar , Hari Bathini , Thomas Gleixner , Greg Kurz , Nicholas Piggin , Masahiro Yamada , Alexey Kardashevskiy , linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-nvdimm@lists.01.org, linux-mm@kvack.org Date: Fri, 06 Mar 2020 10:37:12 +1100 In-Reply-To: <6d1f28bc-334c-e85b-9974-71cf88a1ad20@linux.ibm.com> References: <20200221032720.33893-1-alastair@au1.ibm.com> <20200221032720.33893-19-alastair@au1.ibm.com> <6d1f28bc-334c-e85b-9974-71cf88a1ad20@linux.ibm.com> Organization: IBM Australia Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.34.4 (3.34.4-1.fc31) MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 x-cbid: 20030523-4275-0000-0000-000003A8C9D6 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 20030523-4276-0000-0000-000038BDDAFB Message-Id: <6410f5f56e6d0c902026b7e323c352d5d1f7bb17.camel@au1.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138,18.0.572 definitions=2020-03-05_08:2020-03-05,2020-03-05 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=7 bulkscore=0 spamscore=7 suspectscore=2 clxscore=1015 lowpriorityscore=0 priorityscore=1501 mlxlogscore=110 malwarescore=0 phishscore=0 mlxscore=7 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2003050133 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2020-03-03 at 19:04 +0100, Frederic Barrat wrote: > > Le 21/02/2020 à 04:27, Alastair D'Silva a écrit : > > From: Alastair D'Silva > > > > This patch adds IOCTLs to allow userspace to request & fetch dumps > > of the internal controller state. > > > > This is useful during debugging or when a fatal error on the > > controller > > has occurred. > > > > Signed-off-by: Alastair D'Silva > > --- > > arch/powerpc/platforms/powernv/pmem/ocxl.c | 132 > > +++++++++++++++++++++ > > include/uapi/nvdimm/ocxl-pmem.h | 15 +++ > > 2 files changed, 147 insertions(+) > > > > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c > > b/arch/powerpc/platforms/powernv/pmem/ocxl.c > > index 2b64504f9129..2cabafe1fc58 100644 > > --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c > > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c > > @@ -640,6 +640,124 @@ static int ioctl_error_log(struct ocxlpmem > > *ocxlpmem, > > return 0; > > } > > > > +static int ioctl_controller_dump_data(struct ocxlpmem *ocxlpmem, > > + struct ioctl_ocxl_pmem_controller_dump_data __user > > *uarg) > > +{ > > + struct ioctl_ocxl_pmem_controller_dump_data args; > > + u16 i; > > + u64 val; > > + int rc; > > + > > + if (copy_from_user(&args, uarg, sizeof(args))) > > + return -EFAULT; > > + > > + if (args.buf_size % 8) > > + return -EINVAL; > > + > > + if (args.buf_size > ocxlpmem->admin_command.data_size) > > + return -EINVAL; > > + > > + mutex_lock(&ocxlpmem->admin_command.lock); > > + > > + rc = admin_command_request(ocxlpmem, > > ADMIN_COMMAND_CONTROLLER_DUMP); > > + if (rc) > > + goto out; > > + > > + val = ((u64)args.offset) << 32; > > + val |= args.buf_size; > > + rc = ocxl_global_mmio_write64(ocxlpmem->ocxl_afu, > > + ocxlpmem- > > >admin_command.request_offset + 0x08, > > + OCXL_LITTLE_ENDIAN, val); > > + if (rc) > > + goto out; > > + > > + rc = admin_command_execute(ocxlpmem); > > + if (rc) > > + goto out; > > + > > + rc = admin_command_complete_timeout(ocxlpmem, > > + ADMIN_COMMAND_CONTROLLER_DU > > MP); > > + if (rc < 0) { > > + dev_warn(&ocxlpmem->dev, "Controller dump timed > > out\n"); > > + goto out; > > + } > > + > > + rc = admin_response(ocxlpmem); > > + if (rc < 0) > > + goto out; > > + if (rc != STATUS_SUCCESS) { > > + warn_status(ocxlpmem, > > + "Unexpected status from retrieve error > > log", > > + rc); > > + goto out; > > + } > > > It would help if there was a comment indicating how the 3 ioctls are > used. My understanding is that the userland is: > - requesting the controller to prepare a state dump > - then one or more ioctls to fetch the data. The number of calls > required to get the full state really depends on the size of the > buffer > passed by user > - a last ioctl to tell the controller that we're done, presumably to > let > it free some resources. > Ok, will add it to the blurb. > > > + > > + for (i = 0; i < args.buf_size; i += 8) { > > + u64 val; > > + > > + rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, > > + ocxlpmem- > > >admin_command.data_offset + i, > > + OCXL_HOST_ENDIAN, &val); > > + if (rc) > > + goto out; > > + > > + if (copy_to_user(&args.buf[i], &val, sizeof(u64))) { > > + rc = -EFAULT; > > + goto out; > > + } > > + } > > + > > + if (copy_to_user(uarg, &args, sizeof(args))) { > > + rc = -EFAULT; > > + goto out; > > + } > > + > > + rc = admin_response_handled(ocxlpmem); > > + if (rc) > > + goto out; > > + > > +out: > > + mutex_unlock(&ocxlpmem->admin_command.lock); > > + return rc; > > +} > > + > > +int request_controller_dump(struct ocxlpmem *ocxlpmem) > > +{ > > + int rc; > > + u64 busy = 1; > > + > > + rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, > > GLOBAL_MMIO_CHIC, > > + OCXL_LITTLE_ENDIAN, > > + GLOBAL_MMIO_CHI_CDA); > > + > > rc is not checked here. Whoops > > > > + > > + rc = ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, > > GLOBAL_MMIO_HCI, > > + OCXL_LITTLE_ENDIAN, > > + GLOBAL_MMIO_HCI_CONTROLLER_DUMP); > > + if (rc) > > + return rc; > > + > > + while (busy) { > > + rc = ocxl_global_mmio_read64(ocxlpmem->ocxl_afu, > > + GLOBAL_MMIO_HCI, > > + OCXL_LITTLE_ENDIAN, > > &busy); > > + if (rc) > > + return rc; > > + > > + busy &= GLOBAL_MMIO_HCI_CONTROLLER_DUMP; > > Setting 'busy' doesn't hurt, but it's not really useful, is it? > > We should add some kind of timeout so that if the controller hits an > issue, we don't spin in kernel space endlessly. > > Here we are polling the controller dump bit of the HCI register until the controller clears it - that line is masking off the bits we don't care about. I'll talk to the firmware team about adding a timeout for that to the spec so we know how long to wait for before giving up. > > > + cond_resched(); > > + } > > + > > + return 0; > > +} > > + > > +static int ioctl_controller_dump_complete(struct ocxlpmem > > *ocxlpmem) > > +{ > > + return ocxl_global_mmio_set64(ocxlpmem->ocxl_afu, > > GLOBAL_MMIO_HCI, > > + OCXL_LITTLE_ENDIAN, > > + GLOBAL_MMIO_HCI_CONTROLLER_DUMP_COL > > LECTED); > > +} > > + > > static long file_ioctl(struct file *file, unsigned int cmd, > > unsigned long args) > > { > > struct ocxlpmem *ocxlpmem = file->private_data; > > @@ -650,7 +768,21 @@ static long file_ioctl(struct file *file, > > unsigned int cmd, unsigned long args) > > rc = ioctl_error_log(ocxlpmem, > > (struct ioctl_ocxl_pmem_error_log > > __user *)args); > > break; > > + > > + case IOCTL_OCXL_PMEM_CONTROLLER_DUMP: > > + rc = request_controller_dump(ocxlpmem); > > + break; > > + > > + case IOCTL_OCXL_PMEM_CONTROLLER_DUMP_DATA: > > + rc = ioctl_controller_dump_data(ocxlpmem, > > + (struct > > ioctl_ocxl_pmem_controller_dump_data __user *)args); > > + break; > > + > > + case IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE: > > + rc = ioctl_controller_dump_complete(ocxlpmem); > > + break; > > } > > + > > return rc; > > } > > > > diff --git a/include/uapi/nvdimm/ocxl-pmem.h > > b/include/uapi/nvdimm/ocxl-pmem.h > > index b10f8ac0c20f..d4d8512d03f7 100644 > > --- a/include/uapi/nvdimm/ocxl-pmem.h > > +++ b/include/uapi/nvdimm/ocxl-pmem.h > > @@ -38,9 +38,24 @@ struct ioctl_ocxl_pmem_error_log { > > __u8 *buf; /* pointer to output buffer */ > > }; > > > > +struct ioctl_ocxl_pmem_controller_dump_data { > > + __u8 *buf; /* pointer to output buffer */ > > We only support 64-bit user app on powerpc, but using a pointer type > in > a kernel ABI is unusual. We should use a know size like __u64. > (also applies to buf pointer in struct ioctl_ocxl_pmem_error_log > from > previous patch too) > > The rest of the structure will also be padded by the compiler, which > we > should avoid. > > Fred > Ok, I'll co-erce the pointers into a __u64. > > > > + __u16 buf_size; /* in/out, buffer size provided/required. > > + * If required is greater than provided, the > > buffer > > + * will be truncated to the amount provided. If > > its > > + * less, then only the required bytes will be > > populated. > > + * If it is 0, then there is no more dump data > > available. > > + */ > > + __u32 offset; /* in, Offset within the dump */ > > + __u64 reserved[8]; > > +}; > > + > > /* ioctl numbers */ > > #define OCXL_PMEM_MAGIC 0x5C > > /* SCM devices */ > > #define IOCTL_OCXL_PMEM_ERROR_LOG _IOWR(OCXL_PMEM > > _MAGIC, 0x01, struct ioctl_ocxl_pmem_error_log) > > +#define IOCTL_OCXL_PMEM_CONTROLLER_DUMP _IO(OCX > > L_PMEM_MAGIC, 0x02) > > +#define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_DATA _IOWR(O > > CXL_PMEM_MAGIC, 0x03, struct ioctl_ocxl_pmem_controller_dump_data) > > +#define IOCTL_OCXL_PMEM_CONTROLLER_DUMP_COMPLETE _IO(OCXL_PMEM_M > > AGIC, 0x04) > > > > #endif /* _UAPI_OCXL_SCM_H */ > > -- Alastair D'Silva Open Source Developer Linux Technology Centre, IBM Australia mob: 0423 762 819