2018-01-17 10:54:07

by Johannes Thumshirn

[permalink] [raw]
Subject: [PATCH v3 0/2] add tracepoints for nvme command submission and completion

Add tracepoints for nvme command submission and completion. The tracepoints
are modeled after SCSI's trace_scsi_dispatch_cmd_start() and
trace_scsi_dispatch_cmd_done() tracepoints and fulfil a similar purpose,
namely a fast way to check which command is going to be queued into the HW or
Fabric driver and which command is completed again.

Here's an example output using the qemu emulated pci nvme:

# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
dd-205 [002] .... 7.128368: nvme_setup_cmd: nsid=1, command_id=568, flags=0x0, metadata=0x0, cmd=(nvme_cmd_read slba=0, length=31, control=0x0, dsmgmt=0, reftag=0)
<idle>-0 [002] d.h. 7.129098: nvme_complete_rq: command_id=568, result=0, retries=0, flags=0x0, status=0
dd-205 [002] .... 7.129150: nvme_setup_cmd: nsid=1, command_id=568, flags=0x0, metadata=0x0, cmd=(nvme_cmd_read slba=32, length=63, control=0x0, dsmgmt=0, reftag=0)
dd-205 [002] .... 7.129189: nvme_setup_cmd: nsid=1, command_id=569, flags=0x0, metadata=0x0, cmd=(nvme_cmd_read slba=96, length=127, control=0x0, dsmgmt=0, reftag=0)
<idle>-0 [002] d.h. 7.129318: nvme_complete_rq: command_id=568, result=0, retries=0, flags=0x0, status=0
dd-205 [002] d.h. 7.129355: nvme_complete_rq: command_id=569, result=0, retries=0, flags=0x0, status=0
dd-206 [003] .... 7.137389: nvme_setup_cmd: nsid=1, command_id=22, flags=0x0, metadata=0x0, cmd=(nvme_cmd_write slba=0, length=2559, control=0x0, dsmgmt=0, reftag=0)
dd-206 [003] .... 7.137888: nvme_setup_cmd: nsid=1, command_id=23, flags=0x0, metadata=0x0, cmd=(nvme_cmd_write slba=2560, length=2559, control=0x0, dsmgmt=0, reftag=0)
dd-206 [003] .... 7.138365: nvme_setup_cmd: nsid=1, command_id=24, flags=0x0, metadata=0x0, cmd=(nvme_cmd_write slba=5120, length=2559, control=0x0, dsmgmt=0, reftag=0)
dd-206 [003] d.h. 7.138634: nvme_complete_rq: command_id=22, result=0, retries=0, flags=0x0, status=0
dd-206 [003] .... 7.139014: nvme_setup_cmd: nsid=1, command_id=25, flags=0x0, metadata=0x0, cmd=(nvme_cmd_write slba=7680, length=2559, control=0x0, dsmgmt=0, reftag=0)
<idle>-0 [003] d.h. 7.139658: nvme_complete_rq: command_id=23, result=0, retries=0, flags=0x0, status=0
<idle>-0 [003] d.h. 7.140650: nvme_complete_rq: command_id=24, result=0, retries=0, flags=0x0, status=0
<idle>-0 [003] d.h. 7.141670: nvme_complete_rq: command_id=25, result=0, retries=0, flags=0x0, status=0

Johannes Thumshirn (2):
nvme: add tracepoint for nvme_setup_cmd
nvme: add tracepoint for nvme_complete_rq

drivers/nvme/host/Makefile | 5 ++-
drivers/nvme/host/core.c | 6 +++
drivers/nvme/host/trace.c | 82 ++++++++++++++++++++++++++++++++++++
drivers/nvme/host/trace.h | 103 +++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 195 insertions(+), 1 deletion(-)
create mode 100644 drivers/nvme/host/trace.c
create mode 100644 drivers/nvme/host/trace.h

--
2.12.3


2018-01-17 10:54:04

by Johannes Thumshirn

[permalink] [raw]
Subject: [PATCH v3 2/2] nvme: add tracepoint for nvme_complete_rq

Signed-off-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>

---
Changes to v2:
* Pass the whole struct request to the tracepoint
* Removed spaces after parenthesis (Christoph)
---
drivers/nvme/host/core.c | 2 ++
drivers/nvme/host/trace.h | 26 ++++++++++++++++++++++++++
2 files changed, 28 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 9c5d476c8c6d..c61014d56279 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -193,6 +193,8 @@ static inline bool nvme_req_needs_retry(struct request *req)

void nvme_complete_rq(struct request *req)
{
+ trace_nvme_complete_rq(req);
+
if (unlikely(nvme_req(req)->status && nvme_req_needs_retry(req))) {
if (nvme_req_needs_failover(req)) {
nvme_failover_req(req);
diff --git a/drivers/nvme/host/trace.h b/drivers/nvme/host/trace.h
index a168d806d494..b416dc0760c1 100644
--- a/drivers/nvme/host/trace.h
+++ b/drivers/nvme/host/trace.h
@@ -22,6 +22,8 @@
#include <linux/tracepoint.h>
#include <linux/trace_seq.h>

+#include "nvme.h"
+
#define nvme_opcode_name(opcode) { opcode, #opcode }
#define show_opcode_name(val) \
__print_symbolic(val, \
@@ -66,6 +68,30 @@ TRACE_EVENT(nvme_setup_cmd,
__parse_nvme_cmd(__entry->cmnd))
);

+TRACE_EVENT(nvme_complete_rq,
+ TP_PROTO(struct request *req),
+ TP_ARGS(req),
+ TP_STRUCT__entry(
+ __field(int, cid)
+ __field(__le64, result)
+ __field(u8, retries)
+ __field(u8, flags)
+ __field(u16, status)
+ ),
+ TP_fast_assign(
+ __entry->cid = req->tag;
+ __entry->result = nvme_req(req)->result.u64;
+ __entry->retries = nvme_req(req)->retries;
+ __entry->flags = nvme_req(req)->flags;
+ __entry->status = nvme_req(req)->status;
+ ),
+ TP_printk("cmdid=%u, res=%llu, retries=%u, flags=0x%x, status=%u",
+ __entry->cid,
+ le64_to_cpu(__entry->result),
+ __entry->retries, __entry->flags, __entry->status)
+
+);
+
#endif /* _TRACE_NVME_H */

#undef TRACE_INCLUDE_PATH
--
2.12.3

2018-01-17 10:54:30

by Johannes Thumshirn

[permalink] [raw]
Subject: [PATCH v3 1/2] nvme: add tracepoint for nvme_setup_cmd

Signed-off-by: Johannes Thumshirn <[email protected]>

---
Changes to v2:
* Don't cast le64_to_cpu() conversions to unsigned long long (Christoph)
* Add proper copyright header (Christoph)
* Move trace decoding into own file (Christoph)
* Include the src directory in the Makefile for trace (Christoph)
* Removed spaces before and after parenthesis (Christoph)
* Reduced print lines to fit the 80 char limit (Christoph)

Changes to v1:
* Fix typo (Hannes)
* move include/trace/events/nvme.h -> drivers/nvme/host/trace.h (Christoph)
---
drivers/nvme/host/Makefile | 5 ++-
drivers/nvme/host/core.c | 4 +++
drivers/nvme/host/trace.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++
drivers/nvme/host/trace.h | 77 +++++++++++++++++++++++++++++++++++++++++++
4 files changed, 167 insertions(+), 1 deletion(-)
create mode 100644 drivers/nvme/host/trace.c
create mode 100644 drivers/nvme/host/trace.h

diff --git a/drivers/nvme/host/Makefile b/drivers/nvme/host/Makefile
index a25fd43650ad..3e42ed0e1be5 100644
--- a/drivers/nvme/host/Makefile
+++ b/drivers/nvme/host/Makefile
@@ -1,11 +1,14 @@
# SPDX-License-Identifier: GPL-2.0
+
+ccflags-y += -I$(src)
+
obj-$(CONFIG_NVME_CORE) += nvme-core.o
obj-$(CONFIG_BLK_DEV_NVME) += nvme.o
obj-$(CONFIG_NVME_FABRICS) += nvme-fabrics.o
obj-$(CONFIG_NVME_RDMA) += nvme-rdma.o
obj-$(CONFIG_NVME_FC) += nvme-fc.o

-nvme-core-y := core.o
+nvme-core-y := trace.o core.o
nvme-core-$(CONFIG_NVME_MULTIPATH) += multipath.o
nvme-core-$(CONFIG_NVM) += lightnvm.o

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 839650e0926a..9c5d476c8c6d 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -29,6 +29,9 @@
#include <linux/pm_qos.h>
#include <asm/unaligned.h>

+#define CREATE_TRACE_POINTS
+#include "trace.h"
+
#include "nvme.h"
#include "fabrics.h"

@@ -591,6 +594,7 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req,
}

cmd->common.command_id = req->tag;
+ trace_nvme_setup_cmd(cmd);
return ret;
}
EXPORT_SYMBOL_GPL(nvme_setup_cmd);
diff --git a/drivers/nvme/host/trace.c b/drivers/nvme/host/trace.c
new file mode 100644
index 000000000000..9400cf53c43a
--- /dev/null
+++ b/drivers/nvme/host/trace.c
@@ -0,0 +1,82 @@
+/*
+ * NVM Express device driver tracepoints
+ * Copyright (c) 2018 Johannes Thumshirn, SUSE Linux GmbH
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ */
+
+#include "trace.h"
+
+static const char *nvme_trace_read_write(struct trace_seq *p,
+ struct nvme_rw_command rw)
+{
+ const char *ret = trace_seq_buffer_ptr(p);
+
+ trace_seq_printf(p,
+ "slba=%llu, len=%u, ctrl=0x%x, dsmgmt=%u, reftag=%u",
+ le64_to_cpu(rw.slba), le16_to_cpu(rw.length),
+ le16_to_cpu(rw.control), le32_to_cpu(rw.dsmgmt),
+ le32_to_cpu(rw.reftag));
+ trace_seq_putc(p, 0);
+
+ return ret;
+}
+
+static const char *nvme_trace_dsm(struct trace_seq *p, struct nvme_dsm_cmd dsm)
+{
+ const char *ret = trace_seq_buffer_ptr(p);
+
+ trace_seq_printf(p, "nr=%u, attributes=%u",
+ le32_to_cpu(dsm.nr), le32_to_cpu(dsm.attributes));
+ trace_seq_putc(p, 0);
+
+ return ret;
+}
+
+static const char *nvme_trace_write_zeroes(struct trace_seq *p,
+ struct nvme_write_zeroes_cmd wz)
+{
+ const char *ret = trace_seq_buffer_ptr(p);
+
+ trace_seq_printf(p,
+ "slba=%llu, len=%u, ctrl=0x%x, dsmgmt=%u, reftag=%u",
+ le64_to_cpu(wz.slba), le16_to_cpu(wz.length),
+ le16_to_cpu(wz.control), le32_to_cpu(wz.dsmgmt),
+ le32_to_cpu(wz.reftag));
+ trace_seq_putc(p, 0);
+
+ return ret;
+}
+
+static const char *nvme_trace_common(struct trace_seq *p,
+ struct nvme_common_command c)
+{
+ const char *ret = trace_seq_buffer_ptr(p);
+
+ trace_seq_printf(p, "cdw2=%*ph, cdw10=%*ph", 4, c.cdw2, 24, c.cdw10);
+ trace_seq_putc(p, 0);
+
+ return ret;
+}
+
+const char *nvme_trace_parse_cmd(struct trace_seq *p, struct nvme_command cmd)
+{
+ switch (cmd.common.opcode) {
+ case nvme_cmd_read:
+ case nvme_cmd_write:
+ return nvme_trace_read_write(p, cmd.rw);
+ case nvme_cmd_dsm:
+ return nvme_trace_dsm(p, cmd.dsm);
+ case nvme_cmd_write_zeroes:
+ return nvme_trace_write_zeroes(p, cmd.write_zeroes);
+ default:
+ return nvme_trace_common(p, cmd.common);
+ }
+}
diff --git a/drivers/nvme/host/trace.h b/drivers/nvme/host/trace.h
new file mode 100644
index 000000000000..a168d806d494
--- /dev/null
+++ b/drivers/nvme/host/trace.h
@@ -0,0 +1,77 @@
+/*
+ * NVM Express device driver tracepoints
+ * Copyright (c) 2018 Johannes Thumshirn, SUSE Linux GmbH
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM nvme
+
+#if !defined(_TRACE_NVME_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_NVME_H
+
+#include <linux/nvme.h>
+#include <linux/tracepoint.h>
+#include <linux/trace_seq.h>
+
+#define nvme_opcode_name(opcode) { opcode, #opcode }
+#define show_opcode_name(val) \
+ __print_symbolic(val, \
+ nvme_opcode_name(nvme_cmd_flush), \
+ nvme_opcode_name(nvme_cmd_write), \
+ nvme_opcode_name(nvme_cmd_read), \
+ nvme_opcode_name(nvme_cmd_write_uncor), \
+ nvme_opcode_name(nvme_cmd_compare), \
+ nvme_opcode_name(nvme_cmd_write_zeroes), \
+ nvme_opcode_name(nvme_cmd_dsm), \
+ nvme_opcode_name(nvme_cmd_resv_register), \
+ nvme_opcode_name(nvme_cmd_resv_report), \
+ nvme_opcode_name(nvme_cmd_resv_acquire), \
+ nvme_opcode_name(nvme_cmd_resv_release))
+
+const char *nvme_trace_parse_cmd(struct trace_seq *p, struct nvme_command cmnd);
+#define __parse_nvme_cmd(cmd) nvme_trace_parse_cmd(p, cmd)
+
+TRACE_EVENT(nvme_setup_cmd,
+ TP_PROTO(struct nvme_command *cmd),
+ TP_ARGS(cmd),
+ TP_STRUCT__entry(
+ __field(__u8, opcode)
+ __field(__u8, flags)
+ __field(__u16, cid)
+ __field(__le32, nsid)
+ __field(__le64, metadata)
+ __field_struct( struct nvme_command, cmnd )
+ ),
+ TP_fast_assign(
+ __entry->opcode = cmd->common.opcode;
+ __entry->flags = cmd->common.flags;
+ __entry->cid = cmd->common.command_id;
+ __entry->nsid = cmd->common.nsid;
+ __entry->metadata = cmd->common.metadata;
+ memcpy(&__entry->cmnd, cmd, sizeof(__entry->cmnd));
+ ),
+ TP_printk("nsid=%u, cmdid=%u, flags=0x%x, meta=0x%llx, cmd=(%s %s)",
+ le32_to_cpu(__entry->nsid), __entry->cid, __entry->flags,
+ le64_to_cpu(__entry->metadata),
+ show_opcode_name(__entry->opcode),
+ __parse_nvme_cmd(__entry->cmnd))
+);
+
+#endif /* _TRACE_NVME_H */
+
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE trace
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
--
2.12.3

2018-01-18 01:57:08

by Martin K. Petersen

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] nvme: add tracepoint for nvme_setup_cmd


Johannes,

Very nice!

Reviewed-by: Martin K. Petersen <[email protected]>

--
Martin K. Petersen Oracle Linux Engineering

2018-01-18 01:57:16

by Martin K. Petersen

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] nvme: add tracepoint for nvme_complete_rq


Johannes,

Reviewed-by: Martin K. Petersen <[email protected]>

--
Martin K. Petersen Oracle Linux Engineering

2018-01-18 03:15:59

by Keith Busch

[permalink] [raw]

2018-01-18 03:16:27

by Keith Busch

[permalink] [raw]

2018-01-18 15:22:17

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] nvme: add tracepoint for nvme_setup_cmd

On Wed, Jan 17, 2018 at 11:53:35AM +0100, Johannes Thumshirn wrote:
> +nvme-core-y := trace.o core.o

trace.o should be conditional on CONFIG_TRACEPOINTS.

> +TRACE_EVENT(nvme_setup_cmd,
> + TP_PROTO(struct nvme_command *cmd),
> + TP_ARGS(cmd),
> + TP_STRUCT__entry(
> + __field(__u8, opcode)
> + __field(__u8, flags)
> + __field(__u16, cid)
> + __field(__le32, nsid)
> + __field(__le64, metadata)
> + __field_struct( struct nvme_command, cmnd )

This still copies the whole SQE. I think that is way to much to
copy, especially given that you also copy many of the fields separately
as well.

2018-01-18 16:24:17

by Johannes Thumshirn

[permalink] [raw]
Subject: Re: [PATCH v3 1/2] nvme: add tracepoint for nvme_setup_cmd

On Thu, Jan 18, 2018 at 04:21:34PM +0100, Christoph Hellwig wrote:
> On Wed, Jan 17, 2018 at 11:53:35AM +0100, Johannes Thumshirn wrote:
> > +nvme-core-y := trace.o core.o
>
> trace.o should be conditional on CONFIG_TRACEPOINTS.

Doesn't do too much of a difference for me if CONFIG_TRACEPOINTS=n but if you
insist I can do it of cause.

> > +TRACE_EVENT(nvme_setup_cmd,
> > + TP_PROTO(struct nvme_command *cmd),
> > + TP_ARGS(cmd),
> > + TP_STRUCT__entry(
> > + __field(__u8, opcode)
> > + __field(__u8, flags)
> > + __field(__u16, cid)
> > + __field(__le32, nsid)
> > + __field(__le64, metadata)
> > + __field_struct( struct nvme_command, cmnd )
>
> This still copies the whole SQE. I think that is way to much to
> copy, especially given that you also copy many of the fields separately
> as well.

it's not that much, but ok.

Will be sending out a v4 tomorrow.

--
Johannes Thumshirn Storage
[email protected] +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N?rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850