2020-11-03 13:01:26

by Siddharth Gupta

[permalink] [raw]
Subject: [PATCH v7 0/4] Introduce mini-dump support for remoteproc

Sometimes firmware sizes can be in tens of MB's and reading all the memory
during coredump can consume lot of time and memory.

Introducing support for mini-dumps. Mini-dump contains smallest amount of
useful information, that could help to debug subsystem crashes.

During bootup memory is allocated in SMEM (Shared memory) in the form of a
table that contains the physical addresses and sizes of the regions that
are supposed to be collected during coredump. This memory is shared amongst
all processors in a Qualcomm platform, so all remoteprocs fill in their
entry in the global table once they are out of reset.

This patch series adds support for parsing the global minidump table and
uses the current coredump frameork to expose this memory to userspace
during remoteproc's recovery.

This patch series also integrates the patch:
https://patchwork.kernel.org/patch/11695541/ sent by Siddharth.

Changelog:
v6 -> v7:
- The STR_TAB size is calculated dynamically now instead of a predefined size.
- Added comments to indicate details about the reserved null section header. More
details can be found at https://refspecs.linuxfoundation.org/elf/elf.pdf.

v5 -> v6:
- Removed priv_cleanup operation from rproc_ops. The dump_segments list is
updated and cleaned up each time minidump is invoked.
- Split patch #2 into 2 parts - one that adds the rproc_minidump function, and
the other that uses the new function in the qcom_q6v5_pas driver.
- Updated structs in qcom_minidump to explicitly indicate the endianness of the
data stored in SMEM, also updated member names.
- Read the global table of contents in SMEM each time adsp_minidump is invoked.

v4 -> v5:
- Fixed adsp_add_minidump_segments to read IO memory using appropriate functions.

v3 -> v4:
- Made adsp_priv_cleanup a static function.

v2 -> v3:
- Refactored code to remove dependency on Qualcomm configs.
- Renamed do_rproc_minidump to rproc_minidump and marked as exported
symbol.

v1 -> v2:
- 3 kernel test robot warnings have been resolved.
- Introduced priv_cleanup op in order to making the cleaning of
private elements used by the remoteproc more readable.
- Removed rproc_cleanup_priv as it is no longer needed.
- Switched to if/else format for rproc_alloc in order to keep
the static const decalaration of adsp_minidump_ops.

Siddharth Gupta (4):
remoteproc: core: Add ops to enable custom coredump functionality
remoteproc: coredump: Add minidump functionality
remoteproc: qcom: Add capability to collect minidumps
remoteproc: qcom: Add minidump id for sm8150 modem

drivers/remoteproc/qcom_minidump.h | 64 +++++++++++++
drivers/remoteproc/qcom_q6v5_pas.c | 105 ++++++++++++++++++++-
drivers/remoteproc/remoteproc_core.c | 6 +-
drivers/remoteproc/remoteproc_coredump.c | 140 ++++++++++++++++++++++++++++
drivers/remoteproc/remoteproc_elf_helpers.h | 26 ++++++
include/linux/remoteproc.h | 3 +
6 files changed, 341 insertions(+), 3 deletions(-)
create mode 100644 drivers/remoteproc/qcom_minidump.h

--
Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


2020-11-03 13:01:28

by Siddharth Gupta

[permalink] [raw]
Subject: [PATCH v7 2/4] remoteproc: coredump: Add minidump functionality

This change adds a new kind of core dump mechanism which instead of dumping
entire program segments of the firmware, dumps sections of the remoteproc
memory which are sufficient to allow debugging the firmware. This function
thus uses section headers instead of program headers during creation of the
core dump elf.

Signed-off-by: Rishabh Bhatnagar <[email protected]>
Signed-off-by: Siddharth Gupta <[email protected]>
---
drivers/remoteproc/remoteproc_coredump.c | 140 ++++++++++++++++++++++++++++
drivers/remoteproc/remoteproc_elf_helpers.h | 26 ++++++
include/linux/remoteproc.h | 1 +
3 files changed, 167 insertions(+)

diff --git a/drivers/remoteproc/remoteproc_coredump.c b/drivers/remoteproc/remoteproc_coredump.c
index 34530dc..a6c0099 100644
--- a/drivers/remoteproc/remoteproc_coredump.c
+++ b/drivers/remoteproc/remoteproc_coredump.c
@@ -323,3 +323,143 @@ void rproc_coredump(struct rproc *rproc)
*/
wait_for_completion(&dump_state.dump_done);
}
+
+/**
+ * rproc_minidump() - perform minidump
+ * @rproc: rproc handle
+ *
+ * This function will generate an ELF header for the registered sections of
+ * segments and create a devcoredump device associated with rproc. Based on
+ * the coredump configuration this function will directly copy the segments
+ * from device memory to userspace or copy segments from device memory to
+ * a separate buffer, which can then be read by userspace.
+ * The first approach avoids using extra vmalloc memory. But it will stall
+ * recovery flow until dump is read by userspace.
+ */
+void rproc_minidump(struct rproc *rproc)
+{
+ struct rproc_dump_segment *segment;
+ void *shdr;
+ void *ehdr;
+ size_t data_size;
+ size_t strtbl_size = 0;
+ size_t strtbl_index = 1;
+ size_t offset;
+ void *data;
+ u8 class = rproc->elf_class;
+ int shnum;
+ struct rproc_coredump_state dump_state;
+ unsigned int dump_conf = rproc->dump_conf;
+ char *str_tbl = "STR_TBL";
+
+ if (list_empty(&rproc->dump_segments) ||
+ dump_conf == RPROC_COREDUMP_DISABLED)
+ return;
+
+ if (class == ELFCLASSNONE) {
+ dev_err(&rproc->dev, "Elf class is not set\n");
+ return;
+ }
+
+ /*
+ * We allocate two extra section headers. The first one is null.
+ * Second section header is for the string table. Also space is
+ * allocated for string table.
+ */
+ data_size = elf_size_of_hdr(class) + 2 * elf_size_of_shdr(class);
+ shnum = 2;
+
+ /* the extra byte is for the null character at index 0 */
+ strtbl_size += strlen(str_tbl) + 2;
+
+ list_for_each_entry(segment, &rproc->dump_segments, node) {
+ data_size += elf_size_of_shdr(class);
+ strtbl_size += strlen(segment->priv) + 1;
+ if (dump_conf == RPROC_COREDUMP_ENABLED)
+ data_size += segment->size;
+ shnum++;
+ }
+
+ data_size += strtbl_size;
+
+ data = vmalloc(data_size);
+ if (!data)
+ return;
+
+ ehdr = data;
+ memset(ehdr, 0, elf_size_of_hdr(class));
+ /* e_ident field is common for both elf32 and elf64 */
+ elf_hdr_init_ident(ehdr, class);
+
+ elf_hdr_set_e_type(class, ehdr, ET_CORE);
+ elf_hdr_set_e_machine(class, ehdr, rproc->elf_machine);
+ elf_hdr_set_e_version(class, ehdr, EV_CURRENT);
+ elf_hdr_set_e_entry(class, ehdr, rproc->bootaddr);
+ elf_hdr_set_e_shoff(class, ehdr, elf_size_of_hdr(class));
+ elf_hdr_set_e_ehsize(class, ehdr, elf_size_of_hdr(class));
+ elf_hdr_set_e_shentsize(class, ehdr, elf_size_of_shdr(class));
+ elf_hdr_set_e_shnum(class, ehdr, shnum);
+ elf_hdr_set_e_shstrndx(class, ehdr, 1);
+
+ /*
+ * The zeroth index of the section header is reserved and is rarely used.
+ * Set the section header as null (SHN_UNDEF) and move to the next one.
+ */
+ shdr = data + elf_hdr_get_e_shoff(class, ehdr);
+ memset(shdr, 0, elf_size_of_shdr(class));
+ shdr += elf_size_of_shdr(class);
+
+ /* Initialize the string table. */
+ offset = elf_hdr_get_e_shoff(class, ehdr) +
+ elf_size_of_shdr(class) * elf_hdr_get_e_shnum(class, ehdr);
+ memset(data + offset, 0, strtbl_size);
+
+ /* Fill in the string table section header. */
+ memset(shdr, 0, elf_size_of_shdr(class));
+ elf_shdr_set_sh_type(class, shdr, SHT_STRTAB);
+ elf_shdr_set_sh_offset(class, shdr, offset);
+ elf_shdr_set_sh_size(class, shdr, strtbl_size);
+ elf_shdr_set_sh_entsize(class, shdr, 0);
+ elf_shdr_set_sh_flags(class, shdr, 0);
+ elf_shdr_set_sh_name(class, shdr, set_section_name(str_tbl, ehdr, class, &strtbl_index));
+ offset += elf_shdr_get_sh_size(class, shdr);
+ shdr += elf_size_of_shdr(class);
+
+ list_for_each_entry(segment, &rproc->dump_segments, node) {
+ memset(shdr, 0, elf_size_of_shdr(class));
+ elf_shdr_set_sh_type(class, shdr, SHT_PROGBITS);
+ elf_shdr_set_sh_offset(class, shdr, offset);
+ elf_shdr_set_sh_addr(class, shdr, segment->da);
+ elf_shdr_set_sh_size(class, shdr, segment->size);
+ elf_shdr_set_sh_entsize(class, shdr, 0);
+ elf_shdr_set_sh_flags(class, shdr, SHF_WRITE);
+ elf_shdr_set_sh_name(class, shdr,
+ set_section_name(segment->priv, ehdr, class, &strtbl_index));
+
+ /* No need to copy segments for inline dumps */
+ if (dump_conf == RPROC_COREDUMP_ENABLED)
+ rproc_copy_segment(rproc, data + offset, segment, 0,
+ segment->size);
+ offset += elf_shdr_get_sh_size(class, shdr);
+ shdr += elf_size_of_shdr(class);
+ }
+
+ if (dump_conf == RPROC_COREDUMP_ENABLED) {
+ dev_coredumpv(&rproc->dev, data, data_size, GFP_KERNEL);
+ return;
+ }
+
+ /* Initialize the dump state struct to be used by rproc_coredump_read */
+ dump_state.rproc = rproc;
+ dump_state.header = data;
+ init_completion(&dump_state.dump_done);
+
+ dev_coredumpm(&rproc->dev, NULL, &dump_state, data_size, GFP_KERNEL,
+ rproc_coredump_read, rproc_coredump_free);
+
+ /* Wait until the dump is read and free is called. Data is freed
+ * by devcoredump framework automatically after 5 minutes.
+ */
+ wait_for_completion(&dump_state.dump_done);
+}
+EXPORT_SYMBOL(rproc_minidump);
diff --git a/drivers/remoteproc/remoteproc_elf_helpers.h b/drivers/remoteproc/remoteproc_elf_helpers.h
index 4b6be7b..fa669ad 100644
--- a/drivers/remoteproc/remoteproc_elf_helpers.h
+++ b/drivers/remoteproc/remoteproc_elf_helpers.h
@@ -65,6 +65,7 @@ ELF_GEN_FIELD_GET_SET(hdr, e_type, u16)
ELF_GEN_FIELD_GET_SET(hdr, e_version, u32)
ELF_GEN_FIELD_GET_SET(hdr, e_ehsize, u32)
ELF_GEN_FIELD_GET_SET(hdr, e_phentsize, u16)
+ELF_GEN_FIELD_GET_SET(hdr, e_shentsize, u16)

ELF_GEN_FIELD_GET_SET(phdr, p_paddr, u64)
ELF_GEN_FIELD_GET_SET(phdr, p_vaddr, u64)
@@ -75,6 +76,9 @@ ELF_GEN_FIELD_GET_SET(phdr, p_offset, u64)
ELF_GEN_FIELD_GET_SET(phdr, p_flags, u32)
ELF_GEN_FIELD_GET_SET(phdr, p_align, u64)

+ELF_GEN_FIELD_GET_SET(shdr, sh_type, u32)
+ELF_GEN_FIELD_GET_SET(shdr, sh_flags, u32)
+ELF_GEN_FIELD_GET_SET(shdr, sh_entsize, u16)
ELF_GEN_FIELD_GET_SET(shdr, sh_size, u64)
ELF_GEN_FIELD_GET_SET(shdr, sh_offset, u64)
ELF_GEN_FIELD_GET_SET(shdr, sh_name, u32)
@@ -93,4 +97,26 @@ ELF_STRUCT_SIZE(shdr)
ELF_STRUCT_SIZE(phdr)
ELF_STRUCT_SIZE(hdr)

+static inline unsigned int set_section_name(const char *name, void *ehdr, u8 class, size_t *index)
+{
+ u16 shstrndx = elf_hdr_get_e_shstrndx(class, ehdr);
+ void *shdr;
+ char *strtab;
+ size_t idx, ret;
+
+ shdr = ehdr + elf_size_of_hdr(class) + shstrndx * elf_size_of_shdr(class);
+ strtab = ehdr + elf_shdr_get_sh_offset(class, shdr);
+ idx = index ? *index : 0;
+ if (!strtab || !name)
+ return 0;
+
+ ret = idx;
+ strcpy((strtab + idx), name);
+ idx += strlen(name) + 1;
+ if (index)
+ *index = idx;
+
+ return ret;
+}
+
#endif /* REMOTEPROC_ELF_LOADER_H */
diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
index a419878..844021e 100644
--- a/include/linux/remoteproc.h
+++ b/include/linux/remoteproc.h
@@ -656,6 +656,7 @@ rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
int rproc_boot(struct rproc *rproc);
void rproc_shutdown(struct rproc *rproc);
void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
+void rproc_minidump(struct rproc *rproc);
int rproc_coredump_add_segment(struct rproc *rproc, dma_addr_t da, size_t size);
int rproc_coredump_add_custom_segment(struct rproc *rproc,
dma_addr_t da, size_t size,
--
Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

2020-11-03 14:02:13

by Siddharth Gupta

[permalink] [raw]
Subject: [PATCH v7 3/4] remoteproc: qcom: Add capability to collect minidumps

This patch adds support for collecting minidump in the event of remoteproc
crash. Parse the minidump table based on remoteproc's unique minidump-id,
read all memory regions from the remoteproc's minidump table entry and
expose the memory to userspace. The remoteproc platform driver can choose
to collect a full/mini dump by specifying the coredump op.

Co-developed-by: Rishabh Bhatnagar <[email protected]>
Signed-off-by: Rishabh Bhatnagar <[email protected]>
Co-developed-by: Gurbir Arora <[email protected]>
Signed-off-by: Gurbir Arora <[email protected]>
Signed-off-by: Siddharth Gupta <[email protected]>
---
drivers/remoteproc/qcom_minidump.h | 64 +++++++++++++++++++++++
drivers/remoteproc/qcom_q6v5_pas.c | 104 ++++++++++++++++++++++++++++++++++++-
2 files changed, 166 insertions(+), 2 deletions(-)
create mode 100644 drivers/remoteproc/qcom_minidump.h

diff --git a/drivers/remoteproc/qcom_minidump.h b/drivers/remoteproc/qcom_minidump.h
new file mode 100644
index 0000000..5857d06
--- /dev/null
+++ b/drivers/remoteproc/qcom_minidump.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2020, The Linux Foundation. All rights reserved.
+ */
+
+#ifndef __QCOM_MINIDUMP_H
+#define __QCOM_MINIDUMP_H
+
+#define MAX_NUM_OF_SS 10
+#define MAX_REGION_NAME_LENGTH 16
+#define SBL_MINIDUMP_SMEM_ID 602
+#define MD_REGION_VALID ('V' << 24 | 'A' << 16 | 'L' << 8 | 'I' << 0)
+#define MD_SS_ENCR_DONE ('D' << 24 | 'O' << 16 | 'N' << 8 | 'E' << 0)
+#define MD_SS_ENABLED ('E' << 24 | 'N' << 16 | 'B' << 8 | 'L' << 0)
+
+/**
+ * struct minidump_region - Minidump region
+ * @name : Name of the region to be dumped
+ * @seq_num: : Use to differentiate regions with same name.
+ * @valid : This entry to be dumped (if set to 1)
+ * @address : Physical address of region to be dumped
+ * @size : Size of the region
+ */
+struct minidump_region {
+ char name[MAX_REGION_NAME_LENGTH];
+ __le32 seq_num;
+ __le32 valid;
+ __le64 address;
+ __le64 size;
+};
+
+/**
+ * struct minidump_subsystem_toc: Subsystem's SMEM Table of content
+ * @status : Subsystem toc init status
+ * @enabled : if set to 1, this region would be copied during coredump
+ * @encryption_status: Encryption status for this subsystem
+ * @encryption_required : Decides to encrypt the subsystem regions or not
+ * @ss_region_count : Number of regions added in this subsystem toc
+ * @md_ss_smem_regions_baseptr : regions base pointer of the subsystem
+ */
+struct minidump_subsystem_toc {
+ __le32 status;
+ __le32 enabled;
+ __le32 encryption_status;
+ __le32 encryption_required;
+ __le32 ss_region_count;
+ __le64 md_ss_smem_regions_baseptr;
+};
+
+/**
+ * struct minidump_global_toc: Global Table of Content
+ * @md_toc_init : Global Minidump init status
+ * @md_revision : Minidump revision
+ * @md_enable_status : Minidump enable status
+ * @md_ss_toc : Array of subsystems toc
+ */
+struct minidump_global_toc {
+ __le32 status;
+ __le32 md_revision;
+ __le32 enabled;
+ struct minidump_subsystem_toc md_ss_toc[MAX_NUM_OF_SS];
+};
+
+#endif
diff --git a/drivers/remoteproc/qcom_q6v5_pas.c b/drivers/remoteproc/qcom_q6v5_pas.c
index 3837f23..349f725 100644
--- a/drivers/remoteproc/qcom_q6v5_pas.c
+++ b/drivers/remoteproc/qcom_q6v5_pas.c
@@ -28,11 +28,13 @@
#include "qcom_pil_info.h"
#include "qcom_q6v5.h"
#include "remoteproc_internal.h"
+#include "qcom_minidump.h"

struct adsp_data {
int crash_reason_smem;
const char *firmware_name;
int pas_id;
+ unsigned int minidump_id;
bool has_aggre2_clk;
bool auto_boot;

@@ -63,6 +65,7 @@ struct qcom_adsp {
int proxy_pd_count;

int pas_id;
+ unsigned int minidump_id;
int crash_reason_smem;
bool has_aggre2_clk;
const char *info_name;
@@ -116,6 +119,88 @@ static void adsp_pds_disable(struct qcom_adsp *adsp, struct device **pds,
}
}

+static void adsp_minidump_cleanup(struct rproc *rproc)
+{
+ struct rproc_dump_segment *entry, *tmp;
+
+ list_for_each_entry_safe(entry, tmp, &rproc->dump_segments, node) {
+ list_del(&entry->node);
+ kfree(entry->priv);
+ kfree(entry);
+ }
+}
+
+static void adsp_add_minidump_segments(struct rproc *rproc,
+ struct minidump_subsystem_toc *minidump_ss)
+{
+ struct minidump_region __iomem *ptr;
+ struct minidump_region region;
+ int seg_cnt, i;
+ dma_addr_t da;
+ size_t size;
+ char *name;
+
+ if (!list_empty(&rproc->dump_segments)) {
+ dev_err(&rproc->dev, "dump segment list already populated\n");
+ return;
+ }
+
+ seg_cnt = le32_to_cpu(minidump_ss->ss_region_count);
+ ptr = ioremap((unsigned long)le64_to_cpu(minidump_ss->md_ss_smem_regions_baseptr),
+ seg_cnt * sizeof(struct minidump_region));
+
+ if (!ptr)
+ return;
+
+ for (i = 0; i < seg_cnt; i++) {
+ memcpy_fromio(&region, ptr + i, sizeof(region));
+ if (region.valid == MD_REGION_VALID) {
+ name = kmalloc(MAX_REGION_NAME_LENGTH, GFP_KERNEL);
+ strlcpy(name, region.name, MAX_REGION_NAME_LENGTH);
+ da = le64_to_cpu(region.address);
+ size = le32_to_cpu(region.size);
+ rproc_coredump_add_custom_segment(rproc, da, size, NULL, name);
+ }
+ }
+
+ iounmap(ptr);
+}
+
+static void adsp_dump(struct rproc *rproc)
+{
+ struct qcom_adsp *adsp = rproc->priv;
+ struct minidump_subsystem_toc *minidump_ss;
+ struct minidump_global_toc *minidump_toc;
+
+ /* Get Global minidump ToC*/
+ minidump_toc = qcom_smem_get(QCOM_SMEM_HOST_ANY, SBL_MINIDUMP_SMEM_ID, NULL);
+
+ /* check if global table pointer exists and init is set */
+ if (IS_ERR(minidump_toc) || !minidump_toc->status) {
+ dev_err(&rproc->dev, "SMEM is not initialized.\n");
+ return;
+ }
+
+ /* Get subsystem table of contents using the minidump id */
+ minidump_ss = &minidump_toc->md_ss_toc[adsp->minidump_id];
+
+ /**
+ * Collect minidump if SS ToC is valid and segment table
+ * is initialized in memory and encryption status is set.
+ */
+ if (minidump_ss->md_ss_smem_regions_baseptr == 0 ||
+ le32_to_cpu(minidump_ss->status) != 1 ||
+ le32_to_cpu(minidump_ss->enabled) != MD_SS_ENABLED ||
+ le32_to_cpu(minidump_ss->encryption_status) != MD_SS_ENCR_DONE) {
+ dev_err(&rproc->dev, "Minidump not ready!! Aborting\n");
+ return;
+ }
+
+ adsp_add_minidump_segments(rproc, minidump_ss);
+ rproc_minidump(rproc);
+ adsp_minidump_cleanup(rproc);
+}
+
static int adsp_load(struct rproc *rproc, const struct firmware *fw)
{
struct qcom_adsp *adsp = (struct qcom_adsp *)rproc->priv;
@@ -258,6 +343,15 @@ static const struct rproc_ops adsp_ops = {
.panic = adsp_panic,
};

+static const struct rproc_ops adsp_minidump_ops = {
+ .start = adsp_start,
+ .stop = adsp_stop,
+ .da_to_va = adsp_da_to_va,
+ .load = adsp_load,
+ .panic = adsp_panic,
+ .coredump = adsp_dump,
+};
+
static int adsp_init_clock(struct qcom_adsp *adsp)
{
int ret;
@@ -398,8 +492,13 @@ static int adsp_probe(struct platform_device *pdev)
if (ret < 0 && ret != -EINVAL)
return ret;

- rproc = rproc_alloc(&pdev->dev, pdev->name, &adsp_ops,
- fw_name, sizeof(*adsp));
+ if (desc->minidump_id)
+ rproc = rproc_alloc(&pdev->dev, pdev->name, &adsp_minidump_ops, fw_name,
+ sizeof(*adsp));
+ else
+ rproc = rproc_alloc(&pdev->dev, pdev->name, &adsp_ops, fw_name,
+ sizeof(*adsp));
+
if (!rproc) {
dev_err(&pdev->dev, "unable to allocate remoteproc\n");
return -ENOMEM;
@@ -411,6 +510,7 @@ static int adsp_probe(struct platform_device *pdev)
adsp = (struct qcom_adsp *)rproc->priv;
adsp->dev = &pdev->dev;
adsp->rproc = rproc;
+ adsp->minidump_id = desc->minidump_id;
adsp->pas_id = desc->pas_id;
adsp->has_aggre2_clk = desc->has_aggre2_clk;
adsp->info_name = desc->sysmon_name;
--
Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

2020-11-18 15:54:21

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH v7 2/4] remoteproc: coredump: Add minidump functionality

On Tue 03 Nov 03:19 CST 2020, Siddharth Gupta wrote:

> This change adds a new kind of core dump mechanism which instead of dumping
> entire program segments of the firmware, dumps sections of the remoteproc
> memory which are sufficient to allow debugging the firmware. This function
> thus uses section headers instead of program headers during creation of the
> core dump elf.
>
> Signed-off-by: Rishabh Bhatnagar <[email protected]>

Co-developed-by: Rishabh

> Signed-off-by: Siddharth Gupta <[email protected]>
> ---
> drivers/remoteproc/remoteproc_coredump.c | 140 ++++++++++++++++++++++++++++
> drivers/remoteproc/remoteproc_elf_helpers.h | 26 ++++++
> include/linux/remoteproc.h | 1 +
> 3 files changed, 167 insertions(+)
>
> diff --git a/drivers/remoteproc/remoteproc_coredump.c b/drivers/remoteproc/remoteproc_coredump.c
> index 34530dc..a6c0099 100644
> --- a/drivers/remoteproc/remoteproc_coredump.c
> +++ b/drivers/remoteproc/remoteproc_coredump.c
> @@ -323,3 +323,143 @@ void rproc_coredump(struct rproc *rproc)
> */
> wait_for_completion(&dump_state.dump_done);
> }
> +
> +/**
> + * rproc_minidump() - perform minidump
> + * @rproc: rproc handle
> + *
> + * This function will generate an ELF header for the registered sections of
> + * segments and create a devcoredump device associated with rproc. Based on
> + * the coredump configuration this function will directly copy the segments
> + * from device memory to userspace or copy segments from device memory to
> + * a separate buffer, which can then be read by userspace.
> + * The first approach avoids using extra vmalloc memory. But it will stall
> + * recovery flow until dump is read by userspace.
> + */
> +void rproc_minidump(struct rproc *rproc)

Implementation wise I think this looks good now!

But the name "minidump" isn't descriptive - nor is the "perform
minidump". I think you should name this rproc_coredump_using_sections()

> +{
> + struct rproc_dump_segment *segment;
> + void *shdr;
> + void *ehdr;
> + size_t data_size;
> + size_t strtbl_size = 0;
> + size_t strtbl_index = 1;
> + size_t offset;
> + void *data;
> + u8 class = rproc->elf_class;
> + int shnum;
> + struct rproc_coredump_state dump_state;
> + unsigned int dump_conf = rproc->dump_conf;
> + char *str_tbl = "STR_TBL";
> +
> + if (list_empty(&rproc->dump_segments) ||
> + dump_conf == RPROC_COREDUMP_DISABLED)
> + return;
> +
> + if (class == ELFCLASSNONE) {
> + dev_err(&rproc->dev, "Elf class is not set\n");
> + return;
> + }
> +
> + /*
> + * We allocate two extra section headers. The first one is null.
> + * Second section header is for the string table. Also space is
> + * allocated for string table.
> + */
> + data_size = elf_size_of_hdr(class) + 2 * elf_size_of_shdr(class);
> + shnum = 2;
> +
> + /* the extra byte is for the null character at index 0 */
> + strtbl_size += strlen(str_tbl) + 2;
> +
> + list_for_each_entry(segment, &rproc->dump_segments, node) {
> + data_size += elf_size_of_shdr(class);
> + strtbl_size += strlen(segment->priv) + 1;
> + if (dump_conf == RPROC_COREDUMP_ENABLED)
> + data_size += segment->size;
> + shnum++;
> + }
> +
> + data_size += strtbl_size;
> +
> + data = vmalloc(data_size);
> + if (!data)
> + return;
> +
> + ehdr = data;
> + memset(ehdr, 0, elf_size_of_hdr(class));
> + /* e_ident field is common for both elf32 and elf64 */
> + elf_hdr_init_ident(ehdr, class);
> +
> + elf_hdr_set_e_type(class, ehdr, ET_CORE);
> + elf_hdr_set_e_machine(class, ehdr, rproc->elf_machine);
> + elf_hdr_set_e_version(class, ehdr, EV_CURRENT);
> + elf_hdr_set_e_entry(class, ehdr, rproc->bootaddr);
> + elf_hdr_set_e_shoff(class, ehdr, elf_size_of_hdr(class));
> + elf_hdr_set_e_ehsize(class, ehdr, elf_size_of_hdr(class));
> + elf_hdr_set_e_shentsize(class, ehdr, elf_size_of_shdr(class));
> + elf_hdr_set_e_shnum(class, ehdr, shnum);
> + elf_hdr_set_e_shstrndx(class, ehdr, 1);
> +
> + /*
> + * The zeroth index of the section header is reserved and is rarely used.
> + * Set the section header as null (SHN_UNDEF) and move to the next one.
> + */
> + shdr = data + elf_hdr_get_e_shoff(class, ehdr);
> + memset(shdr, 0, elf_size_of_shdr(class));
> + shdr += elf_size_of_shdr(class);
> +
> + /* Initialize the string table. */
> + offset = elf_hdr_get_e_shoff(class, ehdr) +
> + elf_size_of_shdr(class) * elf_hdr_get_e_shnum(class, ehdr);
> + memset(data + offset, 0, strtbl_size);
> +
> + /* Fill in the string table section header. */
> + memset(shdr, 0, elf_size_of_shdr(class));
> + elf_shdr_set_sh_type(class, shdr, SHT_STRTAB);
> + elf_shdr_set_sh_offset(class, shdr, offset);
> + elf_shdr_set_sh_size(class, shdr, strtbl_size);
> + elf_shdr_set_sh_entsize(class, shdr, 0);
> + elf_shdr_set_sh_flags(class, shdr, 0);
> + elf_shdr_set_sh_name(class, shdr, set_section_name(str_tbl, ehdr, class, &strtbl_index));
> + offset += elf_shdr_get_sh_size(class, shdr);
> + shdr += elf_size_of_shdr(class);
> +
> + list_for_each_entry(segment, &rproc->dump_segments, node) {
> + memset(shdr, 0, elf_size_of_shdr(class));
> + elf_shdr_set_sh_type(class, shdr, SHT_PROGBITS);
> + elf_shdr_set_sh_offset(class, shdr, offset);
> + elf_shdr_set_sh_addr(class, shdr, segment->da);
> + elf_shdr_set_sh_size(class, shdr, segment->size);
> + elf_shdr_set_sh_entsize(class, shdr, 0);
> + elf_shdr_set_sh_flags(class, shdr, SHF_WRITE);
> + elf_shdr_set_sh_name(class, shdr,
> + set_section_name(segment->priv, ehdr, class, &strtbl_index));
> +
> + /* No need to copy segments for inline dumps */
> + if (dump_conf == RPROC_COREDUMP_ENABLED)
> + rproc_copy_segment(rproc, data + offset, segment, 0,
> + segment->size);
> + offset += elf_shdr_get_sh_size(class, shdr);
> + shdr += elf_size_of_shdr(class);
> + }
> +
> + if (dump_conf == RPROC_COREDUMP_ENABLED) {
> + dev_coredumpv(&rproc->dev, data, data_size, GFP_KERNEL);
> + return;
> + }
> +
> + /* Initialize the dump state struct to be used by rproc_coredump_read */
> + dump_state.rproc = rproc;
> + dump_state.header = data;
> + init_completion(&dump_state.dump_done);
> +
> + dev_coredumpm(&rproc->dev, NULL, &dump_state, data_size, GFP_KERNEL,
> + rproc_coredump_read, rproc_coredump_free);
> +
> + /* Wait until the dump is read and free is called. Data is freed
> + * by devcoredump framework automatically after 5 minutes.
> + */
> + wait_for_completion(&dump_state.dump_done);
> +}
> +EXPORT_SYMBOL(rproc_minidump);
> diff --git a/drivers/remoteproc/remoteproc_elf_helpers.h b/drivers/remoteproc/remoteproc_elf_helpers.h
> index 4b6be7b..fa669ad 100644
> --- a/drivers/remoteproc/remoteproc_elf_helpers.h
> +++ b/drivers/remoteproc/remoteproc_elf_helpers.h
> @@ -65,6 +65,7 @@ ELF_GEN_FIELD_GET_SET(hdr, e_type, u16)
> ELF_GEN_FIELD_GET_SET(hdr, e_version, u32)
> ELF_GEN_FIELD_GET_SET(hdr, e_ehsize, u32)
> ELF_GEN_FIELD_GET_SET(hdr, e_phentsize, u16)
> +ELF_GEN_FIELD_GET_SET(hdr, e_shentsize, u16)
>
> ELF_GEN_FIELD_GET_SET(phdr, p_paddr, u64)
> ELF_GEN_FIELD_GET_SET(phdr, p_vaddr, u64)
> @@ -75,6 +76,9 @@ ELF_GEN_FIELD_GET_SET(phdr, p_offset, u64)
> ELF_GEN_FIELD_GET_SET(phdr, p_flags, u32)
> ELF_GEN_FIELD_GET_SET(phdr, p_align, u64)
>
> +ELF_GEN_FIELD_GET_SET(shdr, sh_type, u32)
> +ELF_GEN_FIELD_GET_SET(shdr, sh_flags, u32)
> +ELF_GEN_FIELD_GET_SET(shdr, sh_entsize, u16)
> ELF_GEN_FIELD_GET_SET(shdr, sh_size, u64)
> ELF_GEN_FIELD_GET_SET(shdr, sh_offset, u64)
> ELF_GEN_FIELD_GET_SET(shdr, sh_name, u32)
> @@ -93,4 +97,26 @@ ELF_STRUCT_SIZE(shdr)
> ELF_STRUCT_SIZE(phdr)
> ELF_STRUCT_SIZE(hdr)
>
> +static inline unsigned int set_section_name(const char *name, void *ehdr, u8 class, size_t *index)

I think set_section_name() is a rather generic name for a function
living in a header file. So I think you should prefix this with "elf_".

Also, doesn't this function just "adds strings to a string table",
rather than "set a section name"? Is it elf_strtbl_add() ?

Regards,
Bjorn

> +{
> + u16 shstrndx = elf_hdr_get_e_shstrndx(class, ehdr);
> + void *shdr;
> + char *strtab;
> + size_t idx, ret;
> +
> + shdr = ehdr + elf_size_of_hdr(class) + shstrndx * elf_size_of_shdr(class);
> + strtab = ehdr + elf_shdr_get_sh_offset(class, shdr);
> + idx = index ? *index : 0;
> + if (!strtab || !name)
> + return 0;
> +
> + ret = idx;
> + strcpy((strtab + idx), name);
> + idx += strlen(name) + 1;
> + if (index)
> + *index = idx;
> +
> + return ret;
> +}
> +
> #endif /* REMOTEPROC_ELF_LOADER_H */
> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> index a419878..844021e 100644
> --- a/include/linux/remoteproc.h
> +++ b/include/linux/remoteproc.h
> @@ -656,6 +656,7 @@ rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
> int rproc_boot(struct rproc *rproc);
> void rproc_shutdown(struct rproc *rproc);
> void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
> +void rproc_minidump(struct rproc *rproc);
> int rproc_coredump_add_segment(struct rproc *rproc, dma_addr_t da, size_t size);
> int rproc_coredump_add_custom_segment(struct rproc *rproc,
> dma_addr_t da, size_t size,
> --
> Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>

2020-11-18 16:53:38

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH v7 3/4] remoteproc: qcom: Add capability to collect minidumps

On Tue 03 Nov 03:19 CST 2020, Siddharth Gupta wrote:

> This patch adds support for collecting minidump in the event of remoteproc
> crash. Parse the minidump table based on remoteproc's unique minidump-id,
> read all memory regions from the remoteproc's minidump table entry and
> expose the memory to userspace. The remoteproc platform driver can choose
> to collect a full/mini dump by specifying the coredump op.
>
> Co-developed-by: Rishabh Bhatnagar <[email protected]>
> Signed-off-by: Rishabh Bhatnagar <[email protected]>
> Co-developed-by: Gurbir Arora <[email protected]>
> Signed-off-by: Gurbir Arora <[email protected]>
> Signed-off-by: Siddharth Gupta <[email protected]>
> ---
> drivers/remoteproc/qcom_minidump.h | 64 +++++++++++++++++++++++
> drivers/remoteproc/qcom_q6v5_pas.c | 104 ++++++++++++++++++++++++++++++++++++-
> 2 files changed, 166 insertions(+), 2 deletions(-)
> create mode 100644 drivers/remoteproc/qcom_minidump.h
>
> diff --git a/drivers/remoteproc/qcom_minidump.h b/drivers/remoteproc/qcom_minidump.h

This only needs to live in a header file if it's going to be accessed
from more than 1 c-file.

> new file mode 100644
> index 0000000..5857d06
> --- /dev/null
> +++ b/drivers/remoteproc/qcom_minidump.h
> @@ -0,0 +1,64 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) 2020, The Linux Foundation. All rights reserved.
> + */
> +
> +#ifndef __QCOM_MINIDUMP_H
> +#define __QCOM_MINIDUMP_H
> +
> +#define MAX_NUM_OF_SS 10
> +#define MAX_REGION_NAME_LENGTH 16
> +#define SBL_MINIDUMP_SMEM_ID 602
> +#define MD_REGION_VALID ('V' << 24 | 'A' << 16 | 'L' << 8 | 'I' << 0)
> +#define MD_SS_ENCR_DONE ('D' << 24 | 'O' << 16 | 'N' << 8 | 'E' << 0)
> +#define MD_SS_ENABLED ('E' << 24 | 'N' << 16 | 'B' << 8 | 'L' << 0)
> +
> +/**
> + * struct minidump_region - Minidump region
> + * @name : Name of the region to be dumped
> + * @seq_num: : Use to differentiate regions with same name.
> + * @valid : This entry to be dumped (if set to 1)
> + * @address : Physical address of region to be dumped
> + * @size : Size of the region
> + */
> +struct minidump_region {
> + char name[MAX_REGION_NAME_LENGTH];
> + __le32 seq_num;
> + __le32 valid;
> + __le64 address;
> + __le64 size;
> +};
> +
> +/**
> + * struct minidump_subsystem_toc: Subsystem's SMEM Table of content
> + * @status : Subsystem toc init status
> + * @enabled : if set to 1, this region would be copied during coredump
> + * @encryption_status: Encryption status for this subsystem
> + * @encryption_required : Decides to encrypt the subsystem regions or not
> + * @ss_region_count : Number of regions added in this subsystem toc
> + * @md_ss_smem_regions_baseptr : regions base pointer of the subsystem
> + */
> +struct minidump_subsystem_toc {
> + __le32 status;
> + __le32 enabled;
> + __le32 encryption_status;
> + __le32 encryption_required;
> + __le32 ss_region_count;

Please drop the "ss_" prefix.

> + __le64 md_ss_smem_regions_baseptr;

Please drop the "md_ss_smem_" prefix.

> +};
> +
> +/**
> + * struct minidump_global_toc: Global Table of Content
> + * @md_toc_init : Global Minidump init status
> + * @md_revision : Minidump revision
> + * @md_enable_status : Minidump enable status
> + * @md_ss_toc : Array of subsystems toc
> + */
> +struct minidump_global_toc {
> + __le32 status;
> + __le32 md_revision;
> + __le32 enabled;
> + struct minidump_subsystem_toc md_ss_toc[MAX_NUM_OF_SS];

How about "subsystems" and how about dropping the "_toc" suffix on the
type?

> +};
> +
> +#endif
> diff --git a/drivers/remoteproc/qcom_q6v5_pas.c b/drivers/remoteproc/qcom_q6v5_pas.c
> index 3837f23..349f725 100644
> --- a/drivers/remoteproc/qcom_q6v5_pas.c
> +++ b/drivers/remoteproc/qcom_q6v5_pas.c
> @@ -28,11 +28,13 @@
> #include "qcom_pil_info.h"
> #include "qcom_q6v5.h"
> #include "remoteproc_internal.h"
> +#include "qcom_minidump.h"
>
> struct adsp_data {
> int crash_reason_smem;
> const char *firmware_name;
> int pas_id;
> + unsigned int minidump_id;
> bool has_aggre2_clk;
> bool auto_boot;
>
> @@ -63,6 +65,7 @@ struct qcom_adsp {
> int proxy_pd_count;
>
> int pas_id;
> + unsigned int minidump_id;
> int crash_reason_smem;
> bool has_aggre2_clk;
> const char *info_name;
> @@ -116,6 +119,88 @@ static void adsp_pds_disable(struct qcom_adsp *adsp, struct device **pds,
> }
> }
>
> +static void adsp_minidump_cleanup(struct rproc *rproc)
> +{
> + struct rproc_dump_segment *entry, *tmp;
> +
> + list_for_each_entry_safe(entry, tmp, &rproc->dump_segments, node) {
> + list_del(&entry->node);
> + kfree(entry->priv);
> + kfree(entry);
> + }
> +}
> +
> +static void adsp_add_minidump_segments(struct rproc *rproc,
> + struct minidump_subsystem_toc *minidump_ss)
> +{
> + struct minidump_region __iomem *ptr;
> + struct minidump_region region;
> + int seg_cnt, i;
> + dma_addr_t da;
> + size_t size;
> + char *name;
> +
> + if (!list_empty(&rproc->dump_segments)) {

if (WARN_ON(!list_empty()))

Because this would only happen if we have a bug somewhere that leaves
items lingering on the dump_segments list.

> + dev_err(&rproc->dev, "dump segment list already populated\n");
> + return;
> + }
> +
> + seg_cnt = le32_to_cpu(minidump_ss->ss_region_count);
> + ptr = ioremap((unsigned long)le64_to_cpu(minidump_ss->md_ss_smem_regions_baseptr),
> + seg_cnt * sizeof(struct minidump_region));
> +
> + if (!ptr)
> + return;
> +
> + for (i = 0; i < seg_cnt; i++) {
> + memcpy_fromio(&region, ptr + i, sizeof(region));
> + if (region.valid == MD_REGION_VALID) {
> + name = kmalloc(MAX_REGION_NAME_LENGTH, GFP_KERNEL);
> + strlcpy(name, region.name, MAX_REGION_NAME_LENGTH);

Please use kstrdup() and don't forget to check for (and handle)
allocation failures.

> + da = le64_to_cpu(region.address);
> + size = le32_to_cpu(region.size);
> + rproc_coredump_add_custom_segment(rproc, da, size, NULL, name);
> + }
> + }
> +
> + iounmap(ptr);
> +}
> +
> +static void adsp_dump(struct rproc *rproc)

Here I think it makes sense to spell out adsp_minidump()

That said, the only thing I see specific to this driver here is the
use of adsp->minidump_id, so how about moving all this to qcom_common.c
and just call qcom_minidump(rproc, adsp->minidump_id); from here?

That way we can easily integrate it in the other remoteprocs as needed
later.

> +{
> + struct qcom_adsp *adsp = rproc->priv;
> + struct minidump_subsystem_toc *minidump_ss;
> + struct minidump_global_toc *minidump_toc;

How about just naming this "toc" and minidump_ss just "minidump"?

> +
> + /* Get Global minidump ToC*/
> + minidump_toc = qcom_smem_get(QCOM_SMEM_HOST_ANY, SBL_MINIDUMP_SMEM_ID, NULL);
> +
> + /* check if global table pointer exists and init is set */
> + if (IS_ERR(minidump_toc) || !minidump_toc->status) {
> + dev_err(&rproc->dev, "SMEM is not initialized.\n");

"Minidump TOC not found in SMEM\n"

> + return;
> + }
> +
> + /* Get subsystem table of contents using the minidump id */
> + minidump_ss = &minidump_toc->md_ss_toc[adsp->minidump_id];
> +
> + /**
> + * Collect minidump if SS ToC is valid and segment table
> + * is initialized in memory and encryption status is set.
> + */
> + if (minidump_ss->md_ss_smem_regions_baseptr == 0 ||
> + le32_to_cpu(minidump_ss->status) != 1 ||
> + le32_to_cpu(minidump_ss->enabled) != MD_SS_ENABLED ||
> + le32_to_cpu(minidump_ss->encryption_status) != MD_SS_ENCR_DONE) {
> + dev_err(&rproc->dev, "Minidump not ready!! Aborting\n");

"Minidump not ready, skipping\n"

> + return;
> + }
> +
> + adsp_add_minidump_segments(rproc, minidump_ss);
> + rproc_minidump(rproc);
> + adsp_minidump_cleanup(rproc);
> +}
> +
> static int adsp_load(struct rproc *rproc, const struct firmware *fw)
> {
> struct qcom_adsp *adsp = (struct qcom_adsp *)rproc->priv;
> @@ -258,6 +343,15 @@ static const struct rproc_ops adsp_ops = {
> .panic = adsp_panic,
> };
>
> +static const struct rproc_ops adsp_minidump_ops = {
> + .start = adsp_start,
> + .stop = adsp_stop,
> + .da_to_va = adsp_da_to_va,
> + .load = adsp_load,
> + .panic = adsp_panic,
> + .coredump = adsp_dump,
> +};
> +
> static int adsp_init_clock(struct qcom_adsp *adsp)
> {
> int ret;
> @@ -398,8 +492,13 @@ static int adsp_probe(struct platform_device *pdev)
> if (ret < 0 && ret != -EINVAL)
> return ret;
>
> - rproc = rproc_alloc(&pdev->dev, pdev->name, &adsp_ops,
> - fw_name, sizeof(*adsp));
> + if (desc->minidump_id)

Please use a local variable to reference adsp_minidump_ops vs adsp_ops.
Instead of making the whole thing conditional.

Regards,
Bjorn

> + rproc = rproc_alloc(&pdev->dev, pdev->name, &adsp_minidump_ops, fw_name,
> + sizeof(*adsp));
> + else
> + rproc = rproc_alloc(&pdev->dev, pdev->name, &adsp_ops, fw_name,
> + sizeof(*adsp));
> +
> if (!rproc) {
> dev_err(&pdev->dev, "unable to allocate remoteproc\n");
> return -ENOMEM;
> @@ -411,6 +510,7 @@ static int adsp_probe(struct platform_device *pdev)
> adsp = (struct qcom_adsp *)rproc->priv;
> adsp->dev = &pdev->dev;
> adsp->rproc = rproc;
> + adsp->minidump_id = desc->minidump_id;
> adsp->pas_id = desc->pas_id;
> adsp->has_aggre2_clk = desc->has_aggre2_clk;
> adsp->info_name = desc->sysmon_name;
> --
> Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>