2023-02-21 11:26:35

by Mukesh Ojha

[permalink] [raw]
Subject: [RFC PATCH 0/6] Add basic Minidump kernel driver support

Minidump is a best effort mechanism to collect useful and predefined data
for first level of debugging on end user devices running on Qualcomm SoCs.
It is built on the premise that System on Chip (SoC) or subsystem part of
SoC crashes, due to a range of hardware and software bugs. Hence, the
ability to collect accurate data is only a best-effort. The data collected
could be invalid or corrupted, data collection itself could fail, and so on.

Qualcomm devices in engineering mode provides a mechanism for generating
full system ramdumps for post mortem debugging. But in some cases it's
however not feasible to capture the entire content of RAM. The minidump
mechanism provides the means for selecting which snippets should be
included in the ramdump.

The core of minidump feature is part of Qualcomm's boot firmware code.
It initializes shared memory (SMEM), which is a part of DDR and
allocates a small section of SMEM to minidump table i.e also called
global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has
their own table of segments to be included in the minidump and all get
their reference from G-ToC. Each segment/region has some details like
name, physical address and it's size etc. and it could be anywhere
scattered in the DDR.

Existing upstream Qualcomm remoteproc driver[1] already supports minidump
feature for remoteproc instances like ADSP, MODEM, ... where predefined
selective segments of subsystem region can be dumped as part of
coredump collection which generates smaller size artifacts compared to
complete coredump of subsystem on crash.

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/remoteproc/qcom_common.c#n142

In addition to managing and querying the APSS minidump description,
the Linux driver maintains a ELF header in a segment. This segment
gets updated with section/program header whenever a new entry gets
registered.

Patch 1/6 is very trivial change.
Patch 2/6 moves the minidump specific data structure and macro to
qcom_minidump.h so that (3/6) minidump driver can use.
Patch 3/6 implements qualcomm minidump kernel driver and exports
symbol which other minidump kernel client can use.
Patch 4/6 enables the qualcomm minidump driver.
Patch 5/6 Use the exported symbol from minidump driver in qcom_common
for querying minidump descriptor for a subsystem.
Patch 6/6 Register pstore region with minidump.

Testing of the patches has been done on sm8450 target with the help
of out of tree patch which helps to set the download mode and storage
type(on which dump will be saved) for which i will send separate series.

Mukesh Ojha (6):
remoteproc: qcom: Expand MD_* as MINIDUMP_*
remoteproc: qcom: Move minidump specific data to qcom_minidump.h
soc: qcom: Add Qualcomm minidump kernel driver
arm64: defconfig: Enable Qualcomm minidump driver
remoterproc: qcom: refactor to leverage exported minidump symbol
pstore/ram: Register context with minidump

arch/arm64/configs/defconfig | 1 +
drivers/remoteproc/qcom_common.c | 75 +-----
drivers/soc/qcom/Kconfig | 14 ++
drivers/soc/qcom/Makefile | 1 +
drivers/soc/qcom/qcom_minidump.c | 490 +++++++++++++++++++++++++++++++++++++++++
fs/pstore/ram.c | 77 ++++++
include/soc/qcom/minidump.h | 40 ++++
include/soc/qcom/qcom_minidump.h | 88 +++++++
8 files changed, 717 insertions(+), 69 deletions(-)
create mode 100644 drivers/soc/qcom/qcom_minidump.c
create mode 100644 include/soc/qcom/minidump.h
create mode 100644 include/soc/qcom/qcom_minidump.h

--
2.7.4



2023-02-21 11:26:40

by Mukesh Ojha

[permalink] [raw]
Subject: [RFC PATCH 1/6] remoteproc: qcom: Expand MD_* as MINIDUMP_*

Expand MD_* as MINIDUMP_* which makes more sense than the
abbreviation.

Signed-off-by: Mukesh Ojha <[email protected]>
---
drivers/remoteproc/qcom_common.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/remoteproc/qcom_common.c b/drivers/remoteproc/qcom_common.c
index 020349f..95bd4b8 100644
--- a/drivers/remoteproc/qcom_common.c
+++ b/drivers/remoteproc/qcom_common.c
@@ -29,9 +29,9 @@
#define MAX_NUM_OF_SS 10
#define MAX_REGION_NAME_LENGTH 16
#define SBL_MINIDUMP_SMEM_ID 602
-#define MD_REGION_VALID ('V' << 24 | 'A' << 16 | 'L' << 8 | 'I' << 0)
-#define MD_SS_ENCR_DONE ('D' << 24 | 'O' << 16 | 'N' << 8 | 'E' << 0)
-#define MD_SS_ENABLED ('E' << 24 | 'N' << 16 | 'B' << 8 | 'L' << 0)
+#define MINIDUMP_REGION_VALID ('V' << 24 | 'A' << 16 | 'L' << 8 | 'I' << 0)
+#define MINIDUMP_SS_ENCR_DONE ('D' << 24 | 'O' << 16 | 'N' << 8 | 'E' << 0)
+#define MINIDUMP_SS_ENABLED ('E' << 24 | 'N' << 16 | 'B' << 8 | 'L' << 0)

/**
* struct minidump_region - Minidump region
@@ -123,7 +123,7 @@ static int qcom_add_minidump_segments(struct rproc *rproc, struct minidump_subsy

for (i = 0; i < seg_cnt; i++) {
memcpy_fromio(&region, ptr + i, sizeof(region));
- if (region.valid == MD_REGION_VALID) {
+ if (region.valid == MINIDUMP_REGION_VALID) {
name = kstrdup(region.name, GFP_KERNEL);
if (!name) {
iounmap(ptr);
@@ -163,8 +163,8 @@ void qcom_minidump(struct rproc *rproc, unsigned int minidump_id)
*/
if (subsystem->regions_baseptr == 0 ||
le32_to_cpu(subsystem->status) != 1 ||
- le32_to_cpu(subsystem->enabled) != MD_SS_ENABLED ||
- le32_to_cpu(subsystem->encryption_status) != MD_SS_ENCR_DONE) {
+ le32_to_cpu(subsystem->enabled) != MINIDUMP_SS_ENABLED ||
+ le32_to_cpu(subsystem->encryption_status) != MINIDUMP_SS_ENCR_DONE) {
dev_err(&rproc->dev, "Minidump not ready, skipping\n");
return;
}
--
2.7.4


2023-02-21 11:26:50

by Mukesh Ojha

[permalink] [raw]
Subject: [RFC PATCH 2/6] remoteproc: qcom: Move minidump specific data to qcom_minidump.h

Move minidump specific data types and macros to a separate internal
header(qcom_minidump.h) so that it can be shared among different
Qualcomm drivers.

There is no change in functional behavior after this.

Signed-off-by: Mukesh Ojha <[email protected]>
---
drivers/remoteproc/qcom_common.c | 56 +---------------------------------
include/soc/qcom/qcom_minidump.h | 66 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 67 insertions(+), 55 deletions(-)
create mode 100644 include/soc/qcom/qcom_minidump.h

diff --git a/drivers/remoteproc/qcom_common.c b/drivers/remoteproc/qcom_common.c
index 95bd4b8..f607e1e 100644
--- a/drivers/remoteproc/qcom_common.c
+++ b/drivers/remoteproc/qcom_common.c
@@ -18,6 +18,7 @@
#include <linux/slab.h>
#include <linux/soc/qcom/mdt_loader.h>
#include <linux/soc/qcom/smem.h>
+#include <soc/qcom/qcom_minidump.h>

#include "remoteproc_internal.h"
#include "qcom_common.h"
@@ -26,61 +27,6 @@
#define to_smd_subdev(d) container_of(d, struct qcom_rproc_subdev, subdev)
#define to_ssr_subdev(d) container_of(d, struct qcom_rproc_ssr, subdev)

-#define MAX_NUM_OF_SS 10
-#define MAX_REGION_NAME_LENGTH 16
-#define SBL_MINIDUMP_SMEM_ID 602
-#define MINIDUMP_REGION_VALID ('V' << 24 | 'A' << 16 | 'L' << 8 | 'I' << 0)
-#define MINIDUMP_SS_ENCR_DONE ('D' << 24 | 'O' << 16 | 'N' << 8 | 'E' << 0)
-#define MINIDUMP_SS_ENABLED ('E' << 24 | 'N' << 16 | 'B' << 8 | 'L' << 0)
-
-/**
- * struct minidump_region - Minidump region
- * @name : Name of the region to be dumped
- * @seq_num: : Use to differentiate regions with same name.
- * @valid : This entry to be dumped (if set to 1)
- * @address : Physical address of region to be dumped
- * @size : Size of the region
- */
-struct minidump_region {
- char name[MAX_REGION_NAME_LENGTH];
- __le32 seq_num;
- __le32 valid;
- __le64 address;
- __le64 size;
-};
-
-/**
- * struct minidump_subsystem - Subsystem's SMEM Table of content
- * @status : Subsystem toc init status
- * @enabled : if set to 1, this region would be copied during coredump
- * @encryption_status: Encryption status for this subsystem
- * @encryption_required : Decides to encrypt the subsystem regions or not
- * @region_count : Number of regions added in this subsystem toc
- * @regions_baseptr : regions base pointer of the subsystem
- */
-struct minidump_subsystem {
- __le32 status;
- __le32 enabled;
- __le32 encryption_status;
- __le32 encryption_required;
- __le32 region_count;
- __le64 regions_baseptr;
-};
-
-/**
- * struct minidump_global_toc - Global Table of Content
- * @status : Global Minidump init status
- * @md_revision : Minidump revision
- * @enabled : Minidump enable status
- * @subsystems : Array of subsystems toc
- */
-struct minidump_global_toc {
- __le32 status;
- __le32 md_revision;
- __le32 enabled;
- struct minidump_subsystem subsystems[MAX_NUM_OF_SS];
-};
-
struct qcom_ssr_subsystem {
const char *name;
struct srcu_notifier_head notifier_list;
diff --git a/include/soc/qcom/qcom_minidump.h b/include/soc/qcom/qcom_minidump.h
new file mode 100644
index 0000000..84c8605
--- /dev/null
+++ b/include/soc/qcom/qcom_minidump.h
@@ -0,0 +1,66 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Qualcomm minidump shared data structures and macros
+ *
+ * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef _QCOM_MINIDUMP_H_
+#define _QCOM_MINIDUMP_H_
+
+#define MAX_NUM_OF_SS 10
+#define MAX_REGION_NAME_LENGTH 16
+#define SBL_MINIDUMP_SMEM_ID 602
+#define MINIDUMP_REGION_VALID ('V' << 24 | 'A' << 16 | 'L' << 8 | 'I' << 0)
+#define MINIDUMP_SS_ENCR_DONE ('D' << 24 | 'O' << 16 | 'N' << 8 | 'E' << 0)
+#define MINIDUMP_SS_ENABLED ('E' << 24 | 'N' << 16 | 'B' << 8 | 'L' << 0)
+
+/**
+ * struct minidump_region - Minidump region
+ * @name : Name of the region to be dumped
+ * @seq_num: : Use to differentiate regions with same name.
+ * @valid : This entry to be dumped (if set to 1)
+ * @address : Physical address of region to be dumped
+ * @size : Size of the region
+ */
+struct minidump_region {
+ char name[MAX_REGION_NAME_LENGTH];
+ __le32 seq_num;
+ __le32 valid;
+ __le64 address;
+ __le64 size;
+};
+
+/**
+ * struct minidump_subsystem - Subsystem's SMEM Table of content
+ * @status : Subsystem toc init status
+ * @enabled : if set to 1, this region would be copied during coredump
+ * @encryption_status: Encryption status for this subsystem
+ * @encryption_required : Decides to encrypt the subsystem regions or not
+ * @region_count : Number of regions added in this subsystem toc
+ * @regions_baseptr : regions base pointer of the subsystem
+ */
+struct minidump_subsystem {
+ __le32 status;
+ __le32 enabled;
+ __le32 encryption_status;
+ __le32 encryption_required;
+ __le32 region_count;
+ __le64 regions_baseptr;
+};
+
+/**
+ * struct minidump_global_toc - Global Table of Content
+ * @status : Global Minidump init status
+ * @md_revision : Minidump revision
+ * @enabled : Minidump enable status
+ * @subsystems : Array of subsystems toc
+ */
+struct minidump_global_toc {
+ __le32 status;
+ __le32 md_revision;
+ __le32 enabled;
+ struct minidump_subsystem subsystems[MAX_NUM_OF_SS];
+};
+
+#endif /* _QCOM_MINIDUMP_H_ */
--
2.7.4


2023-02-21 11:26:58

by Mukesh Ojha

[permalink] [raw]
Subject: [RFC PATCH 4/6] arm64: defconfig: Enable Qualcomm minidump driver

Previous patches add the Qualcomm minidump driver support, so
lets enable minidump config so that it can be used by kernel
clients.

Signed-off-by: Mukesh Ojha <[email protected]>
---
arch/arm64/configs/defconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 989183c..d87800b 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -1211,6 +1211,7 @@ CONFIG_QCOM_STATS=m
CONFIG_QCOM_WCNSS_CTRL=m
CONFIG_QCOM_APR=m
CONFIG_QCOM_ICC_BWMON=m
+CONFIG_QCOM_MINIDUMP=y
CONFIG_ARCH_R8A77995=y
CONFIG_ARCH_R8A77990=y
CONFIG_ARCH_R8A77950=y
--
2.7.4


2023-02-21 11:27:01

by Mukesh Ojha

[permalink] [raw]
Subject: [RFC PATCH 3/6] soc: qcom: Add Qualcomm minidump kernel driver

Minidump is a best effort mechanism to collect useful and predefined
data for first level of debugging on end user devices running on
Qualcomm SoCs. It is built on the premise that System on Chip (SoC)
or subsystem part of SoC crashes, due to a range of hardware and
software bugs. Hence, the ability to collect accurate data is only
a best-effort. The data collected could be invalid or corrupted,
data collection itself could fail, and so on.

Qualcomm devices in engineering mode provides a mechanism for
generating full system ramdumps for post mortem debugging. But in some
cases it's however not feasible to capture the entire content of RAM.
The minidump mechanism provides the means for selecting region should
be included in the ramdump. The solution supports extracting the
ramdump/minidump produced either over USB or stored to an attached
storage device.

The core of minidump feature is part of Qualcomm's boot firmware code.
It initializes shared memory(SMEM), which is a part of DDR and
allocates a small section of it to minidump table i.e also called
global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has
their own table of segments to be included in the minidump, all
references from a descriptor in SMEM (G-ToC). Each segment/region has
some details like name, physical address and it's size etc. and it
could be anywhere scattered in the DDR.

Minidump kernel driver adds the capability to add linux region to be
dumped as part of ram dump collection. It provides appropriate symbol
to check its enablement and register client regions.

To simplify post mortem debugging, it creates and maintain an ELF
header as first region that gets updated with upon registration
of a new region.

Signed-off-by: Mukesh Ojha <[email protected]>
---
drivers/soc/qcom/Kconfig | 14 ++
drivers/soc/qcom/Makefile | 1 +
drivers/soc/qcom/qcom_minidump.c | 490 +++++++++++++++++++++++++++++++++++++++++
include/soc/qcom/minidump.h | 40 ++++
include/soc/qcom/qcom_minidump.h | 24 +-
5 files changed, 568 insertions(+), 1 deletion(-)
create mode 100644 drivers/soc/qcom/qcom_minidump.c
create mode 100644 include/soc/qcom/minidump.h

diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
index ae504c4..0fc7698 100644
--- a/drivers/soc/qcom/Kconfig
+++ b/drivers/soc/qcom/Kconfig
@@ -248,4 +248,18 @@ config QCOM_ICC_BWMON
the fixed bandwidth votes from cpufreq (CPU nodes) thus achieve high
memory throughput even with lower CPU frequencies.

+config QCOM_MINIDUMP
+ bool "QCOM Minidump Support"
+ depends on ARCH_QCOM || COMPILE_TEST
+ depends on QCOM_SMEM
+ help
+ Enablement of core minidump feature is controlled from boot firmware
+ side, and this config allow linux to query and manages APPS minidump
+ table.
+
+ Client drivers can register their internal data structures and debug
+ messages as part of the minidump region and when the SoC is crashed,
+ these selective regions will be dumped instead of the entire DDR.
+ This saves significant amount of time and/or storage space.
+
endmenu
diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
index d66604a..e1ff492 100644
--- a/drivers/soc/qcom/Makefile
+++ b/drivers/soc/qcom/Makefile
@@ -29,3 +29,4 @@ obj-$(CONFIG_QCOM_RPMHPD) += rpmhpd.o
obj-$(CONFIG_QCOM_RPMPD) += rpmpd.o
obj-$(CONFIG_QCOM_KRYO_L2_ACCESSORS) += kryo-l2-accessors.o
obj-$(CONFIG_QCOM_ICC_BWMON) += icc-bwmon.o
+obj-$(CONFIG_QCOM_MINIDUMP) += qcom_minidump.o
diff --git a/drivers/soc/qcom/qcom_minidump.c b/drivers/soc/qcom/qcom_minidump.c
new file mode 100644
index 0000000..eb63b75
--- /dev/null
+++ b/drivers/soc/qcom/qcom_minidump.c
@@ -0,0 +1,490 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#define pr_fmt(fmt) "Minidump: " fmt
+
+#include <linux/init.h>
+#include <linux/export.h>
+#include <linux/kernel.h>
+#include <linux/err.h>
+#include <linux/elf.h>
+#include <linux/errno.h>
+#include <linux/string.h>
+#include <linux/slab.h>
+#include <linux/of.h>
+#include <linux/printk.h>
+#include <linux/soc/qcom/smem.h>
+#include <soc/qcom/qcom_minidump.h>
+#include <soc/qcom/minidump.h>
+
+/**
+ * DOC: Overview
+ *
+ * +-----------------------------------------------+
+ * | DDR +-------------+ |
+ * | | SS0-ToC| |
+ * | +----------------+ +----------------+ | |
+ * | |Shared memory | | SS1-ToC| | |
+ * | |(SMEM) | | | | |
+ * | | | +-->|--------+ | | |
+ * | |G-ToC | | | SS-ToC \ | | |
+ * | |+-------------+ | | | +-----------+ | | |
+ * | ||-------------| | | | |-----------| | | |
+ * | || SS0-ToC | | | +-|<|SS1 region1| | | |
+ * | ||-------------| | | | | |-----------| | | |
+ * | || SS1-ToC |-|>+ | | |SS1 region2| | | |
+ * | ||-------------| | | | |-----------| | | |
+ * | || SS2-ToC | | | | | ... | | | |
+ * | ||-------------| | | | |-----------| | | |
+ * | || ... | | |-|<|SS1 regionN| | | |
+ * | ||-------------| | | | |-----------| | | |
+ * | || SSn-ToC | | | | +-----------+ | | |
+ * | |+-------------+ | | | | | |
+ * | | | | |----------------| | |
+ * | | | +>| regionN | | |
+ * | | | | |----------------| | |
+ * | +----------------+ | | | | |
+ * | | |----------------| | |
+ * | +>| region1 | | |
+ * | |----------------| | |
+ * | | | | |
+ * | |----------------|-+ |
+ * | | region5 | |
+ * | |----------------| |
+ * | | | |
+ * | Region information +----------------+ |
+ * | +---------------+ |
+ * | |region name | |
+ * | |---------------| |
+ * | |region address | |
+ * | |---------------| |
+ * | |region size | |
+ * | +---------------+ |
+ * +-----------------------------------------------+
+ * G-ToC: Global table of content
+ * SS-ToC: Subsystem table of content
+ * SS0-SSn: Subsystem numbered from 0 to n
+ *
+ * The core of minidump feature is part of Qualcomm's boot firmware code.
+ * It initializes shared memory(SMEM), which is a part of DDR and
+ * allocates a small section of it to minidump table i.e also called
+ * global table of content (G-ToC).
+ *
+ * Each subsystem (APSS, ADSP, ...) has their own table of segments to be
+ * included in the minidump, all references from a descriptor in SMEM
+ * (G-ToC). Each segment/region has some details like name, physical
+ * address and it's size etc. and it could be anywhere scattered
+ * in the DDR.
+ *
+ * Qualcomm SoCs supports extracting the ramdump/minidump produced
+ * either over USB or stored to an attached storage device.
+ *
+ */
+
+/**
+ * struct minidump_elfhdr - Minidump table elf header
+ * @ehdr: Elf main header
+ * @shdr: Section header
+ * @phdr: Program header
+ * @elf_offset: Section offset in elf
+ * @strtable_idx: String table current index position
+ */
+struct minidump_elfhdr {
+ struct elfhdr *ehdr;
+ struct elf_shdr *shdr;
+ struct elf_phdr *phdr;
+ size_t elf_offset;
+ size_t strtable_idx;
+};
+
+/**
+ * struct minidump - Minidump driver private data
+ * @md_gbl_toc : Global TOC pointer
+ * @md_ss_toc : High level OS TOC pointer
+ * @md_regions : High level OS region base pointer
+ * @elf : Minidump elf header
+ */
+struct minidump {
+ struct minidump_global_toc *md_gbl_toc;
+ struct minidump_subsystem *md_ss_toc;
+ struct minidump_region *md_regions;
+ struct minidump_elfhdr elf;
+};
+
+/*
+ * In some of the Old Qualcomm devices, boot firmware statically allocates 300
+ * as total number of supported region (including all co-processors) in
+ * minidump table out of which linux was using 200. In future, this limitation
+ * from boot firmware might get removed by allocating the region dynamically.
+ * So, keep it compatible with older devices, we can the current limit for Linux
+ * to 200.
+ */
+#define MAX_NUM_ENTRIES 200
+
+static struct minidump minidump;
+static DEFINE_MUTEX(minidump_lock);
+
+static struct elf_shdr *elf_shdr_entry_addr(struct elfhdr *ehdr, int idx)
+{
+ struct elf_shdr *eshdr = (struct elf_shdr *)((size_t)ehdr + ehdr->e_shoff);
+
+ return &eshdr[idx];
+}
+
+static struct elf_phdr *elf_phdr_entry_addr(struct elfhdr *ehdr, int idx)
+{
+ struct elf_phdr *ephdr = (struct elf_phdr *)((size_t)ehdr + ehdr->e_phoff);
+
+ return &ephdr[idx];
+}
+
+static char *elf_str_table_start(struct elfhdr *ehdr)
+{
+ struct elf_shdr *eshdr;
+
+ if (ehdr->e_shstrndx == SHN_UNDEF)
+ return NULL;
+
+ eshdr = elf_shdr_entry_addr(ehdr, ehdr->e_shstrndx);
+ return (char *)ehdr + eshdr->sh_offset;
+}
+
+static unsigned int append_str_to_strtable(const char *name)
+{
+ char *strtab = elf_str_table_start(minidump.elf.ehdr);
+ unsigned int old_idx = minidump.elf.strtable_idx;
+ unsigned int ret;
+
+ if (!strtab || !name)
+ return 0;
+
+ ret = old_idx;
+ old_idx += strscpy((strtab + old_idx), name, MAX_REGION_NAME_LENGTH);
+ minidump.elf.strtable_idx = old_idx + 1;
+ return ret;
+}
+
+static int get_minidump_region_index(const struct qcom_minidump_region *region)
+{
+ struct minidump_region *mdr;
+ unsigned int i;
+ unsigned int count;
+
+ count = le32_to_cpu(minidump.md_ss_toc->region_count);
+ for (i = 0; i < count; i++) {
+ mdr = &minidump.md_regions[i];
+ if (!strcmp(mdr->name, region->name))
+ return i;
+ }
+ return -ENOENT;
+}
+
+/* Update ELF header */
+static void minidump_update_elf_header(const struct qcom_minidump_region *region)
+{
+ struct elfhdr *ehdr = minidump.elf.ehdr;
+ struct elf_shdr *shdr;
+ struct elf_phdr *phdr;
+
+ shdr = elf_shdr_entry_addr(ehdr, ehdr->e_shnum++);
+ phdr = elf_phdr_entry_addr(ehdr, ehdr->e_phnum++);
+
+ shdr->sh_type = SHT_PROGBITS;
+ shdr->sh_name = append_str_to_strtable(region->name);
+ shdr->sh_addr = (elf_addr_t)region->virt_addr;
+ shdr->sh_size = region->size;
+ shdr->sh_flags = SHF_WRITE;
+ shdr->sh_offset = minidump.elf.elf_offset;
+ shdr->sh_entsize = 0;
+
+ phdr->p_type = PT_LOAD;
+ phdr->p_offset = minidump.elf.elf_offset;
+ phdr->p_vaddr = (elf_addr_t)region->virt_addr;
+ phdr->p_paddr = region->phys_addr;
+ phdr->p_filesz = phdr->p_memsz = region->size;
+ phdr->p_flags = PF_R | PF_W;
+ minidump.elf.elf_offset += shdr->sh_size;
+}
+
+/* Add region in subsystem ToC */
+static void minidump_ss_add_region(const struct qcom_minidump_region *region)
+{
+ struct minidump_region *mdr;
+ unsigned int region_cnt = le32_to_cpu(minidump.md_ss_toc->region_count);
+
+ mdr = &minidump.md_regions[region_cnt];
+ strscpy(mdr->name, region->name, sizeof(mdr->name));
+ mdr->address = cpu_to_le64(region->phys_addr);
+ mdr->size = cpu_to_le64(region->size);
+ mdr->valid = cpu_to_le32(MINIDUMP_REGION_VALID);
+ region_cnt++;
+ minidump.md_ss_toc->region_count = cpu_to_le32(region_cnt);
+}
+
+static bool qcom_minidump_valid_region(const struct qcom_minidump_region *region)
+{
+ return region &&
+ strnlen(region->name, MAX_NAME_LENGTH) < MAX_NAME_LENGTH &&
+ region->virt_addr &&
+ region->size &&
+ IS_ALIGNED(region->size, 4);
+}
+
+#define MAX_STRTBL_SIZE (MAX_NUM_ENTRIES * MAX_REGION_NAME_LENGTH)
+static int minidump_add_elf_header(void)
+{
+ struct qcom_minidump_region elfregion;
+ struct elfhdr *ehdr;
+ struct elf_shdr *shdr;
+ struct elf_phdr *phdr;
+ unsigned int elfh_size;
+ unsigned int strtbl_off;
+ unsigned int phdr_off;
+ char *banner;
+ unsigned int banner_len;
+
+ banner_len = strlen(linux_banner);
+ /*
+ * Header buffer contains:
+ * ELF header, (MAX_NUM_ENTRIES + 4) of Section and Program ELF headers,
+ * where, 4 additional entries, one for empty header, one for string table
+ * one for minidump table and one for linux banner.
+ *
+ * Linux banner is stored in minidump to aid post mortem tools to determine
+ * the kernel version.
+ */
+ elfh_size = sizeof(*ehdr);
+ elfh_size += MAX_STRTBL_SIZE;
+ elfh_size += banner_len + 1;
+ elfh_size += ((sizeof(*shdr) + sizeof(*phdr)) * (MAX_NUM_ENTRIES + 4));
+ elfh_size = ALIGN(elfh_size, 4);
+
+ minidump.elf.ehdr = kzalloc(elfh_size, GFP_KERNEL);
+ if (!minidump.elf.ehdr)
+ return -ENOMEM;
+
+ /* Register ELF header as first region */
+ strscpy(elfregion.name, "KELF_HEADER", sizeof(elfregion.name));
+ elfregion.virt_addr = minidump.elf.ehdr;
+ elfregion.phys_addr = virt_to_phys(minidump.elf.ehdr);
+ elfregion.size = elfh_size;
+ minidump_ss_add_region(&elfregion);
+
+ ehdr = minidump.elf.ehdr;
+ /* Assign Section/Program headers offset */
+ minidump.elf.shdr = shdr = (struct elf_shdr *)(ehdr + 1);
+ minidump.elf.phdr = phdr = (struct elf_phdr *)(shdr + MAX_NUM_ENTRIES);
+ phdr_off = sizeof(*ehdr) + (sizeof(*shdr) * MAX_NUM_ENTRIES);
+
+ memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
+ ehdr->e_ident[EI_CLASS] = ELF_CLASS;
+ ehdr->e_ident[EI_DATA] = ELF_DATA;
+ ehdr->e_ident[EI_VERSION] = EV_CURRENT;
+ ehdr->e_ident[EI_OSABI] = ELF_OSABI;
+ ehdr->e_type = ET_CORE;
+ ehdr->e_machine = ELF_ARCH;
+ ehdr->e_version = EV_CURRENT;
+ ehdr->e_ehsize = sizeof(*ehdr);
+ ehdr->e_phoff = phdr_off;
+ ehdr->e_phentsize = sizeof(*phdr);
+ ehdr->e_shoff = sizeof(*ehdr);
+ ehdr->e_shentsize = sizeof(*shdr);
+ ehdr->e_shstrndx = 1;
+
+ minidump.elf.elf_offset = elfh_size;
+
+ /*
+ * The zeroth index of the section header is reserved and is rarely used.
+ * Set the section header as null (SHN_UNDEF) and move to the next one.
+ * 2nd Section is String table.
+ */
+ minidump.elf.strtable_idx = 1;
+ strtbl_off = sizeof(*ehdr) + ((sizeof(*phdr) + sizeof(*shdr)) * MAX_NUM_ENTRIES);
+ shdr++;
+ shdr->sh_type = SHT_STRTAB;
+ shdr->sh_offset = (elf_addr_t)strtbl_off;
+ shdr->sh_size = MAX_STRTBL_SIZE;
+ shdr->sh_entsize = 0;
+ shdr->sh_flags = 0;
+ shdr->sh_name = append_str_to_strtable("STR_TBL");
+ shdr++;
+
+ /* 3rd Section is for Minidump_table VA, used by parsers */
+ shdr->sh_type = SHT_PROGBITS;
+ shdr->sh_entsize = 0;
+ shdr->sh_flags = 0;
+ shdr->sh_addr = (elf_addr_t)&minidump;
+ shdr->sh_name = append_str_to_strtable("minidump_table");
+ shdr++;
+
+ /* 4th Section is Linux banner */
+ banner = (char *)ehdr + strtbl_off + MAX_STRTBL_SIZE;
+ memcpy(banner, linux_banner, banner_len);
+
+ shdr->sh_type = SHT_PROGBITS;
+ shdr->sh_offset = (elf_addr_t)(strtbl_off + MAX_STRTBL_SIZE);
+ shdr->sh_size = banner_len + 1;
+ shdr->sh_addr = (elf_addr_t)linux_banner;
+ shdr->sh_entsize = 0;
+ shdr->sh_flags = SHF_WRITE;
+ shdr->sh_name = append_str_to_strtable("linux_banner");
+
+ phdr->p_type = PT_LOAD;
+ phdr->p_offset = (elf_addr_t)(strtbl_off + MAX_STRTBL_SIZE);
+ phdr->p_vaddr = (elf_addr_t)linux_banner;
+ phdr->p_paddr = virt_to_phys(linux_banner);
+ phdr->p_filesz = phdr->p_memsz = banner_len + 1;
+ phdr->p_flags = PF_R | PF_W;
+
+ /* Update headers count*/
+ ehdr->e_phnum = 1;
+ ehdr->e_shnum = 4;
+ return 0;
+}
+
+static int qcom_minidump_init(void)
+{
+ struct minidump_subsystem *mdsstoc;
+
+ mdsstoc = qcom_minidump_subsystem_desc(MINIDUMP_APSS_DESC);
+ if (IS_ERR(mdsstoc))
+ return -EINVAL;
+
+ minidump.md_regions = kcalloc(MAX_NUM_ENTRIES,
+ sizeof(struct minidump_region), GFP_KERNEL);
+ if (!minidump.md_regions)
+ return -ENOMEM;
+
+ minidump.md_ss_toc = mdsstoc;
+ /* Share memory table update */
+ mdsstoc->regions_baseptr = cpu_to_le64(virt_to_phys(minidump.md_regions));
+ /* Tell bootloader not to encrypt the regions of this subsystem */
+ mdsstoc->encryption_status = cpu_to_le32(MINIDUMP_SS_ENCR_DONE);
+ mdsstoc->encryption_required = cpu_to_le32(MINIDUMP_SS_ENCR_NOTREQ);
+
+ mdsstoc->enabled = cpu_to_le32(MINIDUMP_SS_ENABLED);
+ mdsstoc->status = cpu_to_le32(1);
+ mdsstoc->region_count = cpu_to_le32(0);
+
+ return 0;
+}
+
+/**
+ * qcom_minidump_ready - Check, if minidump is ready for client registration or not.
+ *
+ * Return: zero on success and negative on failure.
+ *
+ * Qualcomm minidump feature is dependent on Qualcomm's shared memory and it is
+ * possible for a arm64 target to not have it's device tree entry present, for
+ * such case, qcom_minidump_ready() returns -ENODEV and client should not further
+ * try to register their region with minidump driver.
+ */
+int qcom_minidump_ready(void)
+{
+ void *ptr;
+ struct device_node *np;
+ static bool is_smem_available = true;
+
+ if (!is_smem_available || !(np = of_find_compatible_node(NULL, NULL, "qcom,smem"))) {
+ is_smem_available = false;
+ return -ENODEV;
+ }
+
+ of_node_put(np);
+ ptr = qcom_smem_get(QCOM_SMEM_HOST_ANY, SBL_MINIDUMP_SMEM_ID, NULL);
+ if (IS_ERR(ptr))
+ return PTR_ERR(ptr);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(qcom_minidump_ready);
+
+/**
+ * qcom_minidump_subsystem_desc() - Get minidump subsystem descriptor.
+ * @minidump_index: minidump index for a subsystem in minidump table
+ *
+ * Return: minidump subsystem descriptor address on success and error
+ * on failure
+ */
+struct minidump_subsystem *qcom_minidump_subsystem_desc(unsigned int minidump_index)
+{
+ struct minidump_global_toc *mdgtoc;
+ size_t size;
+
+ mdgtoc = qcom_smem_get(QCOM_SMEM_HOST_ANY, SBL_MINIDUMP_SMEM_ID, &size);
+ if (IS_ERR(mdgtoc)) {
+ pr_err("Unable to find minidump smem item\n");
+ return ERR_CAST(mdgtoc);
+ }
+
+ if (size < sizeof(*mdgtoc) || !mdgtoc->status) {
+ pr_err("Minidump table is not initialized\n");
+ return ERR_PTR(-EINVAL);
+ }
+
+ return &mdgtoc->subsystems[minidump_index];
+}
+EXPORT_SYMBOL_GPL(qcom_minidump_subsystem_desc);
+
+/**
+ * qcom_minidump_region_register() - Register a region in Minidump table.
+ * @region: minidump region.
+ *
+ * Client should not call this directly instead first call qcom_minidump_ready()
+ * to check if minidump is ready to do registration if yes, then call this API.
+ *
+ * Return: On success, it returns region index in minidump table and negative
+ * error value on failure.
+ */
+int qcom_minidump_region_register(const struct qcom_minidump_region *region)
+{
+ static bool minidump_init_done;
+ unsigned int num_region;
+ int ret;
+
+ /* Initialize minidump context on first call */
+ mutex_lock(&minidump_lock);
+ if (!minidump_init_done) {
+ ret = qcom_minidump_init();
+ if (ret)
+ goto unlock;
+
+ minidump_init_done = true;
+ /* First entry would be ELF header */
+ ret = minidump_add_elf_header();
+ if (ret) {
+ kfree(minidump.md_regions);
+ goto unlock;
+ }
+ }
+
+ if (!qcom_minidump_valid_region(region)) {
+ ret = -EINVAL;
+ goto unlock;
+ }
+
+ ret = get_minidump_region_index(region);
+ if (ret >= 0) {
+ pr_info("%s region is already registered\n", region->name);
+ ret = -EEXIST;
+ goto unlock;
+ }
+
+ /* Check if there is a room for a new entry */
+ ret = num_region = le32_to_cpu(minidump.md_ss_toc->region_count);
+ if (num_region >= MAX_NUM_ENTRIES) {
+ pr_err("maximum region limit %u reached\n", num_region);
+ ret = -ENOSPC;
+ goto unlock;
+ }
+
+ minidump_ss_add_region(region);
+ minidump_update_elf_header(region);
+unlock:
+ mutex_unlock(&minidump_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(qcom_minidump_region_register);
diff --git a/include/soc/qcom/minidump.h b/include/soc/qcom/minidump.h
new file mode 100644
index 0000000..a50843f
--- /dev/null
+++ b/include/soc/qcom/minidump.h
@@ -0,0 +1,40 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#ifndef _MINIDUMP_H_
+#define _MINIDUMP_H_
+
+#include <linux/types.h>
+
+#define MAX_NAME_LENGTH 12
+
+/**
+ * struct qcom_minidump_region - Minidump region information
+ *
+ * @name: Entry name, Minidump will dump binary with this name.
+ * @virt_addr: Virtual address of the entry.
+ * @phys_addr: Physical address of the entry to dump.
+ * @size: Number of byte to dump from @address location,
+ * and it should be 4 byte aligned.
+ */
+struct qcom_minidump_region {
+ char name[MAX_NAME_LENGTH];
+ void *virt_addr;
+ phys_addr_t phys_addr;
+ size_t size;
+};
+
+#if IS_ENABLED(CONFIG_QCOM_MINIDUMP)
+extern int qcom_minidump_ready(void);
+extern int qcom_minidump_region_register(const struct qcom_minidump_region *entry);
+#else
+static inline int qcom_minidump_ready(void) { return 0; }
+static inline int qcom_minidump_region_register(const struct qcom_minidump_region *entry)
+{
+ /* Return quietly, if minidump is not enabled */
+ return 0;
+}
+#endif
+#endif /* _MINIDUMP_H_ */
diff --git a/include/soc/qcom/qcom_minidump.h b/include/soc/qcom/qcom_minidump.h
index 84c8605..e8630ed 100644
--- a/include/soc/qcom/qcom_minidump.h
+++ b/include/soc/qcom/qcom_minidump.h
@@ -10,11 +10,25 @@

#define MAX_NUM_OF_SS 10
#define MAX_REGION_NAME_LENGTH 16
+
+#define MINIDUMP_REVISION 1
#define SBL_MINIDUMP_SMEM_ID 602
+
+/* Application processor minidump descriptor */
+#define MINIDUMP_APSS_DESC 0
+#define SMEM_ENTRY_SIZE 40
+
#define MINIDUMP_REGION_VALID ('V' << 24 | 'A' << 16 | 'L' << 8 | 'I' << 0)
+#define MINIDUMP_REGION_INVALID ('I' << 24 | 'N' << 16 | 'V' << 8 | 'A' << 0)
+#define MINIDUMP_REGION_INIT ('I' << 24 | 'N' << 16 | 'I' << 8 | 'T' << 0)
+#define MINIDUMP_REGION_NOINIT 0
+
+#define MINIDUMP_SS_ENCR_REQ (0 << 24 | 'Y' << 16 | 'E' << 8 | 'S' << 0)
+#define MINIDUMP_SS_ENCR_NOTREQ (0 << 24 | 0 << 16 | 'N' << 8 | 'R' << 0)
+#define MINIDUMP_SS_ENCR_NONE ('N' << 24 | 'O' << 16 | 'N' << 8 | 'E' << 0)
#define MINIDUMP_SS_ENCR_DONE ('D' << 24 | 'O' << 16 | 'N' << 8 | 'E' << 0)
+#define MINIDUMP_SS_ENCR_START ('S' << 24 | 'T' << 16 | 'R' << 8 | 'T' << 0)
#define MINIDUMP_SS_ENABLED ('E' << 24 | 'N' << 16 | 'B' << 8 | 'L' << 0)
-
/**
* struct minidump_region - Minidump region
* @name : Name of the region to be dumped
@@ -63,4 +77,12 @@ struct minidump_global_toc {
struct minidump_subsystem subsystems[MAX_NUM_OF_SS];
};

+#if IS_ENABLED(CONFIG_QCOM_MINIDUMP)
+extern struct minidump_subsystem *qcom_minidump_subsystem_desc(unsigned int minidump_index);
+#else
+static inline struct minidump_subsystem *qcom_minidump_subsystem_desc(unsigned int minidump_index)
+{
+ return NULL;
+}
+#endif
#endif /* _QCOM_MINIDUMP_H_ */
--
2.7.4


2023-02-21 11:27:14

by Mukesh Ojha

[permalink] [raw]
Subject: [RFC PATCH 5/6] remoterproc: qcom: refactor to leverage exported minidump symbol

qcom_minidump driver provides qcom_minidump_subsystem_desc()
exported API which other driver can use it query subsystem
descriptor. Refactor qcom_minidump() to use this symbol.

Signed-off-by: Mukesh Ojha <[email protected]>
---
drivers/remoteproc/qcom_common.c | 13 ++-----------
1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/remoteproc/qcom_common.c b/drivers/remoteproc/qcom_common.c
index f607e1e..273f92b 100644
--- a/drivers/remoteproc/qcom_common.c
+++ b/drivers/remoteproc/qcom_common.c
@@ -89,19 +89,10 @@ void qcom_minidump(struct rproc *rproc, unsigned int minidump_id)
{
int ret;
struct minidump_subsystem *subsystem;
- struct minidump_global_toc *toc;

- /* Get Global minidump ToC*/
- toc = qcom_smem_get(QCOM_SMEM_HOST_ANY, SBL_MINIDUMP_SMEM_ID, NULL);
-
- /* check if global table pointer exists and init is set */
- if (IS_ERR(toc) || !toc->status) {
- dev_err(&rproc->dev, "Minidump TOC not found in SMEM\n");
+ subsystem = qcom_minidump_subsystem_desc(minidump_id);
+ if (IS_ERR(subsystem))
return;
- }
-
- /* Get subsystem table of contents using the minidump id */
- subsystem = &toc->subsystems[minidump_id];

/**
* Collect minidump if SS ToC is valid and segment table
--
2.7.4


2023-02-21 11:27:22

by Mukesh Ojha

[permalink] [raw]
Subject: [RFC PATCH 6/6] pstore/ram: Register context with minidump

There are system which does not uses pstore directly but
may have the interest in the context saved by pstore.
Register pstore regions with minidump so that it get
dumped on minidump collection.

Signed-off-by: Mukesh Ojha <[email protected]>
---
fs/pstore/ram.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 77 insertions(+)

diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index ade66db..038da1a 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -20,6 +20,7 @@
#include <linux/compiler.h>
#include <linux/of.h>
#include <linux/of_address.h>
+#include <soc/qcom/minidump.h>

#include "internal.h"
#include "ram_internal.h"
@@ -714,6 +715,74 @@ static int ramoops_parse_dt(struct platform_device *pdev,
return 0;
}

+#if IS_ENABLED(CONFIG_QCOM_MINIDUMP)
+static int ramoops_qcom_minidump_register(struct ramoops_context *cxt)
+{
+ struct qcom_minidump_region pstore_entry;
+ struct persistent_ram_zone *prz;
+ int ret = 0;
+ int i;
+
+ for (i = 0; i < cxt->max_dump_cnt; i++) {
+ prz = cxt->dprzs[i];
+ scnprintf(pstore_entry.name, sizeof(pstore_entry.name),
+ "KDMESG%d", i);
+ pstore_entry.virt_addr = prz->vaddr;
+ pstore_entry.phys_addr = prz->paddr;
+ pstore_entry.size = prz->size;
+ ret = qcom_minidump_region_register(&pstore_entry);
+ if (ret < 0) {
+ pr_err("failed to add dmesg in minidump: err: %d\n", ret);
+ return ret;
+ }
+ }
+
+ if (cxt->console_size) {
+ prz = cxt->cprz;
+ strlcpy(pstore_entry.name, "KCONSOLE", sizeof(pstore_entry.name));
+ pstore_entry.virt_addr = prz->vaddr;
+ pstore_entry.phys_addr = prz->paddr;
+ pstore_entry.size = prz->size;
+ ret = qcom_minidump_region_register(&pstore_entry);
+ if (ret < 0) {
+ pr_err("failed to add console in minidump: err: %d\n", ret);
+ return ret;
+ }
+ }
+
+ for (i = 0; i < cxt->max_ftrace_cnt; i++) {
+ prz = cxt->fprzs[i];
+ scnprintf(pstore_entry.name, sizeof(pstore_entry.name),
+ "KFTRACE%d", i);
+ pstore_entry.virt_addr = prz->vaddr;
+ pstore_entry.phys_addr = prz->paddr;
+ pstore_entry.size = prz->size;
+ ret = qcom_minidump_region_register(&pstore_entry);
+ if (ret < 0) {
+ pr_err("failed to add ftrace in minidump: err: %d\n", ret);
+ return ret;
+ }
+ }
+
+ if (cxt->pmsg_size) {
+ prz = cxt->mprz;
+ strlcpy(pstore_entry.name, "KPMSG", sizeof(pstore_entry.name));
+ pstore_entry.virt_addr = prz->vaddr;
+ pstore_entry.phys_addr = prz->paddr;
+ pstore_entry.size = prz->size;
+ ret = qcom_minidump_region_register(&pstore_entry);
+ if (ret < 0) {
+ pr_err("failed to add pmsg in minidump: err: %d\n", ret);
+ return ret;
+ }
+ }
+
+ return ret;
+}
+#else
+static int ramoops_qcom_minidump_register(struct ramoops_context *cxt) { return 0; }
+#endif
+
static int ramoops_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
@@ -724,6 +793,10 @@ static int ramoops_probe(struct platform_device *pdev)
phys_addr_t paddr;
int err = -EINVAL;

+ err = qcom_minidump_ready();
+ if (err && err != -ENODEV)
+ return -EPROBE_DEFER;
+
/*
* Only a single ramoops area allowed at a time, so fail extra
* probes.
@@ -841,6 +914,10 @@ static int ramoops_probe(struct platform_device *pdev)
}
}

+ err = ramoops_qcom_minidump_register(cxt);
+ if (err < 0)
+ goto fail_buf;
+
err = pstore_register(&cxt->pstore);
if (err) {
pr_err("registering with pstore failed\n");
--
2.7.4


2023-02-23 12:38:57

by Brian Masney

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add basic Minidump kernel driver support

On Tue, Feb 21, 2023 at 04:55:07PM +0530, Mukesh Ojha wrote:
> Minidump is a best effort mechanism to collect useful and predefined data
> for first level of debugging on end user devices running on Qualcomm SoCs.
> It is built on the premise that System on Chip (SoC) or subsystem part of
> SoC crashes, due to a range of hardware and software bugs. Hence, the
> ability to collect accurate data is only a best-effort. The data collected
> could be invalid or corrupted, data collection itself could fail, and so on.
>
> Qualcomm devices in engineering mode provides a mechanism for generating
> full system ramdumps for post mortem debugging. But in some cases it's
> however not feasible to capture the entire content of RAM. The minidump
> mechanism provides the means for selecting which snippets should be
> included in the ramdump.
>
> The core of minidump feature is part of Qualcomm's boot firmware code.
> It initializes shared memory (SMEM), which is a part of DDR and
> allocates a small section of SMEM to minidump table i.e also called
> global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has
> their own table of segments to be included in the minidump and all get
> their reference from G-ToC. Each segment/region has some details like
> name, physical address and it's size etc. and it could be anywhere
> scattered in the DDR.
>
> Existing upstream Qualcomm remoteproc driver[1] already supports minidump
> feature for remoteproc instances like ADSP, MODEM, ... where predefined
> selective segments of subsystem region can be dumped as part of
> coredump collection which generates smaller size artifacts compared to
> complete coredump of subsystem on crash.
>
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/remoteproc/qcom_common.c#n142
>
> In addition to managing and querying the APSS minidump description,
> the Linux driver maintains a ELF header in a segment. This segment
> gets updated with section/program header whenever a new entry gets
> registered.

I'd like to test this series plus your series that sets the multiple
download modes. Can you include documentation about how to actually use
this new feature? Also the information that you provided above is really
useful. I think that should also go in the documentation file as well.

I already have a reliable way to make a board go BOOM and go into
ramdump mode.

Brian


2023-02-23 19:43:33

by Kees Cook

[permalink] [raw]
Subject: Re: [RFC PATCH 6/6] pstore/ram: Register context with minidump

On Tue, Feb 21, 2023 at 04:55:13PM +0530, Mukesh Ojha wrote:
> There are system which does not uses pstore directly but
> may have the interest in the context saved by pstore.
> Register pstore regions with minidump so that it get
> dumped on minidump collection.

Okay, so, this is a really interesting case -- it's a RAM backend that
is already found on a system by pstore via device tree, but there is
_another_ RAM overlay (minidump) that would like to know more about how
the pstore ram backend carves up the memory regions so it can examine
them itself too. (i.e. it's another "interface" like the pstorefs.)

So we need to provide the mapping back to the overlay. It feels to me
like the logic for this needs to live in the minidump driver itself
(rather than in the pstore RAM backend). Specifically, it wants to know
about all the operational frontends (dmesg, console, ftrace, pmsg) with
their virt & phys addresses and size.

The frontends are defined via enum pstore_type_id, and the other values
are "normal" types, so it should be possible to move this logic into
minidump instead, leaving a simpler callback. Perhaps something like:

void pstore_region_defined(enum pstore_type_id, void *virt,
phys_addr_t phys, size_t size);

How the pstore ram backend should know to call this, though, I'm
struggling to find a sensible way. How can it determine if the device
tree region is actually contained by a minidump overlay?

--
Kees Cook

2023-02-24 10:26:09

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [RFC PATCH 6/6] pstore/ram: Register context with minidump

Thanks Kees for checking into this.

On 2/24/2023 1:13 AM, Kees Cook wrote:
> On Tue, Feb 21, 2023 at 04:55:13PM +0530, Mukesh Ojha wrote:
>> There are system which does not uses pstore directly but
>> may have the interest in the context saved by pstore.
>> Register pstore regions with minidump so that it get
>> dumped on minidump collection.
>
> Okay, so, this is a really interesting case -- it's a RAM backend that
> is already found on a system by pstore via device tree, but there is
> _another_ RAM overlay (minidump) that would like to know more about how
> the pstore ram backend carves up the memory regions so it can examine
> them itself too. (i.e. it's another "interface" like the pstorefs.)

It is not exactly like pstorefs where we can mount and check the logs in
next boot.

This interface needs physical address and size of the region to be
conveyed to the boot firmware via the minidump registration and firmware
will dump this region as a blob out of the device on device crash.
Blobs can be either plain text like console logs or some thing which
needs parser for that reason virtual address is required.

>
> So we need to provide the mapping back to the overlay. It feels to me
> like the logic for this needs to live in the minidump driver itself
> (rather than in the pstore RAM backend). Specifically, it wants to know
> about all the operational frontends (dmesg, console, ftrace, pmsg) with
> their virt & phys addresses and size.
>
> The frontends are defined via enum pstore_type_id, and the other values
> are "normal" types, so it should be possible to move this logic into
> minidump instead, leaving a simpler callback. Perhaps something like:
>
> void pstore_region_defined(enum pstore_type_id, void *virt,
> phys_addr_t phys, size_t size);
Thanks, i will check on this.

>
> How the pstore ram backend should know to call this, though, I'm
> struggling to find a sensible way. How can it determine if the device
> tree region is actually contained by a minidump overlay?

I agree, pstore will not have any awareness about minidump adding
something inside pstore does not look good if it can provide API which
provide this information.

One more thing, since minidump is also one of the user of pstore in this
case and it may not need the fixed ram addresses to be mentioned in the
carve out.

Do you think this can be accepted or if not any suggestion

https://lore.kernel.org/lkml/[email protected]/


-Mukesh
>

2023-02-24 10:43:17

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add basic Minidump kernel driver support

Thanks Brian for your interest in this series.

On 2/23/2023 6:07 PM, Brian Masney wrote:
> On Tue, Feb 21, 2023 at 04:55:07PM +0530, Mukesh Ojha wrote:
>> Minidump is a best effort mechanism to collect useful and predefined data
>> for first level of debugging on end user devices running on Qualcomm SoCs.
>> It is built on the premise that System on Chip (SoC) or subsystem part of
>> SoC crashes, due to a range of hardware and software bugs. Hence, the
>> ability to collect accurate data is only a best-effort. The data collected
>> could be invalid or corrupted, data collection itself could fail, and so on.
>>
>> Qualcomm devices in engineering mode provides a mechanism for generating
>> full system ramdumps for post mortem debugging. But in some cases it's
>> however not feasible to capture the entire content of RAM. The minidump
>> mechanism provides the means for selecting which snippets should be
>> included in the ramdump.
>>
>> The core of minidump feature is part of Qualcomm's boot firmware code.
>> It initializes shared memory (SMEM), which is a part of DDR and
>> allocates a small section of SMEM to minidump table i.e also called
>> global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has
>> their own table of segments to be included in the minidump and all get
>> their reference from G-ToC. Each segment/region has some details like
>> name, physical address and it's size etc. and it could be anywhere
>> scattered in the DDR.
>>
>> Existing upstream Qualcomm remoteproc driver[1] already supports minidump
>> feature for remoteproc instances like ADSP, MODEM, ... where predefined
>> selective segments of subsystem region can be dumped as part of
>> coredump collection which generates smaller size artifacts compared to
>> complete coredump of subsystem on crash.
>>
>> [1]
>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/remoteproc/qcom_common.c#n142
>>
>> In addition to managing and querying the APSS minidump description,
>> the Linux driver maintains a ELF header in a segment. This segment
>> gets updated with section/program header whenever a new entry gets
>> registered.
>
> I'd like to test this series plus your series that sets the multiple
> download modes.

Sure, you are welcome, but for that you need a device running with
Qualcomm SoC and if it has a upstream support.

Also, testing of this patch needs some minimal out of tree patches and
i can help you with that.

> Can you include documentation about how to actually use
> this new feature?

Will surely do, Since this is still RFC, and i am doubtful on the path
of it in documentation directory.

Also the information that you provided above is really
> useful. I think that should also go in the documentation file as well.
>
> I already have a reliable way to make a board go BOOM and go into
> ramdump mode.

That's very nice to hear; but again if you can specify your target
specification.

-Mukesh
>
> Brian
>

2023-02-24 17:15:09

by Trilok Soni

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add basic Minidump kernel driver support

On 2/24/2023 2:40 AM, Mukesh Ojha wrote:
> Thanks Brian for your interest in this series.
>
> On 2/23/2023 6:07 PM, Brian Masney wrote:
>> On Tue, Feb 21, 2023 at 04:55:07PM +0530, Mukesh Ojha wrote:
>>> Minidump is a best effort mechanism to collect useful and predefined
>>> data
>>> for first level of debugging on end user devices running on Qualcomm
>>> SoCs.
>>> It is built on the premise that System on Chip (SoC) or subsystem
>>> part of
>>> SoC crashes, due to a range of hardware and software bugs. Hence, the
>>> ability to collect accurate data is only a best-effort. The data
>>> collected
>>> could be invalid or corrupted, data collection itself could fail, and
>>> so on.
>>>
>>> Qualcomm devices in engineering mode provides a mechanism for generating
>>> full system ramdumps for post mortem debugging. But in some cases it's
>>> however not feasible to capture the entire content of RAM. The minidump
>>> mechanism provides the means for selecting which snippets should be
>>> included in the ramdump.
>>>
>>> The core of minidump feature is part of Qualcomm's boot firmware code.
>>> It initializes shared memory (SMEM), which is a part of DDR and
>>> allocates a small section of SMEM to minidump table i.e also called
>>> global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has
>>> their own table of segments to be included in the minidump and all get
>>> their reference from G-ToC. Each segment/region has some details like
>>> name, physical address and it's size etc. and it could be anywhere
>>> scattered in the DDR.
>>>
>>> Existing upstream Qualcomm remoteproc driver[1] already supports
>>> minidump
>>> feature for remoteproc instances like ADSP, MODEM, ... where predefined
>>> selective segments of subsystem region can be dumped as part of
>>> coredump collection which generates smaller size artifacts compared to
>>> complete coredump of subsystem on crash.
>>>
>>> [1]
>>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/remoteproc/qcom_common.c#n142
>>>
>>> In addition to managing and querying the APSS minidump description,
>>> the Linux driver maintains a ELF header in a segment. This segment
>>> gets updated with section/program header whenever a new entry gets
>>> registered.
>>
>> I'd like to test this series plus your series that sets the multiple
>> download modes.
>
> Sure, you are welcome, but for that you need a device running with
> Qualcomm SoC and if it has a upstream support.
>
> Also, testing of this patch needs some minimal out of tree patches and
> i can help you with that.
>
>> Can you include documentation about how to actually use
>> this new feature?
>
> Will surely do, Since this is still RFC, and i am doubtful on the path
> of it in documentation directory.

This is RFC anyways, you can start w/ the directory which you think best
fits here. The point here is to have the documentation file rather than
path to be fixed.

You can start w/ Documentation/features/debug and let's see what others
have any suggestion. Please add a file in your next revision without
worrying about the path for now.

---Trilok Soni


2023-02-24 19:07:06

by Brian Masney

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add basic Minidump kernel driver support

Hi Mukesh,

On Fri, Feb 24, 2023 at 04:10:42PM +0530, Mukesh Ojha wrote:
> On 2/23/2023 6:07 PM, Brian Masney wrote:
> > I'd like to test this series plus your series that sets the multiple
> > download modes.
>
> Sure, you are welcome, but for that you need a device running with Qualcomm
> SoC and if it has a upstream support.

I will be testing this series on a sa8540p (QDrive3 Automotive
Development Board), which has the sc8280xp SoC with good upstream
support. This is also the same board that I have a reliable way to
make the board crash due to a known firmware bug.

> Also, testing of this patch needs some minimal out of tree patches and
> i can help you with that.

Yup, that's fine. Hopefully we can also work to get those dependencies
merged upstream as well.

Brian


2023-02-27 10:16:14

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add basic Minidump kernel driver support



On 2/25/2023 12:36 AM, Brian Masney wrote:
> Hi Mukesh,
>
> On Fri, Feb 24, 2023 at 04:10:42PM +0530, Mukesh Ojha wrote:
>> On 2/23/2023 6:07 PM, Brian Masney wrote:
>>> I'd like to test this series plus your series that sets the multiple
>>> download modes.
>>
>> Sure, you are welcome, but for that you need a device running with Qualcomm
>> SoC and if it has a upstream support.
>
> I will be testing this series on a sa8540p (QDrive3 Automotive
> Development Board), which has the sc8280xp SoC with good upstream
> support. This is also the same board that I have a reliable way to
> make the board crash due to a known firmware bug.
>


Can you try below patch to just select minidump download mode and make
the device crash ?

--------------------------------------->8-------------------------------
diff --git a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
index 0d02599..bd8e1a8 100644
--- a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
@@ -280,6 +280,7 @@
firmware {
scm: scm {
compatible = "qcom,scm-sc8280xp", "qcom,scm";
+ qcom,dload-mode = <&tcsr 0x13000>;
};
};

diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index cdbfe54..e1539a2 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -20,7 +20,7 @@

#include "qcom_scm.h"

-static bool download_mode =
IS_ENABLED(CONFIG_QCOM_SCM_DOWNLOAD_MODE_DEFAULT);
+static bool download_mode = true;
module_param(download_mode, bool, 0);

#define SCM_HAS_CORE_CLK BIT(0)
@@ -427,7 +427,7 @@ static void qcom_scm_set_download_mode(bool enable)
ret = __qcom_scm_set_dload_mode(__scm->dev, enable);
} else if (__scm->dload_mode_addr) {
ret = qcom_scm_io_writel(__scm->dload_mode_addr,
- enable ? QCOM_SCM_BOOT_SET_DLOAD_MODE : 0);
+ enable ? 0x20 : 0);
} else {
dev_err(__scm->dev,
"No available mechanism for setting download
mode\n");




>> Also, testing of this patch needs some minimal out of tree patches and
>> i can help you with that.
>
> Yup, that's fine. Hopefully we can also work to get those dependencies
> merged upstream as well.
>
> Brian
>

2023-03-06 15:29:22

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add basic Minidump kernel driver support

Friendly review reminder..

-Mukesh

On 2/21/2023 4:55 PM, Mukesh Ojha wrote:
> Minidump is a best effort mechanism to collect useful and predefined data
> for first level of debugging on end user devices running on Qualcomm SoCs.
> It is built on the premise that System on Chip (SoC) or subsystem part of
> SoC crashes, due to a range of hardware and software bugs. Hence, the
> ability to collect accurate data is only a best-effort. The data collected
> could be invalid or corrupted, data collection itself could fail, and so on.
>
> Qualcomm devices in engineering mode provides a mechanism for generating
> full system ramdumps for post mortem debugging. But in some cases it's
> however not feasible to capture the entire content of RAM. The minidump
> mechanism provides the means for selecting which snippets should be
> included in the ramdump.
>
> The core of minidump feature is part of Qualcomm's boot firmware code.
> It initializes shared memory (SMEM), which is a part of DDR and
> allocates a small section of SMEM to minidump table i.e also called
> global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has
> their own table of segments to be included in the minidump and all get
> their reference from G-ToC. Each segment/region has some details like
> name, physical address and it's size etc. and it could be anywhere
> scattered in the DDR.
>
> Existing upstream Qualcomm remoteproc driver[1] already supports minidump
> feature for remoteproc instances like ADSP, MODEM, ... where predefined
> selective segments of subsystem region can be dumped as part of
> coredump collection which generates smaller size artifacts compared to
> complete coredump of subsystem on crash.
>
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/remoteproc/qcom_common.c#n142
>
> In addition to managing and querying the APSS minidump description,
> the Linux driver maintains a ELF header in a segment. This segment
> gets updated with section/program header whenever a new entry gets
> registered.
>
> Patch 1/6 is very trivial change.
> Patch 2/6 moves the minidump specific data structure and macro to
> qcom_minidump.h so that (3/6) minidump driver can use.
> Patch 3/6 implements qualcomm minidump kernel driver and exports
> symbol which other minidump kernel client can use.
> Patch 4/6 enables the qualcomm minidump driver.
> Patch 5/6 Use the exported symbol from minidump driver in qcom_common
> for querying minidump descriptor for a subsystem.
> Patch 6/6 Register pstore region with minidump.
>
> Testing of the patches has been done on sm8450 target with the help
> of out of tree patch which helps to set the download mode and storage
> type(on which dump will be saved) for which i will send separate series.
>
> Mukesh Ojha (6):
> remoteproc: qcom: Expand MD_* as MINIDUMP_*
> remoteproc: qcom: Move minidump specific data to qcom_minidump.h
> soc: qcom: Add Qualcomm minidump kernel driver
> arm64: defconfig: Enable Qualcomm minidump driver
> remoterproc: qcom: refactor to leverage exported minidump symbol
> pstore/ram: Register context with minidump
>
> arch/arm64/configs/defconfig | 1 +
> drivers/remoteproc/qcom_common.c | 75 +-----
> drivers/soc/qcom/Kconfig | 14 ++
> drivers/soc/qcom/Makefile | 1 +
> drivers/soc/qcom/qcom_minidump.c | 490 +++++++++++++++++++++++++++++++++++++++++
> fs/pstore/ram.c | 77 ++++++
> include/soc/qcom/minidump.h | 40 ++++
> include/soc/qcom/qcom_minidump.h | 88 +++++++
> 8 files changed, 717 insertions(+), 69 deletions(-)
> create mode 100644 drivers/soc/qcom/qcom_minidump.c
> create mode 100644 include/soc/qcom/minidump.h
> create mode 100644 include/soc/qcom/qcom_minidump.h
>

2023-03-06 18:12:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add basic Minidump kernel driver support

On Mon, Mar 06, 2023 at 08:58:04PM +0530, Mukesh Ojha wrote:
> Friendly review reminder..

It is a few hours after the merge window closed, please be patient.

And to help out, please review other submissions to reduce the review
load on maintainers. To not do that is just asking for others to do
work for you without any help, right?

thanks,

greg k-h

2023-03-07 17:33:33

by Brian Masney

[permalink] [raw]
Subject: Re: [RFC PATCH 0/6] Add basic Minidump kernel driver support

On Mon, Feb 27, 2023 at 03:45:31PM +0530, Mukesh Ojha wrote:
>
>
> On 2/25/2023 12:36 AM, Brian Masney wrote:
> > Hi Mukesh,
> >
> > On Fri, Feb 24, 2023 at 04:10:42PM +0530, Mukesh Ojha wrote:
> > > On 2/23/2023 6:07 PM, Brian Masney wrote:
> > > > I'd like to test this series plus your series that sets the multiple
> > > > download modes.
> > >
> > > Sure, you are welcome, but for that you need a device running with Qualcomm
> > > SoC and if it has a upstream support.
> >
> > I will be testing this series on a sa8540p (QDrive3 Automotive
> > Development Board), which has the sc8280xp SoC with good upstream
> > support. This is also the same board that I have a reliable way to
> > make the board crash due to a known firmware bug.
> >
>
>
> Can you try below patch to just select minidump download mode and make the
> device crash ?
>
> --------------------------------------->8-------------------------------
> diff --git a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
> b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
> index 0d02599..bd8e1a8 100644
> --- a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
> @@ -280,6 +280,7 @@
> firmware {
> scm: scm {
> compatible = "qcom,scm-sc8280xp", "qcom,scm";
> + qcom,dload-mode = <&tcsr 0x13000>;
> };
> };
>
> diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
> index cdbfe54..e1539a2 100644
> --- a/drivers/firmware/qcom_scm.c
> +++ b/drivers/firmware/qcom_scm.c
> @@ -20,7 +20,7 @@
>
> #include "qcom_scm.h"
>
> -static bool download_mode =
> IS_ENABLED(CONFIG_QCOM_SCM_DOWNLOAD_MODE_DEFAULT);
> +static bool download_mode = true;
> module_param(download_mode, bool, 0);
>
> #define SCM_HAS_CORE_CLK BIT(0)
> @@ -427,7 +427,7 @@ static void qcom_scm_set_download_mode(bool enable)
> ret = __qcom_scm_set_dload_mode(__scm->dev, enable);
> } else if (__scm->dload_mode_addr) {
> ret = qcom_scm_io_writel(__scm->dload_mode_addr,
> - enable ? QCOM_SCM_BOOT_SET_DLOAD_MODE : 0);
> + enable ? 0x20 : 0);
> } else {
> dev_err(__scm->dev,
> "No available mechanism for setting download
> mode\n");

Hi Mukesh,

I tried to test this series but I don't know how to actually use the
minidump feature that's in this series. Some more documentation is
needed.

I added this series, plus your other series that adds the download modes
to the SCM driver to my tree, along with your changes above. I
downgraded the firmware on my sa8540p and I have my reproducible crash.
Linux immediately loses control and the board firmware takes over.

I assumed that I'd need to do a warm reboot so that DDR contents are
still present so Linux can grab the memory contents on next reboot.
However, 'fastboot devices' shows no devices so I can't reboot that
way. I can do a cold boot but the DDR contents will be lost.

Also this series needs to be rebased against 6.3rc1.

Brian


2023-03-08 20:22:58

by Srinivas Kandagatla

[permalink] [raw]
Subject: Re: [RFC PATCH 3/6] soc: qcom: Add Qualcomm minidump kernel driver



On 21/02/2023 11:25, Mukesh Ojha wrote:
> Minidump is a best effort mechanism to collect useful and predefined
> data for first level of debugging on end user devices running on
> Qualcomm SoCs. It is built on the premise that System on Chip (SoC)
> or subsystem part of SoC crashes, due to a range of hardware and
> software bugs. Hence, the ability to collect accurate data is only
> a best-effort. The data collected could be invalid or corrupted,
> data collection itself could fail, and so on.
>
> Qualcomm devices in engineering mode provides a mechanism for
> generating full system ramdumps for post mortem debugging. But in some
> cases it's however not feasible to capture the entire content of RAM.
> The minidump mechanism provides the means for selecting region should
> be included in the ramdump. The solution supports extracting the
> ramdump/minidump produced either over USB or stored to an attached
> storage device.
>
> The core of minidump feature is part of Qualcomm's boot firmware code.
> It initializes shared memory(SMEM), which is a part of DDR and
> allocates a small section of it to minidump table i.e also called
> global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has
> their own table of segments to be included in the minidump, all
> references from a descriptor in SMEM (G-ToC). Each segment/region has
> some details like name, physical address and it's size etc. and it
> could be anywhere scattered in the DDR.
>
> Minidump kernel driver adds the capability to add linux region to be
> dumped as part of ram dump collection. It provides appropriate symbol
> to check its enablement and register client regions.
>
> To simplify post mortem debugging, it creates and maintain an ELF
> header as first region that gets updated with upon registration
> of a new region.
>
> Signed-off-by: Mukesh Ojha <[email protected]>
> ---
> drivers/soc/qcom/Kconfig | 14 ++
> drivers/soc/qcom/Makefile | 1 +
> drivers/soc/qcom/qcom_minidump.c | 490 +++++++++++++++++++++++++++++++++++++++++
> include/soc/qcom/minidump.h | 40 ++++
> include/soc/qcom/qcom_minidump.h | 24 +-
> 5 files changed, 568 insertions(+), 1 deletion(-)
> create mode 100644 drivers/soc/qcom/qcom_minidump.c
> create mode 100644 include/soc/qcom/minidump.h
>
> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
> index ae504c4..0fc7698 100644
> --- a/drivers/soc/qcom/Kconfig
> +++ b/drivers/soc/qcom/Kconfig
> @@ -248,4 +248,18 @@ config QCOM_ICC_BWMON
> the fixed bandwidth votes from cpufreq (CPU nodes) thus achieve high
> memory throughput even with lower CPU frequencies.
>
> +config QCOM_MINIDUMP
> + bool "QCOM Minidump Support"

Why can't this be a module?


> + depends on ARCH_QCOM || COMPILE_TEST
> + depends on QCOM_SMEM
> + help
> + Enablement of core minidump feature is controlled from boot firmware
> + side, and this config allow linux to query and manages APPS minidump
> + table.
> +
> + Client drivers can register their internal data structures and debug
> + messages as part of the minidump region and when the SoC is crashed,
> + these selective regions will be dumped instead of the entire DDR.
> + This saves significant amount of time and/or storage space.
> +
> endmenu
> diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
> index d66604a..e1ff492 100644
> --- a/drivers/soc/qcom/Makefile
> +++ b/drivers/soc/qcom/Makefile
> @@ -29,3 +29,4 @@ obj-$(CONFIG_QCOM_RPMHPD) += rpmhpd.o
> obj-$(CONFIG_QCOM_RPMPD) += rpmpd.o
> obj-$(CONFIG_QCOM_KRYO_L2_ACCESSORS) += kryo-l2-accessors.o
> obj-$(CONFIG_QCOM_ICC_BWMON) += icc-bwmon.o
> +obj-$(CONFIG_QCOM_MINIDUMP) += qcom_minidump.o
> diff --git a/drivers/soc/qcom/qcom_minidump.c b/drivers/soc/qcom/qcom_minidump.c
> new file mode 100644
> index 0000000..eb63b75
> --- /dev/null
> +++ b/drivers/soc/qcom/qcom_minidump.c
> @@ -0,0 +1,490 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +/*
> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#define pr_fmt(fmt) "Minidump: " fmt
> +
> +#include <linux/init.h>
> +#include <linux/export.h>
> +#include <linux/kernel.h>
> +#include <linux/err.h>
> +#include <linux/elf.h>
> +#include <linux/errno.h>
> +#include <linux/string.h>
> +#include <linux/slab.h>
> +#include <linux/of.h>
> +#include <linux/printk.h>
> +#include <linux/soc/qcom/smem.h>
> +#include <soc/qcom/qcom_minidump.h>
> +#include <soc/qcom/minidump.h>
> +
> +/**
> + * DOC: Overview
> + *
> + * +-----------------------------------------------+
> + * | DDR +-------------+ |
> + * | | SS0-ToC| |
> + * | +----------------+ +----------------+ | |
> + * | |Shared memory | | SS1-ToC| | |
> + * | |(SMEM) | | | | |
> + * | | | +-->|--------+ | | |
> + * | |G-ToC | | | SS-ToC \ | | |
> + * | |+-------------+ | | | +-----------+ | | |
> + * | ||-------------| | | | |-----------| | | |
> + * | || SS0-ToC | | | +-|<|SS1 region1| | | |
> + * | ||-------------| | | | | |-----------| | | |
> + * | || SS1-ToC |-|>+ | | |SS1 region2| | | |
> + * | ||-------------| | | | |-----------| | | |
> + * | || SS2-ToC | | | | | ... | | | |
> + * | ||-------------| | | | |-----------| | | |
> + * | || ... | | |-|<|SS1 regionN| | | |
> + * | ||-------------| | | | |-----------| | | |
> + * | || SSn-ToC | | | | +-----------+ | | |
> + * | |+-------------+ | | | | | |
> + * | | | | |----------------| | |
> + * | | | +>| regionN | | |
> + * | | | | |----------------| | |
> + * | +----------------+ | | | | |
> + * | | |----------------| | |
> + * | +>| region1 | | |
> + * | |----------------| | |
> + * | | | | |
> + * | |----------------|-+ |
> + * | | region5 | |
> + * | |----------------| |
> + * | | | |
> + * | Region information +----------------+ |
> + * | +---------------+ |
> + * | |region name | |
> + * | |---------------| |
> + * | |region address | |
> + * | |---------------| |
> + * | |region size | |
> + * | +---------------+ |
> + * +-----------------------------------------------+
> + * G-ToC: Global table of content
> + * SS-ToC: Subsystem table of content
> + * SS0-SSn: Subsystem numbered from 0 to n
> + *
> + * The core of minidump feature is part of Qualcomm's boot firmware code.
> + * It initializes shared memory(SMEM), which is a part of DDR and
> + * allocates a small section of it to minidump table i.e also called
> + * global table of content (G-ToC).
> + *
> + * Each subsystem (APSS, ADSP, ...) has their own table of segments to be
> + * included in the minidump, all references from a descriptor in SMEM
> + * (G-ToC). Each segment/region has some details like name, physical
> + * address and it's size etc. and it could be anywhere scattered
> + * in the DDR.
> + *
> + * Qualcomm SoCs supports extracting the ramdump/minidump produced
> + * either over USB or stored to an attached storage device.
> + *
Are there any userspace tools to parse these dumps?

How can someone collect minidump and use this in upstream setup, do you
have any quick guide to do this?

> + */
> +
> +/**
> + * struct minidump_elfhdr - Minidump table elf header
> + * @ehdr: Elf main header
> + * @shdr: Section header
> + * @phdr: Program header
> + * @elf_offset: Section offset in elf
> + * @strtable_idx: String table current index position
> + */
> +struct minidump_elfhdr {
> + struct elfhdr *ehdr;
> + struct elf_shdr *shdr;
> + struct elf_phdr *phdr;
> + size_t elf_offset;
> + size_t strtable_idx;
> +};
> +
> +/**
> + * struct minidump - Minidump driver private data
> + * @md_gbl_toc : Global TOC pointer
> + * @md_ss_toc : High level OS TOC pointer
> + * @md_regions : High level OS region base pointer
> + * @elf : Minidump elf header
> + */
> +struct minidump {
> + struct minidump_global_toc *md_gbl_toc;

Totally unused.

> + struct minidump_subsystem *md_ss_toc;
> + struct minidump_region *md_regions;
> + struct minidump_elfhdr elf;
> +};
> +
> +/*
> + * In some of the Old Qualcomm devices, boot firmware statically allocates 300
> + * as total number of supported region (including all co-processors) in
> + * minidump table out of which linux was using 200. In future, this limitation
> + * from boot firmware might get removed by allocating the region dynamically.
> + * So, keep it compatible with older devices, we can the current limit for Linux
> + * to 200.
> + */
> +#define MAX_NUM_ENTRIES 200
> +
> +static struct minidump minidump;
> +static DEFINE_MUTEX(minidump_lock);
> +
...

> +/* Update ELF header */
> +static void minidump_update_elf_header(const struct qcom_minidump_region *region)
> +{
> + struct elfhdr *ehdr = minidump.elf.ehdr;
> + struct elf_shdr *shdr;
> + struct elf_phdr *phdr;
> +
> + shdr = elf_shdr_entry_addr(ehdr, ehdr->e_shnum++);
> + phdr = elf_phdr_entry_addr(ehdr, ehdr->e_phnum++);
> +
> + shdr->sh_type = SHT_PROGBITS;
> + shdr->sh_name = append_str_to_strtable(region->name);
> + shdr->sh_addr = (elf_addr_t)region->virt_addr;
> + shdr->sh_size = region->size;
> + shdr->sh_flags = SHF_WRITE;
> + shdr->sh_offset = minidump.elf.elf_offset;
> + shdr->sh_entsize = 0;
> +
> + phdr->p_type = PT_LOAD;
> + phdr->p_offset = minidump.elf.elf_offset;
> + phdr->p_vaddr = (elf_addr_t)region->virt_addr;
> + phdr->p_paddr = region->phys_addr;
> + phdr->p_filesz = phdr->p_memsz = region->size;
> + phdr->p_flags = PF_R | PF_W;
> + minidump.elf.elf_offset += shdr->sh_size;
> +}
> +
> +/* Add region in subsystem ToC */
> +static void minidump_ss_add_region(const struct qcom_minidump_region *region)
> +{
> + struct minidump_region *mdr;
> + unsigned int region_cnt = le32_to_cpu(minidump.md_ss_toc->region_count);
> +
> + mdr = &minidump.md_regions[region_cnt];
> + strscpy(mdr->name, region->name, sizeof(mdr->name));
> + mdr->address = cpu_to_le64(region->phys_addr);
> + mdr->size = cpu_to_le64(region->size);
> + mdr->valid = cpu_to_le32(MINIDUMP_REGION_VALID);
> + region_cnt++;
> + minidump.md_ss_toc->region_count = cpu_to_le32(region_cnt);
> +}
> +
> +static bool qcom_minidump_valid_region(const struct qcom_minidump_region *region)
> +{
> + return region &&
> + strnlen(region->name, MAX_NAME_LENGTH) < MAX_NAME_LENGTH &&
> + region->virt_addr &&
> + region->size &&minidump

mindump is static variable, why are we checking for this?

> + IS_ALIGNED(region->size, 4);

This function looks very much unreadable, can we rearrage this.

> +}
> +
> +#define MAX_STRTBL_SIZE (MAX_NUM_ENTRIES * MAX_REGION_NAME_LENGTH)

please move to the top.

> +static int minidump_add_elf_header(void)
> +{
> + struct qcom_minidump_region elfregion;
> + struct elfhdr *ehdr;
> + struct elf_shdr *shdr;
> + struct elf_phdr *phdr;
> + unsigned int elfh_size;
> + unsigned int strtbl_off;
> + unsigned int phdr_off;
> + char *banner;
> + unsigned int banner_len;
> +
> + banner_len = strlen(linux_banner);
> + /*
> + * Header buffer contains:
> + * ELF header, (MAX_NUM_ENTRIES + 4) of Section and Program ELF headers,
> + * where, 4 additional entries, one for empty header, one for string table
> + * one for minidump table and one for linux banner.
> + *
> + * Linux banner is stored in minidump to aid post mortem tools to determine
> + * the kernel version.
> + */
> + elfh_size = sizeof(*ehdr);
> + elfh_size += MAX_STRTBL_SIZE;
> + elfh_size += banner_len + 1;
> + elfh_size += ((sizeof(*shdr) + sizeof(*phdr)) * (MAX_NUM_ENTRIES + 4));
> + elfh_size = ALIGN(elfh_size, 4);
> +
> + minidump.elf.ehdr = kzalloc(elfh_size, GFP_KERNEL);

when will this memory freed?

> + if (!minidump.elf.ehdr)
> + return -ENOMEM;
> +
> + /* Register ELF header as first region */
> + strscpy(elfregion.name, "KELF_HEADER", sizeof(elfregion.name));
> + elfregion.virt_addr = minidump.elf.ehdr;
> + elfregion.phys_addr = virt_to_phys(minidump.elf.ehdr);
> + elfregion.size = elfh_size;
> + minidump_ss_add_region(&elfregion);
> +
> + ehdr = minidump.elf.ehdr;
> + /* Assign Section/Program headers offset */
> + minidump.elf.shdr = shdr = (struct elf_shdr *)(ehdr + 1);
> + minidump.elf.phdr = phdr = (struct elf_phdr *)(shdr + MAX_NUM_ENTRIES);
> + phdr_off = sizeof(*ehdr) + (sizeof(*shdr) * MAX_NUM_ENTRIES);
> +
> + memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
> + ehdr->e_ident[EI_CLASS] = ELF_CLASS;
> + ehdr->e_ident[EI_DATA] = ELF_DATA;
> + ehdr->e_ident[EI_VERSION] = EV_CURRENT;
> + ehdr->e_ident[EI_OSABI] = ELF_OSABI;
> + ehdr->e_type = ET_CORE;
> + ehdr->e_machine = ELF_ARCH;
> + ehdr->e_version = EV_CURRENT;
> + ehdr->e_ehsize = sizeof(*ehdr);
> + ehdr->e_phoff = phdr_off;
> + ehdr->e_phentsize = sizeof(*phdr);
> + ehdr->e_shoff = sizeof(*ehdr);
> + ehdr->e_shentsize = sizeof(*shdr);
> + ehdr->e_shstrndx = 1;
> +
> + minidump.elf.elf_offset = elfh_size;
> +
> + /*
> + * The zeroth index of the section header is reserved and is rarely used.
> + * Set the section header as null (SHN_UNDEF) and move to the next one.
> + * 2nd Section is String table.
> + */
> + minidump.elf.strtable_idx = 1;
> + strtbl_off = sizeof(*ehdr) + ((sizeof(*phdr) + sizeof(*shdr)) * MAX_NUM_ENTRIES);
> + shdr++;
> + shdr->sh_type = SHT_STRTAB;
> + shdr->sh_offset = (elf_addr_t)strtbl_off;
> + shdr->sh_size = MAX_STRTBL_SIZE;
> + shdr->sh_entsize = 0;
> + shdr->sh_flags = 0;
> + shdr->sh_name = append_str_to_strtable("STR_TBL");
> + shdr++;
> +
> + /* 3rd Section is for Minidump_table VA, used by parsers */
> + shdr->sh_type = SHT_PROGBITS;
> + shdr->sh_entsize = 0;
> + shdr->sh_flags = 0;
> + shdr->sh_addr = (elf_addr_t)&minidump;
> + shdr->sh_name = append_str_to_strtable("minidump_table");
> + shdr++;
> +
> + /* 4th Section is Linux banner */
> + banner = (char *)ehdr + strtbl_off + MAX_STRTBL_SIZE;
> + memcpy(banner, linux_banner, banner_len);
> +
> + shdr->sh_type = SHT_PROGBITS;
> + shdr->sh_offset = (elf_addr_t)(strtbl_off + MAX_STRTBL_SIZE);
> + shdr->sh_size = banner_len + 1;
> + shdr->sh_addr = (elf_addr_t)linux_banner;
> + shdr->sh_entsize = 0;
> + shdr->sh_flags = SHF_WRITE;
> + shdr->sh_name = append_str_to_strtable("linux_banner");
> +
> + phdr->p_type = PT_LOAD;
> + phdr->p_offset = (elf_addr_t)(strtbl_off + MAX_STRTBL_SIZE);
> + phdr->p_vaddr = (elf_addr_t)linux_banner;
> + phdr->p_paddr = virt_to_phys(linux_banner);
> + phdr->p_filesz = phdr->p_memsz = banner_len + 1;
> + phdr->p_flags = PF_R | PF_W;
> +
> + /* Update headers count*/
> + ehdr->e_phnum = 1;
> + ehdr->e_shnum = 4;
Can we define these magic constants.

nit, a new line before return would be nice.

> + return 0;
> +}
> +
> +static int qcom_minidump_init(void)
> +{
> + struct minidump_subsystem *mdsstoc;
> +
> + mdsstoc = qcom_minidump_subsystem_desc(MINIDUMP_APSS_DESC);
> + if (IS_ERR(mdsstoc))
> + return -EINVAL;
> +
> + minidump.md_regions = kcalloc(MAX_NUM_ENTRIES,
> + sizeof(struct minidump_region), GFP_KERNEL);
> + if (!minidump.md_regions)
> + return -ENOMEM;
> +
> + minidump.md_ss_toc = mdsstoc;
> + /* Share memory table update */
> + mdsstoc->regions_baseptr = cpu_to_le64(virt_to_phys(minidump.md_regions));
> + /* Tell bootloader not to encrypt the regions of this subsystem */
> + mdsstoc->encryption_status = cpu_to_le32(MINIDUMP_SS_ENCR_DONE);
> + mdsstoc->encryption_required = cpu_to_le32(MINIDUMP_SS_ENCR_NOTREQ);
> +
> + mdsstoc->enabled = cpu_to_le32(MINIDUMP_SS_ENABLED);
> + mdsstoc->status = cpu_to_le32(1);
> + mdsstoc->region_count = cpu_to_le32(0);
> +
> + return 0;
> +}
> +
> +/**
> + * qcom_minidump_ready - Check, if minidump is ready for client registration or not.
> + *
> + * Return: zero on success and negative on failure.
> + *
> + * Qualcomm minidump feature is dependent on Qualcomm's shared memory and it is
> + * possible for a arm64 target to not have it's device tree entry present, for
> + * such case, qcom_minidump_ready() returns -ENODEV and client should not further
> + * try to register their region with minidump driver.
> + */
> +int qcom_minidump_ready(void)
> +{
> + void *ptr;
> + struct device_node *np;
> + static bool is_smem_available = true;
> +
> + if (!is_smem_available || !(np = of_find_compatible_node(NULL, NULL, "qcom,smem"))) {

just check for dt node here does not mean that smem device is available,
you should probably check if the device is avaliable aswell using
of_device_is_available()


We should proabably return -EPROBEDEFER incase the node is present and
device is not present.


> + is_smem_available = false;
> + return -ENODEV;
> + }
> +
> + of_node_put(np);

<--
> + ptr = qcom_smem_get(QCOM_SMEM_HOST_ANY, SBL_MINIDUMP_SMEM_ID, NULL);
> + if (IS_ERR(ptr))
> + return PTR_ERR(ptr);
-->

If we are already checking for global toc here, why not just set it in
minidump
minidump.md_gbl_toc = qcom_smem_get(QCOM_SMEM_HOST_ANY,
SBL_MINIDUMP_SMEM_ID, &size);
...

and then stop calling qcom_smem_get to get global toc on every call to
qcom_minidump_subsystem_desc()


> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(qcom_minidump_ready);
> +
> +/**
> + * qcom_minidump_subsystem_desc() - Get minidump subsystem descriptor.
> + * @minidump_index: minidump index for a subsystem in minidump table
> + *
> + * Return: minidump subsystem descriptor address on success and error
> + * on failure
> + */
> +struct minidump_subsystem *qcom_minidump_subsystem_desc(unsigned int minidump_index)
> +{
> + struct minidump_global_toc *mdgtoc;
> + size_t size;
> +
> + mdgtoc = qcom_smem_get(QCOM_SMEM_HOST_ANY, SBL_MINIDUMP_SMEM_ID, &size);
> + if (IS_ERR(mdgtoc)) {
> + pr_err("Unable to find minidump smem item\n");
> + return ERR_CAST(mdgtoc);
> + }
> +
> + if (size < sizeof(*mdgtoc) || !mdgtoc->status) {
> + pr_err("Minidump table is not initialized\n");
> + return ERR_PTR(-EINVAL);
> + }
> +
> + return &mdgtoc->subsystems[minidump_index];

once we fix qcom_minidump_ready() with the suggestion then this call will be

struct minidump_subsystem *qcom_minidump_subsystem_desc(..)
{
return &minidump.md_gbl_toc->subsystems[minidump_index];
}

> +}
> +EXPORT_SYMBOL_GPL(qcom_minidump_subsystem_desc);
> +
> +/**
> + * qcom_minidump_region_register() - Register a region in Minidump table.
> + * @region: minidump region.
> + *
> + * Client should not call this directly instead first call qcom_minidump_ready()
> + * to check if minidump is ready to do registration if yes, then call this API.
> + *
> + * Return: On success, it returns region index in minidump table and negative
> + * error value on failure.
> + */
> +int qcom_minidump_region_register(const struct qcom_minidump_region *region)
> +{
> + static bool minidump_init_done;
why not move this type of static variables to struct minidump.

> + unsigned int num_region;
> + int ret;

we should check qcom_minidump_ready() has been called either by setting
a flag in struct minidump and return early on in case its not ready.


> +
> + /* Initialize minidump context on first call */
> + mutex_lock(&minidump_lock);
> + if (!minidump_init_done) {
> + ret = qcom_minidump_init();
> + if (ret)
> + goto unlock;
> +
> + minidump_init_done = true;
> + /* First entry would be ELF header */
> + ret = minidump_add_elf_header();
> + if (ret) {
> + kfree(minidump.md_regions);

should we not reset minidump_init_done = false;
or move
minidump_init_done = true;
to end of this loop.

> + goto unlock;
> + }
> + }
> +
> + if (!qcom_minidump_valid_region(region)) {
> + ret = -EINVAL;
> + goto unlock;
> + }
> +
> + ret = get_minidump_region_index(region);
> + if (ret >= 0) {
> + pr_info("%s region is already registered\n", region->name);
> + ret = -EEXIST;
> + goto unlock;
> + }
> +
> + /* Check if there is a room for a new entry */
> + ret = num_region = le32_to_cpu(minidump.md_ss_toc->region_count);
> + if (num_region >= MAX_NUM_ENTRIES) {
> + pr_err("maximum region limit %u reached\n", num_region);
> + ret = -ENOSPC;
> + goto unlock;
> + }
> +
> + minidump_ss_add_region(region);
> + minidump_update_elf_header(region);
> +unlock:
> + mutex_unlock(&minidump_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(qcom_minidump_region_register);


2023-03-08 20:50:32

by Konrad Dybcio

[permalink] [raw]
Subject: Re: [RFC PATCH 3/6] soc: qcom: Add Qualcomm minidump kernel driver



On 8.03.2023 21:22, Srinivas Kandagatla wrote:
>
>
> On 21/02/2023 11:25, Mukesh Ojha wrote:
>> Minidump is a best effort mechanism to collect useful and predefined
>> data for first level of debugging on end user devices running on
>> Qualcomm SoCs. It is built on the premise that System on Chip (SoC)
>> or subsystem part of SoC crashes, due to a range of hardware and
>> software bugs. Hence, the ability to collect accurate data is only
>> a best-effort. The data collected could be invalid or corrupted,
>> data collection itself could fail, and so on.
>>
>> Qualcomm devices in engineering mode provides a mechanism for
>> generating full system ramdumps for post mortem debugging. But in some
>> cases it's however not feasible to capture the entire content of RAM.
>> The minidump mechanism provides the means for selecting region should
>> be included in the ramdump. The solution supports extracting the
>> ramdump/minidump produced either over USB or stored to an attached
>> storage device.
>>
>> The core of minidump feature is part of Qualcomm's boot firmware code.
>> It initializes shared memory(SMEM), which is a part of DDR and
>> allocates a small section of it to minidump table i.e also called
>> global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has
>> their own table of segments to be included in the minidump, all
>> references from a descriptor in SMEM (G-ToC). Each segment/region has
>> some details like name, physical address and it's size etc. and it
>> could be anywhere scattered in the DDR.
>>
>> Minidump kernel driver adds the capability to add linux region to be
>> dumped as part of ram dump collection. It provides appropriate symbol
>> to check its enablement and register client regions.
>>
>> To simplify post mortem debugging, it creates and maintain an ELF
>> header as first region that gets updated with upon registration
>> of a new region.
>>
>> Signed-off-by: Mukesh Ojha <[email protected]>
>> ---
[...]

>> +int qcom_minidump_ready(void)
>> +{
>> +    void *ptr;
>> +    struct device_node *np;
>> +    static bool is_smem_available = true;
>> +
>> +    if (!is_smem_available || !(np = of_find_compatible_node(NULL, NULL, "qcom,smem"))) {
>
> just check for dt node here does not mean that smem device is available, you should probably check if the device is avaliable aswell using of_device_is_available()
>
>
> We should proabably return -EPROBEDEFER incase the node is present and device is not present.
qcom_smem_get() seems to handle -EPROBE_DEFER internally, so this check
may be entirely redundant.

Konrad

2023-03-15 15:10:19

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [RFC PATCH 3/6] soc: qcom: Add Qualcomm minidump kernel driver

Thanks Srinivas for the review and suggestions.

I will try to address most of the comments in the next version.

There is one thing, i would like to highlight and seeking suggestion
on doing the right way to do it. We wanted any core kernel clients
should be able to register with minidump; say for example to collect
sched info/memstats/irqstats and it could be any debug related data
which could be helpful for postmortem debugging.

But for that I could not add qcom_minidump_ready() and
qcom_minidump_add_region() in core kernel.

One possible way to cache the registration of the
client in minidump driver and do all the registration once minidump
is ready.


Any suggestion ?

On 3/9/2023 1:52 AM, Srinivas Kandagatla wrote:
>
>
> On 21/02/2023 11:25, Mukesh Ojha wrote:
>> Minidump is a best effort mechanism to collect useful and predefined
>> data for first level of debugging on end user devices running on
>> Qualcomm SoCs. It is built on the premise that System on Chip (SoC)
>> or subsystem part of SoC crashes, due to a range of hardware and
>> software bugs. Hence, the ability to collect accurate data is only
>> a best-effort. The data collected could be invalid or corrupted,
>> data collection itself could fail, and so on.
>>
>> Qualcomm devices in engineering mode provides a mechanism for
>> generating full system ramdumps for post mortem debugging. But in some
>> cases it's however not feasible to capture the entire content of RAM.
>> The minidump mechanism provides the means for selecting region should
>> be included in the ramdump. The solution supports extracting the
>> ramdump/minidump produced either over USB or stored to an attached
>> storage device.
>>
>> The core of minidump feature is part of Qualcomm's boot firmware code.
>> It initializes shared memory(SMEM), which is a part of DDR and
>> allocates a small section of it to minidump table i.e also called
>> global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has
>> their own table of segments to be included in the minidump, all
>> references from a descriptor in SMEM (G-ToC). Each segment/region has
>> some details like name, physical address and it's size etc. and it
>> could be anywhere scattered in the DDR.
>>
>> Minidump kernel driver adds the capability to add linux region to be
>> dumped as part of ram dump collection. It provides appropriate symbol
>> to check its enablement and register client regions.
>>
>> To simplify post mortem debugging, it creates and maintain an ELF
>> header as first region that gets updated with upon registration
>> of a new region.
>>
>> Signed-off-by: Mukesh Ojha <[email protected]>
>> ---
>>   drivers/soc/qcom/Kconfig         |  14 ++
>>   drivers/soc/qcom/Makefile        |   1 +
>>   drivers/soc/qcom/qcom_minidump.c | 490
>> +++++++++++++++++++++++++++++++++++++++++
>>   include/soc/qcom/minidump.h      |  40 ++++
>>   include/soc/qcom/qcom_minidump.h |  24 +-
>>   5 files changed, 568 insertions(+), 1 deletion(-)
>>   create mode 100644 drivers/soc/qcom/qcom_minidump.c
>>   create mode 100644 include/soc/qcom/minidump.h
>>
>> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
>> index ae504c4..0fc7698 100644
>> --- a/drivers/soc/qcom/Kconfig
>> +++ b/drivers/soc/qcom/Kconfig
>> @@ -248,4 +248,18 @@ config QCOM_ICC_BWMON
>>         the fixed bandwidth votes from cpufreq (CPU nodes) thus
>> achieve high
>>         memory throughput even with lower CPU frequencies.
>> +config QCOM_MINIDUMP
>> +    bool "QCOM Minidump Support"
>
> Why can't this be a module?

Yes, it can be..

>
>
>> +    depends on ARCH_QCOM || COMPILE_TEST
>> +    depends on QCOM_SMEM
>> +    help
>> +      Enablement of core minidump feature is controlled from boot
>> firmware
>> +      side, and this config allow linux to query and manages APPS
>> minidump
>> +      table.
>> +
>> +      Client drivers can register their internal data structures and
>> debug
>> +      messages as part of the minidump region and when the SoC is
>> crashed,
>> +      these selective regions will be dumped instead of the entire DDR.
>> +      This saves significant amount of time and/or storage space.
>> +
>>   endmenu
>> diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
>> index d66604a..e1ff492 100644
>> --- a/drivers/soc/qcom/Makefile
>> +++ b/drivers/soc/qcom/Makefile
>> @@ -29,3 +29,4 @@ obj-$(CONFIG_QCOM_RPMHPD) += rpmhpd.o
>>   obj-$(CONFIG_QCOM_RPMPD) += rpmpd.o
>>   obj-$(CONFIG_QCOM_KRYO_L2_ACCESSORS) +=    kryo-l2-accessors.o
>>   obj-$(CONFIG_QCOM_ICC_BWMON)    += icc-bwmon.o
>> +obj-$(CONFIG_QCOM_MINIDUMP) += qcom_minidump.o
>> diff --git a/drivers/soc/qcom/qcom_minidump.c
>> b/drivers/soc/qcom/qcom_minidump.c
>> new file mode 100644
>> index 0000000..eb63b75
>> --- /dev/null
>> +++ b/drivers/soc/qcom/qcom_minidump.c
>> @@ -0,0 +1,490 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +
>> +/*
>> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights
>> reserved.
>> + */
>> +
>> +#define pr_fmt(fmt) "Minidump: " fmt
>> +
>> +#include <linux/init.h>
>> +#include <linux/export.h>
>> +#include <linux/kernel.h>
>> +#include <linux/err.h>
>> +#include <linux/elf.h>
>> +#include <linux/errno.h>
>> +#include <linux/string.h>
>> +#include <linux/slab.h>
>> +#include <linux/of.h>
>> +#include <linux/printk.h>
>> +#include <linux/soc/qcom/smem.h>
>> +#include <soc/qcom/qcom_minidump.h>
>> +#include <soc/qcom/minidump.h>
>> +
>> +/**
>> + * DOC: Overview
>> + *
>> + *   +-----------------------------------------------+
>> + *   |   DDR                       +-------------+   |
>> + *   |                             |      SS0-ToC|   |
>> + *   | +----------------+     +----------------+ |   |
>> + *   | |Shared memory   |     |         SS1-ToC| |   |
>> + *   | |(SMEM)          |     |                | |   |
>> + *   | |                | +-->|--------+       | |   |
>> + *   | |G-ToC           | |   | SS-ToC  \      | |   |
>> + *   | |+-------------+ | |   | +-----------+  | |   |
>> + *   | ||-------------| | |   | |-----------|  | |   |
>> + *   | || SS0-ToC     | | | +-|<|SS1 region1|  | |   |
>> + *   | ||-------------| | | | | |-----------|  | |   |
>> + *   | || SS1-ToC     |-|>+ | | |SS1 region2|  | |   |
>> + *   | ||-------------| |   | | |-----------|  | |   |
>> + *   | || SS2-ToC     | |   | | |  ...      |  | |   |
>> + *   | ||-------------| |   | | |-----------|  | |   |
>> + *   | ||  ...        | |   |-|<|SS1 regionN|  | |   |
>> + *   | ||-------------| |   | | |-----------|  | |   |
>> + *   | || SSn-ToC     | |   | | +-----------+  | |   |
>> + *   | |+-------------+ |   | |                | |   |
>> + *   | |                |   | |----------------| |   |
>> + *   | |                |   +>|  regionN       | |   |
>> + *   | |                |   | |----------------| |   |
>> + *   | +----------------+   | |                | |   |
>> + *   |                      | |----------------| |   |
>> + *   |                      +>|  region1       | |   |
>> + *   |                        |----------------| |   |
>> + *   |                        |                | |   |
>> + *   |                        |----------------|-+   |
>> + *   |                        |  region5       |     |
>> + *   |                        |----------------|     |
>> + *   |                        |                |     |
>> + *   |  Region information    +----------------+     |
>> + *   | +---------------+                             |
>> + *   | |region name    |                             |
>> + *   | |---------------|                             |
>> + *   | |region address |                             |
>> + *   | |---------------|                             |
>> + *   | |region size    |                             |
>> + *   | +---------------+                             |
>> + *   +-----------------------------------------------+
>> + *       G-ToC: Global table of content
>> + *       SS-ToC: Subsystem table of content
>> + *       SS0-SSn: Subsystem numbered from 0 to n
>> + *
>> + * The core of minidump feature is part of Qualcomm's boot firmware
>> code.
>> + * It initializes shared memory(SMEM), which is a part of DDR and
>> + * allocates a small section of it to minidump table i.e also called
>> + * global table of content (G-ToC).
>> + *
>> + * Each subsystem (APSS, ADSP, ...) has their own table of segments
>> to be
>> + * included in the minidump, all references from a descriptor in SMEM
>> + * (G-ToC). Each segment/region has some details like name, physical
>> + * address and it's size etc. and it could be anywhere scattered
>> + * in the DDR.
>> + *
>> + * Qualcomm SoCs supports extracting the ramdump/minidump produced
>> + * either over USB or stored to an attached storage device.
>> + *
> Are there any userspace tools to parse these dumps?
>
> How can someone collect minidump and use this in upstream setup, do you
> have any quick guide to do this?

So, will try to add in v2.

>
>> + */
>> +
>> +/**
>> + * struct minidump_elfhdr - Minidump table elf header
>> + * @ehdr: Elf main header
>> + * @shdr: Section header
>> + * @phdr: Program header
>> + * @elf_offset: Section offset in elf
>> + * @strtable_idx: String table current index position
>> + */
>> +struct minidump_elfhdr {
>> +    struct elfhdr        *ehdr;
>> +    struct elf_shdr        *shdr;
>> +    struct elf_phdr        *phdr;
>> +    size_t            elf_offset;
>> +    size_t            strtable_idx;
>> +};
>> +
>> +/**
>> + * struct minidump - Minidump driver private data
>> + * @md_gbl_toc     : Global TOC pointer
>> + * @md_ss_toc     : High level OS TOC pointer
>> + * @md_regions     : High level OS region base pointer
>> + * @elf         : Minidump elf header
>> + */
>> +struct minidump {
>> +    struct minidump_global_toc    *md_gbl_toc;
>
> Totally unused.

Yes, unused at this point, let me use this with your below suggestion.

>
>> +    struct minidump_subsystem    *md_ss_toc;
>> +    struct minidump_region        *md_regions;
>> +    struct minidump_elfhdr        elf;
>> +};
>> +
>> +/*
>> + * In some of the Old Qualcomm devices, boot firmware statically
>> allocates 300
>> + * as total number of supported region (including all co-processors) in
>> + * minidump table out of which linux was using 200. In future, this
>> limitation
>> + * from boot firmware might get removed by allocating the region
>> dynamically.
>> + * So, keep it compatible with older devices, we can the current
>> limit for Linux
>> + * to 200.
>> + */
>> +#define MAX_NUM_ENTRIES      200
>> +
>> +static struct minidump minidump;
>> +static DEFINE_MUTEX(minidump_lock);
>> +
> ...
>
>> +/* Update ELF header */
>> +static void minidump_update_elf_header(const struct
>> qcom_minidump_region *region)
>> +{
>> +    struct elfhdr *ehdr = minidump.elf.ehdr;
>> +    struct elf_shdr *shdr;
>> +    struct elf_phdr *phdr;
>> +
>> +    shdr = elf_shdr_entry_addr(ehdr, ehdr->e_shnum++);
>> +    phdr = elf_phdr_entry_addr(ehdr, ehdr->e_phnum++);
>> +
>> +    shdr->sh_type = SHT_PROGBITS;
>> +    shdr->sh_name = append_str_to_strtable(region->name);
>> +    shdr->sh_addr = (elf_addr_t)region->virt_addr;
>> +    shdr->sh_size = region->size;
>> +    shdr->sh_flags = SHF_WRITE;
>> +    shdr->sh_offset = minidump.elf.elf_offset;
>> +    shdr->sh_entsize = 0;
>> +
>> +    phdr->p_type = PT_LOAD;
>> +    phdr->p_offset = minidump.elf.elf_offset;
>> +    phdr->p_vaddr = (elf_addr_t)region->virt_addr;
>> +    phdr->p_paddr = region->phys_addr;
>> +    phdr->p_filesz = phdr->p_memsz = region->size;
>> +    phdr->p_flags = PF_R | PF_W;
>> +    minidump.elf.elf_offset += shdr->sh_size;
>> +}
>> +
>> +/* Add region in subsystem ToC */
>> +static void minidump_ss_add_region(const struct qcom_minidump_region
>> *region)
>> +{
>> +    struct minidump_region *mdr;
>> +    unsigned int region_cnt =
>> le32_to_cpu(minidump.md_ss_toc->region_count);
>> +
>> +    mdr = &minidump.md_regions[region_cnt];
>> +    strscpy(mdr->name, region->name, sizeof(mdr->name));
>> +    mdr->address = cpu_to_le64(region->phys_addr);
>> +    mdr->size = cpu_to_le64(region->size);
>> +    mdr->valid = cpu_to_le32(MINIDUMP_REGION_VALID);
>> +    region_cnt++;
>> +    minidump.md_ss_toc->region_count = cpu_to_le32(region_cnt);
>> +}
>> +
>> +static bool qcom_minidump_valid_region(const struct
>> qcom_minidump_region *region)
>> +{
>> +    return region &&
>> +        strnlen(region->name, MAX_NAME_LENGTH) < MAX_NAME_LENGTH &&
>> +        region->virt_addr &&
>> +        region->size &&minidump
>
> mindump is static variable, why are we checking for this?

This is a mistake from my side, it was not there in my original patch.
Thanks for catching this.

>
>> +        IS_ALIGNED(region->size, 4);
>
> This function looks very much unreadable, can we rearrage this.
>
>> +}
>> +
>> +#define MAX_STRTBL_SIZE       (MAX_NUM_ENTRIES * MAX_REGION_NAME_LENGTH)
>
> please move to the top.

>
>> +static int minidump_add_elf_header(void)
>> +{
>> +    struct qcom_minidump_region elfregion;
>> +    struct elfhdr *ehdr;
>> +    struct elf_shdr *shdr;
>> +    struct elf_phdr *phdr;
>> +    unsigned int  elfh_size;
>> +    unsigned int strtbl_off;
>> +    unsigned int phdr_off;
>> +    char *banner;
>> +    unsigned int banner_len;
>> +
>> +    banner_len = strlen(linux_banner);
>> +    /*
>> +     * Header buffer contains:
>> +     * ELF header, (MAX_NUM_ENTRIES + 4) of Section and Program ELF
>> headers,
>> +     * where, 4 additional entries, one for empty header, one for
>> string table
>> +     * one for minidump table and one for linux banner.
>> +     *
>> +     * Linux banner is stored in minidump to aid post mortem tools to
>> determine
>> +     * the kernel version.
>> +     */
>> +    elfh_size = sizeof(*ehdr);
>> +    elfh_size += MAX_STRTBL_SIZE;
>> +    elfh_size += banner_len + 1;
>> +    elfh_size += ((sizeof(*shdr) + sizeof(*phdr)) * (MAX_NUM_ENTRIES
>> + 4));
>> +    elfh_size = ALIGN(elfh_size, 4);
>> +
>> +    minidump.elf.ehdr = kzalloc(elfh_size, GFP_KERNEL);
>
> when will this memory freed?

This memory should not be freed. Since this will be first region and it
will be updated on each minidump registration.

>
>> +    if (!minidump.elf.ehdr)
>> +        return -ENOMEM;
>> +
>> +    /* Register ELF header as first region */
>> +    strscpy(elfregion.name, "KELF_HEADER", sizeof(elfregion.name));
>> +    elfregion.virt_addr = minidump.elf.ehdr;
>> +    elfregion.phys_addr = virt_to_phys(minidump.elf.ehdr);
>> +    elfregion.size = elfh_size;
>> +    minidump_ss_add_region(&elfregion);
>> +
>> +    ehdr = minidump.elf.ehdr;
>> +    /* Assign Section/Program headers offset */
>> +    minidump.elf.shdr = shdr = (struct elf_shdr *)(ehdr + 1);
>> +    minidump.elf.phdr = phdr = (struct elf_phdr *)(shdr +
>> MAX_NUM_ENTRIES);
>> +    phdr_off = sizeof(*ehdr) + (sizeof(*shdr) * MAX_NUM_ENTRIES);
>> +
>> +    memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
>> +    ehdr->e_ident[EI_CLASS] = ELF_CLASS;
>> +    ehdr->e_ident[EI_DATA] = ELF_DATA;
>> +    ehdr->e_ident[EI_VERSION] = EV_CURRENT;
>> +    ehdr->e_ident[EI_OSABI] = ELF_OSABI;
>> +    ehdr->e_type = ET_CORE;
>> +    ehdr->e_machine  = ELF_ARCH;
>> +    ehdr->e_version = EV_CURRENT;
>> +    ehdr->e_ehsize = sizeof(*ehdr);
>> +    ehdr->e_phoff = phdr_off;
>> +    ehdr->e_phentsize = sizeof(*phdr);
>> +    ehdr->e_shoff = sizeof(*ehdr);
>> +    ehdr->e_shentsize = sizeof(*shdr);
>> +    ehdr->e_shstrndx = 1;
>> +
>> +    minidump.elf.elf_offset = elfh_size;
>> +
>> +    /*
>> +     * The zeroth index of the section header is reserved and is
>> rarely used.
>> +     * Set the section header as null (SHN_UNDEF) and move to the
>> next one.
>> +     * 2nd Section is String table.
>> +     */
>> +    minidump.elf.strtable_idx = 1;
>> +    strtbl_off = sizeof(*ehdr) + ((sizeof(*phdr) + sizeof(*shdr)) *
>> MAX_NUM_ENTRIES);
>> +    shdr++;
>> +    shdr->sh_type = SHT_STRTAB;
>> +    shdr->sh_offset = (elf_addr_t)strtbl_off;
>> +    shdr->sh_size = MAX_STRTBL_SIZE;
>> +    shdr->sh_entsize = 0;
>> +    shdr->sh_flags = 0;
>> +    shdr->sh_name = append_str_to_strtable("STR_TBL");
>> +    shdr++;
>> +
>> +    /* 3rd Section is for Minidump_table VA, used by parsers */
>> +    shdr->sh_type = SHT_PROGBITS;
>> +    shdr->sh_entsize = 0;
>> +    shdr->sh_flags = 0;
>> +    shdr->sh_addr = (elf_addr_t)&minidump;
>> +    shdr->sh_name = append_str_to_strtable("minidump_table");
>> +    shdr++;
>> +
>> +    /* 4th Section is Linux banner */
>> +    banner = (char *)ehdr + strtbl_off + MAX_STRTBL_SIZE;
>> +    memcpy(banner, linux_banner, banner_len);
>> +
>> +    shdr->sh_type = SHT_PROGBITS;
>> +    shdr->sh_offset = (elf_addr_t)(strtbl_off + MAX_STRTBL_SIZE);
>> +    shdr->sh_size = banner_len + 1;
>> +    shdr->sh_addr = (elf_addr_t)linux_banner;
>> +    shdr->sh_entsize = 0;
>> +    shdr->sh_flags = SHF_WRITE;
>> +    shdr->sh_name = append_str_to_strtable("linux_banner");
>> +
>> +    phdr->p_type = PT_LOAD;
>> +    phdr->p_offset = (elf_addr_t)(strtbl_off + MAX_STRTBL_SIZE);
>> +    phdr->p_vaddr = (elf_addr_t)linux_banner;
>> +    phdr->p_paddr = virt_to_phys(linux_banner);
>> +    phdr->p_filesz = phdr->p_memsz = banner_len + 1;
>> +    phdr->p_flags = PF_R | PF_W;
>> +
>> +    /* Update headers count*/
>> +    ehdr->e_phnum = 1;
>> +    ehdr->e_shnum = 4;
> Can we define these magic constants.

Sure, would need to find some proper name.

>
> nit, a new line before return would be nice.

Ok.

>
>> +    return 0;
>> +}
>> +
>> +static int qcom_minidump_init(void)
>> +{
>> +    struct minidump_subsystem *mdsstoc;
>> +
>> +    mdsstoc = qcom_minidump_subsystem_desc(MINIDUMP_APSS_DESC);
>> +    if (IS_ERR(mdsstoc))
>> +        return -EINVAL;
>> +
>> +    minidump.md_regions = kcalloc(MAX_NUM_ENTRIES,
>> +                  sizeof(struct minidump_region), GFP_KERNEL);
>> +    if (!minidump.md_regions)
>> +        return -ENOMEM;
>> +
>> +    minidump.md_ss_toc = mdsstoc;
>> +    /* Share memory table update */
>> +    mdsstoc->regions_baseptr =
>> cpu_to_le64(virt_to_phys(minidump.md_regions));
>> +    /* Tell bootloader not to encrypt the regions of this subsystem */
>> +    mdsstoc->encryption_status = cpu_to_le32(MINIDUMP_SS_ENCR_DONE);
>> +    mdsstoc->encryption_required = cpu_to_le32(MINIDUMP_SS_ENCR_NOTREQ);
>> +
>> +    mdsstoc->enabled = cpu_to_le32(MINIDUMP_SS_ENABLED);
>> +    mdsstoc->status = cpu_to_le32(1);
>> +    mdsstoc->region_count = cpu_to_le32(0);
>> +
>> +    return 0;
>> +}
>> +
>> +/**
>> + * qcom_minidump_ready - Check, if minidump is ready for client
>> registration or not.
>> + *
>> + * Return: zero on success and negative on failure.
>> + *
>> + * Qualcomm minidump feature is dependent on Qualcomm's shared memory
>> and it is
>> + * possible for a arm64 target to not have it's device tree entry
>> present, for
>> + * such case, qcom_minidump_ready() returns -ENODEV and client should
>> not further
>> + * try to register their region with minidump driver.
>> + */
>> +int qcom_minidump_ready(void)
>> +{
>> +    void *ptr;
>> +    struct device_node *np;
>> +    static bool is_smem_available = true;
>> +
>> +    if (!is_smem_available || !(np = of_find_compatible_node(NULL,
>> NULL, "qcom,smem"))) {
>
> just check for dt node here does not mean that smem device is available,
> you should probably check if the device is avaliable aswell using
> of_device_is_available()
>
>
> We should proabably return -EPROBEDEFER incase the node is present and
> device is not present.

Since, this is an RFC, i may be wrong with the initial idea of asking
every client to do qcom_minidump_ready() before actual registration.

Also, deferring a driver will not look good atleast for any core kernel
client, say for e.g why should ramoops driver probe defer itself for
minidump which does not depends on minidump for its core functionality.

>
>
>> +        is_smem_available = false;
>> +        return -ENODEV;
>> +    }
>> +
>> +    of_node_put(np);
>
> <--
>> +    ptr = qcom_smem_get(QCOM_SMEM_HOST_ANY, SBL_MINIDUMP_SMEM_ID, NULL);
>> +    if (IS_ERR(ptr))
>> +        return PTR_ERR(ptr);
> -->
>
> If we are already checking for global toc here, why not just set it in
> minidump
> minidump.md_gbl_toc = qcom_smem_get(QCOM_SMEM_HOST_ANY,
> SBL_MINIDUMP_SMEM_ID, &size);
> ...
>
> and then stop calling qcom_smem_get to get global toc on every call to
> qcom_minidump_subsystem_desc()
>

Let me assign this on successful call to qcom_smem_get() and call to
qcom_smem_get() if !minidump.md_gbl_toc, since qcom_minidump() call
from driver/remoteproc/qcom_common.c also uses this and it is
independent to this driver but same in calling qcom_smem_get() for
global toc, so they should get advantage from each other.

>
>> +
>> +    return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(qcom_minidump_ready);
>> +
>> +/**
>> + * qcom_minidump_subsystem_desc() - Get minidump subsystem descriptor.
>> + * @minidump_index: minidump index for a subsystem in minidump table
>> + *
>> + * Return: minidump subsystem descriptor address on success and error
>> + * on failure
>> + */
>> +struct minidump_subsystem *qcom_minidump_subsystem_desc(unsigned int
>> minidump_index)
>> +{
>> +    struct minidump_global_toc *mdgtoc;
>> +    size_t size;
>> +
>> +    mdgtoc = qcom_smem_get(QCOM_SMEM_HOST_ANY, SBL_MINIDUMP_SMEM_ID,
>> &size);
>> +    if (IS_ERR(mdgtoc)) {
>> +        pr_err("Unable to find minidump smem item\n");
>> +        return ERR_CAST(mdgtoc);
>> +    }
>> +
>> +    if (size < sizeof(*mdgtoc) || !mdgtoc->status) {
>> +        pr_err("Minidump table is not initialized\n");
>> +        return ERR_PTR(-EINVAL);
>> +    }
>> +
>> +    return &mdgtoc->subsystems[minidump_index];
>
> once we fix qcom_minidump_ready() with the suggestion then this call
> will be
>
> struct minidump_subsystem *qcom_minidump_subsystem_desc(..)
> {
>     return &minidump.md_gbl_toc->subsystems[minidump_index];

Will do as i described above, let me know if you don't agree..

> }
>
>> +}
>> +EXPORT_SYMBOL_GPL(qcom_minidump_subsystem_desc);
>> +
>> +/**
>> + * qcom_minidump_region_register() - Register a region in Minidump
>> table.
>> + * @region: minidump region.
>> + *
>> + * Client should not call this directly instead first call
>> qcom_minidump_ready()
>> + * to check if minidump is ready to do registration if yes, then call
>> this API.
>> + *
>> + * Return: On success, it returns region index in minidump table and
>> negative
>> + * error value on failure.
>> + */
>> +int qcom_minidump_region_register(const struct qcom_minidump_region
>> *region)
>> +{
>> +    static bool minidump_init_done;
> why not move this type of static variables to struct minidump.

Sure, will do.

>
>> +    unsigned int num_region;
>> +    int ret;
>
> we should check qcom_minidump_ready() has been called either by setting
> a flag in struct minidump  and return early on in case its not ready.

Agree, that will bind qcom_minidump_ready() and
qcom_minidump_region_register() together.

>
>
>> +
>> +    /* Initialize minidump context on first call */
>> +    mutex_lock(&minidump_lock);
>> +    if (!minidump_init_done) {
>> +        ret = qcom_minidump_init();
>> +        if (ret)
>> +            goto unlock;
>> +
>> +        minidump_init_done = true;
>> +        /* First entry would be ELF header */
>> +        ret = minidump_add_elf_header();
>> +        if (ret) {
>> +            kfree(minidump.md_regions);
>
> should we not reset minidump_init_done = false;
> or move
> minidump_init_done = true;
> to end of this loop.

Moving it after successful call to minidump_add_elf_header()
is better.

-- Mukesh
>
>> +            goto unlock;
>> +        }
>> +    }
>> +
>> +    if (!qcom_minidump_valid_region(region)) {
>> +        ret = -EINVAL;
>> +        goto unlock;
>> +    }
>> +
>> +    ret = get_minidump_region_index(region);
>> +    if (ret >= 0) {
>> +        pr_info("%s region is already registered\n", region->name);
>> +        ret = -EEXIST;
>> +        goto unlock;
>> +    }
>> +
>> +    /* Check if there is a room for a new entry */
>> +    ret = num_region = le32_to_cpu(minidump.md_ss_toc->region_count);
>> +    if (num_region >= MAX_NUM_ENTRIES) {
>> +        pr_err("maximum region limit %u reached\n", num_region);
>> +        ret = -ENOSPC;
>> +        goto unlock;
>> +    }
>> +
>> +    minidump_ss_add_region(region);
>> +    minidump_update_elf_header(region);
>> +unlock:
>> +    mutex_unlock(&minidump_lock);
>> +    return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(qcom_minidump_region_register);
>


2023-03-15 15:24:05

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [RFC PATCH 3/6] soc: qcom: Add Qualcomm minidump kernel driver

Thanks for the review and comment.

On 3/9/2023 2:20 AM, Konrad Dybcio wrote:
>
>
> On 8.03.2023 21:22, Srinivas Kandagatla wrote:
>>
>>
>> On 21/02/2023 11:25, Mukesh Ojha wrote:
>>> Minidump is a best effort mechanism to collect useful and predefined
>>> data for first level of debugging on end user devices running on
>>> Qualcomm SoCs. It is built on the premise that System on Chip (SoC)
>>> or subsystem part of SoC crashes, due to a range of hardware and
>>> software bugs. Hence, the ability to collect accurate data is only
>>> a best-effort. The data collected could be invalid or corrupted,
>>> data collection itself could fail, and so on.
>>>
>>> Qualcomm devices in engineering mode provides a mechanism for
>>> generating full system ramdumps for post mortem debugging. But in some
>>> cases it's however not feasible to capture the entire content of RAM.
>>> The minidump mechanism provides the means for selecting region should
>>> be included in the ramdump. The solution supports extracting the
>>> ramdump/minidump produced either over USB or stored to an attached
>>> storage device.
>>>
>>> The core of minidump feature is part of Qualcomm's boot firmware code.
>>> It initializes shared memory(SMEM), which is a part of DDR and
>>> allocates a small section of it to minidump table i.e also called
>>> global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) has
>>> their own table of segments to be included in the minidump, all
>>> references from a descriptor in SMEM (G-ToC). Each segment/region has
>>> some details like name, physical address and it's size etc. and it
>>> could be anywhere scattered in the DDR.
>>>
>>> Minidump kernel driver adds the capability to add linux region to be
>>> dumped as part of ram dump collection. It provides appropriate symbol
>>> to check its enablement and register client regions.
>>>
>>> To simplify post mortem debugging, it creates and maintain an ELF
>>> header as first region that gets updated with upon registration
>>> of a new region.
>>>
>>> Signed-off-by: Mukesh Ojha <[email protected]>
>>> ---
> [...]
>
>>> +int qcom_minidump_ready(void)
>>> +{
>>> +    void *ptr;
>>> +    struct device_node *np;
>>> +    static bool is_smem_available = true;
>>> +
>>> +    if (!is_smem_available || !(np = of_find_compatible_node(NULL, NULL, "qcom,smem"))) {
>>
>> just check for dt node here does not mean that smem device is available, you should probably check if the device is avaliable aswell using of_device_is_available()
>>
>>
>> We should proabably return -EPROBEDEFER incase the node is present and device is not present.
> qcom_smem_get() seems to handle -EPROBE_DEFER internally, so this check
> may be entirely redundant.

The main idea behind checking qcom,smem availability is to not stop
client(core kernel) probe if the smem dt node is not present at all
and qcom_smem_get() will always return -EPROBE_DEFER for such cases.

For e.g: Again i am taking ramoops example which seems relevant for
this case, ramoops driver should still probe and not defer forever
if the smem node is not available at all for non-qcom device tree.

-Mukesh
>
> Konrad

2023-03-21 16:14:15

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [RFC PATCH 6/6] pstore/ram: Register context with minidump



On 2/24/2023 1:13 AM, Kees Cook wrote:
> On Tue, Feb 21, 2023 at 04:55:13PM +0530, Mukesh Ojha wrote:
>> There are system which does not uses pstore directly but
>> may have the interest in the context saved by pstore.
>> Register pstore regions with minidump so that it get
>> dumped on minidump collection.
>
> Okay, so, this is a really interesting case -- it's a RAM backend that
> is already found on a system by pstore via device tree, but there is
> _another_ RAM overlay (minidump) that would like to know more about how
> the pstore ram backend carves up the memory regions so it can examine
> them itself too. (i.e. it's another "interface" like the pstorefs.)
>
> So we need to provide the mapping back to the overlay. It feels to me
> like the logic for this needs to live in the minidump driver itself
> (rather than in the pstore RAM backend). Specifically, it wants to know
> about all the operational frontends (dmesg, console, ftrace, pmsg) with
> their virt & phys addresses and size.
>
> The frontends are defined via enum pstore_type_id, and the other values
> are "normal" types, so it should be possible to move this logic into
> minidump instead, leaving a simpler callback. Perhaps something like:
>
> void pstore_region_defined(enum pstore_type_id, void *virt,
> phys_addr_t phys, size_t size);
>
> How the pstore ram backend should know to call this, though, I'm
> struggling to find a sensible way. How can it determine if the device
> tree region is actually contained by a minidump overlay?


Do you think, if qcom_minidump_ready() can be used which checks minidump
readiness ?

-Mukesh
>