2023-06-28 12:51:09

by Mukesh Ojha

[permalink] [raw]
Subject: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

Minidump is a best effort mechanism to collect useful and predefined data
for first level of debugging on end user devices running on Qualcomm SoCs.
It is built on the premise that System on Chip (SoC) or subsystem part of
SoC crashes, due to a range of hardware and software bugs. Hence, the
ability to collect accurate data is only a best-effort. The data collected
could be invalid or corrupted, data collection itself could fail, and so on.

Qualcomm devices in engineering mode provides a mechanism for generating
full system ramdumps for post mortem debugging. But in some cases it's
however not feasible to capture the entire content of RAM. The minidump
mechanism provides the means for selecting which snippets should be
included in the ramdump.

Minidump kernel driver implementation is divided into two parts for
simplicity, one is minidump core which can also be called minidump
frontend(As API gets exported from this driver for registration with
backend) and the other part is minidump backend i.e, where the underlying
implementation of minidump will be there. There could be different way
how the backend is implemented like Shared memory, Memory mapped IO
or Resource manager(gunyah) based where the guest region information is
passed to hypervisor via hypercalls.

Minidump Client-1 Client-2 Client-5 Client-n
| | | |
| | ... | ... |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| +---+--------------+----+ |
+-----------+ qcom_minidump(core) +--------+
| |
+------+-----+------+---+
| | |
| | |
+---------------+ | +--------------------+
| | |
| | |
| | |
v v v
+-------------------+ +-------------------+ +------------------+
|qcom_minidump_smem | |qcom_minidump_mmio | | qcom_minidump_rm |
| | | | | |
+-------------------+ +-------------------+ +------------------+
Shared memory Memory mapped IO Resource manager
(backend) (backend) (backend)


Here, we will be giving all analogy of backend with SMEM as it is the
only implemented backend at present but general idea remains the same.

The core of SMEM based minidump feature is part of Qualcomm's boot
firmware code. It initializes shared memory (SMEM), which is a part of
DDR and allocates a small section of SMEM to minidump table i.e also
called global table of content (G-ToC). Each subsystem (APSS, ADSP, ...)
has their own table of segments to be included in the minidump and all
get their reference from G-ToC. Each segment/region has some details
like name, physical address and it's size etc. and it could be anywhere
scattered in the DDR.

Existing upstream Qualcomm remoteproc driver[1] already supports SMEM
based minidump feature for remoteproc instances like ADSP, MODEM, ...
where predefined selective segments of subsystem region can be dumped
as part of coredump collection which generates smaller size artifacts
compared to complete coredump of subsystem on crash.

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/remoteproc/qcom_common.c#n142

In addition to managing and querying the APSS minidump description,
the Linux driver maintains a ELF header in a segment. This segment
gets updated with section/program header whenever a new entry gets
registered.


docs: qcom: Add qualcomm minidump guide
kallsyms: Export kallsyms_lookup_name
soc: qcom: Add qcom_minidump_smem module
soc: qcom: Add Qualcomm APSS minidump (frontend) feature support
soc: qcom: Add linux minidump smem backend driver support
soc: qcom: minidump: Add pending region registration support
soc: qcom: minidump: Add update region support
dt-bindings: reserved-memory: Add qcom,ramoops binding
pstore/ram : Export ramoops_parse_dt() symbol
soc: qcom: Add qcom's pstore minidump driver support
soc: qcom: Register pstore frontend region with minidump
remoteproc: qcom: Expand MD_* as MINIDUMP_*
remoterproc: qcom: refactor to leverage exported minidump symbol
MAINTAINERS: Add entry for minidump driver related support
arm64: defconfig: Enable Qualcomm Minidump related drivers
arm64: dts: qcom: sm8450: Add Qualcomm ramoops minidump node
firmware: qcom_scm: provide a read-modify-write function
pinctrl: qcom: Use qcom_scm_io_update_field()
firmware: scm: Modify only the download bits in TCSR register
firmware: qcom_scm: Refactor code to support multiple download mode
firmware: qcom_scm: Add multiple download mode support

Patch 1/21 is qualcomm minidump document
Patch 2/21 will export kallsyms_lookup_name will be needed for minidump module
Patch 3/21 moves the minidump specific data structure and macro to
qcom_minidump_smem.c and later 13/21 will use the API and remove
minidump specific code to qcom_minidump_smem file.
Patch 4/21 is qualcomm minidump core(frontend) driver
Patch 5/21 implements qualcomm smem backend kernel driver
Patch 6/21 add pending region support for the clients who came for
registration before minidump.
Patch 7/21 add update region support for registered clients.
Patch 8/21 Add dt-binding for qualcomm ramoops driver which is also a minidump client driver
Patch 9/21 exported symbol from ramoops driver to avoid copy of the code.
Patch 10/21 Add qcom's pstore minidump driver support which adds ramoops platform device
and 11/21 register existing pstore frontend regions.
Patch 12/21 and 13/21 does some clean up and code reuse.
Patch 16/21 enable qcom_ramoops driver for sm8450
Patch 17-21 are not new and has already been through 6 versions and
reason of adding here is for minidump testing purpose and it will be rebased
automatically along with new version of minidump series.

Testing of the patches has been done on sm8450 target after enabling config like
CONFIG_PSTORE_RAM and CONFIG_PSTORE_CONSOLE and once the device boots up.
Try crashing it via devmem2 0xf11c000(this is known to create xpu violation and
and put the device in download mode) on command prompt.

I have added download patch here numbered from 14/18 to 18/18
Earlier download mode setting patches were sent separately
https://lore.kernel.org/lkml/[email protected]/

Default storage type is set to via USB, so minidump would be downloaded with the
help of x86_64 machine (running PCAT tool) attached to Qualcomm device which has
backed minidump boot firmware support (more can be found patch 3/18)

Below patch [1] is to warm reset Qualcomm device which has upstream qcom
watchdog driver support.

After applying all patches, we can boot the device and can execute
following command.

echo mini > /sys/module/qcom_scm/parameters/download_mode
echo c > /proc/sysrq-trigger

This will make the device go to download mode and collect the minidump on to the
attached x86 machine running the Qualcomm PCAT tool(This comes as part Qualcomm
package manager kit).
After that we will see a bunch of predefined registered region as binary blobs files
starts with md_* downloaded on the x86 machine on given location in PCAT tool from
the target device.

A sample client example to dump a linux region has been given in patch 3/18 and as
well as can be seen in patch 12/18.

[1]
--------------------------->8-------------------------------------

commit f1124ccebd47550b4c9627aa162d9cdceba2b76f
Author: Mukesh Ojha <[email protected]>
Date: Thu Mar 16 14:08:35 2023 +0530

do not merge: watchdog bite on panic

Signed-off-by: Mukesh Ojha <[email protected]>

diff --git a/drivers/watchdog/qcom-wdt.c b/drivers/watchdog/qcom-wdt.c
index 0d2209c..767e84a 100644
--- a/drivers/watchdog/qcom-wdt.c
+++ b/drivers/watchdog/qcom-wdt.c
@@ -12,6 +12,7 @@
#include <linux/platform_device.h>
#include <linux/watchdog.h>
#include <linux/of_device.h>
+#include <linux/panic.h>

enum wdt_reg {
WDT_RST,
@@ -114,12 +115,28 @@ static int qcom_wdt_set_pretimeout(struct watchdog_device *wdd,
return qcom_wdt_start(wdd);
}

+static void qcom_wdt_bite_on_panic(struct qcom_wdt *wdt)
+{
+ writel(0, wdt_addr(wdt, WDT_EN));
+ writel(1, wdt_addr(wdt, WDT_BITE_TIME));
+ writel(1, wdt_addr(wdt, WDT_RST));
+ writel(QCOM_WDT_ENABLE, wdt_addr(wdt, WDT_EN));
+
+ wmb();
+
+ while(1)
+ udelay(1);
+}
+
static int qcom_wdt_restart(struct watchdog_device *wdd, unsigned long action,
void *data)
{
struct qcom_wdt *wdt = to_qcom_wdt(wdd);
u32 timeout;

+ if (in_panic)
+ qcom_wdt_bite_on_panic(wdt);
+
/*
* Trigger watchdog bite:
* Setup BITE_TIME to be 128ms, and enable WDT.
diff --git a/include/linux/panic.h b/include/linux/panic.h
index 979b776..f913629 100644
--- a/include/linux/panic.h
+++ b/include/linux/panic.h
@@ -22,6 +22,7 @@ extern int panic_on_oops;
extern int panic_on_unrecovered_nmi;
extern int panic_on_io_nmi;
extern int panic_on_warn;
+extern bool in_panic;

extern unsigned long panic_on_taint;
extern bool panic_on_taint_nousertaint;
diff --git a/kernel/panic.c b/kernel/panic.c
index 487f5b0..714f7f4 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -65,6 +65,8 @@ static unsigned int warn_limit __read_mostly;

int panic_timeout = CONFIG_PANIC_TIMEOUT;
EXPORT_SYMBOL_GPL(panic_timeout);
+bool in_panic = false;
+EXPORT_SYMBOL_GPL(in_panic);

#define PANIC_PRINT_TASK_INFO 0x00000001
#define PANIC_PRINT_MEM_INFO 0x00000002
@@ -261,6 +263,7 @@ void panic(const char *fmt, ...)
int old_cpu, this_cpu;
bool _crash_kexec_post_notifiers = crash_kexec_post_notifiers;

+ in_panic = true;
if (panic_on_warn) {
/*
* This thread may hit another WARN() in the panic path.
--------------------------------------------------------------------------

Changes in v4:
- Redesigned the driver and divided the driver into front end and backend (smem) so
that any new backend can be attached easily to avoid code duplication.
- Patch reordering as per the driver and subsystem to easier review of the code.
- Removed minidump specific code from remoteproc to minidump smem based driver.
- Enabled the all the driver as modules.
- Address comments made on documentation and yaml and Device tree file [Krzysztof/Konrad]
- Address comments made qcom_pstore_minidump driver and given its Device tree
same set of properties as ramoops. [Luca/Kees]
- Added patch for MAINTAINER file.
- Include defconfig change as one patch as per [Krzysztof] suggestion.
- Tried to remove the redundant file scope variables from the module as per [Krzysztof] suggestion.
- Addressed comments made on dload mode patch v6 version
https://lore.kernel.org/lkml/[email protected]/

Changes in v3: https://lore.kernel.org/lkml/[email protected]/
- Addressed most of the comments by Srini on v2 and refactored the minidump driver.
- Added platform device support
- Unregister region support.
- Added update region for clients.
- Added pending region support.
- Modified the documentation guide accordingly.
- Added qcom_pstore_ramdump client driver which happen to add ramoops platform
device and also registers ramoops region with minidump.
- Added download mode patch series with this minidump series.
https://lore.kernel.org/lkml/[email protected]/

Changes in v2: https://lore.kernel.org/lkml/[email protected]/
- Addressed review comment made by [quic_tsoni/bmasney] to add documentation.
- Addressed comments made by [srinivas.kandagatla]
- Dropped pstore 6/6 from the last series, till i get conclusion to get pstore
region in minidump.
- Fixed issue reported by kernel test robot.

Changes in v1: https://lore.kernel.org/lkml/[email protected]/

Mukesh Ojha (21):
docs: qcom: Add qualcomm minidump guide
kallsyms: Export kallsyms_lookup_name
soc: qcom: Add qcom_minidump_smem module
soc: qcom: Add Qualcomm APSS minidump (frontend) feature support
soc: qcom: Add linux minidump smem backend driver support
soc: qcom: minidump: Add pending region registration support
soc: qcom: minidump: Add update region support
dt-bindings: reserved-memory: Add qcom,ramoops binding
pstore/ram : Export ramoops_parse_dt() symbol
soc: qcom: Add qcom's pstore minidump driver support
soc: qcom: Register pstore frontend region with minidump
remoteproc: qcom: Expand MD_* as MINIDUMP_*
remoterproc: qcom: refactor to leverage exported minidump symbol
MAINTAINERS: Add entry for minidump driver related support
arm64: defconfig: Enable Qualcomm Minidump related drivers
arm64: dts: qcom: sm8450: Add Qualcomm ramoops minidump node
firmware: qcom_scm: provide a read-modify-write function
pinctrl: qcom: Use qcom_scm_io_update_field()
firmware: scm: Modify only the download bits in TCSR register
firmware: qcom_scm: Refactor code to support multiple download mode
firmware: qcom_scm: Add multiple download mode support

Documentation/admin-guide/index.rst | 1 +
Documentation/admin-guide/qcom_minidump.rst | 293 +++++++++++
.../devicetree/bindings/soc/qcom/qcom,ramoops.yaml | 126 +++++
MAINTAINERS | 15 +
arch/arm64/boot/dts/qcom/sm8450.dtsi | 12 +
arch/arm64/configs/defconfig | 4 +
drivers/firmware/Kconfig | 11 -
drivers/firmware/qcom_scm.c | 85 ++-
drivers/pinctrl/qcom/pinctrl-msm.c | 12 +-
drivers/remoteproc/qcom_common.c | 142 +----
drivers/soc/qcom/Kconfig | 39 ++
drivers/soc/qcom/Makefile | 3 +
drivers/soc/qcom/qcom_minidump.c | 582 +++++++++++++++++++++
drivers/soc/qcom/qcom_minidump_internal.h | 98 ++++
drivers/soc/qcom/qcom_minidump_smem.c | 387 ++++++++++++++
drivers/soc/qcom/qcom_pstore_minidump.c | 210 ++++++++
drivers/soc/qcom/smem.c | 9 +
fs/pstore/ram.c | 26 +-
include/linux/firmware/qcom/qcom_scm.h | 2 +
include/linux/pstore_ram.h | 2 +
include/soc/qcom/qcom_minidump.h | 64 +++
kernel/kallsyms.c | 2 +-
22 files changed, 1973 insertions(+), 152 deletions(-)
create mode 100644 Documentation/admin-guide/qcom_minidump.rst
create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
create mode 100644 drivers/soc/qcom/qcom_minidump.c
create mode 100644 drivers/soc/qcom/qcom_minidump_internal.h
create mode 100644 drivers/soc/qcom/qcom_minidump_smem.c
create mode 100644 drivers/soc/qcom/qcom_pstore_minidump.c
create mode 100644 include/soc/qcom/qcom_minidump.h

--
2.7.4



2023-06-28 12:51:35

by Mukesh Ojha

[permalink] [raw]
Subject: [PATCH v4 21/21] firmware: qcom_scm: Add multiple download mode support

Currently, scm driver only supports full dump when download
mode is selected. Add support to enable minidump as well as
enable it along with fulldump.

Signed-off-by: Mukesh Ojha <[email protected]>
---
drivers/firmware/qcom_scm.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index 946cb0f76a17..52e3b613a1af 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -31,6 +31,8 @@ static u32 download_mode;
#define SCM_HAS_BUS_CLK BIT(2)

#define QCOM_DOWNLOAD_FULLDUMP 0x1
+#define QCOM_DOWNLOAD_MINIDUMP 0x2
+#define QCOM_DOWNLOAD_BOTHDUMP (QCOM_DOWNLOAD_FULLDUMP | QCOM_DOWNLOAD_MINIDUMP)
#define QCOM_DOWNLOAD_NODUMP 0x0
#define QCOM_DOWNLOAD_MODE_SHIFT 4
#define QCOM_DOWNLOAD_MODE_MASK 0x30
@@ -85,6 +87,8 @@ static const char * const qcom_scm_convention_names[] = {
static const char * const download_mode_name[] = {
[QCOM_DOWNLOAD_NODUMP] = "off",
[QCOM_DOWNLOAD_FULLDUMP] = "full",
+ [QCOM_DOWNLOAD_MINIDUMP] = "mini",
+ [QCOM_DOWNLOAD_BOTHDUMP] = "full,mini",
};

static struct qcom_scm *__scm;
@@ -1465,7 +1469,7 @@ static const struct kernel_param_ops download_mode_param_ops = {

module_param_cb(download_mode, &download_mode_param_ops, NULL, 0644);
MODULE_PARM_DESC(download_mode,
- "download mode: off/full are acceptable values");
+ "download mode: off/full/mini/full,mini are acceptable values");

static int qcom_scm_probe(struct platform_device *pdev)
{
--
2.7.4


2023-06-28 12:51:35

by Mukesh Ojha

[permalink] [raw]
Subject: [PATCH v4 01/21] docs: qcom: Add qualcomm minidump guide

Add the qualcomm minidump guide for the users which
tries to cover the dependency and the way to test
and collect minidump on Qualcomm supported platforms.

Signed-off-by: Mukesh Ojha <[email protected]>
---
Documentation/admin-guide/index.rst | 1 +
Documentation/admin-guide/qcom_minidump.rst | 293 ++++++++++++++++++++++++++++
2 files changed, 294 insertions(+)
create mode 100644 Documentation/admin-guide/qcom_minidump.rst

diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index 43ea35613dfc..251d070486c2 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -120,6 +120,7 @@ configure specific aspects of kernel behavior to your liking.
perf-security
pm/index
pnp
+ qcom_minidump
rapidio
ras
rtc
diff --git a/Documentation/admin-guide/qcom_minidump.rst b/Documentation/admin-guide/qcom_minidump.rst
new file mode 100644
index 000000000000..a3a8cfee4555
--- /dev/null
+++ b/Documentation/admin-guide/qcom_minidump.rst
@@ -0,0 +1,293 @@
+Qualcomm Minidump Feature
+=========================
+
+Introduction
+------------
+
+Minidump is a best effort mechanism to collect useful and predefined
+data for first level of debugging on end user devices running on
+Qualcomm SoCs. It is built on the premise that System on Chip (SoC)
+or subsystem part of SoC crashes, due to a range of hardware and
+software bugs. Hence, the ability to collect accurate data is only
+a best-effort. The data collected could be invalid or corrupted, data
+collection itself could fail, and so on.
+
+Qualcomm devices in engineering mode provides a mechanism for generating
+full system RAM dumps for post-mortem debugging. But in some cases it's
+however not feasible to capture the entire content of RAM. The minidump
+mechanism provides the means for selecting region should be included in
+the ramdump.
+
+::
+
+ +-----------------------------------------------+
+ | DDR +-------------+ |
+ | | SS0-ToC| |
+ | +----------------+ +----------------+ | |
+ | |Shared memory | | SS1-ToC| | |
+ | |(SMEM) | | | | |
+ | | | +-->|--------+ | | |
+ | |G-ToC | | | SS-ToC \ | | |
+ | |+-------------+ | | | +-----------+ | | |
+ | ||-------------| | | | |-----------| | | |
+ | || SS0-ToC | | | +-|<|SS1 region1| | | |
+ | ||-------------| | | | | |-----------| | | |
+ | || SS1-ToC |-|>+ | | |SS1 region2| | | |
+ | ||-------------| | | | |-----------| | | |
+ | || SS2-ToC | | | | | ... | | | |
+ | ||-------------| | | | |-----------| | | |
+ | || ... | | |-|<|SS1 regionN| | | |
+ | ||-------------| | | | |-----------| | | |
+ | || SSn-ToC | | | | +-----------+ | | |
+ | |+-------------+ | | | | | |
+ | | | | |----------------| | |
+ | | | +>| regionN | | |
+ | | | | |----------------| | |
+ | +----------------+ | | | | |
+ | | |----------------| | |
+ | +>| region1 | | |
+ | |----------------| | |
+ | | | | |
+ | |----------------|-+ |
+ | | region5 | |
+ | |----------------| |
+ | | | |
+ | Region information +----------------+ |
+ | +---------------+ |
+ | |region name | |
+ | |---------------| |
+ | |region address | |
+ | |---------------| |
+ | |region size | |
+ | +---------------+ |
+ +-----------------------------------------------+
+ G-ToC: Global table of contents
+ SS-ToC: Subsystem table of contents
+ SS0-SSn: Subsystem numbered from 0 to n
+
+It depends on targets how the underlying hardware taking care of the
+implementation part for minidump like above diagram is for shared
+memory and it is possible that this could be implemented via memory
+mapped regions but the general idea remain same.
+
+In this document, SMEM will be used as the backend implementation of
+minidump.
+
+SMEM as backend
+----------------
+
+The core of minidump feature is part of Qualcomm's boot firmware code.
+It initializes shared memory (SMEM), which is a part of DDR and
+allocates a small section of it to minidump table, i.e. also called
+global table of contents (G-ToC). Each subsystem (APSS, ADSP, ...) has
+its own table of segments to be included in the minidump, all
+references from a descriptor in SMEM (G-ToC). Each segment/region has
+some details like name, physical address and its size etc. and it
+could be anywhere scattered in the DDR.
+
+Minidump kernel driver concept
+------------------------------
+::
+
+ Minidump Client-1 Client-2 Client-5 Client-n
+ | | | |
+ | | ... | ... |
+ | | | |
+ | | | |
+ | | | |
+ | | | |
+ | | | |
+ | | | |
+ | +---+--------------+----+ |
+ +-----------+ qcom_minidump(core) +--------+
+ | |
+ +------+-----+------+---+
+ | | |
+ | | |
+ +---------------+ | +--------------------+
+ | | |
+ | | |
+ | | |
+ v v v
+ +-------------------+ +-------------------+ +------------------+
+ |qcom_minidump_smem | |qcom_minidump_mmio | | qcom_minidump_rm |
+ | | | | | |
+ +-------------------+ +-------------------+ +------------------+
+ Shared memory Memory mapped IO Resource manager
+ (backend) (backend) (backend)
+
+
+Kernel implementation of minidump driver is divided into two parts one is,
+the core implementation called frontend driver ``qcom_minidump.c`` and this
+is the driver will be exposing the API for clients and the other part is,
+backend driver and its depends whether it is based on SMEM, MMIO or some
+other way corressponding driver will be hooking itself up with the core
+driver to get itself working. As of now, at a time one and only one backend
+can be attached to the front-end either it is HOST or a guest VM.
+
+Qualcomm minidump kernel driver adds the capability to add Linux region
+to be dumped as part of RAM dump collection. At the moment, shared memory
+driver creates platform device for minidump driver and give a means to
+APSS minidump to initialize itself on probe.
+
+This driver provides ``qcom_minidump_region_register`` and
+``qcom_minidump_region_unregister`` API's to register and unregister
+APSS minidump region. It also gives a mechanism to update physical/virtual
+address for the client whose addresses keeps on changing, e.g., current stack
+address of task keeps on changing on context switch for each core. So these
+clients can update their addresses with ``qcom_minidump_update_region``
+API.
+
+The driver also supports registration for the clients who came before
+minidump driver was initialized. It maintains pending list of clients
+who came before minidump and once minidump is initialized it registers
+them in one go.
+
+To simplify post-mortem debugging, driver creates and maintain an ELF
+header as first region that gets updated each time a new region gets
+registered.
+
+The solution supports extracting the RAM dump/minidump produced either
+over USB or stored to an attached storage device.
+
+Dependency of minidump kernel driver
+------------------------------------
+
+It is to note that whole of minidump depends on Qualcomm boot
+firmware whether it supports minidump or not. So, if the minidump
+SMEM ID is present in shared memory, it indicates that minidump
+is supported from boot firmware and it is possible to dump Linux
+(APSS) region as part of minidump collection.
+
+How a kernel client driver can register region with minidump
+------------------------------------------------------------
+
+Client driver can use ``qcom_minidump_region_register`` API's to
+register and ``qcom_minidump_region_unregister`` to unregister
+their region from minidump driver.
+
+Client needs to fill their region by filling ``qcom_minidump_region``
+structure object which consists of the region name, region's
+virtual and physical address and its size.
+
+Below is one sample client driver snippet which tries to allocate
+a region from kernel heap of certain size and it writes a certain
+known pattern (that can help in verification after collection
+that we got the exact pattern, what we wrote) and registers it with
+minidump.
+
+ .. code-block:: c
+
+ #include <soc/qcom/qcom_minidump.h>
+ [...]
+
+
+ [... inside a function ...]
+ struct qcom_minidump_region region;
+
+ [...]
+
+ client_mem_region = kzalloc(region_size, GFP_KERNEL);
+ if (!client_mem_region)
+ return -ENOMEM;
+
+ [... Just write a pattern ...]
+ memset(client_mem_region, 0xAB, region_size);
+
+ [... Fill up the region object ...]
+ strlcpy(region.name, "REGION_A", sizeof(region.name));
+ region.virt_addr = client_mem_region;
+ region.phys_addr = virt_to_phys(client_mem_region);
+ region.size = region_size;
+
+ ret = qcom_minidump_region_register(&region);
+ if (ret < 0) {
+ pr_err("failed to add region in minidump: err: %d\n", ret);
+ return ret;
+ }
+
+ [...]
+
+
+Test
+----
+
+Existing Qualcomm devices already supports entire RAM dump (also called
+full dump) by writing appropriate value to Qualcomm's top control and
+status register (tcsr) in ``driver/firmware/qcom_scm.c`` .
+
+SCM device Tree bindings required to support download mode
+For example (sm8450) ::
+
+ / {
+
+ [...]
+
+ firmware {
+ scm: scm {
+ compatible = "qcom,scm-sm8450", "qcom,scm";
+ [... tcsr register ... ]
+ qcom,dload-mode = <&tcsr 0x13000>;
+
+ [...]
+ };
+ };
+
+ [...]
+
+ soc: soc@0 {
+
+ [...]
+
+ tcsr: syscon@1fc0000 {
+ compatible = "qcom,sm8450-tcsr", "syscon";
+ reg = <0x0 0x1fc0000 0x0 0x30000>;
+ };
+
+ [...]
+ };
+ [...]
+
+ };
+
+User of minidump can pass ``qcom_scm.download_mode="mini"`` to kernel
+commandline to set the current download mode to minidump.
+Similarly, ``"full"`` is passed to set the download mode to full dump
+where entire RAM dump will be collected while setting it ``"full,mini"``
+will collect minidump along with fulldump.
+
+Writing to sysfs node can also be used to set the mode to minidump::
+
+ echo "mini" > /sys/module/qcom_scm/parameter/download_mode
+
+Once the download mode is set, any kind of crash will make the device collect
+respective dump as per set download mode.
+
+Dump collection
+---------------
+
+The solution supports extracting the minidump produced either over USB or
+stored to an attached storage device.
+
+By default, dumps are downloaded via USB to the attached x86_64 machine
+running PCAT (Qualcomm tool) software. Upon download, we will see
+a set of binary blobs starting with name ``md_*`` in PCAT configured directory
+in x86_64 machine, so for above example from the client it will be
+``md_REGION_A.BIN``. This binary blob depends on region content to determine
+whether it needs external parser support to get the content of the region,
+so for simple plain ASCII text we don't need any parsing and the content
+can be seen just opening the binary file.
+
+To collect the dump to attached storage type, one needs to write appropriate
+value to IMEM register, in that case dumps are collected in rawdump
+partition on the target device itself.
+
+One needs to read the entire rawdump partition and pull out content to
+save it onto the attached x86_64 machine over USB. Later, this rawdump
+can be passed to another tool (``dexter.exe`` [Qualcomm tool]) which converts
+this into the similar binary blobs which we have got it when download type
+was set to USB, i.e. a set of registered regions as blobs and their name
+starts with ``md_*``.
+
+Replacing the ``dexter.exe`` with some open source tool can be added as future
+scope of this document.
--
2.7.4


2023-06-28 12:52:25

by Mukesh Ojha

[permalink] [raw]
Subject: [PATCH v4 08/21] dt-bindings: reserved-memory: Add qcom,ramoops binding

Qualcomm ramoops minidump logger provide a means of storing
the ramoops data to some dynamically reserved memory instead
of traditionally implemented ramoops where the region should
be statically fixed ram region. Its device tree binding
would be exactly same as ramoops device tree binding and is
going to contain traditional ramoops frontend data and this
content will be collected via Qualcomm minidump infrastructure
provided from the boot firmware.

Signed-off-by: Mukesh Ojha <[email protected]>
---
.../devicetree/bindings/soc/qcom/qcom,ramoops.yaml | 126 +++++++++++++++++++++
1 file changed, 126 insertions(+)
create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml

diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml b/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
new file mode 100644
index 000000000000..b1fdcf3f8ad4
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
@@ -0,0 +1,126 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: "http://devicetree.org/schemas/soc/qcom/qcom,ramoops.yaml#"
+$schema: "http://devicetree.org/meta-schemas/core.yaml#"
+
+title: Qualcomm Ramoops minidump logger
+
+description: |
+ Qualcomm ramoops minidump logger provide a means of storing the ramoops
+ data to some dynamically reserved memory instead of traditionally
+ implemented ramoops where the region should be statically fixed ram
+ region. Because of its similarity with ramoops it will also have same
+ set of property what ramoops have it in its schema and is going to
+ contain traditional ramoops frontend data and this region will be
+ collected via Qualcomm minidump infrastructure provided from the
+ boot firmware.
+
+maintainers:
+ - Mukesh Ojha <[email protected]>
+
+properties:
+ compatible:
+ items:
+ - enum:
+ - qcom,sm8450-ramoops
+ - const: qcom,ramoops
+
+ memory-region:
+ maxItems: 1
+ description: handle to memory reservation for qcom,ramoops region.
+
+ ecc-size:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description: enables ECC support and specifies ECC buffer size in bytes
+ default: 0 # no ECC
+
+ record-size:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description: maximum size in bytes of each kmsg dump
+ default: 0
+
+ console-size:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description: size in bytes of log buffer reserved for kernel messages
+ default: 0
+
+ ftrace-size:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description: size in bytes of log buffer reserved for function tracing and profiling
+ default: 0
+
+ pmsg-size:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description: size in bytes of log buffer reserved for userspace messages
+ default: 0
+
+ mem-type:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description: if present, sets the type of mapping is to be used to map the reserved region.
+ default: 0
+ oneOf:
+ - const: 0
+ description: write-combined
+ - const: 1
+ description: unbuffered
+ - const: 2
+ description: cached
+
+ max-reason:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ default: 2 # log oopses and panics
+ maximum: 0x7fffffff
+ description: |
+ If present, sets maximum type of kmsg dump reasons to store.
+ This can be set to INT_MAX to store all kmsg dumps.
+ See include/linux/kmsg_dump.h KMSG_DUMP_* for other kmsg dump reason values.
+ Setting this to 0 (KMSG_DUMP_UNDEF), means the reason filtering will be
+ controlled by the printk.always_kmsg_dump boot param.
+ If unset, it will be 2 (KMSG_DUMP_OOPS), otherwise 5 (KMSG_DUMP_MAX).
+
+ flags:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ default: 0
+ description: |
+ If present, pass ramoops behavioral flags
+ (see include/linux/pstore_ram.h RAMOOPS_FLAG_* for flag values).
+
+ no-dump-oops:
+ deprecated: true
+ type: boolean
+ description: |
+ Use max_reason instead. If present, and max_reason is not specified,
+ it is equivalent to max_reason = 1 (KMSG_DUMP_PANIC).
+
+ unbuffered:
+ deprecated: true
+ type: boolean
+ description: |
+ Use mem_type instead. If present, and mem_type is not specified,
+ it is equivalent to mem_type = 1 and uses unbuffered mappings to map
+ the reserved region (defaults to buffered mappings mem_type = 0).
+ If both are specified -- "mem_type" overrides "unbuffered".
+
+unevaluatedProperties: false
+
+required:
+ - compatible
+ - memory-region
+
+anyOf:
+ - required: [record-size]
+ - required: [console-size]
+ - required: [ftrace-size]
+ - required: [pmsg-size]
+
+examples:
+ - |
+
+ qcom_ramoops {
+ compatible = "qcom,sm8450-ramoops", "qcom,ramoops";
+ memory-region = <&qcom_ramoops_region>;
+ console-size = <0x8000>; /* 32kB */
+ record-size = <0x400>; /* 1kB */
+ ecc-size = <16>;
+ };
--
2.7.4


2023-06-28 12:52:45

by Mukesh Ojha

[permalink] [raw]
Subject: [PATCH v4 15/21] arm64: defconfig: Enable Qualcomm Minidump related drivers

Enable QCOM_MINIDUMP_SMEM backend driver which automatically
selects QCOM_MINIDUMP core for its functionality to work.

While at it, also enable clients of minidump driver which
is QCOM_PSTORE_MINIDUMP that uses PSTORE_RAM and
PSTORE_CONSOLE to capture console logs into minidump.

Signed-off-by: Mukesh Ojha <[email protected]>
---
arch/arm64/configs/defconfig | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index a24609e14d50..9a47c9ec0391 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -1250,6 +1250,8 @@ CONFIG_QCOM_STATS=m
CONFIG_QCOM_WCNSS_CTRL=m
CONFIG_QCOM_APR=m
CONFIG_QCOM_ICC_BWMON=m
+CONFIG_QCOM_MINIDUMP_SMEM=m
+CONFIG_QCOM_PSTORE_MINIDUMP=m
CONFIG_ARCH_R8A77995=y
CONFIG_ARCH_R8A77990=y
CONFIG_ARCH_R8A77951=y
@@ -1445,6 +1447,8 @@ CONFIG_HUGETLBFS=y
CONFIG_CONFIGFS_FS=y
CONFIG_EFIVAR_FS=y
CONFIG_SQUASHFS=y
+CONFIG_PSTORE_CONSOLE=y
+CONFIG_PSTORE_RAM=y
CONFIG_NFS_FS=y
CONFIG_NFS_V4=y
CONFIG_NFS_V4_1=y
--
2.7.4


2023-06-28 12:52:57

by Mukesh Ojha

[permalink] [raw]
Subject: [PATCH v4 12/21] remoteproc: qcom: Expand MD_* as MINIDUMP_*

Expand MD_* as MINIDUMP_* which makes more sense than the
abbreviation.

Signed-off-by: Mukesh Ojha <[email protected]>
---
drivers/remoteproc/qcom_common.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/remoteproc/qcom_common.c b/drivers/remoteproc/qcom_common.c
index a0d4238492e9..805e525df3b5 100644
--- a/drivers/remoteproc/qcom_common.c
+++ b/drivers/remoteproc/qcom_common.c
@@ -29,9 +29,9 @@
#define MAX_NUM_OF_SS 10
#define MAX_REGION_NAME_LENGTH 16
#define SBL_MINIDUMP_SMEM_ID 602
-#define MD_REGION_VALID ('V' << 24 | 'A' << 16 | 'L' << 8 | 'I' << 0)
-#define MD_SS_ENCR_DONE ('D' << 24 | 'O' << 16 | 'N' << 8 | 'E' << 0)
-#define MD_SS_ENABLED ('E' << 24 | 'N' << 16 | 'B' << 8 | 'L' << 0)
+#define MINIDUMP_REGION_VALID ('V' << 24 | 'A' << 16 | 'L' << 8 | 'I' << 0)
+#define MINIDUMP_SS_ENCR_DONE ('D' << 24 | 'O' << 16 | 'N' << 8 | 'E' << 0)
+#define MINIDUMP_SS_ENABLED ('E' << 24 | 'N' << 16 | 'B' << 8 | 'L' << 0)

/**
* struct minidump_region - Minidump region
@@ -125,7 +125,7 @@ static int qcom_add_minidump_segments(struct rproc *rproc, struct minidump_subsy

for (i = 0; i < seg_cnt; i++) {
memcpy_fromio(&region, ptr + i, sizeof(region));
- if (le32_to_cpu(region.valid) == MD_REGION_VALID) {
+ if (le32_to_cpu(region.valid) == MINIDUMP_REGION_VALID) {
name = kstrndup(region.name, MAX_REGION_NAME_LENGTH - 1, GFP_KERNEL);
if (!name) {
iounmap(ptr);
@@ -168,8 +168,8 @@ void qcom_minidump(struct rproc *rproc, unsigned int minidump_id,
*/
if (subsystem->regions_baseptr == 0 ||
le32_to_cpu(subsystem->status) != 1 ||
- le32_to_cpu(subsystem->enabled) != MD_SS_ENABLED ||
- le32_to_cpu(subsystem->encryption_status) != MD_SS_ENCR_DONE) {
+ le32_to_cpu(subsystem->enabled) != MINIDUMP_SS_ENABLED ||
+ le32_to_cpu(subsystem->encryption_status) != MINIDUMP_SS_ENCR_DONE) {
dev_err(&rproc->dev, "Minidump not ready, skipping\n");
return;
}
--
2.7.4


2023-06-28 12:53:00

by Mukesh Ojha

[permalink] [raw]
Subject: [PATCH v4 07/21] soc: qcom: minidump: Add update region support

Add support to update client's region physical/virtual addresses,
which is useful for dynamic loadable modules, dynamic address
changing clients like if we want to collect current stack
information for each core and the current stack is changing on
each sched_switch event, So here virtual/physical address of
the current stack is changing. So, to cover such use cases
add the update region support in minidump driver and the
corresponding smem backend support.

Signed-off-by: Mukesh Ojha <[email protected]>
---
drivers/soc/qcom/qcom_minidump.c | 55 +++++++++++++++++++++++++++++++
drivers/soc/qcom/qcom_minidump_internal.h | 3 ++
drivers/soc/qcom/qcom_minidump_smem.c | 21 ++++++++++++
include/soc/qcom/qcom_minidump.h | 5 +++
4 files changed, 84 insertions(+)

diff --git a/drivers/soc/qcom/qcom_minidump.c b/drivers/soc/qcom/qcom_minidump.c
index cfdb63cc29d6..37d6ceb4d85c 100644
--- a/drivers/soc/qcom/qcom_minidump.c
+++ b/drivers/soc/qcom/qcom_minidump.c
@@ -318,6 +318,61 @@ int qcom_minidump_region_unregister(const struct qcom_minidump_region *region)
}
EXPORT_SYMBOL_GPL(qcom_minidump_region_unregister);

+/**
+ * qcom_minidump_update_region() - Update region information with new physical
+ * address and virtual address for already registered region e.g, current task
+ * stack for a core keeps on changing on each context switch, there it needs to
+ * change registered region with new updated addresses.
+ *
+ * @region: Should be already registered minidump region.
+ *
+ * Return: On success, it returns 0 and negative error value on failure.
+ */
+int qcom_minidump_update_region(const struct qcom_minidump_region *region)
+{
+ struct minidump_pregion *md_pregion;
+ struct qcom_minidump_region *tmp;
+ struct elfhdr *ehdr;
+ struct elf_shdr *shdr;
+ struct elf_phdr *phdr;
+ int idx;
+ int ret = 0;
+
+ if (!qcom_minidump_valid_region(region))
+ return -EINVAL;
+
+ mutex_lock(&md_lock);
+ if (!md) {
+ md_pregion = check_if_pending_region_exist(region);
+ if (!md_pregion) {
+ ret = -ENOENT;
+ goto unlock;
+ }
+ tmp = &md_pregion->region;
+ tmp->phys_addr = region->phys_addr;
+ tmp->virt_addr = region->virt_addr;
+ goto unlock;
+ }
+
+ ret = md->ops->md_update_region(md, &idx, region);
+ if (ret)
+ goto unlock;
+
+ /* Skip predefined shdr/phdr header entry at the start */
+ ehdr = md->elf.ehdr;
+ shdr = elf_shdr_entry_addr(ehdr, idx + 4);
+ phdr = elf_phdr_entry_addr(ehdr, idx + 1);
+
+ shdr->sh_addr = (elf_addr_t)region->virt_addr;
+ phdr->p_vaddr = (elf_addr_t)region->virt_addr;
+ phdr->p_paddr = region->phys_addr;
+
+unlock:
+ mutex_unlock(&md_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(qcom_minidump_update_region);
+
static int qcom_minidump_add_elf_header(struct minidump *md_data)
{
struct qcom_minidump_region elfregion;
diff --git a/drivers/soc/qcom/qcom_minidump_internal.h b/drivers/soc/qcom/qcom_minidump_internal.h
index 6993e3be10c2..dc1e76c1f063 100644
--- a/drivers/soc/qcom/qcom_minidump_internal.h
+++ b/drivers/soc/qcom/qcom_minidump_internal.h
@@ -21,6 +21,7 @@ struct minidump;
* @md_table_exit: Clean up the stuff populated while md_table_init()
* @md_region_register: Callback to register the region at the backend.
* @md_region_unregister: Callback to unregister the region at the backend.
+ * @md_update_region: Callback to update address of already registered regions.
*/
struct minidump_ops {
int (*md_table_init)(struct minidump *md);
@@ -29,6 +30,8 @@ struct minidump_ops {
const struct qcom_minidump_region *region);
int (*md_region_unregister)(struct minidump *md,
const struct qcom_minidump_region *region);
+ int (*md_update_region)(struct minidump *md, int *idx,
+ const struct qcom_minidump_region *region);
};

/**
diff --git a/drivers/soc/qcom/qcom_minidump_smem.c b/drivers/soc/qcom/qcom_minidump_smem.c
index bdc75aa2f518..9d4021a5e4c8 100644
--- a/drivers/soc/qcom/qcom_minidump_smem.c
+++ b/drivers/soc/qcom/qcom_minidump_smem.c
@@ -263,6 +263,26 @@ static int smem_md_region_unregister(struct minidump *md,
return 0;
}

+static int smem_md_update_region(struct minidump *md, int *idx,
+ const struct qcom_minidump_region *region)
+{
+ struct minidump_ss_data *mdss_data = md->apss_data;
+ struct minidump_region *mdr;
+ int ret;
+
+ ret = smem_md_get_region_index(mdss_data, region);
+ if (ret < 0) {
+ dev_err(md->dev, "%s region is not present\n", region->name);
+ return ret;
+ }
+
+ *idx = ret;
+ mdr = &mdss_data->md_regions[*idx];
+ mdr->address = cpu_to_le64(region->phys_addr);
+
+ return 0;
+}
+
static int smem_md_table_init(struct minidump *md)
{
struct minidump_global_toc *mdgtoc;
@@ -324,6 +344,7 @@ static struct minidump_ops smem_md_ops = {
.md_table_exit = smem_md_table_exit,
.md_region_register = smem_md_region_register,
.md_region_unregister = smem_md_region_unregister,
+ .md_update_region = smem_md_update_region,
};

static int qcom_minidump_smem_probe(struct platform_device *pdev)
diff --git a/include/soc/qcom/qcom_minidump.h b/include/soc/qcom/qcom_minidump.h
index d0bebc3daac5..a86b0313698f 100644
--- a/include/soc/qcom/qcom_minidump.h
+++ b/include/soc/qcom/qcom_minidump.h
@@ -29,6 +29,7 @@ struct qcom_minidump_region {
#if IS_ENABLED(CONFIG_QCOM_MINIDUMP)
int qcom_minidump_region_register(const struct qcom_minidump_region *region);
int qcom_minidump_region_unregister(const struct qcom_minidump_region *region);
+int qcom_minidump_update_region(const struct qcom_minidump_region *region);
#else
static inline int qcom_minidump_region_register(const struct qcom_minidump_region *region)
{
@@ -39,6 +40,10 @@ static inline int qcom_minidump_region_unregister(const struct qcom_minidump_reg
{
return 0;
}
+static inline int qcom_minidump_update_region(const struct qcom_minidump_region *region)
+{
+ return 0;
+}
#endif /* CONFIG_QCOM_MINIDUMP */

#if IS_ENABLED(CONFIG_QCOM_MINIDUMP_SMEM)
--
2.7.4


2023-06-28 12:53:35

by Mukesh Ojha

[permalink] [raw]
Subject: [PATCH v4 16/21] arm64: dts: qcom: sm8450: Add Qualcomm ramoops minidump node

This enable dynamic reserve memory for Qualcomm ramoops device,
which will be used to save ramoops frontend data and this region
gets dumped on crash via Qualcomm's minidump infrastructure.
qcom_pstore_minidump is the associated driver for this node.

Signed-off-by: Mukesh Ojha <[email protected]>
---
arch/arm64/boot/dts/qcom/sm8450.dtsi | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi
index d59ea8ee7111..0b1dedee606b 100644
--- a/arch/arm64/boot/dts/qcom/sm8450.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi
@@ -390,6 +390,12 @@
};
};

+ ramoops-minidump {
+ compatible = "qcom,sm8450-ramoops", "qcom,ramoops";
+ console-size = <0x200000>;
+ memory-region = <&qcom_ramoops>;
+ };
+
reserved_memory: reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
@@ -623,6 +629,12 @@
reg = <0x0 0xed900000 0x0 0x3b00000>;
no-map;
};
+
+ qcom_ramoops: ramoops {
+ alloc-ranges = <0x0 0x00000000 0xffffffff 0xffffffff>;
+ size = <0x0 0x200000>;
+ no-map;
+ };
};

smp2p-adsp {
--
2.7.4


2023-06-28 12:53:39

by Mukesh Ojha

[permalink] [raw]
Subject: [PATCH v4 09/21] pstore/ram : Export ramoops_parse_dt() symbol

It is possible if some driver do not want ramoops region to
be static instead it sets up the mem_address and mem_size
and use everything what this driver has to offer. To offer
this convenience export ramoops_parse_dt() function.

Signed-off-by: Mukesh Ojha <[email protected]>
---
fs/pstore/ram.c | 26 ++++++++++++++++++--------
include/linux/pstore_ram.h | 2 ++
2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/fs/pstore/ram.c b/fs/pstore/ram.c
index ade66dbe5f39..6bb66d044bef 100644
--- a/fs/pstore/ram.c
+++ b/fs/pstore/ram.c
@@ -638,7 +638,7 @@ static int ramoops_parse_dt_u32(struct platform_device *pdev,
return 0;
}

-static int ramoops_parse_dt(struct platform_device *pdev,
+int ramoops_parse_dt(struct platform_device *pdev,
struct ramoops_platform_data *pdata)
{
struct device_node *of_node = pdev->dev.of_node;
@@ -649,15 +649,24 @@ static int ramoops_parse_dt(struct platform_device *pdev,

dev_dbg(&pdev->dev, "using Device Tree\n");

- res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
- if (!res) {
- dev_err(&pdev->dev,
- "failed to locate DT /reserved-memory resource\n");
- return -EINVAL;
+ /*
+ * It is possible if some driver do not want ramoops
+ * region to be static instead it sets up the mem_address
+ * and mem_size and use everything what this driver has
+ * to offer.
+ */
+ if (!pdata->mem_address && !pdata->mem_size) {
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!res) {
+ dev_err(&pdev->dev,
+ "failed to locate DT /reserved-memory resource\n");
+ return -EINVAL;
+ }
+
+ pdata->mem_size = resource_size(res);
+ pdata->mem_address = res->start;
}

- pdata->mem_size = resource_size(res);
- pdata->mem_address = res->start;
/*
* Setting "unbuffered" is deprecated and will be ignored if
* "mem_type" is also specified.
@@ -713,6 +722,7 @@ static int ramoops_parse_dt(struct platform_device *pdev,

return 0;
}
+EXPORT_SYMBOL_GPL(ramoops_parse_dt);

static int ramoops_probe(struct platform_device *pdev)
{
diff --git a/include/linux/pstore_ram.h b/include/linux/pstore_ram.h
index 9d65ff94e216..55df4e631a25 100644
--- a/include/linux/pstore_ram.h
+++ b/include/linux/pstore_ram.h
@@ -8,6 +8,7 @@
#ifndef __LINUX_PSTORE_RAM_H__
#define __LINUX_PSTORE_RAM_H__

+#include <linux/platform_device.h>
#include <linux/pstore.h>

struct persistent_ram_ecc_info {
@@ -39,4 +40,5 @@ struct ramoops_platform_data {
struct persistent_ram_ecc_info ecc_info;
};

+int ramoops_parse_dt(struct platform_device *, struct ramoops_platform_data *);
#endif
--
2.7.4


2023-06-28 12:54:07

by Mukesh Ojha

[permalink] [raw]
Subject: [PATCH v4 10/21] soc: qcom: Add qcom's pstore minidump driver support

This driver was inspired from the fact pstore ram region should be
fixed and boot firmware need to have awarness about this region,
so that it will be persistent across boot. But, there are many
QCOM SoC which does not support warm boot from hardware but they
have minidump support from the software, and for them, there is
no need of this pstore ram region to be fixed, but at the same
time have interest in the pstore frontends data. So, this driver
get the dynamic reserved region from the ram and register the
ramoops platform device.

+---------+ +---------+ +--------+ +---------+
| console | | pmsg | | ftrace | | dmesg |
+---------+ +---------+ +--------+ +---------+
| | | |
| | | |
+------------------------------------------+
|
\ /
+----------------+
(1) |pstore frontends|
+----------------+
|
\ /
+------------------- +
(2) | pstore backend(ram)|
+--------------------+
|
\ /
+--------------------+
(3) |qcom_pstore_minidump|
+--------------------+
|
\ /
+---------------+
(4) | qcom_minidump |
+---------------+

This driver will route all the pstore front data to the stored
in qcom pstore reserved region and the reason of showing an
arrow from (3) to (4) as qcom_pstore_minidump driver will register
all the available frontends region with qcom minidump driver
in upcoming patch.

Signed-off-by: Mukesh Ojha <[email protected]>
---
drivers/soc/qcom/Kconfig | 12 +++++
drivers/soc/qcom/Makefile | 1 +
drivers/soc/qcom/qcom_pstore_minidump.c | 85 +++++++++++++++++++++++++++++++++
3 files changed, 98 insertions(+)
create mode 100644 drivers/soc/qcom/qcom_pstore_minidump.c

diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
index 1834213fd652..fbf08e30feda 100644
--- a/drivers/soc/qcom/Kconfig
+++ b/drivers/soc/qcom/Kconfig
@@ -306,4 +306,16 @@ config QCOM_MINIDUMP_SMEM

This config should be enabled if the low level minidump is implemented
as part of SMEM.
+
+config QCOM_PSTORE_MINIDUMP
+ tristate "Pstore support for QCOM Minidump"
+ depends on ARCH_QCOM
+ depends on PSTORE_RAM
+ depends on QCOM_MINIDUMP
+ help
+ Enablement of this driver ensures that ramoops region can be anywhere
+ reserved in ram instead of being fixed address which needs boot firmware
+ awareness. So, this driver creates plaform device and registers available
+ frontend region with the Qualcomm's minidump driver.
+
endmenu
diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
index 737d868757ac..1ab59c1b364d 100644
--- a/drivers/soc/qcom/Makefile
+++ b/drivers/soc/qcom/Makefile
@@ -36,3 +36,4 @@ qcom_ice-objs += ice.o
obj-$(CONFIG_QCOM_INLINE_CRYPTO_ENGINE) += qcom_ice.o
obj-$(CONFIG_QCOM_MINIDUMP) += qcom_minidump.o
obj-$(CONFIG_QCOM_MINIDUMP_SMEM) += qcom_minidump_smem.o
+obj-$(CONFIG_QCOM_PSTORE_MINIDUMP) += qcom_pstore_minidump.o
diff --git a/drivers/soc/qcom/qcom_pstore_minidump.c b/drivers/soc/qcom/qcom_pstore_minidump.c
new file mode 100644
index 000000000000..b07cd10340df
--- /dev/null
+++ b/drivers/soc/qcom/qcom_pstore_minidump.c
@@ -0,0 +1,85 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
+ */
+
+#include <linux/module.h>
+#include <linux/of_device.h>
+#include <linux/of_reserved_mem.h>
+#include <linux/platform_device.h>
+#include <linux/pstore_ram.h>
+
+struct qcom_ramoops_dd {
+ struct ramoops_platform_data qcom_ramoops_pdata;
+ struct platform_device *ramoops_pdev;
+};
+
+static int qcom_ramoops_probe(struct platform_device *pdev)
+{
+ struct device_node *of_node = pdev->dev.of_node;
+ struct qcom_ramoops_dd *qcom_rdd;
+ struct ramoops_platform_data *pdata;
+ struct reserved_mem *rmem;
+ struct device_node *node;
+ long ret;
+
+ node = of_parse_phandle(of_node, "memory-region", 0);
+ if (!node)
+ return -ENODEV;
+
+ rmem = of_reserved_mem_lookup(node);
+ of_node_put(node);
+ if (!rmem) {
+ dev_err(&pdev->dev, "failed to locate DT /reserved-memory resource\n");
+ return -EINVAL;
+ }
+
+ qcom_rdd = devm_kzalloc(&pdev->dev, sizeof(*qcom_rdd), GFP_KERNEL);
+ if (!qcom_rdd)
+ return -ENOMEM;
+
+ pdata = &qcom_rdd->qcom_ramoops_pdata;
+ pdata->mem_size = rmem->size;
+ pdata->mem_address = rmem->base;
+ ramoops_parse_dt(pdev, pdata);
+
+ qcom_rdd->ramoops_pdev = platform_device_register_data(NULL, "ramoops", -1,
+ pdata, sizeof(*pdata));
+ if (IS_ERR(qcom_rdd->ramoops_pdev)) {
+ ret = PTR_ERR(qcom_rdd->ramoops_pdev);
+ dev_err(&pdev->dev, "could not create platform device: %ld\n", ret);
+ qcom_rdd->ramoops_pdev = NULL;
+ }
+ platform_set_drvdata(pdev, qcom_rdd);
+
+ return ret;
+}
+
+static void qcom_ramoops_remove(struct platform_device *pdev)
+{
+ struct qcom_ramoops_dd *qcom_rdd = platform_get_drvdata(pdev);
+
+ platform_device_unregister(qcom_rdd->ramoops_pdev);
+ qcom_rdd->ramoops_pdev = NULL;
+}
+
+static const struct of_device_id qcom_ramoops_of_match[] = {
+ { .compatible = "qcom,ramoops"},
+ {}
+};
+MODULE_DEVICE_TABLE(of, qcom_ramoops_of_match);
+
+static struct platform_driver qcom_ramoops_drv = {
+ .driver = {
+ .name = "qcom,ramoops",
+ .of_match_table = qcom_ramoops_of_match,
+ },
+ .probe = qcom_ramoops_probe,
+ .remove_new = qcom_ramoops_remove,
+};
+
+module_platform_driver(qcom_ramoops_drv);
+
+MODULE_DESCRIPTION("Qualcomm ramoops minidump driver");
+MODULE_LICENSE("GPL");
--
2.7.4


2023-06-28 14:55:03

by Pavan Kondeti

[permalink] [raw]
Subject: Re: [PATCH v4 08/21] dt-bindings: reserved-memory: Add qcom,ramoops binding

On Wed, Jun 28, 2023 at 06:04:35PM +0530, Mukesh Ojha wrote:
> Qualcomm ramoops minidump logger provide a means of storing
> the ramoops data to some dynamically reserved memory instead
> of traditionally implemented ramoops where the region should
> be statically fixed ram region. Its device tree binding
> would be exactly same as ramoops device tree binding and is
> going to contain traditional ramoops frontend data and this
> content will be collected via Qualcomm minidump infrastructure
> provided from the boot firmware.
>
> Signed-off-by: Mukesh Ojha <[email protected]>
> ---
> .../devicetree/bindings/soc/qcom/qcom,ramoops.yaml | 126 +++++++++++++++++++++
> 1 file changed, 126 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
>
> diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml b/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
> new file mode 100644
> index 000000000000..b1fdcf3f8ad4
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
> @@ -0,0 +1,126 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: "http://devicetree.org/schemas/soc/qcom/qcom,ramoops.yaml#"
> +$schema: "http://devicetree.org/meta-schemas/core.yaml#"
> +
> +title: Qualcomm Ramoops minidump logger
> +
> +description: |
> + Qualcomm ramoops minidump logger provide a means of storing the ramoops
> + data to some dynamically reserved memory instead of traditionally
> + implemented ramoops where the region should be statically fixed ram
> + region. Because of its similarity with ramoops it will also have same
> + set of property what ramoops have it in its schema and is going to
> + contain traditional ramoops frontend data and this region will be
> + collected via Qualcomm minidump infrastructure provided from the
> + boot firmware.
> +
> +maintainers:
> + - Mukesh Ojha <[email protected]>
> +
> +properties:
> + compatible:
> + items:
> + - enum:
> + - qcom,sm8450-ramoops
> + - const: qcom,ramoops
> +
> + memory-region:
> + maxItems: 1
> + description: handle to memory reservation for qcom,ramoops region.
> +
> + ecc-size:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + description: enables ECC support and specifies ECC buffer size in bytes
> + default: 0 # no ECC
> +
> + record-size:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + description: maximum size in bytes of each kmsg dump
> + default: 0
> +
> + console-size:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + description: size in bytes of log buffer reserved for kernel messages
> + default: 0
> +
> + ftrace-size:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + description: size in bytes of log buffer reserved for function tracing and profiling
> + default: 0
> +
> + pmsg-size:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + description: size in bytes of log buffer reserved for userspace messages
> + default: 0
> +
> + mem-type:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + description: if present, sets the type of mapping is to be used to map the reserved region.
> + default: 0
> + oneOf:
> + - const: 0
> + description: write-combined
> + - const: 1
> + description: unbuffered
> + - const: 2
> + description: cached
> +
> + max-reason:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + default: 2 # log oopses and panics
> + maximum: 0x7fffffff
> + description: |
> + If present, sets maximum type of kmsg dump reasons to store.
> + This can be set to INT_MAX to store all kmsg dumps.
> + See include/linux/kmsg_dump.h KMSG_DUMP_* for other kmsg dump reason values.
> + Setting this to 0 (KMSG_DUMP_UNDEF), means the reason filtering will be
> + controlled by the printk.always_kmsg_dump boot param.
> + If unset, it will be 2 (KMSG_DUMP_OOPS), otherwise 5 (KMSG_DUMP_MAX).
> +
> + flags:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + default: 0
> + description: |
> + If present, pass ramoops behavioral flags
> + (see include/linux/pstore_ram.h RAMOOPS_FLAG_* for flag values).
> +
> + no-dump-oops:
> + deprecated: true
> + type: boolean
> + description: |
> + Use max_reason instead. If present, and max_reason is not specified,
> + it is equivalent to max_reason = 1 (KMSG_DUMP_PANIC).
> +
> + unbuffered:
> + deprecated: true
> + type: boolean
> + description: |
> + Use mem_type instead. If present, and mem_type is not specified,
> + it is equivalent to mem_type = 1 and uses unbuffered mappings to map
> + the reserved region (defaults to buffered mappings mem_type = 0).
> + If both are specified -- "mem_type" overrides "unbuffered".
> +

Most of the properties you added here are already documented at
Documentation/devicetree/bindings/reserved-memory/ramoops.yaml

Can't we just reference them here? would something like work?

max-reason:
$ref: "../../reserved-memory/ramoops.yaml#/properties/max-reason

> +unevaluatedProperties: false
> +

will there be any additional properties be added dynamically? if not,
should not we use "additionalProperties: false" here?

Thanks,
Pavan

2023-06-28 15:16:56

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [PATCH v4 08/21] dt-bindings: reserved-memory: Add qcom,ramoops binding



On 6/28/2023 8:17 PM, Rob Herring wrote:
> On Wed, Jun 28, 2023 at 6:36 AM Mukesh Ojha <[email protected]> wrote:
>>
>> Qualcomm ramoops minidump logger provide a means of storing
>> the ramoops data to some dynamically reserved memory instead
>> of traditionally implemented ramoops where the region should
>> be statically fixed ram region. Its device tree binding
>> would be exactly same as ramoops device tree binding and is
>> going to contain traditional ramoops frontend data and this
>> content will be collected via Qualcomm minidump infrastructure
>> provided from the boot firmware.
>
> The big difference is if firmware is not deciding where this log
> lives, then it doesn't need to be in DT. How does anything except the
> kernel that allocates the log find the logs?

Yes, you are correct, firmware is not deciding where the logs lives
instead here, Kernel has reserved the region where the ramoops region
lives and later with the minidump registration where, physical
address/size/virtual address(for parsing) are passed and that is how
firmware is able to know and dump those region before triggering system
reset.

A part of this registration code you can find in 11/21

> I'm pretty sure I already said all this before.

Yes, you said this before but that's the reason i came up with vendor
ramoops instead of changing traditional ramoops binding.

>
>>
>> Signed-off-by: Mukesh Ojha <[email protected]>
>> ---
>> .../devicetree/bindings/soc/qcom/qcom,ramoops.yaml | 126 +++++++++++++++++++++
>> 1 file changed, 126 insertions(+)
>> create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
>>
>> diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml b/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
>> new file mode 100644
>> index 000000000000..b1fdcf3f8ad4
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
>> @@ -0,0 +1,126 @@
>> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
>> +%YAML 1.2
>> +---
>> +$id: "http://devicetree.org/schemas/soc/qcom/qcom,ramoops.yaml#"
>> +$schema: "http://devicetree.org/meta-schemas/core.yaml#"
>> +
>> +title: Qualcomm Ramoops minidump logger
>> +
>> +description: |
>> + Qualcomm ramoops minidump logger provide a means of storing the ramoops
>> + data to some dynamically reserved memory instead of traditionally
>> + implemented ramoops where the region should be statically fixed ram
>> + region. Because of its similarity with ramoops it will also have same
>> + set of property what ramoops have it in its schema and is going to
>> + contain traditional ramoops frontend data and this region will be
>> + collected via Qualcomm minidump infrastructure provided from the
>> + boot firmware.
>> +
>> +maintainers:
>> + - Mukesh Ojha <[email protected]>
>> +
>> +properties:
>> + compatible:
>> + items:
>> + - enum:
>> + - qcom,sm8450-ramoops
>> + - const: qcom,ramoops
>> +
>> + memory-region:
>> + maxItems: 1
>> + description: handle to memory reservation for qcom,ramoops region.
>> +
>> + ecc-size:
>> + $ref: /schemas/types.yaml#/definitions/uint32
>> + description: enables ECC support and specifies ECC buffer size in bytes
>> + default: 0 # no ECC
>> +
>> + record-size:
>> + $ref: /schemas/types.yaml#/definitions/uint32
>> + description: maximum size in bytes of each kmsg dump
>> + default: 0
>> +
>> + console-size:
>> + $ref: /schemas/types.yaml#/definitions/uint32
>> + description: size in bytes of log buffer reserved for kernel messages
>> + default: 0
>> +
>> + ftrace-size:
>> + $ref: /schemas/types.yaml#/definitions/uint32
>> + description: size in bytes of log buffer reserved for function tracing and profiling
>> + default: 0
>> +
>> + pmsg-size:
>> + $ref: /schemas/types.yaml#/definitions/uint32
>> + description: size in bytes of log buffer reserved for userspace messages
>> + default: 0
>> +
>> + mem-type:
>> + $ref: /schemas/types.yaml#/definitions/uint32
>> + description: if present, sets the type of mapping is to be used to map the reserved region.
>> + default: 0
>> + oneOf:
>> + - const: 0
>> + description: write-combined
>> + - const: 1
>> + description: unbuffered
>> + - const: 2
>> + description: cached
>> +
>> + max-reason:
>> + $ref: /schemas/types.yaml#/definitions/uint32
>> + default: 2 # log oopses and panics
>> + maximum: 0x7fffffff
>> + description: |
>> + If present, sets maximum type of kmsg dump reasons to store.
>> + This can be set to INT_MAX to store all kmsg dumps.
>> + See include/linux/kmsg_dump.h KMSG_DUMP_* for other kmsg dump reason values.
>> + Setting this to 0 (KMSG_DUMP_UNDEF), means the reason filtering will be
>> + controlled by the printk.always_kmsg_dump boot param.
>> + If unset, it will be 2 (KMSG_DUMP_OOPS), otherwise 5 (KMSG_DUMP_MAX).
>> +
>> + flags:
>> + $ref: /schemas/types.yaml#/definitions/uint32
>> + default: 0
>> + description: |
>> + If present, pass ramoops behavioral flags
>> + (see include/linux/pstore_ram.h RAMOOPS_FLAG_* for flag values).
>> +
>> + no-dump-oops:
>> + deprecated: true
>
> Why would you define deprecated properties in a *new* binding?

Since, it is exact copy of ramoops binding, that's the reason of mention.

Will remove it.

-Mukesh
>
>> + type: boolean
>> + description: |
>> + Use max_reason instead. If present, and max_reason is not specified,
>> + it is equivalent to max_reason = 1 (KMSG_DUMP_PANIC).
>> +
>> + unbuffered:
>> + deprecated: true
>> + type: boolean
>> + description: |
>> + Use mem_type instead. If present, and mem_type is not specified,
>> + it is equivalent to mem_type = 1 and uses unbuffered mappings to map
>> + the reserved region (defaults to buffered mappings mem_type = 0).
>> + If both are specified -- "mem_type" overrides "unbuffered".
>> +
>> +unevaluatedProperties: false
>> +
>> +required:
>> + - compatible
>> + - memory-region
>> +
>> +anyOf:
>> + - required: [record-size]
>> + - required: [console-size]
>> + - required: [ftrace-size]
>> + - required: [pmsg-size]
>> +
>> +examples:
>> + - |
>> +
>> + qcom_ramoops {
>> + compatible = "qcom,sm8450-ramoops", "qcom,ramoops";
>> + memory-region = <&qcom_ramoops_region>;
>> + console-size = <0x8000>; /* 32kB */
>> + record-size = <0x400>; /* 1kB */
>> + ecc-size = <16>;
>> + };
>> --
>> 2.7.4
>>

2023-06-28 15:19:00

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v4 08/21] dt-bindings: reserved-memory: Add qcom,ramoops binding

On Wed, Jun 28, 2023 at 6:36 AM Mukesh Ojha <[email protected]> wrote:
>
> Qualcomm ramoops minidump logger provide a means of storing
> the ramoops data to some dynamically reserved memory instead
> of traditionally implemented ramoops where the region should
> be statically fixed ram region. Its device tree binding
> would be exactly same as ramoops device tree binding and is
> going to contain traditional ramoops frontend data and this
> content will be collected via Qualcomm minidump infrastructure
> provided from the boot firmware.

The big difference is if firmware is not deciding where this log
lives, then it doesn't need to be in DT. How does anything except the
kernel that allocates the log find the logs?

I'm pretty sure I already said all this before.

>
> Signed-off-by: Mukesh Ojha <[email protected]>
> ---
> .../devicetree/bindings/soc/qcom/qcom,ramoops.yaml | 126 +++++++++++++++++++++
> 1 file changed, 126 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
>
> diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml b/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
> new file mode 100644
> index 000000000000..b1fdcf3f8ad4
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
> @@ -0,0 +1,126 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: "http://devicetree.org/schemas/soc/qcom/qcom,ramoops.yaml#"
> +$schema: "http://devicetree.org/meta-schemas/core.yaml#"
> +
> +title: Qualcomm Ramoops minidump logger
> +
> +description: |
> + Qualcomm ramoops minidump logger provide a means of storing the ramoops
> + data to some dynamically reserved memory instead of traditionally
> + implemented ramoops where the region should be statically fixed ram
> + region. Because of its similarity with ramoops it will also have same
> + set of property what ramoops have it in its schema and is going to
> + contain traditional ramoops frontend data and this region will be
> + collected via Qualcomm minidump infrastructure provided from the
> + boot firmware.
> +
> +maintainers:
> + - Mukesh Ojha <[email protected]>
> +
> +properties:
> + compatible:
> + items:
> + - enum:
> + - qcom,sm8450-ramoops
> + - const: qcom,ramoops
> +
> + memory-region:
> + maxItems: 1
> + description: handle to memory reservation for qcom,ramoops region.
> +
> + ecc-size:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + description: enables ECC support and specifies ECC buffer size in bytes
> + default: 0 # no ECC
> +
> + record-size:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + description: maximum size in bytes of each kmsg dump
> + default: 0
> +
> + console-size:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + description: size in bytes of log buffer reserved for kernel messages
> + default: 0
> +
> + ftrace-size:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + description: size in bytes of log buffer reserved for function tracing and profiling
> + default: 0
> +
> + pmsg-size:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + description: size in bytes of log buffer reserved for userspace messages
> + default: 0
> +
> + mem-type:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + description: if present, sets the type of mapping is to be used to map the reserved region.
> + default: 0
> + oneOf:
> + - const: 0
> + description: write-combined
> + - const: 1
> + description: unbuffered
> + - const: 2
> + description: cached
> +
> + max-reason:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + default: 2 # log oopses and panics
> + maximum: 0x7fffffff
> + description: |
> + If present, sets maximum type of kmsg dump reasons to store.
> + This can be set to INT_MAX to store all kmsg dumps.
> + See include/linux/kmsg_dump.h KMSG_DUMP_* for other kmsg dump reason values.
> + Setting this to 0 (KMSG_DUMP_UNDEF), means the reason filtering will be
> + controlled by the printk.always_kmsg_dump boot param.
> + If unset, it will be 2 (KMSG_DUMP_OOPS), otherwise 5 (KMSG_DUMP_MAX).
> +
> + flags:
> + $ref: /schemas/types.yaml#/definitions/uint32
> + default: 0
> + description: |
> + If present, pass ramoops behavioral flags
> + (see include/linux/pstore_ram.h RAMOOPS_FLAG_* for flag values).
> +
> + no-dump-oops:
> + deprecated: true

Why would you define deprecated properties in a *new* binding?

> + type: boolean
> + description: |
> + Use max_reason instead. If present, and max_reason is not specified,
> + it is equivalent to max_reason = 1 (KMSG_DUMP_PANIC).
> +
> + unbuffered:
> + deprecated: true
> + type: boolean
> + description: |
> + Use mem_type instead. If present, and mem_type is not specified,
> + it is equivalent to mem_type = 1 and uses unbuffered mappings to map
> + the reserved region (defaults to buffered mappings mem_type = 0).
> + If both are specified -- "mem_type" overrides "unbuffered".
> +
> +unevaluatedProperties: false
> +
> +required:
> + - compatible
> + - memory-region
> +
> +anyOf:
> + - required: [record-size]
> + - required: [console-size]
> + - required: [ftrace-size]
> + - required: [pmsg-size]
> +
> +examples:
> + - |
> +
> + qcom_ramoops {
> + compatible = "qcom,sm8450-ramoops", "qcom,ramoops";
> + memory-region = <&qcom_ramoops_region>;
> + console-size = <0x8000>; /* 32kB */
> + record-size = <0x400>; /* 1kB */
> + ecc-size = <16>;
> + };
> --
> 2.7.4
>

2023-06-28 16:13:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

On Wed, Jun 28, 2023 at 06:04:27PM +0530, Mukesh Ojha wrote:
> Minidump is a best effort mechanism to collect useful and predefined data
> for first level of debugging on end user devices running on Qualcomm SoCs.
> It is built on the premise that System on Chip (SoC) or subsystem part of
> SoC crashes, due to a range of hardware and software bugs. Hence, the
> ability to collect accurate data is only a best-effort. The data collected
> could be invalid or corrupted, data collection itself could fail, and so on.
>
> Qualcomm devices in engineering mode provides a mechanism for generating
> full system ramdumps for post mortem debugging. But in some cases it's
> however not feasible to capture the entire content of RAM. The minidump
> mechanism provides the means for selecting which snippets should be
> included in the ramdump.
>
> Minidump kernel driver implementation is divided into two parts for
> simplicity, one is minidump core which can also be called minidump
> frontend(As API gets exported from this driver for registration with
> backend) and the other part is minidump backend i.e, where the underlying
> implementation of minidump will be there. There could be different way
> how the backend is implemented like Shared memory, Memory mapped IO
> or Resource manager(gunyah) based where the guest region information is
> passed to hypervisor via hypercalls.
>
> Minidump Client-1 Client-2 Client-5 Client-n
> | | | |
> | | ... | ... |
> | | | |
> | | | |
> | | | |
> | | | |
> | | | |
> | | | |
> | +---+--------------+----+ |
> +-----------+ qcom_minidump(core) +--------+
> | |
> +------+-----+------+---+
> | | |
> | | |
> +---------------+ | +--------------------+
> | | |
> | | |
> | | |
> v v v
> +-------------------+ +-------------------+ +------------------+
> |qcom_minidump_smem | |qcom_minidump_mmio | | qcom_minidump_rm |
> | | | | | |
> +-------------------+ +-------------------+ +------------------+
> Shared memory Memory mapped IO Resource manager
> (backend) (backend) (backend)
>
>
> Here, we will be giving all analogy of backend with SMEM as it is the
> only implemented backend at present but general idea remains the same.

If you only have one "backend" then you don't need the extra compexity
here at all, just remove that whole middle layer please and make this
much simpler and smaller and easier to review and possibly accept.

We don't add layers when they are not needed, and never when there is no
actual user. If you need the extra "complexity" later, then add it
later when it is needed as who knows when that will ever be.

Please redo this series based on that, thanks.

greg k-h

2023-06-28 16:31:28

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support



On 6/28/2023 9:15 PM, Greg KH wrote:
> On Wed, Jun 28, 2023 at 06:04:27PM +0530, Mukesh Ojha wrote:
>> Minidump is a best effort mechanism to collect useful and predefined data
>> for first level of debugging on end user devices running on Qualcomm SoCs.
>> It is built on the premise that System on Chip (SoC) or subsystem part of
>> SoC crashes, due to a range of hardware and software bugs. Hence, the
>> ability to collect accurate data is only a best-effort. The data collected
>> could be invalid or corrupted, data collection itself could fail, and so on.
>>
>> Qualcomm devices in engineering mode provides a mechanism for generating
>> full system ramdumps for post mortem debugging. But in some cases it's
>> however not feasible to capture the entire content of RAM. The minidump
>> mechanism provides the means for selecting which snippets should be
>> included in the ramdump.
>>
>> Minidump kernel driver implementation is divided into two parts for
>> simplicity, one is minidump core which can also be called minidump
>> frontend(As API gets exported from this driver for registration with
>> backend) and the other part is minidump backend i.e, where the underlying
>> implementation of minidump will be there. There could be different way
>> how the backend is implemented like Shared memory, Memory mapped IO
>> or Resource manager(gunyah) based where the guest region information is
>> passed to hypervisor via hypercalls.
>>
>> Minidump Client-1 Client-2 Client-5 Client-n
>> | | | |
>> | | ... | ... |
>> | | | |
>> | | | |
>> | | | |
>> | | | |
>> | | | |
>> | | | |
>> | +---+--------------+----+ |
>> +-----------+ qcom_minidump(core) +--------+
>> | |
>> +------+-----+------+---+
>> | | |
>> | | |
>> +---------------+ | +--------------------+
>> | | |
>> | | |
>> | | |
>> v v v
>> +-------------------+ +-------------------+ +------------------+
>> |qcom_minidump_smem | |qcom_minidump_mmio | | qcom_minidump_rm |
>> | | | | | |
>> +-------------------+ +-------------------+ +------------------+
>> Shared memory Memory mapped IO Resource manager
>> (backend) (backend) (backend)
>>
>>
>> Here, we will be giving all analogy of backend with SMEM as it is the
>> only implemented backend at present but general idea remains the same.
>
> If you only have one "backend" then you don't need the extra compexity
> here at all, just remove that whole middle layer please and make this
> much simpler and smaller and easier to review and possibly accept.
>
> We don't add layers when they are not needed, and never when there is no
> actual user. If you need the extra "complexity" later, then add it
> later when it is needed as who knows when that will ever be.
>
> Please redo this series based on that, thanks.

I already followed without this middle layer till v3 since without
the middle layer it will be end up with lot of code duplication if there
is another backend.

We already have other backend implementation in the downstream, if you
want to see them, i will try to post them in upcoming series..

-Mukesh

>
> greg k-h

2023-06-28 17:26:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

On Wed, Jun 28, 2023 at 09:50:00PM +0530, Mukesh Ojha wrote:
>
>
> On 6/28/2023 9:15 PM, Greg KH wrote:
> > On Wed, Jun 28, 2023 at 06:04:27PM +0530, Mukesh Ojha wrote:
> > > Minidump is a best effort mechanism to collect useful and predefined data
> > > for first level of debugging on end user devices running on Qualcomm SoCs.
> > > It is built on the premise that System on Chip (SoC) or subsystem part of
> > > SoC crashes, due to a range of hardware and software bugs. Hence, the
> > > ability to collect accurate data is only a best-effort. The data collected
> > > could be invalid or corrupted, data collection itself could fail, and so on.
> > >
> > > Qualcomm devices in engineering mode provides a mechanism for generating
> > > full system ramdumps for post mortem debugging. But in some cases it's
> > > however not feasible to capture the entire content of RAM. The minidump
> > > mechanism provides the means for selecting which snippets should be
> > > included in the ramdump.
> > >
> > > Minidump kernel driver implementation is divided into two parts for
> > > simplicity, one is minidump core which can also be called minidump
> > > frontend(As API gets exported from this driver for registration with
> > > backend) and the other part is minidump backend i.e, where the underlying
> > > implementation of minidump will be there. There could be different way
> > > how the backend is implemented like Shared memory, Memory mapped IO
> > > or Resource manager(gunyah) based where the guest region information is
> > > passed to hypervisor via hypercalls.
> > >
> > > Minidump Client-1 Client-2 Client-5 Client-n
> > > | | | |
> > > | | ... | ... |
> > > | | | |
> > > | | | |
> > > | | | |
> > > | | | |
> > > | | | |
> > > | | | |
> > > | +---+--------------+----+ |
> > > +-----------+ qcom_minidump(core) +--------+
> > > | |
> > > +------+-----+------+---+
> > > | | |
> > > | | |
> > > +---------------+ | +--------------------+
> > > | | |
> > > | | |
> > > | | |
> > > v v v
> > > +-------------------+ +-------------------+ +------------------+
> > > |qcom_minidump_smem | |qcom_minidump_mmio | | qcom_minidump_rm |
> > > | | | | | |
> > > +-------------------+ +-------------------+ +------------------+
> > > Shared memory Memory mapped IO Resource manager
> > > (backend) (backend) (backend)
> > >
> > >
> > > Here, we will be giving all analogy of backend with SMEM as it is the
> > > only implemented backend at present but general idea remains the same.
> >
> > If you only have one "backend" then you don't need the extra compexity
> > here at all, just remove that whole middle layer please and make this
> > much simpler and smaller and easier to review and possibly accept.
> >
> > We don't add layers when they are not needed, and never when there is no
> > actual user. If you need the extra "complexity" later, then add it
> > later when it is needed as who knows when that will ever be.
> >
> > Please redo this series based on that, thanks.
>
> I already followed without this middle layer till v3 since without
> the middle layer it will be end up with lot of code duplication if there
> is another backend.

But as this series does not have such a thing, only add it when needed
please. Don't make us review a whole bunch of stuff that is not
actually used here.

Would you want to review such a thing?

> We already have other backend implementation in the downstream, if you
> want to see them, i will try to post them in upcoming series..

Ok, so if you already have it, yes, post it as part of the series,
otherwise such a layer makes no sense.

thanks,

greg k-h

2023-06-28 23:14:51

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v4 10/21] soc: qcom: Add qcom's pstore minidump driver support

On Wed, Jun 28, 2023 at 6:37 AM Mukesh Ojha <[email protected]> wrote:
>
> This driver was inspired from the fact pstore ram region should be
> fixed and boot firmware need to have awarness about this region,
> so that it will be persistent across boot. But, there are many
> QCOM SoC which does not support warm boot from hardware but they
> have minidump support from the software, and for them, there is
> no need of this pstore ram region to be fixed, but at the same
> time have interest in the pstore frontends data. So, this driver
> get the dynamic reserved region from the ram and register the
> ramoops platform device.
>
> +---------+ +---------+ +--------+ +---------+
> | console | | pmsg | | ftrace | | dmesg |
> +---------+ +---------+ +--------+ +---------+
> | | | |
> | | | |
> +------------------------------------------+
> |
> \ /
> +----------------+
> (1) |pstore frontends|
> +----------------+
> |
> \ /
> +------------------- +
> (2) | pstore backend(ram)|
> +--------------------+
> |
> \ /
> +--------------------+
> (3) |qcom_pstore_minidump|
> +--------------------+
> |
> \ /
> +---------------+
> (4) | qcom_minidump |
> +---------------+
>
> This driver will route all the pstore front data to the stored
> in qcom pstore reserved region and the reason of showing an
> arrow from (3) to (4) as qcom_pstore_minidump driver will register
> all the available frontends region with qcom minidump driver
> in upcoming patch.
>
> Signed-off-by: Mukesh Ojha <[email protected]>
> ---
> drivers/soc/qcom/Kconfig | 12 +++++
> drivers/soc/qcom/Makefile | 1 +
> drivers/soc/qcom/qcom_pstore_minidump.c | 85 +++++++++++++++++++++++++++++++++

drivers/soc/ is the dumping ground for things with no other place. As
this is a pstore driver, it belongs with pstore.

> 3 files changed, 98 insertions(+)
> create mode 100644 drivers/soc/qcom/qcom_pstore_minidump.c
>
> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
> index 1834213fd652..fbf08e30feda 100644
> --- a/drivers/soc/qcom/Kconfig
> +++ b/drivers/soc/qcom/Kconfig
> @@ -306,4 +306,16 @@ config QCOM_MINIDUMP_SMEM
>
> This config should be enabled if the low level minidump is implemented
> as part of SMEM.
> +
> +config QCOM_PSTORE_MINIDUMP
> + tristate "Pstore support for QCOM Minidump"
> + depends on ARCH_QCOM
> + depends on PSTORE_RAM
> + depends on QCOM_MINIDUMP
> + help
> + Enablement of this driver ensures that ramoops region can be anywhere
> + reserved in ram instead of being fixed address which needs boot firmware
> + awareness. So, this driver creates plaform device and registers available
> + frontend region with the Qualcomm's minidump driver.
> +
> endmenu
> diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
> index 737d868757ac..1ab59c1b364d 100644
> --- a/drivers/soc/qcom/Makefile
> +++ b/drivers/soc/qcom/Makefile
> @@ -36,3 +36,4 @@ qcom_ice-objs += ice.o
> obj-$(CONFIG_QCOM_INLINE_CRYPTO_ENGINE) += qcom_ice.o
> obj-$(CONFIG_QCOM_MINIDUMP) += qcom_minidump.o
> obj-$(CONFIG_QCOM_MINIDUMP_SMEM) += qcom_minidump_smem.o
> +obj-$(CONFIG_QCOM_PSTORE_MINIDUMP) += qcom_pstore_minidump.o
> diff --git a/drivers/soc/qcom/qcom_pstore_minidump.c b/drivers/soc/qcom/qcom_pstore_minidump.c
> new file mode 100644
> index 000000000000..b07cd10340df
> --- /dev/null
> +++ b/drivers/soc/qcom/qcom_pstore_minidump.c
> @@ -0,0 +1,85 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +/*
> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
> + */
> +
> +#include <linux/module.h>
> +#include <linux/of_device.h>

You probably don't need this include. Use the actual includes you need
and don't rely on implicit includes (because I'm trying to remove
those).

> +#include <linux/of_reserved_mem.h>
> +#include <linux/platform_device.h>
> +#include <linux/pstore_ram.h>
> +
> +struct qcom_ramoops_dd {
> + struct ramoops_platform_data qcom_ramoops_pdata;
> + struct platform_device *ramoops_pdev;
> +};
> +
> +static int qcom_ramoops_probe(struct platform_device *pdev)
> +{
> + struct device_node *of_node = pdev->dev.of_node;
> + struct qcom_ramoops_dd *qcom_rdd;
> + struct ramoops_platform_data *pdata;
> + struct reserved_mem *rmem;
> + struct device_node *node;
> + long ret;
> +
> + node = of_parse_phandle(of_node, "memory-region", 0);
> + if (!node)
> + return -ENODEV;
> +
> + rmem = of_reserved_mem_lookup(node);
> + of_node_put(node);
> + if (!rmem) {
> + dev_err(&pdev->dev, "failed to locate DT /reserved-memory resource\n");
> + return -EINVAL;
> + }
> +
> + qcom_rdd = devm_kzalloc(&pdev->dev, sizeof(*qcom_rdd), GFP_KERNEL);
> + if (!qcom_rdd)
> + return -ENOMEM;
> +
> + pdata = &qcom_rdd->qcom_ramoops_pdata;
> + pdata->mem_size = rmem->size;
> + pdata->mem_address = rmem->base;
> + ramoops_parse_dt(pdev, pdata);
> +
> + qcom_rdd->ramoops_pdev = platform_device_register_data(NULL, "ramoops", -1,
> + pdata, sizeof(*pdata));
> + if (IS_ERR(qcom_rdd->ramoops_pdev)) {
> + ret = PTR_ERR(qcom_rdd->ramoops_pdev);
> + dev_err(&pdev->dev, "could not create platform device: %ld\n", ret);
> + qcom_rdd->ramoops_pdev = NULL;
> + }
> + platform_set_drvdata(pdev, qcom_rdd);
> +
> + return ret;
> +}
> +
> +static void qcom_ramoops_remove(struct platform_device *pdev)
> +{
> + struct qcom_ramoops_dd *qcom_rdd = platform_get_drvdata(pdev);
> +
> + platform_device_unregister(qcom_rdd->ramoops_pdev);
> + qcom_rdd->ramoops_pdev = NULL;
> +}
> +
> +static const struct of_device_id qcom_ramoops_of_match[] = {
> + { .compatible = "qcom,ramoops"},
> + {}
> +};
> +MODULE_DEVICE_TABLE(of, qcom_ramoops_of_match);
> +
> +static struct platform_driver qcom_ramoops_drv = {
> + .driver = {
> + .name = "qcom,ramoops",
> + .of_match_table = qcom_ramoops_of_match,
> + },
> + .probe = qcom_ramoops_probe,
> + .remove_new = qcom_ramoops_remove,
> +};
> +
> +module_platform_driver(qcom_ramoops_drv);
> +
> +MODULE_DESCRIPTION("Qualcomm ramoops minidump driver");
> +MODULE_LICENSE("GPL");
> --
> 2.7.4
>

2023-06-28 23:25:27

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v4 08/21] dt-bindings: reserved-memory: Add qcom,ramoops binding

On Wed, Jun 28, 2023 at 8:11 AM Pavan Kondeti <[email protected]> wrote:
>
> On Wed, Jun 28, 2023 at 06:04:35PM +0530, Mukesh Ojha wrote:
> > Qualcomm ramoops minidump logger provide a means of storing
> > the ramoops data to some dynamically reserved memory instead
> > of traditionally implemented ramoops where the region should
> > be statically fixed ram region. Its device tree binding
> > would be exactly same as ramoops device tree binding and is
> > going to contain traditional ramoops frontend data and this
> > content will be collected via Qualcomm minidump infrastructure
> > provided from the boot firmware.
> >
> > Signed-off-by: Mukesh Ojha <[email protected]>
> > ---
> > .../devicetree/bindings/soc/qcom/qcom,ramoops.yaml | 126 +++++++++++++++++++++
> > 1 file changed, 126 insertions(+)
> > create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
> >
> > diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml b/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
> > new file mode 100644
> > index 000000000000..b1fdcf3f8ad4
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
> > @@ -0,0 +1,126 @@
> > +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> > +%YAML 1.2
> > +---
> > +$id: "http://devicetree.org/schemas/soc/qcom/qcom,ramoops.yaml#"
> > +$schema: "http://devicetree.org/meta-schemas/core.yaml#"
> > +
> > +title: Qualcomm Ramoops minidump logger
> > +
> > +description: |
> > + Qualcomm ramoops minidump logger provide a means of storing the ramoops
> > + data to some dynamically reserved memory instead of traditionally
> > + implemented ramoops where the region should be statically fixed ram
> > + region. Because of its similarity with ramoops it will also have same
> > + set of property what ramoops have it in its schema and is going to
> > + contain traditional ramoops frontend data and this region will be
> > + collected via Qualcomm minidump infrastructure provided from the
> > + boot firmware.
> > +
> > +maintainers:
> > + - Mukesh Ojha <[email protected]>
> > +
> > +properties:
> > + compatible:
> > + items:
> > + - enum:
> > + - qcom,sm8450-ramoops
> > + - const: qcom,ramoops
> > +
> > + memory-region:
> > + maxItems: 1
> > + description: handle to memory reservation for qcom,ramoops region.
> > +
> > + ecc-size:
> > + $ref: /schemas/types.yaml#/definitions/uint32
> > + description: enables ECC support and specifies ECC buffer size in bytes
> > + default: 0 # no ECC
> > +
> > + record-size:
> > + $ref: /schemas/types.yaml#/definitions/uint32
> > + description: maximum size in bytes of each kmsg dump
> > + default: 0
> > +
> > + console-size:
> > + $ref: /schemas/types.yaml#/definitions/uint32
> > + description: size in bytes of log buffer reserved for kernel messages
> > + default: 0
> > +
> > + ftrace-size:
> > + $ref: /schemas/types.yaml#/definitions/uint32
> > + description: size in bytes of log buffer reserved for function tracing and profiling
> > + default: 0
> > +
> > + pmsg-size:
> > + $ref: /schemas/types.yaml#/definitions/uint32
> > + description: size in bytes of log buffer reserved for userspace messages
> > + default: 0
> > +
> > + mem-type:
> > + $ref: /schemas/types.yaml#/definitions/uint32
> > + description: if present, sets the type of mapping is to be used to map the reserved region.
> > + default: 0
> > + oneOf:
> > + - const: 0
> > + description: write-combined
> > + - const: 1
> > + description: unbuffered
> > + - const: 2
> > + description: cached
> > +
> > + max-reason:
> > + $ref: /schemas/types.yaml#/definitions/uint32
> > + default: 2 # log oopses and panics
> > + maximum: 0x7fffffff
> > + description: |
> > + If present, sets maximum type of kmsg dump reasons to store.
> > + This can be set to INT_MAX to store all kmsg dumps.
> > + See include/linux/kmsg_dump.h KMSG_DUMP_* for other kmsg dump reason values.
> > + Setting this to 0 (KMSG_DUMP_UNDEF), means the reason filtering will be
> > + controlled by the printk.always_kmsg_dump boot param.
> > + If unset, it will be 2 (KMSG_DUMP_OOPS), otherwise 5 (KMSG_DUMP_MAX).
> > +
> > + flags:
> > + $ref: /schemas/types.yaml#/definitions/uint32
> > + default: 0
> > + description: |
> > + If present, pass ramoops behavioral flags
> > + (see include/linux/pstore_ram.h RAMOOPS_FLAG_* for flag values).
> > +
> > + no-dump-oops:
> > + deprecated: true
> > + type: boolean
> > + description: |
> > + Use max_reason instead. If present, and max_reason is not specified,
> > + it is equivalent to max_reason = 1 (KMSG_DUMP_PANIC).
> > +
> > + unbuffered:
> > + deprecated: true
> > + type: boolean
> > + description: |
> > + Use mem_type instead. If present, and mem_type is not specified,
> > + it is equivalent to mem_type = 1 and uses unbuffered mappings to map
> > + the reserved region (defaults to buffered mappings mem_type = 0).
> > + If both are specified -- "mem_type" overrides "unbuffered".
> > +
>
> Most of the properties you added here are already documented at
> Documentation/devicetree/bindings/reserved-memory/ramoops.yaml

That is certainly a problem. Don't define the same property more than
once. Not yet checked and enforced by the tools, but it will be.

> Can't we just reference them here? would something like work?
>
> max-reason:
> $ref: "../../reserved-memory/ramoops.yaml#/properties/max-reason

Can work, but no. Common properties need to go into a schema of common
properties which the device specific schemas reference.

>
> > +unevaluatedProperties: false
> > +
>
> will there be any additional properties be added dynamically? if not,
> should not we use "additionalProperties: false" here?

I don't know what you mean by dynamically, but that's not the criteria
for which to use.

Rob

2023-06-28 23:25:29

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

On Wed, Jun 28, 2023 at 9:45 AM Greg KH <[email protected]> wrote:
>
> On Wed, Jun 28, 2023 at 06:04:27PM +0530, Mukesh Ojha wrote:
> > Minidump is a best effort mechanism to collect useful and predefined data
> > for first level of debugging on end user devices running on Qualcomm SoCs.
> > It is built on the premise that System on Chip (SoC) or subsystem part of
> > SoC crashes, due to a range of hardware and software bugs. Hence, the
> > ability to collect accurate data is only a best-effort. The data collected
> > could be invalid or corrupted, data collection itself could fail, and so on.
> >
> > Qualcomm devices in engineering mode provides a mechanism for generating
> > full system ramdumps for post mortem debugging. But in some cases it's
> > however not feasible to capture the entire content of RAM. The minidump
> > mechanism provides the means for selecting which snippets should be
> > included in the ramdump.
> >
> > Minidump kernel driver implementation is divided into two parts for
> > simplicity, one is minidump core which can also be called minidump
> > frontend(As API gets exported from this driver for registration with
> > backend) and the other part is minidump backend i.e, where the underlying
> > implementation of minidump will be there. There could be different way
> > how the backend is implemented like Shared memory, Memory mapped IO
> > or Resource manager(gunyah) based where the guest region information is
> > passed to hypervisor via hypercalls.
> >
> > Minidump Client-1 Client-2 Client-5 Client-n
> > | | | |
> > | | ... | ... |
> > | | | |
> > | | | |
> > | | | |
> > | | | |
> > | | | |
> > | | | |
> > | +---+--------------+----+ |
> > +-----------+ qcom_minidump(core) +--------+
> > | |
> > +------+-----+------+---+
> > | | |
> > | | |
> > +---------------+ | +--------------------+
> > | | |
> > | | |
> > | | |
> > v v v
> > +-------------------+ +-------------------+ +------------------+
> > |qcom_minidump_smem | |qcom_minidump_mmio | | qcom_minidump_rm |
> > | | | | | |
> > +-------------------+ +-------------------+ +------------------+
> > Shared memory Memory mapped IO Resource manager
> > (backend) (backend) (backend)
> >
> >
> > Here, we will be giving all analogy of backend with SMEM as it is the
> > only implemented backend at present but general idea remains the same.
>
> If you only have one "backend" then you don't need the extra compexity
> here at all, just remove that whole middle layer please and make this
> much simpler and smaller and easier to review and possibly accept.

pstore already supports backends. Why aren't the above backends just
pstore backends rather than having an intermediate pstore backend in
RAM which then somehow gets moved into these minidump backends.

> We don't add layers when they are not needed, and never when there is no
> actual user. If you need the extra "complexity" later, then add it
> later when it is needed as who knows when that will ever be.
>
> Please redo this series based on that, thanks.

My bigger issue with this whole series is what would this all look
like if every SoC vendor upstreamed their own custom dumping
mechanism. That would be a mess. (I have similar opinions on the
$soc-vendor hypervisors.)

Rob

2023-06-28 23:44:13

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v4 01/21] docs: qcom: Add qualcomm minidump guide

On Wed, Jun 28, 2023 at 6:36 AM Mukesh Ojha <[email protected]> wrote:
>
> Add the qualcomm minidump guide for the users which
> tries to cover the dependency and the way to test
> and collect minidump on Qualcomm supported platforms.
>
> Signed-off-by: Mukesh Ojha <[email protected]>
> ---
> Documentation/admin-guide/index.rst | 1 +
> Documentation/admin-guide/qcom_minidump.rst | 293 ++++++++++++++++++++++++++++
> 2 files changed, 294 insertions(+)
> create mode 100644 Documentation/admin-guide/qcom_minidump.rst
>
> diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
> index 43ea35613dfc..251d070486c2 100644
> --- a/Documentation/admin-guide/index.rst
> +++ b/Documentation/admin-guide/index.rst
> @@ -120,6 +120,7 @@ configure specific aspects of kernel behavior to your liking.
> perf-security
> pm/index
> pnp
> + qcom_minidump
> rapidio
> ras
> rtc
> diff --git a/Documentation/admin-guide/qcom_minidump.rst b/Documentation/admin-guide/qcom_minidump.rst
> new file mode 100644
> index 000000000000..a3a8cfee4555
> --- /dev/null
> +++ b/Documentation/admin-guide/qcom_minidump.rst
> @@ -0,0 +1,293 @@
> +Qualcomm Minidump Feature
> +=========================
> +
> +Introduction
> +------------
> +
> +Minidump is a best effort mechanism to collect useful and predefined
> +data for first level of debugging on end user devices running on
> +Qualcomm SoCs. It is built on the premise that System on Chip (SoC)
> +or subsystem part of SoC crashes, due to a range of hardware and
> +software bugs. Hence, the ability to collect accurate data is only
> +a best-effort. The data collected could be invalid or corrupted, data
> +collection itself could fail, and so on.
> +
> +Qualcomm devices in engineering mode provides a mechanism for generating
> +full system RAM dumps for post-mortem debugging. But in some cases it's
> +however not feasible to capture the entire content of RAM. The minidump
> +mechanism provides the means for selecting region should be included in
> +the ramdump.
> +
> +::
> +
> + +-----------------------------------------------+
> + | DDR +-------------+ |
> + | | SS0-ToC| |
> + | +----------------+ +----------------+ | |
> + | |Shared memory | | SS1-ToC| | |
> + | |(SMEM) | | | | |
> + | | | +-->|--------+ | | |
> + | |G-ToC | | | SS-ToC \ | | |
> + | |+-------------+ | | | +-----------+ | | |
> + | ||-------------| | | | |-----------| | | |
> + | || SS0-ToC | | | +-|<|SS1 region1| | | |
> + | ||-------------| | | | | |-----------| | | |
> + | || SS1-ToC |-|>+ | | |SS1 region2| | | |
> + | ||-------------| | | | |-----------| | | |
> + | || SS2-ToC | | | | | ... | | | |
> + | ||-------------| | | | |-----------| | | |
> + | || ... | | |-|<|SS1 regionN| | | |
> + | ||-------------| | | | |-----------| | | |
> + | || SSn-ToC | | | | +-----------+ | | |
> + | |+-------------+ | | | | | |
> + | | | | |----------------| | |
> + | | | +>| regionN | | |
> + | | | | |----------------| | |
> + | +----------------+ | | | | |
> + | | |----------------| | |
> + | +>| region1 | | |
> + | |----------------| | |
> + | | | | |
> + | |----------------|-+ |
> + | | region5 | |
> + | |----------------| |
> + | | | |
> + | Region information +----------------+ |
> + | +---------------+ |
> + | |region name | |
> + | |---------------| |
> + | |region address | |
> + | |---------------| |
> + | |region size | |
> + | +---------------+ |
> + +-----------------------------------------------+
> + G-ToC: Global table of contents
> + SS-ToC: Subsystem table of contents
> + SS0-SSn: Subsystem numbered from 0 to n
> +
> +It depends on targets how the underlying hardware taking care of the
> +implementation part for minidump like above diagram is for shared
> +memory and it is possible that this could be implemented via memory
> +mapped regions but the general idea remain same.
> +
> +In this document, SMEM will be used as the backend implementation of
> +minidump.
> +
> +SMEM as backend
> +----------------
> +
> +The core of minidump feature is part of Qualcomm's boot firmware code.
> +It initializes shared memory (SMEM), which is a part of DDR and
> +allocates a small section of it to minidump table, i.e. also called
> +global table of contents (G-ToC). Each subsystem (APSS, ADSP, ...) has
> +its own table of segments to be included in the minidump, all
> +references from a descriptor in SMEM (G-ToC). Each segment/region has
> +some details like name, physical address and its size etc. and it
> +could be anywhere scattered in the DDR.
> +
> +Minidump kernel driver concept
> +------------------------------
> +::
> +
> + Minidump Client-1 Client-2 Client-5 Client-n
> + | | | |
> + | | ... | ... |
> + | | | |
> + | | | |
> + | | | |
> + | | | |
> + | | | |
> + | | | |
> + | +---+--------------+----+ |
> + +-----------+ qcom_minidump(core) +--------+
> + | |
> + +------+-----+------+---+
> + | | |
> + | | |
> + +---------------+ | +--------------------+
> + | | |
> + | | |
> + | | |
> + v v v
> + +-------------------+ +-------------------+ +------------------+
> + |qcom_minidump_smem | |qcom_minidump_mmio | | qcom_minidump_rm |
> + | | | | | |
> + +-------------------+ +-------------------+ +------------------+
> + Shared memory Memory mapped IO Resource manager
> + (backend) (backend) (backend)
> +
> +
> +Kernel implementation of minidump driver is divided into two parts one is,
> +the core implementation called frontend driver ``qcom_minidump.c`` and this
> +is the driver will be exposing the API for clients and the other part is,
> +backend driver and its depends whether it is based on SMEM, MMIO or some
> +other way corressponding driver will be hooking itself up with the core
> +driver to get itself working. As of now, at a time one and only one backend
> +can be attached to the front-end either it is HOST or a guest VM.
> +
> +Qualcomm minidump kernel driver adds the capability to add Linux region
> +to be dumped as part of RAM dump collection. At the moment, shared memory
> +driver creates platform device for minidump driver and give a means to
> +APSS minidump to initialize itself on probe.
> +
> +This driver provides ``qcom_minidump_region_register`` and
> +``qcom_minidump_region_unregister`` API's to register and unregister
> +APSS minidump region. It also gives a mechanism to update physical/virtual
> +address for the client whose addresses keeps on changing, e.g., current stack
> +address of task keeps on changing on context switch for each core. So these
> +clients can update their addresses with ``qcom_minidump_update_region``
> +API.
> +
> +The driver also supports registration for the clients who came before
> +minidump driver was initialized. It maintains pending list of clients
> +who came before minidump and once minidump is initialized it registers
> +them in one go.
> +
> +To simplify post-mortem debugging, driver creates and maintain an ELF
> +header as first region that gets updated each time a new region gets
> +registered.
> +
> +The solution supports extracting the RAM dump/minidump produced either
> +over USB or stored to an attached storage device.
> +
> +Dependency of minidump kernel driver
> +------------------------------------
> +
> +It is to note that whole of minidump depends on Qualcomm boot
> +firmware whether it supports minidump or not. So, if the minidump
> +SMEM ID is present in shared memory, it indicates that minidump
> +is supported from boot firmware and it is possible to dump Linux
> +(APSS) region as part of minidump collection.
> +
> +How a kernel client driver can register region with minidump
> +------------------------------------------------------------
> +
> +Client driver can use ``qcom_minidump_region_register`` API's to
> +register and ``qcom_minidump_region_unregister`` to unregister
> +their region from minidump driver.
> +
> +Client needs to fill their region by filling ``qcom_minidump_region``
> +structure object which consists of the region name, region's
> +virtual and physical address and its size.
> +
> +Below is one sample client driver snippet which tries to allocate
> +a region from kernel heap of certain size and it writes a certain
> +known pattern (that can help in verification after collection
> +that we got the exact pattern, what we wrote) and registers it with
> +minidump.
> +
> + .. code-block:: c
> +
> + #include <soc/qcom/qcom_minidump.h>
> + [...]
> +
> +
> + [... inside a function ...]
> + struct qcom_minidump_region region;
> +
> + [...]
> +
> + client_mem_region = kzalloc(region_size, GFP_KERNEL);
> + if (!client_mem_region)
> + return -ENOMEM;
> +
> + [... Just write a pattern ...]
> + memset(client_mem_region, 0xAB, region_size);
> +
> + [... Fill up the region object ...]
> + strlcpy(region.name, "REGION_A", sizeof(region.name));
> + region.virt_addr = client_mem_region;
> + region.phys_addr = virt_to_phys(client_mem_region);
> + region.size = region_size;
> +
> + ret = qcom_minidump_region_register(&region);
> + if (ret < 0) {
> + pr_err("failed to add region in minidump: err: %d\n", ret);
> + return ret;
> + }
> +
> + [...]
> +
> +
> +Test
> +----
> +
> +Existing Qualcomm devices already supports entire RAM dump (also called
> +full dump) by writing appropriate value to Qualcomm's top control and
> +status register (tcsr) in ``driver/firmware/qcom_scm.c`` .
> +
> +SCM device Tree bindings required to support download mode
> +For example (sm8450) ::
> +
> + / {
> +
> + [...]
> +
> + firmware {
> + scm: scm {
> + compatible = "qcom,scm-sm8450", "qcom,scm";
> + [... tcsr register ... ]
> + qcom,dload-mode = <&tcsr 0x13000>;
> +
> + [...]
> + };
> + };
> +
> + [...]
> +
> + soc: soc@0 {
> +
> + [...]
> +
> + tcsr: syscon@1fc0000 {
> + compatible = "qcom,sm8450-tcsr", "syscon";
> + reg = <0x0 0x1fc0000 0x0 0x30000>;
> + };
> +
> + [...]
> + };
> + [...]
> +
> + };
> +
> +User of minidump can pass ``qcom_scm.download_mode="mini"`` to kernel
> +commandline to set the current download mode to minidump.
> +Similarly, ``"full"`` is passed to set the download mode to full dump
> +where entire RAM dump will be collected while setting it ``"full,mini"``
> +will collect minidump along with fulldump.
> +
> +Writing to sysfs node can also be used to set the mode to minidump::
> +
> + echo "mini" > /sys/module/qcom_scm/parameter/download_mode
> +
> +Once the download mode is set, any kind of crash will make the device collect
> +respective dump as per set download mode.
> +
> +Dump collection
> +---------------
> +
> +The solution supports extracting the minidump produced either over USB or
> +stored to an attached storage device.
> +
> +By default, dumps are downloaded via USB to the attached x86_64 machine
> +running PCAT (Qualcomm tool) software. Upon download, we will see
> +a set of binary blobs starting with name ``md_*`` in PCAT configured directory
> +in x86_64 machine, so for above example from the client it will be

So I can't use my QCom laptop or M1 MacBook? This text won't age well,
so perhaps reword it.

Rob

2023-06-29 02:18:17

by Pavan Kondeti

[permalink] [raw]
Subject: Re: [PATCH v4 08/21] dt-bindings: reserved-memory: Add qcom,ramoops binding

On Wed, Jun 28, 2023 at 05:17:13PM -0600, Rob Herring wrote:
> On Wed, Jun 28, 2023 at 8:11 AM Pavan Kondeti <[email protected]> wrote:
> >
> > On Wed, Jun 28, 2023 at 06:04:35PM +0530, Mukesh Ojha wrote:
> > > Qualcomm ramoops minidump logger provide a means of storing
> > > the ramoops data to some dynamically reserved memory instead
> > > of traditionally implemented ramoops where the region should
> > > be statically fixed ram region. Its device tree binding
> > > would be exactly same as ramoops device tree binding and is
> > > going to contain traditional ramoops frontend data and this
> > > content will be collected via Qualcomm minidump infrastructure
> > > provided from the boot firmware.
> > >
> > > Signed-off-by: Mukesh Ojha <[email protected]>
> > > ---
> > > .../devicetree/bindings/soc/qcom/qcom,ramoops.yaml | 126 +++++++++++++++++++++
> > > 1 file changed, 126 insertions(+)
> > > create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
> > >
> > > diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml b/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
> > > new file mode 100644
> > > index 000000000000..b1fdcf3f8ad4
> > > --- /dev/null
> > > +++ b/Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
> > > @@ -0,0 +1,126 @@
> > > +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> > > +%YAML 1.2
> > > +---
> > > +$id: "http://devicetree.org/schemas/soc/qcom/qcom,ramoops.yaml#"
> > > +$schema: "http://devicetree.org/meta-schemas/core.yaml#"
> > > +
> > > +title: Qualcomm Ramoops minidump logger
> > > +
> > > +description: |
> > > + Qualcomm ramoops minidump logger provide a means of storing the ramoops
> > > + data to some dynamically reserved memory instead of traditionally
> > > + implemented ramoops where the region should be statically fixed ram
> > > + region. Because of its similarity with ramoops it will also have same
> > > + set of property what ramoops have it in its schema and is going to
> > > + contain traditional ramoops frontend data and this region will be
> > > + collected via Qualcomm minidump infrastructure provided from the
> > > + boot firmware.
> > > +
> > > +maintainers:
> > > + - Mukesh Ojha <[email protected]>
> > > +
> > > +properties:
> > > + compatible:
> > > + items:
> > > + - enum:
> > > + - qcom,sm8450-ramoops
> > > + - const: qcom,ramoops
> > > +
> > > + memory-region:
> > > + maxItems: 1
> > > + description: handle to memory reservation for qcom,ramoops region.
> > > +
> > > + ecc-size:
> > > + $ref: /schemas/types.yaml#/definitions/uint32
> > > + description: enables ECC support and specifies ECC buffer size in bytes
> > > + default: 0 # no ECC
> > > +
> > > + record-size:
> > > + $ref: /schemas/types.yaml#/definitions/uint32
> > > + description: maximum size in bytes of each kmsg dump
> > > + default: 0
> > > +
> > > + console-size:
> > > + $ref: /schemas/types.yaml#/definitions/uint32
> > > + description: size in bytes of log buffer reserved for kernel messages
> > > + default: 0
> > > +
> > > + ftrace-size:
> > > + $ref: /schemas/types.yaml#/definitions/uint32
> > > + description: size in bytes of log buffer reserved for function tracing and profiling
> > > + default: 0
> > > +
> > > + pmsg-size:
> > > + $ref: /schemas/types.yaml#/definitions/uint32
> > > + description: size in bytes of log buffer reserved for userspace messages
> > > + default: 0
> > > +
> > > + mem-type:
> > > + $ref: /schemas/types.yaml#/definitions/uint32
> > > + description: if present, sets the type of mapping is to be used to map the reserved region.
> > > + default: 0
> > > + oneOf:
> > > + - const: 0
> > > + description: write-combined
> > > + - const: 1
> > > + description: unbuffered
> > > + - const: 2
> > > + description: cached
> > > +
> > > + max-reason:
> > > + $ref: /schemas/types.yaml#/definitions/uint32
> > > + default: 2 # log oopses and panics
> > > + maximum: 0x7fffffff
> > > + description: |
> > > + If present, sets maximum type of kmsg dump reasons to store.
> > > + This can be set to INT_MAX to store all kmsg dumps.
> > > + See include/linux/kmsg_dump.h KMSG_DUMP_* for other kmsg dump reason values.
> > > + Setting this to 0 (KMSG_DUMP_UNDEF), means the reason filtering will be
> > > + controlled by the printk.always_kmsg_dump boot param.
> > > + If unset, it will be 2 (KMSG_DUMP_OOPS), otherwise 5 (KMSG_DUMP_MAX).
> > > +
> > > + flags:
> > > + $ref: /schemas/types.yaml#/definitions/uint32
> > > + default: 0
> > > + description: |
> > > + If present, pass ramoops behavioral flags
> > > + (see include/linux/pstore_ram.h RAMOOPS_FLAG_* for flag values).
> > > +
> > > + no-dump-oops:
> > > + deprecated: true
> > > + type: boolean
> > > + description: |
> > > + Use max_reason instead. If present, and max_reason is not specified,
> > > + it is equivalent to max_reason = 1 (KMSG_DUMP_PANIC).
> > > +
> > > + unbuffered:
> > > + deprecated: true
> > > + type: boolean
> > > + description: |
> > > + Use mem_type instead. If present, and mem_type is not specified,
> > > + it is equivalent to mem_type = 1 and uses unbuffered mappings to map
> > > + the reserved region (defaults to buffered mappings mem_type = 0).
> > > + If both are specified -- "mem_type" overrides "unbuffered".
> > > +
> >
> > Most of the properties you added here are already documented at
> > Documentation/devicetree/bindings/reserved-memory/ramoops.yaml
>
> That is certainly a problem. Don't define the same property more than
> once. Not yet checked and enforced by the tools, but it will be.
>
> > Can't we just reference them here? would something like work?
> >
> > max-reason:
> > $ref: "../../reserved-memory/ramoops.yaml#/properties/max-reason
>
> Can work, but no. Common properties need to go into a schema of common
> properties which the device specific schemas reference.
>

Thanks for the clarification. We need to define the common properties
and make it available under /schemas/<>.yaml and add a reference to it
in these bindings. Is my understanding correct?

> >
> > > +unevaluatedProperties: false
> > > +
> >
> > will there be any additional properties be added dynamically? if not,
> > should not we use "additionalProperties: false" here?
>
> I don't know what you mean by dynamically, but that's not the criteria
> for which to use.
>
ok, I was wrong in saying dynamically. I should say "are there any other
properties that will be included that are not documented here".

is below my understanding correct?

additionalProperties needs to be set to false if at all this binding
defines all the possible properties (including child nodes) and not
referring to any other schemas.

If that is not the case and this binding references to other schemas,
then unevaluatedProperties could be used since additionalProperties
can't support combining schemas.

Thanks,
Pavan

2023-06-29 03:07:36

by Pavan Kondeti

[permalink] [raw]
Subject: Re: [PATCH v4 07/21] soc: qcom: minidump: Add update region support

On Wed, Jun 28, 2023 at 06:04:34PM +0530, Mukesh Ojha wrote:
> Add support to update client's region physical/virtual addresses,
> which is useful for dynamic loadable modules, dynamic address
> changing clients like if we want to collect current stack
> information for each core and the current stack is changing on
> each sched_switch event, So here virtual/physical address of
> the current stack is changing. So, to cover such use cases
> add the update region support in minidump driver and the
> corresponding smem backend support.
>
> Signed-off-by: Mukesh Ojha <[email protected]>
> ---
> drivers/soc/qcom/qcom_minidump.c | 55 +++++++++++++++++++++++++++++++
> drivers/soc/qcom/qcom_minidump_internal.h | 3 ++
> drivers/soc/qcom/qcom_minidump_smem.c | 21 ++++++++++++
> include/soc/qcom/qcom_minidump.h | 5 +++
> 4 files changed, 84 insertions(+)
>
> diff --git a/drivers/soc/qcom/qcom_minidump.c b/drivers/soc/qcom/qcom_minidump.c
> index cfdb63cc29d6..37d6ceb4d85c 100644
> --- a/drivers/soc/qcom/qcom_minidump.c
> +++ b/drivers/soc/qcom/qcom_minidump.c
> @@ -318,6 +318,61 @@ int qcom_minidump_region_unregister(const struct qcom_minidump_region *region)
> }
> EXPORT_SYMBOL_GPL(qcom_minidump_region_unregister);
>
> +/**
> + * qcom_minidump_update_region() - Update region information with new physical
> + * address and virtual address for already registered region e.g, current task
> + * stack for a core keeps on changing on each context switch, there it needs to
> + * change registered region with new updated addresses.
> + *
> + * @region: Should be already registered minidump region.
> + *
> + * Return: On success, it returns 0 and negative error value on failure.
> + */
> +int qcom_minidump_update_region(const struct qcom_minidump_region *region)
> +{
> + struct minidump_pregion *md_pregion;
> + struct qcom_minidump_region *tmp;
> + struct elfhdr *ehdr;
> + struct elf_shdr *shdr;
> + struct elf_phdr *phdr;
> + int idx;
> + int ret = 0;
> +
> + if (!qcom_minidump_valid_region(region))
> + return -EINVAL;
> +
> + mutex_lock(&md_lock);
> + if (!md) {
> + md_pregion = check_if_pending_region_exist(region);
> + if (!md_pregion) {
> + ret = -ENOENT;
> + goto unlock;
> + }
> + tmp = &md_pregion->region;
> + tmp->phys_addr = region->phys_addr;
> + tmp->virt_addr = region->virt_addr;
> + goto unlock;
> + }
> +
> + ret = md->ops->md_update_region(md, &idx, region);
> + if (ret)
> + goto unlock;
> +
> + /* Skip predefined shdr/phdr header entry at the start */
> + ehdr = md->elf.ehdr;
> + shdr = elf_shdr_entry_addr(ehdr, idx + 4);
> + phdr = elf_phdr_entry_addr(ehdr, idx + 1);
> +
> + shdr->sh_addr = (elf_addr_t)region->virt_addr;
> + phdr->p_vaddr = (elf_addr_t)region->virt_addr;
> + phdr->p_paddr = region->phys_addr;
> +
> +unlock:
> + mutex_unlock(&md_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(qcom_minidump_update_region);
> +

I don't see any use of this API in the series. Do you plan to add one in
the next version?

Thanks,
Pavan

2023-06-29 09:42:45

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [PATCH v4 10/21] soc: qcom: Add qcom's pstore minidump driver support



On 6/29/2023 4:27 AM, Rob Herring wrote:
> On Wed, Jun 28, 2023 at 6:37 AM Mukesh Ojha <[email protected]> wrote:
>>
>> This driver was inspired from the fact pstore ram region should be
>> fixed and boot firmware need to have awarness about this region,
>> so that it will be persistent across boot. But, there are many
>> QCOM SoC which does not support warm boot from hardware but they
>> have minidump support from the software, and for them, there is
>> no need of this pstore ram region to be fixed, but at the same
>> time have interest in the pstore frontends data. So, this driver
>> get the dynamic reserved region from the ram and register the
>> ramoops platform device.
>>
>> +---------+ +---------+ +--------+ +---------+
>> | console | | pmsg | | ftrace | | dmesg |
>> +---------+ +---------+ +--------+ +---------+
>> | | | |
>> | | | |
>> +------------------------------------------+
>> |
>> \ /
>> +----------------+
>> (1) |pstore frontends|
>> +----------------+
>> |
>> \ /
>> +------------------- +
>> (2) | pstore backend(ram)|
>> +--------------------+
>> |
>> \ /
>> +--------------------+
>> (3) |qcom_pstore_minidump|
>> +--------------------+
>> |
>> \ /
>> +---------------+
>> (4) | qcom_minidump |
>> +---------------+
>>
>> This driver will route all the pstore front data to the stored
>> in qcom pstore reserved region and the reason of showing an
>> arrow from (3) to (4) as qcom_pstore_minidump driver will register
>> all the available frontends region with qcom minidump driver
>> in upcoming patch.
>>
>> Signed-off-by: Mukesh Ojha <[email protected]>
>> ---
>> drivers/soc/qcom/Kconfig | 12 +++++
>> drivers/soc/qcom/Makefile | 1 +
>> drivers/soc/qcom/qcom_pstore_minidump.c | 85 +++++++++++++++++++++++++++++++++
>
> drivers/soc/ is the dumping ground for things with no other place. As
> this is a pstore driver, it belongs with pstore.

The inspiration of this driver was taken from
drivers/platform/chrome/chromeos_pstore.c, do you think that is misplaced ?

>
>> 3 files changed, 98 insertions(+)
>> create mode 100644 drivers/soc/qcom/qcom_pstore_minidump.c
>>
>> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
>> index 1834213fd652..fbf08e30feda 100644
>> --- a/drivers/soc/qcom/Kconfig
>> +++ b/drivers/soc/qcom/Kconfig
>> @@ -306,4 +306,16 @@ config QCOM_MINIDUMP_SMEM
>>
>> This config should be enabled if the low level minidump is implemented
>> as part of SMEM.
>> +
>> +config QCOM_PSTORE_MINIDUMP
>> + tristate "Pstore support for QCOM Minidump"
>> + depends on ARCH_QCOM
>> + depends on PSTORE_RAM
>> + depends on QCOM_MINIDUMP
>> + help
>> + Enablement of this driver ensures that ramoops region can be anywhere
>> + reserved in ram instead of being fixed address which needs boot firmware
>> + awareness. So, this driver creates plaform device and registers available
>> + frontend region with the Qualcomm's minidump driver.
>> +
>> endmenu
>> diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
>> index 737d868757ac..1ab59c1b364d 100644
>> --- a/drivers/soc/qcom/Makefile
>> +++ b/drivers/soc/qcom/Makefile
>> @@ -36,3 +36,4 @@ qcom_ice-objs += ice.o
>> obj-$(CONFIG_QCOM_INLINE_CRYPTO_ENGINE) += qcom_ice.o
>> obj-$(CONFIG_QCOM_MINIDUMP) += qcom_minidump.o
>> obj-$(CONFIG_QCOM_MINIDUMP_SMEM) += qcom_minidump_smem.o
>> +obj-$(CONFIG_QCOM_PSTORE_MINIDUMP) += qcom_pstore_minidump.o
>> diff --git a/drivers/soc/qcom/qcom_pstore_minidump.c b/drivers/soc/qcom/qcom_pstore_minidump.c
>> new file mode 100644
>> index 000000000000..b07cd10340df
>> --- /dev/null
>> +++ b/drivers/soc/qcom/qcom_pstore_minidump.c
>> @@ -0,0 +1,85 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +
>> +/*
>> + * Copyright (c) 2023 Qualcomm Innovation Center, Inc. All rights reserved.
>> + */
>> +
>> +#include <linux/module.h>
>> +#include <linux/of_device.h>
>
> You probably don't need this include. Use the actual includes you need
> and don't rely on implicit includes (because I'm trying to remove
> those).

Ok, will try to check..

- Mukesh

>
>> +#include <linux/of_reserved_mem.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/pstore_ram.h>
>> +
>> +struct qcom_ramoops_dd {
>> + struct ramoops_platform_data qcom_ramoops_pdata;
>> + struct platform_device *ramoops_pdev;
>> +};
>> +
>> +static int qcom_ramoops_probe(struct platform_device *pdev)
>> +{
>> + struct device_node *of_node = pdev->dev.of_node;
>> + struct qcom_ramoops_dd *qcom_rdd;
>> + struct ramoops_platform_data *pdata;
>> + struct reserved_mem *rmem;
>> + struct device_node *node;
>> + long ret;
>> +
>> + node = of_parse_phandle(of_node, "memory-region", 0);
>> + if (!node)
>> + return -ENODEV;
>> +
>> + rmem = of_reserved_mem_lookup(node);
>> + of_node_put(node);
>> + if (!rmem) {
>> + dev_err(&pdev->dev, "failed to locate DT /reserved-memory resource\n");
>> + return -EINVAL;
>> + }
>> +
>> + qcom_rdd = devm_kzalloc(&pdev->dev, sizeof(*qcom_rdd), GFP_KERNEL);
>> + if (!qcom_rdd)
>> + return -ENOMEM;
>> +
>> + pdata = &qcom_rdd->qcom_ramoops_pdata;
>> + pdata->mem_size = rmem->size;
>> + pdata->mem_address = rmem->base;
>> + ramoops_parse_dt(pdev, pdata);
>> +
>> + qcom_rdd->ramoops_pdev = platform_device_register_data(NULL, "ramoops", -1,
>> + pdata, sizeof(*pdata));
>> + if (IS_ERR(qcom_rdd->ramoops_pdev)) {
>> + ret = PTR_ERR(qcom_rdd->ramoops_pdev);
>> + dev_err(&pdev->dev, "could not create platform device: %ld\n", ret);
>> + qcom_rdd->ramoops_pdev = NULL;
>> + }
>> + platform_set_drvdata(pdev, qcom_rdd);
>> +
>> + return ret;
>> +}
>> +
>> +static void qcom_ramoops_remove(struct platform_device *pdev)
>> +{
>> + struct qcom_ramoops_dd *qcom_rdd = platform_get_drvdata(pdev);
>> +
>> + platform_device_unregister(qcom_rdd->ramoops_pdev);
>> + qcom_rdd->ramoops_pdev = NULL;
>> +}
>> +
>> +static const struct of_device_id qcom_ramoops_of_match[] = {
>> + { .compatible = "qcom,ramoops"},
>> + {}
>> +};
>> +MODULE_DEVICE_TABLE(of, qcom_ramoops_of_match);
>> +
>> +static struct platform_driver qcom_ramoops_drv = {
>> + .driver = {
>> + .name = "qcom,ramoops",
>> + .of_match_table = qcom_ramoops_of_match,
>> + },
>> + .probe = qcom_ramoops_probe,
>> + .remove_new = qcom_ramoops_remove,
>> +};
>> +
>> +module_platform_driver(qcom_ramoops_drv);
>> +
>> +MODULE_DESCRIPTION("Qualcomm ramoops minidump driver");
>> +MODULE_LICENSE("GPL");
>> --
>> 2.7.4
>>

2023-06-30 16:30:17

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support



On 6/29/2023 4:42 AM, Rob Herring wrote:
> On Wed, Jun 28, 2023 at 9:45 AM Greg KH <[email protected]> wrote:
>>
>> On Wed, Jun 28, 2023 at 06:04:27PM +0530, Mukesh Ojha wrote:
>>> Minidump is a best effort mechanism to collect useful and predefined data
>>> for first level of debugging on end user devices running on Qualcomm SoCs.
>>> It is built on the premise that System on Chip (SoC) or subsystem part of
>>> SoC crashes, due to a range of hardware and software bugs. Hence, the
>>> ability to collect accurate data is only a best-effort. The data collected
>>> could be invalid or corrupted, data collection itself could fail, and so on.
>>>
>>> Qualcomm devices in engineering mode provides a mechanism for generating
>>> full system ramdumps for post mortem debugging. But in some cases it's
>>> however not feasible to capture the entire content of RAM. The minidump
>>> mechanism provides the means for selecting which snippets should be
>>> included in the ramdump.
>>>
>>> Minidump kernel driver implementation is divided into two parts for
>>> simplicity, one is minidump core which can also be called minidump
>>> frontend(As API gets exported from this driver for registration with
>>> backend) and the other part is minidump backend i.e, where the underlying
>>> implementation of minidump will be there. There could be different way
>>> how the backend is implemented like Shared memory, Memory mapped IO
>>> or Resource manager(gunyah) based where the guest region information is
>>> passed to hypervisor via hypercalls.
>>>
>>> Minidump Client-1 Client-2 Client-5 Client-n
>>> | | | |
>>> | | ... | ... |
>>> | | | |
>>> | | | |
>>> | | | |
>>> | | | |
>>> | | | |
>>> | | | |
>>> | +---+--------------+----+ |
>>> +-----------+ qcom_minidump(core) +--------+
>>> | |
>>> +------+-----+------+---+
>>> | | |
>>> | | |
>>> +---------------+ | +--------------------+
>>> | | |
>>> | | |
>>> | | |
>>> v v v
>>> +-------------------+ +-------------------+ +------------------+
>>> |qcom_minidump_smem | |qcom_minidump_mmio | | qcom_minidump_rm |
>>> | | | | | |
>>> +-------------------+ +-------------------+ +------------------+
>>> Shared memory Memory mapped IO Resource manager
>>> (backend) (backend) (backend)
>>>
>>>
>>> Here, we will be giving all analogy of backend with SMEM as it is the
>>> only implemented backend at present but general idea remains the same.
>>
>> If you only have one "backend" then you don't need the extra compexity
>> here at all, just remove that whole middle layer please and make this
>> much simpler and smaller and easier to review and possibly accept.
>
> pstore already supports backends. Why aren't the above backends just
> pstore backends rather than having an intermediate pstore backend in
> RAM which then somehow gets moved into these minidump backends.

It can't be another pstore backend since, pstore backend(ram) is for
the system where there is a guarantees of fixed ram range content
persist across boot, but that is not true with minidump, there is no
pstorefs kind of support with minidump.

Instead, the whole idea of backend/front-end of minidump here, is
entirely related to minidump and here minidump backend could be
anything like could be Shared memory in DDR which is shared across
multiple subsystem (CPUSS) is one of them or it could be something
where you want to collect minidump for a guest vm which guest
does not have access to backend instead it may be the hypervisor
do the register the region for the guest.

Pstore(ram) is one of the clients of minidump where, we want to collect
the console/pmsg/ftrace/dmesg logs for production devices where
DDR dump is difficult and the reason of doing this to not re-invent
the wheel and it just need physical address/size .

+---------+ +---------+ +--------+ +---------+
| console | | pmsg | | ftrace | | dmesg |
+---------+ +---------+ +--------+ +---------+
| | | |
| | | |
+------------------------------------------+
|
\ /
+----------------+
(1) |pstore frontends|
+----------------+
|
\ /
+------------------- +
(2) | pstore backend(ram)|
+--------------------+
|
\ /
+--------------------+
(3) |qcom_pstore_minidump|
+--------------------+

-Mukesh
>
>> We don't add layers when they are not needed, and never when there is no
>> actual user. If you need the extra "complexity" later, then add it
>> later when it is needed as who knows when that will ever be.
>>
>> Please redo this series based on that, thanks.
>
> My bigger issue with this whole series is what would this all look
> like if every SoC vendor upstreamed their own custom dumping
> mechanism. That would be a mess. (I have similar opinions on the
> $soc-vendor hypervisors.)
>
> Rob

2023-07-02 08:55:53

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

On 28/06/2023 14:34, Mukesh Ojha wrote:
> Minidump is a best effort mechanism to collect useful and predefined data
> for first level of debugging on end user devices running on Qualcomm SoCs.
> It is built on the premise that System on Chip (SoC) or subsystem part of
> SoC crashes, due to a range of hardware and software bugs. Hence, the
> ability to collect accurate data is only a best-effort. The data collected
> could be invalid or corrupted, data collection itself could fail, and so on.
>

...hundred of unrelated lines...

> #define PANIC_PRINT_TASK_INFO 0x00000001
> #define PANIC_PRINT_MEM_INFO 0x00000002
> @@ -261,6 +263,7 @@ void panic(const char *fmt, ...)
> int old_cpu, this_cpu;
> bool _crash_kexec_post_notifiers = crash_kexec_post_notifiers;
>
> + in_panic = true;
> if (panic_on_warn) {
> /*
> * This thread may hit another WARN() in the panic path.
> --------------------------------------------------------------------------
>
> Changes in v4:

Putting changelog at the end of very long cover letter does no help us
to find it.

> - Redesigned the driver and divided the driver into front end and backend (smem) so
> that any new backend can be attached easily to avoid code duplication.
> - Patch reordering as per the driver and subsystem to easier review of the code.
> - Removed minidump specific code from remoteproc to minidump smem based driver.
> - Enabled the all the driver as modules.
> - Address comments made on documentation and yaml and Device tree file [Krzysztof/Konrad]

That's not enough. Your binding changed a lot and I doubt we proposed
such changes. You need to be specific.

> - Address comments made qcom_pstore_minidump driver and given its Device tree
> same set of properties as ramoops. [Luca/Kees]
> - Added patch for MAINTAINER file.
> - Include defconfig change as one patch as per [Krzysztof] suggestion.
> - Tried to remove the redundant file scope variables from the module as per [Krzysztof] suggestion.
> - Addressed comments made on dload mode patch v6 version
> https://lore.kernel.org/lkml/[email protected]/
>
>

Best regards,
Krzysztof


2023-07-02 08:56:22

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v4 08/21] dt-bindings: reserved-memory: Add qcom,ramoops binding

On 28/06/2023 17:01, Mukesh Ojha wrote:
>
>
> On 6/28/2023 8:17 PM, Rob Herring wrote:
>> On Wed, Jun 28, 2023 at 6:36 AM Mukesh Ojha <[email protected]> wrote:
>>>
>>> Qualcomm ramoops minidump logger provide a means of storing
>>> the ramoops data to some dynamically reserved memory instead
>>> of traditionally implemented ramoops where the region should
>>> be statically fixed ram region. Its device tree binding
>>> would be exactly same as ramoops device tree binding and is
>>> going to contain traditional ramoops frontend data and this
>>> content will be collected via Qualcomm minidump infrastructure
>>> provided from the boot firmware.
>>
>> The big difference is if firmware is not deciding where this log
>> lives, then it doesn't need to be in DT. How does anything except the
>> kernel that allocates the log find the logs?
>
> Yes, you are correct, firmware is not deciding where the logs lives
> instead here, Kernel has reserved the region where the ramoops region
> lives and later with the minidump registration where, physical
> address/size/virtual address(for parsing) are passed and that is how
> firmware is able to know and dump those region before triggering system
> reset.

Your explanation does not justify storing all this in DT. Kernel can
allocate any memory it wishes, store there logs and pass the address to
the firmware. That's it, no need for DT.

>
> A part of this registration code you can find in 11/21
>
>> I'm pretty sure I already said all this before.
>
> Yes, you said this before but that's the reason i came up with vendor
> ramoops instead of changing traditional ramoops binding.

That's unexpected conclusion. Adding more bindings is not the answer to
comment that it should not be in the DTS in the first place.

Best regards,
Krzysztof


2023-07-02 09:08:41

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

On 30/06/2023 18:04, Mukesh Ojha wrote:
>>
>>> We don't add layers when they are not needed, and never when there is no
>>> actual user. If you need the extra "complexity" later, then add it
>>> later when it is needed as who knows when that will ever be.
>>>
>>> Please redo this series based on that, thanks.
>>
>> My bigger issue with this whole series is what would this all look
>> like if every SoC vendor upstreamed their own custom dumping
>> mechanism. That would be a mess. (I have similar opinions on the
>> $soc-vendor hypervisors.)

Mukesh,

LPC CFP is still open. There will be also Android and Kernel Debugging
LPC microconference tracks. Coming with a unified solution could be a
great topic for LPC. Solutions targeting only one user are quite often
frowned upon.

Best regards,
Krzysztof


2023-07-03 07:13:23

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [PATCH v4 08/21] dt-bindings: reserved-memory: Add qcom,ramoops binding



On 7/2/2023 1:42 PM, Krzysztof Kozlowski wrote:
> On 28/06/2023 17:01, Mukesh Ojha wrote:
>>
>>
>> On 6/28/2023 8:17 PM, Rob Herring wrote:
>>> On Wed, Jun 28, 2023 at 6:36 AM Mukesh Ojha <[email protected]> wrote:
>>>>
>>>> Qualcomm ramoops minidump logger provide a means of storing
>>>> the ramoops data to some dynamically reserved memory instead
>>>> of traditionally implemented ramoops where the region should
>>>> be statically fixed ram region. Its device tree binding
>>>> would be exactly same as ramoops device tree binding and is
>>>> going to contain traditional ramoops frontend data and this
>>>> content will be collected via Qualcomm minidump infrastructure
>>>> provided from the boot firmware.
>>>
>>> The big difference is if firmware is not deciding where this log
>>> lives, then it doesn't need to be in DT. How does anything except the
>>> kernel that allocates the log find the logs?
>>
>> Yes, you are correct, firmware is not deciding where the logs lives
>> instead here, Kernel has reserved the region where the ramoops region
>> lives and later with the minidump registration where, physical
>> address/size/virtual address(for parsing) are passed and that is how
>> firmware is able to know and dump those region before triggering system
>> reset.
>
> Your explanation does not justify storing all this in DT. Kernel can
> allocate any memory it wishes, store there logs and pass the address to
> the firmware. That's it, no need for DT.

If you go through the driver, you will know that what it does, is
just create platform device for actual ramoops driver to probe and to
provide this it needs exact set of parameters of input what original
ramoops DT provides, we need to keep it in DT as maintaining this in
driver will not scale well with different size/parameter size
requirement for different targets.

>
>>
>> A part of this registration code you can find in 11/21
>>
>>> I'm pretty sure I already said all this before.
>>
>> Yes, you said this before but that's the reason i came up with vendor
>> ramoops instead of changing traditional ramoops binding.
>
> That's unexpected conclusion. Adding more bindings is not the answer to
> comment that it should not be in the DTS in the first place.

Please suggest, what is the other way being above text as requirement..

-- Mukesh
>
> Best regards,
> Krzysztof
>

2023-07-03 07:35:46

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v4 08/21] dt-bindings: reserved-memory: Add qcom,ramoops binding

On Mon, 3 Jul 2023 at 08:22, Mukesh Ojha <[email protected]> wrote:
> On 7/2/2023 1:42 PM, Krzysztof Kozlowski wrote:
> >>> The big difference is if firmware is not deciding where this log
> >>> lives, then it doesn't need to be in DT. How does anything except the
> >>> kernel that allocates the log find the logs?
> >>
> >> Yes, you are correct, firmware is not deciding where the logs lives
> >> instead here, Kernel has reserved the region where the ramoops region
> >> lives and later with the minidump registration where, physical
> >> address/size/virtual address(for parsing) are passed and that is how
> >> firmware is able to know and dump those region before triggering system
> >> reset.
> >
> > Your explanation does not justify storing all this in DT. Kernel can
> > allocate any memory it wishes, store there logs and pass the address to
> > the firmware. That's it, no need for DT.
>
> If you go through the driver, you will know that what it does, is

We talk about bindings and I should not be forced to look at the
driver to be able to understand them. Bindings should stand on their
own.

> just create platform device for actual ramoops driver to probe and to

Not really justification for Devicetree anyway. Whatever your driver
is doing, is driver's business, not bindings.

> provide this it needs exact set of parameters of input what original
> ramoops DT provides, we need to keep it in DT as maintaining this in
> driver will not scale well with different size/parameter size
> requirement for different targets.

Really? Why? I don't see a problem in scaling. At all.

>
> >
> >>
> >> A part of this registration code you can find in 11/21
> >>
> >>> I'm pretty sure I already said all this before.
> >>
> >> Yes, you said this before but that's the reason i came up with vendor
> >> ramoops instead of changing traditional ramoops binding.
> >
> > That's unexpected conclusion. Adding more bindings is not the answer to
> > comment that it should not be in the DTS in the first place.
>
> Please suggest, what is the other way being above text as requirement..

I do not see any requirement for us there. Forcing me to figure out
how to add non-hardware property to DT is not the way to convince
reviewers. But if you insist - we have ABI for this, called sysfs. If
it is debugging feature, then debugfs.

Best regards,
Krzysztof

2023-07-03 16:17:23

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [PATCH v4 08/21] dt-bindings: reserved-memory: Add qcom,ramoops binding



On 7/3/2023 12:50 PM, Krzysztof Kozlowski wrote:
> On Mon, 3 Jul 2023 at 08:22, Mukesh Ojha <[email protected]> wrote:
>> On 7/2/2023 1:42 PM, Krzysztof Kozlowski wrote:
>>>>> The big difference is if firmware is not deciding where this log
>>>>> lives, then it doesn't need to be in DT. How does anything except the
>>>>> kernel that allocates the log find the logs?
>>>>
>>>> Yes, you are correct, firmware is not deciding where the logs lives
>>>> instead here, Kernel has reserved the region where the ramoops region
>>>> lives and later with the minidump registration where, physical
>>>> address/size/virtual address(for parsing) are passed and that is how
>>>> firmware is able to know and dump those region before triggering system
>>>> reset.
>>>
>>> Your explanation does not justify storing all this in DT. Kernel can
>>> allocate any memory it wishes, store there logs and pass the address to
>>> the firmware. That's it, no need for DT.
>>
>> If you go through the driver, you will know that what it does, is
>
> We talk about bindings and I should not be forced to look at the
> driver to be able to understand them. Bindings should stand on their
> own.

Why can't ramoops binding have one more feature where it can add a flag
*dynamic* to indicate the regions are dynamic and it is for platforms
where there is another entity 'minidump' who is interested in these
regions.

>
>> just create platform device for actual ramoops driver to probe and to
>
> Not really justification for Devicetree anyway. Whatever your driver
> is doing, is driver's business, not bindings.
>
>> provide this it needs exact set of parameters of input what original
>> ramoops DT provides, we need to keep it in DT as maintaining this in
>> driver will not scale well with different size/parameter size
>> requirement for different targets.
>
> Really? Why? I don't see a problem in scaling. At all.

I had attempted it here,

https://lore.kernel.org/lkml/[email protected]/

but got comments related to hard coding and some in favor of having
the same set of properties what ramoops has/provides

https://lore.kernel.org/lkml/[email protected]/

https://lore.kernel.org/lkml/202305161347.80204C1A0E@keescook/
>
>>
>>>
>>>>
>>>> A part of this registration code you can find in 11/21
>>>>
>>>>> I'm pretty sure I already said all this before.
>>>>
>>>> Yes, you said this before but that's the reason i came up with vendor
>>>> ramoops instead of changing traditional ramoops binding.
>>>
>>> That's unexpected conclusion. Adding more bindings is not the answer to
>>> comment that it should not be in the DTS in the first place.
>>
>> Please suggest, what is the other way being above text as requirement..
>
> I do not see any requirement for us there. Forcing me to figure out
> how to add non-hardware property to DT is not the way to convince
> reviewers. But if you insist - we have ABI for this, called sysfs. If
> it is debugging feature, then debugfs.

ramoops already support module params and a way to pass these parameters
from bootargs but it also need to know the hard-codes addresses, so,
doing something in sysfs will be again duplication with ramoops driver..

If this can be accommodated under ramoops, this will be very small
change, like this

https://lore.kernel.org/lkml/[email protected]/

-- Mukesh
>
> Best regards,
> Krzysztof

2023-07-03 21:11:09

by Trilok Soni

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

On 7/2/2023 1:29 AM, Krzysztof Kozlowski wrote:
> On 30/06/2023 18:04, Mukesh Ojha wrote:
>>>
>>>> We don't add layers when they are not needed, and never when there is no
>>>> actual user. If you need the extra "complexity" later, then add it
>>>> later when it is needed as who knows when that will ever be.
>>>>
>>>> Please redo this series based on that, thanks.
>>>
>>> My bigger issue with this whole series is what would this all look
>>> like if every SoC vendor upstreamed their own custom dumping
>>> mechanism. That would be a mess. (I have similar opinions on the
>>> $soc-vendor hypervisors.)
>
> Mukesh,
>
> LPC CFP is still open. There will be also Android and Kernel Debugging
> LPC microconference tracks. Coming with a unified solution could be a
> great topic for LPC. Solutions targeting only one user are quite often
> frowned upon.

LPC is far out and in November. Can we not have others speak up if they
have the similar solution now? We can expand this to linux-kernel and
ask for the other SOC vendors to chime in. I am sure that we may have
existing solutions which came in for the one user first like Intel RDT
if I remember. I am sure ARM MPAM usecase was present at that time but
Intel RDT based solution which was x86 specific but accepted.

---Trilok Soni

2023-07-04 06:15:13

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v4 08/21] dt-bindings: reserved-memory: Add qcom,ramoops binding

On 03/07/2023 17:55, Mukesh Ojha wrote:
>
>
> On 7/3/2023 12:50 PM, Krzysztof Kozlowski wrote:
>> On Mon, 3 Jul 2023 at 08:22, Mukesh Ojha <[email protected]> wrote:
>>> On 7/2/2023 1:42 PM, Krzysztof Kozlowski wrote:
>>>>>> The big difference is if firmware is not deciding where this log
>>>>>> lives, then it doesn't need to be in DT. How does anything except the
>>>>>> kernel that allocates the log find the logs?
>>>>>
>>>>> Yes, you are correct, firmware is not deciding where the logs lives
>>>>> instead here, Kernel has reserved the region where the ramoops region
>>>>> lives and later with the minidump registration where, physical
>>>>> address/size/virtual address(for parsing) are passed and that is how
>>>>> firmware is able to know and dump those region before triggering system
>>>>> reset.
>>>>
>>>> Your explanation does not justify storing all this in DT. Kernel can
>>>> allocate any memory it wishes, store there logs and pass the address to
>>>> the firmware. That's it, no need for DT.
>>>
>>> If you go through the driver, you will know that what it does, is
>>
>> We talk about bindings and I should not be forced to look at the
>> driver to be able to understand them. Bindings should stand on their
>> own.
>
> Why can't ramoops binding have one more feature where it can add a flag
> *dynamic* to indicate the regions are dynamic and it is for platforms
> where there is another entity 'minidump' who is interested in these
> regions.

Because we do not define dynamic stuff in Devicetree. Dynamic means
defined by SW or runtime configurable. It is against the entire idea of
Devicetree which is for non-discoverable hardware.

>
>>
>>> just create platform device for actual ramoops driver to probe and to
>>
>> Not really justification for Devicetree anyway. Whatever your driver
>> is doing, is driver's business, not bindings.
>>
>>> provide this it needs exact set of parameters of input what original
>>> ramoops DT provides, we need to keep it in DT as maintaining this in
>>> driver will not scale well with different size/parameter size
>>> requirement for different targets.
>>
>> Really? Why? I don't see a problem in scaling. At all.
>
> I had attempted it here,
>
> https://lore.kernel.org/lkml/[email protected]/
>
> but got comments related to hard coding and some in favor of having
> the same set of properties what ramoops has/provides
>
> https://lore.kernel.org/lkml/[email protected]/
>
> https://lore.kernel.org/lkml/202305161347.80204C1A0E@keescook/

Then you were tricked. I don't get why someone else suggests that
non-hardware property should be part of Devicetree, but anyway it's the
call of Devicetree binding maintainers, not someone else. DT is not
dumping ground for all the system configuration variables.


>>
>>>
>>>>
>>>>>
>>>>> A part of this registration code you can find in 11/21
>>>>>
>>>>>> I'm pretty sure I already said all this before.
>>>>>
>>>>> Yes, you said this before but that's the reason i came up with vendor
>>>>> ramoops instead of changing traditional ramoops binding.
>>>>
>>>> That's unexpected conclusion. Adding more bindings is not the answer to
>>>> comment that it should not be in the DTS in the first place.
>>>
>>> Please suggest, what is the other way being above text as requirement..
>>
>> I do not see any requirement for us there. Forcing me to figure out
>> how to add non-hardware property to DT is not the way to convince
>> reviewers. But if you insist - we have ABI for this, called sysfs. If
>> it is debugging feature, then debugfs.
>
> ramoops already support module params and a way to pass these parameters
> from bootargs but it also need to know the hard-codes addresses, so,
> doing something in sysfs will be again duplication with ramoops driver..

Why do you need hard-coded addresses?

>
> If this can be accommodated under ramoops, this will be very small
> change, like this
>
> https://lore.kernel.org/lkml/[email protected]/

That's also funny patch - missing bindings updated, missing CC DT
maintainers.

Best regards,
Krzysztof


2023-07-04 09:40:29

by Linus Walleij

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

On Thu, Jun 29, 2023 at 1:12 AM Rob Herring <[email protected]> wrote:

> My bigger issue with this whole series is what would this all look
> like if every SoC vendor upstreamed their own custom dumping
> mechanism. That would be a mess. (I have similar opinions on the
> $soc-vendor hypervisors.)

I agree with Rob's stance.

I think it would be useful to get input from the hwtracing developers
(Alexander and Mathieu) who faced this "necessarily different" issue
with all the hwtrace mechanisms and found a way out of it. I suspect
they can have an idea of how this should be abstracted.

Yours,
Linus Walleij

2023-07-05 17:43:25

by Trilok Soni

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

On 7/4/2023 2:27 AM, Linus Walleij wrote:
> On Thu, Jun 29, 2023 at 1:12 AM Rob Herring <[email protected]> wrote:
>
>> My bigger issue with this whole series is what would this all look
>> like if every SoC vendor upstreamed their own custom dumping
>> mechanism. That would be a mess. (I have similar opinions on the
>> $soc-vendor hypervisors.)
>
> I agree with Rob's stance.
>
> I think it would be useful to get input from the hwtracing developers
> (Alexander and Mathieu) who faced this "necessarily different" issue
> with all the hwtrace mechanisms and found a way out of it. I suspect
> they can have an idea of how this should be abstracted.

Any mailing list you suggest we expand to so that we get inputs from the
hwtracing developers and maintainers or just look into the MAINTAINERS
file and start an email thread?

We are fine to submit the abstract for the LPC in next two weeks, but
prefer to have lot of good discussion before it on the mailing list, so
that we have code to talk about in LPC.

---Trilok Soni

2023-07-05 23:43:58

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v4 10/21] soc: qcom: Add qcom's pstore minidump driver support

On Thu, Jun 29, 2023 at 3:16 AM Mukesh Ojha <[email protected]> wrote:
>
>
>
> On 6/29/2023 4:27 AM, Rob Herring wrote:
> > On Wed, Jun 28, 2023 at 6:37 AM Mukesh Ojha <[email protected]> wrote:
> >>
> >> This driver was inspired from the fact pstore ram region should be
> >> fixed and boot firmware need to have awarness about this region,
> >> so that it will be persistent across boot. But, there are many
> >> QCOM SoC which does not support warm boot from hardware but they
> >> have minidump support from the software, and for them, there is
> >> no need of this pstore ram region to be fixed, but at the same
> >> time have interest in the pstore frontends data. So, this driver
> >> get the dynamic reserved region from the ram and register the
> >> ramoops platform device.
> >>
> >> +---------+ +---------+ +--------+ +---------+
> >> | console | | pmsg | | ftrace | | dmesg |
> >> +---------+ +---------+ +--------+ +---------+
> >> | | | |
> >> | | | |
> >> +------------------------------------------+
> >> |
> >> \ /
> >> +----------------+
> >> (1) |pstore frontends|
> >> +----------------+
> >> |
> >> \ /
> >> +------------------- +
> >> (2) | pstore backend(ram)|
> >> +--------------------+
> >> |
> >> \ /
> >> +--------------------+
> >> (3) |qcom_pstore_minidump|
> >> +--------------------+
> >> |
> >> \ /
> >> +---------------+
> >> (4) | qcom_minidump |
> >> +---------------+
> >>
> >> This driver will route all the pstore front data to the stored
> >> in qcom pstore reserved region and the reason of showing an
> >> arrow from (3) to (4) as qcom_pstore_minidump driver will register
> >> all the available frontends region with qcom minidump driver
> >> in upcoming patch.
> >>
> >> Signed-off-by: Mukesh Ojha <[email protected]>
> >> ---
> >> drivers/soc/qcom/Kconfig | 12 +++++
> >> drivers/soc/qcom/Makefile | 1 +
> >> drivers/soc/qcom/qcom_pstore_minidump.c | 85 +++++++++++++++++++++++++++++++++
> >
> > drivers/soc/ is the dumping ground for things with no other place. As
> > this is a pstore driver, it belongs with pstore.
>
> The inspiration of this driver was taken from
> drivers/platform/chrome/chromeos_pstore.c, do you think that is misplaced ?

The difference is really that's nothing more than platform specific
logic to instantiate a normal ramoops device. It's kind of ugly, yes,
but it's still just a normal ramoops device in the end. Your case is
that plus all the extra parts for minidump.

Rob

2023-07-06 17:38:31

by Mathieu Poirier

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

On Mon, Jul 03, 2023 at 02:05:58PM -0700, Trilok Soni wrote:
> On 7/2/2023 1:29 AM, Krzysztof Kozlowski wrote:
> > On 30/06/2023 18:04, Mukesh Ojha wrote:
> > > >
> > > > > We don't add layers when they are not needed, and never when there is no
> > > > > actual user. If you need the extra "complexity" later, then add it
> > > > > later when it is needed as who knows when that will ever be.
> > > > >
> > > > > Please redo this series based on that, thanks.
> > > >
> > > > My bigger issue with this whole series is what would this all look
> > > > like if every SoC vendor upstreamed their own custom dumping
> > > > mechanism. That would be a mess. (I have similar opinions on the
> > > > $soc-vendor hypervisors.)
> >
> > Mukesh,
> >
> > LPC CFP is still open. There will be also Android and Kernel Debugging
> > LPC microconference tracks. Coming with a unified solution could be a
> > great topic for LPC. Solutions targeting only one user are quite often
> > frowned upon.
>
> LPC is far out and in November. Can we not have others speak up if they have
> the similar solution now? We can expand this to linux-kernel and ask for the
> other SOC vendors to chime in. I am sure that we may have existing solutions
> which came in for the one user first like Intel RDT if I remember. I am sure
> ARM MPAM usecase was present at that time but Intel RDT based solution which
> was x86 specific but accepted.

I am not familiar with Intel RDT and Arm MPAM but the community is always
improving on the way it does things.

LPC is indeed far out in November but it is an opportunity to cover the
groundwork needed to have this discussion. It is always best to improve on
something then introduce something new. Even better if something specific such
as Intel RDT and Arm MPAM can be made more generic. A perfect example is
hwtracing Linus referred to. The perf framework wasn't a perfect fit but it was
enhanced to accommodate our requirements. I suggest to look at what is currently
available and come up with a strategy to be presented at LPC - event better if
you have a prototype. If you can't find anything or the drawbacks inherent to
each avenue outweigh the benefits then we can have that conversation at LPC.

>
> ---Trilok Soni

2023-07-06 18:02:58

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

On Mon, Jul 3, 2023 at 3:06 PM Trilok Soni <[email protected]> wrote:
>
> On 7/2/2023 1:29 AM, Krzysztof Kozlowski wrote:
> > On 30/06/2023 18:04, Mukesh Ojha wrote:
> >>>
> >>>> We don't add layers when they are not needed, and never when there is no
> >>>> actual user. If you need the extra "complexity" later, then add it
> >>>> later when it is needed as who knows when that will ever be.
> >>>>
> >>>> Please redo this series based on that, thanks.
> >>>
> >>> My bigger issue with this whole series is what would this all look
> >>> like if every SoC vendor upstreamed their own custom dumping
> >>> mechanism. That would be a mess. (I have similar opinions on the
> >>> $soc-vendor hypervisors.)
> >
> > Mukesh,
> >
> > LPC CFP is still open. There will be also Android and Kernel Debugging
> > LPC microconference tracks. Coming with a unified solution could be a
> > great topic for LPC. Solutions targeting only one user are quite often
> > frowned upon.
>
> LPC is far out and in November. Can we not have others speak up if they
> have the similar solution now? We can expand this to linux-kernel and
> ask for the other SOC vendors to chime in. I am sure that we may have
> existing solutions which came in for the one user first like Intel RDT
> if I remember. I am sure ARM MPAM usecase was present at that time but
> Intel RDT based solution which was x86 specific but accepted.

RDT predated MPAM. resctrl is the kernel feature, and it supports
Intel and AMD which are not identical. resctrl is being (extensively)
refactored to add in MPAM support.

You are not the first here like Intel RDT, so I fail to see the
parallel with minidump. We have an existing logging to persistent
storage mechanism which is pstore. You should integrate into that
rather than grafting something on to the side or underneath.

Rob

2023-07-06 18:28:54

by Trilok Soni

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

On 7/6/2023 10:40 AM, Rob Herring wrote:
> On Mon, Jul 3, 2023 at 3:06 PM Trilok Soni <[email protected]> wrote:
>>
>> On 7/2/2023 1:29 AM, Krzysztof Kozlowski wrote:
>>> On 30/06/2023 18:04, Mukesh Ojha wrote:
>>>>>
>>>>>> We don't add layers when they are not needed, and never when there is no
>>>>>> actual user. If you need the extra "complexity" later, then add it
>>>>>> later when it is needed as who knows when that will ever be.
>>>>>>
>>>>>> Please redo this series based on that, thanks.
>>>>>
>>>>> My bigger issue with this whole series is what would this all look
>>>>> like if every SoC vendor upstreamed their own custom dumping
>>>>> mechanism. That would be a mess. (I have similar opinions on the
>>>>> $soc-vendor hypervisors.)
>>>
>>> Mukesh,
>>>
>>> LPC CFP is still open. There will be also Android and Kernel Debugging
>>> LPC microconference tracks. Coming with a unified solution could be a
>>> great topic for LPC. Solutions targeting only one user are quite often
>>> frowned upon.
>>
>> LPC is far out and in November. Can we not have others speak up if they
>> have the similar solution now? We can expand this to linux-kernel and
>> ask for the other SOC vendors to chime in. I am sure that we may have
>> existing solutions which came in for the one user first like Intel RDT
>> if I remember. I am sure ARM MPAM usecase was present at that time but
>> Intel RDT based solution which was x86 specific but accepted.
>
> RDT predated MPAM. resctrl is the kernel feature, and it supports
> Intel and AMD which are not identical. resctrl is being (extensively)
> refactored to add in MPAM support.
>
> You are not the first here like Intel RDT, so I fail to see the
> parallel with minidump. We have an existing logging to persistent
> storage mechanism which is pstore. You should integrate into that
> rather than grafting something on to the side or underneath.
>

Mukesh will chime in once he looks at the hwtracing suggested by Linus W
and see if it fits. Mukesh seems have already looked at pstore and
discussions/patches are there w/ pstore logic I believe, but it is okay
if they are not perfect if we are still not decided on the right
framework. Best to decide if the existing frameworks fits or not or we
need to create the new one.

I would still prefer if other SOC vendors chime in here, since I am sure
in the Mobile and Embedded world various SOCs may have requirements to
get specific portion of the ramdump only for the quick analysis and
meeting the storage requirements on the device for its collection.

As mentioned on another patch, we are fine the submit abstract at LPC
debug MC, but I would like the framework discussion to continue so that
we can decide during the LPC that either existing frameworks fits the
needs or they need to be extended or new fwk is needed.

---Trilok Soni

2023-07-13 05:20:54

by Kathiravan Thirumoorthy

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support


On 6/28/2023 6:04 PM, Mukesh Ojha wrote:
> Minidump is a best effort mechanism to collect useful and predefined data
> for first level of debugging on end user devices running on Qualcomm SoCs.
> It is built on the premise that System on Chip (SoC) or subsystem part of
> SoC crashes, due to a range of hardware and software bugs. Hence, the
> ability to collect accurate data is only a best-effort. The data collected
> could be invalid or corrupted, data collection itself could fail, and so on.
>
> Qualcomm devices in engineering mode provides a mechanism for generating
> full system ramdumps for post mortem debugging. But in some cases it's
> however not feasible to capture the entire content of RAM. The minidump
> mechanism provides the means for selecting which snippets should be
> included in the ramdump.
>
> Minidump kernel driver implementation is divided into two parts for
> simplicity, one is minidump core which can also be called minidump
> frontend(As API gets exported from this driver for registration with
> backend) and the other part is minidump backend i.e, where the underlying
> implementation of minidump will be there. There could be different way
> how the backend is implemented like Shared memory, Memory mapped IO
> or Resource manager(gunyah) based where the guest region information is
> passed to hypervisor via hypercalls.
>
> Minidump Client-1 Client-2 Client-5 Client-n
> | | | |
> | | ... | ... |
> | | | |
> | | | |
> | | | |
> | | | |
> | | | |
> | | | |
> | +---+--------------+----+ |
> +-----------+ qcom_minidump(core) +--------+
> | |
> +------+-----+------+---+
> | | |
> | | |
> +---------------+ | +--------------------+
> | | |
> | | |
> | | |
> v v v
> +-------------------+ +-------------------+ +------------------+
> |qcom_minidump_smem | |qcom_minidump_mmio | | qcom_minidump_rm |
> | | | | | |
> +-------------------+ +-------------------+ +------------------+
> Shared memory Memory mapped IO Resource manager
> (backend) (backend) (backend)
>
>
> Here, we will be giving all analogy of backend with SMEM as it is the
> only implemented backend at present but general idea remains the same.
>
> The core of SMEM based minidump feature is part of Qualcomm's boot
> firmware code. It initializes shared memory (SMEM), which is a part of
> DDR and allocates a small section of SMEM to minidump table i.e also
> called global table of content (G-ToC). Each subsystem (APSS, ADSP, ...)
> has their own table of segments to be included in the minidump and all
> get their reference from G-ToC. Each segment/region has some details
> like name, physical address and it's size etc. and it could be anywhere
> scattered in the DDR.
>
> Existing upstream Qualcomm remoteproc driver[1] already supports SMEM
> based minidump feature for remoteproc instances like ADSP, MODEM, ...
> where predefined selective segments of subsystem region can be dumped
> as part of coredump collection which generates smaller size artifacts
> compared to complete coredump of subsystem on crash.
>
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/remoteproc/qcom_common.c#n142
>
> In addition to managing and querying the APSS minidump description,
> the Linux driver maintains a ELF header in a segment. This segment
> gets updated with section/program header whenever a new entry gets
> registered.
>
>
> docs: qcom: Add qualcomm minidump guide
> kallsyms: Export kallsyms_lookup_name
> soc: qcom: Add qcom_minidump_smem module
> soc: qcom: Add Qualcomm APSS minidump (frontend) feature support
> soc: qcom: Add linux minidump smem backend driver support
> soc: qcom: minidump: Add pending region registration support
> soc: qcom: minidump: Add update region support
> dt-bindings: reserved-memory: Add qcom,ramoops binding
> pstore/ram : Export ramoops_parse_dt() symbol
> soc: qcom: Add qcom's pstore minidump driver support
> soc: qcom: Register pstore frontend region with minidump
> remoteproc: qcom: Expand MD_* as MINIDUMP_*
> remoterproc: qcom: refactor to leverage exported minidump symbol
> MAINTAINERS: Add entry for minidump driver related support
> arm64: defconfig: Enable Qualcomm Minidump related drivers
> arm64: dts: qcom: sm8450: Add Qualcomm ramoops minidump node
> firmware: qcom_scm: provide a read-modify-write function
> pinctrl: qcom: Use qcom_scm_io_update_field()
> firmware: scm: Modify only the download bits in TCSR register
> firmware: qcom_scm: Refactor code to support multiple download mode
> firmware: qcom_scm: Add multiple download mode support
>
> Patch 1/21 is qualcomm minidump document
> Patch 2/21 will export kallsyms_lookup_name will be needed for minidump module
> Patch 3/21 moves the minidump specific data structure and macro to
> qcom_minidump_smem.c and later 13/21 will use the API and remove
> minidump specific code to qcom_minidump_smem file.
> Patch 4/21 is qualcomm minidump core(frontend) driver
> Patch 5/21 implements qualcomm smem backend kernel driver
> Patch 6/21 add pending region support for the clients who came for
> registration before minidump.
> Patch 7/21 add update region support for registered clients.
> Patch 8/21 Add dt-binding for qualcomm ramoops driver which is also a minidump client driver
> Patch 9/21 exported symbol from ramoops driver to avoid copy of the code.
> Patch 10/21 Add qcom's pstore minidump driver support which adds ramoops platform device
> and 11/21 register existing pstore frontend regions.
> Patch 12/21 and 13/21 does some clean up and code reuse.
> Patch 16/21 enable qcom_ramoops driver for sm8450
> Patch 17-21 are not new and has already been through 6 versions and
> reason of adding here is for minidump testing purpose and it will be rebased
> automatically along with new version of minidump series.
>
> Testing of the patches has been done on sm8450 target after enabling config like
> CONFIG_PSTORE_RAM and CONFIG_PSTORE_CONSOLE and once the device boots up.
> Try crashing it via devmem2 0xf11c000(this is known to create xpu violation and
> and put the device in download mode) on command prompt.
>
> I have added download patch here numbered from 14/18 to 18/18
> Earlier download mode setting patches were sent separately
> https://lore.kernel.org/lkml/[email protected]/
>
> Default storage type is set to via USB, so minidump would be downloaded with the
> help of x86_64 machine (running PCAT tool) attached to Qualcomm device which has
> backed minidump boot firmware support (more can be found patch 3/18)
>
> Below patch [1] is to warm reset Qualcomm device which has upstream qcom
> watchdog driver support.
>
> After applying all patches, we can boot the device and can execute
> following command.
>
> echo mini > /sys/module/qcom_scm/parameters/download_mode
> echo c > /proc/sysrq-trigger
>
> This will make the device go to download mode and collect the minidump on to the
> attached x86 machine running the Qualcomm PCAT tool(This comes as part Qualcomm
> package manager kit).
> After that we will see a bunch of predefined registered region as binary blobs files
> starts with md_* downloaded on the x86 machine on given location in PCAT tool from
> the target device.
>
> A sample client example to dump a linux region has been given in patch 3/18 and as
> well as can be seen in patch 12/18.
>
> [1]
> --------------------------->8-------------------------------------
>
> commit f1124ccebd47550b4c9627aa162d9cdceba2b76f
> Author: Mukesh Ojha <[email protected]>
> Date: Thu Mar 16 14:08:35 2023 +0530
>
> do not merge: watchdog bite on panic
>
> Signed-off-by: Mukesh Ojha <[email protected]>
>
> diff --git a/drivers/watchdog/qcom-wdt.c b/drivers/watchdog/qcom-wdt.c
> index 0d2209c..767e84a 100644
> --- a/drivers/watchdog/qcom-wdt.c
> +++ b/drivers/watchdog/qcom-wdt.c
> @@ -12,6 +12,7 @@
> #include <linux/platform_device.h>
> #include <linux/watchdog.h>
> #include <linux/of_device.h>
> +#include <linux/panic.h>
>
> enum wdt_reg {
> WDT_RST,
> @@ -114,12 +115,28 @@ static int qcom_wdt_set_pretimeout(struct watchdog_device *wdd,
> return qcom_wdt_start(wdd);
> }
>
> +static void qcom_wdt_bite_on_panic(struct qcom_wdt *wdt)
> +{
> + writel(0, wdt_addr(wdt, WDT_EN));
> + writel(1, wdt_addr(wdt, WDT_BITE_TIME));
> + writel(1, wdt_addr(wdt, WDT_RST));
> + writel(QCOM_WDT_ENABLE, wdt_addr(wdt, WDT_EN));
> +
> + wmb();
> +
> + while(1)
> + udelay(1);
> +}
> +
> static int qcom_wdt_restart(struct watchdog_device *wdd, unsigned long action,
> void *data)
> {
> struct qcom_wdt *wdt = to_qcom_wdt(wdd);
> u32 timeout;
>
> + if (in_panic)
> + qcom_wdt_bite_on_panic(wdt);
> +
> /*
> * Trigger watchdog bite:
> * Setup BITE_TIME to be 128ms, and enable WDT.
> diff --git a/include/linux/panic.h b/include/linux/panic.h
> index 979b776..f913629 100644
> --- a/include/linux/panic.h
> +++ b/include/linux/panic.h
> @@ -22,6 +22,7 @@ extern int panic_on_oops;
> extern int panic_on_unrecovered_nmi;
> extern int panic_on_io_nmi;
> extern int panic_on_warn;
> +extern bool in_panic;
>
> extern unsigned long panic_on_taint;
> extern bool panic_on_taint_nousertaint;
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 487f5b0..714f7f4 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -65,6 +65,8 @@ static unsigned int warn_limit __read_mostly;
>
> int panic_timeout = CONFIG_PANIC_TIMEOUT;
> EXPORT_SYMBOL_GPL(panic_timeout);
> +bool in_panic = false;
> +EXPORT_SYMBOL_GPL(in_panic);
>
> #define PANIC_PRINT_TASK_INFO 0x00000001
> #define PANIC_PRINT_MEM_INFO 0x00000002
> @@ -261,6 +263,7 @@ void panic(const char *fmt, ...)
> int old_cpu, this_cpu;
> bool _crash_kexec_post_notifiers = crash_kexec_post_notifiers;
>
> + in_panic = true;
> if (panic_on_warn) {
> /*
> * This thread may hit another WARN() in the panic path.
> --------------------------------------------------------------------------
>
> Changes in v4:
> - Redesigned the driver and divided the driver into front end and backend (smem) so
> that any new backend can be attached easily to avoid code duplication.
> - Patch reordering as per the driver and subsystem to easier review of the code.
> - Removed minidump specific code from remoteproc to minidump smem based driver.
> - Enabled the all the driver as modules.
> - Address comments made on documentation and yaml and Device tree file [Krzysztof/Konrad]
> - Address comments made qcom_pstore_minidump driver and given its Device tree
> same set of properties as ramoops. [Luca/Kees]
> - Added patch for MAINTAINER file.
> - Include defconfig change as one patch as per [Krzysztof] suggestion.
> - Tried to remove the redundant file scope variables from the module as per [Krzysztof] suggestion.
> - Addressed comments made on dload mode patch v6 version
> https://lore.kernel.org/lkml/[email protected]/
>
> Changes in v3: https://lore.kernel.org/lkml/[email protected]/
> - Addressed most of the comments by Srini on v2 and refactored the minidump driver.
> - Added platform device support
> - Unregister region support.
> - Added update region for clients.
> - Added pending region support.
> - Modified the documentation guide accordingly.
> - Added qcom_pstore_ramdump client driver which happen to add ramoops platform
> device and also registers ramoops region with minidump.
> - Added download mode patch series with this minidump series.
> https://lore.kernel.org/lkml/[email protected]/
>
> Changes in v2: https://lore.kernel.org/lkml/[email protected]/
> - Addressed review comment made by [quic_tsoni/bmasney] to add documentation.
> - Addressed comments made by [srinivas.kandagatla]
> - Dropped pstore 6/6 from the last series, till i get conclusion to get pstore
> region in minidump.
> - Fixed issue reported by kernel test robot.
>
> Changes in v1: https://lore.kernel.org/lkml/[email protected]/
>
> Mukesh Ojha (21):
> docs: qcom: Add qualcomm minidump guide
> kallsyms: Export kallsyms_lookup_name
> soc: qcom: Add qcom_minidump_smem module
> soc: qcom: Add Qualcomm APSS minidump (frontend) feature support
> soc: qcom: Add linux minidump smem backend driver support
> soc: qcom: minidump: Add pending region registration support
> soc: qcom: minidump: Add update region support
> dt-bindings: reserved-memory: Add qcom,ramoops binding
> pstore/ram : Export ramoops_parse_dt() symbol
> soc: qcom: Add qcom's pstore minidump driver support
> soc: qcom: Register pstore frontend region with minidump
> remoteproc: qcom: Expand MD_* as MINIDUMP_*
> remoterproc: qcom: refactor to leverage exported minidump symbol
> MAINTAINERS: Add entry for minidump driver related support
> arm64: defconfig: Enable Qualcomm Minidump related drivers
> arm64: dts: qcom: sm8450: Add Qualcomm ramoops minidump node
> firmware: qcom_scm: provide a read-modify-write function
> pinctrl: qcom: Use qcom_scm_io_update_field()
> firmware: scm: Modify only the download bits in TCSR register
> firmware: qcom_scm: Refactor code to support multiple download mode
> firmware: qcom_scm: Add multiple download mode support


Hi Mukesh,

For IPQ chipsets, for the crashdump to work, we need the below patch

firmware: scm: Modify only the download bits in TCSR register

can you post the below patches separately? Looks like minidump will take
some time and also I don't see any dependencies for these to go along
with the minidump. Given that, will it be possible to post the below
patches separately?

  firmware: qcom_scm: provide a read-modify-write function
  pinctrl: qcom: Use qcom_scm_io_update_field()
  firmware: scm: Modify only the download bits in TCSR register

Do let us know if we can take these patches and post it separately.

Thanks,

Kathiravan T.


>
> Documentation/admin-guide/index.rst | 1 +
> Documentation/admin-guide/qcom_minidump.rst | 293 +++++++++++
> .../devicetree/bindings/soc/qcom/qcom,ramoops.yaml | 126 +++++
> MAINTAINERS | 15 +
> arch/arm64/boot/dts/qcom/sm8450.dtsi | 12 +
> arch/arm64/configs/defconfig | 4 +
> drivers/firmware/Kconfig | 11 -
> drivers/firmware/qcom_scm.c | 85 ++-
> drivers/pinctrl/qcom/pinctrl-msm.c | 12 +-
> drivers/remoteproc/qcom_common.c | 142 +----
> drivers/soc/qcom/Kconfig | 39 ++
> drivers/soc/qcom/Makefile | 3 +
> drivers/soc/qcom/qcom_minidump.c | 582 +++++++++++++++++++++
> drivers/soc/qcom/qcom_minidump_internal.h | 98 ++++
> drivers/soc/qcom/qcom_minidump_smem.c | 387 ++++++++++++++
> drivers/soc/qcom/qcom_pstore_minidump.c | 210 ++++++++
> drivers/soc/qcom/smem.c | 9 +
> fs/pstore/ram.c | 26 +-
> include/linux/firmware/qcom/qcom_scm.h | 2 +
> include/linux/pstore_ram.h | 2 +
> include/soc/qcom/qcom_minidump.h | 64 +++
> kernel/kallsyms.c | 2 +-
> 22 files changed, 1973 insertions(+), 152 deletions(-)
> create mode 100644 Documentation/admin-guide/qcom_minidump.rst
> create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
> create mode 100644 drivers/soc/qcom/qcom_minidump.c
> create mode 100644 drivers/soc/qcom/qcom_minidump_internal.h
> create mode 100644 drivers/soc/qcom/qcom_minidump_smem.c
> create mode 100644 drivers/soc/qcom/qcom_pstore_minidump.c
> create mode 100644 include/soc/qcom/qcom_minidump.h
>

2023-07-14 16:09:45

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support


>
> Hi Mukesh,
>
> For IPQ chipsets, for the crashdump to work, we need the below patch
>
> firmware: scm: Modify only the download bits in TCSR register
>
> can you post the below patches separately? Looks like minidump will take
> some time and also I don't see any dependencies for these to go along
> with the minidump. Given that, will it be possible to post the below
> patches separately?
>
>   firmware: qcom_scm: provide a read-modify-write function
>   pinctrl: qcom: Use qcom_scm_io_update_field()
>   firmware: scm: Modify only the download bits in TCSR register
>
> Do let us know if we can take these patches and post it separately.

Yes, we can post this separately.

-Mukesh
>
>>
>>   Documentation/admin-guide/index.rst                |   1 +
>>   Documentation/admin-guide/qcom_minidump.rst        | 293 +++++++++++
>>   .../devicetree/bindings/soc/qcom/qcom,ramoops.yaml | 126 +++++
>>   MAINTAINERS                                        |  15 +
>>   arch/arm64/boot/dts/qcom/sm8450.dtsi               |  12 +
>>   arch/arm64/configs/defconfig                       |   4 +
>>   drivers/firmware/Kconfig                           |  11 -
>>   drivers/firmware/qcom_scm.c                        |  85 ++-
>>   drivers/pinctrl/qcom/pinctrl-msm.c                 |  12 +-
>>   drivers/remoteproc/qcom_common.c                   | 142 +----
>>   drivers/soc/qcom/Kconfig                           |  39 ++
>>   drivers/soc/qcom/Makefile                          |   3 +
>>   drivers/soc/qcom/qcom_minidump.c                   | 582
>> +++++++++++++++++++++
>>   drivers/soc/qcom/qcom_minidump_internal.h          |  98 ++++
>>   drivers/soc/qcom/qcom_minidump_smem.c              | 387 ++++++++++++++
>>   drivers/soc/qcom/qcom_pstore_minidump.c            | 210 ++++++++
>>   drivers/soc/qcom/smem.c                            |   9 +
>>   fs/pstore/ram.c                                    |  26 +-
>>   include/linux/firmware/qcom/qcom_scm.h             |   2 +
>>   include/linux/pstore_ram.h                         |   2 +
>>   include/soc/qcom/qcom_minidump.h                   |  64 +++
>>   kernel/kallsyms.c                                  |   2 +-
>>   22 files changed, 1973 insertions(+), 152 deletions(-)
>>   create mode 100644 Documentation/admin-guide/qcom_minidump.rst
>>   create mode 100644
>> Documentation/devicetree/bindings/soc/qcom/qcom,ramoops.yaml
>>   create mode 100644 drivers/soc/qcom/qcom_minidump.c
>>   create mode 100644 drivers/soc/qcom/qcom_minidump_internal.h
>>   create mode 100644 drivers/soc/qcom/qcom_minidump_smem.c
>>   create mode 100644 drivers/soc/qcom/qcom_pstore_minidump.c
>>   create mode 100644 include/soc/qcom/qcom_minidump.h
>>

2023-07-15 00:55:44

by Trilok Soni

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

On 7/5/2023 10:29 AM, Trilok Soni wrote:
> On 7/4/2023 2:27 AM, Linus Walleij wrote:
>> On Thu, Jun 29, 2023 at 1:12 AM Rob Herring <[email protected]> wrote:
>>
>>> My bigger issue with this whole series is what would this all look
>>> like if every SoC vendor upstreamed their own custom dumping
>>> mechanism. That would be a mess. (I have similar opinions on the
>>> $soc-vendor hypervisors.)
>>
>> I agree with Rob's stance.
>>
>> I think it would be useful to get input from the hwtracing developers
>> (Alexander and Mathieu) who faced this "necessarily different" issue
>> with all the hwtrace mechanisms and found a way out of it. I suspect
>> they can have an idea of how this should be abstracted.
>
> Any mailing list you suggest we expand to so that we get inputs from the
> hwtracing developers and maintainers or just look into the MAINTAINERS
> file and start an email thread?
>
> We are fine to submit the abstract for the LPC in next two weeks, but
> prefer to have lot of good discussion before it on the mailing list, so
> that we have code to talk about in LPC.

We have submitted abstract at LPC MC. Let's continue the discussion here
though.

Mukesh, do you want to expand the lists as necessary to see if other
soc-vendors are having any inputs here? May be add Exynos or MTK kernel
mailing lists + linux-kernel? I don't know if anyone will respond or
not, but let's try.

---Trilok Soni


2023-07-18 06:03:52

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

+ [email protected]
+ [email protected]



On 7/15/2023 5:15 AM, Trilok Soni wrote:
> On 7/5/2023 10:29 AM, Trilok Soni wrote:
>> On 7/4/2023 2:27 AM, Linus Walleij wrote:
>>> On Thu, Jun 29, 2023 at 1:12 AM Rob Herring <[email protected]> wrote:
>>>
>>>> My bigger issue with this whole series is what would this all look
>>>> like if every SoC vendor upstreamed their own custom dumping
>>>> mechanism. That would be a mess. (I have similar opinions on the
>>>> $soc-vendor hypervisors.)
>>>
>>> I agree with Rob's stance.
>>>
>>> I think it would be useful to get input from the hwtracing developers
>>> (Alexander and Mathieu) who faced this "necessarily different" issue
>>> with all the hwtrace mechanisms and found a way out of it. I suspect
>>> they can have an idea of how this should be abstracted.
>>
>> Any mailing list you suggest we expand to so that we get inputs from
>> the hwtracing developers and maintainers or just look into the
>> MAINTAINERS file and start an email thread?
>>
>> We are fine to submit the abstract for the LPC in next two weeks, but
>> prefer to have lot of good discussion before it on the mailing list,
>> so that we have code to talk about in LPC.
>
> We have submitted abstract at LPC MC. Let's continue the discussion here
> though.
>
> Mukesh, do you want to expand the lists as necessary to see if other
> soc-vendors are having any inputs here? May be add Exynos or MTK kernel
> mailing lists + linux-kernel? I don't know if anyone will respond or
> not, but let's try.

Sure.

-Mukesh
>
> ---Trilok Soni
>

2023-07-18 13:46:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

On Tue, Jul 18, 2023 at 11:17:12AM +0530, Mukesh Ojha wrote:
> + [email protected]
> + [email protected]

What does that do?

confused,

greg k-h

2023-07-18 14:34:36

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support



On 7/18/2023 7:05 PM, Greg KH wrote:
> On Tue, Jul 18, 2023 at 11:17:12AM +0530, Mukesh Ojha wrote:
>> + [email protected]
>> + [email protected]
>
> What does that do?

This is to seek their feedback, if they have something similar
requirement to debug end user device crashes.

-Mukesh
>
> confused,
>
> greg k-h

2023-07-18 15:29:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

On Tue, Jul 18, 2023 at 07:25:15PM +0530, Mukesh Ojha wrote:
>
>
> On 7/18/2023 7:05 PM, Greg KH wrote:
> > On Tue, Jul 18, 2023 at 11:17:12AM +0530, Mukesh Ojha wrote:
> > > + [email protected]
> > > + [email protected]
> >
> > What does that do?
>
> This is to seek their feedback, if they have something similar requirement
> to debug end user device crashes.

Feedback to what? There is no context here and no content either at
all.

Just adding a mailing list to the top of a message doesn't actually send
the thread there.

confused,

greg k-h

2023-07-18 15:35:46

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support



On 7/6/2023 10:50 PM, Mathieu Poirier wrote:
> On Mon, Jul 03, 2023 at 02:05:58PM -0700, Trilok Soni wrote:
>> On 7/2/2023 1:29 AM, Krzysztof Kozlowski wrote:
>>> On 30/06/2023 18:04, Mukesh Ojha wrote:
>>>>>
>>>>>> We don't add layers when they are not needed, and never when there is no
>>>>>> actual user. If you need the extra "complexity" later, then add it
>>>>>> later when it is needed as who knows when that will ever be.
>>>>>>
>>>>>> Please redo this series based on that, thanks.
>>>>>
>>>>> My bigger issue with this whole series is what would this all look
>>>>> like if every SoC vendor upstreamed their own custom dumping
>>>>> mechanism. That would be a mess. (I have similar opinions on the
>>>>> $soc-vendor hypervisors.)
>>>
>>> Mukesh,
>>>
>>> LPC CFP is still open. There will be also Android and Kernel Debugging
>>> LPC microconference tracks. Coming with a unified solution could be a
>>> great topic for LPC. Solutions targeting only one user are quite often
>>> frowned upon.
>>
>> LPC is far out and in November. Can we not have others speak up if they have
>> the similar solution now? We can expand this to linux-kernel and ask for the
>> other SOC vendors to chime in. I am sure that we may have existing solutions
>> which came in for the one user first like Intel RDT if I remember. I am sure
>> ARM MPAM usecase was present at that time but Intel RDT based solution which
>> was x86 specific but accepted.
>
> I am not familiar with Intel RDT and Arm MPAM but the community is always
> improving on the way it does things.
>
> LPC is indeed far out in November but it is an opportunity to cover the
> groundwork needed to have this discussion. It is always best to improve on
> something then introduce something new. Even better if something specific such
> as Intel RDT and Arm MPAM can be made more generic. A perfect example is
> hwtracing Linus referred to. The perf framework wasn't a perfect fit but it was
> enhanced to accommodate our requirements. I suggest to look at what is currently
> available and come up with a strategy to be presented at LPC - event better if
> you have a prototype. If you can't find anything or the drawbacks inherent to
> each avenue outweigh the benefits then we can have that conversation at LPC.

I was checking hwtracing[1] and pmu interface introduction of
address filtering[3] from analogy point of view, which i think you
meant that perf framework was extended to accommodate this.

Minidump is quite different and simple in its way to address the problem
of debugging on end user devices with minimum data captured to debug
crashes and this patch series is inline with similar (core + backend)
implementation done for stm patches[1] where stm core was developed
and intel trace hub get hooked into it and later it got reused in [2] by
coresight-stm driver.

I am still exploring if something available we can reuse but it seems
unlikely at the moment to already available something in the kernel with
similar use case.

-Mukesh

[1]
https://lwn.net/Articles/650245/

[2]
https://lwn.net/Articles/674201/

[3]
https://lore.kernel.org/lkml/1461771888-10409-1-git-send-email-alexander.shishkin@linux.intel.com/


>
>>
>> ---Trilok Soni

2023-07-18 16:42:38

by Trilok Soni

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support

On 7/18/2023 7:41 AM, Greg KH wrote:
> On Tue, Jul 18, 2023 at 07:25:15PM +0530, Mukesh Ojha wrote:
>>
>>
>> On 7/18/2023 7:05 PM, Greg KH wrote:
>>> On Tue, Jul 18, 2023 at 11:17:12AM +0530, Mukesh Ojha wrote:
>>>> + [email protected]
>>>> + [email protected]
>>>
>>> What does that do?
>>
>> This is to seek their feedback, if they have something similar requirement
>> to debug end user device crashes.
>
> Feedback to what? There is no context here and no content either at
> all.
>
> Just adding a mailing list to the top of a message doesn't actually send
> the thread there.
>
> confused,

Mukesh, instead of adding the mailing lists here, we should send either
the refreshed revision of this patchset (if there are enough changes) w/
MLs CCed or start a new discussion with these mailing list with the
context of the minidump and refer these patches from the mailing list
archives.

---Trilok Soni

2023-07-19 11:35:09

by Luca Stefani

[permalink] [raw]
Subject: Re: [PATCH v4 08/21] dt-bindings: reserved-memory: Add qcom,ramoops binding


On 04/07/23 07:57, Krzysztof Kozlowski wrote:
> On 03/07/2023 17:55, Mukesh Ojha wrote:
>>
>> On 7/3/2023 12:50 PM, Krzysztof Kozlowski wrote:
>>> On Mon, 3 Jul 2023 at 08:22, Mukesh Ojha <[email protected]> wrote:
>>>> On 7/2/2023 1:42 PM, Krzysztof Kozlowski wrote:
>>>>>>> The big difference is if firmware is not deciding where this log
>>>>>>> lives, then it doesn't need to be in DT. How does anything except the
>>>>>>> kernel that allocates the log find the logs?
>>>>>> Yes, you are correct, firmware is not deciding where the logs lives
>>>>>> instead here, Kernel has reserved the region where the ramoops region
>>>>>> lives and later with the minidump registration where, physical
>>>>>> address/size/virtual address(for parsing) are passed and that is how
>>>>>> firmware is able to know and dump those region before triggering system
>>>>>> reset.
>>>>> Your explanation does not justify storing all this in DT. Kernel can
>>>>> allocate any memory it wishes, store there logs and pass the address to
>>>>> the firmware. That's it, no need for DT.
>>>> If you go through the driver, you will know that what it does, is
>>> We talk about bindings and I should not be forced to look at the
>>> driver to be able to understand them. Bindings should stand on their
>>> own.
>> Why can't ramoops binding have one more feature where it can add a flag
>> *dynamic* to indicate the regions are dynamic and it is for platforms
>> where there is another entity 'minidump' who is interested in these
>> regions.
> Because we do not define dynamic stuff in Devicetree. Dynamic means
> defined by SW or runtime configurable. It is against the entire idea of
> Devicetree which is for non-discoverable hardware.
>
>>>> just create platform device for actual ramoops driver to probe and to
>>> Not really justification for Devicetree anyway. Whatever your driver
>>> is doing, is driver's business, not bindings.
>>>
>>>> provide this it needs exact set of parameters of input what original
>>>> ramoops DT provides, we need to keep it in DT as maintaining this in
>>>> driver will not scale well with different size/parameter size
>>>> requirement for different targets.
>>> Really? Why? I don't see a problem in scaling. At all.
>> I had attempted it here,
>>
>> https://lore.kernel.org/lkml/[email protected]/
>>
>> but got comments related to hard coding and some in favor of having
>> the same set of properties what ramoops has/provides
>>
>> https://lore.kernel.org/lkml/[email protected]/
>>
>> https://lore.kernel.org/lkml/202305161347.80204C1A0E@keescook/
> Then you were tricked. I don't get why someone else suggests that
> non-hardware property should be part of Devicetree, but anyway it's the
> call of Devicetree binding maintainers, not someone else. DT is not
> dumping ground for all the system configuration variables.

Sorry for that, I assumed the interface should be as close as possible
to pstore, but apparently that's not the case. Why does it have to be
different from it? It provides the same functionality and is
configurable even if it doesn't explicitly configure non discoverable
hardware properties.

Assuming we make the driver picks the values, how would it do that? 
Hardcoding a configuration could lead to a few problems, such as the
allocated region being smaller than the driver defaults or driver
defaults not fully utilizing the allocated region, possibly wasting more
memory than it'll ever use. On top of that what happens if we want to
configure it differently than the hardcoded default values? Via cmdline
options? For example in the previous version it allocated the whole
region for the console alone, while other entries, such as pmsg that
could be useful on devices using minidump to store Android logs, was
zero-sized.

>
>>>>>> A part of this registration code you can find in 11/21
>>>>>>
>>>>>>> I'm pretty sure I already said all this before.
>>>>>> Yes, you said this before but that's the reason i came up with vendor
>>>>>> ramoops instead of changing traditional ramoops binding.
>>>>> That's unexpected conclusion. Adding more bindings is not the answer to
>>>>> comment that it should not be in the DTS in the first place.
>>>> Please suggest, what is the other way being above text as requirement..
>>> I do not see any requirement for us there. Forcing me to figure out
>>> how to add non-hardware property to DT is not the way to convince
>>> reviewers. But if you insist - we have ABI for this, called sysfs. If
>>> it is debugging feature, then debugfs.
>> ramoops already support module params and a way to pass these parameters
>> from bootargs but it also need to know the hard-codes addresses, so,
>> doing something in sysfs will be again duplication with ramoops driver..
> Why do you need hard-coded addresses?
>
>> If this can be accommodated under ramoops, this will be very small
>> change, like this
>>
>> https://lore.kernel.org/lkml/[email protected]/
> That's also funny patch - missing bindings updated, missing CC DT
> maintainers.
>
> Best regards,
> Krzysztof
>
>
>

2023-08-07 14:23:01

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support


On 7/18/2023 8:33 PM, Mukesh Ojha wrote:
>
>
> On 7/6/2023 10:50 PM, Mathieu Poirier wrote:
>> On Mon, Jul 03, 2023 at 02:05:58PM -0700, Trilok Soni wrote:
>>> On 7/2/2023 1:29 AM, Krzysztof Kozlowski wrote:
>>>> On 30/06/2023 18:04, Mukesh Ojha wrote:
>>>>>>
>>>>>>> We don't add layers when they are not needed, and never when
>>>>>>> there is no
>>>>>>> actual user.  If you need the extra "complexity" later, then add it
>>>>>>> later when it is needed as who knows when that will ever be.
>>>>>>>
>>>>>>> Please redo this series based on that, thanks.
>>>>>>
>>>>>> My bigger issue with this whole series is what would this all look
>>>>>> like if every SoC vendor upstreamed their own custom dumping
>>>>>> mechanism. That would be a mess. (I have similar opinions on the
>>>>>> $soc-vendor hypervisors.)
>>>>
>>>> Mukesh,
>>>>
>>>> LPC CFP is still open. There will be also Android and Kernel Debugging
>>>> LPC microconference tracks. Coming with a unified solution could be a
>>>> great topic for LPC. Solutions targeting only one user are quite often
>>>> frowned upon.
>>>
>>> LPC is far out and in November. Can we not have others speak up if
>>> they have
>>> the similar solution now? We can expand this to linux-kernel and ask
>>> for the
>>> other SOC vendors to chime in. I am sure that we may have existing
>>> solutions
>>> which came in for the one user first like Intel RDT if I remember. I
>>> am sure
>>> ARM MPAM usecase was present at that time but Intel RDT based
>>> solution which
>>> was x86 specific but accepted.
>>
>> I am not familiar with Intel RDT and Arm MPAM but the community is always
>> improving on the way it does things.
>>
>> LPC is indeed far out in November but it is an opportunity to cover the
>> groundwork needed to have this discussion.  It is always best to
>> improve on
>> something then introduce something new.  Even better if something
>> specific such
>> as Intel RDT and Arm MPAM can be made more generic.  A perfect example is
>> hwtracing Linus referred to.  The perf framework wasn't a perfect fit
>> but it was
>> enhanced to accommodate our requirements.  I suggest to look at what
>> is currently
>> available and come up with a strategy to be presented at LPC - event
>> better if
>> you have a prototype.  If you can't find anything or the drawbacks
>> inherent to
>> each avenue outweigh the benefits then we can have that conversation
>> at LPC.
>
> I was checking hwtracing[1] and pmu interface introduction of
> address filtering[3] from analogy point of view, which i think you
> meant that perf framework was extended to accommodate this.
>
> Minidump is quite different and simple in its way to address the problem
> of debugging on end user devices with minimum data captured to debug
> crashes and this patch series is inline with similar (core + backend)
> implementation done for stm patches[1] where stm core was developed
> and intel trace hub get hooked into it and later it got reused in [2] by
> coresight-stm driver.
>
> I am still exploring if something available we can reuse but it seems
> unlikely at the moment to already available something in the kernel with
> similar use case.


I explored about kdump and fadump(PPC) but they take another route to
boot capture kernel, FAdump is even optimized and do not even boot
capture kernel instead reboot the same kernel with minimal
memory and once the reading of crash kernel region is complete
from user space it release the memories.

Latency in booting another kernel may not acceptable to boot time
sensitive devices like mobile, tablet etc., also to boot another
kernel will have security implication with user data..

So, Minidump is limited in its capability and uses firmware infra
to dump registered entries and boot afresh. So, it need to be
presented what need to be dumped via Minidump registration API.

If there is no comments, would like to go ahead with the next
revision of the patchset.

-Mukesh
>
> -Mukesh
>
> [1]
> https://lwn.net/Articles/650245/
>
> [2]
> https://lwn.net/Articles/674201/
>
> [3]
> https://lore.kernel.org/lkml/1461771888-10409-1-git-send-email-alexander.shishkin@linux.intel.com/
>
>
>>
>>>
>>> ---Trilok Soni

2023-08-10 18:21:07

by Mukesh Ojha

[permalink] [raw]
Subject: Re: [PATCH v4 00/21] Add Qualcomm Minidump kernel driver related support



On 7/6/2023 11:10 PM, Rob Herring wrote:
> On Mon, Jul 3, 2023 at 3:06 PM Trilok Soni <[email protected]> wrote:
>>
>> On 7/2/2023 1:29 AM, Krzysztof Kozlowski wrote:
>>> On 30/06/2023 18:04, Mukesh Ojha wrote:
>>>>>
>>>>>> We don't add layers when they are not needed, and never when there is no
>>>>>> actual user. If you need the extra "complexity" later, then add it
>>>>>> later when it is needed as who knows when that will ever be.
>>>>>>
>>>>>> Please redo this series based on that, thanks.
>>>>>
>>>>> My bigger issue with this whole series is what would this all look
>>>>> like if every SoC vendor upstreamed their own custom dumping
>>>>> mechanism. That would be a mess. (I have similar opinions on the
>>>>> $soc-vendor hypervisors.)
>>>
>>> Mukesh,
>>>
>>> LPC CFP is still open. There will be also Android and Kernel Debugging
>>> LPC microconference tracks. Coming with a unified solution could be a
>>> great topic for LPC. Solutions targeting only one user are quite often
>>> frowned upon.
>>
>> LPC is far out and in November. Can we not have others speak up if they
>> have the similar solution now? We can expand this to linux-kernel and
>> ask for the other SOC vendors to chime in. I am sure that we may have
>> existing solutions which came in for the one user first like Intel RDT
>> if I remember. I am sure ARM MPAM usecase was present at that time but
>> Intel RDT based solution which was x86 specific but accepted.
>
> RDT predated MPAM. resctrl is the kernel feature, and it supports
> Intel and AMD which are not identical. resctrl is being (extensively)
> refactored to add in MPAM support.
>
> You are not the first here like Intel RDT, so I fail to see the
> parallel with minidump. We have an existing logging to persistent
> storage mechanism which is pstore. You should integrate into that
> rather than grafting something on to the side or underneath.

Most of the Qualcomm SoCs does not support *warm boot* and that is the
base requirement for pstore(ram) to work to preserve the content of
fixed region during the reboot. So, it will not work those SOCs.

Minidump in its capability can do more than what is available
through pstore, it can dump ramoops region as one of data point
for debugging but it can dump anything given the size and address.

We can make minidump it another backend of pstore(ram), and improve
pstore with more debug data collection during panic like timer data or
irqstats etc. which was our final goal with minidump that way pstore
also gets benefit and minidump will just collect what is there in
pstore(ram). but for that we need base infrastructure driver to
merge.

One of the proposal made here..
https://lore.kernel.org/lkml/[email protected]/

-Mukesh
>
> Rob