2024-01-03 15:28:54

by Mukesh Ojha

[permalink] [raw]
Subject: Re: RESEND: Re: [Patch v6 03/12] docs: qcom: Add qualcomm minidump guide



On 12/25/2023 7:25 PM, Ruipeng Qi wrote:
> <+How a kernel client driver can register region with minidump
> <+------------------------------------------------------------
> <+
> <+Client driver can use ``qcom_minidump_region_register`` API's to register
> <+and ``qcom_minidump_region_unregister`` to unregister their region from
> <+minidump driver.
> <+
> <+Client needs to fill their region by filling ``qcom_minidump_region``
> <+structure object which consists of the region name, region's virtual
> <+and physical address and its size.
>
> Hi, Mukesh, wish you a good holiday :)

Hope you had the same..:-)

>
> I have the following idea, please help me to assess whether this can be
> implemented or not. As we all know, most of the kernel objects are
> allocated by the slab sub-system.I wonder if we can dump all memory
> keeped by the slab sub-system? If so, we got most of the kernel objects
> which will be helpful to fix problems when we run with system issues.
>
> How can we do this? From the description above, I think we should
> register one region for each slab, for each slab will have some pages,
> and the memory between each slab is non-continuous. As we all
> know, there are millions of slabs in the system, so if we dump slabs
> in this way, it will introduce a heavy overhead.
>
> I am not very familiar with qualcomm minidump, maybe my thought
> is wrong. Looking forward to your reply!

In the current state and in simple terms, Qualcomm Minidump can not do
this, Minidump is more of a consumer driver so, what ever gets
registered with it, it can dump. Qualcomm Minidump serves bigger purpose
to dump content in any kind of crash whether it is kernel or non-kernel
like NOC errors/XPUs etc and both kernel/non-kernel entity can register
to it, so we gets dump in any kind of system crash.

One more thing, kernel part of minidump, we are calling it APSS Minidump
has limitation of no of entries so it will be difficult to dump
non-continuous regions after a certain number of registration ~200. However,
we do have a solution in downstream kernel for it like to create a big
CMA buffer and register this buffer with Minidump so that whatever gets
dumped in that buffer gets captured during crash and fill up this buffer
and create elf during panic. I think, similar thing you are also doing
with your OS-minidump.

I have just glanced into your implementation of OS-minidump, it
more of relying on basic concept of RAM content preserved
across boot and later reading it through procfs but this basic
stuff is common to pstore(ram) as well and pstore has file system
support why don't you make your driver as one of pstore record and that
way Qualcomm minidump also gets benefited where entire OS-minidump
record gets registered with Qualcomm minidump and we get this on panic
and you get this via pstorefs.

-Mukesh

>
> Best Regards
> Ruipeng


2024-01-08 15:35:23

by Ruipeng Qi

[permalink] [raw]
Subject: Re: RESEND: Re: [Patch v6 03/12] docs: qcom: Add qualcomm minidump guide

On Wed, Jan 3, 2024 at 11:27 PM Mukesh Ojha <[email protected]> wrote:
>
>
> One more thing, kernel part of minidump, we are calling it APSS Minidump
> has limitation of no of entries so it will be difficult to dump
> non-continuous regions after a certain number of registration ~200. However,
> we do have a solution in downstream kernel for it like to create a big
> CMA buffer and register this buffer with Minidump so that whatever gets
> dumped in that buffer gets captured during crash and fill up this buffer
> and create elf during panic. I think, similar thing you are also doing
> with your OS-minidump.
>
> I have just glanced into your implementation of OS-minidump, it
> more of relying on basic concept of RAM content preserved
> across boot and later reading it through procfs but this basic
> stuff is common to pstore(ram) as well and pstore has file system
> support why don't you make your driver as one of pstore record and that
> way Qualcomm minidump also gets benefited where entire OS-minidump
> record gets registered with Qualcomm minidump and we get this on panic
> and you get this via pstorefs.
>
Thanks Mukesh!It is a good suggestion to move OS-minidump forward!
By the way, I have some questions here for which I need your assistance.

Firstly,I can reimplement OS-minidump as one of the pstore records to
dump data. The resulting dump file would contain thousands of
non-contiguous memory regions, each with only the virtual address and
size recorded. As far as I know, Qualcomm's minidump can handle
several memory regions, each with a physical address and size.
This seems to be a difference, and I'm curious as to how you deal with
data dumped by OS-minidump. I would really appreciate it if you could
provide more details on your approach.

Secondly, what tools do you use to analyze the dump data, and does it
support crash tool?

Lastly, is Qualcomm minidump compatible with non-Qualcomm SoCs,
and if so, how can one use it?

Best Regards
Ruipeng Qi

2024-01-09 08:53:47

by Mukesh Ojha

[permalink] [raw]
Subject: Re: RESEND: Re: [Patch v6 03/12] docs: qcom: Add qualcomm minidump guide



On 1/8/2024 9:04 PM, Ruipeng Qi wrote:
> On Wed, Jan 3, 2024 at 11:27 PM Mukesh Ojha <[email protected]> wrote:
>>
>>
>> One more thing, kernel part of minidump, we are calling it APSS Minidump
>> has limitation of no of entries so it will be difficult to dump
>> non-continuous regions after a certain number of registration ~200. However,
>> we do have a solution in downstream kernel for it like to create a big
>> CMA buffer and register this buffer with Minidump so that whatever gets
>> dumped in that buffer gets captured during crash and fill up this buffer
>> and create elf during panic. I think, similar thing you are also doing
>> with your OS-minidump.
>>
>> I have just glanced into your implementation of OS-minidump, it
>> more of relying on basic concept of RAM content preserved
>> across boot and later reading it through procfs but this basic
>> stuff is common to pstore(ram) as well and pstore has file system
>> support why don't you make your driver as one of pstore record and that
>> way Qualcomm minidump also gets benefited where entire OS-minidump
>> record gets registered with Qualcomm minidump and we get this on panic
>> and you get this via pstorefs.
>>
> Thanks Mukesh!It is a good suggestion to move OS-minidump forward!
> By the way, I have some questions here for which I need your assistance.
>
> Firstly,I can reimplement OS-minidump as one of the pstore records to
> dump data. The resulting dump file would contain thousands of
> non-contiguous memory regions, each with only the virtual address and
> size recorded. As far as I know, Qualcomm's minidump can handle
> several memory regions, each with a physical address and size.
> This seems to be a difference, and I'm curious as to how you deal with
> data dumped by OS-minidump. I would really appreciate it if you could
> provide more details on your approach.

What my thought was to think your OS-minidump to be one of pstore record
similar to existing records like console, ftrace, pmsg, dmesg etc.,
If you follow this series patch 11/12 and 12/12 is trying to get the
pstore(ram) record information and registering with minidump and here
the physical address are of the ramoops record addresses.

So, once you are capturing everything inside in a record, all minidump
has to do is get your Os-minidump record physical address and size
and register with minidump.
>
> Secondly, what tools do you use to analyze the dump data, and does it
> support crash tool?

Currently, we are trying to capture only pstore ramoops region in text
format and not been using any tool.

Since, Qualcomm minidump is controlled from boot firmware and it can
not be used on non-Qualcomm SoCs so here minidump driver and its usecase
is limited to capture only pstore (ram)records for targets where RAM
content is not guaranteed to be preserved across boots.

So, you can think minidump as one of ramoops backend which will be
dumping all the ramoops regions/records/zones.

+---------+ +---------+ +--------+ +---------+
| console | | pmsg | | ftrace | | dmesg | ...Os-minidump
+---------+ +---------+ +--------+ +---------+
| | | |
| | | |
+------------------------------------------+
|
\ /
+----------------+
(1) |pstore frontends|
+----------------+
|
\ /
+------------------- +
(2) | pstore backend(ram)|
+--------------------+
|
\ /

+---------------+
(3) | qcom_minidump |
+---------------+


>
> Lastly, is Qualcomm minidump compatible with non-Qualcomm SoCs,
> and if so, how can one use it?

I already replied it above.

-Mukesh

>
> Best Regards
> Ruipeng Qi