2019-09-30 13:35:04

by Prasad Sodagudi

[permalink] [raw]
Subject: Time stamp value in printk records

Hi All,

From Qualcomm side, we would like to check with upstream team about
adding Raw time stamp value to printk records. On Qualcomm soc, there
are various DSPs subsystems are there - for example audio, video and
modem DSPs.
Adding raw timer value(along with sched_clock()) in the printk record
helps in the following use cases –
1) To find out which subsystem crashed first - Whether application
processor crashed first or DSP subsystem?
2) If there are any system stability issues on the DSP side, what is the
activity on the APPS processor side during that time?

Initially during the device boot up, printk shed_clock value can be
matched with timer raw value used on the dsp subsystem, but after APPS
processor suspends several times, we don’t have way to correlate the
time stamp value on the DSP and APPS processor. All timers(both apps
processor timer and dsp timers) are derived from globally always on
timer on Qualcomm soc, So keeping global timer raw values in printk
records and dsp logs help to correlate the activity of all the
processors in SoC.

It would be great if upstream team adds common solution this problem if
all soc vendors would get benefit by adding raw timer value to printk
records.

-Thanks, Prasad

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum,
Linux Foundation Collaborative Project


2019-09-30 13:58:50

by Steven Rostedt

[permalink] [raw]
Subject: Re: Time stamp value in printk records

On Mon, 30 Sep 2019 06:33:42 -0700
Sodagudi Prasad <[email protected]> wrote:

> Hi All,
>
> From Qualcomm side, we would like to check with upstream team about
> adding Raw time stamp value to printk records. On Qualcomm soc, there
> are various DSPs subsystems are there - for example audio, video and
> modem DSPs.
> Adding raw timer value(along with sched_clock()) in the printk record
> helps in the following use cases –
> 1) To find out which subsystem crashed first - Whether application
> processor crashed first or DSP subsystem?
> 2) If there are any system stability issues on the DSP side, what is the
> activity on the APPS processor side during that time?
>
> Initially during the device boot up, printk shed_clock value can be
> matched with timer raw value used on the dsp subsystem, but after APPS
> processor suspends several times, we don’t have way to correlate the
> time stamp value on the DSP and APPS processor. All timers(both apps
> processor timer and dsp timers) are derived from globally always on
> timer on Qualcomm soc, So keeping global timer raw values in printk
> records and dsp logs help to correlate the activity of all the
> processors in SoC.
>
> It would be great if upstream team adds common solution this problem if
> all soc vendors would get benefit by adding raw timer value to printk
>

Hi Prasad,

If you or someone you know would like to present patches for exactly
what you would like to see, that would go a long way.

-- Steve

2019-09-30 15:09:34

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: Time stamp value in printk records

Hi,

On (09/30/19 06:33), Sodagudi Prasad wrote:
> From Qualcomm side, we would like to check with upstream team about adding
> Raw time stamp value to printk records. On Qualcomm soc, there are various
> DSPs subsystems are there - for example audio, video and modem DSPs.
> Adding raw timer value(along with sched_clock()) in the printk record helps
> in the following use cases –
> 1) To find out which subsystem crashed first - Whether application
> processor crashed first or DSP subsystem?
> 2) If there are any system stability issues on the DSP side, what is the
> activity on the APPS processor side during that time?
>
> Initially during the device boot up, printk shed_clock value can be matched
> with timer raw value used on the dsp subsystem, but after APPS processor
> suspends several times, we don’t have way to correlate the time stamp value
> on the DSP and APPS processor. All timers(both apps processor timer and dsp
> timers) are derived from globally always on timer on Qualcomm soc, So
> keeping global timer raw values in printk records and dsp logs help to
> correlate the activity of all the processors in SoC.

Off the top of my head - timestamps are really hard.

Not only because, as of now, we serialize printk() internally. But also
because a lot of things can happen on any CPU between the moment when event
occurs - printk() - and the moment when we read clocks. NMI, IRQ, preemption.

Consider the following case

CPU0 CPU1 CPU2 CPU3
printk() printk() printk() printk()
<<preemption>> <<irq>>
spin_lock(logbuf)
clock()
spin_unlock(logbuf) spin_lock(logbuf)
clock() <<iret>>
spin_unlock(logbuf) spin_lock(logbuf)
clock()
spin_lock(logbuf) spin_unlock(logbuf)
clock()
spin_unlock(logbuf)

-ss

2019-10-01 12:08:05

by Petr Mladek

[permalink] [raw]
Subject: Re: Time stamp value in printk records

On Mon 2019-09-30 06:33:42, Sodagudi Prasad wrote:
> Hi All,
>
> From Qualcomm side, we would like to check with upstream team about adding
> Raw time stamp value to printk records. On Qualcomm soc, there are various
> DSPs subsystems are there - for example audio, video and modem DSPs.
> Adding raw timer value(along with sched_clock()) in the printk record helps
> in the following use cases –
> 1) To find out which subsystem crashed first - Whether application
> processor crashed first or DSP subsystem?
> 2) If there are any system stability issues on the DSP side, what is the
> activity on the APPS processor side during that time?
>
> Initially during the device boot up, printk shed_clock value can be matched
> with timer raw value used on the dsp subsystem, but after APPS processor
> suspends several times, we don’t have way to correlate the time stamp value
> on the DSP and APPS processor. All timers(both apps processor timer and dsp
> timers) are derived from globally always on timer on Qualcomm soc, So
> keeping global timer raw values in printk records and dsp logs help to
> correlate the activity of all the processors in SoC.
>
> It would be great if upstream team adds common solution this problem if all
> soc vendors would get benefit by adding raw timer value to printk records.

There were some proposals in the past. IMHO, the most comprehensive
discussion can be found at
https://lore.kernel.org/lkml/alpine.DEB.2.20.1711131023170.1851@nanos/

The main requirement is:

+ The main timestamp must have the same semantic on all systems

+ User space parses the timestamp. We must not break the format
and semantic.

+ printk() need to get the timestamp a lockless way


Now, different people wanted different clocks in the past. Which
brings several problems:

+ configuration
+ storing
+ output format so that people/tools know what they read

There is a huge risk that it will get over engineered. Also
there is a risk that some userspace tools might parse it
and we would need to maintain compatibility forever.

IMHO, the most acceptable idea was to print a line with "all"
possible clocks every now and then. It can be done regularly
(once a hour/day) or on event (resume).

These are hints and pointers. Feel free to send patches so
that we could discuss them.

Best Regards,
Petr