2021-07-07 10:58:11

by Ani Sinha

[permalink] [raw]
Subject: [PATCH v2] sched/clock: print a log when the sched clock is marked unstable

When the sched clock transitions from stable to unstable and
vice versa, a kernel log is printed. When the sched clock
is marked explicitly as unstable, make __clear_sched_clock_stable()
emit a warning log. It is useful for example in understanding
why a certain feature like NOHZ that depends on availability of a
stable sched clock, is not available.

Signed-off-by: Ani Sinha <[email protected]>
---
kernel/sched/clock.c | 1 +
1 file changed, 1 insertion(+)

Changelog:
V1: original patch
v2: print pr_warn from __clear_sched_clock_stable instead. Commit log
updated with proper prefix and wording.

diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
index c2b2859ddd82..99484fec0335 100644
--- a/kernel/sched/clock.c
+++ b/kernel/sched/clock.c
@@ -183,6 +183,7 @@ static void __clear_sched_clock_stable(void)
return;

tick_dep_set(TICK_DEP_BIT_CLOCK_UNSTABLE);
+ pr_warn("sched_clock: Marking unstable.\n");
schedule_work(&sched_clock_work);
}

--
2.25.1


2021-07-07 11:48:04

by Ani Sinha

[permalink] [raw]
Subject: Re: [PATCH v2] sched/clock: print a log when the sched clock is marked unstable



On Wed, 7 Jul 2021, Peter Zijlstra wrote:

> On Wed, Jul 07, 2021 at 04:26:59PM +0530, Ani Sinha wrote:
> > When the sched clock transitions from stable to unstable and
> > vice versa, a kernel log is printed. When the sched clock
> > is marked explicitly as unstable, make __clear_sched_clock_stable()
> > emit a warning log. It is useful for example in understanding
> > why a certain feature like NOHZ that depends on availability of a
> > stable sched clock, is not available.
> >
> > Signed-off-by: Ani Sinha <[email protected]>
> > ---
> > kernel/sched/clock.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > Changelog:
> > V1: original patch
> > v2: print pr_warn from __clear_sched_clock_stable instead. Commit log
> > updated with proper prefix and wording.
> >
> > diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
> > index c2b2859ddd82..99484fec0335 100644
> > --- a/kernel/sched/clock.c
> > +++ b/kernel/sched/clock.c
> > @@ -183,6 +183,7 @@ static void __clear_sched_clock_stable(void)
> > return;
> >
> > tick_dep_set(TICK_DEP_BIT_CLOCK_UNSTABLE);
> > + pr_warn("sched_clock: Marking unstable.\n");
> > schedule_work(&sched_clock_work);
> > }
>
> I'm confused... that work has a ton of printk()s in, what is this
> pr_warn() adding?
>

Right, but for whatever reason, on latest version of Ubuntu (21.04)
running on hyperV, I do not see those logs:

azureuser@CentOS:~$ uname -a
Linux CentOS 5.11.0-1009-azure #9-Ubuntu SMP Thu Jun 17 02:20:07 UTC 2021
x86_64 x86_64 x86_64 GNU/Linux
azureuser@CentOS:~$ dmesg | grep unstable
dmesg: read kernel buffer failed: Operation not permitted
azureuser@CentOS:~$ sudo su
root@CentOS:/home/azureuser# dmesg | grep unstable
[ 0.000000] tsc: Marking TSC unstable due to running on Hyper-V
root@CentOS:/home/azureuser# dmesg | grep sched
[ 0.384656] rcu: RCU calculated value of scheduler-enlistment delay is
25 jiffies.
[ 1.122977] io scheduler mq-deadline registered
root@CentOS:/home/azureuser# exit

Hence, I was trying the patch.

A

2021-07-07 12:20:07

by Ani Sinha

[permalink] [raw]
Subject: Re: [PATCH v2] sched/clock: print a log when the sched clock is marked unstable



On Wed, 7 Jul 2021, Ani Sinha wrote:

>
>
> On Wed, 7 Jul 2021, Peter Zijlstra wrote:
>
> > On Wed, Jul 07, 2021 at 04:26:59PM +0530, Ani Sinha wrote:
> > > When the sched clock transitions from stable to unstable and
> > > vice versa, a kernel log is printed. When the sched clock
> > > is marked explicitly as unstable, make __clear_sched_clock_stable()
> > > emit a warning log. It is useful for example in understanding
> > > why a certain feature like NOHZ that depends on availability of a
> > > stable sched clock, is not available.
> > >
> > > Signed-off-by: Ani Sinha <[email protected]>
> > > ---
> > > kernel/sched/clock.c | 1 +
> > > 1 file changed, 1 insertion(+)
> > >
> > > Changelog:
> > > V1: original patch
> > > v2: print pr_warn from __clear_sched_clock_stable instead. Commit log
> > > updated with proper prefix and wording.
> > >
> > > diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
> > > index c2b2859ddd82..99484fec0335 100644
> > > --- a/kernel/sched/clock.c
> > > +++ b/kernel/sched/clock.c
> > > @@ -183,6 +183,7 @@ static void __clear_sched_clock_stable(void)
> > > return;
> > >
> > > tick_dep_set(TICK_DEP_BIT_CLOCK_UNSTABLE);
> > > + pr_warn("sched_clock: Marking unstable.\n");
> > > schedule_work(&sched_clock_work);
> > > }
> >
> > I'm confused... that work has a ton of printk()s in, what is this
> > pr_warn() adding?
> >
>
> Right, but for whatever reason, on latest version of Ubuntu (21.04)
> running on hyperV, I do not see those logs:
>
> azureuser@CentOS:~$ uname -a
> Linux CentOS 5.11.0-1009-azure #9-Ubuntu SMP Thu Jun 17 02:20:07 UTC 2021
> x86_64 x86_64 x86_64 GNU/Linux
> azureuser@CentOS:~$ dmesg | grep unstable
> dmesg: read kernel buffer failed: Operation not permitted
> azureuser@CentOS:~$ sudo su
> root@CentOS:/home/azureuser# dmesg | grep unstable
> [ 0.000000] tsc: Marking TSC unstable due to running on Hyper-V
> root@CentOS:/home/azureuser# dmesg | grep sched
> [ 0.384656] rcu: RCU calculated value of scheduler-enlistment delay is
> 25 jiffies.
> [ 1.122977] io scheduler mq-deadline registered
> root@CentOS:/home/azureuser# exit
>
> Hence, I was trying the patch.

OK never mimd. Seems on hyperV, they use read_hv_sched_clock_tsc() for
pv_sched_clock and therefore using_native_sched_clock() returns false.
Thus, clear_sched_clock_stable() is not called.

I will need to do some instrumentation to confirm this. Maybe we do not
need any patching after all.

2021-07-10 03:50:50

by Ani Sinha

[permalink] [raw]
Subject: Re: [PATCH v2] sched/clock: print a log when the sched clock is marked unstable

On Wed, Jul 7, 2021 at 5:48 PM Ani Sinha <[email protected]> wrote:
>
>
>
> On Wed, 7 Jul 2021, Ani Sinha wrote:
>
> >
> >
> > On Wed, 7 Jul 2021, Peter Zijlstra wrote:
> >
> > > On Wed, Jul 07, 2021 at 04:26:59PM +0530, Ani Sinha wrote:
> > > > When the sched clock transitions from stable to unstable and
> > > > vice versa, a kernel log is printed. When the sched clock
> > > > is marked explicitly as unstable, make __clear_sched_clock_stable()
> > > > emit a warning log. It is useful for example in understanding
> > > > why a certain feature like NOHZ that depends on availability of a
> > > > stable sched clock, is not available.
> > > >
> > > > Signed-off-by: Ani Sinha <[email protected]>
> > > > ---
> > > > kernel/sched/clock.c | 1 +
> > > > 1 file changed, 1 insertion(+)
> > > >
> > > > Changelog:
> > > > V1: original patch
> > > > v2: print pr_warn from __clear_sched_clock_stable instead. Commit log
> > > > updated with proper prefix and wording.
> > > >
> > > > diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
> > > > index c2b2859ddd82..99484fec0335 100644
> > > > --- a/kernel/sched/clock.c
> > > > +++ b/kernel/sched/clock.c
> > > > @@ -183,6 +183,7 @@ static void __clear_sched_clock_stable(void)
> > > > return;
> > > >
> > > > tick_dep_set(TICK_DEP_BIT_CLOCK_UNSTABLE);
> > > > + pr_warn("sched_clock: Marking unstable.\n");
> > > > schedule_work(&sched_clock_work);
> > > > }
> > >
> > > I'm confused... that work has a ton of printk()s in, what is this
> > > pr_warn() adding?
> > >
> >
> > Right, but for whatever reason, on latest version of Ubuntu (21.04)
> > running on hyperV, I do not see those logs:
> >
> > azureuser@CentOS:~$ uname -a
> > Linux CentOS 5.11.0-1009-azure #9-Ubuntu SMP Thu Jun 17 02:20:07 UTC 2021
> > x86_64 x86_64 x86_64 GNU/Linux
> > azureuser@CentOS:~$ dmesg | grep unstable
> > dmesg: read kernel buffer failed: Operation not permitted
> > azureuser@CentOS:~$ sudo su
> > root@CentOS:/home/azureuser# dmesg | grep unstable
> > [ 0.000000] tsc: Marking TSC unstable due to running on Hyper-V
> > root@CentOS:/home/azureuser# dmesg | grep sched
> > [ 0.384656] rcu: RCU calculated value of scheduler-enlistment delay is
> > 25 jiffies.
> > [ 1.122977] io scheduler mq-deadline registered
> > root@CentOS:/home/azureuser# exit
> >
> > Hence, I was trying the patch.
>
> OK never mimd. Seems on hyperV, they use read_hv_sched_clock_tsc() for
> pv_sched_clock and therefore using_native_sched_clock() returns false.
> Thus, clear_sched_clock_stable() is not called.
>
> I will need to do some instrumentation to confirm this. Maybe we do not
> need any patching after all.

Yes, no additional logging is needed here. This patch can be
discarded. There is a potential issue on the platform side code for
hyperV for which we might send a separate patch.