2014-02-11 08:24:45

by Juri Lelli

[permalink] [raw]
Subject: [PATCH 0/2] A couple of sched patches

This two patches (on top of tip/master as of today) fix bugs
in sched/core. First one is a repost of
http://comments.gmane.org/gmane.linux.kernel/1638425, and exposed,
at least to me, another bug (fixed by second patch).

Regards,

- Juri

Juri Lelli (2):
sched/core: fix sched_rt_global_validate
sched/core: make dl_b->lock IRQ safe

kernel/sched/core.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

--
1.7.9.5


2014-02-11 08:24:53

by Juri Lelli

[permalink] [raw]
Subject: [PATCH 2/2] sched/core: make dl_b->lock IRQ safe

Fix this lockdep warning:

[ 44.804600] =========================================================
[ 44.805746] [ INFO: possible irq lock inversion dependency detected ]
[ 44.805746] 3.14.0-rc2-test+ #14 Not tainted
[ 44.805746] ---------------------------------------------------------
[ 44.805746] bash/3674 just changed the state of lock:
[ 44.805746] (&dl_b->lock){+.....}, at: [<ffffffff8106ad15>] sched_rt_handler+0x132/0x248
[ 44.805746] but this lock was taken by another, HARDIRQ-safe lock in the past:
[ 44.805746] (&rq->lock){-.-.-.}

and interrupts could create inverse lock ordering between them.

[ 44.805746]
[ 44.805746] other info that might help us debug this:
[ 44.805746] Possible interrupt unsafe locking scenario:
[ 44.805746]
[ 44.805746] CPU0 CPU1
[ 44.805746] ---- ----
[ 44.805746] lock(&dl_b->lock);
[ 44.805746] local_irq_disable();
[ 44.805746] lock(&rq->lock);
[ 44.805746] lock(&dl_b->lock);
[ 44.805746] <Interrupt>
[ 44.805746] lock(&rq->lock);

by making dl_b->lock acquiring always IRQ safe.

Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Juri Lelli <[email protected]>
---
kernel/sched/core.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 604dd4e..ed006be 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7412,6 +7412,7 @@ static int sched_dl_global_constraints(void)
u64 period = global_rt_period();
u64 new_bw = to_ratio(period, runtime);
int cpu, ret = 0;
+ unsigned long flags;

/*
* Here we want to check the bandwidth not being set to some
@@ -7425,10 +7426,10 @@ static int sched_dl_global_constraints(void)
for_each_possible_cpu(cpu) {
struct dl_bw *dl_b = dl_bw_of(cpu);

- raw_spin_lock(&dl_b->lock);
+ raw_spin_lock_irqsave(&dl_b->lock, flags);
if (new_bw < dl_b->total_bw)
ret = -EBUSY;
- raw_spin_unlock(&dl_b->lock);
+ raw_spin_unlock_irqrestore(&dl_b->lock, flags);

if (ret)
break;
@@ -7441,6 +7442,7 @@ static void sched_dl_do_global(void)
{
u64 new_bw = -1;
int cpu;
+ unsigned long flags;

def_dl_bandwidth.dl_period = global_rt_period();
def_dl_bandwidth.dl_runtime = global_rt_runtime();
@@ -7454,9 +7456,9 @@ static void sched_dl_do_global(void)
for_each_possible_cpu(cpu) {
struct dl_bw *dl_b = dl_bw_of(cpu);

- raw_spin_lock(&dl_b->lock);
+ raw_spin_lock_irqsave(&dl_b->lock, flags);
dl_b->bw = new_bw;
- raw_spin_unlock(&dl_b->lock);
+ raw_spin_unlock_irqrestore(&dl_b->lock, flags);
}
}

--
1.7.9.5

2014-02-11 08:24:51

by Juri Lelli

[permalink] [raw]
Subject: [REPOST - PATCH 1/2] sched/core: fix sched_rt_global_validate

Don't compare sysctl_sched_rt_runtime against sysctl_sched_rt_period if
the former is equal to RUNTIME_INF, otherwise disabling -rt bandwidth
management (with CONFIG_RT_GROUP_SCHED=n) fails.

Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Juri Lelli <[email protected]>
---
kernel/sched/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 104c816..604dd4e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7465,7 +7465,8 @@ static int sched_rt_global_validate(void)
if (sysctl_sched_rt_period <= 0)
return -EINVAL;

- if (sysctl_sched_rt_runtime > sysctl_sched_rt_period)
+ if ((sysctl_sched_rt_runtime != RUNTIME_INF) &&
+ (sysctl_sched_rt_runtime > sysctl_sched_rt_period))
return -EINVAL;

return 0;
--
1.7.9.5

2014-02-11 08:50:02

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 0/2] A couple of sched patches


* Juri Lelli <[email protected]> wrote:

> This two patches (on top of tip/master as of today) fix bugs
> in sched/core. First one is a repost of
> http://comments.gmane.org/gmane.linux.kernel/1638425, and exposed,
> at least to me, another bug (fixed by second patch).

With today's -tip there's a new warning on UP builds:

/home/mingo/tip/kernel/sched/deadline.c:993: warning: 'pull_dl_task' declared 'static' but never defined

And in general the #ifdef happiness of deadline.c is rather sad:

comet:~/tip> grep '#ifdef' kernel/sched/deadline.c
#ifdef CONFIG_SMP
#ifdef CONFIG_SMP
#ifdef CONFIG_SMP
#ifdef CONFIG_SMP
#ifdef CONFIG_SMP
#ifdef CONFIG_SMP
#ifdef CONFIG_SCHED_HRTICK
#ifdef CONFIG_SMP
#ifdef CONFIG_SCHED_HRTICK
#ifdef CONFIG_SMP
#ifdef CONFIG_SCHED_HRTICK
#ifdef CONFIG_SMP
#ifdef CONFIG_SMP
#ifdef CONFIG_SMP
#ifdef CONFIG_SMP
#ifdef CONFIG_SMP

All that needs to be cleaned up.

Thanks,

Ingo

Subject: [tip:sched/urgent] sched/core: Fix sched_rt_global_validate

Commit-ID: e9e7cb38c21c80c82af4b16608bb4c8c5ec6a28e
Gitweb: http://git.kernel.org/tip/e9e7cb38c21c80c82af4b16608bb4c8c5ec6a28e
Author: Juri Lelli <[email protected]>
AuthorDate: Tue, 11 Feb 2014 09:24:26 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 21 Feb 2014 21:27:10 +0100

sched/core: Fix sched_rt_global_validate

Don't compare sysctl_sched_rt_runtime against sysctl_sched_rt_period if
the former is equal to RUNTIME_INF, otherwise disabling -rt bandwidth
management (with CONFIG_RT_GROUP_SCHED=n) fails.

Cc: Ingo Molnar <[email protected]>
Signed-off-by: Juri Lelli <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
---
kernel/sched/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2491448..98d33c1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7475,7 +7475,8 @@ static int sched_rt_global_validate(void)
if (sysctl_sched_rt_period <= 0)
return -EINVAL;

- if (sysctl_sched_rt_runtime > sysctl_sched_rt_period)
+ if ((sysctl_sched_rt_runtime != RUNTIME_INF) &&
+ (sysctl_sched_rt_runtime > sysctl_sched_rt_period))
return -EINVAL;

return 0;

Subject: [tip:sched/urgent] sched/core: Make dl_b->lock IRQ safe

Commit-ID: 495163420ab5398c84af96ca3eae2c6aa4a140da
Gitweb: http://git.kernel.org/tip/495163420ab5398c84af96ca3eae2c6aa4a140da
Author: Juri Lelli <[email protected]>
AuthorDate: Tue, 11 Feb 2014 09:24:27 +0100
Committer: Thomas Gleixner <[email protected]>
CommitDate: Fri, 21 Feb 2014 21:27:10 +0100

sched/core: Make dl_b->lock IRQ safe

Fix this lockdep warning:

[ 44.804600] =========================================================
[ 44.805746] [ INFO: possible irq lock inversion dependency detected ]
[ 44.805746] 3.14.0-rc2-test+ #14 Not tainted
[ 44.805746] ---------------------------------------------------------
[ 44.805746] bash/3674 just changed the state of lock:
[ 44.805746] (&dl_b->lock){+.....}, at: [<ffffffff8106ad15>] sched_rt_handler+0x132/0x248
[ 44.805746] but this lock was taken by another, HARDIRQ-safe lock in the past:
[ 44.805746] (&rq->lock){-.-.-.}

and interrupts could create inverse lock ordering between them.

[ 44.805746]
[ 44.805746] other info that might help us debug this:
[ 44.805746] Possible interrupt unsafe locking scenario:
[ 44.805746]
[ 44.805746] CPU0 CPU1
[ 44.805746] ---- ----
[ 44.805746] lock(&dl_b->lock);
[ 44.805746] local_irq_disable();
[ 44.805746] lock(&rq->lock);
[ 44.805746] lock(&dl_b->lock);
[ 44.805746] <Interrupt>
[ 44.805746] lock(&rq->lock);

by making dl_b->lock acquiring always IRQ safe.

Cc: Ingo Molnar <[email protected]>
Signed-off-by: Juri Lelli <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
---
kernel/sched/core.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 98d33c1..33d030a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7422,6 +7422,7 @@ static int sched_dl_global_constraints(void)
u64 period = global_rt_period();
u64 new_bw = to_ratio(period, runtime);
int cpu, ret = 0;
+ unsigned long flags;

/*
* Here we want to check the bandwidth not being set to some
@@ -7435,10 +7436,10 @@ static int sched_dl_global_constraints(void)
for_each_possible_cpu(cpu) {
struct dl_bw *dl_b = dl_bw_of(cpu);

- raw_spin_lock(&dl_b->lock);
+ raw_spin_lock_irqsave(&dl_b->lock, flags);
if (new_bw < dl_b->total_bw)
ret = -EBUSY;
- raw_spin_unlock(&dl_b->lock);
+ raw_spin_unlock_irqrestore(&dl_b->lock, flags);

if (ret)
break;
@@ -7451,6 +7452,7 @@ static void sched_dl_do_global(void)
{
u64 new_bw = -1;
int cpu;
+ unsigned long flags;

def_dl_bandwidth.dl_period = global_rt_period();
def_dl_bandwidth.dl_runtime = global_rt_runtime();
@@ -7464,9 +7466,9 @@ static void sched_dl_do_global(void)
for_each_possible_cpu(cpu) {
struct dl_bw *dl_b = dl_bw_of(cpu);

- raw_spin_lock(&dl_b->lock);
+ raw_spin_lock_irqsave(&dl_b->lock, flags);
dl_b->bw = new_bw;
- raw_spin_unlock(&dl_b->lock);
+ raw_spin_unlock_irqrestore(&dl_b->lock, flags);
}
}