-stable review patch. If anyone has any objections, please let us know.
------------------
From: Thomas Gleixner <[email protected]>
The integer divisions in the timer accounting code can round the result
down to 0. Adding 0 is without effect and the signal delivery stops.
Clamp the division result to minimum 1 to avoid this.
Problem was reported by Seongbae Park <[email protected]>, who provided
also an inital patch.
Roland sayeth:
I have had some more time to think about the problem, and to reproduce it
using Toyo's test case. For the record, if my understanding of the problem
is correct, this happens only in one very particular case. First, the
expiry time has to be so soon that in cputime_t units (usually 1s/HZ ticks)
it's < nthreads so the division yields zero. Second, it only affects each
thread that is so new that its CPU time accumulation is zero so now+0 is
still zero and ->it_*_expires winds up staying zero. For the VIRT and PROF
clocks when cputime_t is tick granularity (or the SCHED clock on
configurations where sched_clock's value only advances on clock ticks), this
is not hard to arrange with new threads starting up and blocking before they
accumulate a whole tick of CPU time. That's what happens in Toyo's test
case.
Note that in general it is fine for that division to round down to zero,
and set each thread's expiry time to its "now" time. The problem only
arises with thread's whose "now" value is still zero, so that now+0 winds up
0 and is interpreted as "not set" instead of ">= now". So it would be a
sufficient and more precise fix to just use max(ticks, 1) inside the loop
when setting each it_*_expires value.
But, it does no harm to round the division up to one and always advance
every thread's expiry time. If the thread didn't already fire timers for
the expiry time of "now", there is no expectation that it will do so before
the next tick anyway. So I followed Thomas's patch in lifting the max out
of the loops.
This patch also covers the reload cases, which are harder to write a test
for (and I didn't try). I've tested it with Toyo's case and it fixes that.
[[email protected]: fix: min_t -> max_t]
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Roland McGrath <[email protected]>
Cc: Daniel Walker <[email protected]>
Cc: Toyo Abe <[email protected]>
Cc: john stultz <[email protected]>
Cc: Roman Zippel <[email protected]>
Cc: Seongbae Park <[email protected]>
Cc: Peter Mattis <[email protected]>
Cc: Rohit Seth <[email protected]>
Cc: Martin Bligh <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Chris Wright <[email protected]>
---
kernel/posix-cpu-timers.c | 27 +++++++++++++++++++++------
1 file changed, 21 insertions(+), 6 deletions(-)
--- linux-2.6.18.1.orig/kernel/posix-cpu-timers.c
+++ linux-2.6.18.1/kernel/posix-cpu-timers.c
@@ -88,6 +88,19 @@ static inline union cpu_time_count cpu_t
}
/*
+ * Divide and limit the result to res >= 1
+ *
+ * This is necessary to prevent signal delivery starvation, when the result of
+ * the division would be rounded down to 0.
+ */
+static inline cputime_t cputime_div_non_zero(cputime_t time, unsigned long div)
+{
+ cputime_t res = cputime_div(time, div);
+
+ return max_t(cputime_t, res, 1);
+}
+
+/*
* Update expiry time from increment, and increase overrun count,
* given the current clock sample.
*/
@@ -483,8 +496,8 @@ static void process_timer_rebalance(stru
BUG();
break;
case CPUCLOCK_PROF:
- left = cputime_div(cputime_sub(expires.cpu, val.cpu),
- nthreads);
+ left = cputime_div_non_zero(cputime_sub(expires.cpu, val.cpu),
+ nthreads);
do {
if (likely(!(t->flags & PF_EXITING))) {
ticks = cputime_add(prof_ticks(t), left);
@@ -498,8 +511,8 @@ static void process_timer_rebalance(stru
} while (t != p);
break;
case CPUCLOCK_VIRT:
- left = cputime_div(cputime_sub(expires.cpu, val.cpu),
- nthreads);
+ left = cputime_div_non_zero(cputime_sub(expires.cpu, val.cpu),
+ nthreads);
do {
if (likely(!(t->flags & PF_EXITING))) {
ticks = cputime_add(virt_ticks(t), left);
@@ -515,6 +528,7 @@ static void process_timer_rebalance(stru
case CPUCLOCK_SCHED:
nsleft = expires.sched - val.sched;
do_div(nsleft, nthreads);
+ nsleft = max_t(unsigned long long, nsleft, 1);
do {
if (likely(!(t->flags & PF_EXITING))) {
ns = t->sched_time + nsleft;
@@ -1159,12 +1173,13 @@ static void check_process_timers(struct
prof_left = cputime_sub(prof_expires, utime);
prof_left = cputime_sub(prof_left, stime);
- prof_left = cputime_div(prof_left, nthreads);
+ prof_left = cputime_div_non_zero(prof_left, nthreads);
virt_left = cputime_sub(virt_expires, utime);
- virt_left = cputime_div(virt_left, nthreads);
+ virt_left = cputime_div_non_zero(virt_left, nthreads);
if (sched_expires) {
sched_left = sched_expires - sched_time;
do_div(sched_left, nthreads);
+ sched_left = max_t(unsigned long long, sched_left, 1);
} else {
sched_left = 0;
}
--