2006-11-08 18:05:13

by Mauricio Lin

[permalink] [raw]
Subject: Jiffies wraparound is not treated in the schedstats

Hi Balbir,

Do you know why in the sched_info_arrive() and sched_info_depart()
functions the calculation of delta_jiffies does not use the time_after
or time_before macro to prevent the miscalculation when jiffies
overflow?

For instance the delta_jiffies variable is simply calculated as:

delta_jiffies = now - t->sched_info.last_queued;

Do not you think the more logical way should be

if (time_after(now, t->sched_info.last_queued))
delta_jiffies = now - t->sched_info.last_queued;
else
delta_jiffies = (MAX_JIFFIES - t->sched_info.last_queued) + now

I have included more variables to measure some issues of schedule in
the kernel (following schedstat idea) and I noticed that jiffies
wraparound has led to wrong values, since the user space tool when
collecting the values is producing negative values.

Any comments?

Can I provide a patch for that?

BR,

Mauricio Lin.


2006-11-08 19:01:24

by Kaz Kylheku

[permalink] [raw]
Subject: Re: Jiffies wraparound is not treated in the schedstats

Mauricio Lin wrote:
> For instance the delta_jiffies variable is simply calculated as:
>
> delta_jiffies = now - t->sched_info.last_queued;
>
> Do not you think the more logical way should be

No. Learn about the modulo arithmetic properties of the unsigned types.

> if (time_after(now, t->sched_info.last_queued))

Have you looked at time_after? It just performs a subtraction, after
casting both operands to long. It does not care which one is bigger than
the other. Basically, it performs the same calculation as the
delta_jiffies above, and then checks the sign bit.

> delta_jiffies = now - t->sched_info.last_queued;
> else
> delta_jiffies = (MAX_JIFFIES - t->sched_info.last_queued) + now

I'm looking at a 2.6.17 kernel, and I can't find MAX_JIFFIES, or
JIFFIES_MAX. There is a MAX_JIFFY_OFFSET, which is something else. It's
the biggest delta that you can add to a value X so that time_after(X, X
+ delta) is still true. If you try to schedule something beyond now +
MAX_JIFFY_OFFSET, it will be put back in time.

Also, your calculation is buggy. MAX_JIFFIES would correspond to
ULONG_MAX. What you want is:

(ULONG_MAX + 1 - last_queued + now)

This expression is now mathematically congruent to

-last_queued + now (mod ULONG_MAX + 1)

or

now - last_queued (mod ULONG_MAX + 1).

The arithmetic of the unsigned long type is /already/ carried out modulo
ULONG_MAX + 1, so the ULONG_MAX + 1 term can be dropped from your
expression, leaving it as

now - last_queued

> I have included more variables to measure some issues of schedule in
> the kernel (following schedstat idea) and I noticed that jiffies
> wraparound has led to wrong values, since the user space tool when
> collecting the values is producing negative values.

What user space tool?

> Any comments?

The user space tool is perhaps buggy, or you are misinterpreting its
output.

> Can I provide a patch for that?

Some day, when you learn how to program.

2006-11-09 06:30:33

by Balbir Singh

[permalink] [raw]
Subject: Re: Jiffies wraparound is not treated in the schedstats

Mauricio Lin wrote:
> Hi Balbir,
>
> Do you know why in the sched_info_arrive() and sched_info_depart()
> functions the calculation of delta_jiffies does not use the time_after
> or time_before macro to prevent the miscalculation when jiffies
> overflow?
>
> For instance the delta_jiffies variable is simply calculated as:
>
> delta_jiffies = now - t->sched_info.last_queued;
>
> Do not you think the more logical way should be
>
> if (time_after(now, t->sched_info.last_queued))
> delta_jiffies = now - t->sched_info.last_queued;
> else
> delta_jiffies = (MAX_JIFFIES - t->sched_info.last_queued) + now
>

What's MAX_JIFFIES? Is it MAX_ULONG? jiffies is unsigned long
so you'll have to be careful with unsigned long arithmetic.

Consider that now is 5 and t->sched_info.last_queued is 10.

On my system

perl -e '{printf("%lu\n", -5 + (1<<32) - 1);}'
4294967291

perl -e '{printf("%lu\n", -5 );}'
4294967291


> I have included more variables to measure some issues of schedule in
> the kernel (following schedstat idea) and I noticed that jiffies
> wraparound has led to wrong values, since the user space tool when
> collecting the values is producing negative values.
>

hmm.. jiffies wrapped around in sched_info_depart()? I've never seen
that happen. Could you post the additions and user space tool to look at?
What additional features are you planning to measure in the scheduler?

> Any comments?
>
> Can I provide a patch for that?
>

Please feel free to provide patches, this is open source!!

> BR,
>
> Mauricio Lin.


--

Balbir Singh,
Linux Technology Center,
IBM Software Labs

2006-11-09 06:46:26

by Balbir Singh

[permalink] [raw]
Subject: Re: Jiffies wraparound is not treated in the schedstats

Balbir Singh wrote:
>
> hmm.. jiffies wrapped around in sched_info_depart()? I've never seen
> that happen.

I meant beyond the edge of the boundary twice or more. One overflow is handled
as explained in the previous mail.

--

Balbir Singh,
Linux Technology Center,
IBM Software Labs

2006-11-09 14:35:56

by Mauricio Lin

[permalink] [raw]
Subject: Re: Jiffies wraparound is not treated in the schedstats

Hi Balbir,

I did not know about arithmetic properties of the unsigned types, so
the misundertanding about jiffies calculation. It was my mistake.

On 11/9/06, Balbir Singh <[email protected]> wrote:
> Mauricio Lin wrote:
> > Hi Balbir,
> >
> > Do you know why in the sched_info_arrive() and sched_info_depart()
> > functions the calculation of delta_jiffies does not use the time_after
> > or time_before macro to prevent the miscalculation when jiffies
> > overflow?
> >
> > For instance the delta_jiffies variable is simply calculated as:
> >
> > delta_jiffies = now - t->sched_info.last_queued;
> >
> > Do not you think the more logical way should be
> >
> > if (time_after(now, t->sched_info.last_queued))
> > delta_jiffies = now - t->sched_info.last_queued;
> > else
> > delta_jiffies = (MAX_JIFFIES - t->sched_info.last_queued) + now
> >
>
> What's MAX_JIFFIES? Is it MAX_ULONG? jiffies is unsigned long
> so you'll have to be careful with unsigned long arithmetic.

It is ULONG_MAX. :)

>
> Consider that now is 5 and t->sched_info.last_queued is 10.
>
> On my system
>
> perl -e '{printf("%lu\n", -5 + (1<<32) - 1);}'
> 4294967291
>
> perl -e '{printf("%lu\n", -5 );}'
> 4294967291
>
>
> > I have included more variables to measure some issues of schedule in
> > the kernel (following schedstat idea) and I noticed that jiffies
> > wraparound has led to wrong values, since the user space tool when
> > collecting the values is producing negative values.
> >
>
> hmm.. jiffies wrapped around in sched_info_depart()? I've never seen
> that happen. Could you post the additions and user space tool to look at?
> What additional features are you planning to measure in the scheduler?

Well, one of the behaviour we would like to check its the time that
the system remains idle. As the schedstat already provide the
sched_goidle (number of time the system gets idle), during the
measurements we would like to know also the time spent in the idle
state.

Checking the results, perhaps such new info is not necessary to be
included if we want the percentage of use in our measurements. For
instance if we want the cpu_load calculated as:

cpu_load = (cpu_time2 - cpu_time1)/sample_period

where "cpu_time" corresponds to the rq->rq_sched_info.cpu_time in the
kernel and "sample period" is the time spent running the use case.

So the "cpu_idle_load" can be calculated as (1-cpu_load).

If we include in the __sched_info_switch() (kernel) something like:

...
if (prev != rq->idle)
sched_info_depart(prev);
else
rq->rq_sched_info.idle_time +=
jiffies - prev->sched_info.last_arrival;

if (next != rq->idle)
sched_info_arrive(next);
else
next->sched_info.last_arrival = jiffies;
...

We can calculated the cpu_idle_load at user space as:

cpu_idle_load = (cpu_idle_time2 - cpu_idle_time1)/sample_period

where cpu_idle_time corresponds to rq->rq_sched_info.idle_time

This info leads to the same result if we just base the calculation on
the (1-cpu_load) as mentioned previously.

BR,

Mauricio Lin.

2007-05-30 04:10:43

by Oleg Verych

[permalink] [raw]
Subject: Re: Jiffies wraparound is not treated in the schedstats

On 2006-11-09, Balbir Singh <[email protected]> wrote:
> Path: news.gmane.org!not-for-mail
> From: Balbir Singh <[email protected]>
> Newsgroups: gmane.linux.kernel
> Subject: Re: Jiffies wraparound is not treated in the schedstats
> Date: Thu, 09 Nov 2006 11:59:45 +0530
> Organization: IBM
> Lines: 61
> Approved: [email protected]
> Message-ID: <[email protected]>
> References: <[email protected]>
> Reply-To: [email protected]
> NNTP-Posting-Host: main.gmane.org
> Mime-Version: 1.0
> Content-Type: text/plain; charset=ISO-8859-1
> Content-Transfer-Encoding: 7bit
> X-Trace: sea.gmane.org 1163053871 25652 80.91.229.2 (9 Nov 2006 06:31:11 GMT)
> X-Complaints-To: [email protected]
> NNTP-Posting-Date: Thu, 9 Nov 2006 06:31:11 +0000 (UTC)
> Cc: linux-kernel <[email protected]>
> Original-X-From: linux-kernel-owner+glk-linux-kernel-3=40m.gmane.org-S1754746AbWKIGad@vger.kernel.org Thu Nov 09 07:31:09 2006
> Return-path: <linux-kernel-owner+glk-linux-kernel-3=40m.gmane.org-S1754746AbWKIGad@vger.kernel.org>
> Envelope-to: [email protected]
> Original-Received: from vger.kernel.org ([209.132.176.167]) by ciao.gmane.org with esmtp (Exim 4.43) id 1Gi3RH-0005lG-4D for [email protected]; Thu, 09 Nov 2006 07:31:03 +0100
> Original-Received: ([email protected]) by vger.kernel.org via listexpand id S1754746AbWKIGad (ORCPT <rfc822;[email protected]>); Thu, 9 Nov 2006 01:30:33 -0500
> Original-Received: ([email protected]) by vger.kernel.org id S1754747AbWKIGad (ORCPT <rfc822;linux-kernel-outgoing>); Thu, 9 Nov 2006 01:30:33 -0500
> Original-Received: from ausmtp05.au.ibm.com ([202.81.18.154]:27366 "EHLO ausmtp05.au.ibm.com") by vger.kernel.org with ESMTP id S1754746AbWKIGac (ORCPT <rfc822;[email protected]>); Thu, 9 Nov 2006 01:30:32 -0500
> Original-Received: from sd0208e0.au.ibm.com (d23rh904.au.ibm.com [202.81.18.202]) by ausmtp05.au.ibm.com (8.13.8/8.13.6) with ESMTP id kA9IWR8m2142218 for <[email protected]>; Thu, 9 Nov 2006 17:32:29 -0100
> Original-Received: from d23av01.au.ibm.com (d23av01.au.ibm.com [9.190.250.242]) by sd0208e0.au.ibm.com (8.13.6/8.13.6/NCO v8.1.1) with ESMTP id kA96XQUu233974 for <[email protected]>; Thu, 9 Nov 2006 17:33:46 +1100
> Original-Received: from d23av01.au.ibm.com (loopback [127.0.0.1]) by d23av01.au.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id kA96TxcF020262 for <[email protected]>; Thu, 9 Nov 2006 17:29:59 +1100
> Original-Received: from [9.124.96.199] ([9.124.96.199]) by d23av01.au.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id kA96Tuua020086; Thu, 9 Nov 2006 17:29:58 +1100
> User-Agent: Thunderbird 1.5.0.7 (X11/20060922)
> Original-To: Mauricio Lin <[email protected]>
> In-Reply-To: <[email protected]>
> Original-Sender: [email protected]
> Precedence: bulk
> X-Mailing-List: [email protected]
> Xref: news.gmane.org gmane.linux.kernel:465046
> Archived-At: <http://permalink.gmane.org/gmane.linux.kernel/465046>
>
> Mauricio Lin wrote:
>> Hi Balbir,
>>
>> Do you know why in the sched_info_arrive() and sched_info_depart()
>> functions the calculation of delta_jiffies does not use the time_after
>> or time_before macro to prevent the miscalculation when jiffies
>> overflow?
>>
>> For instance the delta_jiffies variable is simply calculated as:
>>
>> delta_jiffies = now - t->sched_info.last_queued;
>>
>> Do not you think the more logical way should be
>>
>> if (time_after(now, t->sched_info.last_queued))
>> delta_jiffies = now - t->sched_info.last_queued;
>> else
>> delta_jiffies = (MAX_JIFFIES - t->sched_info.last_queued) + now
>>
>
> What's MAX_JIFFIES? Is it MAX_ULONG? jiffies is unsigned long
> so you'll have to be careful with unsigned long arithmetic.
>
> Consider that now is 5 and t->sched_info.last_queued is 10.
>
> On my system
>
> perl -e '{printf("%lu\n", -5 + (1<<32) - 1);}'
> 4294967291

So, according to this
> perl -e '{printf("%lu\n", -5 );}'
> 4294967291
you have 32bit perl (and OS).

I have same result in
"This is perl, v5.8.4 built for i386-linux-thread-multi"

(knoppix 3.2.3 in qemu on intel's amd64).

That means we are on the same question i asked before about integer
overflow. In this case "(1<<32) = 1",

>> I have included more variables to measure some issues of schedule in
>> the kernel (following schedstat idea) and I noticed that jiffies
>> wraparound has led to wrong values, since the user space tool when
>> collecting the values is producing negative values.
>>
>
> hmm.. jiffies wrapped around in sched_info_depart()? I've never seen
> that happen. Could you post the additions and user space tool to look at?
> What additional features are you planning to measure in the scheduler?
>
>> Any comments?
>>
>> Can I provide a patch for that?
>>
>
> Please feel free to provide patches, this is open source!!
>
>> BR,
>>
>> Mauricio Lin.
>
>
> --
>
> Balbir Singh,
> Linux Technology Center,
> IBM Software Labs