2003-05-08 08:49:46

by Ming Lei

[permalink] [raw]
Subject: linux rt priority thread corrupt global variable?

Platform:
Intel Pentium II; RedHat 7.2 with kernel version 2.4.7-10, libc 2.2.4-13 and
gcc 2.96.

Problem description:

a program has 3 threads of priority 12, 10, 6 respectively, and the main
process at priority 0. All the threads except main process is created with
pthread_create, and defined SCHED_FIFO as real time scheduler policy.

There is a global variable I define with 'int cpl'. All the threads and main
process may alter cpl at any time. cpl may have one of these values {0,
0xf000006e, 0xf0000068, 0xe0000000, 0xe0000060}. cpl is protected by mutex
for any access.

<Problem=> at some point of execution which cpl should be a value say
e0000060, but the actual value retained at cpl is another say e0000000; that
is, the value is changed without the program actually done anything on it.
The retained value I observed is kind of historic value(one of these value
in the above set), not the arbituary value. The problem had occured just
after context switch, also occured during a thread execution.

<Confirm> I used Intel debug register to track any writing to the cpl memory
address globally, which is the way GDB use for x86 hardware watchpoint
implementation. I could see all the writing from my program to change cpl,
but failed to see the source from which the problem occured. So I dont know
what cause the problem.

Can anyone listening give me a direction or hint on this annoying situation?

PS. please cc to this email address.
-Ming


Related questions:

Is linux kernel 2.4.10 considered strictly preemptive such as VxWorks or
other RTOS? I guess 2.4.10 may simulate preemptive with running scheduler on
every syscall or interrupt returns. Am I right?

Is printf() real-time priority thread safe?


2003-05-08 09:31:02

by Jörn Engel

[permalink] [raw]
Subject: Re: linux rt priority thread corrupt global variable?

On Thu, 8 May 2003 02:03:35 -0700, Ming Lei wrote:
>
> Platform:
> Intel Pentium II; RedHat 7.2 with kernel version 2.4.7-10, libc 2.2.4-13 and
> gcc 2.96.

You should either upgrade to 2.4.20 or similar or post the question to
RedHat for their kernels. If the problem can be reproduced with
2.4.20, come back here. :)

> Problem description:
>
> a program has 3 threads of priority 12, 10, 6 respectively, and the main
> process at priority 0. All the threads except main process is created with
> pthread_create, and defined SCHED_FIFO as real time scheduler policy.
>
> There is a global variable I define with 'int cpl'. All the threads and main
> process may alter cpl at any time. cpl may have one of these values {0,
> 0xf000006e, 0xf0000068, 0xe0000000, 0xe0000060}. cpl is protected by mutex
> for any access.
>
> <Problem=> at some point of execution which cpl should be a value say
> e0000060, but the actual value retained at cpl is another say e0000000; that
> is, the value is changed without the program actually done anything on it.
> The retained value I observed is kind of historic value(one of these value
> in the above set), not the arbituary value. The problem had occured just
> after context switch, also occured during a thread execution.
>
> <Confirm> I used Intel debug register to track any writing to the cpl memory
> address globally, which is the way GDB use for x86 hardware watchpoint
> implementation. I could see all the writing from my program to change cpl,
> but failed to see the source from which the problem occured. So I dont know
> what cause the problem.
>
> Can anyone listening give me a direction or hint on this annoying situation?

Sounds a bit like a caching problem. Old value in cache, new value
written to memory, chache line dirty => flushed, old value written to
memory again. But it could also be something else.

J?rn

--
Simplicity is prerequisite for reliability.
-- Edsger W. Dijkstra

2003-05-08 09:40:06

by Bill Huey

[permalink] [raw]
Subject: Re: linux rt priority thread corrupt global variable?

On Thu, May 08, 2003 at 02:03:35AM -0700, Ming Lei wrote:
> Related questions:
>
> Is linux kernel 2.4.10 considered strictly preemptive such as VxWorks or
> other RTOS? I guess 2.4.10 may simulate preemptive with running scheduler on
> every syscall or interrupt returns. Am I right?

No, it's not a fully preemptive kernel, but spreads preemption points
throughout the source tree, both directly and indirectly, instead. Spinlocks
are the primary mutex of choice in Linux and create atomic critical sections
that can't be preempted with respect to the normal Linux scheduler. Fully
preemptive systems tend to use sleepable locks with relaxed preemptability
within critical sections and add the possible option of priority inheritance
depending on the system.

If you're going to do RT Linux related stuff use RTLinux, RTAI or other
commerical options instead.

> Is printf() real-time priority thread safe?

Stock Linux is definitely not if I understand what you're saying and
if I understand the code correctly. :)

bill

2003-05-08 09:39:14

by Arjan van de Ven

[permalink] [raw]
Subject: Re: linux rt priority thread corrupt global variable?

On Thu, 2003-05-08 at 11:03, Ming Lei wrote:
> Platform:
> Intel Pentium II; RedHat 7.2 with kernel version 2.4.7-10,
eeep that's an old one; it has been superceeded by like 10 or more
errata kernels.


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2003-05-08 09:46:38

by Bill Huey

[permalink] [raw]
Subject: Re: linux rt priority thread corrupt global variable?

On Thu, May 08, 2003 at 02:52:38AM -0700, Bill Huey wrote:
> No, it's not a fully preemptive kernel, but spreads preemption points
> throughout the source tree, both directly and indirectly, instead. Spinlocks
> are the primary mutex of choice in Linux and create atomic critical sections
> that can't be preempted with respect to the normal Linux scheduler. Fully

Geez, this isn't exactly right either, my brain is failing me at the moment.

> preemptive systems tend to use sleepable locks with relaxed preemptability
> within critical sections and add the possible option of priority inheritance
> depending on the system.

/me thinks

bill

2003-05-08 10:30:21

by Bill Huey

[permalink] [raw]
Subject: Re: linux rt priority thread corrupt global variable?

On Thu, May 08, 2003 at 02:59:11AM -0700, Bill Huey wrote:
> On Thu, May 08, 2003 at 02:52:38AM -0700, Bill Huey wrote:
> > No, it's not a fully preemptive kernel, but spreads preemption points
> > throughout the source tree, both directly and indirectly, instead. Spinlocks
> > are the primary mutex of choice in Linux and create atomic critical sections
> > that can't be preempted with respect to the normal Linux scheduler. Fully
>
> Geez, this isn't exactly right either, my brain is failing me at the moment.

I was right the first time. :) Just remember why breaking a spinlock is
a bad thing to do. :)

bill

2003-05-08 16:46:04

by Ming Lei

[permalink] [raw]
Subject: Re: linux rt priority thread corrupt global variable?


Does anyone know about how Intel x86 debug register monitor the write
access to a specified mem address? I looked the gdb code and found only the
process VM address of the variable to be watched is writen to the debug
register. Does it mean that x86 debug register only watchs the VM address? I
want to know if Intel hardware watchs the real physical address or VM
address or CPU cache? where can I find this info? I didnt find it in intel
manual.


> > a program has 3 threads of priority 12, 10, 6 respectively, and the main
> > process at priority 0. All the threads except main process is created
with
> > pthread_create, and defined SCHED_FIFO as real time scheduler policy.
> >
> > There is a global variable I define with 'int cpl'. All the threads and
main
> > process may alter cpl at any time. cpl may have one of these values {0,
> > 0xf000006e, 0xf0000068, 0xe0000000, 0xe0000060}. cpl is protected by
mutex
> > for any access.
> >
> > <Problem=> at some point of execution which cpl should be a value say
> > e0000060, but the actual value retained at cpl is another say e0000000;
that
> > is, the value is changed without the program actually done anything on
it.
> > The retained value I observed is kind of historic value(one of these
value
> > in the above set), not the arbituary value. The problem had occured just
> > after context switch, also occured during a thread execution.
> >
> > <Confirm> I used Intel debug register to track any writing to the cpl
memory
> > address globally, which is the way GDB use for x86 hardware watchpoint
> > implementation. I could see all the writing from my program to change
cpl,
> > but failed to see the source from which the problem occured. So I dont
know
> > what cause the problem.
> >
> > Can anyone listening give me a direction or hint on this annoying
situation?
>
> Sounds a bit like a caching problem. Old value in cache, new value
> written to memory, chache line dirty => flushed, old value written to
> memory again. But it could also be something else.

2003-05-08 20:33:05

by Roger Larsson

[permalink] [raw]
Subject: Re: linux rt priority thread corrupt global variable?

On torsdag 08 maj 2003 11:03, Ming Lei wrote:
>
> Is linux kernel 2.4.10 considered strictly preemptive such as VxWorks or
> other RTOS? I guess 2.4.10 may simulate preemptive with running scheduler on
> every syscall or interrupt returns. Am I right?
>

Yes, but what else is there?
- A timer interrupt that ends a sleep for a RT process.
- A device interrupt that notifies a RT process about new data.
- A process that wakes up another process.
The problem with 2.4.10 is that while the current process is
executing IN kernel, the wakened RT process will need to wait
until the current leaves kernel or goes to sleep.

This is not a huge problem since there are patches for 2.4.10 that adds
explicit checks in found kernel spots (loops over long lists).

Later kernels got some of these improvements. There are patches for
these as well.

In the 2.5 series you can specify preemptive kernel.
With that a preemption can happen in the kernel but not
when being inside a spin lock. There are patches for this case
as well.

/RogerL

--
Roger Larsson
Skellefte?
Sweden