Hi all,
I am using the software Cinelerra and I noticed that it was
systematically crashing (segfaults) under 2.6.9 where using a 2.6.7 was
ok. After looking more in depth at the problem it seems that the main
issue comes from the fact that several objects (it's C++) are tested in
one thread while they are freed in another. Changing from 2.6.7 to 2.6.9
seems to make it possible to have the free(object) occurring before the
test which leads to a segfault.
So, my little question is: "Did something changed recently in the kernel
about mutexes, phtreads, and so on ???"
I didn't find anything obvious about this by browsing the web.
Regards
--
Emmanuel Fleury
Computer Science Department, | Office: B1-201
Aalborg University, | Phone: +45 96 35 72 23
Fredriks Bajersvej 7E, | Fax: +45 98 15 98 89
9220 Aalborg East, Denmark | Email: [email protected]
On Wednesday 10 November 2004 15.24, you wrote:
> Hi all,
>
> I am using the software Cinelerra and I noticed that it was
> systematically crashing (segfaults) under 2.6.9 where using a 2.6.7 was
> ok. After looking more in depth at the problem it seems that the main
> issue comes from the fact that several objects (it's C++) are tested in
> one thread while they are freed in another. Changing from 2.6.7 to 2.6.9
> seems to make it possible to have the free(object) occurring before the
> test which leads to a segfault.
>
> So, my little question is: "Did something changed recently in the kernel
> about mutexes, phtreads, and so on ???"
A slight change in how the kernel schedules thread can cause failures in
already broken programs. Such changes can come from any source. A
consequence of preemptive scheduling is that you never when code in different
threads run relative to each other except when you explicitly synchronize
them.
It's very easy to create multithreaded programs that work fine by pure luck,
until one day they stop working. Testing alone is not likely to find bugs
caused by broken or missing synchronization, although multiprocessor machines
triggers such bugs more easily.
This doesn't rule out kernel bugs, but that's not where I'd start looking.
First check that access to the objects in question is synchronized properly.
-- robin