Hi all,
Just pulled latest bk of 2.4 and it appears to be broken. The recent
rwsem race fixes seem to be the culprit (see
http://linux.bkbits.net:8080/linux-2.4/cset@40cee86dCLGhZc1lEOWZV6K7FysQlw?nav=index.html|
ChangeSet@-1d). Reversing it fixes the problem.
gcc -D__KERNEL__ -I/usr/local/src/linux-2.4/include -Wall -Wstrict-
prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -fomit-
frame-pointer -pipe -mpreferred-stack-boundary=2 -march=athlon -
nostdinc -iwithprefix include -DKBUILD_BASENAME=rwsem -DEXPORT_SYMTAB -c
rwsem.c
rwsem.c: In function `__rwsem_do_wake':
rwsem.c:67: warning: implicit declaration of function `put_task_struct'
rwsem.c: In function `rwsem_down_failed_common':
rwsem.c:131: error: `mem_map' undeclared (first use in this function)
rwsem.c:131: error: (Each undeclared identifier is reported only once
rwsem.c:131: error: for each function it appears in.)
make[2]: *** [rwsem.o] Error 1
make[2]: Leaving directory `/usr/local/src/linux-2.4/lib'
make[1]: *** [first_rule] Error 2
make[1]: Leaving directory `/usr/local/src/linux-2.4/lib'
make: *** [_dir_lib] Error 2
x86 target, gcc 3.3.2 (or 2.95.4 prerelease). Ping me if you need .config
or anything else.
[nuno@calvin] /usr/local/src/linux-2.4$ patch -Rp1 < fix-rwsem-race
patching file lib/rwsem-spinlock.c
patching file lib/rwsem.c
[nuno@calvin] /usr/local/src/linux-2.4$ cd lib
[nuno@calvin] ..src/linux-2.4/lib$ gcc -D__KERNEL__ -I/usr/local/src/
linux-2.4/include -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fno-
strict-aliasing -fno-common -fomit-frame-pointer -pipe -mpreferred-stack-
boundary=2 -march=athlon -nostdinc -iwithprefix include -
DKBUILD_BASENAME=rwsem -DEXPORT_SYMTAB -c rwsem.c
[nuno@calvin] ..src/linux-2.4/lib$ echo $?
0
Regards,
Nuno
Nuno Monteiro wrote:
>
> Hi all,
>
>
> Just pulled latest bk of 2.4 and it appears to be broken. The recent
> rwsem race fixes seem to be the culprit (see
> http://linux.bkbits.net:8080/linux-2.4/cset@40cee86dCLGhZc1lEOWZV6K7FysQlw?nav=index.html|
> ChangeSet@-1d). Reversing it fixes the problem.
>
Sorry, that was stupid of me.
Does the attached patch look acceptable? In particular, should
task_lock be used in this manner? (ie. to guarantee the task doesn't
go away).
> - put_task_struct(tsk);
> + task_unlock(tsk);
Ummm... that doesn't look right.
> - get_task_struct(tsk);
This is necessary to stop someone deallocating the task structure, can the
task structure be deallocated whilst locked?
David
David Howells wrote:
>>- put_task_struct(tsk);
>>+ task_unlock(tsk);
>
>
> Ummm... that doesn't look right.
>
>
>>- get_task_struct(tsk);
>
>
> This is necessary to stop someone deallocating the task structure, can the
> task structure be deallocated whilst locked?
>
Ooh maybe it can. Should that be a read_lock of the tasklist lock then?
Nick Piggin writes:
> David Howells wrote:
> >>- put_task_struct(tsk);
> >>+ task_unlock(tsk);
> >
> >
> > Ummm... that doesn't look right.
> >
> >
> >>- get_task_struct(tsk);
> >
> >
> > This is necessary to stop someone deallocating the task structure, can the
> > task structure be deallocated whilst locked?
> >
>
> Ooh maybe it can. Should that be a read_lock of the tasklist lock then?
For 2.4 kernels, use get_task_struct() and free_task_struct() [not put]
for locking and unlocking a task.
Mikael Pettersson wrote:
> Nick Piggin writes:
> > David Howells wrote:
> > >>- put_task_struct(tsk);
> > >>+ task_unlock(tsk);
> > >
> > >
> > > Ummm... that doesn't look right.
> > >
> > >
> > >>- get_task_struct(tsk);
> > >
> > >
> > > This is necessary to stop someone deallocating the task structure, can the
> > > task structure be deallocated whilst locked?
> > >
> >
> > Ooh maybe it can. Should that be a read_lock of the tasklist lock then?
>
> For 2.4 kernels, use get_task_struct() and free_task_struct() [not put]
> for locking and unlocking a task.
>
Sorry I'm an idiot. I'm sure Marcelo has already fixed it.
Just simply replace put_task_struct with free_task_struct.
On 2004.06.16 10:04, Nick Piggin wrote:
> Just simply replace put_task_struct with free_task_struct.
Like this, maybe? It applies on top of what's currently in BK -- it fixed
it for me, compiled and boot tested, running for the last 20 minutes.
Also, linux/mm.h is needed because of free_pages().
Marcelo, please apply.
--- linux-2.4-BK/lib/rwsem.c~fix-rwsem-race-fix 2004-06-16 14:14:02.187159768 +0100
+++ linux-2.4-BK/lib/rwsem.c 2004-06-16 14:14:28.905098024 +0100
@@ -5,6 +5,7 @@
*/
#include <linux/rwsem.h>
#include <linux/sched.h>
+#include <linux/mm.h>
#include <linux/module.h>
struct rwsem_waiter {
@@ -64,7 +65,7 @@ static inline struct rw_semaphore *__rws
mb();
waiter->task = NULL;
wake_up_process(tsk);
- put_task_struct(tsk);
+ free_task_struct(tsk);
goto out;
/* grant an infinite number of read locks to the readers at the front of the queue
@@ -96,7 +97,7 @@ static inline struct rw_semaphore *__rws
mb();
waiter->task = NULL;
wake_up_process(tsk);
- put_task_struct(tsk);
+ free_task_struct(tsk);
}
sem->wait_list.next = next;
On 2004.06.16 10:04, Nick Piggin wrote:
> Just simply replace put_task_struct with free_task_struct.
Sorry, only sent half of the patch the first time around. Here it is,
with the second hunk touching rwsem-spinlock.c.
This applies on top of what's currently in BK, and fixes the problems for
me -- it is compiled and boot tested, running for the last 20 minutes.
Also, linux/mm.h is needed because of free_pages().
I don't have any arch around here that actually needs rwsem-spinlock.c so
that part is untested, although it should be ok.
Marcelo, please apply.
--- linux-2.4-BK/lib/rwsem.c~fix-rwsem-race-fix 2004-06-16 14:14:02.000000000 +0100
+++ linux-2.4-BK/lib/rwsem.c 2004-06-16 14:14:28.000000000 +0100
@@ -5,6 +5,7 @@
*/
#include <linux/rwsem.h>
#include <linux/sched.h>
+#include <linux/mm.h>
#include <linux/module.h>
struct rwsem_waiter {
@@ -64,7 +65,7 @@ static inline struct rw_semaphore *__rws
mb();
waiter->task = NULL;
wake_up_process(tsk);
- put_task_struct(tsk);
+ free_task_struct(tsk);
goto out;
/* grant an infinite number of read locks to the readers at the front of the queue
@@ -96,7 +97,7 @@ static inline struct rw_semaphore *__rws
mb();
waiter->task = NULL;
wake_up_process(tsk);
- put_task_struct(tsk);
+ free_task_struct(tsk);
}
sem->wait_list.next = next;
--- linux-2.4-BK/lib/rwsem-spinlock.c~fix-rwsem-race-fix 2004-06-16 15:47:26.982774224 +0100
+++ linux-2.4-BK/lib/rwsem-spinlock.c 2004-06-16 15:51:17.522726824 +0100
@@ -9,6 +9,7 @@
*/
#include <linux/rwsem.h>
#include <linux/sched.h>
+#include <linux/mm.h>
#include <linux/module.h>
struct rwsem_waiter {
@@ -69,7 +70,7 @@ static inline struct rw_semaphore *__rws
mb();
waiter->task = NULL;
wake_up_process(tsk);
- put_task_struct(tsk);
+ free_task_struct(tsk);
goto out;
}
@@ -81,7 +82,7 @@ static inline struct rw_semaphore *__rws
mb();
waiter->task = NULL;
wake_up_process(tsk);
- put_task_struct(tsk);
+ free_task_struct(tsk);
woken++;
if (list_empty(&sem->wait_list))
break;
@@ -111,7 +112,7 @@ static inline struct rw_semaphore *__rws
mb();
waiter->task = NULL;
wake_up_process(tsk);
- put_task_struct(tsk);
+ free_task_struct(tsk);
return sem;
}
On Wed, Jun 16, 2004 at 02:40:36PM +0100, Nuno Monteiro wrote:
> On 2004.06.16 10:04, Nick Piggin wrote:
> > Just simply replace put_task_struct with free_task_struct.
>
> Like this, maybe? It applies on top of what's currently in BK -- it fixed
> it for me, compiled and boot tested, running for the last 20 minutes.
> Also, linux/mm.h is needed because of free_pages().
Ok, applied, thanks everyone. Should be releasing -pre6 in a few moments with
this.