Hi all
I got bitten by this problem on sparc64 (a blade 1000)
http://ubuntuforums.org/showthread.php?t=297474
summary :
modprobe bbc
runs kenvctrld which uses 100% of a CPU for 5 seconds,
then 0% for 5 seconds, then 100% .. and so on. The author
cited above suggests removing the line
remove_wait_queue(&bp->wq, &wait);
in the function
static int wait_for_pin(struct bbc_i2c_bus *bp, u8 *status)
Is there a better way?
I can test patches if that would be helpful.
Cheers
Jim
--
J.J. Green, Dept. Applied Mathematics, Hicks Bld.,
University of Sheffield, UK. +44 (0114) 222 3742
http://pdfb.wiredworkplace.net/pub/jjg
> On Tue, 20 Feb 2007 13:27:12 +0000 "J.J. Green" <[email protected]> wrote:
> Hi all
>
> I got bitten by this problem on sparc64 (a blade 1000)
>
> http://ubuntuforums.org/showthread.php?t=297474
>
> summary :
>
> modprobe bbc
>
> runs kenvctrld which uses 100% of a CPU for 5 seconds,
> then 0% for 5 seconds, then 100% .. and so on. The author
> cited above suggests removing the line
>
> remove_wait_queue(&bp->wq, &wait);
>
> in the function
>
> static int wait_for_pin(struct bbc_i2c_bus *bp, u8 *status)
>
> Is there a better way?
>
> I can test patches if that would be helpful.
>
The code around there looks relatively unbuggy to me. Removing that
remove_wait_queue() would be very bad - it would cause later stack
corruption.
msleep_interruptible() certainly shouldn't consume CPU like that. Do we
know where the CPU time is being spent? The output of:
readprofile -r
sleep 10
readprofile -n -v -m /boot/System.map | sort -n -k 3 | tail -40
would tell us.
* Andrew Morton <[email protected]>, [2007-02-25 4:47 -0800]:
> > On Tue, 20 Feb 2007 13:27:12 +0000 "J.J. Green" <[email protected]> wrote:
> > I got bitten by this problem on sparc64 (a blade 1000)
> >
> > http://ubuntuforums.org/showthread.php?t=297474
> The code around there looks relatively unbuggy to me. Removing that
> remove_wait_queue() would be very bad - it would cause later stack
> corruption.
The following patch by J?rg Friedrich fixes the issue without removing
the call to remove_wait_queue():
http://lists.debian.org/debian-sparc/2007/02/msg00045.html
ciao,
ema
Hi Andrew
> The code around there looks relatively unbuggy to me. Removing that
> remove_wait_queue() would be very bad - it would cause later stack
> corruption.
>
> msleep_interruptible() certainly shouldn't consume CPU like that. Do we
> know where the CPU time is being spent? The output of:
>
> readprofile -r
> sleep 10
> readprofile -n -v -m /boot/System.map | sort -n -k 3 | tail -40
>
> would tell us.
As was mentioned in another reply, this message by
Joerg Friedrich
http://lists.debian.org/debian-sparc/2007/02/msg00045.html
gives a possible explanantion of where the time is going.
I applied the patch to the debian kernel sources for 2.6.18,
it applied cleanly and fixed the problem.
I have the upatched kernel in /boot so I can run the tests
you mentioned fairly easily -- please let me know if you'd
still like me to do that.
Jim
--
J.J. Green, Dept. Applied Mathematics, Hicks Bld.,
University of Sheffield, UK. +44 (0114) 222 3742
http://pdfb.wiredworkplace.net/pub/jjg
From: "J.J.Green" <[email protected]>
Date: Sun, 25 Feb 2007 23:58:48 +0000 (GMT)
> Hi Andrew
>
> > The code around there looks relatively unbuggy to me. Removing that
> > remove_wait_queue() would be very bad - it would cause later stack
> > corruption.
> >
> > msleep_interruptible() certainly shouldn't consume CPU like that. Do we
> > know where the CPU time is being spent? The output of:
> >
> > readprofile -r
> > sleep 10
> > readprofile -n -v -m /boot/System.map | sort -n -k 3 | tail -40
> >
> > would tell us.
>
> As was mentioned in another reply, this message by
> Joerg Friedrich
>
> http://lists.debian.org/debian-sparc/2007/02/msg00045.html
>
> gives a possible explanantion of where the time is going.
> I applied the patch to the debian kernel sources for 2.6.18,
> it applied cleanly and fixed the problem.
I've added Joerg's patch to my tree and will push it into
-stable as well.
Reviewing this patch had been sitting deep in my backlog for weeks, I
just never got around to it, sorry.
From: Joerg Friedrich <[email protected]>
Date: Tue, 27 Feb 2007 06:22:39 +0100
> Can you just tell me if it's sufficient to check for a return value >0
> of wait_event_interruptible_timeout? I was not sure so I extended the
> check to
> if ((val != -ERESTARTSYS) && (val > 0))
I changed the check to just "val > 0".
The comments in the kernel around the implementation and
declaration of the function wait_event_interruptible()
VERY CLEARLY state that the possible return values are:
1) Negative error code on interrupt
2) Zero if timeout expired
3) Positive non-zero value if condition became true before
timeout expired
So there is no doubt that "val > 0" is sufficient.
Hi David,
David Miller schrieb am Montag, 26. Februar 2007 um 10:12:19 -0800:
> From: "J.J.Green" <[email protected]>
> Date: Sun, 25 Feb 2007 23:58:48 +0000 (GMT)
>
> > Hi Andrew
> >
> > > The code around there looks relatively unbuggy to me. Removing that
> > > remove_wait_queue() would be very bad - it would cause later stack
> > > corruption.
> > >
> > > msleep_interruptible() certainly shouldn't consume CPU like that. Do we
> > > know where the CPU time is being spent? The output of:
> > >
> > > readprofile -r
> > > sleep 10
> > > readprofile -n -v -m /boot/System.map | sort -n -k 3 | tail -40
> > >
> > > would tell us.
> >
> > As was mentioned in another reply, this message by
> > Joerg Friedrich
> >
> > http://lists.debian.org/debian-sparc/2007/02/msg00045.html
> >
> > gives a possible explanantion of where the time is going.
> > I applied the patch to the debian kernel sources for 2.6.18,
> > it applied cleanly and fixed the problem.
>
> I've added Joerg's patch to my tree and will push it into
> -stable as well.
>
> Reviewing this patch had been sitting deep in my backlog for weeks, I
> just never got around to it, sorry.
Can you just tell me if it's sufficient to check for a return value >0
of wait_event_interruptible_timeout? I was not sure so I extended the
check to
if ((val != -ERESTARTSYS) && (val > 0))
--
Yours,
Jörg Friedrich
There are only 10 types of people:
Those who understand binary and those who don't.