2007-02-20 14:00:53

by J.J.Green

[permalink] [raw]
Subject: sparc64 / bbc_i2c.c

Hi all

I got bitten by this problem on sparc64 (a blade 1000)

http://ubuntuforums.org/showthread.php?t=297474

summary :

modprobe bbc

runs kenvctrld which uses 100% of a CPU for 5 seconds,
then 0% for 5 seconds, then 100% .. and so on. The author
cited above suggests removing the line

remove_wait_queue(&bp->wq, &wait);

in the function

static int wait_for_pin(struct bbc_i2c_bus *bp, u8 *status)

Is there a better way?

I can test patches if that would be helpful.

Cheers

Jim
--
J.J. Green, Dept. Applied Mathematics, Hicks Bld.,
University of Sheffield, UK. +44 (0114) 222 3742
http://pdfb.wiredworkplace.net/pub/jjg




2007-02-25 12:48:45

by Andrew Morton

[permalink] [raw]
Subject: Re: sparc64 / bbc_i2c.c

> On Tue, 20 Feb 2007 13:27:12 +0000 "J.J. Green" <[email protected]> wrote:
> Hi all
>
> I got bitten by this problem on sparc64 (a blade 1000)
>
> http://ubuntuforums.org/showthread.php?t=297474
>
> summary :
>
> modprobe bbc
>
> runs kenvctrld which uses 100% of a CPU for 5 seconds,
> then 0% for 5 seconds, then 100% .. and so on. The author
> cited above suggests removing the line
>
> remove_wait_queue(&bp->wq, &wait);
>
> in the function
>
> static int wait_for_pin(struct bbc_i2c_bus *bp, u8 *status)
>
> Is there a better way?
>
> I can test patches if that would be helpful.
>

The code around there looks relatively unbuggy to me. Removing that
remove_wait_queue() would be very bad - it would cause later stack
corruption.

msleep_interruptible() certainly shouldn't consume CPU like that. Do we
know where the CPU time is being spent? The output of:

readprofile -r
sleep 10
readprofile -n -v -m /boot/System.map | sort -n -k 3 | tail -40

would tell us.

2007-02-25 14:00:10

by Emanuele Rocca

[permalink] [raw]
Subject: Re: sparc64 / bbc_i2c.c

* Andrew Morton <[email protected]>, [2007-02-25 4:47 -0800]:
> > On Tue, 20 Feb 2007 13:27:12 +0000 "J.J. Green" <[email protected]> wrote:
> > I got bitten by this problem on sparc64 (a blade 1000)
> >
> > http://ubuntuforums.org/showthread.php?t=297474

> The code around there looks relatively unbuggy to me. Removing that
> remove_wait_queue() would be very bad - it would cause later stack
> corruption.

The following patch by J?rg Friedrich fixes the issue without removing
the call to remove_wait_queue():

http://lists.debian.org/debian-sparc/2007/02/msg00045.html

ciao,
ema

2007-02-25 23:58:51

by J.J.Green

[permalink] [raw]
Subject: Re: sparc64 / bbc_i2c.c

Hi Andrew

> The code around there looks relatively unbuggy to me. Removing that
> remove_wait_queue() would be very bad - it would cause later stack
> corruption.
>
> msleep_interruptible() certainly shouldn't consume CPU like that. Do we
> know where the CPU time is being spent? The output of:
>
> readprofile -r
> sleep 10
> readprofile -n -v -m /boot/System.map | sort -n -k 3 | tail -40
>
> would tell us.

As was mentioned in another reply, this message by
Joerg Friedrich

http://lists.debian.org/debian-sparc/2007/02/msg00045.html

gives a possible explanantion of where the time is going.
I applied the patch to the debian kernel sources for 2.6.18,
it applied cleanly and fixed the problem.

I have the upatched kernel in /boot so I can run the tests
you mentioned fairly easily -- please let me know if you'd
still like me to do that.

Jim
--
J.J. Green, Dept. Applied Mathematics, Hicks Bld.,
University of Sheffield, UK. +44 (0114) 222 3742
http://pdfb.wiredworkplace.net/pub/jjg


2007-02-26 18:12:22

by David Miller

[permalink] [raw]
Subject: Re: sparc64 / bbc_i2c.c

From: "J.J.Green" <[email protected]>
Date: Sun, 25 Feb 2007 23:58:48 +0000 (GMT)

> Hi Andrew
>
> > The code around there looks relatively unbuggy to me. Removing that
> > remove_wait_queue() would be very bad - it would cause later stack
> > corruption.
> >
> > msleep_interruptible() certainly shouldn't consume CPU like that. Do we
> > know where the CPU time is being spent? The output of:
> >
> > readprofile -r
> > sleep 10
> > readprofile -n -v -m /boot/System.map | sort -n -k 3 | tail -40
> >
> > would tell us.
>
> As was mentioned in another reply, this message by
> Joerg Friedrich
>
> http://lists.debian.org/debian-sparc/2007/02/msg00045.html
>
> gives a possible explanantion of where the time is going.
> I applied the patch to the debian kernel sources for 2.6.18,
> it applied cleanly and fixed the problem.

I've added Joerg's patch to my tree and will push it into
-stable as well.

Reviewing this patch had been sitting deep in my backlog for weeks, I
just never got around to it, sorry.

2007-02-27 05:29:05

by David Miller

[permalink] [raw]
Subject: Re: sparc64 / bbc_i2c.c

From: Joerg Friedrich <[email protected]>
Date: Tue, 27 Feb 2007 06:22:39 +0100

> Can you just tell me if it's sufficient to check for a return value >0
> of wait_event_interruptible_timeout? I was not sure so I extended the
> check to
> if ((val != -ERESTARTSYS) && (val > 0))

I changed the check to just "val > 0".

The comments in the kernel around the implementation and
declaration of the function wait_event_interruptible()
VERY CLEARLY state that the possible return values are:

1) Negative error code on interrupt
2) Zero if timeout expired
3) Positive non-zero value if condition became true before
timeout expired

So there is no doubt that "val > 0" is sufficient.

2007-02-27 05:54:42

by Joerg Friedrich

[permalink] [raw]
Subject: Re: sparc64 / bbc_i2c.c

Hi David,

David Miller schrieb am Montag, 26. Februar 2007 um 10:12:19 -0800:
> From: "J.J.Green" <[email protected]>
> Date: Sun, 25 Feb 2007 23:58:48 +0000 (GMT)
>
> > Hi Andrew
> >
> > > The code around there looks relatively unbuggy to me. Removing that
> > > remove_wait_queue() would be very bad - it would cause later stack
> > > corruption.
> > >
> > > msleep_interruptible() certainly shouldn't consume CPU like that. Do we
> > > know where the CPU time is being spent? The output of:
> > >
> > > readprofile -r
> > > sleep 10
> > > readprofile -n -v -m /boot/System.map | sort -n -k 3 | tail -40
> > >
> > > would tell us.
> >
> > As was mentioned in another reply, this message by
> > Joerg Friedrich
> >
> > http://lists.debian.org/debian-sparc/2007/02/msg00045.html
> >
> > gives a possible explanantion of where the time is going.
> > I applied the patch to the debian kernel sources for 2.6.18,
> > it applied cleanly and fixed the problem.
>
> I've added Joerg's patch to my tree and will push it into
> -stable as well.
>
> Reviewing this patch had been sitting deep in my backlog for weeks, I
> just never got around to it, sorry.

Can you just tell me if it's sufficient to check for a return value >0
of wait_event_interruptible_timeout? I was not sure so I extended the
check to
if ((val != -ERESTARTSYS) && (val > 0))

--
Yours,
Jörg Friedrich

There are only 10 types of people:
Those who understand binary and those who don't.