2003-01-03 17:34:54

by Stephen Evanchik

[permalink] [raw]
Subject: [PATCH 2.5.54] hermes: serialization fixes

The hermes MAC controller module wasn't serializing BAP seek calls. This
caused an eventual Rx/Tx failure which equates tens of thousands of errors on
the wireless interface. A simple spinlock is used to keep things in line.

Any comments are appreciated, I don't believe this is the best solution, but
it is working well. Patches can be downloaded from here:

http://www.clarkson.edu/~evanchsa/software/kernel/patches/hermes.patch-2.4.20
http://www.clarkson.edu/~evanchsa/software/kernel/patches/hermes.patch-2.5.54



--- linux-2.5.54/drivers/net/wireless/hermes.h 2003-01-01 22:21:02.000000000
-0500
+++ linux-2.5.54-devel/drivers/net/wireless/hermes.h 2003-01-03
11:38:25.000000000 -0500
@@ -288,6 +288,9 @@

u16 inten; /* Which interrupts should be enabled? */

+ /* Lock to avoid tripping over ourselves */
+ spinlock_t baplock;
+
#ifdef HERMES_DEBUG_BUFFER
struct hermes_debug_entry dbuf[HERMES_DEBUG_BUFSIZE];
unsigned long dbufp;
--- linux-2.5.54/drivers/net/wireless/hermes.c 2003-01-01 22:21:13.000000000
-0500
+++ linux-2.5.54-devel/drivers/net/wireless/hermes.c 2003-01-03
11:41:51.000000000 -0500
@@ -407,11 +407,18 @@
{
int dreg = bap ? HERMES_DATA1 : HERMES_DATA0;
int err = 0;
+ unsigned long flags;

if ( (len < 0) || (len % 2) )
return -EINVAL;

+ /* Without this, system instability occurs */
+ spin_lock_irqsave((&hw->baplock), flags);
+
err = hermes_bap_seek(hw, bap, id, offset);
+
+ spin_unlock_irqrestore((&hw->baplock), flags);
+
if (err)
goto out;

@@ -433,11 +440,17 @@
{
int dreg = bap ? HERMES_DATA1 : HERMES_DATA0;
int err = 0;
+ unsigned long flags;

if ( (len < 0) || (len % 2) )
return -EINVAL;

+ spin_lock_irqsave((&hw->baplock), flags);
+
err = hermes_bap_seek(hw, bap, id, offset);
+
+ spin_unlock_irqrestore((&hw->baplock), flags);
+
if (err)
goto out;

@@ -463,6 +476,7 @@
int dreg = bap ? HERMES_DATA1 : HERMES_DATA0;
u16 rlength, rtype;
int nwords;
+ unsigned long flags;

if ( (bufsize < 0) || (bufsize % 2) )
return -EINVAL;
@@ -471,7 +485,12 @@
if (err)
goto out;

+ spin_lock_irqsave((&hw->baplock), flags);
+
err = hermes_bap_seek(hw, bap, rid, 0);
+
+ spin_unlock_irqrestore((&hw->baplock), flags);
+
if (err)
goto out;

@@ -505,8 +524,14 @@
int dreg = bap ? HERMES_DATA1 : HERMES_DATA0;
int err = 0;
int count;
+ unsigned long flags;

+ spin_lock_irqsave((&hw->baplock), flags);
+
err = hermes_bap_seek(hw, bap, rid, 0);
+
+ spin_unlock_irqrestore((&hw->baplock), flags);
+
if (err)
goto out;


2003-01-03 17:40:52

by Benjamin LaHaise

[permalink] [raw]
Subject: Re: [PATCH 2.5.54] hermes: serialization fixes

On Fri, Jan 03, 2003 at 12:39:29PM -0500, Stephen Evanchik wrote:
> The hermes MAC controller module wasn't serializing BAP seek calls. This
> caused an eventual Rx/Tx failure which equates tens of thousands of errors on
> the wireless interface. A simple spinlock is used to keep things in line.
>
> Any comments are appreciated, I don't believe this is the best solution, but
> it is working well. Patches can be downloaded from here:

Why not put the spinlock/unlock inside hermes_bap_seek()? Smaller, better
contained and more readable.

-ben

2003-01-03 17:52:09

by Stephen Evanchik

[permalink] [raw]
Subject: Re: [PATCH 2.5.54] hermes: serialization fixes

On Friday 03 January 2003 12:47, you wrote:
| Why not put the spinlock/unlock inside hermes_bap_seek()? Smaller, better
| contained and more readable.

That's the better solution, I'm trying to coordinate a bit with the
maintainer. The only reason I didn't do this in the first place is because
there is a (possibly unecessary) delay loop inside hermes_bap_seek that I
believe is trying to combat the same problem. I'm awaiting a response from
the maintainer since he knows a bit more about the hardware than I do.


Stephen

2003-01-06 18:06:59

by Alexander Hoogerhuis

[permalink] [raw]
Subject: Re: [PATCH 2.5.54] hermes: serialization fixes

Stephen Evanchik <[email protected]> writes:

> On Friday 03 January 2003 12:47, you wrote:
> | Why not put the spinlock/unlock inside hermes_bap_seek()? Smaller, better
> | contained and more readable.
>
> That's the better solution, I'm trying to coordinate a bit with the
> maintainer. The only reason I didn't do this in the first place is because
> there is a (possibly unecessary) delay loop inside hermes_bap_seek that I
> believe is trying to combat the same problem. I'm awaiting a response from
> the maintainer since he knows a bit more about the hardware than I do.
>

There is something not quite right about this patch. I used have a ton
of errors in my logs, and this patch seemed to clear this out nicely.

When I run with a patched driver now, I run for about 20 minutes with
various loads and sudenly the ksoftirqd_CPU0 process starts to hog my
CPU and not wanting to let go. As soon as I pull out the card, the
load returns to normal.

Is there any way I can provide more details on what is happening?

>
> Stephen
>

mvh,
A
--
Alexander Hoogerhuis | [email protected]
CCNP - CCDP - MCNE - CCSE | +47 908 21 485
"You have zero privacy anyway. Get over it." --Scott McNealy

2003-01-06 18:32:41

by Stephen Evanchik

[permalink] [raw]
Subject: Re: [PATCH 2.5.54] hermes: serialization fixes

On Monday 06 January 2003 12:40, Alexander Hoogerhuis wrote:

| There is something not quite right about this patch. I used have a ton
| of errors in my logs, and this patch seemed to clear this out nicely.

I belive there were some serialization issues in that version of the driver.
Things have been rewritten substantially since the driver that was included
to 2.4.20. The 2.5.54 driver should be relatively unchanged with my patch
as I believe most of the serialization issues have been resolved.

| When I run with a patched driver now, I run for about 20 minutes with
| various loads and sudenly the ksoftirqd_CPU0 process starts to hog my
| CPU and not wanting to let go. As soon as I pull out the card, the
| load returns to normal.

I noticed this too, but long after 20 minutes. After about 10 hours of continuous
download from a local mirror @ 250kb/sec things started to get interesting. As
you mention, removing the card cleans things out.

| Is there any way I can provide more details on what is happening?

There are a number of problems with the 2.4.20 driver and I recommend using
the latest from David Gibson's site:

http://www.ozlabs.org/people/dgibson/dldwd

Incidently, it's what David recommends as well -- I've personally been using
the testing version with great success. There are still some issues though;
I'm pinning them down with the help from the comments David has provided me.

Most of the work in getting this driver working seems to have been done in the
newest versions -- it's just the little things now.

---
Stephen Evanchik