2001-07-10 07:12:22

by Tim Hockin

[permalink] [raw]
Subject: [PATCH] sym53c8xx timer rework

diff -ruN dist-2.4.6/drivers/scsi/sym53c8xx.c cobalt-2.4.6/drivers/scsi/sym53c8xx.c
--- dist-2.4.6/drivers/scsi/sym53c8xx.c Tue Jun 12 11:06:54 2001
+++ cobalt-2.4.6/drivers/scsi/sym53c8xx.c Mon Jul 9 11:04:14 2001
@@ -2249,7 +2265,6 @@
**----------------------------------------------------------------
*/
struct usrcmd user; /* Command from user */
- volatile u_char release_stage; /* Synchronisation stage on release */

/*----------------------------------------------------------------
** Fields that are used (primarily) for integrity check
@@ -5868,7 +5883,12 @@
** start the timeout daemon
*/
np->lasttime=0;
- ncr_timeout (np);
+#ifdef SCSI_NCR_PCIQ_BROKEN_INTR
+ np->timer.expires = ktime_get((HZ+9)/10);
+#else
+ np->timer.expires = ktime_get(SCSI_NCR_TIMER_INTERVAL);
+#endif
+ add_timer(&np->timer);

/*
** use SIMPLE TAG messages by default
@@ -7227,23 +7247,19 @@
**==========================================================
*/

-#ifdef MODULE
static int ncr_detach(ncb_p np)
{
- int i;
+ unsigned long flags;

printk("%s: detaching ...\n", ncr_name(np));

/*
-** Stop the ncr_timeout process
-** Set release_stage to 1 and wait that ncr_timeout() set it to 2.
+** Stop the ncr_timeout process - lock it to ensure no timer is running
+** on a different CPU, or anything
*/
- np->release_stage = 1;
- for (i = 50 ; i && np->release_stage != 2 ; i--) MDELAY (100);
- if (np->release_stage != 2)
- printk("%s: the timer seems to be already stopped\n",
- ncr_name(np));
- else np->release_stage = 2;
+ NCR_LOCK_NCB(np, flags);
+ del_timer(&np->timer);
+ NCR_UNLOCK_NCB(np, flags);

/*
** Reset NCR chip.
@@ -7273,7 +7289,6 @@

return 1;
}
-#endif

/*==========================================================
**
@@ -8600,23 +8615,11 @@
{
u_long thistime = ktime_get(0);

- /*
- ** If release process in progress, let's go
- ** Set the release stage from 1 to 2 to synchronize
- ** with the release process.
- */
-
- if (np->release_stage) {
- if (np->release_stage == 1) np->release_stage = 2;
- return;
- }
-
#ifdef SCSI_NCR_PCIQ_BROKEN_INTR
- np->timer.expires = ktime_get((HZ+9)/10);
+ mod_timer(&np->timer, ktime_get((HZ+9)/10));
#else
- np->timer.expires = ktime_get(SCSI_NCR_TIMER_INTERVAL);
+ mod_timer(&np->timer, ktime_get(SCSI_NCR_TIMER_INTERVAL));
#endif
- add_timer(&np->timer);

/*
** If we are resetting the ncr, wait for settle_time before


Attachments:
sym_timer.diff (2.38 kB)

Subject: Re: [PATCH] sym53c8xx timer rework

> /*
> -** Stop the ncr_timeout process
> -** Set release_stage to 1 and wait that ncr_timeout() set it to 2.
> +** Stop the ncr_timeout process - lock it to ensure no timer is running
> +** on a different CPU, or anything
> */
> - np->release_stage = 1;
> - for (i = 50 ; i && np->release_stage != 2 ; i--) MDELAY (100);
> - if (np->release_stage != 2)
> - printk("%s: the timer seems to be already stopped\n",
> - ncr_name(np));
> - else np->release_stage = 2;
> + NCR_LOCK_NCB(np, flags);
> + del_timer(&np->timer);
> + NCR_UNLOCK_NCB(np, flags);

I'm only reading the diff, but this change looks wrong.
The simplest solution is del_timer_sync() instead of
LOCK;del_timer;UNLOCK.

Why do you acqurie the NCB spinlock? the _timeout function runs without
it.

--
Manfred

2001-07-10 18:14:56

by Tim Hockin

[permalink] [raw]
Subject: Re: [PATCH] sym53c8xx timer rework

Studierende der Universitaet des Saarlandes wrote:

> > + NCR_LOCK_NCB(np, flags);
> > + del_timer(&np->timer);
> > + NCR_UNLOCK_NCB(np, flags);
>
> I'm only reading the diff, but this change looks wrong.
> The simplest solution is del_timer_sync() instead of
> LOCK;del_timer;UNLOCK.

I didn't even realize there was a del_timer_sync(). That does what is
needed. 3 lines become 1, I guess.

Thanks, CC:ed to Gerard and crew.

--
Tim Hockin
Systems Software Engineer
Sun Microsystems, Cobalt Server Appliances
[email protected]

2001-07-10 19:03:16

by Gérard Roudier

[permalink] [raw]
Subject: Re: [PATCH] sym53c8xx timer rework



On Tue, 10 Jul 2001, Tim Hockin wrote:

> Gerard (and all)
>
> Attached is a small patch to re-work the timer in the sym53c8xx driver. I
> submitted this patch against 2.4.5, but don't see it in 2.4.6, so I am
> re-submitting against 2.4.6.
>
> Please let me know if there are any problems with this patch.

Hmmm... How much are you sure there isn't any race in your patch ?

If the timer handler is spinning on the lock embedded in the driver
instance and you free this instance under its knees, it will just
reference random memory.

That was the reason I preferred to leave the timer die by itself prior to
releasing the HBA instance. The 'release_stage' was the trick, but
probably some memory barriers or atomic operations were missing.

If you want to delete the timer on HBA instance release, then you also
want to check if the pointer to the HBA instance is still valid in the
timer handler and just return if it isn't so.

Btw, is there a simple and clean way to deal with such concurrency (I mean
a timer embedded in a data structure we want to free concurrently ? Given
the current sheme of Linux requiring a synchronous HBA instance release, I
am under the impression that there is no such way.

G?rard.