2002-03-21 05:06:22

by Pete Zaitcev

[permalink] [raw]
Subject: 2 questions about SCSI initialization

Hello:

I've got two questions which I cannot answer just by reading
the code, so I need to refer to the institutional memory of
the hackerdom (Doug G. - I need your memory, too :)

The context is that I got a bug with oops by someone with 68 SCSI
disks, traceable to a scsi_build_commandblocks failure, with a
subsequent oops because the error patch calls scsi_unregister_device,
and scsi_unregister_device aborts with module reference check.

Now the questions:

#1: Why does scsi_build_commandblocks() allocate memory with
GFP_ATOMIC? It's not called from an interrupt or from a swap I/O
path as far as I can see.

#2: What does if (GET_USE_COUNT(tpnt->module) != 0) do in
scsi_unregister_device? The circomstances are truly bizzare:
a) the error code is NEVER used
b) it can be called either from module unload.
I would like to kill that check.

Thanks,
-- Pete


2002-03-21 13:56:43

by Douglas Gilbert

[permalink] [raw]
Subject: Re: 2 questions about SCSI initialization

Pete Zaitcev wrote:
>
> Hello:
>
> I've got two questions which I cannot answer just by reading
> the code, so I need to refer to the institutional memory of
> the hackerdom (Doug G. - I need your memory, too :)
>
> The context is that I got a bug with oops by someone with 68 SCSI
> disks, traceable to a scsi_build_commandblocks failure, with a
> subsequent oops because the error patch calls scsi_unregister_device,
> and scsi_unregister_device aborts with module reference check.
>
> Now the questions:
>
> #1: Why does scsi_build_commandblocks() allocate memory with
> GFP_ATOMIC? It's not called from an interrupt or from a swap I/O
> path as far as I can see.

Pete,
There has long been a preference in the scsi subsystem
for using its own memory management ( scsi_malloc() )
or the most conservative mm calls. GFP_ATOMIC may well
be overkill in scsi_build_commandblocks(). However it
might be wise to check that the calling context is indeed
user_space since this can be called from all subsystems
that have a scsi pseudo device driver (e.g. ide-scsi,
usb-storage, 1394/sbp2, ...).

> #2: What does if (GET_USE_COUNT(tpnt->module) != 0) do in
> scsi_unregister_device? The circomstances are truly bizzare:
> a) the error code is NEVER used
> b) it can be called either from module unload.
> I would like to kill that check.

Another badly named function since it is unregistering
in upper level driver (e.g. sd). That "if" is to check
if there are open file descriptors (or some other
reason **) on the driver in question. That seems to be
a sensible check. Whether it can every happen in that
context, I'm not sure.


** The sg driver purposely holds its USE_COUNT > 0 even
when all its file descriptors are closed iff there are
outstanding commands for which the response has not
yet arrived. [If this is not done, then a control-C on
something like cdrecord followed by "rmmod sg" may
cause an oops, especially during "fixating" mode.]

Doug Gilbert

2002-03-21 14:17:21

by Alan

[permalink] [raw]
Subject: Re: 2 questions about SCSI initialization

> for using its own memory management ( scsi_malloc() )
> or the most conservative mm calls. GFP_ATOMIC may well
> be overkill in scsi_build_commandblocks(). However it

(Historically the scsi layer predates kmalloc in Linux so had to use its
own allocator)

2002-03-21 22:27:15

by Patrick Mansfield

[permalink] [raw]
Subject: Re: 2 questions about SCSI initialization

On Thu, Mar 21, 2002 at 12:05:53AM -0500, Pete Zaitcev wrote:

> #2: What does if (GET_USE_COUNT(tpnt->module) != 0) do in
> scsi_unregister_device? The circomstances are truly bizzare:
> a) the error code is NEVER used
> b) it can be called either from module unload.
> I would like to kill that check.

Is sd (or sg or whatever) *not* a module in your case?

I think the check is buggy, as it doesn't validate tpnt->module,
IMO you ought to add a check for tpnt->module, and call BUG()
rather than completely removing the check, see the patch below.

If the count is really non zero, the function should not be called
(rmmod won't call into it if the module is in use; if calling via
scsi_register_device_module on failure, it should be impossible
to increment count - it should be impossible to call sd_open or
sg_open).

Here's a patch against 2.4.18:

--- scsi.c.orig Thu Mar 21 13:51:27 2002
+++ scsi.c Thu Mar 21 13:52:54 2002
@@ -2331,8 +2331,8 @@
/*
* If we are busy, this is not going to fly.
*/
- if (GET_USE_COUNT(tpnt->module) != 0)
- goto error_out;
+ if (tpnt->module && (GET_USE_COUNT(tpnt->module) != 0))
+ BUG();

/*
* Next, detach the devices from the driver.
@@ -2366,9 +2366,6 @@
* cleanup function.
*/
return 0;
-error_out:
- unlock_kernel();
- return -1;
}


-- Patrick Mansfield

2002-03-22 00:05:19

by Pete Zaitcev

[permalink] [raw]
Subject: Re: 2 questions about SCSI initialization

> Date: Thu, 21 Mar 2002 14:26:35 -0800
> From: Patrick Mansfield <[email protected]>

> > #2: What does if (GET_USE_COUNT(tpnt->module) != 0) do in
> > scsi_unregister_device? The circomstances are truly bizzare:
> > a) the error code is NEVER used
> > b) it can be called either from module unload.
> > I would like to kill that check.

>[...]
> If the count is really non zero, the function should not be called
> (rmmod won't call into it if the module is in use; if calling via
> scsi_register_device_module on failure, it should be impossible
> to increment count - it should be impossible to call sd_open or
> sg_open).

The last line of reasoning is faulty, because sys_init_module()
does atomic_set(&mod->uc.usecount,1); before calling init_sg()
or init_sd(). Thus, it's not only possible, but it is guaranteed
that the counter is non-zero when control gets
to scsi_register_device_module, and to the failure path.

> --- scsi.c.orig Thu Mar 21 13:51:27 2002
> +++ scsi.c Thu Mar 21 13:52:54 2002
> @@ -2331,8 +2331,8 @@
> /*
> * If we are busy, this is not going to fly.
> */
> - if (GET_USE_COUNT(tpnt->module) != 0)
> - goto error_out;
> + if (tpnt->module && (GET_USE_COUNT(tpnt->module) != 0))
> + BUG();

Guaranteed to trigger BUG() is out_of_memory gets set.

I still think we better kill this check altogether.
Any more objections?

-- Pete

2002-03-22 00:22:36

by Pete Zaitcev

[permalink] [raw]
Subject: Re: 2 questions about SCSI initialization

> Date: Thu, 21 Mar 2002 08:57:27 -0500
> From: Douglas Gilbert <[email protected]>

> > #1: Why does scsi_build_commandblocks() allocate memory with
> > GFP_ATOMIC? It's not called from an interrupt or from a swap I/O
> > path as far as I can see.
>
> There has long been a preference in the scsi subsystem
> for using its own memory management ( scsi_malloc() )
> or the most conservative mm calls. GFP_ATOMIC may well
> be overkill in scsi_build_commandblocks(). However it
> might be wise to check that the calling context is indeed
> user_space since this can be called from all subsystems
> that have a scsi pseudo device driver (e.g. ide-scsi,
> usb-storage, 1394/sbp2, ...).

Thanks for the explanation. I've set out to prove that it is called
from a user context only. If that fails, I'll come up with something
else, perhaps in_interrupt() check, or an extra parameter.

> > #2: What does if (GET_USE_COUNT(tpnt->module) != 0) do in
> > scsi_unregister_device? The circomstances are truly bizzare:
> > a) the error code is NEVER used
> > b) it can be called either from module unload.
> > I would like to kill that check.
>
> Another badly named function since it is unregistering
> in upper level driver (e.g. sd). That "if" is to check
> if there are open file descriptors (or some other
> reason **) on the driver in question. That seems to be
> a sensible check. Whether it can every happen in that
> context, I'm not sure.
>
> ** The sg driver purposely holds its USE_COUNT > 0 even
> when all its file descriptors are closed iff there are
> outstanding commands for which the response has not
> yet arrived. [If this is not done, then a control-C on
> something like cdrecord followed by "rmmod sg" may
> cause an oops, especially during "fixating" mode.]

I think the above cannot apply, because sys_delete_module()
does this:

spin_lock(&unload_lock);
if (!__MOD_IN_USE(mod)) {
mod->flags |= MOD_DELETED;
spin_unlock(&unload_lock);
free_module(mod, 0);
error = 0;
} else {
spin_unlock(&unload_lock);
}

There's no way even to get near that check if the module
count is nonzero (well, minus "can_unload()", which is not
used by sg, or anyone for that matter).

One more thing that was skipped in the header of my mail
was that the current code in scsi_register_device_module near
out_of_space flag simply DOES NOT WORK. There's an oops for
the show. It is possible to de-factor the scsi_unregister_device
into checking and non-checking version (perhaps one calling
the other), but why? So far it is proven that the check does
nothing (either it is not reached when files are open, or else
it is guaranteed to fail). This is why I suggest to remove
the check.

-- Pete

2002-03-22 01:28:34

by Patrick Mansfield

[permalink] [raw]
Subject: Re: 2 questions about SCSI initialization

On Thu, Mar 21, 2002 at 07:04:51PM -0500, Pete Zaitcev wrote:
> > Date: Thu, 21 Mar 2002 14:26:35 -0800
> > From: Patrick Mansfield <[email protected]>
> > --- scsi.c.orig Thu Mar 21 13:51:27 2002
> > +++ scsi.c Thu Mar 21 13:52:54 2002
> > @@ -2331,8 +2331,8 @@
> > /*
> > * If we are busy, this is not going to fly.
> > */
> > - if (GET_USE_COUNT(tpnt->module) != 0)
> > - goto error_out;
> > + if (tpnt->module && (GET_USE_COUNT(tpnt->module) != 0))
> > + BUG();
>
> Guaranteed to trigger BUG() is out_of_memory gets set.
>
> I still think we better kill this check altogether.
> Any more objections?

No objection.

The same problem exists in scsi_unregister_host, where it checks
GET_USE_COUNT(SDpnt->host->hostt->module). It looks like we would
hit this with sd and scsi built into the kernel, and an insmod
of an adapter that hits a scsi_build_commandblocks failure. Correct?

-- Patrick Mansfield

>
> -- Pete
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2002-03-22 01:44:48

by Pete Zaitcev

[permalink] [raw]
Subject: Re: 2 questions about SCSI initialization

> Date: Thu, 21 Mar 2002 17:27:55 -0800
> From: Patrick Mansfield <[email protected]>

> The same problem exists in scsi_unregister_host, where it checks
> GET_USE_COUNT(SDpnt->host->hostt->module). It looks like we would
> hit this with sd and scsi built into the kernel, and an insmod
> of an adapter that hits a scsi_build_commandblocks failure. Correct?

I saw that too, but I am less enthusiastic about it for selfish
reasons: no bug is filed against me. There's also one more small
thing: for a host such a check _may_ make some sense.
Target drivers interface file system from their top, so they
get their module usage incremented (and from there, they may
safely increment their usage more if they, say, have outstanding
commands, as Doug explained previously). This is not the case with
host adapter drivers. I simply do not have a complete analysis.
Obviously, the code is broken, but I do not know how to fix it.

All that code is a hell on Earth, I tell you. I am happy that
Marcelo accepted my "detected" counters, but I think that in a
year someone will step into the very same trap with 2.6. Whole
SCSI needs an overhaul. Wanna be Andre Hendriks of SCSI?

-- Pete

2002-03-22 08:38:09

by Pete Zaitcev

[permalink] [raw]
Subject: Re: 2 questions about SCSI initialization

> Date: Thu, 21 Mar 2002 08:57:27 -0500
> From: Douglas Gilbert <[email protected]>

> There has long been a preference in the scsi subsystem
> for using its own memory management ( scsi_malloc() )
> or the most conservative mm calls. GFP_ATOMIC may well
> be overkill in scsi_build_commandblocks(). However it
> might be wise to check that the calling context is indeed
> user_space since this can be called from all subsystems
> that have a scsi pseudo device driver (e.g. ide-scsi,
> usb-storage, 1394/sbp2, ...).

OK, I did the legwork, and it seems that Doug is right.
Unfortunately, this means that, in pinhead's words, I should be
ashamed to post the fix to a public mailing list. Let's do it
this way: if Alan or Marcelo pick it - good. Real oops, real fix,
everyone's happy. Otherwise, I'll ship it with Red Hat kernel under
my own responsibility [I do not promise a hara-kiri if it blows up,
but they will point fingers at me at meetings]

About the second point - we know the module count check is bogus,
but if the attached band aid gets accepted, it may stay.

-- Pete

--- linux-2.4.19-pre2/drivers/scsi/scsi.c Mon Mar 11 09:20:48 2002
+++ linux-2.4.19-pre2-p3/drivers/scsi/scsi.c Thu Mar 21 23:31:35 2002
@@ -106,6 +106,12 @@
COMMAND_SIZE(SCpnt->cmnd[0]) : SCpnt->cmd_len)

/*
+ * This code cannot be detangled, so we resort to band-aid.
+ * See also gfp_any().
+ */
+#define GFP_ANY() (in_interrupt()? GFP_ATOMIC: GFP_KERNEL)
+
+/*
* Data declarations.
*/
unsigned long scsi_pid;
@@ -1483,12 +1489,16 @@
SDpnt->device_queue = NULL;

for (j = 0; j < SDpnt->queue_depth; j++) {
+ spin_unlock_irqrestore(&device_request_lock, flags);
+
SCpnt = (Scsi_Cmnd *)
kmalloc(sizeof(Scsi_Cmnd),
- GFP_ATOMIC |
+ GFP_ANY() |
(host->unchecked_isa_dma ? GFP_DMA : 0));
- if (NULL == SCpnt)
+ if (NULL == SCpnt) {
+ spin_lock_irqsave(&device_request_lock, flags);
break; /* If not, the next line will oops ... */
+ }
memset(SCpnt, 0, sizeof(Scsi_Cmnd));
SCpnt->host = host;
SCpnt->device = SDpnt;
@@ -1506,10 +1516,12 @@
SCpnt->serial_number = 0;
SCpnt->serial_number_at_timeout = 0;
SCpnt->host_scribble = NULL;
- SCpnt->next = SDpnt->device_queue;
- SDpnt->device_queue = SCpnt;
SCpnt->state = SCSI_STATE_UNUSED;
SCpnt->owner = SCSI_OWNER_NOBODY;
+
+ spin_lock_irqsave(&device_request_lock, flags);
+ SCpnt->next = SDpnt->device_queue;
+ SDpnt->device_queue = SCpnt;
}
if (j < SDpnt->queue_depth) { /* low on space (D.Gilbert 990424) */
printk(KERN_WARNING "scsi_build_commandblocks: want=%d, space for=%d blocks\n",