2002-02-18 21:03:42

by Peter Wong

[permalink] [raw]
Subject: [PATCH] Encountered a Null Pointer Problem on the SCSI Layer

A while ago, I reported that I encountered a null pointer problem
on the SCSI layer when I was testing Mingming Cao's diskio patch
"diskio-stat-rq-2414" on 2.4.14.

Mingming's patch is at http://sourceforge.net/projects/lse/.

The code in sd_find_queue() that protects against accessing a
non-existent device is not correct. After my patch was sent out,
Pete Zaitcev of Red Hat identified a similar problem in
sd_init_command of the same file.

Let's consider sd_find_queue().

If the array pointed by rscsi_disk has been allocated,
dpnt cannot be null.

If rscsi_disk has NOT been allocated, dpnt = &rscsi_disks[target]
may NOT be null, and it depends on the value of target. Thus,
"if (!dpnt)" is not sufficient anyway.

You can also look at sd_attach(), in which "if (!dpnt->device)" is
tested, not "if (!dpnt)".

Please check.

The following patch is based on the 2.4.18-pre7 code:
---------------------------------------------------------------------------
--- linux/drivers/scsi/sd.c Mon Feb 18 13:36:42 2002
+++ linux-2.4.17-diskio/drivers/scsi/sd.c Mon Feb 18 13:29:34 2002
@@ -279,7 +279,7 @@
target = DEVICE_NR(dev);

dpnt = &rscsi_disks[target];
- if (!dpnt)
+ if (!dpnt->device)
return NULL; /* No such device */
return &dpnt->device->request_queue;
}
@@ -302,7 +302,7 @@

dpnt = &rscsi_disks[dev];
if (devm >= (sd_template.dev_max << 4) ||
- !dpnt ||
+ !dpnt->device ||
!dpnt->device->online ||
block + SCpnt->request.nr_sectors > sd[devm].nr_sects) {
SCSI_LOG_HLQUEUE(2, printk("Finishing %ld sectors\n", SCpnt->request.nr_sectors));
---------------------------------------------------------------------------

Regards,
Peter

Wai Yee Peter Wong
IBM Linux Technology Center, Performance Analysis
email: [email protected]


2002-02-18 23:03:19

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: [PATCH] Encountered a Null Pointer Problem on the SCSI Layer

> A while ago, I reported that I encountered a null pointer problem
> on the SCSI layer when I was testing Mingming Cao's diskio patch
> "diskio-stat-rq-2414" on 2.4.14.
>
> Mingming's patch is at http://sourceforge.net/projects/lse/.
>
> The code in sd_find_queue() that protects against accessing a
> non-existent device is not correct. After my patch was sent out,
> Pete Zaitcev of Red Hat identified a similar problem in
> sd_init_command of the same file.
>
> Let's consider sd_find_queue().
>
> If the array pointed by rscsi_disk has been allocated,
> dpnt cannot be null.
>
> If rscsi_disk has NOT been allocated, dpnt =
&rscsi_disks[target]
> may NOT be null, and it depends on the value of target. Thus,
> "if (!dpnt)" is not sufficient anyway.
>
> You can also look at sd_attach(), in which "if (!dpnt->device)"
is
> tested, not "if (!dpnt)".
>
> Please check.

Are you 100% sure, that there is no case where
dpnt==NULL? Because if there is such a possibility, your patch will
blow up.
It would be completely safe to check both

(!dpnt && !dpnt->device)

Regards,
Stephan


>
> The following patch is based on the 2.4.18-pre7 code:
>
----------------------------------------------------------------------
-----
> --- linux/drivers/scsi/sd.c Mon Feb 18 13:36:42 2002
> +++ linux-2.4.17-diskio/drivers/scsi/sd.c Mon Feb 18 13:29:34 2002
> @@ -279,7 +279,7 @@
> target = DEVICE_NR(dev);
>
> dpnt = &rscsi_disks[target];
> - if (!dpnt)
> + if (!dpnt->device)
> return NULL; /* No such device */
> return &dpnt->device->request_queue;
> }
> @@ -302,7 +302,7 @@
>
> dpnt = &rscsi_disks[dev];
> if (devm >= (sd_template.dev_max << 4) ||
> - !dpnt ||
> + !dpnt->device ||
> !dpnt->device->online ||
> block + SCpnt->request.nr_sectors > sd[devm].nr_sects) {
> SCSI_LOG_HLQUEUE(2, printk("Finishing %ld sectors\n",
SCpnt->request.nr_sectors));
>
----------------------------------------------------------------------
-----
>
> Regards,
> Peter
>
> Wai Yee Peter Wong
> IBM Linux Technology Center, Performance Analysis
> email: [email protected]
>
> -
> To unsubscribe from this list: send the line "unsubscribe
linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2002-02-19 00:06:25

by Pete Zaitcev

[permalink] [raw]
Subject: Re: [PATCH] Encountered a Null Pointer Problem on the SCSI Layer

> Date: Tue, 19 Feb 2002 00:01:39 +0100
> From: Stephan von Krawczynski <[email protected]>

> Are you 100% sure, that there is no case where
> dpnt==NULL? Because if there is such a possibility, your patch will
> blow up.

If there is such a possibility, everything will blow up.

Please bear with me while I am ranting at your expense, but
your example is very educational.

It seems to me that many people consider that putting a check
for NULL in front of every pointer dereference is an answer
to everything, including a missing understanding of the code.
It actually is not the case, IMHO. In well written code,
checks for NULL are not introduced to prevent oopses locally.
Instead, they implement a functionality, according to the
master plan. To distinguish these two conditions, mentally
consider a replacement of the check with something more descriptive.
When my eye sees a code that does this:

if (p->foo == NULL)
return -EINVAL;

my mind sees either this:

/*
* Our brains are too small to wrap arouund this module,
* and we saw an oops somewhere. Let's plug it with
* a check and pray that nobody will notice.
*/
if (p->foo == NULL)
return -EINVAL;

or this:

if (device_is_attached(dpnt))
return -EINVAL; /* XXX TODO: -ENODEV, not -EINVAL */

The code may be conductive to such interpretation by the mind,
or it may be not. In latter case we do what is called "a cleanup".

Often, the interpretation can only be done by looking at the
code as a whole, but this particular patch is nearly obvious
by itself:

> > dpnt = &rscsi_disks[target];
> > - if (!dpnt)
> > + if (!dpnt->device)
> > return NULL; /* No such device */
> > return &dpnt->device->request_queue;

The dpnt may be null ONLY if we do I/O to the first partition
of a first disk. Is anything special about that case?
I think not. Also, look at the comment. Obviously, the if() was
meant for something other than a corner case of partition zero.
It seems probable that the data layout was modified
but the check was forgotten.

-- Pete

2002-02-19 12:01:19

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: [PATCH] Encountered a Null Pointer Problem on the SCSI Layer

On Mon, 18 Feb 2002 19:04:07 -0500
Pete Zaitcev <[email protected]> wrote:

> > Date: Tue, 19 Feb 2002 00:01:39 +0100
> > From: Stephan von Krawczynski <[email protected]>
>
> > Are you 100% sure, that there is no case where
> > dpnt==NULL? Because if there is such a possibility, your patch will
> > blow up.
>
> If there is such a possibility, everything will blow up.

Re-reading the code, you are right. This patch is fine.

Regards,
Stephan