2001-07-31 04:40:40

by vijay srinath

[permalink] [raw]
Subject: SCSI Tape driver problem

hello all,

I noticed a bug in the scsi tape class driver in kernel 2.2.16/2.4.x.
This is the test that i ran.
1. I have two scsi-fc tape devices
2. I insert the hba driver, so that both the tape devices are enabled and map to st0 and st1
3. I remove the device that maps to st0 using
echo "scsi remove-single-device 0 0 0 0" > /proc/scsi/scsi
4. Now, if i try to do an open("/dev/st1",..), the process hangs as the open call never returns.

The problem seems to be in the callback function st_sleep_done() where the following comparison is being made
if ((st_nbr = TAPE_NR(SCpnt->request.rq_dev)) < st_template.nr_dev)
{
...
}

This comparision fails for the above test since nr_dev will be 1 and
TAPE_NR() will also be 1 for /dev/st1. Hence the semaphore
SCpnt->request.sem never gets released and open waits forever.


Can somebody please let me know why this comparison is needed ?

regards
vijay



Get 250 color business cards for FREE!
http://businesscards.lycos.com/vp/fastpath/


2001-07-31 19:33:42

by Kai Mäkisara (Kolumbus)

[permalink] [raw]
Subject: Re: SCSI Tape driver problem

On Tue, 31 Jul 2001, vijay srinath wrote:

> hello all,
>
> I noticed a bug in the scsi tape class driver in kernel 2.2.16/2.4.x.
> This is the test that i ran.
> 1. I have two scsi-fc tape devices
> 2. I insert the hba driver, so that both the tape devices are enabled and map to st0 and st1
> 3. I remove the device that maps to st0 using
> echo "scsi remove-single-device 0 0 0 0" > /proc/scsi/scsi
> 4. Now, if i try to do an open("/dev/st1",..), the process hangs as the open call never returns.
>
> The problem seems to be in the callback function st_sleep_done() where the following comparison is being made
> if ((st_nbr = TAPE_NR(SCpnt->request.rq_dev)) < st_template.nr_dev)
> {
> ...
> }
>
> This comparision fails for the above test since nr_dev will be 1 and
> TAPE_NR() will also be 1 for /dev/st1. Hence the semaphore
> SCpnt->request.sem never gets released and open waits forever.
>
>
> Can somebody please let me know why this comparison is needed ?
>
The test was inserted a long time ago to be an additional safeguard
against problems in the code. In normal operation the interrupt function
should never be called with an illegal argument. It seems that the world
around the test has evolved so that the test is not working properly any
more.

You can just remove the test. I will prepare a patch to fix this bug and
send the patch to Linus.

Kai