2001-07-24 17:21:32

by Carsten Otte

[permalink] [raw]
Subject: Design-Question: end_that_request_* and bh->b_end_io hooks

Hi Folks,

as you are deeper into block-devices & filesystems than me,
here are my two simple questions in short:
Is it legal for a filesystem (or whatever) to install a hook into
bh->b_end_io
which calls generic_make_request?
Do block drivers need or are they allowed to hold the io_request_lock or
other (local)
locks when calling end_that_request_*?

Rational / Explanation:
We encountered a problem with our block device driver (dasd on arch=s390)
together with JFS:
In our bottom half ,we grab the io_request_lock & disable interrupts and
(among
other stuff) call end_that_request* for finalized requests. Afterwards we
release
the lock and enable again.
The problem is, that JFS hooks into bh->b_end_io and calls
generic_make_request then.
Note that generic_make_request grabs the io_request_lock, may call schedule
(in __get_request_wait), and may call the do_request function of the block
device driver.
Do you think that this hook is legal (and we have to change the device
driver to
call end_that_request from outside the bottom half -scheduling in
interrupt!- without
holding the io_request_lock) or does the fs need to be changed?
Do other filesystems (esp. ReiserFS) require or grab the io_request_lock in
their
b_end_io hooks?

Please CC: answers directly me since I do not read the FS-Lists regulary.

mit freundlichem Gru? / with kind regards
Carsten Otte

IBM Deutschland Entwicklung GmbH
Linux for 390/zSeries Development - Device Driver Team
Phone: +49/07031/16-4076
IBM internal phone: *120-4076
--
We are Linux.
Resistance indicates that you're missing the point!


2001-07-25 17:25:17

by Tim Pepper

[permalink] [raw]
Subject: Re: Design-Question: end_that_request_* and bh->b_end_io hooks

Are you confusing generic_make_request() and __make_request().
generic_make_request() doesn't itself grab any locks or sleep. It mostly
sets some stuff up and calls the make_request function that was registered
for the given queue. If the driver hasn't done anything special for its
make_request() function then __make_request() will be that function,
in which case your statements about locking and sleeping are correct.
I suppose that either way you're triggering stuff to run which might
sleep so you shouldn't be holding locks.

You bring up an interresting point though aside from locking.
What I've read has given me the indication that a person writing
a b_end_io() function should assume that they could be called from
interrupt context. If that is the case then any b_end_io() wanting to
call generic_make_request() would need to defer that call until it was
outside of interrupt context. Otherwise the b_end_io() could sleep
within interrupt context. Drivers at the "md" level tend to call
generic_make_request() after b_end_io(), but in the kernel proper I
don't see any others. I haven't traced through the md drivers enough
to know but it does look like they do defer.

I think this may be something I'm doing wrong in a driver on which I'm
working...

Tim

2001-07-25 21:05:50

by Dave Kleikamp

[permalink] [raw]
Subject: Re: [Jfs-discussion] Design-Question: end_that_request_* and bh->b_end_io hooks

>
> Hi Folks,
>
> as you are deeper into block-devices & filesystems than me,
> here are my two simple questions in short:
> Is it legal for a filesystem (or whatever) to install a hook into
> bh->b_end_io
> which calls generic_make_request?

I believe that JFS is at fault here. JFS has been using in_interrupt() to
determine whether to initiate I/O or to queue the I/O to a worker thread.
In your case, in_interrupt() will return false, but JFS still should not be
calling generic_make_request.

> Do block drivers need or are they allowed to hold the io_request_lock or
> other (local)
> locks when calling end_that_request_*?

I think you're doing it right. See if this patch fixes the problem.

(I'm currently at OLS in Ottawa, so I apologize if my response is a little
slow.)

Index: linux/fs/jfs/jfs_logmgr.c
===================================================================
RCS file: /share/Open_Source/CVS_JFS/linux/fs/jfs/jfs_logmgr.c,v
retrieving revision 1.27
diff -u -r1.27 jfs_logmgr.c
--- linux/fs/jfs/jfs_logmgr.c 2001/06/29 14:50:21 1.27
+++ linux/fs/jfs/jfs_logmgr.c 2001/07/25 21:57:57
@@ -242,7 +242,7 @@
static void lbmFree(lbuf_t * bp);
static void lbmfree(lbuf_t * bp);
static int lbmRead(log_t * log, int pn, lbuf_t ** bpp);
-static void lbmWrite(log_t * log, lbuf_t * bp, int flag);
+static void lbmWrite(log_t * log, lbuf_t * bp, int flag, int cant_block);
static void lbmDirectWrite(log_t * log, lbuf_t * bp, int flag);
static int lbmIOWait(lbuf_t * bp, int flag);
static void lbmIODone(struct buffer_head * bh, int);
@@ -251,7 +251,7 @@
#endif /* _STILL_TO_PORT */
void lbmStartIO(lbuf_t * bp);
#ifdef _JFS_LAZYCOMMIT
-void lmGCwrite(log_t * log);
+void lmGCwrite(log_t * log, int cant_block);
#endif


@@ -687,7 +687,7 @@
if (bp->l_wqnext == NULL) {
/* bp->l_ceor = bp->l_eor; */
/* lp->h.eor = lp->t.eor = bp->l_ceor; */
- lbmWrite(log, bp, 0);
+ lbmWrite(log, bp, 0, 0);
}
}
/* page is not bound with outstanding tblk:
@@ -697,7 +697,7 @@
/* finalize the page */
bp->l_ceor = bp->l_eor;
lp->h.eor = lp->t.eor = cpu_to_le16(bp->l_ceor);
- lbmWrite(log, bp, lbmWRITE | lbmRELEASE | lbmFREE);
+ lbmWrite(log, bp, lbmWRITE | lbmRELEASE | lbmFREE, 0);
}
LOGGC_UNLOCK(log);

@@ -771,7 +771,7 @@
*/
log->cflag |= logGC_PAGEOUT;

- lmGCwrite(log);
+ lmGCwrite(log, 0);
}
/* lmGCwrite gives up LOGGC_LOCK, check again */

@@ -831,7 +831,7 @@
* LOGGC_LOCK must be held by caller.
* N.B. LOG_LOCK is NOT held during lmGroupCommit().
*/
-void lmGCwrite(log_t * log)
+void lmGCwrite(log_t * log, int cant_write)
{
lbuf_t *bp;
logpage_t *lp;
@@ -873,7 +873,7 @@
jEVENT(0,
("gc: tclsn:0x%x, bceor:0x%x\n", tblk->clsn,
bp->l_ceor));
- lbmWrite(log, bp, lbmWRITE | lbmRELEASE | lbmGC);
+ lbmWrite(log, bp, lbmWRITE | lbmRELEASE | lbmGC, cant_write);
}
/* page is not yet full */
else {
@@ -882,7 +882,7 @@
jEVENT(0,
("gc: tclsn:0x%x, bceor:0x%x\n", tblk->clsn,
bp->l_ceor));
- lbmWrite(log, bp, lbmWRITE | lbmGC);
+ lbmWrite(log, bp, lbmWRITE | lbmGC, cant_write);
}
}

@@ -962,7 +962,7 @@
bp->l_ceor = bp->l_eor;
lp->h.eor = lp->t.eor = cpu_to_le16(bp->l_eor);
jEVENT(0, ("lmPostGC: calling lbmWrite\n"));
- lbmWrite(log, bp, lbmWRITE | lbmRELEASE | lbmFREE);
+ lbmWrite(log, bp, lbmWRITE | lbmRELEASE | lbmFREE, 1);
}

}
@@ -977,7 +977,7 @@
/*
* Call lmGCwrite with new group leader
*/
- lmGCwrite(log);
+ lmGCwrite(log, 1);

/* no transaction are ready yet (transactions are only just
* queued (GC_QUEUE) and not entered for group commit yet).
@@ -1169,7 +1169,7 @@
jFYI(1,
("gc: tclsn:0x%x, bceor:0x%x\n", tblk->clsn,
bp->l_ceor));
- lbmWrite(log, bp, lbmWRITE | lbmRELEASE | lbmSYNC);
+ lbmWrite(log, bp, lbmWRITE | lbmRELEASE | lbmSYNC, 0);
}
/* page is not yet full */
else {
@@ -1178,7 +1178,7 @@
jFYI(1,
("gc: tclsn:0x%x, bceor:0x%x\n", tblk->clsn,
bp->l_ceor));
- lbmWrite(log, bp, lbmWRITE | lbmSYNC);
+ lbmWrite(log, bp, lbmWRITE | lbmSYNC, 0);
}

LOGGC_UNLOCK(log);
@@ -1237,7 +1237,7 @@
/* finalize the page */
bp->l_ceor = bp->l_eor;
lp->h.eor = lp->t.eor = cpu_to_le16(bp->l_eor);
- lbmWrite(log, bp, lbmWRITE | lbmRELEASE | lbmFREE);
+ lbmWrite(log, bp, lbmWRITE | lbmRELEASE | lbmFREE, 0);
}

tblk = tblk->cqnext;
@@ -1780,7 +1780,7 @@
bp->l_ceor = bp->l_eor;
lp = (logpage_t *) bp->l_ldata;
lp->h.eor = lp->t.eor = cpu_to_le16(bp->l_eor);
- lbmWrite(log, bp, lbmWRITE | lbmSYNC);
+ lbmWrite(log, bp, lbmWRITE | lbmSYNC, 0);
if ((rc = lbmIOWait(bp, 0)))
goto errout30;

@@ -1981,7 +1981,7 @@
bp = log->bp;
lp = (logpage_t *) bp->l_ldata;
lp->h.eor = lp->t.eor = cpu_to_le16(bp->l_eor);
- lbmWrite(log, log->bp, lbmWRITE | lbmRELEASE | lbmSYNC);
+ lbmWrite(log, log->bp, lbmWRITE | lbmRELEASE | lbmSYNC, 0);
lbmIOWait(log->bp, lbmFREE);

/*
@@ -2431,7 +2431,7 @@
* LOGGC_LOCK() serializes lbmWrite() by lmNextPage() and lmGroupCommit().
* LCACHE_LOCK() serializes xflag between lbmWrite() and lbmIODone()
*/
-static void lbmWrite(log_t * log, lbuf_t * bp, int flag)
+static void lbmWrite(log_t * log, lbuf_t * bp, int flag, int cant_block)
{
lbuf_t *tail;
unsigned long flags;
@@ -2481,7 +2481,7 @@

LCACHE_UNLOCK(); /* unlock+enable */

- if (in_interrupt()) {
+ if (cant_block) {
spin_lock_irqsave(&async_lock, flags);
bp->l_redrive_next = log_redrive_list;
log_redrive_list = bp;

2001-07-25 21:48:14

by Dave Kleikamp

[permalink] [raw]
Subject: Re: [Jfs-discussion] Design-Question: end_that_request_* and bh->b_end_io hooks

Oops! I shouldn't have sent the patch to everyone on the cc list. I
wasn't paying attention. I'm sorry everyone!

Dave Kleikamp

2001-07-26 06:38:58

by Jens Axboe

[permalink] [raw]
Subject: Re: Design-Question: end_that_request_* and bh->b_end_io hooks

On Tue, Jul 24 2001, Carsten Otte wrote:
> Hi Folks,
>
> as you are deeper into block-devices & filesystems than me,
> here are my two simple questions in short:
> Is it legal for a filesystem (or whatever) to install a hook into
> bh->b_end_io
> which calls generic_make_request?

No, b_end_io might be called from irq context (IDE for instance) which
will break for __make_request (both the _irq spin locks and the schedule
on request slot empty).

You could do bh stacking and defer stuff like this to a thread, that's
probably the way to go.

> Do block drivers need or are they allowed to hold the io_request_lock or
> other (local) locks when calling end_that_request_*?

Yes they may, in fact they _must_ hold it for end_that_request_last.
Look at blkdev_release_request -- it meddles with the queue free and
pending lists and must be protected against reentrancy.

--
Jens Axboe