2008-11-24 21:26:35

by Keith Mannthey

[permalink] [raw]
Subject: [Patch][RFC] Supress Buffer I/O errors when SCSI REQ_QUIET flag set

Allow the scsi request REQ_QUIET flag to be propagated to the buffer
file system layer. The basic ideas is to pass the flag from the scsi
request to the bio (block IO) and then to the buffer layer. The buffer
layer can then suppress needless printks.

This patch declutters the kernel log by removed the 40-50 (per lun)
buffer io error messages seen during a boot in my multipath setup . It
is a good chance any real errors will be missed in the "noise" it the
logs without this patch.

During boot I see blocks of messages like
"
__ratelimit: 211 callbacks suppressed
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242847
Buffer I/O error on device sdm, logical block 1
Buffer I/O error on device sdm, logical block 5242878
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242879
Buffer I/O error on device sdm, logical block 5242872
"
in my logs.

My disk environment is multipath fiber channel using the SCSI_DH_RDAC
code and multipathd. This topology includes an "active" and "ghost"
path for each lun. IO's to the "ghost" path will never complete and the
SCSI layer, via the scsi device handler rdac code, quick returns the IOs
to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi
layer messages.

I am wanting to extend the QUIET behavior to include the buffer file
system layer to deal with these errors as well. I have been running this
patch for a while now on several boxes without issue. A few runs of
bonnie++ show no noticeable difference in performance in my setup.

Thanks for John Stultz for the quiet_error finalization.

Submitted-by: Keith Mannthey <[email protected]>

---

diff -urN linux-2.6.28-rc4-orig/block/blk-core.c linux-2.6.28-rc4/block/blk-core.c
--- linux-2.6.28-rc4-orig/block/blk-core.c 2008-11-18 11:17:59.000000000 -0800
+++ linux-2.6.28-rc4/block/blk-core.c 2008-11-17 16:48:06.000000000 -0800
@@ -139,6 +139,9 @@
nbytes = bio->bi_size;
}

+ if (unlikely(rq->cmd_flags & REQ_QUIET)) {
+ set_bit(BIO_QUIET, &bio->bi_flags);
+ }
bio->bi_size -= nbytes;
bio->bi_sector += (nbytes >> 9);

diff -urN linux-2.6.28-rc4-orig/fs/buffer.c linux-2.6.28-rc4/fs/buffer.c
--- linux-2.6.28-rc4-orig/fs/buffer.c 2008-11-18 11:18:01.000000000 -0800
+++ linux-2.6.28-rc4/fs/buffer.c 2008-11-18 12:02:59.000000000 -0800
@@ -99,10 +99,18 @@
page_cache_release(page);
}

+
+static int quiet_error(struct buffer_head *bh)
+{
+ if (!test_bit(BH_Quiet, &bh->b_state) && printk_ratelimit())
+ return 0;
+ return 1;
+}
+
+
static void buffer_io_error(struct buffer_head *bh)
{
char b[BDEVNAME_SIZE];
-
printk(KERN_ERR "Buffer I/O error on device %s, logical block %Lu\n",
bdevname(bh->b_bdev, b),
(unsigned long long)bh->b_blocknr);
@@ -144,7 +152,7 @@
if (uptodate) {
set_buffer_uptodate(bh);
} else {
- if (!buffer_eopnotsupp(bh) && printk_ratelimit()) {
+ if (!buffer_eopnotsupp(bh) && !quiet_error(bh)) {
buffer_io_error(bh);
printk(KERN_WARNING "lost page write due to "
"I/O error on %s\n",
@@ -394,7 +402,7 @@
set_buffer_uptodate(bh);
} else {
clear_buffer_uptodate(bh);
- if (printk_ratelimit())
+ if (!quiet_error(bh))
buffer_io_error(bh);
SetPageError(page);
}
@@ -455,7 +463,7 @@
if (uptodate) {
set_buffer_uptodate(bh);
} else {
- if (printk_ratelimit()) {
+ if (!quiet_error(bh)) {
buffer_io_error(bh);
printk(KERN_WARNING "lost page write due to "
"I/O error on %s\n",
@@ -2912,6 +2920,9 @@
set_bit(BH_Eopnotsupp, &bh->b_state);
}

+ if (unlikely (test_bit(BIO_QUIET,&bio->bi_flags)))
+ set_bit(BH_Quiet, &bh->b_state);
+
bh->b_end_io(bh, test_bit(BIO_UPTODATE, &bio->bi_flags));
bio_put(bio);
}
diff -urN linux-2.6.28-rc4-orig/include/linux/bio.h linux-2.6.28-rc4/include/linux/bio.h
--- linux-2.6.28-rc4-orig/include/linux/bio.h 2008-11-18 11:18:02.000000000 -0800
+++ linux-2.6.28-rc4/include/linux/bio.h 2008-11-18 11:32:51.000000000 -0800
@@ -117,6 +117,7 @@
#define BIO_CPU_AFFINE 8 /* complete bio on same CPU as submitted */
#define BIO_NULL_MAPPED 9 /* contains invalid user pages */
#define BIO_FS_INTEGRITY 10 /* fs owns integrity data, not block layer */
+#define BIO_QUIET 11 /* Make BIO Quiet */
#define bio_flagged(bio, flag) ((bio)->bi_flags & (1 << (flag)))

/*
diff -urN linux-2.6.28-rc4-orig/include/linux/buffer_head.h linux-2.6.28-rc4/include/linux/buffer_head.h
--- linux-2.6.28-rc4-orig/include/linux/buffer_head.h 2008-11-18 11:18:02.000000000 -0800
+++ linux-2.6.28-rc4/include/linux/buffer_head.h 2008-11-17 16:48:06.000000000 -0800
@@ -35,6 +35,7 @@
BH_Ordered, /* ordered write */
BH_Eopnotsupp, /* operation not supported (barrier) */
BH_Unwritten, /* Buffer is allocated on disk but not written */
+ BH_Quiet, /* Buffer Error Prinks to be quiet */

BH_PrivateStart,/* not a state bit, but the first bit available
* for private allocation by other entities


2008-11-25 09:21:33

by Jens Axboe

[permalink] [raw]
Subject: Re: [Patch][RFC] Supress Buffer I/O errors when SCSI REQ_QUIET flag set

On Mon, Nov 24 2008, Keith Mannthey wrote:
> Allow the scsi request REQ_QUIET flag to be propagated to the buffer
> file system layer. The basic ideas is to pass the flag from the scsi
> request to the bio (block IO) and then to the buffer layer. The buffer
> layer can then suppress needless printks.
>
> This patch declutters the kernel log by removed the 40-50 (per lun)
> buffer io error messages seen during a boot in my multipath setup . It
> is a good chance any real errors will be missed in the "noise" it the
> logs without this patch.
>
> During boot I see blocks of messages like
> "
> __ratelimit: 211 callbacks suppressed
> Buffer I/O error on device sdm, logical block 5242879
> Buffer I/O error on device sdm, logical block 5242879
> Buffer I/O error on device sdm, logical block 5242847
> Buffer I/O error on device sdm, logical block 1
> Buffer I/O error on device sdm, logical block 5242878
> Buffer I/O error on device sdm, logical block 5242879
> Buffer I/O error on device sdm, logical block 5242879
> Buffer I/O error on device sdm, logical block 5242879
> Buffer I/O error on device sdm, logical block 5242879
> Buffer I/O error on device sdm, logical block 5242872
> "
> in my logs.
>
> My disk environment is multipath fiber channel using the SCSI_DH_RDAC
> code and multipathd. This topology includes an "active" and "ghost"
> path for each lun. IO's to the "ghost" path will never complete and the
> SCSI layer, via the scsi device handler rdac code, quick returns the IOs
> to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi
> layer messages.
>
> I am wanting to extend the QUIET behavior to include the buffer file
> system layer to deal with these errors as well. I have been running this
> patch for a while now on several boxes without issue. A few runs of
> bonnie++ show no noticeable difference in performance in my setup.
>
> Thanks for John Stultz for the quiet_error finalization.

Looks good to me. I'll merge it up for 2.6.29.

--
Jens Axboe

2008-12-30 19:36:18

by Andrew Morton

[permalink] [raw]
Subject: Re: [Patch][RFC] Supress Buffer I/O errors when SCSI REQ_QUIET flag set

On Tue, 25 Nov 2008 10:19:18 +0100
Jens Axboe <[email protected]> wrote:

> On Mon, Nov 24 2008, Keith Mannthey wrote:
> > Allow the scsi request REQ_QUIET flag to be propagated to the buffer
> > file system layer. The basic ideas is to pass the flag from the scsi
> > request to the bio (block IO) and then to the buffer layer. The buffer
> > layer can then suppress needless printks.
> >
> > This patch declutters the kernel log by removed the 40-50 (per lun)
> > buffer io error messages seen during a boot in my multipath setup . It
> > is a good chance any real errors will be missed in the "noise" it the
> > logs without this patch.
> >
> > During boot I see blocks of messages like
> > "
> > __ratelimit: 211 callbacks suppressed
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242847
> > Buffer I/O error on device sdm, logical block 1
> > Buffer I/O error on device sdm, logical block 5242878
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242872
> > "
> > in my logs.
> >
> > My disk environment is multipath fiber channel using the SCSI_DH_RDAC
> > code and multipathd. This topology includes an "active" and "ghost"
> > path for each lun. IO's to the "ghost" path will never complete and the
> > SCSI layer, via the scsi device handler rdac code, quick returns the IOs
> > to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi
> > layer messages.
> >
> > I am wanting to extend the QUIET behavior to include the buffer file
> > system layer to deal with these errors as well. I have been running this
> > patch for a while now on several boxes without issue. A few runs of
> > bonnie++ show no noticeable difference in performance in my setup.
> >
> > Thanks for John Stultz for the quiet_error finalization.
>
> Looks good to me. I'll merge it up for 2.6.29.

So a month later this turns up in linux-next. During the merge
window, giving a nice pile of rejects to keep me amused.

Can we do better than this, please? A lot?

> +static int quiet_error(struct buffer_head *bh)
> +{
> + if (!test_bit(BH_Quiet, &bh->b_state) && printk_ratelimit())
> + return 0;
> + return 1;
> +}
>

For better of for worse, we have a convention of using cpp-generated
helper functions for the buffer_head flags. There's no reason why this
new code needs to diverge from that. The above should use buffer_quiet().

The functions in fs/buffer.c have been nicely commented.

This function is poorly named. What does "quiet_error" *mean*?

<tries to work it out>

Every caller of this function does `if (!quiet_error(bh))'. Would it
not make more sense to invert the sense of its return value?

static int permit_bh_errors(struct buffer_head *bh)
{
if (buffer_quiet(bh))
return 0; /* IO layer suppressed error messages */
return printk_ratelimit();
}

Did I translate that right? If so, then the addition of the
printk_ratelimit() to the non-buffer_quiet() buffers is an
unchangelogged and unrelated alteration.

The use of printk_ratelimit() needs some thought. It shares
ratelimiting state with all other printk_ratelimit() callsites. Was
that desirable? Would it have been better to create a private
ratelimit_state for buffer_heads? Per physical device? Per
something-else?


> + if (unlikely (test_bit(BIO_QUIET,&bio->bi_flags)))
> + set_bit(BH_Quiet, &bh->b_state);

And the above (which has coding-style errors and has apparently not been
checkpatched) should use set_buffer_quiet().


> --- a/include/linux/buffer_head.h
> +++ b/include/linux/buffer_head.h
> @@ -35,6 +35,7 @@ enum bh_state_bits {
> BH_Ordered, /* ordered write */
> BH_Eopnotsupp, /* operation not supported (barrier) */
> BH_Unwritten, /* Buffer is allocated on disk but not written */
> + BH_Quiet, /* Buffer Error Prinks to be quiet */
>
> BH_PrivateStart,/* not a state bit, but the first bit available
> * for private allocation by other entities

Add

+ BUFFER_FNS(Quiet, quiet)

around line 123 to generate the helper functions.

2008-12-30 19:42:40

by Andrew Morton

[permalink] [raw]
Subject: Re: [Patch][RFC] Supress Buffer I/O errors when SCSI REQ_QUIET flag set

On Tue, 30 Dec 2008 11:35:14 -0800
Andrew Morton <[email protected]> wrote:

> During the merge
> window, giving a nice pile of rejects to keep me amused.

Seems that the rejects are due to other churn. Where the heck did
all the slab defrag code disappear to??

I'll need to revert to yesterday's linux-next I think. Pekka, has
that stuff been dropped?

2008-12-30 20:08:23

by Pekka Enberg

[permalink] [raw]
Subject: Re: [Patch][RFC] Supress Buffer I/O errors when SCSI REQ_QUIET flag set

Hi Andrew,

On Tue, Dec 30, 2008 at 9:41 PM, Andrew Morton
<[email protected]> wrote:
> Seems that the rejects are due to other churn. Where the heck did
> all the slab defrag code disappear to??
>
> I'll need to revert to yesterday's linux-next I think. Pekka, has
> that stuff been dropped?

Yup, there hasn't been any activity after the last merge window so I'm
not expecting to be able to merge it for 2.6.29. As per linux-next
rules, I dropped the 2.6.30 material until the merge window closes.
The code is in topic/slub-defrag of slab.git, though.

Pekka

2008-12-31 23:05:04

by Christoph Lameter

[permalink] [raw]
Subject: Defrag support for inodes / dentries etc

On Tue, 30 Dec 2008, Pekka Enberg wrote:

> > I'll need to revert to yesterday's linux-next I think. Pekka, has
> > that stuff been dropped?
>
> Yup, there hasn't been any activity after the last merge window so I'm
> not expecting to be able to merge it for 2.6.29. As per linux-next
> rules, I dropped the 2.6.30 material until the merge window closes.
> The code is in topic/slub-defrag of slab.git, though.

Well the problem currently is that there is no driver for these features.
There has been a request for this functionality for a long time but both
the primary drivers for this (Dave Chinner and I) are no longer with the
company for which we started the project. I'd be glad to continue this
if there would be an interest by the filesystem developers.

The key problem that would have to be addressed to move this
forward is to clean up the callbacks for dentries and inodes.

Again the problem solved by these patches are mainly filesystem intensive
loads that can cause significant fragmentation for inodes and denties
which may leave lots of memory unrecoverable for other purposes.




2009-01-01 01:10:19

by Keith Mannthey

[permalink] [raw]
Subject: Re: [Patch][RFC] Supress Buffer I/O errors when SCSI REQ_QUIET flag set

On Tue, 2008-12-30 at 11:35 -0800, Andrew Morton wrote:
> On Tue, 25 Nov 2008 10:19:18 +0100
> Jens Axboe <[email protected]> wrote:
>
> > On Mon, Nov 24 2008, Keith Mannthey wrote:

<snip>

> > +static int quiet_error(struct buffer_head *bh)
> > +{
> > + if (!test_bit(BH_Quiet, &bh->b_state) && printk_ratelimit())
> > + return 0;
> > + return 1;
> > +}
> >
>
> For better of for worse, we have a convention of using cpp-generated
> helper functions for the buffer_head flags. There's no reason why this
> new code needs to diverge from that. The above should use buffer_quiet().
>
> The functions in fs/buffer.c have been nicely commented.
>
> This function is poorly named. What does "quiet_error" *mean*?

There were only theses handful or errors that I encounter in my setup.
"quite_error" in my mind was to meant to mean that the error message
could be suppressed by the new BH_Quiet flag.

> <tries to work it out>
>
> Every caller of this function does `if (!quiet_error(bh))'. Would it
> not make more sense to invert the sense of its return value?
>
> static int permit_bh_errors(struct buffer_head *bh)
> {
> if (buffer_quiet(bh))
> return 0; /* IO layer suppressed error messages */
> return printk_ratelimit();
> }
>
> Did I translate that right? If so, then the addition of the
> printk_ratelimit() to the non-buffer_quiet() buffers is an
> unchangelogged and unrelated alteration.

I will remove the mucking with of the printk_ratelimit as to not disturb
its current implementation.

> The use of printk_ratelimit() needs some thought. It shares
> ratelimiting state with all other printk_ratelimit() callsites. Was
> that desirable? Would it have been better to create a private
> ratelimit_state for buffer_heads? Per physical device? Per
> something-else?

It is unclear to me so I will refrain from touching it.

>
> > + if (unlikely (test_bit(BIO_QUIET,&bio->bi_flags)))
> > + set_bit(BH_Quiet, &bh->b_state);
>
> And the above (which has coding-style errors and has apparently not been
> checkpatched) should use set_buffer_quiet().

sorry about that :(

>
> > --- a/include/linux/buffer_head.h
> > +++ b/include/linux/buffer_head.h
> > @@ -35,6 +35,7 @@ enum bh_state_bits {
> > BH_Ordered, /* ordered write */
> > BH_Eopnotsupp, /* operation not supported (barrier) */
> > BH_Unwritten, /* Buffer is allocated on disk but not written */
> > + BH_Quiet, /* Buffer Error Prinks to be quiet */
> >
> > BH_PrivateStart,/* not a state bit, but the first bit available
> > * for private allocation by other entities
>
> Add
>
> + BUFFER_FNS(Quiet, quiet)
>
> around line 123 to generate the helper functions.

Thanks for the review and information about the helper functions. It
will be Monday before I will be able to retest/resend a new patch.

Thanks,
Keith Mannthey

2009-01-02 00:00:27

by Dave Chinner

[permalink] [raw]
Subject: Re: Defrag support for inodes / dentries etc

On Wed, Dec 31, 2008 at 05:04:04PM -0600, Christoph Lameter wrote:
> On Tue, 30 Dec 2008, Pekka Enberg wrote:
>
> > > I'll need to revert to yesterday's linux-next I think. Pekka, has
> > > that stuff been dropped?
> >
> > Yup, there hasn't been any activity after the last merge window so I'm
> > not expecting to be able to merge it for 2.6.29. As per linux-next
> > rules, I dropped the 2.6.30 material until the merge window closes.
> > The code is in topic/slub-defrag of slab.git, though.
>
> Well the problem currently is that there is no driver for these features.
> There has been a request for this functionality for a long time but both
> the primary drivers for this (Dave Chinner and I) are no longer with the
> company for which we started the project. I'd be glad to continue this
> if there would be an interest by the filesystem developers.

Certainly. The problem hasn't gone away and anyone trying to run
a busy fileserver while know of the problem....

Cheers,

Dave.
--
Dave Chinner
[email protected]