LinuxLists.cc - about ll_rw_blk.c of void generic_make

2006-03-29 07:26:14

Subject: about ll_rw_blk.c of void generic_make_request(struct bio *bio)

Dear Jens Axboe,

I am an engineer of Areca (SATA RAID controller producer).
I have coding linux driver for kernel.org "arcmsr".
I have got dump message %s: rw=%ld, want=%Lu, limit=%Lu message from ext2
file system.
But I am do well at Ext3 and all linux files system.
This issue only occur at read command.
Could you give me some info how to fix this bug in my linux scsi raid
driver?

About the code ll_rw_blk.c mention that "it may well happen - the kernel
calls bread() without checking the size of the device, e.g., when mounting a
device."

I hope that you have more experience with it and knew what's wrong I am
doing in my driver.

generic_make_request(struct bio *bio)

if (maxsector)
{
sector_t sector = bio->bi_sector;

if (maxsector < nr_sectors || maxsector - nr_sectors < sector)
{
/*
* This may well happen - the kernel calls bread()
* without checking the size of the device, e.g., when
* mounting a device.
*/
handle_bad_sector(bio);
goto end_io;
}
}
Best Regards
Erich Chen

2006-03-30 15:58:20

by Jens Axboe

[permalink] [raw]

Subject: Re: about ll_rw_blk.c of void generic_make_request(struct bio *bio)

On Wed, Mar 29 2006, erich wrote:
> Dear Jens Axboe,
>
> I am an engineer of Areca (SATA RAID controller producer).
> I have coding linux driver for kernel.org "arcmsr".
> I have got dump message %s: rw=%ld, want=%Lu, limit=%Lu message from ext2
> file system.
> But I am do well at Ext3 and all linux files system.
> This issue only occur at read command.
> Could you give me some info how to fix this bug in my linux scsi raid
> driver?
>
> About the code ll_rw_blk.c mention that "it may well happen - the kernel
> calls bread() without checking the size of the device, e.g., when mounting
> a device."
>
> I hope that you have more experience with it and knew what's wrong I am
> doing in my driver.
>
>
> generic_make_request(struct bio *bio)
>
> if (maxsector)
> {
> sector_t sector = bio->bi_sector;
>
> if (maxsector < nr_sectors || maxsector - nr_sectors < sector)
> {
> /*
> * This may well happen - the kernel calls bread()
> * without checking the size of the device, e.g., when
> * mounting a device.
> */
> handle_bad_sector(bio);
> goto end_io;
> }
> }

I can't really say, from my recollection of leafing over lkml emails, I
seem to recall someone saying he hit this with a newer kernel where as
the older one did not?

What are the sectors exactly it complains about, eg the full line you
see?

--
Jens Axboe

2006-03-31 17:32:57

by Chris Caputo

[permalink] [raw]

Subject: Re: about ll_rw_blk.c of void generic_make_request(struct bio *bio)

On Thu, 30 Mar 2006, Jens Axboe wrote:
> I can't really say, from my recollection of leafing over lkml emails, I
> seem to recall someone saying he hit this with a newer kernel where as
> the older one did not?
>
> What are the sectors exactly it complains about, eg the full line you
> see?

I see:

attempt to access beyond end of device
sdb1: rw=0, want=134744080, limit=128002016

Chris

2006-03-31 18:09:19

by Chris Caputo

[permalink] [raw]

Subject: Re: about ll_rw_blk.c of void generic_make_request(struct bio *bio)

On Fri, 31 Mar 2006, Chris Caputo wrote:
> On Thu, 30 Mar 2006, Jens Axboe wrote:
> > I can't really say, from my recollection of leafing over lkml emails, I
> > seem to recall someone saying he hit this with a newer kernel where as
> > the older one did not?
> >
> > What are the sectors exactly it complains about, eg the full line you
> > see?
>
> I see:
>
> attempt to access beyond end of device
> sdb1: rw=0, want=134744080, limit=128002016

I believe the "rw=0" means that was a simple read request, and not a
read-ahead.

128002016 equals about 62 gigs, which is the correct volume size:

Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdb1 62995364 2832696 56962620 5% /xxx

/dev/sdb1 on /xxx type ext2 (rw,noatime)

I'm at a loss as to why ext2 would want to read 3+ gigs past the end of
the volume or why the arcmsr driver setting max_sectors to be 4096 instead
of 512 makes a difference.

Erich, while using 4096 as the max_sectors count, in your lab can you make
it so ll_rw_blk.c:handle_bad_sector() makes a call to dump_stack() after
the printk's? What does it show as the call trace?

Chris

2006-03-31 20:22:04

by Jens Axboe

[permalink] [raw]

Subject: Re: about ll_rw_blk.c of void generic_make_request(struct bio *bio)

On Fri, Mar 31 2006, Chris Caputo wrote:
> On Fri, 31 Mar 2006, Chris Caputo wrote:
> > On Thu, 30 Mar 2006, Jens Axboe wrote:
> > > I can't really say, from my recollection of leafing over lkml emails, I
> > > seem to recall someone saying he hit this with a newer kernel where as
> > > the older one did not?
> > >
> > > What are the sectors exactly it complains about, eg the full line you
> > > see?
> >
> > I see:
> >
> > attempt to access beyond end of device
> > sdb1: rw=0, want=134744080, limit=128002016
>
> I believe the "rw=0" means that was a simple read request, and not a
> read-ahead.

Correct.

> 128002016 equals about 62 gigs, which is the correct volume size:
>
> Filesystem 1K-blocks Used Available Use% Mounted on
> /dev/sdb1 62995364 2832696 56962620 5% /xxx
>
> /dev/sdb1 on /xxx type ext2 (rw,noatime)

How are you reproducing this, through the file system (reading files),
or reading the device? If the former, is the file system definitely
sound - eg does it pass fsck?

> I'm at a loss as to why ext2 would want to read 3+ gigs past the end of
> the volume or why the arcmsr driver setting max_sectors to be 4096 instead
> of 512 makes a difference.

It's truly puzzing why the 4k vs 512 would make a difference, except if
the driver really doesn't support that large requests and corrupts the
data somehow. I'm having an extraordinarily hard time imaging how the
SCSI layer could even come up with such a bug.

So everything seems to point us getting wrong data from the hardware,
most likely because of a driver bug in either handling the larger
transfers or the hardware just not liking them very much.

> Erich, while using 4096 as the max_sectors count, in your lab can you
> make it so ll_rw_blk.c:handle_bad_sector() makes a call to
> dump_stack() after the printk's? What does it show as the call trace?

Probably wont tell you much.

--
Jens Axboe

2006-03-31 20:38:10

by Chris Caputo

[permalink] [raw]

Subject: Re: about ll_rw_blk.c of void generic_make_request(struct bio *bio)

On Fri, 31 Mar 2006, Jens Axboe wrote:
> On Fri, Mar 31 2006, Chris Caputo wrote:
> > On Fri, 31 Mar 2006, Chris Caputo wrote:
> > > On Thu, 30 Mar 2006, Jens Axboe wrote:
> > > > I can't really say, from my recollection of leafing over lkml emails, I
> > > > seem to recall someone saying he hit this with a newer kernel where as
> > > > the older one did not?
> > > >
> > > > What are the sectors exactly it complains about, eg the full line you
> > > > see?
> > >
> > > I see:
> > >
> > > attempt to access beyond end of device
> > > sdb1: rw=0, want=134744080, limit=128002016
> >
> > I believe the "rw=0" means that was a simple read request, and not a
> > read-ahead.
>
> Correct.
>
> > 128002016 equals about 62 gigs, which is the correct volume size:
> >
> > Filesystem 1K-blocks Used Available Use% Mounted on
> > /dev/sdb1 62995364 2832696 56962620 5% /xxx
> >
> > /dev/sdb1 on /xxx type ext2 (rw,noatime)
>
> How are you reproducing this, through the file system (reading files),
> or reading the device? If the former, is the file system definitely
> sound - eg does it pass fsck?

Filesystem level interaction via bonnie++. Basic repro is, using ccaputo
user, is:

mke2fs -j -L /xxx /dev/sdb1
mount -t ext2 /dev/sdb1 /xxx
cd /xxx ; mkdir ccaputo ; chown ccaputo ccaputo ; cd ccaputo ; su ccaputo
/usr/sbin/bonnie++

Filesystem is believed to be sound since it is from a fresh mke2fs.

The one strange thing I do is that I format it as ext3 (-j) but mount it
as ext2, but I didn't think that would be an issue and I'd be surprised if
Erich is doing the same in his tests, which also fail, with ext2. (I do
it in case I later decide to mount the volume as ext3.)

Chris