2004-03-01 15:53:00

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: Ibm Serveraid Problem with 2.4.25


On Mon, 1 Mar 2004, Alain Fauconnet wrote:

> Please followup to the list when you've drawn conclusions about this
> issue.
> I'm holding a 2.4.25 upgrade of a busy mail server because of this.

We found out the guilty part, its the readahead modifications done by
Chuck Lever in -pre1.

The error is harmless though, you can safely move your server to 2.4.25.
Its just trying to readahead a page beyond the end device.

The following untested patch on top of 2.4.25 should fix it:

--- mm/filemap.c.orig 2004-03-01 12:47:41.000000000 -0300
+++ mm/filemap.c 2004-03-01 12:48:30.000000000 -0300
@@ -1346,7 +1346,7 @@
while (ahead < max_ahead) {
unsigned long ra_index = raend + ahead + 1;

- if (ra_index > end_index)
+ if (ra_index >= end_index)
break;
if (page_cache_read(filp, ra_index) < 0)
break;


2004-03-01 17:24:06

by Chuck Lever

[permalink] [raw]
Subject: Re: Ibm Serveraid Problem with 2.4.25

hi marcelo-

your "fix" will break readahead again for NFS. with the ">=" as you
propose, the read ahead code will never be able to read the last page of
the file as a coalesced read, it will always be a separate 4KB read.

the problem is not the readahead code, it is the driver code that tries
to read beyond the end of the device. my change merely exposed this
misbehavior.

so there is a broken assumption somewhere about how the index of the last
page of a file/device is computed. i think it is a problem when the file
ends exactly on a page boundary.

alain, if you don't use the NFS client, marcelo's fix should work just
fine for you. but i believe that in general it is incorrect.


On Mon, 1 Mar 2004, Marcelo Tosatti wrote:

>
> On Mon, 1 Mar 2004, Alain Fauconnet wrote:
>
> > Please followup to the list when you've drawn conclusions about this
> > issue.
> > I'm holding a 2.4.25 upgrade of a busy mail server because of this.
>
> We found out the guilty part, its the readahead modifications done by
> Chuck Lever in -pre1.
>
> The error is harmless though, you can safely move your server to 2.4.25.
> Its just trying to readahead a page beyond the end device.
>
> The following untested patch on top of 2.4.25 should fix it:
>
> --- mm/filemap.c.orig 2004-03-01 12:47:41.000000000 -0300
> +++ mm/filemap.c 2004-03-01 12:48:30.000000000 -0300
> @@ -1346,7 +1346,7 @@
> while (ahead < max_ahead) {
> unsigned long ra_index = raend + ahead + 1;
>
> - if (ra_index > end_index)
> + if (ra_index >= end_index)
> break;
> if (page_cache_read(filp, ra_index) < 0)
> break;
>
>

- Chuck Lever
--
corporate: <cel at netapp dot com>
personal: <chucklever at bigfoot dot com>

2004-03-01 18:24:35

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: Ibm Serveraid Problem with 2.4.25



On Mon, 1 Mar 2004, Chuck Lever wrote:

> hi marcelo-
>
> your "fix" will break readahead again for NFS. with the ">=" as you
> propose, the read ahead code will never be able to read the last page of
> the file as a coalesced read, it will always be a separate 4KB read.
>
> the problem is not the readahead code, it is the driver code that tries
> to read beyond the end of the device. my change merely exposed this
> misbehavior.
>
> so there is a broken assumption somewhere about how the index of the last
> page of a file/device is computed. i think it is a problem when the file
> ends exactly on a page boundary.
>
> alain, if you don't use the NFS client, marcelo's fix should work just
> fine for you. but i believe that in general it is incorrect.

Okey, most drivers do no exhibit this problem indeed.

We should try to fix the problematic drivers, then.

If we can't do it easily and in a straightforward manner, I'm afraid we
will have to undo your change because even if the "read end beyond of
device" accesses are harmless (are they really harmless?), they must
fixed.

Agreed?

I'll take a look at them later today, but I'm no expert, so help is very
appreciated.

We know that these have problems:

- Promise ATA
- ips (serveraid)



2004-03-01 18:38:17

by Enrico Demarin

[permalink] [raw]
Subject: Re: Ibm Serveraid Problem with 2.4.25

Hi Marcelo,

Add mpt fusion driver to the list, it exhibits the problem as well.

- Enrico

On Mon, 2004-03-01 at 13:23, Marcelo Tosatti wrote:
> On Mon, 1 Mar 2004, Chuck Lever wrote:
>
> > hi marcelo-
> >
> > your "fix" will break readahead again for NFS. with the ">=" as you
> > propose, the read ahead code will never be able to read the last page of
> > the file as a coalesced read, it will always be a separate 4KB read.
> >
> > the problem is not the readahead code, it is the driver code that tries
> > to read beyond the end of the device. my change merely exposed this
> > misbehavior.
> >
> > so there is a broken assumption somewhere about how the index of the last
> > page of a file/device is computed. i think it is a problem when the file
> > ends exactly on a page boundary.
> >
> > alain, if you don't use the NFS client, marcelo's fix should work just
> > fine for you. but i believe that in general it is incorrect.
>
> Okey, most drivers do no exhibit this problem indeed.
>
> We should try to fix the problematic drivers, then.
>
> If we can't do it easily and in a straightforward manner, I'm afraid we
> will have to undo your change because even if the "read end beyond of
> device" accesses are harmless (are they really harmless?), they must
> fixed.
>
> Agreed?
>
> I'll take a look at them later today, but I'm no expert, so help is very
> appreciated.
>
> We know that these have problems:
>
> - Promise ATA
> - ips (serveraid)
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2004-03-01 18:40:36

by Chuck Lever

[permalink] [raw]
Subject: Re: Ibm Serveraid Problem with 2.4.25

On Mon, 1 Mar 2004, Marcelo Tosatti wrote:

> On Mon, 1 Mar 2004, Chuck Lever wrote:
>
> > hi marcelo-
> >
> > your "fix" will break readahead again for NFS. with the ">=" as you
> > propose, the read ahead code will never be able to read the last page of
> > the file as a coalesced read, it will always be a separate 4KB read.
> >
> > the problem is not the readahead code, it is the driver code that tries
> > to read beyond the end of the device. my change merely exposed this
> > misbehavior.
> >
> > so there is a broken assumption somewhere about how the index of the last
> > page of a file/device is computed. i think it is a problem when the file
> > ends exactly on a page boundary.
> >
> > alain, if you don't use the NFS client, marcelo's fix should work just
> > fine for you. but i believe that in general it is incorrect.
>
> Okey, most drivers do no exhibit this problem indeed.
>
> We should try to fix the problematic drivers, then.

it may very well be that most drivers don't complain about reading past
the end of the device.

some have suggested that the end_index computation at the top of
generic_file_readahead is incorrect. that is not beyond the pale, and i
can take a look at that.

this calculation:

end_index = inode->i_size >> PAGE_CACHE_SHIFT;

i think the case where there are bytes in that last page work as i
intended. the problem is that sometimes the end_index value points to a
page with zero bytes (for example when the file is exactly an integral
number of pages in length; this is always the case for most devices).

i_size = 4095 end_index = 0 total_pages_in_file = 1
i_size = 4096 end_index = 1 total_pages_in_file = 1
i_size = 4097 end_index = 1 total_pages_in_file = 2

so right at the boundary we have a little problem.

> If we can't do it easily and in a straightforward manner, I'm afraid we
> will have to undo your change because even if the "read end beyond of
> device" accesses are harmless (are they really harmless?), they must
> fixed.
>
> Agreed?

understood, and thanks for the giving me the opportunity to try to help
you fix it.

> I'll take a look at them later today, but I'm no expert, so help is very
> appreciated.
>
> We know that these have problems:
>
> - Promise ATA
> - ips (serveraid)

i don't have either of those. can anyone who is willing to test a patch
on this hardware e-mail me so we can try this out?

- Chuck Lever
--
corporate: <cel at netapp dot com>
personal: <chucklever at bigfoot dot com>

2004-03-02 05:18:27

by Chuck Lever

[permalink] [raw]
Subject: Re: Ibm Serveraid Problem with 2.4.25

hi marcelo-

this patch seems to address the extra read at the end of files or devices
that end exactly on a page boundary, while retaining the good readahead
characteristics of my original patch. i'd like to hear about any testing
from folks who have one of:

mpt fusion
ips (ibm serveraid)
promise ata (and maybe sata too)

this patch is against 2.4.26-pre1.


diff -X ../dont-diff -Naurp 00-stock/mm/filemap.c 01-readahead/mm/filemap.c
--- 00-stock/mm/filemap.c 2004-02-18 08:36:32.000000000 -0500
+++ 01-readahead/mm/filemap.c 2004-03-02 00:07:59.000000000 -0500
@@ -1285,7 +1285,8 @@ static void generic_file_readahead(int r
unsigned long raend;
int max_readahead = get_max_readahead(inode);

- end_index = inode->i_size >> PAGE_CACHE_SHIFT;
+ end_index = ((inode->i_size + ~PAGE_CACHE_MASK) >>
+ PAGE_CACHE_SHIFT) - 1;

raend = filp->f_raend;
max_ahead = 0;


- Chuck Lever
--
corporate: <cel at netapp dot com>
personal: <chucklever at bigfoot dot com>