2002-07-29 18:23:05

by Badari Pulavarty

[permalink] [raw]
Subject: [PATCH] vfs_read/vfs_write small bug fix (2.5.29)

Hi,

Here is a patch to fix small bug in for vfs_read/vfs_write.

Please apply.

Thanks,
Badari

--- linux-2.5.29/fs/read_write.c Mon Jul 29 11:07:32 2002
+++ linux.new/fs/read_write.c Mon Jul 29 11:07:57 2002
@@ -184,7 +184,7 @@
return -EBADF;
if (!file->f_op || !file->f_op->read)
return -EINVAL;
- if (pos < 0)
+ if (*pos < 0)
return -EINVAL;

ret = locks_verify_area(FLOCK_VERIFY_READ, inode, file, *pos, count);
@@ -209,7 +209,7 @@
return -EBADF;
if (!file->f_op || !file->f_op->write)
return -EINVAL;
- if (pos < 0)
+ if (*pos < 0)
return -EINVAL;

ret = locks_verify_area(FLOCK_VERIFY_WRITE, inode, file, *pos, count);


2002-07-29 19:02:48

by Paul Larson

[permalink] [raw]
Subject: Re: [PATCH] vfs_read/vfs_write small bug fix (2.5.29)

On Mon, 2002-07-29 at 13:25, Badari Pulavarty wrote:
> Hi,
>
> Here is a patch to fix small bug in for vfs_read/vfs_write.
>
> Please apply.
This is actually one of the same issues I was looking at this morning.
I noticed that the Linux Test Project tests pread02 and pwrite02 were
failing because of this. However I also had to do some typecasting to
make it work correctly. I'm not sure this is the best way to do it, but
without the typecasting, the tests still fail.

-Paul Larson

diff -Nru a/fs/read_write.c b/fs/read_write.c
--- a/fs/read_write.c Mon Jul 29 14:48:45 2002
+++ b/fs/read_write.c Mon Jul 29 14:48:45 2002
@@ -185,7 +185,7 @@
return -EBADF;
if (!file->f_op || !file->f_op->read)
return -EINVAL;
- if (pos < 0)
+ if ((int)*pos < 0)
return -EINVAL;

ret = locks_verify_area(FLOCK_VERIFY_READ, inode, file, *pos, count);
@@ -210,7 +210,7 @@
return -EBADF;
if (!file->f_op || !file->f_op->write)
return -EINVAL;
- if (pos < 0)
+ if ((int)*pos < 0)
return -EINVAL;

ret = locks_verify_area(FLOCK_VERIFY_WRITE, inode, file, *pos, count);

2002-07-29 20:08:12

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] vfs_read/vfs_write small bug fix (2.5.29)

In article <[email protected]>,
Badari Pulavarty <[email protected]> wrote:
>
>Here is a patch to fix small bug in for vfs_read/vfs_write.

Hmm. The patch is bogus, but that's not your fault, looking at the code
the patch is no more bogus than the existing code already is.

The fact is, the test for a negative "pos" should not be in
vfs_read/write at all, since it can only happen for pread/pwrite.

And pread/pwrite do not even _take_ a "loff_t" argument, they take a
"off_t", and yet we've just happily claiming they do a loff_t, which
means that they shouldn't work at all unless by pure changce user space
happens to put a zero in memory in the right place.

Cristoph, I think you're the one that did this re-org. I think the code
is wrong, and the right fix is something along these lines (untested,
you get brownie-points for testing against some standards test).

Linus


----
===== fs/read_write.c 1.12 vs edited =====
--- 1.12/fs/read_write.c Sat Jul 27 08:21:19 2002
+++ edited/fs/read_write.c Mon Jul 29 12:51:09 2002
@@ -185,8 +185,6 @@
return -EBADF;
if (!file->f_op || !file->f_op->read)
return -EINVAL;
- if (pos < 0)
- return -EINVAL;

ret = locks_verify_area(FLOCK_VERIFY_READ, inode, file, *pos, count);
if (!ret) {
@@ -210,8 +208,6 @@
return -EBADF;
if (!file->f_op || !file->f_op->write)
return -EINVAL;
- if (pos < 0)
- return -EINVAL;

ret = locks_verify_area(FLOCK_VERIFY_WRITE, inode, file, *pos, count);
if (!ret) {
@@ -255,14 +251,18 @@
}

asmlinkage ssize_t sys_pread(unsigned int fd, char *buf,
- size_t count, loff_t pos)
+ size_t count, off_t pos)
{
struct file *file;
ssize_t ret = -EBADF;

+ if (pos < 0)
+ return -EINVAL;
+
file = fget(fd);
if (file) {
- ret = vfs_read(file, buf, count, &pos);
+ loff_t lpos = pos;
+ ret = vfs_read(file, buf, count, &lpos);
fput(file);
}

@@ -270,14 +270,18 @@
}

asmlinkage ssize_t sys_pwrite(unsigned int fd, const char *buf,
- size_t count, loff_t pos)
+ size_t count, off_t pos)
{
struct file *file;
ssize_t ret = -EBADF;

+ if (pos < 0)
+ return -EINVAL;
+
file = fget(fd);
if (file) {
- ret = vfs_write(file, buf, count, &pos);
+ loff_t lpos = pos;
+ ret = vfs_write(file, buf, count, &lpos);
fput(file);
}


2002-07-29 21:06:56

by Paul Larson

[permalink] [raw]
Subject: Re: [PATCH] vfs_read/vfs_write small bug fix (2.5.29)

On Mon, 2002-07-29 at 15:11, Linus Torvalds wrote:

> The fact is, the test for a negative "pos" should not be in
> vfs_read/write at all, since it can only happen for pread/pwrite.
That's where LTP has been seeing it fail.

> And pread/pwrite do not even _take_ a "loff_t" argument, they take a
> "off_t", and yet we've just happily claiming they do a loff_t, which
> means that they shouldn't work at all unless by pure changce user space
> happens to put a zero in memory in the right place.
>
> Cristoph, I think you're the one that did this re-org. I think the code
> is wrong, and the right fix is something along these lines (untested,
> you get brownie-points for testing against some standards test).
>
> Linus

This passes all the LTP pread and pwrite tests.

Thanks,
Paul Larson
Linux Test Project
http://ltp.sourceforge.net

2002-07-29 21:19:21

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] vfs_read/vfs_write small bug fix (2.5.29)


On 29 Jul 2002, Paul Larson wrote:
>
> This passes all the LTP pread and pwrite tests.

Christoph claims that the kernel pread/pwrite has never been a
32-bitinterface at all, but has always been really a pread64/pwrite64.

Which would make my patch do the wrong thing on big-endian machines, and
would also break any apps that actually used it with loff_t.

If so, the bug is actually in user space, and the real fix on a kernel
level is to _document_ the fact that "sys_pread()" isn't actually the same
as the regular pread() system call. Done by renaming it to "pread64()"
internally, like this..

Does this work for you? If not, that implies that glibc may be missing a

if (pos < 0) {
errno = EINVAL;
return -1;
}

in its implementation of the pread/pwrite shim layer.

(Or maybe glibc doesn't know that the kernel pread/pwrite system calls
were always 64-bit clean, and it just happened to work).

Linus

-----
===== arch/i386/kernel/entry.S 1.38 vs edited =====
--- 1.38/arch/i386/kernel/entry.S Fri Jul 26 00:57:48 2002
+++ edited/arch/i386/kernel/entry.S Mon Jul 29 14:09:51 2002
@@ -689,8 +689,8 @@
.long sys_rt_sigtimedwait
.long sys_rt_sigqueueinfo
.long sys_rt_sigsuspend
- .long sys_pread /* 180 */
- .long sys_pwrite
+ .long sys_pread64 /* 180 */
+ .long sys_pwrite64
.long sys_chown16
.long sys_getcwd
.long sys_capget
===== fs/read_write.c 1.12 vs edited =====
--- 1.12/fs/read_write.c Sat Jul 27 08:21:19 2002
+++ edited/fs/read_write.c Mon Jul 29 14:08:55 2002
@@ -185,8 +185,6 @@
return -EBADF;
if (!file->f_op || !file->f_op->read)
return -EINVAL;
- if (pos < 0)
- return -EINVAL;

ret = locks_verify_area(FLOCK_VERIFY_READ, inode, file, *pos, count);
if (!ret) {
@@ -210,8 +208,6 @@
return -EBADF;
if (!file->f_op || !file->f_op->write)
return -EINVAL;
- if (pos < 0)
- return -EINVAL;

ret = locks_verify_area(FLOCK_VERIFY_WRITE, inode, file, *pos, count);
if (!ret) {
@@ -254,12 +250,15 @@
return ret;
}

-asmlinkage ssize_t sys_pread(unsigned int fd, char *buf,
+asmlinkage ssize_t sys_pread64(unsigned int fd, char *buf,
size_t count, loff_t pos)
{
struct file *file;
ssize_t ret = -EBADF;

+ if (pos < 0)
+ return -EINVAL;
+
file = fget(fd);
if (file) {
ret = vfs_read(file, buf, count, &pos);
@@ -269,11 +268,14 @@
return ret;
}

-asmlinkage ssize_t sys_pwrite(unsigned int fd, const char *buf,
+asmlinkage ssize_t sys_pwrite64(unsigned int fd, const char *buf,
size_t count, loff_t pos)
{
struct file *file;
ssize_t ret = -EBADF;
+
+ if (pos < 0)
+ return -EINVAL;

file = fget(fd);
if (file) {

2002-07-29 22:07:06

by Paul Larson

[permalink] [raw]
Subject: Re: [PATCH] vfs_read/vfs_write small bug fix (2.5.29)

On Mon, 2002-07-29 at 16:22, Linus Torvalds wrote:
> Does this work for you? If not, that implies that glibc may be missing a
>
> if (pos < 0) {
> errno = EINVAL;
> return -1;
> }
>
> in its implementation of the pread/pwrite shim layer.
>
> (Or maybe glibc doesn't know that the kernel pread/pwrite system calls
> were always 64-bit clean, and it just happened to work).
>
> Linus
No, with just this patch it still fails on pread02 and pwrite02.

# ./pread02
pread02 1 PASS : pread() fails, file descriptor is a PIPE or
FIFO, errno:29
pread02 2 FAIL : pread() returned 0, expected -1, errno:22

# ./pwrite02
pwrite02 1 PASS : file descriptor is a PIPE or FIFO, errno:29
caught SIGXFSZ
pwrite02 2 FAIL : specified offset is -ve or invalid, unexpected
errno:27, expected:22
pwrite02 3 PASS : file descriptor is bad, errno:9

Paul Larson
Linux Test Project
http://ltp.sourceforge.net

2002-07-29 22:15:18

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] vfs_read/vfs_write small bug fix (2.5.29)


On 29 Jul 2002, Paul Larson wrote:
> >
> > (Or maybe glibc doesn't know that the kernel pread/pwrite system calls
> > were always 64-bit clean, and it just happened to work).
?
> No, with just this patch it still fails on pread02 and pwrite02.

Ok, that just means that it's a library issue now - having looked at
historical kernel behaviour (and a lot of 64/bit architectures emulating
their old 32/bit system calls), the kernel system call interface is
clearly a 64-bit value, ie the kernel only export pread64/pwrite64, not a
"traditional" pread/pwrite at all.

So the question is what the library should do with a 32-bit negative
"offset_t" passed in to the user-level "pread()" implementation.

Looking at the disassembly of glibc pread, the implementation seems to be
to just clear %edx on x86 (which are the high bits of the 64-bit offset
value passed into sys_pread64()).

And equally clearly your test wants to get EINVAL.

Your test would pass if glibc just sign-extended the 32-bit value to 64
bits, instead of zero-extending it.

Alternativealy, your test would pass if glibc just internally checked for
a negative offset.

I think the sign-extending sounds like the right idea, but that will
obviously break applications that _want_ to pread() in the 2GB-4GB address
without using pread64(). Something that sounds unlikely to be an issue, of
course (and which should have failed with -EBIG at open time anyway)

Linus

2002-07-30 07:34:36

by Andreas Schwab

[permalink] [raw]
Subject: Re: [PATCH] vfs_read/vfs_write small bug fix (2.5.29)

Linus Torvalds <[email protected]> writes:

|> On 29 Jul 2002, Paul Larson wrote:
|> > >
|> > > (Or maybe glibc doesn't know that the kernel pread/pwrite system calls
|> > > were always 64-bit clean, and it just happened to work).
|> ?
|> > No, with just this patch it still fails on pread02 and pwrite02.
|>
|> Ok, that just means that it's a library issue now - having looked at
|> historical kernel behaviour (and a lot of 64/bit architectures emulating
|> their old 32/bit system calls), the kernel system call interface is
|> clearly a 64-bit value, ie the kernel only export pread64/pwrite64, not a
|> "traditional" pread/pwrite at all.
|>
|> So the question is what the library should do with a 32-bit negative
|> "offset_t" passed in to the user-level "pread()" implementation.
|>
|> Looking at the disassembly of glibc pread, the implementation seems to be
|> to just clear %edx on x86 (which are the high bits of the 64-bit offset
|> value passed into sys_pread64()).
|>
|> And equally clearly your test wants to get EINVAL.
|>
|> Your test would pass if glibc just sign-extended the 32-bit value to 64
|> bits, instead of zero-extending it.

Yes, this was a bug in glibc, it has been fixed already in CVS.

Andreas.

--
Andreas Schwab, SuSE Labs, [email protected]
SuSE Linux AG, Deutschherrnstr. 15-19, D-90429 N?rnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."