2002-10-20 04:25:45

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH] work around duff ABIs


*sigh*. i hate this kind of bullshit. please, don't anyone ever try
to pass 64-bit args through the syscall interface again.

Index: fs/read_write.c
===================================================================
RCS file: /var/cvs/linux-2.5/fs/read_write.c,v
retrieving revision 1.5
diff -u -p -r1.5 read_write.c
--- fs/read_write.c 17 Oct 2002 20:43:22 -0000 1.5
+++ fs/read_write.c 20 Oct 2002 04:22:15 -0000
@@ -14,6 +14,7 @@
#include <linux/security.h>
#include <linux/module.h>

+#include <asm/byteorder.h>
#include <asm/uaccess.h>

struct file_operations generic_ro_fops = {
@@ -285,9 +286,15 @@ asmlinkage ssize_t sys_write(unsigned in
return ret;
}

-asmlinkage ssize_t sys_pread64(unsigned int fd, char *buf,
- size_t count, loff_t pos)
+#ifdef __BIG_ENDIAN
+asmlinkage ssize_t sys_pread64(unsigned int fd, char *buf, size_t count,
+ unsigned int high, unsigned int low)
+#else
+asmlinkage ssize_t sys_pread64(unsigned int fd, char *buf, size_t count,
+ unsigned int low, unsigned int high)
+#endif
{
+ loff_t pos = (loff_t)high << 32 | low;
struct file *file;
ssize_t ret = -EBADF;

@@ -303,9 +310,15 @@ asmlinkage ssize_t sys_pread64(unsigned
return ret;
}

-asmlinkage ssize_t sys_pwrite64(unsigned int fd, const char *buf,
- size_t count, loff_t pos)
+#ifdef __BIG_ENDIAN
+asmlinkage ssize_t sys_pwrite64(unsigned int fd, const char *buf, size_t count,
+ unsigned int high, unsigned int low)
+#else
+asmlinkage ssize_t sys_pwrite64(unsigned int fd, const char *buf, size_t count,
+ unsigned int low, unsigned int high)
+#endif
{
+ loff_t pos = (loff_t)high << 32 | low;
struct file *file;
ssize_t ret = -EBADF;

Index: fs/open.c
===================================================================
RCS file: /var/cvs/linux-2.5/fs/open.c,v
retrieving revision 1.4
diff -u -p -r1.4 open.c
--- fs/open.c 17 Oct 2002 20:43:22 -0000 1.4
+++ fs/open.c 20 Oct 2002 04:22:16 -0000
@@ -18,6 +18,7 @@
#include <linux/backing-dev.h>
#include <linux/security.h>

+#include <asm/byteorder.h>
#include <asm/uaccess.h>

#define special_file(m) (S_ISCHR(m)||S_ISBLK(m)||S_ISFIFO(m)||S_ISSOCK(m))
@@ -205,18 +206,34 @@ asmlinkage long sys_ftruncate(unsigned i
return do_sys_ftruncate(fd, length, 1);
}

-/* LFS versions of truncate are only needed on 32 bit machines */
+/*
+ * LFS versions of truncate are only needed on 32 bit machines. PA-RISC
+ * and MIPS ABIs specify 64-bit alignment for 64-bit quantities, but glibc
+ * ignores this and passes 64-bit quantities in misaligned registers.
+ */
#if BITS_PER_LONG == 32
-asmlinkage long sys_truncate64(const char * path, loff_t length)
+#ifdef __BIG_ENDIAN
+asmlinkage long sys_truncate64(const char * path, unsigned int high, unsigned int low)
{
- return do_sys_truncate(path, length);
+ return do_sys_truncate(path, (loff_t)high << 32 | low);
}

-asmlinkage long sys_ftruncate64(unsigned int fd, loff_t length)
+asmlinkage long sys_ftruncate64(unsigned int fd, unsigned int high, unsigned int low)
{
- return do_sys_ftruncate(fd, length, 0);
+ return do_sys_ftruncate(fd, (loff_t)high << 32 | low, 0);
}
-#endif
+#else /* !__BIG_ENDIAN */
+asmlinkage long sys_truncate64(const char * path, unsigned int low, unsigned int high)
+{
+ return do_sys_truncate(path, (loff_t)high << 32 | low);
+}
+
+asmlinkage long sys_ftruncate64(unsigned int fd, unsigned int low, unsigned int high)
+{
+ return do_sys_ftruncate(fd, (loff_t)high << 32 | low, 0);
+}
+#endif /* !__BIG_ENDIAN */
+#endif /* BITS_PER_LONG == 32 */

#if !(defined(__alpha__) || defined(__ia64__))


--
Revolutions do not require corporate support.


2002-10-20 04:45:43

by Erik Andersen

[permalink] [raw]
Subject: Re: [PATCH] work around duff ABIs

On Sun Oct 20, 2002 at 05:31:47AM +0100, Matthew Wilcox wrote:
>
> *sigh*. i hate this kind of bullshit. please, don't anyone ever try
> to pass 64-bit args through the syscall interface again.

I agree it can be a pain.

[-----------snip-------------]
> -asmlinkage ssize_t sys_pread64(unsigned int fd, char *buf,
> - size_t count, loff_t pos)
> +#ifdef __BIG_ENDIAN
> +asmlinkage ssize_t sys_pread64(unsigned int fd, char *buf, size_t count,
> + unsigned int high, unsigned int low)
> +#else
> +asmlinkage ssize_t sys_pread64(unsigned int fd, char *buf, size_t count,
> + unsigned int low, unsigned int high)
> +#endif

Nonono. Please see __LONG_LONG_PAIR in /usr/include/endian.h.
Your user space code should be doing something like this:

static inline _syscall5(ssize_t, __syscall_pread, int, fd, void *, buf,
size_t, count, off_t, offset_hi, off_t, offset_lo);

ssize_t __libc_pread(int fd, void *buf, size_t count, off_t offset)
{
return(__syscall_pread(fd,buf,count,__LONG_LONG_PAIR((off_t)0,offset)));
}

ssize_t __libc_pread64(int fd, void *buf, size_t count, off64_t offset)
{
return(__syscall_pread(fd, buf, count,
__LONG_LONG_PAIR((off_t)(offset>>32),
(off_t)(offset&0xffffffff))));
}

Your patch is going to break GNU libc, uClibc, and anyone else in
userspace that is doing pread and pread64 correctly....

-Erik

--
Erik B. Andersen http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--

2002-10-20 05:43:26

by John Levon

[permalink] [raw]
Subject: Re: [PATCH] work around duff ABIs

On Sun, Oct 20, 2002 at 05:31:47AM +0100, Matthew Wilcox wrote:

> *sigh*. i hate this kind of bullshit. please, don't anyone ever try
> to pass 64-bit args through the syscall interface again.

I was about to do exactly that ...

> + * LFS versions of truncate are only needed on 32 bit machines. PA-RISC
> + * and MIPS ABIs specify 64-bit alignment for 64-bit quantities, but glibc
> + * ignores this and passes 64-bit quantities in misaligned registers.

Isn't glibc broken ?

regards
john

--
"It's a cardboard universe ... and if you lean too hard against it, you fall
through."
- Philip K. Dick

2002-10-20 12:45:07

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] work around duff ABIs

On Sat, Oct 19, 2002 at 10:51:49PM -0600, Erik Andersen wrote:
> Nonono. Please see __LONG_LONG_PAIR in /usr/include/endian.h.
> Your user space code should be doing something like this:
>
> static inline _syscall5(ssize_t, __syscall_pread, int, fd, void *, buf,
> size_t, count, off_t, offset_hi, off_t, offset_lo);
>
> ssize_t __libc_pread(int fd, void *buf, size_t count, off_t offset)
> {
> return(__syscall_pread(fd,buf,count,__LONG_LONG_PAIR((off_t)0,offset)));
> }
>
> ssize_t __libc_pread64(int fd, void *buf, size_t count, off64_t offset)
> {
> return(__syscall_pread(fd, buf, count,
> __LONG_LONG_PAIR((off_t)(offset>>32),
> (off_t)(offset&0xffffffff))));
> }
>
> Your patch is going to break GNU libc, uClibc, and anyone else in
> userspace that is doing pread and pread64 correctly....

Well... since you just proved that you don't understand the problem,
I'd hazard a guess that uClibc is also broken, as well as glibc.

Here's how the ABI specifies that pread() shall take its arguments:

asmlinkage ssize_t sys_pread64(unsigned int fd, char *buf, size_t count,
loff_t pos)

fd arg0
buf arg1
count arg2
pos arg4 & arg5

Here's what __LONG_LONG_PAIR does:

fd arg0
buf arg1
count arg2
pos(HI) arg3
pos(LO) arg4

--
Revolutions do not require corporate support.

2002-10-21 11:53:01

by Alan

[permalink] [raw]
Subject: Re: [PATCH] work around duff ABIs

On Sun, 2002-10-20 at 13:51, Matthew Wilcox wrote:
> asmlinkage ssize_t sys_pread64(unsigned int fd, char *buf, size_t count,
> loff_t pos)
>

Which ABI. If its the hppa ABI then its however you specified it and
however your call setup/return code handles it. If that needs glue and
your own syscall vectors calling sys_pread64 indirectly BFD.

2002-10-21 11:52:04

by Alan

[permalink] [raw]
Subject: Re: [PATCH] work around duff ABIs

On Sun, 2002-10-20 at 05:31, Matthew Wilcox wrote:
>
> *sigh*. i hate this kind of bullshit. please, don't anyone ever try
> to pass 64-bit args through the syscall interface again.

Please bury this crap in arch/hppa/


2002-10-21 13:45:00

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] work around duff ABIs

On Mon, Oct 21, 2002 at 01:13:39PM +0100, Alan Cox wrote:
> On Sun, 2002-10-20 at 05:31, Matthew Wilcox wrote:
> >
> > *sigh*. i hate this kind of bullshit. please, don't anyone ever try
> > to pass 64-bit args through the syscall interface again.
>
> Please bury this crap in arch/hppa/

and arch/mips?

--
Revolutions do not require corporate support.

2002-10-21 16:04:36

by Alan

[permalink] [raw]
Subject: Re: [PATCH] work around duff ABIs

On Mon, 2002-10-21 at 14:48, Matthew Wilcox wrote:
> On Mon, Oct 21, 2002 at 01:13:39PM +0100, Alan Cox wrote:
> > On Sun, 2002-10-20 at 05:31, Matthew Wilcox wrote:
> > >
> > > *sigh*. i hate this kind of bullshit. please, don't anyone ever try
> > > to pass 64-bit args through the syscall interface again.
> >
> > Please bury this crap in arch/hppa/
>
> and arch/mips?

IMHO yes

2002-10-22 04:37:00

by Erik Andersen

[permalink] [raw]
Subject: Re: [PATCH] work around duff ABIs

On Sun Oct 20, 2002 at 01:51:09PM +0100, Matthew Wilcox wrote:
> On Sat, Oct 19, 2002 at 10:51:49PM -0600, Erik Andersen wrote:
> > Nonono. Please see __LONG_LONG_PAIR in /usr/include/endian.h.
> > Your user space code should be doing something like this:
> >
> > static inline _syscall5(ssize_t, __syscall_pread, int, fd, void *, buf,
> > size_t, count, off_t, offset_hi, off_t, offset_lo);
> >
> > ssize_t __libc_pread(int fd, void *buf, size_t count, off_t offset)
> > {
> > return(__syscall_pread(fd,buf,count,__LONG_LONG_PAIR((off_t)0,offset)));
> > }
> >
> > ssize_t __libc_pread64(int fd, void *buf, size_t count, off64_t offset)
> > {
> > return(__syscall_pread(fd, buf, count,
> > __LONG_LONG_PAIR((off_t)(offset>>32),
> > (off_t)(offset&0xffffffff))));
> > }
> >
> > Your patch is going to break GNU libc, uClibc, and anyone else in
> > userspace that is doing pread and pread64 correctly....
>
> Well... since you just proved that you don't understand the problem,
> I'd hazard a guess that uClibc is also broken, as well as glibc.

I understand the problem very well. Passing 64 bit stuff via
syscalls is a major pain in the butt. But your patch is not just
changing hppa and mips -- you are breaking the ABI on x86, arm,
powerpc, etc, etc. etc where it is currently working. Look very
closely at your patch. See those endianness ifdefs? You are
adding endianness specific ifdefs into pread, truncate, and
ftruncate to switch the argument order. User space is already
doing that too. At no time on any architecture is the low stuff
passed into arg3. Ergo, your patch is going to break userspace
where pread and pread64 are now working correctly....

If you want to change the kernel to passing eliminate 64 bit
stuff via syscalls, and instead pass pairs of 32bit entities --
I'm all for that as that would make explicit what user space is
doing anyways. But don't break binary compatibility for no
reason. Why make both user-space _and_ kernel space add ifdefs
for endianness? Make arg3 _always_ contain the hi-bits.

-Erik

--
Erik B. Andersen http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--

2002-10-22 12:58:21

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] work around duff ABIs

On Mon, Oct 21, 2002 at 10:43:10PM -0600, Erik Andersen wrote:
> I understand the problem very well. Passing 64 bit stuff via
> syscalls is a major pain in the butt. But your patch is not just
> changing hppa and mips -- you are breaking the ABI on x86, arm,
> powerpc, etc, etc. etc where it is currently working. Look very
> closely at your patch. See those endianness ifdefs? You are
> adding endianness specific ifdefs into pread, truncate, and
> ftruncate to switch the argument order. User space is already
> doing that too. At no time on any architecture is the low stuff
> passed into arg3. Ergo, your patch is going to break userspace
> where pread and pread64 are now working correctly....

Nope. Some architectures _do not_ pad 64-bit arguments, others _do_.
On ARM/x86, we currently do use arg3 for the low part of the argument,
but on PPC we use it for the high part because of this sexswapping
crap being done in userspace.

> If you want to change the kernel to passing eliminate 64 bit
> stuff via syscalls, and instead pass pairs of 32bit entities --
> I'm all for that as that would make explicit what user space is
> doing anyways. But don't break binary compatibility for no
> reason. Why make both user-space _and_ kernel space add ifdefs
> for endianness? Make arg3 _always_ contain the hi-bits.

I'd love to move to that model. But I suspect we need a consensus to
_never_ pass 64-bit quantities across the syscall boundary, and we
aren't going to get it. So we're going to add more crap every time
someone adds a loff_t ;-(

--
Revolutions do not require corporate support.

2002-11-04 05:04:26

by Richard Henderson

[permalink] [raw]
Subject: Re: [PATCH] work around duff ABIs

On Sun, Oct 20, 2002 at 05:31:47AM +0100, Matthew Wilcox wrote:
> -asmlinkage ssize_t sys_pread64(unsigned int fd, char *buf,
> - size_t count, loff_t pos)
> +#ifdef __BIG_ENDIAN
> +asmlinkage ssize_t sys_pread64(unsigned int fd, char *buf, size_t count,
> + unsigned int high, unsigned int low)
> +#else
> +asmlinkage ssize_t sys_pread64(unsigned int fd, char *buf, size_t count,
> + unsigned int low, unsigned int high)
> +#endif

Broken for actual 64-bit targets.

I'd suggest frobbing the arguments in arch specific code
for those 32-bit targets that do pad 64-bit arguments to
the next even/odd register. I disbelieve you can do
anything cleanly here.


r~