Btw, thinking more about the autofs patch, I realized that despite it
all working well for Thomas in his case, it's fundamentally wrong.
And it's not fundamentally wrong because of any ambiguities about the
size of the structure: that structure is clearly 304 bytes on x86-64
(and most other platforms, buth 32-bit and 64-bit), but it's 300 bytes
on x86-32 and m68k.
No, the problem is that "is_compat_task()" is not the right check.
It's not the task that *waits* for autofs that matters, it's that damn
autofs daemon task.
IOW, what we actually want to test is whether the other end of that
autofs sbi->pipe is a compat task or not.
And I have no idea how to do that. Can I assume that whoever does the
original "mount()" system call is the daemon? It needs to have that
pipe somehow.. Is there something that the daemon does early on that
we can use to capture whether the daemon is a compat task or not?
Ian, Peter, anybody who knows autofs? Is perhaps one of the ioctl's
always done by the daemon, where we could then use "is_compat_task()"
at that point to figure out whether it is going to expect the 300-byte
packet or the 304-byte packet?
We could just initialize sbi->v5_packet_size to the plain sizeof(),
but when we see that ioctl and realize that the daemon is a x86-32
binary we'd reset the packet size to 300.
Anyway, here's the patch again with a long explanation, but with a
"THIS IS WRONG" comment in the code, and an explanation in the commit
log. It works for Thomas, but it works for the wrong reasons - in his
setup, all binaries are compat binaries, so "is_compat_task()" just
happens to get the right value for the daemon too. But if you have a
mixture of binaries, you might get the autofs *request* in a compat
binary while the daemon is a 64-bit native x86-64 binary, or the other
way around, and then this patch would use the wrong packet size to
communicate with the daemon.
Hmm?
Linus
From: Linus Torvalds <[email protected]>
Date: Tue, 21 Feb 2012 18:24:10 -0800
> No, the problem is that "is_compat_task()" is not the right check.
> It's not the task that *waits* for autofs that matters, it's that damn
> autofs daemon task.
>
> IOW, what we actually want to test is whether the other end of that
> autofs sbi->pipe is a compat task or not.
>
> And I have no idea how to do that.
It's just a real fd, and there is no way to tell the compat'ness for
that. The mount operation literally passes in an integer attribute as
the pipefd mount option, and that's what it seems to use to send these
events.
And that fd can be dup()'d, exec()'d into compat and non-compat tasks,
passed around as SCM credentials between arbitrary processes, etc.
It's a real mess.
The only way to fix this cess pool completely is to override the
read() fop on that pipe, and translate the event stream in-situ.
What we could do is just manage the autofs messages as a linked list
of events, f.e. the packets in native format, then in the overridden
read() handler we either pass it along as is (for non-compat tasks) or
translate to compat format and copy that to userspace instead.
On Tue, Feb 21, 2012 at 7:16 PM, David Miller <[email protected]> wrote:
>
> It's just a real fd, and there is no way to tell the compat'ness for
> that. ?The mount operation literally passes in an integer attribute as
> the pipefd mount option, and that's what it seems to use to send these
> events.
Sure.
But I'm sure that the autofs daemon does something simple that we can
depend on in practice. It's fine if we default to the current "native
size" but then just have some heuristic that notices "oh, but we seem
to be running a compat daemon". After all, this is very much a special
case, I don't think we need to worry about people doing crazy things
in *general*, the only thing we need to worry about is a legacy x86-32
install that has been updated with a 64-bit kernel.
Where is that autofs daemon source code so that I can see what it does?
> The only way to fix this cess pool completely is to override the
> read() fop on that pipe, and translate the event stream in-situ.
>
> What we could do is just manage the autofs messages as a linked list
> of events, f.e. the packets in native format, then in the overridden
> read() handler we either pass it along as is (for non-compat tasks) or
> translate to compat format and copy that to userspace instead.
Sure. But I just don't think we need to fix the general case.
Linus
On Tue, Feb 21, 2012 at 7:33 PM, Linus Torvalds
<[email protected]> wrote:
>
> But I'm sure that the autofs daemon does something simple that we can
> depend on in practice.
I didn't find the most recent source code, but the first thing that
automount.c does after calling "mount" (which is a for+exec, so we
cannot take the mounting is_compat_task() into account is to open the
new root for the ioctlfd, and then it does a stat.
And then it does a AUTOFS_IOC_PROTOVER ioctl to see what the protocol
version is.
So we could just decide that
(a) we add a mount option for the packet size (or just "v6" - which
would be "v5 with a fixed packet size")
(b) in the absence of an explicit mount option, we look at
is_compat_task() for the first AUTOFS_IOC_PROTOVER ioctl we get.
That looks fairly straightforward and safe. Hmm?
Where are the automount sources supposed to be, anyway? kernel.org has
a v5 directory, but it's empty.
Linus
On Tue, 2012-02-21 at 19:47 -0800, Linus Torvalds wrote:
> On Tue, Feb 21, 2012 at 7:33 PM, Linus Torvalds
> <[email protected]> wrote:
> >
> > But I'm sure that the autofs daemon does something simple that we can
> > depend on in practice.
>
> I didn't find the most recent source code, but the first thing that
> automount.c does after calling "mount" (which is a for+exec, so we
> cannot take the mounting is_compat_task() into account is to open the
> new root for the ioctlfd, and then it does a stat.
>
> And then it does a AUTOFS_IOC_PROTOVER ioctl to see what the protocol
> version is.
>
> So we could just decide that
>
> (a) we add a mount option for the packet size (or just "v6" - which
> would be "v5 with a fixed packet size")
>
> (b) in the absence of an explicit mount option, we look at
> is_compat_task() for the first AUTOFS_IOC_PROTOVER ioctl we get.
>
> That looks fairly straightforward and safe. Hmm?
>
> Where are the automount sources supposed to be, anyway? kernel.org has
> a v5 directory, but it's empty.
I'm repopulating the autofs directory with content right now.
I had some difficulty with getting appropriate signatures for my PGP
key, consequently my account wasn't activated for quite. The upshot of
that is I wasn't aware of announcements about kup either. Anyway, I've
sorted that out and I'm trying to upload the content right now.
The git repo has been available for a while now though @
/pub/scm/linux/kernel/git/raven/autofs
at least that is the place I thought it should go?
Ian
On Tue, Feb 21, 2012 at 7:47 PM, Linus Torvalds
<[email protected]> wrote:
>
> And then it does a AUTOFS_IOC_PROTOVER ioctl to see what the protocol
> version is.
>
> So we could just decide that
>
> ?(a) we add a mount option for the packet size (or just "v6" - which
> would be "v5 with a fixed packet size")
>
> ?(b) in the absence of an explicit mount option, we look at
> is_compat_task() for the first AUTOFS_IOC_PROTOVER ioctl we get.
Ok, so here's a patch that does that. Well, it doesn't do that (a)
part, but it does the auto-detection of whether the daemon is a compat
process or not.
THIS IS UNTESTED. Thomas - could you test this in your environment? I
added your "Tested-by:", but that's really for the previous patch that
tested is_compat_task() in the wrong context.
Linus
On Tue, 2012-02-21 at 20:56 -0800, Linus Torvalds wrote:
> On Tue, Feb 21, 2012 at 7:47 PM, Linus Torvalds
> <[email protected]> wrote:
> >
> > And then it does a AUTOFS_IOC_PROTOVER ioctl to see what the protocol
> > version is.
> >
> > So we could just decide that
> >
> > (a) we add a mount option for the packet size (or just "v6" - which
> > would be "v5 with a fixed packet size")
> >
> > (b) in the absence of an explicit mount option, we look at
> > is_compat_task() for the first AUTOFS_IOC_PROTOVER ioctl we get.
>
> Ok, so here's a patch that does that. Well, it doesn't do that (a)
> part, but it does the auto-detection of whether the daemon is a compat
> process or not.
>
> THIS IS UNTESTED. Thomas - could you test this in your environment? I
> added your "Tested-by:", but that's really for the previous patch that
> tested is_compat_task() in the wrong context.
>
> Linus
How about .... ummm .. I haven't updated the patch description in case
this is wrong, just put together what I think should work ... TOTALLY
UNTESTED also.
autofs: work around unhappy compat problem on x86-64
From: Ian Kent <[email protected]>
When the autofs protocol version 5 packet type was added in commit
5c0a32fc2cd0 ("autofs4: add new packet type for v5 communications"), it
obvously tried quite hard to be word-size agnostic, and uses explicitly
sized fields that are all correctly aligned.
However, with the final "char name[NAME_MAX+1]" array at the end, the
actual size of the structure ends up being not very well defined:
because the struct isn't marked 'packed', doing a "sizeof()" on it will
align the size of the struct up to the biggest alignment of the members
it has.
And despite all the members being the same, the alignment of them is
different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
alignment on x86-64. And while 'NAME_MAX+1' ends up being a nice round
number (256), the name[] array starts out a 4-byte aligned.
End result: the "packed" size of the structure is 300 bytes: 4-byte, but
not 8-byte aligned.
As a result, despite all the fields being in the same place on all
architectures, sizeof() will round up that size to 304 bytes on
architectures that have 8-byte alignment for u64.
Note that this is *not* a problem for 32-bit compat mode on POWER, since
there __u64 is 8-byte aligned even in 32-bit mode. But on x86, 32-bit
and 64-bit alignment is different for 64-bit entities, and as a result
the structure that has exactly the same layout has different sizes.
So on x86-64, but no other architecture, we will just subtract 4 from
the size of the structure when running in a compat task. That way we
will write the properly sized packet that user mode expects.
Not pretty. Sadly, this very subtle, and unnecessary, size difference
has been encoded in user space that wants to read packets of *exactly*
the right size, and will refuse to touch anything else.
Reported-and-tested-by: Thomas Meyer <[email protected]>
Cc: Ian Kent <[email protected]>
---
fs/autofs4/autofs_i.h | 1 +
fs/autofs4/dev-ioctl.c | 3 +++
fs/autofs4/inode.c | 1 +
fs/autofs4/root.c | 8 ++++++++
fs/autofs4/waitq.c | 23 ++++++++++++++++++++---
5 files changed, 33 insertions(+), 3 deletions(-)
diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
index d8d8e7b..eb1cc92 100644
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -110,6 +110,7 @@ struct autofs_sb_info {
int sub_version;
int min_proto;
int max_proto;
+ int compat_daemon;
unsigned long exp_timeout;
unsigned int type;
int reghost_enabled;
diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index 76741d8..868fd57 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -265,6 +265,9 @@ static int autofs_dev_ioctl_open_mountpoint(const char *name, dev_t devid)
}
autofs_dev_ioctl_fd_install(fd, filp);
+
+ if (sbi->compat_daemon < 0)
+ sbi->compat_daemon = is_compat_task();
}
return fd;
diff --git a/fs/autofs4/inode.c b/fs/autofs4/inode.c
index e16980b..2920bbd 100644
--- a/fs/autofs4/inode.c
+++ b/fs/autofs4/inode.c
@@ -224,6 +224,7 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
set_autofs_type_indirect(&sbi->type);
sbi->min_proto = 0;
sbi->max_proto = 0;
+ sbi->compat_daemon = -1; /* unknown */
mutex_init(&sbi->wq_mutex);
mutex_init(&sbi->pipe_mutex);
spin_lock_init(&sbi->fs_lock);
diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
index 75e5f1c..9c620bf 100644
--- a/fs/autofs4/root.c
+++ b/fs/autofs4/root.c
@@ -71,6 +71,14 @@ const struct dentry_operations autofs4_dentry_operations = {
.d_release = autofs4_dentry_release,
};
+static int autofs4_root_dir_open(struct inode *inode, struct file *file)
+{
+ struct autofs_sb_info *sbi= autofs4_sbi(file->f_path.dentry->d_sb);
+ if (sbi->compat_daemon < 0)
+ sbi->compat_daemon = is_compat_task();
+ return dcache_dir_open(inode, file);
+}
+
static void autofs4_add_active(struct dentry *dentry)
{
struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
diff --git a/fs/autofs4/waitq.c b/fs/autofs4/waitq.c
index da8876d..8ec8507 100644
--- a/fs/autofs4/waitq.c
+++ b/fs/autofs4/waitq.c
@@ -15,6 +15,7 @@
#include <linux/time.h>
#include <linux/signal.h>
#include <linux/file.h>
+#include <linux/compat.h>
#include "autofs_i.h"
/* We make this a static variable rather than a part of the superblock; it
@@ -91,7 +92,24 @@ static int autofs4_write(struct autofs_sb_info *sbi,
return (bytes > 0);
}
-
+
+/*
+ * The autofs_v5 packet was misdesigned.
+ *
+ * The packets are identical on x86-32 and x86-64, but have different
+ * alignment. Which means that 'sizeof()' will give different results.
+ * Fix it up for the case of running 32-bit user mode on a 64-bit kernel.
+ */
+static noinline size_t autofs_v5_packet_size(struct autofs_sb_info *sbi)
+{
+ size_t pktsz = sizeof(struct autofs_v5_packet);
+#if defined(CONFIG_X86_64) && defined(CONFIG_COMPAT)
+ if (sbi->compat_daemon > 0)
+ pktsz -= 4;
+#endif
+ return pktsz;
+}
+
static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
struct autofs_wait_queue *wq,
int type)
@@ -155,8 +173,7 @@ static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
{
struct autofs_v5_packet *packet = &pkt.v5_pkt.v5_packet;
- pktsz = sizeof(*packet);
-
+ pktsz = autofs_v5_packet_size(sbi);
packet->wait_queue_token = wq->wait_queue_token;
packet->len = wq->name.len;
memcpy(packet->name, wq->name.name, wq->name.len);
On Wed, 2012-02-22 at 13:43 +0800, Ian Kent wrote:
> On Tue, 2012-02-21 at 20:56 -0800, Linus Torvalds wrote:
Ahh ... forgot to set the file_operations structure member .. oops
>
> +static int autofs4_root_dir_open(struct inode *inode, struct file *file)
> +{
> + struct autofs_sb_info *sbi= autofs4_sbi(file->f_path.dentry->d_sb);
> + if (sbi->compat_daemon < 0)
> + sbi->compat_daemon = is_compat_task();
> + return dcache_dir_open(inode, file);
> +}
> +
On Wed, 2012-02-22 at 13:53 +0800, Ian Kent wrote:
> On Wed, 2012-02-22 at 13:43 +0800, Ian Kent wrote:
> > On Tue, 2012-02-21 at 20:56 -0800, Linus Torvalds wrote:
>
> Ahh ... forgot to set the file_operations structure member .. oops
>
> >
> > +static int autofs4_root_dir_open(struct inode *inode, struct file *file)
> > +{
> > + struct autofs_sb_info *sbi= autofs4_sbi(file->f_path.dentry->d_sb);
> > + if (sbi->compat_daemon < 0)
> > + sbi->compat_daemon = is_compat_task();
> > + return dcache_dir_open(inode, file);
> > +}
> > +
>
Lets try that again.
autofs: work around unhappy compat problem on x86-64
From: Ian Kent <[email protected]>
When the autofs protocol version 5 packet type was added in commit
5c0a32fc2cd0 ("autofs4: add new packet type for v5 communications"), it
obvously tried quite hard to be word-size agnostic, and uses explicitly
sized fields that are all correctly aligned.
However, with the final "char name[NAME_MAX+1]" array at the end, the
actual size of the structure ends up being not very well defined:
because the struct isn't marked 'packed', doing a "sizeof()" on it will
align the size of the struct up to the biggest alignment of the members
it has.
And despite all the members being the same, the alignment of them is
different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
alignment on x86-64. And while 'NAME_MAX+1' ends up being a nice round
number (256), the name[] array starts out a 4-byte aligned.
End result: the "packed" size of the structure is 300 bytes: 4-byte, but
not 8-byte aligned.
As a result, despite all the fields being in the same place on all
architectures, sizeof() will round up that size to 304 bytes on
architectures that have 8-byte alignment for u64.
Note that this is *not* a problem for 32-bit compat mode on POWER, since
there __u64 is 8-byte aligned even in 32-bit mode. But on x86, 32-bit
and 64-bit alignment is different for 64-bit entities, and as a result
the structure that has exactly the same layout has different sizes.
So on x86-64, but no other architecture, we will just subtract 4 from
the size of the structure when running in a compat task. That way we
will write the properly sized packet that user mode expects.
Not pretty. Sadly, this very subtle, and unnecessary, size difference
has been encoded in user space that wants to read packets of *exactly*
the right size, and will refuse to touch anything else.
Reported-and-tested-by: Thomas Meyer <[email protected]>
Cc: Ian Kent <[email protected]>
---
fs/autofs4/autofs_i.h | 1 +
fs/autofs4/dev-ioctl.c | 3 +++
fs/autofs4/inode.c | 1 +
fs/autofs4/root.c | 11 ++++++++++-
fs/autofs4/waitq.c | 23 ++++++++++++++++++++---
5 files changed, 35 insertions(+), 4 deletions(-)
diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
index d8d8e7b..eb1cc92 100644
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -110,6 +110,7 @@ struct autofs_sb_info {
int sub_version;
int min_proto;
int max_proto;
+ int compat_daemon;
unsigned long exp_timeout;
unsigned int type;
int reghost_enabled;
diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index 76741d8..868fd57 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -265,6 +265,9 @@ static int autofs_dev_ioctl_open_mountpoint(const char *name, dev_t devid)
}
autofs_dev_ioctl_fd_install(fd, filp);
+
+ if (sbi->compat_daemon < 0)
+ sbi->compat_daemon = is_compat_task();
}
return fd;
diff --git a/fs/autofs4/inode.c b/fs/autofs4/inode.c
index e16980b..2920bbd 100644
--- a/fs/autofs4/inode.c
+++ b/fs/autofs4/inode.c
@@ -224,6 +224,7 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
set_autofs_type_indirect(&sbi->type);
sbi->min_proto = 0;
sbi->max_proto = 0;
+ sbi->compat_daemon = -1; /* unknown */
mutex_init(&sbi->wq_mutex);
mutex_init(&sbi->pipe_mutex);
spin_lock_init(&sbi->fs_lock);
diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
index 75e5f1c..ea0dc27 100644
--- a/fs/autofs4/root.c
+++ b/fs/autofs4/root.c
@@ -31,6 +31,7 @@ static long autofs4_root_ioctl(struct file *,unsigned int,unsigned long);
#ifdef CONFIG_COMPAT
static long autofs4_root_compat_ioctl(struct file *,unsigned int,unsigned long);
#endif
+static int autofs4_root_dir_open(struct inode *inode, struct file *file);
static int autofs4_dir_open(struct inode *inode, struct file *file);
static struct dentry *autofs4_lookup(struct inode *,struct dentry *, struct nameidata *);
static struct vfsmount *autofs4_d_automount(struct path *);
@@ -38,7 +39,7 @@ static int autofs4_d_manage(struct dentry *, bool);
static void autofs4_dentry_release(struct dentry *);
const struct file_operations autofs4_root_operations = {
- .open = dcache_dir_open,
+ .open = autofs4_root_dir_open,
.release = dcache_dir_close,
.read = generic_read_dir,
.readdir = dcache_readdir,
@@ -71,6 +72,14 @@ const struct dentry_operations autofs4_dentry_operations = {
.d_release = autofs4_dentry_release,
};
+static int autofs4_root_dir_open(struct inode *inode, struct file *file)
+{
+ struct autofs_sb_info *sbi= autofs4_sbi(file->f_path.dentry->d_sb);
+ if (sbi->compat_daemon < 0)
+ sbi->compat_daemon = is_compat_task();
+ return dcache_dir_open(inode, file);
+}
+
static void autofs4_add_active(struct dentry *dentry)
{
struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
diff --git a/fs/autofs4/waitq.c b/fs/autofs4/waitq.c
index da8876d..8ec8507 100644
--- a/fs/autofs4/waitq.c
+++ b/fs/autofs4/waitq.c
@@ -15,6 +15,7 @@
#include <linux/time.h>
#include <linux/signal.h>
#include <linux/file.h>
+#include <linux/compat.h>
#include "autofs_i.h"
/* We make this a static variable rather than a part of the superblock; it
@@ -91,7 +92,24 @@ static int autofs4_write(struct autofs_sb_info *sbi,
return (bytes > 0);
}
-
+
+/*
+ * The autofs_v5 packet was misdesigned.
+ *
+ * The packets are identical on x86-32 and x86-64, but have different
+ * alignment. Which means that 'sizeof()' will give different results.
+ * Fix it up for the case of running 32-bit user mode on a 64-bit kernel.
+ */
+static noinline size_t autofs_v5_packet_size(struct autofs_sb_info *sbi)
+{
+ size_t pktsz = sizeof(struct autofs_v5_packet);
+#if defined(CONFIG_X86_64) && defined(CONFIG_COMPAT)
+ if (sbi->compat_daemon > 0)
+ pktsz -= 4;
+#endif
+ return pktsz;
+}
+
static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
struct autofs_wait_queue *wq,
int type)
@@ -155,8 +173,7 @@ static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
{
struct autofs_v5_packet *packet = &pkt.v5_pkt.v5_packet;
- pktsz = sizeof(*packet);
-
+ pktsz = autofs_v5_packet_size(sbi);
packet->wait_queue_token = wq->wait_queue_token;
packet->len = wq->name.len;
memcpy(packet->name, wq->name.name, wq->name.len);
On 02/21/2012 09:43 PM, Ian Kent wrote:
>
> However, with the final "char name[NAME_MAX+1]" array at the end, the
> actual size of the structure ends up being not very well defined:
> because the struct isn't marked 'packed', doing a "sizeof()" on it will
> align the size of the struct up to the biggest alignment of the members
> it has.
>
Note that it's really rather unfortunate that this transmits the entire
packet no matter what, which also means that the "len" field is actually
completely and totally pointless.
I have mentioned in the past that I consider it a design mistake on my
part to have used a pipe in the first place: it was "so easy", but
required an extra read for no good reason, or padding to fixed size
resulting in this problem. The right thing really should have been to
use a Unix datagram socket; these days using SOCK_SEQPACKET - that way
the kernel would give you the right semantics by design.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
On Wed, 2012-02-22 at 13:57 +0800, Ian Kent wrote:
> On Wed, 2012-02-22 at 13:53 +0800, Ian Kent wrote:
> > On Wed, 2012-02-22 at 13:43 +0800, Ian Kent wrote:
> > > On Tue, 2012-02-21 at 20:56 -0800, Linus Torvalds wrote:
> >
> > Ahh ... forgot to set the file_operations structure member .. oops
> >
> > >
> > > +static int autofs4_root_dir_open(struct inode *inode, struct file *file)
> > > +{
> > > + struct autofs_sb_info *sbi= autofs4_sbi(file->f_path.dentry->d_sb);
> > > + if (sbi->compat_daemon < 0)
> > > + sbi->compat_daemon = is_compat_task();
> > > + return dcache_dir_open(inode, file);
> > > +}
> > > +
> >
>
> Lets try that again.
Oh, DOH ...
The process mounting the autofs mount is passing the pipe it will use
for communication via a mount option so we can do this right in
fill_super.
I'm out for a while, I'll update the patch when I'm back.
Ian
On Wed, 2012-02-22 at 17:32 +0800, Ian Kent wrote:
> On Wed, 2012-02-22 at 13:57 +0800, Ian Kent wrote:
> > On Wed, 2012-02-22 at 13:53 +0800, Ian Kent wrote:
> > > On Wed, 2012-02-22 at 13:43 +0800, Ian Kent wrote:
> > > > On Tue, 2012-02-21 at 20:56 -0800, Linus Torvalds wrote:
> > >
> > > Ahh ... forgot to set the file_operations structure member .. oops
> > >
> > > >
> > > > +static int autofs4_root_dir_open(struct inode *inode, struct file *file)
> > > > +{
> > > > + struct autofs_sb_info *sbi= autofs4_sbi(file->f_path.dentry->d_sb);
> > > > + if (sbi->compat_daemon < 0)
> > > > + sbi->compat_daemon = is_compat_task();
> > > > + return dcache_dir_open(inode, file);
> > > > +}
> > > > +
> > >
> >
> > Lets try that again.
>
> Oh, DOH ...
>
> The process mounting the autofs mount is passing the pipe it will use
> for communication via a mount option so we can do this right in
> fill_super.
>
> I'm out for a while, I'll update the patch when I'm back.
autofs: work around unhappy compat problem on x86-64
From: Ian Kent <[email protected]>
When the autofs protocol version 5 packet type was added in commit
5c0a32fc2cd0 ("autofs4: add new packet type for v5 communications"), it
obvously tried quite hard to be word-size agnostic, and uses explicitly
sized fields that are all correctly aligned.
However, with the final "char name[NAME_MAX+1]" array at the end, the
actual size of the structure ends up being not very well defined:
because the struct isn't marked 'packed', doing a "sizeof()" on it will
align the size of the struct up to the biggest alignment of the members
it has.
And despite all the members being the same, the alignment of them is
different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
alignment on x86-64. And while 'NAME_MAX+1' ends up being a nice round
number (256), the name[] array starts out a 4-byte aligned.
End result: the "packed" size of the structure is 300 bytes: 4-byte, but
not 8-byte aligned.
As a result, despite all the fields being in the same place on all
architectures, sizeof() will round up that size to 304 bytes on
architectures that have 8-byte alignment for u64.
Note that this is *not* a problem for 32-bit compat mode on POWER, since
there __u64 is 8-byte aligned even in 32-bit mode. But on x86, 32-bit
and 64-bit alignment is different for 64-bit entities, and as a result
the structure that has exactly the same layout has different sizes.
So on x86-64, but no other architecture, we will just subtract 4 from
the size of the structure when running in a compat task. That way we
will write the properly sized packet that user mode expects.
Not pretty. Sadly, this very subtle, and unnecessary, size difference
has been encoded in user space that wants to read packets of *exactly*
the right size, and will refuse to touch anything else.
Reported-and-tested-by: Thomas Meyer <[email protected]>
Cc: Ian Kent <[email protected]>
---
fs/autofs4/autofs_i.h | 1 +
fs/autofs4/inode.c | 2 ++
fs/autofs4/waitq.c | 22 +++++++++++++++++++---
3 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
index d8d8e7b..eb1cc92 100644
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -110,6 +110,7 @@ struct autofs_sb_info {
int sub_version;
int min_proto;
int max_proto;
+ int compat_daemon;
unsigned long exp_timeout;
unsigned int type;
int reghost_enabled;
diff --git a/fs/autofs4/inode.c b/fs/autofs4/inode.c
index e16980b..06858d9 100644
--- a/fs/autofs4/inode.c
+++ b/fs/autofs4/inode.c
@@ -19,6 +19,7 @@
#include <linux/parser.h>
#include <linux/bitops.h>
#include <linux/magic.h>
+#include <linux/compat.h>
#include "autofs_i.h"
#include <linux/module.h>
@@ -224,6 +225,7 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
set_autofs_type_indirect(&sbi->type);
sbi->min_proto = 0;
sbi->max_proto = 0;
+ sbi->compat_daemon = is_compat_task();
mutex_init(&sbi->wq_mutex);
mutex_init(&sbi->pipe_mutex);
spin_lock_init(&sbi->fs_lock);
diff --git a/fs/autofs4/waitq.c b/fs/autofs4/waitq.c
index da8876d..9c098db 100644
--- a/fs/autofs4/waitq.c
+++ b/fs/autofs4/waitq.c
@@ -91,7 +91,24 @@ static int autofs4_write(struct autofs_sb_info *sbi,
return (bytes > 0);
}
-
+
+/*
+ * The autofs_v5 packet was misdesigned.
+ *
+ * The packets are identical on x86-32 and x86-64, but have different
+ * alignment. Which means that 'sizeof()' will give different results.
+ * Fix it up for the case of running 32-bit user mode on a 64-bit kernel.
+ */
+static noinline size_t autofs_v5_packet_size(struct autofs_sb_info *sbi)
+{
+ size_t pktsz = sizeof(struct autofs_v5_packet);
+#if defined(CONFIG_X86_64) && defined(CONFIG_COMPAT)
+ if (sbi->compat_daemon > 0)
+ pktsz -= 4;
+#endif
+ return pktsz;
+}
+
static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
struct autofs_wait_queue *wq,
int type)
@@ -155,8 +172,7 @@ static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
{
struct autofs_v5_packet *packet = &pkt.v5_pkt.v5_packet;
- pktsz = sizeof(*packet);
-
+ pktsz = autofs_v5_packet_size(sbi);
packet->wait_queue_token = wq->wait_queue_token;
packet->len = wq->name.len;
memcpy(packet->name, wq->name.name, wq->name.len);
On Wed, 2012-02-22 at 20:15 +0800, Ian Kent wrote:
> On Wed, 2012-02-22 at 17:32 +0800, Ian Kent wrote:
> > On Wed, 2012-02-22 at 13:57 +0800, Ian Kent wrote:
> > > On Wed, 2012-02-22 at 13:53 +0800, Ian Kent wrote:
> > > > On Wed, 2012-02-22 at 13:43 +0800, Ian Kent wrote:
> > > > > On Tue, 2012-02-21 at 20:56 -0800, Linus Torvalds wrote:
> > > >
> > > > Ahh ... forgot to set the file_operations structure member .. oops
> > > >
> > > > >
> > > > > +static int autofs4_root_dir_open(struct inode *inode, struct file *file)
> > > > > +{
> > > > > + struct autofs_sb_info *sbi= autofs4_sbi(file->f_path.dentry->d_sb);
> > > > > + if (sbi->compat_daemon < 0)
> > > > > + sbi->compat_daemon = is_compat_task();
> > > > > + return dcache_dir_open(inode, file);
> > > > > +}
> > > > > +
> > > >
> > >
> > > Lets try that again.
> >
> > Oh, DOH ...
> >
> > The process mounting the autofs mount is passing the pipe it will use
> > for communication via a mount option so we can do this right in
> > fill_super.
> >
> > I'm out for a while, I'll update the patch when I'm back.
This is *still* not right.
The daemon could be SIGKILLed and then re-connect to the existing mounts
and set a new pipe file handle. But that new task may or may not be a
compat task.
Let me fix that too.
On Wed, 2012-02-22 at 20:39 +0800, Ian Kent wrote:
> On Wed, 2012-02-22 at 20:15 +0800, Ian Kent wrote:
> > On Wed, 2012-02-22 at 17:32 +0800, Ian Kent wrote:
> > > On Wed, 2012-02-22 at 13:57 +0800, Ian Kent wrote:
> > > > On Wed, 2012-02-22 at 13:53 +0800, Ian Kent wrote:
> > > > > On Wed, 2012-02-22 at 13:43 +0800, Ian Kent wrote:
> > > > > > On Tue, 2012-02-21 at 20:56 -0800, Linus Torvalds wrote:
> > > > >
> > > > > Ahh ... forgot to set the file_operations structure member .. oops
> > > > >
> > > > > >
> > > > > > +static int autofs4_root_dir_open(struct inode *inode, struct file *file)
> > > > > > +{
> > > > > > + struct autofs_sb_info *sbi= autofs4_sbi(file->f_path.dentry->d_sb);
> > > > > > + if (sbi->compat_daemon < 0)
> > > > > > + sbi->compat_daemon = is_compat_task();
> > > > > > + return dcache_dir_open(inode, file);
> > > > > > +}
> > > > > > +
> > > > >
> > > >
> > > > Lets try that again.
> > >
> > > Oh, DOH ...
> > >
> > > The process mounting the autofs mount is passing the pipe it will use
> > > for communication via a mount option so we can do this right in
> > > fill_super.
> > >
> > > I'm out for a while, I'll update the patch when I'm back.
>
> This is *still* not right.
>
> The daemon could be SIGKILLed and then re-connect to the existing mounts
> and set a new pipe file handle. But that new task may or may not be a
> compat task.
>
> Let me fix that too.
I've added a Signed-off-by this time, so I think it's right...
autofs: work around unhappy compat problem on x86-64
From: Ian Kent <[email protected]>
When the autofs protocol version 5 packet type was added in commit
5c0a32fc2cd0 ("autofs4: add new packet type for v5 communications"), it
obvously tried quite hard to be word-size agnostic, and uses explicitly
sized fields that are all correctly aligned.
However, with the final "char name[NAME_MAX+1]" array at the end, the
actual size of the structure ends up being not very well defined:
because the struct isn't marked 'packed', doing a "sizeof()" on it will
align the size of the struct up to the biggest alignment of the members
it has.
And despite all the members being the same, the alignment of them is
different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
alignment on x86-64. And while 'NAME_MAX+1' ends up being a nice round
number (256), the name[] array starts out a 4-byte aligned.
End result: the "packed" size of the structure is 300 bytes: 4-byte, but
not 8-byte aligned.
As a result, despite all the fields being in the same place on all
architectures, sizeof() will round up that size to 304 bytes on
architectures that have 8-byte alignment for u64.
Note that this is *not* a problem for 32-bit compat mode on POWER, since
there __u64 is 8-byte aligned even in 32-bit mode. But on x86, 32-bit
and 64-bit alignment is different for 64-bit entities, and as a result
the structure that has exactly the same layout has different sizes.
So on x86-64, but no other architecture, we will just subtract 4 from
the size of the structure when running in a compat task. That way we
will write the properly sized packet that user mode expects.
Not pretty. Sadly, this very subtle, and unnecessary, size difference
has been encoded in user space that wants to read packets of *exactly*
the right size, and will refuse to touch anything else.
Reported-and-tested-by: Thomas Meyer <[email protected]>
Signed-off-by: Ian Kent <[email protected]>
---
fs/autofs4/autofs_i.h | 1 +
fs/autofs4/dev-ioctl.c | 1 +
fs/autofs4/inode.c | 2 ++
fs/autofs4/waitq.c | 22 +++++++++++++++++++---
4 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
index d8d8e7b..eb1cc92 100644
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -110,6 +110,7 @@ struct autofs_sb_info {
int sub_version;
int min_proto;
int max_proto;
+ int compat_daemon;
unsigned long exp_timeout;
unsigned int type;
int reghost_enabled;
diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index 76741d8..85f1fcd 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -385,6 +385,7 @@ static int autofs_dev_ioctl_setpipefd(struct file *fp,
sbi->pipefd = pipefd;
sbi->pipe = pipe;
sbi->catatonic = 0;
+ sbi->compat_daemon = is_compat_task();
}
out:
mutex_unlock(&sbi->wq_mutex);
diff --git a/fs/autofs4/inode.c b/fs/autofs4/inode.c
index e16980b..06858d9 100644
--- a/fs/autofs4/inode.c
+++ b/fs/autofs4/inode.c
@@ -19,6 +19,7 @@
#include <linux/parser.h>
#include <linux/bitops.h>
#include <linux/magic.h>
+#include <linux/compat.h>
#include "autofs_i.h"
#include <linux/module.h>
@@ -224,6 +225,7 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
set_autofs_type_indirect(&sbi->type);
sbi->min_proto = 0;
sbi->max_proto = 0;
+ sbi->compat_daemon = is_compat_task();
mutex_init(&sbi->wq_mutex);
mutex_init(&sbi->pipe_mutex);
spin_lock_init(&sbi->fs_lock);
diff --git a/fs/autofs4/waitq.c b/fs/autofs4/waitq.c
index da8876d..9c098db 100644
--- a/fs/autofs4/waitq.c
+++ b/fs/autofs4/waitq.c
@@ -91,7 +91,24 @@ static int autofs4_write(struct autofs_sb_info *sbi,
return (bytes > 0);
}
-
+
+/*
+ * The autofs_v5 packet was misdesigned.
+ *
+ * The packets are identical on x86-32 and x86-64, but have different
+ * alignment. Which means that 'sizeof()' will give different results.
+ * Fix it up for the case of running 32-bit user mode on a 64-bit kernel.
+ */
+static noinline size_t autofs_v5_packet_size(struct autofs_sb_info *sbi)
+{
+ size_t pktsz = sizeof(struct autofs_v5_packet);
+#if defined(CONFIG_X86_64) && defined(CONFIG_COMPAT)
+ if (sbi->compat_daemon > 0)
+ pktsz -= 4;
+#endif
+ return pktsz;
+}
+
static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
struct autofs_wait_queue *wq,
int type)
@@ -155,8 +172,7 @@ static void autofs4_notify_daemon(struct autofs_sb_info *sbi,
{
struct autofs_v5_packet *packet = &pkt.v5_pkt.v5_packet;
- pktsz = sizeof(*packet);
-
+ pktsz = autofs_v5_packet_size(sbi);
packet->wait_queue_token = wq->wait_queue_token;
packet->len = wq->name.len;
memcpy(packet->name, wq->name.name, wq->name.len);
On 02/22/2012 01:32 AM, Ian Kent wrote:
>
> Oh, DOH ...
>
> The process mounting the autofs mount is passing the pipe it will use
> for communication via a mount option so we can do this right in
> fill_super.
>
Does that mean that the autofs v5 daemon calls mount() by itself, unlike
the older autofs daemon? Otherwise you're reading the compatness of
/bin/mount...
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
On Tue, Feb 21, 2012 at 10:02 PM, H. Peter Anvin <[email protected]> wrote:
>
> Note that it's really rather unfortunate that this transmits the entire
> packet no matter what, which also means that the "len" field is actually
> completely and totally pointless.
Yes.
> I have mentioned in the past that I consider it a design mistake on my
> part to have used a pipe in the first place: it was "so easy", but
> required an extra read for no good reason, or padding to fixed size
> resulting in this problem. ?The right thing really should have been to
> use a Unix datagram socket; these days using SOCK_SEQPACKET - that way
> the kernel would give you the right semantics by design.
Well, the kernel gives the right semantics for pipes too - writes are
guaranteed to be "atomic", so even in the presence of multiple writers
you can trivially do packetized data.
You just have to (a) add the length to the packet and (b) do the
length+packet write as a single write (which is limited to PIPE_SIZE -
4kB - for the atomicity guarantee).
If you don't have multiple concurrent writers without locking, the (b)
part falls away entirely, of course.
Yes, for the reader side you need to be able to handle the fact that
you can get more than one packet in one read() call, but sorting that
out isn't hard either.
But yeah, writing fixed-size data and then having a reader that reads
fixed-size data is just not a very good approach. It's doubly bad when
the "fixed size" isn't an explicit size that is documented in the
protocol, but depends on data structures.
Linus
On Wed, Feb 22, 2012 at 1:32 AM, Ian Kent <[email protected]> wrote:
>
> Oh, DOH ...
>
> The process mounting the autofs mount is passing the pipe it will use
> for communication via a mount option so we can do this right in
> fill_super.
No, we really cannot.
The autofs "mount" is done by fork + execve("mount").
So fill_super doesn't actually see the *daemon*. It sees the mount binary.
So if you have a 64-bit mount binary, and a 32-bit daemon, you're screwed again.
Linus
On Wed, Feb 22, 2012 at 4:45 AM, Ian Kent <[email protected]> wrote:
> diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
> index 76741d8..85f1fcd 100644
> --- a/fs/autofs4/dev-ioctl.c
> +++ b/fs/autofs4/dev-ioctl.c
> @@ -385,6 +385,7 @@ static int autofs_dev_ioctl_setpipefd(struct file *fp,
> ? ? ? ? ? ? ? ?sbi->pipefd = pipefd;
> ? ? ? ? ? ? ? ?sbi->pipe = pipe;
> ? ? ? ? ? ? ? ?sbi->catatonic = 0;
> + ? ? ? ? ? ? ? sbi->compat_daemon = is_compat_task();
> ? ? ? ?}
> ?out:
> ? ? ? ?mutex_unlock(&sbi->wq_mutex);
Yes, this part should do the reconnect correctly (assuming it's the
daemon that does that itself, but I assume it is).
But
> diff --git a/fs/autofs4/inode.c b/fs/autofs4/inode.c
> index e16980b..06858d9 100644
> --- a/fs/autofs4/inode.c
> +++ b/fs/autofs4/inode.c
> @@ -224,6 +225,7 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
> ? ? ? ?set_autofs_type_indirect(&sbi->type);
> ? ? ? ?sbi->min_proto = 0;
> ? ? ? ?sbi->max_proto = 0;
> + ? ? ? sbi->compat_daemon = is_compat_task();
> ? ? ? ?mutex_init(&sbi->wq_mutex);
> ? ? ? ?mutex_init(&sbi->pipe_mutex);
> ? ? ? ?spin_lock_init(&sbi->fs_lock);
the above really is wrong, at least for the autofs sources I looked at.
Because autofs does "spawn()" to fork+exec the mount() system call,
the is_compat_task() check really will be checking the wrong binary.
Now, it's true that in *most* cases you'd hope that 'mount' and
'autofs' both come from the same distro, and have been the compiled
for the same model. So it probably works in practice, and might be
good enough.
But I could easily see somebody building their own 'autofs' tools (or
building their own 'mount', although that sounds less likely to me)
and if they started out with a 32-bit distro but are slowly upgrading
(like they've upgraded the kernel) the above would get that wrong.
I could also imagine a 64-bit environment wanting to "cross" compile
(and test) a 32-bit binary, and trying that. In fact, I suspect that
is something that you could test yourself: I assume you build your own
'autofs' daemon on a system where you don't build everything else, and
just adding "-m32" to the build should do this and show the problem.
Linus
On 02/22/2012 08:10 AM, Linus Torvalds wrote:
>
> Well, the kernel gives the right semantics for pipes too - writes are
> guaranteed to be "atomic", so even in the presence of multiple writers
> you can trivially do packetized data.
>
> You just have to (a) add the length to the packet and (b) do the
> length+packet write as a single write (which is limited to PIPE_SIZE -
> 4kB - for the atomicity guarantee).
>
> If you don't have multiple concurrent writers without locking, the (b)
> part falls away entirely, of course.
>
> Yes, for the reader side you need to be able to handle the fact that
> you can get more than one packet in one read() call, but sorting that
> out isn't hard either.
>
What you describe above is pretty much how autofs 3 used to work; except
it would do one read() for the header including length and then another
read() for the body. Of course, it could just have read ahead -- if you
read part of the next packet, it wouldn't really matter since at least
at that time the daemon was single-threaded and would have to loop back
anyway.
The PIPE_SIZE guarantee took care of the fact that this was a multiple
writer/single reader situation (since the writes happens in the context
of the process requesting a mount.) Either way, SOCK_DGRAM and
SOCK_SEQPACKET would solve all of the problems and would Just Work, and
packet boundaries would then be explicit.
> But yeah, writing fixed-size data and then having a reader that reads
> fixed-size data is just not a very good approach. It's doubly bad when
> the "fixed size" isn't an explicit size that is documented in the
> protocol, but depends on data structures.
Indeed.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
On 02/22/2012 08:10 AM, Linus Torvalds wrote:
>
> But yeah, writing fixed-size data and then having a reader that reads
> fixed-size data is just not a very good approach. It's doubly bad when
> the "fixed size" isn't an explicit size that is documented in the
> protocol, but depends on data structures.
>
Incidentally, a valid user space workaround for this problem would be to
just read all the data off the pipe but consider the packet length to be
NAME_MAX rather than NAME_MAX+1; then simply discard all-zero ints
received on input since a packet always starts with the autofs version
number.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
On Wed, Feb 22, 2012 at 9:45 AM, H. Peter Anvin <[email protected]> wrote:
>
> Incidentally, a valid user space workaround for this problem would be to
> just read all the data off the pipe but consider the packet length to be
> NAME_MAX rather than NAME_MAX+1; then simply discard all-zero ints
> received on input since a packet always starts with the autofs version
> number.
Nope, won't work.
Why?
Because that padding word for size is just random data.
In fact, we probably should clear it. I suspect we leak kernel stack
contents to autofs. Not that it matters (system daemon with root
privileges and all that), but it's another case of the whole "packing
data structures" issue.
Of course, right now autofs leaks kernel stack way more than that one
possible alignment word: the whole "name[NAME_MAX+1]" array is leaking
stuff after the name length (and final zero). So the padding is the
least of the leaking worries.
But the point is that because of very similar issues, the user-space
daemon cannot rely on things being zero.
Unless you fix the kernel, and increment the version number. But once
you do that, you're better off just fixing the size anyway.
Linus
On Wed, Feb 22, 2012 at 10:16 AM, Linus Torvalds
<[email protected]> wrote:
>
> Of course, right now autofs leaks kernel stack way more than that one
> possible alignment word: the whole "name[NAME_MAX+1]" array is leaking
> stuff after the name length (and final zero). So the padding is the
> least of the leaking worries.
Oh, we do have that "memset()" there, and it seems to have been there
since the beginning.
I take all that back. Fixing user space is actually possible.
However, the whole "no regressions" part is that we shouldn't require
updated user space, so we should fix the compat mess in the kernel
regardless. At which point "fixing user space" becomes irrelevant
again, so ..
Linus
On 02/22/2012 10:16 AM, Linus Torvalds wrote:
>
> Because that padding word for size is just random data.
>
> In fact, we probably should clear it. I suspect we leak kernel stack
> contents to autofs. Not that it matters (system daemon with root
> privileges and all that), but it's another case of the whole "packing
> data structures" issue.
>
Fortunately this is not true -- there is a memset(0) of the entire
packet before the packet is built in kernel space. Otherwise we'd have
a security hole.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
On Wed, 2012-02-22 at 07:13 -0800, H. Peter Anvin wrote:
> On 02/22/2012 01:32 AM, Ian Kent wrote:
> >
> > Oh, DOH ...
> >
> > The process mounting the autofs mount is passing the pipe it will use
> > for communication via a mount option so we can do this right in
> > fill_super.
> >
>
> Does that mean that the autofs v5 daemon calls mount() by itself, unlike
> the older autofs daemon? Otherwise you're reading the compatness of
> /bin/mount...
Version 5 does, yes.
There was nothing to be gained by the extra overhead of calling mount(8)
since I wanted to minimize mtab usage and contention, and didn't want
these showing up in mtab either.
>
> -hpa
>
On Wed, 2012-02-22 at 08:12 -0800, Linus Torvalds wrote:
> On Wed, Feb 22, 2012 at 1:32 AM, Ian Kent <[email protected]> wrote:
> >
> > Oh, DOH ...
> >
> > The process mounting the autofs mount is passing the pipe it will use
> > for communication via a mount option so we can do this right in
> > fill_super.
>
> No, we really cannot.
Sorry, I think your wrong this time.
>
> The autofs "mount" is done by fork + execve("mount").
It's done like this when mounting things inside an already mounted
indirect autofs mount or when mounting things on autofs direct mount
triggers but, in version 5, mount(2) has always used to mount autofs
file systems.
Ian
On Thu, 2012-02-23 at 09:48 +0800, Ian Kent wrote:
> On Wed, 2012-02-22 at 08:12 -0800, Linus Torvalds wrote:
> > On Wed, Feb 22, 2012 at 1:32 AM, Ian Kent <[email protected]> wrote:
> > >
> > > Oh, DOH ...
> > >
> > > The process mounting the autofs mount is passing the pipe it will use
> > > for communication via a mount option so we can do this right in
> > > fill_super.
> >
> > No, we really cannot.
>
> Sorry, I think your wrong this time.
On second thought, I suppose other user space users don't "have" to use
mount(2) ....
>
> >
> > The autofs "mount" is done by fork + execve("mount").
>
> It's done like this when mounting things inside an already mounted
> indirect autofs mount or when mounting things on autofs direct mount
> triggers but, in version 5, mount(2) has always used to mount autofs
> file systems.
>
> Ian
On Wed, Feb 22, 2012 at 5:48 PM, Ian Kent <[email protected]> wrote:
>
> Sorry, I think your wrong this time.
Well, that would be good, actually. Doing the test itself at mount
time is certainly the simpler approach.
>> The autofs "mount" is done by fork + execve("mount").
>
> It's done like this when mounting things inside an already mounted
> indirect autofs mount or when mounting things on autofs direct mount
> triggers but, in version 5, mount(2) has always used to mount autofs
> file systems.
Is that true for legacy autofs daemons too that distros ship? Because
those are the ones we'd be fighting..
Because when I do
git grep '\<mount[ ]*(' -- '*.[ch]'
(that's a space and a tab in that pattern) on the autofs-4.1.4 sources
I downloaded, I don't see a single call to mount. But I do see
spawning of PATH_MOUNT. And one of them is with "-t", "autofs".
So at least in the last version of autofs4, it was executing the
external "mount" program, not using mount(2).
Linus
On Wed, 2012-02-22 at 17:56 -0800, Linus Torvalds wrote:
> On Wed, Feb 22, 2012 at 5:48 PM, Ian Kent <[email protected]> wrote:
> >
> > Sorry, I think your wrong this time.
>
> Well, that would be good, actually. Doing the test itself at mount
> time is certainly the simpler approach.
>
> >> The autofs "mount" is done by fork + execve("mount").
> >
> > It's done like this when mounting things inside an already mounted
> > indirect autofs mount or when mounting things on autofs direct mount
> > triggers but, in version 5, mount(2) has always used to mount autofs
> > file systems.
>
> Is that true for legacy autofs daemons too that distros ship? Because
> those are the ones we'd be fighting..
That's a good point.
>
> Because when I do
>
> git grep '\<mount[ ]*(' -- '*.[ch]'
>
> (that's a space and a tab in that pattern) on the autofs-4.1.4 sources
> I downloaded, I don't see a single call to mount. But I do see
> spawning of PATH_MOUNT. And one of them is with "-t", "autofs".
Sure, that's true, and you'll see it uses the mount option maxproto with
the value of AUTOFS_MAX_PROTO_VERSION. But autofs uses it's own copy of
the headers so AUTOFS_MAX_PROTO_VERSION is 4 not 5 so it won't be
affected by this change.
Ian
On Thu, 2012-02-23 at 10:09 +0800, Ian Kent wrote:
> On Wed, 2012-02-22 at 17:56 -0800, Linus Torvalds wrote:
> > On Wed, Feb 22, 2012 at 5:48 PM, Ian Kent <[email protected]> wrote:
> > >
> > > Sorry, I think your wrong this time.
> >
> > Well, that would be good, actually. Doing the test itself at mount
> > time is certainly the simpler approach.
> >
> > >> The autofs "mount" is done by fork + execve("mount").
> > >
> > > It's done like this when mounting things inside an already mounted
> > > indirect autofs mount or when mounting things on autofs direct mount
> > > triggers but, in version 5, mount(2) has always used to mount autofs
> > > file systems.
> >
> > Is that true for legacy autofs daemons too that distros ship? Because
> > those are the ones we'd be fighting..
>
> That's a good point.
>
> >
> > Because when I do
> >
> > git grep '\<mount[ ]*(' -- '*.[ch]'
> >
> > (that's a space and a tab in that pattern) on the autofs-4.1.4 sources
> > I downloaded, I don't see a single call to mount. But I do see
> > spawning of PATH_MOUNT. And one of them is with "-t", "autofs".
>
> Sure, that's true, and you'll see it uses the mount option maxproto with
> the value of AUTOFS_MAX_PROTO_VERSION. But autofs uses it's own copy of
> the headers so AUTOFS_MAX_PROTO_VERSION is 4 not 5 so it won't be
> affected by this change.
Not to mention that v4 just won't work with a v5 packet.
>
> Ian
On Thu, 2012-02-23 at 09:54 +0800, Ian Kent wrote:
> On Thu, 2012-02-23 at 09:48 +0800, Ian Kent wrote:
> > On Wed, 2012-02-22 at 08:12 -0800, Linus Torvalds wrote:
> > > On Wed, Feb 22, 2012 at 1:32 AM, Ian Kent <[email protected]> wrote:
> > > >
> > > > Oh, DOH ...
> > > >
> > > > The process mounting the autofs mount is passing the pipe it will use
> > > > for communication via a mount option so we can do this right in
> > > > fill_super.
> > >
> > > No, we really cannot.
> >
> > Sorry, I think your wrong this time.
>
> On second thought, I suppose other user space users don't "have" to use
> mount(2) ....
Thomas, what does systemd use to mount the autofs mounts that it uses?
>
> >
> > >
> > > The autofs "mount" is done by fork + execve("mount").
> >
> > It's done like this when mounting things inside an already mounted
> > indirect autofs mount or when mounting things on autofs direct mount
> > triggers but, in version 5, mount(2) has always used to mount autofs
> > file systems.
> >
> > Ian
>
On Wed, Feb 22, 2012 at 6:09 PM, Ian Kent <[email protected]> wrote:
>
> Sure, that's true, and you'll see it uses the mount option maxproto with
> the value of AUTOFS_MAX_PROTO_VERSION. But autofs uses it's own copy of
> the headers so AUTOFS_MAX_PROTO_VERSION is 4 not 5 so it won't be
> affected by this change.
Ok.
I fetched your git tree, and it seems this changed to use mount()
directly in commit a74f68c99d9f ("integrated master map parsing into
daemon") back in 2006.
Which seems to be part of autofs_5_0_0_beta1, so I guess we're safe. I
hope no distro ever used any early pre-beta versions of autofs that
had the new packet support but still used the external 'mount'
command.
So looks good. I can't imagine that systemd mounts things directly,
because then you need to play with the whole pipe thing to get it to
work with the daemon. And that would just be odd.
Of course, "odd" is often par-for-the-course in user land ;)
Linus
On Wed, 2012-02-22 at 18:25 -0800, Linus Torvalds wrote:
> On Wed, Feb 22, 2012 at 6:09 PM, Ian Kent <[email protected]> wrote:
> >
> > Sure, that's true, and you'll see it uses the mount option maxproto with
> > the value of AUTOFS_MAX_PROTO_VERSION. But autofs uses it's own copy of
> > the headers so AUTOFS_MAX_PROTO_VERSION is 4 not 5 so it won't be
> > affected by this change.
>
> Ok.
>
> I fetched your git tree, and it seems this changed to use mount()
> directly in commit a74f68c99d9f ("integrated master map parsing into
> daemon") back in 2006.
>
> Which seems to be part of autofs_5_0_0_beta1, so I guess we're safe. I
> hope no distro ever used any early pre-beta versions of autofs that
> had the new packet support but still used the external 'mount'
> command.
If there are they have bigger automount problem than the kernel packet
size, ;)
It really didn't become usable until at least the rc releases and the
actual first non-test release was 5.0.1, we never had a 5.0.0.
Ian
On Thu, 2012-02-23 at 10:21 +0800, Ian Kent wrote:
> On Thu, 2012-02-23 at 09:54 +0800, Ian Kent wrote:
> > On Thu, 2012-02-23 at 09:48 +0800, Ian Kent wrote:
> > > On Wed, 2012-02-22 at 08:12 -0800, Linus Torvalds wrote:
> > > > On Wed, Feb 22, 2012 at 1:32 AM, Ian Kent <[email protected]> wrote:
> > > > >
> > > > > Oh, DOH ...
> > > > >
> > > > > The process mounting the autofs mount is passing the pipe it will use
> > > > > for communication via a mount option so we can do this right in
> > > > > fill_super.
> > > >
> > > > No, we really cannot.
> > >
> > > Sorry, I think your wrong this time.
> >
> > On second thought, I suppose other user space users don't "have" to use
> > mount(2) ....
>
> Thomas, what does systemd use to mount the autofs mounts that it uses?
Mmm ... AFAICS systemd uses mount(2) so that looks OK too.
Ian
However, I liked Linus' approach.of using the version ioctl; any reason not to?
Ian Kent <[email protected]> wrote:
>On Thu, 2012-02-23 at 10:21 +0800, Ian Kent wrote:
>> On Thu, 2012-02-23 at 09:54 +0800, Ian Kent wrote:
>> > On Thu, 2012-02-23 at 09:48 +0800, Ian Kent wrote:
>> > > On Wed, 2012-02-22 at 08:12 -0800, Linus Torvalds wrote:
>> > > > On Wed, Feb 22, 2012 at 1:32 AM, Ian Kent <[email protected]>
>wrote:
>> > > > >
>> > > > > Oh, DOH ...
>> > > > >
>> > > > > The process mounting the autofs mount is passing the pipe it
>will use
>> > > > > for communication via a mount option so we can do this right
>in
>> > > > > fill_super.
>> > > >
>> > > > No, we really cannot.
>> > >
>> > > Sorry, I think your wrong this time.
>> >
>> > On second thought, I suppose other user space users don't "have" to
>use
>> > mount(2) ....
>>
>> Thomas, what does systemd use to mount the autofs mounts that it
>uses?
>
>Mmm ... AFAICS systemd uses mount(2) so that looks OK too.
>
>Ian
--
Sent from my mobile phone. Please excuse my brevity and lack of formatting.
Zitat von Ian Kent <[email protected]>:
> On Thu, 2012-02-23 at 09:54 +0800, Ian Kent wrote:
>> On Thu, 2012-02-23 at 09:48 +0800, Ian Kent wrote:
>>
>> On second thought, I suppose other user space users don't "have" to use
>> mount(2) ....
>
> Thomas, what does systemd use to mount the autofs mounts that it uses?
>
Judging from
http://cgit.freedesktop.org/systemd/systemd/tree/src/automount.c#n515
- mount is called via library/system call.
On Wed, 2012-02-22 at 22:31 -0800, H. Peter Anvin wrote:
> However, I liked Linus' approach.of using the version ioctl; any reason not to?
Not really, although that ioctl is used it doesn't have to be but you
must mount the autofs file system to use it. Setting a timeout is
another ioctl that is almost mandatory although not using it amounts to
saying you don't want anything to expire which is valid.
>
> Ian Kent <[email protected]> wrote:
>
> >On Thu, 2012-02-23 at 10:21 +0800, Ian Kent wrote:
> >> On Thu, 2012-02-23 at 09:54 +0800, Ian Kent wrote:
> >> > On Thu, 2012-02-23 at 09:48 +0800, Ian Kent wrote:
> >> > > On Wed, 2012-02-22 at 08:12 -0800, Linus Torvalds wrote:
> >> > > > On Wed, Feb 22, 2012 at 1:32 AM, Ian Kent <[email protected]>
> >wrote:
> >> > > > >
> >> > > > > Oh, DOH ...
> >> > > > >
> >> > > > > The process mounting the autofs mount is passing the pipe it
> >will use
> >> > > > > for communication via a mount option so we can do this right
> >in
> >> > > > > fill_super.
> >> > > >
> >> > > > No, we really cannot.
> >> > >
> >> > > Sorry, I think your wrong this time.
> >> >
> >> > On second thought, I suppose other user space users don't "have" to
> >use
> >> > mount(2) ....
> >>
> >> Thomas, what does systemd use to mount the autofs mounts that it
> >uses?
> >
> >Mmm ... AFAICS systemd uses mount(2) so that looks OK too.
> >
> >Ian
>
On Thu, 2012-02-23 at 19:20 +0800, Ian Kent wrote:
> On Wed, 2012-02-22 at 22:31 -0800, H. Peter Anvin wrote:
> > However, I liked Linus' approach.of using the version ioctl; any reason not to?
>
> Not really, although that ioctl is used it doesn't have to be but you
> must mount the autofs file system to use it. Setting a timeout is
> another ioctl that is almost mandatory although not using it amounts to
> saying you don't want anything to expire which is valid.
The open, to get an ioctl is another that pretty much must be done to
use the mount.
>
> >
> > Ian Kent <[email protected]> wrote:
> >
> > >On Thu, 2012-02-23 at 10:21 +0800, Ian Kent wrote:
> > >> On Thu, 2012-02-23 at 09:54 +0800, Ian Kent wrote:
> > >> > On Thu, 2012-02-23 at 09:48 +0800, Ian Kent wrote:
> > >> > > On Wed, 2012-02-22 at 08:12 -0800, Linus Torvalds wrote:
> > >> > > > On Wed, Feb 22, 2012 at 1:32 AM, Ian Kent <[email protected]>
> > >wrote:
> > >> > > > >
> > >> > > > > Oh, DOH ...
> > >> > > > >
> > >> > > > > The process mounting the autofs mount is passing the pipe it
> > >will use
> > >> > > > > for communication via a mount option so we can do this right
> > >in
> > >> > > > > fill_super.
> > >> > > >
> > >> > > > No, we really cannot.
> > >> > >
> > >> > > Sorry, I think your wrong this time.
> > >> >
> > >> > On second thought, I suppose other user space users don't "have" to
> > >use
> > >> > mount(2) ....
> > >>
> > >> Thomas, what does systemd use to mount the autofs mounts that it
> > >uses?
> > >
> > >Mmm ... AFAICS systemd uses mount(2) so that looks OK too.
> > >
> > >Ian
> >
>
Am Mittwoch, den 22.02.2012, 13:57 +0800 schrieb Ian Kent:
> On Wed, 2012-02-22 at 13:53 +0800, Ian Kent wrote:
> > On Wed, 2012-02-22 at 13:43 +0800, Ian Kent wrote:
> > > On Tue, 2012-02-21 at 20:56 -0800, Linus Torvalds wrote:
> >
> > Ahh ... forgot to set the file_operations structure member .. oops
> >
> > >
> > > +static int autofs4_root_dir_open(struct inode *inode, struct file *file)
> > > +{
> > > + struct autofs_sb_info *sbi= autofs4_sbi(file->f_path.dentry->d_sb);
> > > + if (sbi->compat_daemon < 0)
> > > + sbi->compat_daemon = is_compat_task();
> > > + return dcache_dir_open(inode, file);
> > > +}
> > > +
> >
>
> Lets try that again.
>
> autofs: work around unhappy compat problem on x86-64
>
> From: Ian Kent <[email protected]>
>
> When the autofs protocol version 5 packet type was added in commit
> 5c0a32fc2cd0 ("autofs4: add new packet type for v5 communications"), it
> obvously tried quite hard to be word-size agnostic, and uses explicitly
> sized fields that are all correctly aligned.
>
> However, with the final "char name[NAME_MAX+1]" array at the end, the
> actual size of the structure ends up being not very well defined:
> because the struct isn't marked 'packed', doing a "sizeof()" on it will
> align the size of the struct up to the biggest alignment of the members
> it has.
>
> And despite all the members being the same, the alignment of them is
> different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
> alignment on x86-64. And while 'NAME_MAX+1' ends up being a nice round
> number (256), the name[] array starts out a 4-byte aligned.
>
> End result: the "packed" size of the structure is 300 bytes: 4-byte, but
> not 8-byte aligned.
>
> As a result, despite all the fields being in the same place on all
> architectures, sizeof() will round up that size to 304 bytes on
> architectures that have 8-byte alignment for u64.
>
> Note that this is *not* a problem for 32-bit compat mode on POWER, since
> there __u64 is 8-byte aligned even in 32-bit mode. But on x86, 32-bit
> and 64-bit alignment is different for 64-bit entities, and as a result
> the structure that has exactly the same layout has different sizes.
>
> So on x86-64, but no other architecture, we will just subtract 4 from
> the size of the structure when running in a compat task. That way we
> will write the properly sized packet that user mode expects.
>
> Not pretty. Sadly, this very subtle, and unnecessary, size difference
> has been encoded in user space that wants to read packets of *exactly*
> the right size, and will refuse to touch anything else.
>
> Reported-and-tested-by: Thomas Meyer <[email protected]>
> Cc: Ian Kent <[email protected]>
> ---
works for me on top of 3.2.7.
The compat_daemon stuff is only needed on x86-64, and is_compat_task is
only defined with CONFIG_COMPAT, so disable it for all other
configurations.
Signed-off-by: Andreas Schwab <[email protected]>
---
fs/autofs4/autofs_i.h | 2 ++
fs/autofs4/dev-ioctl.c | 2 ++
fs/autofs4/inode.c | 2 ++
3 files changed, 6 insertions(+)
diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h
index eb1cc92..60439c2 100644
--- a/fs/autofs4/autofs_i.h
+++ b/fs/autofs4/autofs_i.h
@@ -110,7 +110,9 @@ struct autofs_sb_info {
int sub_version;
int min_proto;
int max_proto;
+#if defined(CONFIG_X86_64) && defined(CONFIG_COMPAT)
int compat_daemon;
+#endif
unsigned long exp_timeout;
unsigned int type;
int reghost_enabled;
diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c
index 85f1fcd..3ffff72 100644
--- a/fs/autofs4/dev-ioctl.c
+++ b/fs/autofs4/dev-ioctl.c
@@ -385,7 +385,9 @@ static int autofs_dev_ioctl_setpipefd(struct file *fp,
sbi->pipefd = pipefd;
sbi->pipe = pipe;
sbi->catatonic = 0;
+#if defined(CONFIG_X86_64) && defined(CONFIG_COMPAT)
sbi->compat_daemon = is_compat_task();
+#endif
}
out:
mutex_unlock(&sbi->wq_mutex);
diff --git a/fs/autofs4/inode.c b/fs/autofs4/inode.c
index 06858d9..5f3fb68 100644
--- a/fs/autofs4/inode.c
+++ b/fs/autofs4/inode.c
@@ -225,7 +225,9 @@ int autofs4_fill_super(struct super_block *s, void *data, int silent)
set_autofs_type_indirect(&sbi->type);
sbi->min_proto = 0;
sbi->max_proto = 0;
+#if defined(CONFIG_X86_64) && defined(CONFIG_COMPAT)
sbi->compat_daemon = is_compat_task();
+#endif
mutex_init(&sbi->wq_mutex);
mutex_init(&sbi->pipe_mutex);
spin_lock_init(&sbi->fs_lock);
--
1.7.9.2
--
Andreas Schwab, [email protected]
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
On Sat, Feb 25, 2012 at 2:10 PM, Andreas Schwab <[email protected]> wrote:
> The compat_daemon stuff is only needed on x86-64, and is_compat_task is
> only defined with CONFIG_COMPAT, so disable it for all other
> configurations.
Ugh, I hate the proliferation of #ifdefs in code when they aren't
really necessary.
How about this patch instead? It's small, simple, and clean. Maybe it
would even allow a few other #ifdef's to be removed (I see one in
kernel/seccomp.c, for example, although I suspect that
'mode1_syscalls_32' might also be a compat-only thing, so who knows)
And if you worry about the size of autofs_sb_info, I think that could
be made denser by using 'char' or bitfields instead of the various
ints that hold small values or flags instead.
Linus
On 02/25/2012 05:31 PM, Linus Torvalds wrote:
> On Sat, Feb 25, 2012 at 2:10 PM, Andreas Schwab <[email protected]> wrote:
>> The compat_daemon stuff is only needed on x86-64, and is_compat_task is
>> only defined with CONFIG_COMPAT, so disable it for all other
>> configurations.
>
> Ugh, I hate the proliferation of #ifdefs in code when they aren't
> really necessary.
>
> How about this patch instead? It's small, simple, and clean. Maybe it
> would even allow a few other #ifdef's to be removed (I see one in
> kernel/seccomp.c, for example, although I suspect that
> 'mode1_syscalls_32' might also be a compat-only thing, so who knows)
>
> And if you worry about the size of autofs_sb_info, I think that could
> be made denser by using 'char' or bitfields instead of the various
> ints that hold small values or flags instead.
>
I think this patch could help clean up other places, too.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
On Sat, Feb 25, 2012 at 5:46 PM, H. Peter Anvin <[email protected]> wrote:
>
> I think this patch could help clean up other places, too.
Grepping for CONFIG_COMPAT shows that the most common test is for some
local variable that indicates "compatness".
The "is_compat_task()" test is actually surprisingly rare.
Linus
Fortunately it is, because it always indicates a major design facepalm.
Linus Torvalds <[email protected]> wrote:
>On Sat, Feb 25, 2012 at 5:46 PM, H. Peter Anvin <[email protected]> wrote:
>>
>> I think this patch could help clean up other places, too.
>
>Grepping for CONFIG_COMPAT shows that the most common test is for some
>local variable that indicates "compatness".
>
>The "is_compat_task()" test is actually surprisingly rare.
>
> Linus
--
Sent from my mobile phone. Please excuse my brevity and lack of formatting.
Linus Torvalds <[email protected]> writes:
> How about this patch instead?
Works as well.
> And if you worry about the size of autofs_sb_info, I think that could
> be made denser by using 'char' or bitfields instead of the various
> ints that hold small values or flags instead.
Since it's local to autofs4 it doesn't really matter that much.
Andreas.
--
Andreas Schwab, [email protected]
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
On 26/02/12 02:31, Linus Torvalds wrote:
>
> +#else
> +
> +#define is_compat_task() (0)
> +
Linus,
this breaks 32bit builds of s390 (and maybe others), since several platforms already
define a is_compat_task. This macro then destroys the definition of the function
making
static inline int is_compat_task(void)
{
return 0;
}
into
static inline int 0
{
return 0;
}
e.g.
In file included from arch/s390/mm/fault.c:39:0:
/home/autobuild/BUILD/linux-3.3.0-rc5.00060.g203738e.49.x.20120227/arch/s390/include/asm/compat.h:177:38: error: macro "is_compat_task" passed 1 arguments, but takes just 0
/home/autobuild/BUILD/linux-3.3.0-rc5.00060.g203738e.49.x.20120227/arch/s390/include/asm/compat.h:178:1: error: expected '=', ',', ';', 'asm' or '__attribute__' before '{' token
CC arch/s390/kernel/time.o
Christian
On Mon, Feb 27, 2012 at 08:29:25AM +0100, Christian Borntraeger wrote:
> On 26/02/12 02:31, Linus Torvalds wrote:
> >
> > +#else
> > +
> > +#define is_compat_task() (0)
> > +
>
> Linus,
>
> this breaks 32bit builds of s390 (and maybe others), since several platforms already
> define a is_compat_task.
It breaks only !COMPAT builds on s390, since only we have the is_compat_task()
function defined for !COMPAT. The reason for that was simply to get rid of a
couple of ugly #ifdefs.
Note, that we still need to include asm/compat.h in some file since we need the
compat_ptr typedef.
That might be ugly, but I preferred including that header file so we could get
rid of the #ifdefs.
Anyway... the patch below fixes the build issues:
>From 5f96c00c9ace3935eb52ce6bdf25a4171f867f4a Mon Sep 17 00:00:00 2001
From: Heiko Carstens <[email protected]>
Date: Mon, 27 Feb 2012 10:01:52 +0100
Subject: [PATCH] compat: fix compile breakage on s390
The new is_compat_task() define for the !COMPAT case in include/linux/compat.h
conflicts with a similar define in arch/s390/include/asm/compat.h.
This is the minimal patch which fixes the build issues.
Signed-off-by: Heiko Carstens <[email protected]>
---
arch/s390/include/asm/compat.h | 7 -------
arch/s390/kernel/process.c | 1 -
arch/s390/kernel/ptrace.c | 2 +-
arch/s390/kernel/setup.c | 2 +-
arch/s390/kernel/signal.c | 1 -
arch/s390/mm/fault.c | 1 -
arch/s390/mm/mmap.c | 2 +-
drivers/s390/block/dasd_eckd.c | 2 +-
drivers/s390/block/dasd_ioctl.c | 1 +
drivers/s390/char/fs3270.c | 1 +
drivers/s390/char/vmcp.c | 1 +
drivers/s390/cio/chsc_sch.c | 1 +
drivers/s390/scsi/zfcp_cfdc.c | 1 +
13 files changed, 9 insertions(+), 14 deletions(-)
diff --git a/arch/s390/include/asm/compat.h b/arch/s390/include/asm/compat.h
index 2e49748..234f1d8 100644
--- a/arch/s390/include/asm/compat.h
+++ b/arch/s390/include/asm/compat.h
@@ -172,13 +172,6 @@ static inline int is_compat_task(void)
return is_32bit_task();
}
-#else
-
-static inline int is_compat_task(void)
-{
- return 0;
-}
-
#endif
static inline void __user *arch_compat_alloc_user_space(long len)
diff --git a/arch/s390/kernel/process.c b/arch/s390/kernel/process.c
index 4261aa7..e795933 100644
--- a/arch/s390/kernel/process.c
+++ b/arch/s390/kernel/process.c
@@ -29,7 +29,6 @@
#include <asm/irq.h>
#include <asm/timer.h>
#include <asm/nmi.h>
-#include <asm/compat.h>
#include <asm/smp.h>
#include "entry.h"
diff --git a/arch/s390/kernel/ptrace.c b/arch/s390/kernel/ptrace.c
index 9d82ed4..61f9548 100644
--- a/arch/s390/kernel/ptrace.c
+++ b/arch/s390/kernel/ptrace.c
@@ -20,8 +20,8 @@
#include <linux/regset.h>
#include <linux/tracehook.h>
#include <linux/seccomp.h>
+#include <linux/compat.h>
#include <trace/syscall.h>
-#include <asm/compat.h>
#include <asm/segment.h>
#include <asm/page.h>
#include <asm/pgtable.h>
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
index 354de07..3b2efc8 100644
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -46,6 +46,7 @@
#include <linux/kexec.h>
#include <linux/crash_dump.h>
#include <linux/memory.h>
+#include <linux/compat.h>
#include <asm/ipl.h>
#include <asm/uaccess.h>
@@ -59,7 +60,6 @@
#include <asm/ptrace.h>
#include <asm/sections.h>
#include <asm/ebcdic.h>
-#include <asm/compat.h>
#include <asm/kvm_virtio.h>
#include <asm/diag.h>
diff --git a/arch/s390/kernel/signal.c b/arch/s390/kernel/signal.c
index a8ba840..2d421d9 100644
--- a/arch/s390/kernel/signal.c
+++ b/arch/s390/kernel/signal.c
@@ -30,7 +30,6 @@
#include <asm/ucontext.h>
#include <asm/uaccess.h>
#include <asm/lowcore.h>
-#include <asm/compat.h>
#include "entry.h"
#define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP)))
diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index 354dd39..e8fcd92 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -36,7 +36,6 @@
#include <asm/pgtable.h>
#include <asm/irq.h>
#include <asm/mmu_context.h>
-#include <asm/compat.h>
#include "../kernel/entry.h"
#ifndef CONFIG_64BIT
diff --git a/arch/s390/mm/mmap.c b/arch/s390/mm/mmap.c
index f09c748..a0155c0 100644
--- a/arch/s390/mm/mmap.c
+++ b/arch/s390/mm/mmap.c
@@ -29,8 +29,8 @@
#include <linux/mman.h>
#include <linux/module.h>
#include <linux/random.h>
+#include <linux/compat.h>
#include <asm/pgalloc.h>
-#include <asm/compat.h>
static unsigned long stack_maxrandom_size(void)
{
diff --git a/drivers/s390/block/dasd_eckd.c b/drivers/s390/block/dasd_eckd.c
index 70880be..2617b1e 100644
--- a/drivers/s390/block/dasd_eckd.c
+++ b/drivers/s390/block/dasd_eckd.c
@@ -18,12 +18,12 @@
#include <linux/hdreg.h> /* HDIO_GETGEO */
#include <linux/bio.h>
#include <linux/module.h>
+#include <linux/compat.h>
#include <linux/init.h>
#include <asm/debug.h>
#include <asm/idals.h>
#include <asm/ebcdic.h>
-#include <asm/compat.h>
#include <asm/io.h>
#include <asm/uaccess.h>
#include <asm/cio.h>
diff --git a/drivers/s390/block/dasd_ioctl.c b/drivers/s390/block/dasd_ioctl.c
index f1a2016..792c69e 100644
--- a/drivers/s390/block/dasd_ioctl.c
+++ b/drivers/s390/block/dasd_ioctl.c
@@ -13,6 +13,7 @@
#define KMSG_COMPONENT "dasd"
#include <linux/interrupt.h>
+#include <linux/compat.h>
#include <linux/major.h>
#include <linux/fs.h>
#include <linux/blkpg.h>
diff --git a/drivers/s390/char/fs3270.c b/drivers/s390/char/fs3270.c
index e712981..9117045 100644
--- a/drivers/s390/char/fs3270.c
+++ b/drivers/s390/char/fs3270.c
@@ -11,6 +11,7 @@
#include <linux/console.h>
#include <linux/init.h>
#include <linux/interrupt.h>
+#include <linux/compat.h>
#include <linux/module.h>
#include <linux/list.h>
#include <linux/slab.h>
diff --git a/drivers/s390/char/vmcp.c b/drivers/s390/char/vmcp.c
index 75bde6a..89c03e6 100644
--- a/drivers/s390/char/vmcp.c
+++ b/drivers/s390/char/vmcp.c
@@ -13,6 +13,7 @@
#include <linux/fs.h>
#include <linux/init.h>
+#include <linux/compat.h>
#include <linux/kernel.h>
#include <linux/miscdevice.h>
#include <linux/slab.h>
diff --git a/drivers/s390/cio/chsc_sch.c b/drivers/s390/cio/chsc_sch.c
index 0c87b0f..8f9a1a3 100644
--- a/drivers/s390/cio/chsc_sch.c
+++ b/drivers/s390/cio/chsc_sch.c
@@ -8,6 +8,7 @@
*/
#include <linux/slab.h>
+#include <linux/compat.h>
#include <linux/device.h>
#include <linux/module.h>
#include <linux/uaccess.h>
diff --git a/drivers/s390/scsi/zfcp_cfdc.c b/drivers/s390/scsi/zfcp_cfdc.c
index 303dde0..fab2c25 100644
--- a/drivers/s390/scsi/zfcp_cfdc.c
+++ b/drivers/s390/scsi/zfcp_cfdc.c
@@ -11,6 +11,7 @@
#define KMSG_COMPONENT "zfcp"
#define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
+#include <linux/compat.h>
#include <linux/slab.h>
#include <linux/types.h>
#include <linux/miscdevice.h>
--
1.7.9.1
On Mon, 2012-02-27 at 08:29 +0100, Christian Borntraeger wrote:
> On 26/02/12 02:31, Linus Torvalds wrote:
> >
> > +#else
> > +
> > +#define is_compat_task() (0)
> > +
>
> Linus,
>
> this breaks 32bit builds of s390 (and maybe others), since several platforms already
> define a is_compat_task. This macro then destroys the definition of the function
> making
It looks like s390 is the only arch that uses a #else (CONFIG_COMPAT) so
maybe it is the only breakage.
Perhaps using a function instead of a define in include/linux/compat.h
and removing the else from arch/s390/include/asm/compat.h is the
sensible thing to do here or maybe just removing the #else from
arch/s390/include/asm/compat.h since it just returns 0 anyway?
>
> static inline int is_compat_task(void)
> {
> return 0;
> }
>
>
> into
>
> static inline int 0
> {
> return 0;
> }
>
> e.g.
>
> In file included from arch/s390/mm/fault.c:39:0:
> /home/autobuild/BUILD/linux-3.3.0-rc5.00060.g203738e.49.x.20120227/arch/s390/include/asm/compat.h:177:38: error: macro "is_compat_task" passed 1 arguments, but takes just 0
> /home/autobuild/BUILD/linux-3.3.0-rc5.00060.g203738e.49.x.20120227/arch/s390/include/asm/compat.h:178:1: error: expected '=', ',', ';', 'asm' or '__attribute__' before '{' token
> CC arch/s390/kernel/time.o
>
>
>
>
> Christian
>
On 02/27/2012 01:09 AM, Heiko Carstens wrote:
> On Mon, Feb 27, 2012 at 08:29:25AM +0100, Christian Borntraeger wrote:
>> On 26/02/12 02:31, Linus Torvalds wrote:
>>>
>>> +#else
>>> +
>>> +#define is_compat_task() (0)
>>> +
>>
>> Linus,
>>
>> this breaks 32bit builds of s390 (and maybe others), since several platforms already
>> define a is_compat_task.
>
> It breaks only !COMPAT builds on s390, since only we have the is_compat_task()
> function defined for !COMPAT. The reason for that was simply to get rid of a
> couple of ugly #ifdefs.
> Note, that we still need to include asm/compat.h in some file since we need the
> compat_ptr typedef.
> That might be ugly, but I preferred including that header file so we could get
> rid of the #ifdefs.
> Anyway... the patch below fixes the build issues:
>
This patch would seem to be The Right Thing; the combination of this
really takes what s390 has done in arch space and globalizes it.
-hpa
On Mon, Feb 27, 2012 at 8:22 AM, H. Peter Anvin <[email protected]> wrote:
>
> This patch would seem to be The Right Thing; the combination of this really
> takes what s390 has done in arch space and globalizes it.
Yeah, I already applied that s390 fixup and pushed it out. And yes,
the fact that s390 already did the whole "is_compat_task()" dance on
its own just shows that we should have done this in <linux/compat.h>
from the very beginning.
Linus