2009-12-03 21:09:26

by Ingo Molnar

[permalink] [raw]
Subject: [GIT PULL] x86/paravirt for v2.6.33

Linus,

Please pull the latest x86-paravirt-for-linus git tree from:

git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git x86-paravirt-for-linus

Thanks,

Ingo

------------------>
Jeremy Fitzhardinge (1):
x86: unify sys_iopl


arch/x86/include/asm/syscalls.h | 8 +++-----
arch/x86/kernel/ioport.c | 11 ++---------
2 files changed, 5 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
index 372b76e..5336ce2 100644
--- a/arch/x86/include/asm/syscalls.h
+++ b/arch/x86/include/asm/syscalls.h
@@ -33,11 +33,11 @@ long sys_rt_sigreturn(struct pt_regs *);
asmlinkage int sys_set_thread_area(struct user_desc __user *);
asmlinkage int sys_get_thread_area(struct user_desc __user *);

-/* X86_32 only */
-#ifdef CONFIG_X86_32
/* kernel/ioport.c */
-long sys_iopl(struct pt_regs *);
+asmlinkage long sys_iopl(unsigned int);

+/* X86_32 only */
+#ifdef CONFIG_X86_32
/* kernel/process_32.c */
int sys_clone(struct pt_regs *);
int sys_execve(struct pt_regs *);
@@ -70,8 +70,6 @@ int sys_vm86(struct pt_regs *);
#else /* CONFIG_X86_32 */

/* X86_64 only */
-/* kernel/ioport.c */
-asmlinkage long sys_iopl(unsigned int, struct pt_regs *);

/* kernel/process_64.c */
asmlinkage long sys_clone(unsigned long, unsigned long,
diff --git a/arch/x86/kernel/ioport.c b/arch/x86/kernel/ioport.c
index 99c4d30..bad4f22 100644
--- a/arch/x86/kernel/ioport.c
+++ b/arch/x86/kernel/ioport.c
@@ -119,11 +119,10 @@ static int do_iopl(unsigned int level, struct pt_regs *regs)
return 0;
}

-#ifdef CONFIG_X86_32
-long sys_iopl(struct pt_regs *regs)
+asmlinkage long sys_iopl(unsigned int level)
{
- unsigned int level = regs->bx;
struct thread_struct *t = &current->thread;
+ struct pt_regs *regs = task_pt_regs(current);
int rc;

rc = do_iopl(level, regs);
@@ -135,9 +134,3 @@ long sys_iopl(struct pt_regs *regs)
out:
return rc;
}
-#else
-asmlinkage long sys_iopl(unsigned int level, struct pt_regs *regs)
-{
- return do_iopl(level, regs);
-}
-#endif


2009-12-08 21:34:43

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT PULL] x86/paravirt for v2.6.33



On Thu, 3 Dec 2009, Ingo Molnar wrote:
>
> Please pull the latest x86-paravirt-for-linus git tree from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git x86-paravirt-for-linus

I _really_ don't like this:

> -long sys_iopl(struct pt_regs *regs)
> +asmlinkage long sys_iopl(unsigned int level)
> {
> - unsigned int level = regs->bx;
> struct thread_struct *t = &current->thread;
> + struct pt_regs *regs = task_pt_regs(current);

I do _not_ want to add any more task_pt_regs() crap, please.

Why? It's wrong for at least vm86 mode (and from kernel system calls).
Maybe we can't get into system calls from vm86 mode, and the kernel
hopefully doesn't do those things anyway, but the point is, you chose the
wrong way to go.

The old version that actually passed the stack frame was better. Why pick
the inferior version?

Linus

2009-12-09 07:36:45

by Ingo Molnar

[permalink] [raw]
Subject: Re: [GIT PULL] x86/paravirt for v2.6.33


* Linus Torvalds <[email protected]> wrote:

> On Thu, 3 Dec 2009, Ingo Molnar wrote:
> >
> > Please pull the latest x86-paravirt-for-linus git tree from:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git x86-paravirt-for-linus
>
> I _really_ don't like this:
>
> > -long sys_iopl(struct pt_regs *regs)
> > +asmlinkage long sys_iopl(unsigned int level)
> > {
> > - unsigned int level = regs->bx;
> > struct thread_struct *t = &current->thread;
> > + struct pt_regs *regs = task_pt_regs(current);
>
> I do _not_ want to add any more task_pt_regs() crap, please.
>
> Why? It's wrong for at least vm86 mode (and from kernel system calls).
> Maybe we can't get into system calls from vm86 mode, and the kernel
> hopefully doesn't do those things anyway, but the point is, you chose
> the wrong way to go.
>
> The old version that actually passed the stack frame was better. Why
> pick the inferior version?

Yeah, agreed. I missed that detail.

Jeremy, mind sending a patch that updates this code to use the less
obfuscated 32-bit version, not the 64-bit version? (a delta patch
against tip:master would be nice, as there's a fair amount of testing in
the unification change itself already, which we dont want to discard.)

Thanks,

Ingo

2009-12-09 18:18:37

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [GIT PULL] x86/paravirt for v2.6.33

On 12/08/09 13:34, Linus Torvalds wrote:
> I do _not_ want to add any more task_pt_regs() crap, please.
>
> Why? It's wrong for at least vm86 mode (and from kernel system calls).
>

Would the stack frame version work in these cases?

> Maybe we can't get into system calls from vm86 mode, and the kernel
> hopefully doesn't do those things anyway, but the point is, you chose the
> wrong way to go.
>

iopl doesn't make much sense as a kernel-called syscall, unless the
caller is intending to change the usermode iopl. In which case, won't
task_pt_regs() do the right thing - by pointing to the saved usermode
register set - vs modifying the ptregs the caller may pass in?

iopl is also one of the special set of syscalls which get special
handing in entry_*.S, so I don't think doing a direct call from within
the kernel is ever sensible, and it should always be possible to make
task_pt_regs return meaningful results.

I agree with you that vm86 would be a problem if its possible to call iopl.

> The old version that actually passed the stack frame was better. Why pick
> the inferior version?
>

Mainly because it exposes the difference between the 32 and 64-bit ABIs,
requiring separate code for each case; it seemed like an opportunity to
remove the differences.

Anyway, I'll post a patch to revert to the pt_regs-based version shortly.

J

2009-12-09 18:19:45

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [GIT PULL] x86/paravirt for v2.6.33

On 12/08/09 23:36, Ingo Molnar wrote:
>> The old version that actually passed the stack frame was better. Why
>> pick the inferior version?
>>
> Yeah, agreed. I missed that detail.
>

Which detail is that? The whole patch? ;)

> Jeremy, mind sending a patch that updates this code to use the less
> obfuscated 32-bit version, not the 64-bit version? (a delta patch
> against tip:master would be nice, as there's a fair amount of testing in
> the unification change itself already, which we dont want to discard.)
>

Sure.

But I'm not sure I understand the objection to task_pt_regs(); is it
considered deprecated? This patch received quite a lot of discussion
with no mention of it. Should we consider all its uses as suspect?

Would it be better to have something similar which just returns a
pointer to the saved [re]flags, since that's all we care about? That
should be easier to make robust against

J

2009-12-09 18:30:39

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [GIT PULL] x86/paravirt for v2.6.33

Linus clearly prefers the style with pt_regs passed on the stack as the sole form. Since we *have* to use that form for things like clone(), it makes sense to use it as the only form.

For what it's worth I did look at this when the patch first came up; it does make the individual patch a fair bit uglier, but I can understand Linus' consistency argument.

As far as I know, we don't allow any system calls from inside v86 mode.


-hpa

"Jeremy Fitzhardinge" <[email protected]> wrote:

>On 12/08/09 23:36, Ingo Molnar wrote:
>>> The old version that actually passed the stack frame was better. Why
>>> pick the inferior version?
>>>
>> Yeah, agreed. I missed that detail.
>>
>
>Which detail is that? The whole patch? ;)
>
>> Jeremy, mind sending a patch that updates this code to use the less
>> obfuscated 32-bit version, not the 64-bit version? (a delta patch
>> against tip:master would be nice, as there's a fair amount of testing in
>> the unification change itself already, which we dont want to discard.)
>>
>
>Sure.
>
>But I'm not sure I understand the objection to task_pt_regs(); is it
>considered deprecated? This patch received quite a lot of discussion
>with no mention of it. Should we consider all its uses as suspect?
>
>Would it be better to have something similar which just returns a
>pointer to the saved [re]flags, since that's all we care about? That
>should be easier to make robust against
>
> J

--
Sent from my mobile phone. Please excuse any lack of formatting.

2009-12-09 18:31:41

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [GIT PULL] x86/paravirt for v2.6.33

On 12/08/09 23:36, Ingo Molnar wrote:
> Jeremy, mind sending a patch that updates this code to use the less
> obfuscated 32-bit version, not the 64-bit version? (a delta patch
> against tip:master would be nice, as there's a fair amount of testing in
> the unification change itself already, which we dont want to discard.)
>

How does this look?

git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git fix-iopl

From: Jeremy Fitzhardinge<[email protected]>
Date: Wed, 9 Dec 2009 10:26:59 -0800
Subject: [PATCH] x86: Make sys_iopl use passed-in pt_regs

Rather than using task_pt_regs, use the pt_regs * passed into the syscall.
The ABI differences are handled in small 32/64-bit specific functions,
and everything else is handled in the common do_iopl().

Signed-off-by: Jeremy Fitzhardinge<[email protected]>

diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
index 5336ce2..70497f0 100644
--- a/arch/x86/include/asm/syscalls.h
+++ b/arch/x86/include/asm/syscalls.h
@@ -33,11 +33,11 @@ long sys_rt_sigreturn(struct pt_regs *);
asmlinkage int sys_set_thread_area(struct user_desc __user *);
asmlinkage int sys_get_thread_area(struct user_desc __user *);

-/* kernel/ioport.c */
-asmlinkage long sys_iopl(unsigned int);
-
/* X86_32 only */
#ifdef CONFIG_X86_32
+/* kernel/ioport.c */
+asmlinkage long sys_iopl(struct pt_regs *);
+
/* kernel/process_32.c */
int sys_clone(struct pt_regs *);
int sys_execve(struct pt_regs *);
@@ -71,6 +71,9 @@ int sys_vm86(struct pt_regs *);

/* X86_64 only */

+/* kernel/ioport.c */
+asmlinkage long sys_iopl(unsigned int, struct pt_regs *);
+
/* kernel/process_64.c */
asmlinkage long sys_clone(unsigned long, unsigned long,
void __user *, void __user *,
diff --git a/arch/x86/kernel/ioport.c b/arch/x86/kernel/ioport.c
index bad4f22..ac3cf88 100644
--- a/arch/x86/kernel/ioport.c
+++ b/arch/x86/kernel/ioport.c
@@ -105,6 +105,7 @@ asmlinkage long sys_ioperm(unsigned long from, unsigned long num, int turn_on)
*/
static int do_iopl(unsigned int level, struct pt_regs *regs)
{
+ struct thread_struct *t =&current->thread;
unsigned int old = (regs->flags>> 12)& 3;

if (level> 3)
@@ -116,21 +117,20 @@ static int do_iopl(unsigned int level, struct pt_regs *regs)
}
regs->flags = (regs->flags& ~X86_EFLAGS_IOPL) | (level<< 12);

+ t->iopl = level<< 12;
+ set_iopl_mask(t->iopl);
+
return 0;
}

-asmlinkage long sys_iopl(unsigned int level)
+#ifdef CONFIG_X86_64
+asmlinkage long sys_iopl(unsigned int level, struct pt_regs *regs)
{
- struct thread_struct *t =&current->thread;
- struct pt_regs *regs = task_pt_regs(current);
- int rc;
-
- rc = do_iopl(level, regs);
- if (rc< 0)
- goto out;
-
- t->iopl = level<< 12;
- set_iopl_mask(t->iopl);
-out:
- return rc;
+ return do_iopl(level, regs);
+}
+#else
+asmlinkage long sys_iopl(struct pt_regs *regs)
+{
+ return do_iopl(regs->bx, regs);
}
+#endif

2009-12-09 18:39:38

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT PULL] x86/paravirt for v2.6.33



On Wed, 9 Dec 2009, H. Peter Anvin wrote:
> "Jeremy Fitzhardinge" <[email protected]> wrote:
> >
> >But I'm not sure I understand the objection to task_pt_regs(); is it
> >considered deprecated? This patch received quite a lot of discussion
> >with no mention of it. Should we consider all its uses as suspect?

I personally would be much happier without ever seeing a single
task_pt_regs() call outside of pure ptrace() users (and quite frankly,
even there I'd be happier if it was basically done once, and then 'struct
pt_regs' being passed around as an argument as much as possible).

In the case of 'ptrace' we control the stack of the tracee.

In other cases, we do _not_.

For example, think about us ever implementing 'tasklets' or async system
calls in general. It's _very_ possible that we'd have a special stolen
stack for them, and run the low-level system call function from that
stack. Then something like task_pt_regs() migth very well do the wrong
thing.

Now, I'm not saying that 'sys_iopl()' would ever be a valid target for
async system calls, but I _am_ saying that system calls that depend on
task_pt_regs() are fundamnetally fragile and broken, and have subtle
interactions.

In contrast, if you make it very explicit that the system call gets passed
a pointer to its pt_regs, then it still has all the same issues, but now
it's not subtle any more - now it's explicitly encoded in the call
sequence on both the caller and callee sides, and somebody doing tasklets
would hopefully see that "oh, iopl() has that same special thing that
fork/clone/vfork/execve has, and touches the stack frame register set
explicitly".

Linus

2009-12-09 18:48:02

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT PULL] x86/paravirt for v2.6.33



On Wed, 9 Dec 2009, Jeremy Fitzhardinge wrote:
>
> How does this look?

I would actually prefer it if the calling convention was just made to
match on both x86 and x86-64. Wouldn't it be nice if they both just had

> +/* kernel/ioport.c */
> +asmlinkage long sys_iopl(unsigned int, struct pt_regs *);

as the prototype, and looked the same?

I realize that right now the 32-bit PTREGSCALL() thing doesn't support
that (very different macros for entry.S x86-32 and -64), but isn't that
just another thing we should try to fix too?

IOW, maybe something like this would be good, and would change the x86-32
calling convention to match the x86-64 one?

NOTE NOTE NOTE! Totally untested. Is the second argument even in %edx? I
don't remember, I didn't check, I'm just throwing this out as a "hey,
maybe something _like_ this can work" patch, and will be immediately
removing it from my machine after sending this email.

Linus

---
arch/x86/kernel/entry_32.S | 24 ++++++++++++------------
1 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 50b9c22..22b4431 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -725,22 +725,22 @@ END(syscall_badsys)
/*
* System calls that need a pt_regs pointer.
*/
-#define PTREGSCALL(name) \
+#define PTREGSCALL(name, reg) \
ALIGN; \
ptregs_##name: \
- leal 4(%esp),%eax; \
+ leal 4(%esp),reg; \
jmp sys_##name;

-PTREGSCALL(iopl)
-PTREGSCALL(fork)
-PTREGSCALL(clone)
-PTREGSCALL(vfork)
-PTREGSCALL(execve)
-PTREGSCALL(sigaltstack)
-PTREGSCALL(sigreturn)
-PTREGSCALL(rt_sigreturn)
-PTREGSCALL(vm86)
-PTREGSCALL(vm86old)
+PTREGSCALL(iopl,%edx)
+PTREGSCALL(fork,%eax)
+PTREGSCALL(clone,%eax)
+PTREGSCALL(vfork,%eax)
+PTREGSCALL(execve,%eax)
+PTREGSCALL(sigaltstack,%eax)
+PTREGSCALL(sigreturn,%eax)
+PTREGSCALL(rt_sigreturn,%eax)
+PTREGSCALL(vm86,%eax)
+PTREGSCALL(vm86old,%eax)

.macro FIXUP_ESPFIX_STACK
/*

2009-12-09 18:50:10

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [GIT PULL] x86/paravirt for v2.6.33

On 12/09/2009 10:31 AM, Jeremy Fitzhardinge wrote:
>
> How does this look?
>
> git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git fix-iopl
>
> From: Jeremy Fitzhardinge<[email protected]>
> Date: Wed, 9 Dec 2009 10:26:59 -0800
> Subject: [PATCH] x86: Make sys_iopl use passed-in pt_regs
>
> Rather than using task_pt_regs, use the pt_regs * passed into the syscall.
> The ABI differences are handled in small 32/64-bit specific functions,
> and everything else is handled in the common do_iopl().
>
> Signed-off-by: Jeremy Fitzhardinge<[email protected]>
>
> diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
> index 5336ce2..70497f0 100644
> --- a/arch/x86/include/asm/syscalls.h
> +++ b/arch/x86/include/asm/syscalls.h
> @@ -33,11 +33,11 @@ long sys_rt_sigreturn(struct pt_regs *);
> asmlinkage int sys_set_thread_area(struct user_desc __user *);
> asmlinkage int sys_get_thread_area(struct user_desc __user *);
>
> -/* kernel/ioport.c */
> -asmlinkage long sys_iopl(unsigned int);
> -
> /* X86_32 only */
> #ifdef CONFIG_X86_32
> +/* kernel/ioport.c */
> +asmlinkage long sys_iopl(struct pt_regs *);
> +
> /* kernel/process_32.c */

This is the main ugliness that led me to pass up on this in the first
place. What it really comes down to is that on 32 bits we need the
analogue with what 64 bits have with different stubs for different
number of arguments.

-hpa

2009-12-09 18:55:15

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [GIT PULL] x86/paravirt for v2.6.33

On 12/09/2009 10:47 AM, Linus Torvalds wrote:
>
> NOTE NOTE NOTE! Totally untested. Is the second argument even in %edx? I
> don't remember, I didn't check, I'm just throwing this out as a "hey,
> maybe something _like_ this can work" patch, and will be immediately
> removing it from my machine after sending this email.
>

The second argument is in %edx, but unlike 64 bits, it is not loaded
into that register a priory ("asmlinkage" means arguments are on the stack.)

As such, we need something looking like:

#define PTREGSCALL0(name) \
ptregs_##name: \
leal 4(%esp),%eax; \
jmp sys_##name

#define PTREGSCALL1(name) \
ptregs_##name: \
movl 4(%esp),%eax; \
leal 4(%esp),%edx; \
jmp sys_##name

#define PTREGSCALL2(name) \
ptregs_##name: \
movl 4(%esp),%eax; \
movl 8(%esp),%edx; \
leal 4(%esp),%ecx; \
jmp sys_##name

If we need more than two arguments + pt_regs, then we have to set up a
temporary stack frame.

[Sorry, I'm sitting in a meeting so I can't actually write up a real patch]

-hpa

2009-12-09 19:09:22

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT PULL] x86/paravirt for v2.6.33



On Wed, 9 Dec 2009, H. Peter Anvin wrote:
>
> The second argument is in %edx, but unlike 64 bits, it is not loaded
> into that register a priory ("asmlinkage" means arguments are on the stack.)

Oh, I missed the fact that we don't actually use asmlinkage on
sys_iopl() at all on x86-32 for this very reason.

And on x86-64, I think asmlinkage is a no-op, so that's ok - we should
just make the prototype be

long sys_iopl(long level, struct pt_regs *regs);

and your fancier macros.

Linus

2009-12-09 19:25:20

by Brian Gerst

[permalink] [raw]
Subject: Re: [GIT PULL] x86/paravirt for v2.6.33

On Wed, Dec 9, 2009 at 1:54 PM, H. Peter Anvin <[email protected]> wrote:
> On 12/09/2009 10:47 AM, Linus Torvalds wrote:
>>
>> NOTE NOTE NOTE! Totally untested. Is the second argument even in %edx? I
>> don't remember, I didn't check, I'm just throwing this out as a "hey,
>> maybe something _like_ this can work" patch, and will be immediately
>> removing it from my machine after sending this email.
>>
>
> The second argument is in %edx, but unlike 64 bits, it is not loaded
> into that register a priory ("asmlinkage" means arguments are on the stack.)
>
> As such, we need something looking like:
>
> #define PTREGSCALL0(name)       \
> ptregs_##name:                  \
>        leal    4(%esp),%eax;   \
>        jmp sys_##name
>
> #define PTREGSCALL1(name)       \
> ptregs_##name:                  \
>        movl    4(%esp),%eax;   \
>        leal    4(%esp),%edx;   \
>        jmp sys_##name
>
> #define PTREGSCALL2(name)       \
> ptregs_##name:                  \
>        movl    4(%esp),%eax;   \
>        movl    8(%esp),%edx;   \
>        leal    4(%esp),%ecx;   \
>        jmp sys_##name
>
> If we need more than two arguments + pt_regs, then we have to set up a
> temporary stack frame.
>
> [Sorry, I'm sitting in a meeting so I can't actually write up a real patch]
>
>        -hpa

I had started working on a patchset like this when this issue came up
the first time. I'll get it up to date and send it out. It would be
nice if this could be done for all syscalls, so that
asmlinkage_protect() isn't needed.

--
Brian Gerst

2009-12-09 19:32:37

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [GIT PULL] x86/paravirt for v2.6.33

On 12/09/09 10:47, Linus Torvalds wrote:
>
> On Wed, 9 Dec 2009, Jeremy Fitzhardinge wrote:
>
>> How does this look?
>>
> I would actually prefer it if the calling convention was just made to
> match on both x86 and x86-64. Wouldn't it be nice if they both just had
>
>
>> +/* kernel/ioport.c */
>> +asmlinkage long sys_iopl(unsigned int, struct pt_regs *);
>>
> as the prototype, and looked the same?
>
> I realize that right now the 32-bit PTREGSCALL() thing doesn't support
> that (very different macros for entry.S x86-32 and -64), but isn't that
> just another thing we should try to fix too?
>
> IOW, maybe something like this would be good, and would change the x86-32
> calling convention to match the x86-64 one?
>
> NOTE NOTE NOTE! Totally untested. Is the second argument even in %edx? I
> don't remember, I didn't check, I'm just throwing this out as a "hey,
> maybe something _like_ this can work" patch, and will be immediately
> removing it from my machine after sending this email.
>

I came up with this:

From: Jeremy Fitzhardinge<[email protected]>
Date: Wed, 9 Dec 2009 11:17:52 -0800
Subject: [PATCH] x86/iopl: make 32bit iopl also get level argument

This makes it match the 64-bit prototype, and simplifies the whole thing.

Signed-off-by: Jeremy Fitzhardinge<[email protected]>

diff --git a/arch/x86/include/asm/syscalls.h b/arch/x86/include/asm/syscalls.h
index 70497f0..4e567d5 100644
--- a/arch/x86/include/asm/syscalls.h
+++ b/arch/x86/include/asm/syscalls.h
@@ -18,6 +18,7 @@
/* Common in X86_32 and X86_64 */
/* kernel/ioport.c */
asmlinkage long sys_ioperm(unsigned long, unsigned long, int);
+long sys_iopl(unsigned int, struct pt_regs *);

/* kernel/process.c */
int sys_fork(struct pt_regs *);
@@ -36,7 +37,6 @@ asmlinkage int sys_get_thread_area(struct user_desc __user *);
/* X86_32 only */
#ifdef CONFIG_X86_32
/* kernel/ioport.c */
-asmlinkage long sys_iopl(struct pt_regs *);

/* kernel/process_32.c */
int sys_clone(struct pt_regs *);
@@ -71,9 +71,6 @@ int sys_vm86(struct pt_regs *);

/* X86_64 only */

-/* kernel/ioport.c */
-asmlinkage long sys_iopl(unsigned int, struct pt_regs *);
-
/* kernel/process_64.c */
asmlinkage long sys_clone(unsigned long, unsigned long,
void __user *, void __user *,
diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 50b9c22..737b81f 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -725,13 +725,15 @@ END(syscall_badsys)
/*
* System calls that need a pt_regs pointer.
*/
-#define PTREGSCALL(name) \
+#define PTREGSCALL_ARG(name,arg) \
ALIGN; \
ptregs_##name: \
- leal 4(%esp),%eax; \
+ leal 4(%esp),arg; \
jmp sys_##name;
-
-PTREGSCALL(iopl)
+#define PTREGSCALL(name) \
+ PTREGSCALL_ARG(name, %eax)
+
+PTREGSCALL_ARG(iopl,%edx)
PTREGSCALL(fork)
PTREGSCALL(clone)
PTREGSCALL(vfork)
diff --git a/arch/x86/kernel/ioport.c b/arch/x86/kernel/ioport.c
index b1cbac5..f435e42 100644
--- a/arch/x86/kernel/ioport.c
+++ b/arch/x86/kernel/ioport.c
@@ -103,7 +103,7 @@ asmlinkage long sys_ioperm(unsigned long from, unsigned long num, int turn_on)
* on system-call entry - see also fork() and the signal handling
* code.
*/
-static int do_iopl(unsigned int level, struct pt_regs *regs)
+long sys_iopl(unsigned int level, struct pt_regs *regs)
{
struct thread_struct *t =&current->thread;
unsigned int old = (regs->flags>> 12)& 3;
@@ -122,15 +122,3 @@ static int do_iopl(unsigned int level, struct pt_regs *regs)

return 0;
}
-
-#ifdef CONFIG_X86_64
-long sys_iopl(unsigned int level, struct pt_regs *regs)
-{
- return do_iopl(level, regs);
-}
-#else
-long sys_iopl(struct pt_regs *regs)
-{
- return do_iopl(regs->bx, regs);
-}
-#endif

2009-12-09 19:37:19

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [GIT PULL] x86/paravirt for v2.6.33

On 12/09/2009 11:25 AM, Brian Gerst wrote:
>
> I had started working on a patchset like this when this issue came up
> the first time. I'll get it up to date and send it out. It would be
> nice if this could be done for all syscalls, so that
> asmlinkage_protect() isn't needed.
>

asmlinkage_protect() is evil in the extreme and really should be killed.

In fact, I started working on a patchset to eliminate asmlinkage completely.

-hpa

2009-12-09 20:07:36

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [GIT PULL] x86/paravirt for v2.6.33

On 12/09/2009 11:32 AM, Jeremy Fitzhardinge wrote:
> /*
> * System calls that need a pt_regs pointer.
> */
> -#define PTREGSCALL(name) \
> +#define PTREGSCALL_ARG(name,arg) \
> ALIGN; \
> ptregs_##name: \
> - leal 4(%esp),%eax; \
> + leal 4(%esp),arg; \
> jmp sys_##name;
> -

I believe this will not work for the reason outlined in the other post.

-hpa

2009-12-09 21:58:51

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT PULL] x86/paravirt for v2.6.33



On Wed, 9 Dec 2009, Jeremy Fitzhardinge wrote:

> On 12/08/09 13:34, Linus Torvalds wrote:
> > I do _not_ want to add any more task_pt_regs() crap, please.
> >
> > Why? It's wrong for at least vm86 mode (and from kernel system calls).
> >
>
> Would the stack frame version work in these cases?

It would "work" in the sense that at least it wouldn't corrupt the "outer"
stack frame - it would only change the inner one. For vm86 mode, that
would actually matter (iopl is meaningful), but as Peter also said, I
don't think we actually allow direct system calls from vm86 mode.

For me it's actually more of a conceptual complaint: I really think
'task_pt_regs()' is only reliable for ptrace and is simply _wrong_ in
other situations. On other architectures, you literally need to set up the
stack _differently_ on the signal handling path - which is what ptrace
does - than on regular system call paths.

So conceptually, the system call stack layout is simply _different_ than
the ptrace stack. And I'd hate to have x86 code that teaches people to do
things that really don't work in general.

Linus