2004-06-30 02:44:44

by Jamie Lokier

[permalink] [raw]
Subject: A question about PROT_NONE on ARM and ARM26

Hi folks,

I'm doing a survey of the different architectural implementations of
PROT_* flags for mmap() and mprotect(). I'm looking at linux-2.6.5.

The ARM and ARM26 implementations are very similar to plain x86: read
implies exec, exec implies read and write implies read.

But I see a potential bug with PROT_NONE. I'm not sure if it's real,
so could you please confirm?

In include/asm-arm26/pgtable.h, I see this (reindented for mail):

#define PAGE_NONE \
__pgprot(_PAGE_PRESENT | _PAGE_CLEAN | _PAGE_READONLY | _PAGE_NOT_USER)
#define PAGE_READONLY \
__pgprot(_PAGE_PRESENT | _PAGE_CLEAN | _PAGE_READONLY )

In include/asm-arm/pgtable.h, I see this (reindented for mail):

#define _L_PTE_DEFAULT \
L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_CACHEABLE | L_PTE_BUFFERABLE
#define _L_PTE_READ \
L_PTE_USER | L_PTE_EXEC
#define PAGE_NONE \
__pgprot(_L_PTE_DEFAULT)
#define PAGE_READONLY
__pgprot(_L_PTE_DEFAULT | _L_PTE_READ)

Apparently the difference between PAGE_NONE and PAGE_READONLY, in each
case, is that PAGE_NONE is not readable from userspace but _is_
readable from kernel space.

Therefore all user accesses to a PROT_NONE page will cause a fault.

My question is: if the _kernel_ reads a PROT_NONE page, will it fault?
It looks likely to me.

This means that calling write() with a PROT_NONE region would succeed,
wouldn't it?

If so, this is a bug. A minor bug, perhaps, but nonetheless I wish to
document it.

I don't know if you would be able to rearrange the pte bits so that a
PROT_NONE page is not accessible to the kernel either. E.g. on i386
this is done by making PROT_NONE not set the hardware's present bit
but a different bit, and "pte_present()" tests both of those bits to
test the virtual present bit.

Thanks,
-- Jamie


2004-06-30 03:39:12

by William Lee Irwin III

[permalink] [raw]
Subject: Re: A question about PROT_NONE on ARM and ARM26

On Wed, Jun 30, 2004 at 03:44:34AM +0100, Jamie Lokier wrote:
> Apparently the difference between PAGE_NONE and PAGE_READONLY, in each
> case, is that PAGE_NONE is not readable from userspace but _is_
> readable from kernel space.
> Therefore all user accesses to a PROT_NONE page will cause a fault.
> My question is: if the _kernel_ reads a PROT_NONE page, will it fault?
> It looks likely to me.
> This means that calling write() with a PROT_NONE region would succeed,
> wouldn't it?
> If so, this is a bug. A minor bug, perhaps, but nonetheless I wish to
> document it.
> I don't know if you would be able to rearrange the pte bits so that a
> PROT_NONE page is not accessible to the kernel either. E.g. on i386
> this is done by making PROT_NONE not set the hardware's present bit
> but a different bit, and "pte_present()" tests both of those bits to
> test the virtual present bit.

It would be a bug if copy_to_user()/copy_from_user() failed to return
errors on attempted copies to/from areas with PROT_NONE protection.

I recommend writing a testcase and submitting it to LTP. I'll follow up
with an additional suggestion.

-- wli

2004-06-30 08:16:26

by Russell King

[permalink] [raw]
Subject: Re: A question about PROT_NONE on ARM and ARM26

On Wed, Jun 30, 2004 at 03:44:34AM +0100, Jamie Lokier wrote:
> My question is: if the _kernel_ reads a PROT_NONE page, will it fault?
> It looks likely to me.

There are two different types of privileged accesses on ARM. One is the
standard load/store instruction, which checks the permissions for the
current processor mode. The other is one which simulates a user mode
access to the address.

We use the latter for get_user/put_user/copy_to_user/copy_from_user.

> This means that calling write() with a PROT_NONE region would succeed,
> wouldn't it?

No, because the uaccess.h function will fault, and we'll end up returning
-EFAULT.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core

2004-06-30 14:59:50

by Jamie Lokier

[permalink] [raw]
Subject: Re: A question about PROT_NONE on ARM and ARM26

Russell King wrote:
> There are two different types of privileged accesses on ARM. One is the
> standard load/store instruction, which checks the permissions for the
> current processor mode. The other is one which simulates a user mode
> access to the address.
>
> We use the latter for get_user/put_user/copy_to_user/copy_from_user.
>
> > This means that calling write() with a PROT_NONE region would succeed,
> > wouldn't it?
>
> No, because the uaccess.h function will fault, and we'll end up returning
> -EFAULT.

Ok, that answers my question, thanks. ARM and ARM26 are fine with PROT_NONE.

Those are the "ldrlst" instructions in getuser.S, right?

Here's a question, for ARM only (not ARM26):
...........................................

getuser.S uses "ldrlst", but unlike ARM26 has no TASK_SIZE check and
matching "ldrge". If kernel C code uses set_fs(), then get_user()
_should_ permit reading from kernel addresses. Will that work on ARM?

I ask because it's interesting to see that ARM and ARM26 have quite
different code in getuser.S and putuser.S. The ARM code is shorter.

Here's an optimisation idea, for ARM26 only:
...........................................

Do you need the "strlst" instructions in putuser.S? They're followed
by "strge" instructions.

For storing, it looks as though the protections set in pgtable.h will
trigger a write fault whether it's a user mode access or not. Thus
you _might_ be able to shave an instruction or two off each put_user,
by simply doing a single unconditional kernel mode store. (The check
against TASK_SIZE has already been done).

Just an idea, I don't know ARM26 well enough to know if that'd work.

-- Jamie

2004-06-30 15:22:52

by Ian molton

[permalink] [raw]
Subject: Re: A question about PROT_NONE on ARM and ARM26

On Wed, 30 Jun 2004 15:59:42 +0100
Jamie Lokier <[email protected]> wrote:

>
> Here's an optimisation idea, for ARM26 only:
> ...........................................
>
> Do you need the "strlst" instructions in putuser.S? They're followed
> by "strge" instructions.

ARM26 is special compared to some other architectures.

the CPU has a 64MB address space, and in all known ARM26 + MMU
configurations, the bottom 32MB are the logical addresses. the upper
32MB (where kernel, physical RAM (16MB max) and IO live) are physically
addressable ONLY.

the kernel isnt mapped into the virtual address space on ARM26. it could
be, but with only 512 logical pages maximum on a normal machine (1024 on
a machine with very little RAM) it would cripple the system even more
than it already is.

the tests in ARM26 determine wether to use a translated access or a
nontranslated one depending on wether we access kernel or user space.

2004-06-30 18:27:28

by Russell King

[permalink] [raw]
Subject: Re: A question about PROT_NONE on ARM and ARM26

On Wed, Jun 30, 2004 at 03:59:42PM +0100, Jamie Lokier wrote:
> Russell King wrote:
> > There are two different types of privileged accesses on ARM. One is the
> > standard load/store instruction, which checks the permissions for the
> > current processor mode. The other is one which simulates a user mode
> > access to the address.
> >
> > We use the latter for get_user/put_user/copy_to_user/copy_from_user.
> >
> > > This means that calling write() with a PROT_NONE region would succeed,
> > > wouldn't it?
> >
> > No, because the uaccess.h function will fault, and we'll end up returning
> > -EFAULT.
>
> Ok, that answers my question, thanks. ARM and ARM26 are fine with PROT_NONE.
>
> Those are the "ldrlst" instructions in getuser.S, right?
>
> Here's a question, for ARM only (not ARM26):
> ...........................................
>
> getuser.S uses "ldrlst", but unlike ARM26 has no TASK_SIZE check and
> matching "ldrge". If kernel C code uses set_fs(), then get_user()
> _should_ permit reading from kernel addresses. Will that work on ARM?

Indeed it does - it's all magic. Firstly, let me explain "ldrlst". This
is "ldr" + "ls" + "t". "ldr" = load register. "ls" = less than (all
instructions are conditional on ARM.) "t" = the magic which turns this
access into a user mode access.

If the address is larger than the value in TI_ADDR_LIMIT, there's no
point in even trying the access - it will fail, so we just do the "bad
access" handling. This also happens if the instruction faults and the
fault can not be fixed up.

However, when we have set_fs(KERNEL_DS) in effect, we modify two things.
First is the TI_ADDR_LIMIT, which allows any access through the assembly
check. The other is the magic - we fiddle with the domain register.

Every translation has a "domain" index associated with it, and each
domain can be in one of three modes: no access, client or manager.

If it's in "no access" mode, nothing can access translations in this
domain. "client" mode means that the page level permissions are checked
and faults are generated depending on the access mode vs the permission
mode. "manager" means the page level permissions are not checked at
all, and any access will succeed irrespective of the page level
permissions.

We use three domains - one for user, one for kernel and one for IO.
Normally all three are in client mode. However, on set_fs(KERNEL_DS)
we switch the kernel domain to manager mode.

This means that the user-mode LDR instructions (ldrt / ldrlst etc)
will not have their page permissions checked, and therefore the access
will succeed - exactly as we require.

> I ask because it's interesting to see that ARM and ARM26 have quite
> different code in getuser.S and putuser.S. The ARM code is shorter.

ARM26 is completely different - it doesn't have the ability to bypass
permission checks in the "kernel" area of memory. Therefore, ARM26
has to rely solely on the TI_ADDR_LIMIT check and select the appropriate
instruction to use based upon the suceeding address.

> Here's an optimisation idea, for ARM26 only:
> ...........................................
>
> Do you need the "strlst" instructions in putuser.S? They're followed
> by "strge" instructions.

The outcome of the page permission checks are slightly different for the
strt vs str instructions for both the ARM26 cases:

Privileged T-bit 00 01 10 11
Y 0 r/w r/w r/w r/w
Y 1 r/w read no access no access
N X r/w read no access no access

Note: if PAGE_NOT_USER and PAGE_OLD are both clear (iow, young + user
page) we use bit pattern 0x. If PAGE_NOT_USER, PAGE_OLD, PAGE_READONLY
and PAGE_CLEAN are all clear, we use bit pattern 00. Otherwise we use
bit pattern 11.

We have a similar difference in kernel-mode vs user-mode accesses for
the ARM case as well - so its all complicated and unless you really
understand this... 8)

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core

2004-06-30 19:14:39

by Jamie Lokier

[permalink] [raw]
Subject: Re: A question about PROT_NONE on ARM and ARM26

Russell King wrote:
> We use three domains - one for user, one for kernel and one for IO.
> Normally all three are in client mode. However, on set_fs(KERNEL_DS)
> we switch the kernel domain to manager mode.
>
> This means that the user-mode LDR instructions (ldrt / ldrlst etc)
> will not have their page permissions checked, and therefore the access
> will succeed - exactly as we require.

Protection permissions (i.e. read-only, PROT_NONE) should still be
checked after set_fs(KERNEL_DS). It's only the kernel page vs. user
page distinction that should be relaxed.

>From your description, it's not obvious that it'll do the right thing
in that circumstance.

Hopefully,

> [Tables]
> We have a similar difference in kernel-mode vs user-mode accesses for
> the ARM case as well - so its all complicated and unless you really
> understand this... 8)

...this is alluding to a mechanism such that exactly the right thing
happens for PROT_NONE and PROT_READONLY pages after set_fs(KERNEL_DS), yes?

> Privileged T-bit 00 01 10 11
> Y 0 r/w r/w r/w r/w
> Y 1 r/w read no access no access
> N X r/w read no access no access
>
> Note: if PAGE_NOT_USER and PAGE_OLD are both clear (iow, young + user
> page) we use bit pattern 0x. If PAGE_NOT_USER, PAGE_OLD, PAGE_READONLY
> and PAGE_CLEAN are all clear, we use bit pattern 00. Otherwise we use
> bit pattern 11.

Ok, that explains nicely and should do the right thing on ARM26 with
PROT_NONE pages, even with set_fs(KERNEL_DS).

Because set_fs() is rarely used, I think you can optimise getuser.S
and putuser.S on ARM26. Instead of comparing the address against
TI_ADDR_LIMIT, compare it against the hard-coded userspace limit.

If that succeeds, continue with ldrt et al. Note the improvements in
the common case (fs == USER_DS and no fault): (1) you only compare
against one limit, not two; (2) no load of TI_ADDR_LIMIT; (3) one less
ldr instruction.

If that comparison fails, then branch to a version which checks
TI_ADDR_LIMIT.

Here's an example. It's probably wrong as I haven't written ARM in a
long time, but illustrates the idea. Note how the common case takes 4
instructions instead of 12 in the current code:

__get_user_4:
cmp r0,#0x02000000
4: ldrlst r1, [r0]
movls r0, #0
movls pc, lr
bic r1, sp, #0x1f00
bic r1, r1, #0x00ff
str lr, [sp, #-4]!
ldr r1, [r1, #TI_ADDR_LIMIT]
sub r1, r1, #4
cmp r0, r1
14: ldrls r1, [r0]
movls r0, #0
ldmfdls sp!, {pc}^
b __get_user_bad

-- Jamie

2004-06-30 19:23:24

by Russell King

[permalink] [raw]
Subject: Re: A question about PROT_NONE on ARM and ARM26

On Wed, Jun 30, 2004 at 08:14:28PM +0100, Jamie Lokier wrote:
> Russell King wrote:
> > We use three domains - one for user, one for kernel and one for IO.
> > Normally all three are in client mode. However, on set_fs(KERNEL_DS)
> > we switch the kernel domain to manager mode.
> >
> > This means that the user-mode LDR instructions (ldrt / ldrlst etc)
> > will not have their page permissions checked, and therefore the access
> > will succeed - exactly as we require.
>
> Protection permissions (i.e. read-only, PROT_NONE) should still be
> checked after set_fs(KERNEL_DS). It's only the kernel page vs. user
> page distinction that should be relaxed.
>
> >From your description, it's not obvious that it'll do the right thing
> in that circumstance.

Trust me, it does. Unless you fully understand how the MMU and domains
work on ARM, you've little chance of working it out from the code.

Really, I see its pointless trying to discuss the details of this any
further - I presently have very little time to educate people in the
details, sorry.

> Because set_fs() is rarely used, I think you can optimise getuser.S
> and putuser.S on ARM26. Instead of comparing the address against
> TI_ADDR_LIMIT, compare it against the hard-coded userspace limit.

Wrong. That means that if userspace passes an address above the hard
coded limit, we _WILL_ bypass all protections and access that memory.

However, ARM26 is not under my control anymore, so it isn't something
I care about, and I doubt there are many people who do. We're talking
about a 20 year old architecture which hasn't had any conforming devices
produced for at least 10 years.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core

2004-06-30 20:17:55

by Jamie Lokier

[permalink] [raw]
Subject: Re: A question about PROT_NONE on ARM and ARM26

Russell King wrote:
> Trust me, it does. Unless you fully understand how the MMU and domains
> work on ARM, you've little chance of working it out from the code.

Thanks, that's fine. I just wanted you to confirm PROT_NONE works
with set_fs(KERNEL_DS), as it's not apparent from your earlier
description. I don't need to know _how_ it works - I can read manuals
too - although you description was interesting.

> > Instead of comparing the address against TI_ADDR_LIMIT, compare it
> > against the hard-coded userspace limit.
>
> Wrong. That means that if userspace passes an address above the hard
> coded limit, we _WILL_ bypass all protections and access that memory.

No - it does check against TI_ADDR_LIMIT in the case that the address
is above the hard-coded limit, so prevents that.

The optimisation is valid on all architectures, actually, including
current ARM where it saves a few instructions in the common path.

Here's the potential improvement to current 32-bit ARM. It's
4 instructions instead of 8 and one less load, in the common case:

__get_user_4:
cmp r0, #TASK_SIZE-4
4: ldrlet r1, [r0]
movle r0, #0
movle pc, lr
bic r1, sp, #0x1f00
bic r1, r1, #0x00ff
ldr r1, [r1, #TI_ADDR_LIMIT]
sub r1, r1, #4
cmp r0, r1
14: ldrlet r1, [r0]
movle r0, #0
movle pc, lr
b __get_user_bad

Finally, I think I see a bug in current ARM. Shouldn't this use
ldrlet instead of ldrlst? Think about accesses to addresses
TASK_SIZE-4 and 0xfffffffc.

ldr r1, [r1, #TI_ADDR_LIMIT]
sub r1, r1, #4
cmp r0, r1
4: ldrlst r1, [r0]

Thanks,
-- Jamie

2004-06-30 22:59:32

by Russell King

[permalink] [raw]
Subject: Re: A question about PROT_NONE on ARM and ARM26

On Wed, Jun 30, 2004 at 09:15:46PM +0100, Jamie Lokier wrote:
> Russell King wrote:
> > Trust me, it does. Unless you fully understand how the MMU and domains
> > work on ARM, you've little chance of working it out from the code.
>
> Thanks, that's fine. I just wanted you to confirm PROT_NONE works
> with set_fs(KERNEL_DS), as it's not apparent from your earlier
> description. I don't need to know _how_ it works - I can read manuals
> too - although you description was interesting.

Ok, to fill in for just this bit, the domain covering user space mappings
always remains in "client" mode, so page protections are always checked.
PAGE_NONE does not have the "user" bit set, so both user space accesses
and ldrt/strt instructions will be unable to access the pages, which is
the desired behaviour.

However, plain ldr and str instructions will access the page, but
get_user/put_user doesn't use them, and copy_from_user/copy_to_user
are carefully crafted to ensure that we hit the necessary permission
checks for each page it touches on the first access.

> > > Instead of comparing the address against TI_ADDR_LIMIT, compare it
> > > against the hard-coded userspace limit.
> >
> > Wrong. That means that if userspace passes an address above the hard
> > coded limit, we _WILL_ bypass all protections and access that memory.
>
> No - it does check against TI_ADDR_LIMIT in the case that the address
> is above the hard-coded limit, so prevents that.

Ok.

> Here's the potential improvement to current 32-bit ARM. It's
> 4 instructions instead of 8 and one less load, in the common case:
>
> __get_user_4:
> cmp r0, #TASK_SIZE-4
> 4: ldrlet r1, [r0]
> movle r0, #0
> movle pc, lr
> bic r1, sp, #0x1f00
> bic r1, r1, #0x00ff
> ldr r1, [r1, #TI_ADDR_LIMIT]
> sub r1, r1, #4
> cmp r0, r1
> 14: ldrlet r1, [r0]
> movle r0, #0
> movle pc, lr
> b __get_user_bad

Ok, this could work, but there's one gotcha - TASK_SIZE-4 doesn't fit
in an 8-bit rotated constants, so we need 2 extra instructions:

__get_user_4:
mov r1, #TASK_SIZE
sub r1, r1, #4
cmp r0, r1
4: ldrlet r1, [r0]
movle r0, #0
movle pc, lr
...

> Finally, I think I see a bug in current ARM. Shouldn't this use
> ldrlet instead of ldrlst? Think about accesses to addresses
> TASK_SIZE-4 and 0xfffffffc.

LS = unsigned less than or same. LE = signed less than or equal. You
need the unsigned compare because addresses are unsigned.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core

2004-06-30 23:30:24

by Jamie Lokier

[permalink] [raw]
Subject: Re: A question about PROT_NONE on ARM and ARM26

Russell King wrote:
> > Here's the potential improvement to current 32-bit ARM. It's
> > 4 instructions instead of 8 and one less load, in the common case:
> >
> > __get_user_4:
> > cmp r0, #TASK_SIZE-4
> > 4: ldrlet r1, [r0]
> > movle r0, #0
> > movle pc, lr
> > bic r1, sp, #0x1f00
> > bic r1, r1, #0x00ff
> > ldr r1, [r1, #TI_ADDR_LIMIT]
> > sub r1, r1, #4
> > cmp r0, r1
> > 14: ldrlet r1, [r0]
> > movle r0, #0
> > movle pc, lr
> > b __get_user_bad
>
> Ok, this could work, but there's one gotcha - TASK_SIZE-4 doesn't fit
> in an 8-bit rotated constants, so we need 2 extra instructions:
>
> __get_user_4:
> mov r1, #TASK_SIZE
> sub r1, r1, #4
> cmp r0, r1
> 4: ldrlet r1, [r0]
> movle r0, #0
> movle pc, lr
> ...

One more possibility:

cmp r0, #(TASK_SIZE - (1<<24))

I.e. just compare against the largest constant that can be
represented. For accesses to the last part of userspace, it's a
penalty of 4 instructions -- but it might work out to be a net gain.

Actually, since the shortest path is only three instructions in the
fast case, not counting control flow, it might be good to inline those
3 in uaccess.h, and change the "bl" to a conditonal "blhi" there.

> > Finally, I think I see a bug in current ARM. Shouldn't this use
> > ldrlet instead of ldrlst? Think about accesses to addresses
> > TASK_SIZE-4 and 0xfffffffc.
>
> LS = unsigned less than or same. LE = signed less than or equal. You
> need the unsigned compare because addresses are unsigned.

Ah. I was guessing the mnemonic.

That's because of the way "ge" is used on ARM26 in places, which
therefore look buggy or subtly clever:

ldr r1, [r1, #TI_ADDR_LIMIT]
sub r1, r1, #4
cmp r0, r1
bge __get_user_bad
cmp r0, #0x02000000
4: ldrlst r1, [r0]
ldrge r1, [r0]

"ge" is a signed comparison, and unsigned is needed here, unless I
missed something subtle. So "bge" and "ldrge" should be "bhi" and "ldrhi".

Thanks,
-- Jamie

2004-06-30 23:49:19

by Ian molton

[permalink] [raw]
Subject: Re: A question about PROT_NONE on ARM and ARM26

On Thu, 1 Jul 2004 00:30:14 +0100
Jamie Lokier <[email protected]> wrote:

> "ge" is a signed comparison, and unsigned is needed here, unless I
> missed something subtle. So "bge" and "ldrge" should be "bhi" and "ldrhi".

technically, I think you're right here.

in practise, the arm26 address space is too small (64MB) for this to
ever cause a problem.

2004-07-01 01:05:31

by Nicolas Pitre

[permalink] [raw]
Subject: Re: A question about PROT_NONE on ARM and ARM26

On Thu, 1 Jul 2004, Jamie Lokier wrote:

> Russell King wrote:
> > Ok, this could work, but there's one gotcha - TASK_SIZE-4 doesn't fit
> > in an 8-bit rotated constants, so we need 2 extra instructions:
> >
> > __get_user_4:
> > mov r1, #TASK_SIZE
> > sub r1, r1, #4
> > cmp r0, r1
> > 4: ldrlet r1, [r0]
> > movle r0, #0
> > movle pc, lr
> > ...
>
> One more possibility:
>
> cmp r0, #(TASK_SIZE - (1<<24))
>
> I.e. just compare against the largest constant that can be
> represented. For accesses to the last part of userspace, it's a
> penalty of 4 instructions -- but it might work out to be a net gain.

Maybe not. The user stack is located at the top so any user buffer
allocated on the stack would be penalized.


Nicolas

2004-07-01 03:35:35

by William Lee Irwin III

[permalink] [raw]
Subject: Re: Testing PROT_NONE and other protections, and a surprise

William Lee Irwin III wrote:
>> It would be a bug if copy_to_user()/copy_from_user() failed to return
>> errors on attempted copies to/from areas with PROT_NONE protection.
>> I recommend writing a testcase and submitting it to LTP. I'll follow up
>> with an additional suggestion.

On Thu, Jul 01, 2004 at 04:26:06AM +0100, Jamie Lokier wrote:
> I've just written a thorough test. The attached program tries every
> combination of PROT_* flags, and tells you what protection you really get.
> I don't know how tests get into LTP; perhaps I can leave that to you?
> When running it on i386, I got a *huge* surprise (to me). A
> PROT_WRITE-only page can sometimes fault on read or exec. This is the
> output:

This is unsurprising. The permissions can't be represented in pagetables,
but can opportunistically be enforced when exceptions are taken for other
reasons (e.g. TLB invalidations related to page replacement).


-- wli

2004-07-01 01:59:17

by Jamie Lokier

[permalink] [raw]
Subject: Re: A question about PROT_NONE on ARM and ARM26

Ian Molton wrote:
> > "ge" is a signed comparison, and unsigned is needed here, unless I
> > missed something subtle. So "bge" and "ldrge" should be "bhi" and "ldrhi".
>
> technically, I think you're right here.
>
> in practise, the arm26 address space is too small (64MB) for this to
> ever cause a problem.

No -- there is still a bug.

The bug is that userspace can pass an address like 0x90000000 to the
kernel. This is possible even on arm26.

If you follow the logic in getuser.S, it won't branch to
__get_user_bad, and it won't execute _either_ of the "ldrlst" or
"ldrge" instructions.

So it'll end up returning the value that happens to be in r1 and/or
r2, and using that for the syscall, instead of the syscall returning
-EFAULT as it should.

In rare cases, that's a security information leakage. Usually it's
just rubbish.

-- Jamie

2004-07-01 03:26:46

by Jamie Lokier

[permalink] [raw]
Subject: Testing PROT_NONE and other protections, and a surprise

William Lee Irwin III wrote:
> It would be a bug if copy_to_user()/copy_from_user() failed to return
> errors on attempted copies to/from areas with PROT_NONE protection.
>
> I recommend writing a testcase and submitting it to LTP. I'll follow up
> with an additional suggestion.

I've just written a thorough test. The attached program tries every
combination of PROT_* flags, and tells you what protection you really get.
I don't know how tests get into LTP; perhaps I can leave that to you?

When running it on i386, I got a *huge* surprise (to me). A
PROT_WRITE-only page can sometimes fault on read or exec. This is the
output:

Requested PROT | --- R-- -W- RW- --X R-X -WX RWX
========================================================================
MAP_SHARED | --- r-x !w! rwx r-x r-x rwx rwx
MAP_PRIVATE | --- r-x !w! rwx r-x r-x rwx rwx

The "!" means that a read or exec *sometimes* raises a signal.

(In general you cannot predict when it will or won't, because that can
depends on background paging decisions.)

Now, this makes complete sense when you think about how the page fault
path works. But it's quite surprising behaviour from an application
point of view. It's widely said that "PROT_WRITE implies PROT_READ on
i386" (and in fact on all architectures except IA64). This shows that
it isn't quite true.

This program should hopefully run on all architectures, however it
does depend on an empty function working when relocated. That might
not be the case. A file is written and then mapped, to ensure that
i-cache coherency isn't a problem.

It'll be interesting to see the results on other architectures.

-- Jamie

==== test_mmap_prot.c ====
/* Test actual page permissions for PROT_* combinations with mmap()

Copyright (C) 2004 Jamie Lokier

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */


#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <stdio.h>
#include <setjmp.h>
#include <sys/signal.h>

static sigjmp_buf buf;
static int fd;
static size_t size;

/* Hopefully this function is so simple it executes correctly even
when relocated. */
void void_function (void) {}
void void_function_end (void) {}

void sigsegv (int sig)
{
siglongjmp (buf, 1);
}

void setup (void)
{
char * buffer;
size = getpagesize ();
buffer = calloc (1, size); /* All zeros. */
if (!buffer) {
perror ("calloc");
exit (1);
}
fd = open ("/tmp/test_mmap_file", O_CREAT | O_TRUNC | O_RDWR, 0666);
if (fd == -1) {
perror ("open");
exit (1);
}
if (unlink ("/tmp/test_mmap_file") != 0) {
perror ("unlink");
exit (1);
}
memcpy (buffer, (const char *) &void_function,
((const char *) &void_function_end
- (const char *) &void_function));
if (write (fd, buffer, size) != size) {
perror ("write");
exit (1);
}
signal (SIGSEGV, sigsegv);
signal (SIGBUS, sigsegv); /* For those that don't use SIGSEGV... */
}

char * map (int private, int r, int w, int x)
{
char * addr = mmap (0, size, ((r ? PROT_READ : 0)
| (w ? PROT_WRITE : 0)
| (x ? PROT_EXEC : 0)),
MAP_FILE | (private ? MAP_PRIVATE : MAP_SHARED),
fd, 0);
if (addr == (char *) MAP_FAILED) {
perror ("mmap");
exit (1);
}
return addr;
}

void test (int private, int R, int W, int X)
{
int i, dummy;
char * addr = map (private, R, W, X);
int r = 0, w = 0, x = 0;

for (i = 0; i < 10; i++) {
/* Test read. */
if (!sigsetjmp (buf, 1)) {
dummy = *(volatile char *) addr;
r++;
}

/* Ensure page is fresh, if necessary. */
if (i == 0) {
munmap (addr, size);
addr = map (private, R, W, X);
}

/* Test write. */
if (!sigsetjmp (buf, 1)) {
/* Don't clobber the executable code. */
*((volatile char *) (addr + size - 1)) = 1;
w++;
}

/* Ensure page is fresh, if necessary. */
if (i == 0) {
munmap (addr, size);
addr = map (private, R, W, X);
}

/* Test exec. */
if (!sigsetjmp (buf, 1)) {
((void (*) (void)) addr) ();
x++;
}
}
munmap (addr, size);

printf ("%c%c%c", (r == 10 ? 'r' : r == 0 ? '-' : '!'),
(w == 10 ? 'w' : w == 0 ? '-' : '!'),
(x == 10 ? 'x' : x == 0 ? '-' : '!'));
}

int main ()
{
int private, R, W, X;

setup ();
printf ("Requested PROT | --- R-- -W- RW- --X R-X -WX RWX\n"
"========================================================================\n");

for (private = 0; private <= 1; private++) {
printf ("%s | ", private ? "MAP_PRIVATE" : "MAP_SHARED ");

for (X = 0; X <= 1; X++) {
for (W = 0; W <= 1; W++) {
for (R = 0; R <= 1; R++) {
if (R | W | X)
printf (" ");
test (private, R, W, X);
}
}
}

printf ("\n");
}
return 0;
}

2004-07-01 01:50:55

by Jamie Lokier

[permalink] [raw]
Subject: Re: A question about PROT_NONE on ARM and ARM26

Nicolas Pitre wrote:
> > cmp r0, #(TASK_SIZE - (1<<24))
> >
> > I.e. just compare against the largest constant that can be
> > represented. For accesses to the last part of userspace, it's a
> > penalty of 4 instructions -- but it might work out to be a net gain.
>
> Maybe not. The user stack is located at the top so any user buffer
> allocated on the stack would be penalized.

I agree. I don't know if it would work out to be a net gain on
average or a net loss.

It saves a couple of instructions, but when it fails the cost is only
a few instructions anyway.

Probably for get_user & put_user, the common case _is_ to be on the
user's stack, so Russell's code would be better.

-- Jamie