From: Thomas Gleixner <[email protected]>
In order to make the LDT mapping RO the access bit needs to be forced by
the kernel. Adjust the test case so it handles that gracefully.
Signed-off-by: Thomas Gleixner <[email protected]>
---
tools/testing/selftests/x86/ldt_gdt.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/tools/testing/selftests/x86/ldt_gdt.c
+++ b/tools/testing/selftests/x86/ldt_gdt.c
@@ -122,8 +122,7 @@ static void check_valid_segment(uint16_t
* NB: Different Linux versions do different things with the
* accessed bit in set_thread_area().
*/
- if (ar != expected_ar &&
- (ldt || ar != (expected_ar | AR_ACCESSED))) {
+ if (ar != expected_ar && ar != (expected_ar | AR_ACCESSED)) {
printf("[FAIL]\t%s entry %hu has AR 0x%08X but expected 0x%08X\n",
(ldt ? "LDT" : "GDT"), index, ar, expected_ar);
nerrs++;
On Thu, Dec 14, 2017 at 3:27 AM, Peter Zijlstra <[email protected]> wrote:
> From: Thomas Gleixner <[email protected]>
>
> In order to make the LDT mapping RO the access bit needs to be forced by
> the kernel. Adjust the test case so it handles that gracefully.
If this turns out to need reverting because it breaks Wine or
something, we're really going to regret it.
On Thu, Dec 14, 2017 at 8:20 AM, Andy Lutomirski <[email protected]> wrote:
>
> If this turns out to need reverting because it breaks Wine or
> something, we're really going to regret it.
I really don't see that as very likely. We already play other (much
more fundamental) games with segments.
But I do agree that it would be good to consider this "turn LDT
read-only" a separate series just in case.
Linus
On Thu, Dec 14, 2017 at 11:43 AM, Linus Torvalds
<[email protected]> wrote:
> On Thu, Dec 14, 2017 at 8:20 AM, Andy Lutomirski <[email protected]> wrote:
>>
>> If this turns out to need reverting because it breaks Wine or
>> something, we're really going to regret it.
>
> I really don't see that as very likely. We already play other (much
> more fundamental) games with segments.
>
I dunno. Maybe Wine or DOSEMU apps expect to be able to create a
non-accessed segment and then read out the accessed bit using LAR or
modify_ldt() later.
> But I do agree that it would be good to consider this "turn LDT
> read-only" a separate series just in case.
Which kind of kills the whole thing. There's no way the idea of
putting the LDT in a VMA is okay if it's RW. You just get the kernel
to put_user() a call gate into it and it's game over.
I have a competing patch that just aliases the LDT high up in kernel
land and shares it in the user tables. I like a lot of the cleanups
in this series, but I don't like the actual LDT-in-a-VMA part.
On Thu, Dec 14, 2017 at 1:22 PM, Andy Lutomirski <[email protected]> wrote:
>
> Which kind of kills the whole thing. There's no way the idea of
> putting the LDT in a VMA is okay if it's RW.
Sure there is.
I really don't understand why you guys think it has to be RO.
All it has to be is not _user_ accessible. And that's a requirement
regardless, because no way in hell should users be able to read the
damn thing.
So it clearly needs to have the PAGE_USER bit clear (to avoid users
accessing it directly), and it needs to be marked somehow for
get_user_pages() to refuse it too, and access_ok() needs to fail it so
that we can't do get_user/put_user on it.
But the whole RO vs RW is not fundamentally critical.
Now, I do agree that RO is much much better in general, and it avoids
the requirement to play games with "access_ok()" and friends (assuming
we're just ok with users reading it), but I disagree with the whole
"this is fundamental".
Linus
On Thu, Dec 14, 2017 at 1:44 PM, Linus Torvalds
<[email protected]> wrote:
>
> So it clearly needs to have the PAGE_USER bit clear (to avoid users
> accessing it directly), and it needs to be marked somehow for
> get_user_pages() to refuse it too, and access_ok() needs to fail it so
> that we can't do get_user/put_user on it.
Actually, just clearing PAGE_USER should make gup avoid it automatically.
So really the only other thing it needs is to have access_ok() avoid
it so that the kernel can't be fooled into accessing it for the user.
That does probably mean having to put it at the top of the user
address space and playing games with user_addr_max(). Which is not
wonderful, but certainly not rocket surgery either.
Linus
On Thu, Dec 14, 2017 at 01:48:50PM -0800, Linus Torvalds wrote:
> Actually, just clearing PAGE_USER should make gup avoid it automatically.
_Should_ being the operative word, because I cannot currently see it
DTRT. But maybe I'm missing the obvious -- I tend to do that at times.
We don't appear to have pte_bad() and pte_access_permitted() is
currently all sorts of wrong.
> On Dec 14, 2017, at 1:48 PM, Linus Torvalds <[email protected]> wrote:
>
> On Thu, Dec 14, 2017 at 1:44 PM, Linus Torvalds
> <[email protected]> wrote:
>>
>> So it clearly needs to have the PAGE_USER bit clear (to avoid users
>> accessing it directly), and it needs to be marked somehow for
>> get_user_pages() to refuse it too, and access_ok() needs to fail it so
>> that we can't do get_user/put_user on it.
>
> Actually, just clearing PAGE_USER should make gup avoid it automatically.
>
> So really the only other thing it needs is to have access_ok() avoid
> it so that the kernel can't be fooled into accessing it for the user.
>
> That does probably mean having to put it at the top of the user
> address space and playing games with user_addr_max(). Which is not
> wonderful, but certainly not rocket surgery either.
That seems to rather defeat the point of using a VMA, though. And it means we still have to do a full cmp instead of just checking a sign bit in access_ok if we ever manage to kill set_fs().
Again, I have an apparently fully functional patch to alias the LDT at a high (kernel) address where we can cleanly map it in the user pagetables without any of this VMA stuff. It's much less code.
On Thu, Dec 14, 2017 at 2:02 PM, Peter Zijlstra <[email protected]> wrote:
>
> _Should_ being the operative word, because I cannot currently see it
> DTRT. But maybe I'm missing the obvious -- I tend to do that at times.
At least the old get_user_pages_fast() code used to check the USER bit:
unsigned long need_pte_bits = _PAGE_PRESENT|_PAGE_USER;
if (write)
need_pte_bits |= _PAGE_RW;
but that may have been lost when we converted over to the generic code.
It shouldn't actually _matter_, since we'd need to change access_ok()
anyway (and gup had better check that!)
Linus
On Thu, Dec 14, 2017 at 2:11 PM, Andy Lutomirski <[email protected]> wrote:
>
> That seems to rather defeat the point of using a VMA, though.
There never was any point in using a VMA per se.
The point was always to just map the damn thing in the user page
tables, wasn't it?
The vma bit was just an implementation detail.
Linus
On Thu, 14 Dec 2017, Linus Torvalds wrote:
> On Thu, Dec 14, 2017 at 1:22 PM, Andy Lutomirski <[email protected]> wrote:
> >
> > Which kind of kills the whole thing. There's no way the idea of
> > putting the LDT in a VMA is okay if it's RW.
>
> Sure there is.
>
> I really don't understand why you guys think it has to be RO.
>
> All it has to be is not _user_ accessible. And that's a requirement
> regardless, because no way in hell should users be able to read the
> damn thing.
The user knows the LDT contents because he put it there and it can be read
via modify_ldt(0, ) anyway. Or am I misunderstanding what you are trying to
say?
Thanks,
tglx
On Thu, Dec 14, 2017 at 02:14:00PM -0800, Linus Torvalds wrote:
> On Thu, Dec 14, 2017 at 2:02 PM, Peter Zijlstra <[email protected]> wrote:
> >
> > _Should_ being the operative word, because I cannot currently see it
> > DTRT. But maybe I'm missing the obvious -- I tend to do that at times.
>
> At least the old get_user_pages_fast() code used to check the USER bit:
>
> unsigned long need_pte_bits = _PAGE_PRESENT|_PAGE_USER;
>
> if (write)
> need_pte_bits |= _PAGE_RW;
>
> but that may have been lost when we converted over to the generic code.
The generic gup_pte_range() has pte_access_permitted() (which has the
above test) in the right place.
> It shouldn't actually _matter_, since we'd need to change access_ok()
> anyway (and gup had better check that!)
get_user_pages_fast() (both of them) do indeed test access_ok(), but the
regular get_user_pages() does not, I suspect because it can operate on a
foreign mm.
And its the regular old get_user_pages() that's all sorts of broken wrt
!PAGE_USER too.
> On Dec 14, 2017, at 2:15 PM, Linus Torvalds <[email protected]> wrote:
>
>> On Thu, Dec 14, 2017 at 2:11 PM, Andy Lutomirski <[email protected]> wrote:
>>
>> That seems to rather defeat the point of using a VMA, though.
>
> There never was any point in using a VMA per se.
>
> The point was always to just map the damn thing in the user page
> tables, wasn't it?
>
> The vma bit was just an implementation detail.
And all this is why I dislike using a VMA. My patch puts it at a negative address. We could just as easily put it just above TASK_SIZE_MAX, but I'm a bit nervous about bugs that overrun an access_ok check by a small amount. IIRC I found one of those in the net code once, and I didn't look very hard.
On Thu, Dec 14, 2017 at 2:23 PM, Thomas Gleixner <[email protected]> wrote:
>
> The user knows the LDT contents because he put it there and it can be read
> via modify_ldt(0, ) anyway. Or am I misunderstanding what you are trying to
> say?
I don't think they are secret, it's more of a "if they can read it,
they can write it" kind of thing.
The whole "it should be RO" makes no sense. The first choice should be
"it should be inaccessible".
And that actually seems the _simpler_ choice.
Linus
On Thu, Dec 14, 2017 at 2:24 PM, Peter Zijlstra <[email protected]> wrote:
>
> get_user_pages_fast() (both of them) do indeed test access_ok(), but the
> regular get_user_pages() does not, I suspect because it can operate on a
> foreign mm.
That sounds wrong.
We actually had some very serious reasons why get_user_pages_fast()
needed to check access_ok().
I happen to forget what those reasons were, though.
My mind may be going.
But I think it was something like "you could walk off the page tables
because the undefined address range generates nonsensical values for
the pgd_offset() functions" etc.
But maybe the regular get_user_pages() has some other way to protect
against that.
Linus