On Mon, Mar 05, 2018 at 10:27:32PM +0300, Ilya Smith wrote:
> > On 5 Mar 2018, at 19:23, Matthew Wilcox <[email protected]> wrote:
> > On Mon, Mar 05, 2018 at 04:09:31PM +0300, Ilya Smith wrote:
> >> I’m analysing that approach and see much more problems:
> >> - each time you call mmap like this, you still increase count of vmas as my
> >> patch did
> >
> > Umm ... yes, each time you call mmap, you get a VMA. I'm not sure why
> > that's a problem with my patch. I was trying to solve the problem Daniel
> > pointed out, that mapping a guard region after each mmap cost twice as
> > many VMAs, and it solves that problem.
> >
> The issue was in VMAs count as Daniel mentioned.
> The more count, the harder walk tree. I think this is fine

The performance problem Daniel was mentioning with your patch was not
with the number of VMAs but with the scattering of addresses across the
page table tree.

> >> - the entropy you provide is like 16 bit, that is really not so hard to brute
> >
> > It's 16 bits per mapping. I think that'll make enough attacks harder
> > to be worthwhile.
>
> Well yes, its ok, sorry. I just would like to have 32 bit entropy maximum some day :)

We could put 32 bits of padding into the prot argument on 64-bit systems
(and obviously you need a 64-bit address space to use that many bits). The
thing is that you can't then put anything else into those pages (without
using MAP_FIXED).

> >> - if you unmap/remap one page inside region, field vma_guard will show head
> >> or tail pages for vma, not both; kernel don’t know how to handle it
> >
> > There are no head pages. The guard pages are only placed after the real end.
>
> Ok, we have MG where G = vm_guard, right? so when you do vm_split,
> you may come to situation - m1g1m2G, how to handle it? I mean when M is
> split with only one page inside this region. How to handle it?

I thought I covered that in my earlier email. Using one letter per page,
and a five-page mapping with two guard pages: MMMMMGG. Now unmap the
fourth page, and the VMA gets split into two. You get: MMMGMGG.

> > I can't agree with that. The user has plenty of opportunities to get
> > randomness; from /dev/random is the easiest, but you could also do timing
> > attacks on your own cachelines, for example.
>
> I think the usual case to use randomization for any mmap or not use it at all
> for whole process. So here I think would be nice to have some variable
> changeable with sysctl (root only) and ioctl (for greedy processes).

I think this functionality can just as well live inside libc as in
the kernel.

> Well, let me summary:
> My approach chose random gap inside gap range with following strings:
>
> + addr = get_random_long() % ((high - low) >> PAGE_SHIFT);
> + addr = low + (addr << PAGE_SHIFT);
>
> Could be improved limiting maximum possible entropy in this shift.
> To prevent situation when attacker may massage allocations and
> predict chosen address, I randomly choose memory region. I’m still
> like my idea, but not going to push it anymore, since you have yours now.
>
> Your idea just provide random non-mappable and non-accessable offset
> from best-fit region. This consumes memory (1GB gap if random value
> is 0xffff). But it works and should work faster and should resolve the issue.

umm ... 64k * 4k is a 256MB gap, not 1GB. And it consumes address space,
not memory.

> My point was that current implementation need to be changed and you
> have your own approach for that. :)
> Lets keep mine in the mind till better times (or worse?) ;)
> Will you finish your approach and upstream it?

I'm just putting it out there for discussion. If people think this is
the right approach, then I'm happy to finish it off. If the consensus
is that we should randomly pick addresses instead, I'm happy if your
approach gets merged.

2018-03-05 20:21:46

by Ilya Smith

[permalink] [raw]

Subject: Re: [RFC PATCH] Randomization of address chosen by mmap.

> On 5 Mar 2018, at 22:47, Matthew Wilcox <[email protected]> wrote:
>>>> - the entropy you provide is like 16 bit, that is really not so hard to brute
>>>
>>> It's 16 bits per mapping. I think that'll make enough attacks harder
>>> to be worthwhile.
>>
>> Well yes, its ok, sorry. I just would like to have 32 bit entropy maximum some day :)
>
> We could put 32 bits of padding into the prot argument on 64-bit systems
> (and obviously you need a 64-bit address space to use that many bits). The
> thing is that you can't then put anything else into those pages (without
> using MAP_FIXED).
>

This one sounds good to me. In my approach it is possible to map there, but ok.

>>>> - if you unmap/remap one page inside region, field vma_guard will show head
>>>> or tail pages for vma, not both; kernel don’t know how to handle it
>>>
>>> There are no head pages. The guard pages are only placed after the real end.
>>
>> Ok, we have MG where G = vm_guard, right? so when you do vm_split,
>> you may come to situation - m1g1m2G, how to handle it? I mean when M is
>> split with only one page inside this region. How to handle it?
>
> I thought I covered that in my earlier email. Using one letter per page,
> and a five-page mapping with two guard pages: MMMMMGG. Now unmap the
> fourth page, and the VMA gets split into two. You get: MMMGMGG.
>
I was just interesting, it’s not the issue to me. Now its clear, thanks.

>>> I can't agree with that. The user has plenty of opportunities to get
>>> randomness; from /dev/random is the easiest, but you could also do timing
>>> attacks on your own cachelines, for example.
>>
>> I think the usual case to use randomization for any mmap or not use it at all
>> for whole process. So here I think would be nice to have some variable
>> changeable with sysctl (root only) and ioctl (for greedy processes).
>
> I think this functionality can just as well live inside libc as in
> the kernel.
>

Good news for them :)

>> Well, let me summary:
>> My approach chose random gap inside gap range with following strings:
>>
>> + addr = get_random_long() % ((high - low) >> PAGE_SHIFT);
>> + addr = low + (addr << PAGE_SHIFT);
>>
>> Could be improved limiting maximum possible entropy in this shift.
>> To prevent situation when attacker may massage allocations and
>> predict chosen address, I randomly choose memory region. I’m still
>> like my idea, but not going to push it anymore, since you have yours now.
>>
>> Your idea just provide random non-mappable and non-accessable offset
>> from best-fit region. This consumes memory (1GB gap if random value
>> is 0xffff). But it works and should work faster and should resolve the issue.
>
> umm ... 64k * 4k is a 256MB gap, not 1GB. And it consumes address space,
> not memory.
>

hmm, yes… I found 8 bits somewhere.. 256MB should be enough for everyone.

>> My point was that current implementation need to be changed and you
>> have your own approach for that. :)
>> Lets keep mine in the mind till better times (or worse?) ;)
>> Will you finish your approach and upstream it?
>
> I'm just putting it out there for discussion. If people think this is
> the right approach, then I'm happy to finish it off. If the consensus
> is that we should randomly pick addresses instead, I'm happy if your
> approach gets merged.

So now, its time to call for people? Sorry, I’m new here.

Thanks,
Ilya