2005-10-21 12:46:55

by Vincent W. Freeh

[permalink] [raw]
Subject: Understanding Linux addr space, malloc, and heap

I am trying to understand the Linux addr space. I figured someone might
be able to shed some light on it. Or at least point me to some sources
that will help.

I don't understand what is happening with malloc and the heap in my
process. According to /proc/<pid>/maps the memory from heap to stack
initially looks like that. I only show the four "maps" from the heap
and above. (This is a slightly altered form consisting of start_addr,
end_addr, size_in_pgs, permissions, and path_if_one):

0x08d42000 - 0x08d63000 (33 pgs) rw-p path `[heap]'
0xb7ef8000 - 0xb7ef9000 (1 pgs) rw-p
0xb7f09000 - 0xb7f0b000 (2 pgs) rw-p
0xbfaf5000 - 0xbfb0b000 (22 pgs) rw-p path `[stack]'

I cannot touch (rd, wr, or even mprotect) the map immediate above the
heap--must be a sandboxing page. Before any malloc, brk = 0x8d42000
first page of heap.

If I malloc <= 33 pages the memory comes from the first map above and
the brk changes as appropriate. However, some new maps appear between
the two above. And the 2d one above gets bigger. However, all data
comes from the heap. brk remain below the top of the heap. As shown below.

0x08d42000 - 0x08d63000 (33 pgs) ---p path `[heap]'
0xb7d00000 - 0xb7d01000 (1 pgs) rw-p
0xb7d01000 - 0xb7d21000 (32 pgs) ---p
0xb7d21000 - 0xb7e00000 (223 pgs) ---p
0xb7ef7000 - 0xb7ef9000 (2 pgs) rw-p
0xb7f09000 - 0xb7f0b000 (2 pgs) rw-p
0xbfaf5000 - 0xbfb0b000 (22 pgs) rw-p path `[stack]'

Now if I malloc > 33 pages, the data comes from the heap and the next
map(s). That is the 34th pages is 0xb7d01000, in above example. What
is going on?

Another thing I don't understand is that I can touch maps 3 & 4 above
(0xb7d01000 & 0xb7d21000) both rd and wr. However, I cannot mprotect
the 4th map---but mprotect does not fail, just doesn't change
permissions. I can mprotect the 32 pages in the map 3. This is my
initial problem: I can only mprotect 65 pages. The 66th page (from map
4) silently doesn't mprotect.

Looking around at other processes, they seem very different. Both tcsh
and emacs (appear to) have the 1 pg sandbox just below the stack (good
place) and much larger heaps.

First, please fix any erroneous statements/assumptions above. Next I
have many questions. A few follow.

* How does the heap work? I learned/teach that heap is a contiguous
chunk of memory that holds dynamically-allocated memory. Doesn't appear
to be the case.

* Man pg says can only mprotect mmap-able pages. But what are these?
How can I tell?

* Why does mprotect silently fail?

* I thought brk indicated the top of the heap and that all dynamic
memory would be between bss end and brk. That's not true. What is brk
for then?

Thanks,
v.
--
Vincent (Vince) W. Freeh
Dept of Computer Science
North Carolina State University
http://www.csc.ncsu.edu/faculty/freeh
919-513-7196


2005-10-21 13:00:18

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

On Fri, 2005-10-21 at 08:46 -0400, Vincent W. Freeh wrote:
> I am trying to understand the Linux addr space. I figured someone might
> be able to shed some light on it. Or at least point me to some sources
> that will help.
>
> I don't understand what is happening with malloc and the heap in my
> process. According to /proc/<pid>/maps the memory from heap to stack
> initially looks like that. I only show the four "maps" from the heap
> and above. (This is a slightly altered form consisting of start_addr,
> end_addr, size_in_pgs, permissions, and path_if_one):
>
> 0x08d42000 - 0x08d63000 (33 pgs) rw-p path `[heap]'
> 0xb7ef8000 - 0xb7ef9000 (1 pgs) rw-p
> 0xb7f09000 - 0xb7f0b000 (2 pgs) rw-p
> 0xbfaf5000 - 0xbfb0b000 (22 pgs) rw-p path `[stack]'
>
> First, please fix any erroneous statements/assumptions above. Next I
> have many questions. A few follow.
>
> * How does the heap work? I learned/teach that heap is a contiguous
> chunk of memory that holds dynamically-allocated memory. Doesn't appear
> to be the case.

that's the old school 1970's stuff

the "heap" is still brk in linux, however there is no 1:1 relation
between heap and malloc. malloc in glibc is implemented both using brk
and mmap, depending on the size of your allocation.


>
> * Man pg says can only mprotect mmap-able pages. But what are these?
> How can I tell?

you need to mmap these yourself to be sure.. eg you cannot mprotect the
output of malloc, at least not reliably. Only of mmap.

>
> * Why does mprotect silently fail?

no it has sideeffects; eg it most likely affects more memory than just
your malloc()'d part


> * I thought brk indicated the top of the heap and that all dynamic
> memory would be between bss end and brk. That's not true. What is brk
> for then?

see definition of heap vs malloc above


2005-10-21 13:45:09

by Vincent W. Freeh

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

Thanks for your quick response. It basically confirmed that I observed
what I thought I did. However, I am no closer to solving my problem. I
cannot mprotect data that I malloc beyond the first 65 pages. Why is
that? Can that be fixed? Second, why does mprotect silently fail? I
could live with it failing--but I cannot deal with a call the "works"
but doesn't work.

Thanks,
vince.

-----------
Subject Re: Understanding Linux addr space, malloc, and heap
From Arjan van de Ven <>
Date Fri, 21 Oct 2005 15:00:02 +0200

On Fri, 2005-10-21 at 08:46 -0400, Vincent W. Freeh wrote:
> I am trying to understand the Linux addr space. I figured someone might
> be able to shed some light on it. Or at least point me to some sources
> that will help.
>
> I don't understand what is happening with malloc and the heap in my
> process. According to /proc/<pid>/maps the memory from heap to stack
> initially looks like that. I only show the four "maps" from the heap
> and above. (This is a slightly altered form consisting of start_addr,
> end_addr, size_in_pgs, permissions, and path_if_one):
>
> 0x08d42000 - 0x08d63000 (33 pgs) rw-p path `[heap]'
> 0xb7ef8000 - 0xb7ef9000 (1 pgs) rw-p
> 0xb7f09000 - 0xb7f0b000 (2 pgs) rw-p
> 0xbfaf5000 - 0xbfb0b000 (22 pgs) rw-p path `[stack]'
>
> First, please fix any erroneous statements/assumptions above. Next I
> have many questions. A few follow.
>
> * How does the heap work? I learned/teach that heap is a contiguous
> chunk of memory that holds dynamically-allocated memory. Doesn't appear
> to be the case.

that's the old school 1970's stuff

the "heap" is still brk in linux, however there is no 1:1 relation
between heap and malloc. malloc in glibc is implemented both using brk
and mmap, depending on the size of your allocation.


>
> * Man pg says can only mprotect mmap-able pages. But what are these?
> How can I tell?

you need to mmap these yourself to be sure.. eg you cannot mprotect the
output of malloc, at least not reliably. Only of mmap.

>
> * Why does mprotect silently fail?

no it has sideeffects; eg it most likely affects more memory than just
your malloc()'d part


> * I thought brk indicated the top of the heap and that all dynamic
> memory would be between bss end and brk. That's not true. What is brk
> for then?

see definition of heap vs malloc above


2005-10-21 14:03:27

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

On Fri, 2005-10-21 at 09:45 -0400, Vincent W. Freeh wrote:
> Thanks for your quick response. It basically confirmed that I observed
> what I thought I did. However, I am no closer to solving my problem. I
> cannot mprotect data that I malloc beyond the first 65 pages.

you can't mprotect malloc() memory period ..
> Why is
> that? Can that be fixed? Second, why does mprotect silently fail? I
> could live with it failing--but I cannot deal with a call the "works"
> but doesn't work.

need more info :)

2005-10-21 15:11:30

by Vincent W. Freeh

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

Arjan van de Ven wrote:
> On Fri, 2005-10-21 at 09:45 -0400, Vincent W. Freeh wrote:
>
>>Thanks for your quick response. It basically confirmed that I observed
>>what I thought I did. However, I am no closer to solving my problem. I
>>cannot mprotect data that I malloc beyond the first 65 pages.
>
>
> you can't mprotect malloc() memory period ..

Actually, I can and do. Simple program at end.

>
>> Why is
>>that? Can that be fixed? Second, why does mprotect silently fail? I
>>could live with it failing--but I cannot deal with a call the "works"
>>but doesn't work.
>
>
> need more info :)
>

I call mprotect and it return 0--meaning it succeeded. But the
permissions on the page remain rw. So it fails to change the
permissions, but doesn't give any indication of this.

Thanks,
vince.

------------------
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
void *p;
int pgsize = getpagesize();

p = malloc(1024);
mprotect((void*)((unsigned)p & ~(pgsize-1)), 1024, PROT_NONE);
printf("\t*p = %d\n", *(int *)p);
return 0;
}

2005-10-21 15:20:16

by Anton Altaparmakov

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

On Fri, 2005-10-21 at 11:11 -0400, Vincent W. Freeh wrote:
> Arjan van de Ven wrote:
> > On Fri, 2005-10-21 at 09:45 -0400, Vincent W. Freeh wrote:
> >
> >>Thanks for your quick response. It basically confirmed that I observed
> >>what I thought I did. However, I am no closer to solving my problem. I
> >>cannot mprotect data that I malloc beyond the first 65 pages.
> >
> >
> > you can't mprotect malloc() memory period ..
>
> Actually, I can and do. Simple program at end.
>
> >
> >> Why is
> >>that? Can that be fixed? Second, why does mprotect silently fail? I
> >>could live with it failing--but I cannot deal with a call the "works"
> >>but doesn't work.
> >
> >
> > need more info :)
> >
>
> I call mprotect and it return 0--meaning it succeeded. But the
> permissions on the page remain rw. So it fails to change the
> permissions, but doesn't give any indication of this.
>
> Thanks,
> vince.
>
> ------------------
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/mman.h>
> #include <unistd.h>
>
> int main(int argc, char *argv[])
> {
> void *p;
> int pgsize = getpagesize();
>
> p = malloc(1024);
> mprotect((void*)((unsigned)p & ~(pgsize-1)), 1024, PROT_NONE);
> printf("\t*p = %d\n", *(int *)p);
> return 0;
> }

This program is completely screwed. Read the mprotect man page in
particular the examples section has an example of how to do your program
correctly.

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

2005-10-21 15:23:20

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

On Fri, 2005-10-21 at 11:11 -0400, Vincent W. Freeh wrote:
> Arjan van de Ven wrote:
> > On Fri, 2005-10-21 at 09:45 -0400, Vincent W. Freeh wrote:
> >
> >>Thanks for your quick response. It basically confirmed that I observed
> >>what I thought I did. However, I am no closer to solving my problem. I
> >>cannot mprotect data that I malloc beyond the first 65 pages.
> >
> >
> > you can't mprotect malloc() memory period ..
>
> Actually, I can and do. Simple program at end.

Ok I meant in the "while adhering to the standard" :)


> I call mprotect and it return 0--meaning it succeeded. But the
> permissions on the page remain rw. So it fails to change the
> permissions, but doesn't give any indication of this.
>
> Thanks,
> vince.
>
> ------------------
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/mman.h>
> #include <unistd.h>
>
> int main(int argc, char *argv[])
> {
> void *p;
> int pgsize = getpagesize();
>
> p = malloc(1024);
> mprotect((void*)((unsigned)p & ~(pgsize-1)), 1024, PROT_NONE);
> printf("\t*p = %d\n", *(int *)p);
> return 0;
>}

this has a bug, the 1024 is wrong... what if your "p" point actually
spans 2 pages?

but to have "some effect" even for malloc-falling-back-to-mmap..
just there's a bunch of collateral damage since you mprotect more than
just the memory you got from malloc. mprotect works on page size.. so if
p spans 2 pages (why wouldn't it ;) you mprotect either the wrong memory
(as in your example) or too much (eg both pages)...


2005-10-21 15:23:48

by Paulo Marques

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

>> On Fri, 2005-10-21 at 09:45 -0400, Vincent W. Freeh wrote:
>> I cannot mprotect data that I malloc ...

> Arjan van de Ven wrote:
>> you can't mprotect malloc() memory period ..

Vincent W. Freeh wrote:
> Actually, I can and do. Simple program at end.

Am I the only one who finds this conversation weird? :)

This reminds me of a student I had that called "main" to return to the
start of the application. No matter how I explained that it was simply
wrong and that stack was growing because of that he just kept replying:
"but it works!"...

--
Paulo Marques - http://www.grupopie.com

The rule is perfect: in all matters of opinion our
adversaries are insane.
Mark Twain

Subject: Re: Understanding Linux addr space, malloc, and heap



--On 21 October 2005 17:22 +0200 Arjan van de Ven <[email protected]>
wrote:

> Ok I meant in the "while adhering to the standard" :)

More precisely, as per the man page:
> POSIX.1b says that mprotect can be used only on regions of memory
> obtained from mmap(2).

But what is interesting (if anything) is this:
> ERRORS
> EINVAL addr is not a valid pointer, or not a multiple of
> PAGESIZE.

So if he calls mprotect with memory allocated by malloc (which should
fail), why doesn't he get EINVAL? He says it returns 0 (meaning it
succeeded). Which it shouldn't (unless he is stupendously lucky in
malloc's allocation, in which case it should work).

--
Alex Bligh

2005-10-21 15:37:10

by Vincent W. Freeh

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

The point of the code is to show that one can protect malloc code. It
is a short example. It is not my application. The comments criticising
the code are the beside the point.

One can mprotect malloc'd pages. My code does it. The mprotect man
page does it.

But I can't mprotect the 66th page I malloc. And mprotect fails SILENTLY!

Is this interesting or not? Does anyone understand why?

Thanks,
vince.

Arjan van de Ven wrote:
> On Fri, 2005-10-21 at 11:11 -0400, Vincent W. Freeh wrote:
>
>>Arjan van de Ven wrote:
>>
>>>On Fri, 2005-10-21 at 09:45 -0400, Vincent W. Freeh wrote:
>>>
>>>
>>>>Thanks for your quick response. It basically confirmed that I observed
>>>>what I thought I did. However, I am no closer to solving my problem. I
>>>>cannot mprotect data that I malloc beyond the first 65 pages.
>>>
>>>
>>>you can't mprotect malloc() memory period ..
>>
>>Actually, I can and do. Simple program at end.
>
>
> Ok I meant in the "while adhering to the standard" :)
>
>
>
>>I call mprotect and it return 0--meaning it succeeded. But the
>>permissions on the page remain rw. So it fails to change the
>>permissions, but doesn't give any indication of this.
>>
>>Thanks,
>>vince.
>>
>>------------------
>>#include <stdio.h>
>>#include <stdlib.h>
>>#include <sys/mman.h>
>>#include <unistd.h>
>>
>>int main(int argc, char *argv[])
>>{
>> void *p;
>> int pgsize = getpagesize();
>>
>> p = malloc(1024);
>> mprotect((void*)((unsigned)p & ~(pgsize-1)), 1024, PROT_NONE);
>> printf("\t*p = %d\n", *(int *)p);
>> return 0;
>>}
>
>
> this has a bug, the 1024 is wrong... what if your "p" point actually
> spans 2 pages?
>
> but to have "some effect" even for malloc-falling-back-to-mmap..
> just there's a bunch of collateral damage since you mprotect more than
> just the memory you got from malloc. mprotect works on page size.. so if
> p spans 2 pages (why wouldn't it ;) you mprotect either the wrong memory
> (as in your example) or too much (eg both pages)...
>
>

2005-10-21 15:47:42

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

On Fri, 2005-10-21 at 16:37 +0100, Alex Bligh - linux-kernel wrote:
>
> --On 21 October 2005 17:22 +0200 Arjan van de Ven <[email protected]>
> wrote:
>
> > Ok I meant in the "while adhering to the standard" :)
>
> More precisely, as per the man page:
> > POSIX.1b says that mprotect can be used only on regions of memory
> > obtained from mmap(2).
>
> But what is interesting (if anything) is this:
> > ERRORS
> > EINVAL addr is not a valid pointer, or not a multiple of
> > PAGESIZE.
>
> So if he calls mprotect with memory allocated by malloc (which should
> fail), why doesn't he get EINVAL? He says it returns 0 (meaning it
> succeeded). Which it shouldn't (unless he is stupendously lucky in
> malloc's allocation, in which case it should work).

it succeeds all right; it just does other things than you expect
perhaps ;)

your alignment code had a bug, so it would align potentially to the
wrong piece of memory


2005-10-21 15:48:43

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap


>
> But I can't mprotect the 66th page I malloc. And mprotect fails SILENTLY!

I'm not convinced it does that.. not until the bugs are out of the
code.... since right now it mprotects the wrong stuff, which sometimes
overlaps with what you malloced, sometimes not.


2005-10-21 15:52:47

by Kyle Moffett

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

On Oct 21, 2005, at 11:37:07, Vincent W. Freeh wrote:
> The point of the code is to show that one can protect malloc code.

You may be able to protect malloced memory, but it's not reliable, it
doesn't follow the standard, and if it breaks you get to keep both
pieces. You tried to protect malloced memory, it wasn't reliable, it
didn't follow the standard, therefore you get to keep both pieces.

> But I can't mprotect the 66th page I malloc. And mprotect fails
> SILENTLY!
You must understand that malloc does not necessarily return a
_page_. It returns a random hunk of memory aligned to a 16 byte
boundary, which is *not* the same as aligned to a page boundary
(usually 4096 bytes). Your code is buggy:

>>> mprotect((void*)((unsigned)p & ~(pgsize-1)), 1024, PROT_NONE);

You malloced 1024 bytes (1024 != page size (usually 4096)). You then
round down to a multiple of a page size, and mprotect that page
(since you used a size < 1 page, it rounds up to a page). If malloc
returns an offset 4080 bytes into a page (aligned on a 16-byte
boundary), and you round down and protect that page, then only the
first 16 bytes of the memory you got will be protected.

You *cannot* reliably expect to mprotect() the results of malloc().
If you want to mprotect() things, you _must_ do it on mmap()ed memory.

Cheers,
Kyle Moffett

--
Somone asked me why I work on this free (http://www.fsf.org/
philosophy/) software stuff and not get a real job. Charles Shultz
had the best answer:

"Why do musicians compose symphonies and poets write poems? They do
it because life wouldn't have any meaning for them if they didn't.
That's why I draw cartoons. It's my life."
-- Charles Shultz


2005-10-21 15:58:51

by Paulo Marques

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

Arjan van de Ven wrote:
> On Fri, 2005-10-21 at 16:37 +0100, Alex Bligh - linux-kernel wrote:
>
>>--On 21 October 2005 17:22 +0200 Arjan van de Ven <[email protected]>
>>wrote:
>>
>>
>>>Ok I meant in the "while adhering to the standard" :)
>>
>>More precisely, as per the man page:
>>
>>>POSIX.1b says that mprotect can be used only on regions of memory
>>>obtained from mmap(2).
>>
>>But what is interesting (if anything) is this:
>>
>>>ERRORS
>>> EINVAL addr is not a valid pointer, or not a multiple of
>>> PAGESIZE.
>>
>>So if he calls mprotect with memory allocated by malloc (which should
>>fail), why doesn't he get EINVAL? He says it returns 0 (meaning it
>>succeeded). Which it shouldn't (unless he is stupendously lucky in
>>malloc's allocation, in which case it should work).
>
>
> it succeeds all right; it just does other things than you expect
> perhaps ;)
>
> your alignment code had a bug, so it would align potentially to the
> wrong piece of memory

Actually, it should give a SIGSEGV for the first byte allocated,
although I agree that it might not work for any other byte allocated in
the case where malloc returns a pointer just at the end of a page.

I just tested the sample code and in fact I do get a SIGSEGV.

What kernel version / architecture are you testing this on?

--
Paulo Marques
Software Development Department - Grupo PIE, S.A.
Phone: +351 252 290600, Fax: +351 252 290601
Web: http://www.grupopie.com

The rule is perfect: in all matters of opinion our
adversaries are insane.
Mark Twain

2005-10-21 16:04:48

by Vincent W. Freeh

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

Clearly, it was a mistake to post that code. I had no idea so many
people would point out the bleeding obvious.

Here is a more elaborate version--that does the same thing, but more
lines of code. In it malloc'd memory is mprotect'd. The program
generates a SIGSEGV, a page fault.

----------------
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/mman.h>

#include <limits.h> /* for PAGESIZE */
#ifndef PAGESIZE
#define PAGESIZE 4096
#endif

int
main(void)
{
char *p;
char c;

/* Allocate a buffer; it will have the default
protection of PROT_READ|PROT_WRITE. */
p = malloc(1024+PAGESIZE-1);
if (!p) {
perror("Couldn?t malloc(1024)");
exit(errno);
}

/* Align to a multiple of PAGESIZE, assumed to be a power of two */
p = (char *)(((int) p + PAGESIZE-1) & ~(PAGESIZE-1));

c = p[666]; /* Read; ok */
p[666] = 42; /* Write; ok */

/* Mark the buffer read-only. */
if (mprotect(p, 1024, PROT_READ)) {
perror("Couldn?t mprotect");
exit(errno);
}

c = p[666]; /* Read; ok */
p[666] = 42; /* Write; program dies on SIGSEGV */

exit(0);
}


Arjan van de Ven wrote:
>>But I can't mprotect the 66th page I malloc. And mprotect fails SILENTLY!
>
>
> I'm not convinced it does that.. not until the bugs are out of the
> code.... since right now it mprotects the wrong stuff, which sometimes
> overlaps with what you malloced, sometimes not.
>
>

2005-10-21 16:10:50

by Vincent W. Freeh

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

First, thanks for all the help and attention. I am learning much.

I think the focus of this discussion should be on mprotect. I
understand that spec says it only works on mmap'd memory. So does
malloc use mmap? If not why does it work at all?

Probably the most problematic issue, tho, is why does mprotect return 0
even though it failed to change permissions on the 66th page?

Kyle Moffett wrote:
> On Oct 21, 2005, at 11:37:07, Vincent W. Freeh wrote:
> You *cannot* reliably expect to mprotect() the results of malloc(). If
> you want to mprotect() things, you _must_ do it on mmap()ed memory.
>
> Cheers,
> Kyle Moffett
>

2005-10-21 16:14:22

by Andreas Schwab

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

"Vincent W. Freeh" <[email protected]> writes:

> The point of the code is to show that one can protect malloc code.

You "can" do many things. But that does not mean that you always get any
sensible behaviour.

Andreas.

--
Andreas Schwab, SuSE Labs, [email protected]
SuSE Linux Products GmbH, Maxfeldstra?e 5, 90409 N?rnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."

2005-10-21 16:19:23

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

On Fri, Oct 21, 2005 at 12:10:47PM -0400, Vincent W. Freeh wrote:
> First, thanks for all the help and attention. I am learning much.
>
> I think the focus of this discussion should be on mprotect. I
> understand that spec says it only works on mmap'd memory. So does
> malloc use mmap? If not why does it work at all?

_Sometimes_ malloc() uses mmap, and _sometimes_ it doesn't (it will
use memory by adjust the brk pointer). This can be adjusted via
various tuning parameters to malloc. I suggest you read the info
documentation for glibc's malloc() and mallopt() calls. (Note that
mallopt parameters are non-portable, and may not apply if you are
using another OS, another version of glibc, or if you have replaced
the malloc with another implementation --- as some application writers
might do.)

Bottom line is you must not count on the type of memory returned by
malloc(). It is given the freedom in the specifications to use
whatever free memory it deems most likely to provide better
application performance. An application which is willing to be
malloc() specific can use various interfaces to tune various
malloc()'s behavior for performance reasons.

- Ted

2005-10-21 16:24:12

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

On Fri, 2005-10-21 at 12:04 -0400, Vincent W. Freeh wrote:
> Clearly, it was a mistake to post that code. I had no idea so many
> people would point out the bleeding obvious.
>
> Here is a more elaborate version--that does the same thing, but more
> lines of code. In it malloc'd memory is mprotect'd. The program
> generates a SIGSEGV, a page fault.


... and what is the problem ?

2005-10-21 16:24:52

by Vincent W. Freeh

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

I guess I live in a different world. I do lots of things I'm not
"supposed" to do.

Moreover, it is very sensible and usable to mprotect malloc pages. I
have implemented simple sandboxing this way. For my dissertation I
implemented a DSM by mprotect'g malloc'd memory. This system worked for
>6 on several version of Linux and SunOS. I actually have a better
track record for this technique than for some things that are within the
specifications.

Andreas Schwab wrote:
> "Vincent W. Freeh" <[email protected]> writes:
>
>
>>The point of the code is to show that one can protect malloc code.
>
>
> You "can" do many things. But that does not mean that you always get any
> sensible behaviour.
>
> Andreas.
>

2005-10-21 16:27:01

by Paulo Marques

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

Vincent W. Freeh wrote:
> First, thanks for all the help and attention. I am learning much.
>
> I think the focus of this discussion should be on mprotect. I
> understand that spec says it only works on mmap'd memory. So does
> malloc use mmap? If not why does it work at all?
>
> Probably the most problematic issue, tho, is why does mprotect return 0
> even though it failed to change permissions on the 66th page?

Do you have code that shows this?

I tried to change the example in the mprotect man page to loop N times
(N given on the command line) malloc'ing and mprotect'ing N pages and
then accessing the N'th page and it always gave SIGSEGV, for any N from
1 to 100.

--
Paulo Marques - http://www.grupopie.com

The rule is perfect: in all matters of opinion our
adversaries are insane.
Mark Twain

2005-10-22 19:27:16

by Kyle Moffett

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

On Oct 21, 2005, at 12:24:50, Vincent W. Freeh wrote:
> I guess I live in a different world. I do lots of things I'm not
> "supposed" to do.

So why are you complaining that it doesn't work? "Doctor, it hurts
when I use my toes to hold a nail as I hammer it in!" "Well don't do
that then!"

> Moreover, it is very sensible and usable to mprotect malloc pages.

DANGER! DANGER WILL ROBINSON! DANGER! malloc() is *NOT* guaranteed
or even theoretically implemented to return pages. It might return
all memory at some random 16-byte offset into a page. If you make
malloc'ed memory read only, you might make malloc()-internal data
read-only too and cause malloc() to crash. YOU CANNOT RELY ON THIS
TO WORK!!! Is that sufficiently clear? It may work for you, and it
may not, but when it breaks, don't whine on the LKML.

> I have implemented simple sandboxing this way. For my dissertation
> I implemented a DSM by mprotect'g malloc'd memory. This system
> worked for >6 on several version of Linux and SunOS. I actually
> have a better track record for this technique than for some things
> that are within the specifications.

If it works for you, good luck, but don't try to tell us that it's
wrong when it breaks in a very documented way.

Cheers,
Kyle Moffett

--
Premature optimization is the root of all evil in programming
-- C.A.R. Hoare



2005-10-23 10:41:48

by Bodo Eggert

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

Kyle Moffett <[email protected]> wrote:
> On Oct 21, 2005, at 12:24:50, Vincent W. Freeh wrote:

>> I guess I live in a different world. I do lots of things I'm not
>> "supposed" to do.
>
> So why are you complaining that it doesn't work? "Doctor, it hurts
> when I use my toes to hold a nail as I hammer it in!" "Well don't do
> that then!"

I'm not supposed to run linux on i386, ask Bill. Why do I do it then?

>> Moreover, it is very sensible and usable to mprotect malloc pages.
>
> DANGER! DANGER WILL ROBINSON! DANGER! malloc() is *NOT* guaranteed
> or even theoretically implemented to return pages.

If you allocate a block of 2*PAGESIZE-1 bytes, *any* allocation method is
*guaranteed* return at least one complete page. (BTW: The example from the
manpage is wrong since it does only make sure the starting address is on
the page, but not that the end of the protected memory is within the
allocated area. Whom should I contact?)

But even if Vincend makes the next malloc/free/whatever to be fubar,
or if he made the world explode, mprotect is still required to report
an error if the requested action failed. If it doesn't do that for
mprotecting _any_ range, no matter how strange it may be, it is broken.
--
Ich danke GMX daf?r, die Verwendung meiner Adressen mittels per SPF
verbreiteten L?gen zu sabotieren.

2005-10-23 10:45:04

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap


> But even if Vincend makes the next malloc/free/whatever to be fubar,
> or if he made the world explode, mprotect is still required to report
> an error if the requested action failed.

but.. there's no proof yet that it failed...


2005-10-23 21:30:17

by Kyle Moffett

[permalink] [raw]
Subject: Re: Understanding Linux addr space, malloc, and heap

On Oct 23, 2005, at 06:44:47, Arjan van de Ven wrote:
>> But even if Vincend makes the next malloc/free/whatever to be
>> fubar, or if he made the world explode, mprotect is still required
>> to report an error if the requested action failed.
>
> but.. there's no proof yet that it failed...

Precisely. The only code sample he's sent that exhibits this
"problem" is buggy because it checks the wrong addresses for
protected status. In any case, if you _were_ going to try to change
protection bits on malloc()ed memory, you would need to make
_damn_sure_ that you didn't change the protection bits on internal
data structures that malloc uses to keep track of allocations. If
you remove read or write privs on malloc-internal linked-list
pointers, an attempt to malloc() or free() memory might (and probably
will) crash.

Cheers,
Kyle Moffett

--
I have yet to see any problem, however complicated, which, when you
looked at it in the right way, did not become still more complicated.
-- Poul Anderson