2006-10-16 19:19:16

by mfbaustx

[permalink] [raw]
Subject: copy_from_user / copy_to_user with no swap space

I've been trying to find or derive a definitive answer to this question
for a while now but can't quite get over the hump.

I understand when/why copy_<to|from>_user (and siblings) are required
(address validation, guaranteeing a process is paged in, etc...). The
question is: if you have no swap space (or virtual memory or whatever),
can there ever be a case in which any valid pointer to a buffer in
user-space would be incorrect as a result of another process's PTE being
present? Put another way: can a process be partially paged?

My reasoning (which I obviously have no confidence else I wouldn't be
asking this question) is as follows:

All processes share the same logical address space starting at 0 and
(usually) ending at 3GB, right? Text sections start low and build up,
stacks start high and grow down. Somewhere in there you get your heap and
shared memory regions. Since noting about a logical address can identify
a specific process, then copy_to/from_user can do nothing to guaruntee
that the CORRECT process is paged in. True? So you're absolutely
obligated to DO the copy at the time the kernel is executing on behalf of
that process. Once your process/thread is context swapped, you've lost
the [correct] information on the address mapping.

So, IF you MUST copy_from/to_user when in the context of the process, AND
IF you have no virtual memory/swapping, THEN must it not be true that you
can ALWAYS dereferences your user space pointers?


TIA!



2006-10-16 19:27:53

by Oliver Neukum

[permalink] [raw]
Subject: Re: copy_from_user / copy_to_user with no swap space

Am Montag, 16. Oktober 2006 21:19 schrieb mfbaustx:
> So, IF you MUST copy_from/to_user when in the context of the process, AND ?
> IF you have no virtual memory/swapping, THEN must it not be true that you ?
> can ALWAYS dereferences your user space pointers?

No. Your code may be only partially paged into RAM.
The same can happen for any mmaped data.

Regards
Oliver

2006-10-16 19:34:21

by Al Viro

[permalink] [raw]
Subject: Re: copy_from_user / copy_to_user with no swap space

On Mon, Oct 16, 2006 at 02:19:03PM -0500, mfbaustx wrote:
> stacks start high and grow down. Somewhere in there you get your heap and
> shared memory regions. Since noting about a logical address can identify
> a specific process, then copy_to/from_user can do nothing to guaruntee
> that the CORRECT process is paged in. True? So you're absolutely
> obligated to DO the copy at the time the kernel is executing on behalf of
> that process. Once your process/thread is context swapped, you've lost
> the [correct] information on the address mapping.
>
> So, IF you MUST copy_from/to_user when in the context of the process, AND
> IF you have no virtual memory/swapping, THEN must it not be true that you
> can ALWAYS dereferences your user space pointers?

First of all, kernel and userland don't have to be in the same address
space at all; not even on x86 in some configuration. So dereferencing
user pointer as if it had been a normal pointer simply won't work - what
you'll get might have nothing to do with any user memory.

But even aside of that, even on architectures where kernel and userland
_do_ share address space, there's nothing to guarantee that any given
piece of user address space is currently present or has ever been paged
in to start with.

Dereference that and you'll get an exception. If you take a look at
the guts of e.g. arch/i386/lib/usercopy.c, you'll see stuff going to
.fixup section; when you call e.g. get_user() on address in a page that
is currently not paged in, exception *is* generated and handled; then
control is returned back to where we'd taken it.

IOW, even low-level code on such targets has to be careful; blind dereferencing
would simply get you an oops. On something like ppc it's simply out of
question - there you would be able to trigger reads from memory-mapped
registers of hell knows what hardware. From userland. Confusing the
living fsck out of hardware and drivers... _And_ you'd get access to
genuine kernel data.

2006-10-16 19:39:46

by Kyle Moffett

[permalink] [raw]
Subject: Re: copy_from_user / copy_to_user with no swap space

On Oct 16, 2006, at 15:19:03, mfbaustx wrote:
> So you're absolutely obligated to DO the copy at the time the
> kernel is executing on behalf of that process. Once your process/
> thread is context swapped, you've lost the [correct] information on
> the address mapping.

Yes, this is correct.

> So, IF you MUST copy_from/to_user when in the context of the
> process, AND IF you have no virtual memory/swapping, THEN must it
> not be true that you can ALWAYS dereferences your user space pointers?

I'm not sure I entirely understand what you're asking here; perhaps
you could rephrase or explain what you're trying to do? From what I
can pick up from your description; you may be missing that program
text pages and memory-mapped files may be "swapped-out" even
*without* a swap device. As an example, when I first start /bin/bash
(ignoring readahead for the moment), very little of the binary and
shared libraries are actually in memory (the rest is left on disk).
When I use data or call a function that hasn't been loaded from disk
yet, a major fault occurs, the kernel loads data from the bash
executable file or a shared library, and then maps it into the
process address space.

Cheers,
Kyle Moffett

2006-10-16 19:47:30

by mfbaustx

[permalink] [raw]
Subject: Re: copy_from_user / copy_to_user with no swap space

>>> No. Your code may be only partially paged into RAM.
>>> The same can happen for any mmaped data.

That's what I thought I read. But then my question is: with on-demand
paging, is it possible to have two processes partially paged? Surely, it
MUST be the case that any processes with overlapping logical address
spaces must be paged coherently. So, while on-demand "paging-in" allows
for partial paging of a process, is it the case that, on a context switch,
the user-space PTE's are completely erased (so that you get page-faults
and can then on-demand page them in...)?





2006-10-16 20:21:31

by Horst H. von Brand

[permalink] [raw]
Subject: Re: copy_from_user / copy_to_user with no swap space

mfbaustx <[email protected]> wrote:
> I've been trying to find or derive a definitive answer to this
> question for a while now but can't quite get over the hump.
>
> I understand when/why copy_<to|from>_user (and siblings) are required
> (address validation, guaranteeing a process is paged in, etc...). The
> question is: if you have no swap space (or virtual memory or
> whatever), can there ever be a case in which any valid pointer to a
> buffer in user-space would be incorrect as a result of another
> process's PTE being present? Put another way: can a process be
> partially paged?

Yes. The executable (including data areas) and shared libraries are demand
paged in (and ro areas could also be evicted), so they can very well be
only partially in memory.

In any case, relying on "this kernel will never have no swap" isn't wise...
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria +56 32 2654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 2797513

2006-10-16 20:26:57

by mfbaustx

[permalink] [raw]
Subject: Re: copy_from_user / copy_to_user with no swap space

> pages and memory-mapped files may be "swapped-out" even *without* a swap
> device. As an example, when I first start /bin/bash (ignoring readahead

Fair enough. The kernel can reclaim pieces of RAM knowing that certain
text sections will availble on the storage medium from which they were
originally loaded. Right?

Also, I suppose one of the less obvious side-effects of using copy_to_user
would be to cause a copy-on-write of a data section?

2006-10-17 12:04:24

by Helge Hafting

[permalink] [raw]
Subject: Re: copy_from_user / copy_to_user with no swap space

mfbaustx wrote:
>>>> No. Your code may be only partially paged into RAM.
>>>> The same can happen for any mmaped data.
>
> That's what I thought I read. But then my question is: with
> on-demand paging, is it possible to have two processes partially
> paged? Surely, it MUST be the case that any processes with
> overlapping logical address spaces must be paged coherently. So,
> while on-demand "paging-in" allows for partial paging of a process, is
> it the case that, on a context switch, the user-space PTE's are
> completely erased (so that you get page-faults and can then on-demand
> page them in...)?
You can surely have two or more processes partially paged.
Or some processes more or less paged out, while some are not.

The kernel never looses track of the address spaces, and knows very well
which block on the swapdevice maps to what address. And of course
it knows what process the block belongs to too.

Several processes can all have their own address 4096 swapped out
at the same time, for example. Obviously to different blocks on the
swapdisk.
There is no need for any special care when several processes are
swapped at the same time.

Demand paging happens when a process tries to use memory but the
memory isn't there. The processor will then get an exception and
schedule read-in of the missing memory. When the memory eventually gets
there, the process is allowed to continue.

Helge Hafting

2006-10-17 13:23:13

by Horst H. von Brand

[permalink] [raw]
Subject: Re: copy_from_user / copy_to_user with no swap space

mfbaustx <[email protected]> wrote:
> >>> No. Your code may be only partially paged into RAM.
> >>> The same can happen for any mmaped data.

> That's what I thought I read. But then my question is: with
> on-demand paging, is it possible to have two processes partially
> paged?

Why shouldn't they be? The whole idea is having just /parts/ (hopefully the
ones in active use) in memory.

> Surely, it MUST be the case that any processes with
> overlapping logical address spaces must be paged coherently.

I don't know what this is supposed to mean...

> So,
> while on-demand "paging-in" allows for partial paging of a process,
> is it the case that, on a context switch, the user-space PTE's are
> completely erased (so that you get page-faults and can then on-demand
> page them in...)?

Each process has its own page tables, they don't get in each others
hair. And the page tables precisely manage making several processes get
access to the /same/ logical addresses, but at /different/ physical
addresses.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria +56 32 2654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 2797513