2003-03-06 16:41:39

by Hugh Dickins

[permalink] [raw]
Subject: [PATCH] nonlinear oddities

1. Revert MAP_NONLINEAR and VM_NONLINEAR: I can easily imagine wanting
VM_NONLINEAR in future, warning that vma is unusual, but currently it's
not useful: install_page just needs to SetPageAnon if the page is put
somewhere try_to_unmap_obj_one wouldn't be able to find it.

2. filemap_populate and shmem_populate expect an absolute pgoff, but
try_to_unmap_one is forgetting to add in vm_pgoff when doing set_pte.
Could be done the other way round, with relative pgoff in the pte?
No, that would make splitting a vma tedious.

3. No patch included, but I believe 2.5.64-mm1 is testing Ingo's
file-offset-in-pte very much less than you imagine (I've yet to hit
a breakpoint on do_file_page, and I don't think that's down to the
patches above): Dave's work means that the file pages don't arrive
at Ingo's code to set the pte with file offset (unless you actually
use Ingo's syscall) - difficult to test both at once, I think.

Hugh

--- 2.5.64-mm1/include/asm-i386/mman.h Thu Mar 6 08:24:23 2003
+++ linux/include/asm-i386/mman.h Thu Mar 6 15:59:49 2003
@@ -20,7 +20,6 @@
#define MAP_NORESERVE 0x4000 /* don't check for reservations */
#define MAP_POPULATE 0x8000 /* populate (prefault) pagetables */
#define MAP_NONBLOCK 0x10000 /* do not block on IO */
-#define MAP_NONLINEAR 0x20000 /* will be used for remap_file_pages */

#define MS_ASYNC 1 /* sync memory asynchronously */
#define MS_INVALIDATE 2 /* invalidate the caches */
--- 2.5.64-mm1/include/asm-ppc64/mman.h Thu Mar 6 08:24:23 2003
+++ linux/include/asm-ppc64/mman.h Thu Mar 6 15:59:49 2003
@@ -36,7 +36,6 @@

#define MAP_POPULATE 0x8000 /* populate (prefault) pagetables */
#define MAP_NONBLOCK 0x10000 /* do not block on IO */
-#define MAP_NONLINEAR 0x20000 /* Mapping may use remap_file_pages */

#define MADV_NORMAL 0x0 /* default page-in behavior */
#define MADV_RANDOM 0x1 /* page-in minimum required */
--- 2.5.64-mm1/include/linux/mm.h Thu Mar 6 08:24:24 2003
+++ linux/include/linux/mm.h Thu Mar 6 15:59:49 2003
@@ -107,7 +107,6 @@
#define VM_RESERVED 0x00080000 /* Don't unmap it from swap_out */
#define VM_ACCOUNT 0x00100000 /* Is a VM accounted object */
#define VM_HUGETLB 0x00400000 /* Huge TLB Page VM */
-#define VM_NONLINEAR 0x00800000 /* Nonlinear area */

#ifdef CONFIG_STACK_GROWSUP
#define VM_STACK_FLAGS (VM_GROWSUP | VM_DATA_DEFAULT_FLAGS | VM_ACCOUNT)
--- 2.5.64-mm1/mm/fremap.c Thu Mar 6 08:24:24 2003
+++ linux/mm/fremap.c Thu Mar 6 15:59:49 2003
@@ -1,5 +1,5 @@
/*
- * linux/mm/mpopulate.c
+ * linux/mm/fremap.c
*
* Explicit pagetable population and nonlinear (random) mappings support.
*
@@ -57,6 +57,7 @@
pgd_t *pgd;
pmd_t *pmd;
struct pte_chain *pte_chain;
+ unsigned long pgidx;

pte_chain = pte_chain_alloc(GFP_KERNEL);
if (!pte_chain)
@@ -79,7 +80,10 @@
flush_icache_page(vma, page);
entry = mk_pte(page, prot);
set_pte(pte, entry);
- if (vma->vm_flags & VM_NONLINEAR)
+ pgidx = (addr - vma->vm_start) >> PAGE_SHIFT;
+ pgidx += vma->vm_pgoff;
+ pgidx >>= PAGE_CACHE_SHIFT - PAGE_SHIFT;
+ if (page->index != pgidx)
SetPageAnon(page);
pte_chain = page_add_rmap(page, pte, pte_chain);
pte_unmap(pte);
@@ -139,8 +143,7 @@
* and that the remapped range is valid and fully within
* the single existing vma:
*/
- if (vma &&
- ((vma->vm_flags & (VM_SHARED|VM_NONLINEAR)) == (VM_SHARED|VM_NONLINEAR)) &&
+ if (vma && (vma->vm_flags & VM_SHARED) &&
vma->vm_ops && vma->vm_ops->populate &&
end > start && start >= vma->vm_start &&
end <= vma->vm_end)
--- 2.5.64-mm1/mm/mmap.c Thu Mar 6 08:24:24 2003
+++ linux/mm/mmap.c Thu Mar 6 15:59:49 2003
@@ -219,7 +219,6 @@
flag_bits =
_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN) |
_trans(flags, MAP_DENYWRITE, VM_DENYWRITE) |
- _trans(flags, MAP_NONLINEAR, VM_NONLINEAR) |
_trans(flags, MAP_EXECUTABLE, VM_EXECUTABLE);
return prot_bits | flag_bits;
#undef _trans
--- 2.5.64-mm1/mm/rmap.c Thu Mar 6 08:24:24 2003
+++ linux/mm/rmap.c Thu Mar 6 15:59:49 2003
@@ -598,6 +598,8 @@
* in the pte.
*/
pgidx = (address - vma->vm_start) >> PAGE_SHIFT;
+ pgidx += vma->vm_pgoff;
+ pgidx >>= PAGE_CACHE_SHIFT - PAGE_SHIFT;
if (1 || page->index != pgidx) {
set_pte(ptep, pgoff_to_pte(page->index));
BUG_ON(!pte_file(*ptep));


2003-03-06 19:02:31

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] nonlinear oddities

On Thu, 6 Mar 2003, Hugh Dickins wrote:
> 1. Revert MAP_NONLINEAR and VM_NONLINEAR: I can easily imagine wanting
> VM_NONLINEAR in future, warning that vma is unusual, but currently it's
> not useful: install_page just needs to SetPageAnon if the page is put
> somewhere try_to_unmap_obj_one wouldn't be able to find it.

Now I think about it more, install_page's SetPageAnon is not good at all.
That (unlocked) page may already be mapped into other vmas as a shared
file page, non-zero mapcount, we can't suddenly switch it to Anon
(pte_chained) without doing the work to handle that case.

Before Ingo's file-offset-in-pte, it would have been consistent without
any SetPageAnon there, because the remapped pages would be unreliable
unless locked, and having them unfindable is equivalent to being locked.

But if we're to bother with file-offset-in-pte, then we have to bother
with finding the remapped pages, handling mapped-earlier case properly.

Hugh

2003-03-06 19:27:52

by Dave McCracken

[permalink] [raw]
Subject: Re: [PATCH] nonlinear oddities


--On Thursday, March 06, 2003 19:14:47 +0000 Hugh Dickins
<[email protected]> wrote:

> Now I think about it more, install_page's SetPageAnon is not good at all.
> That (unlocked) page may already be mapped into other vmas as a shared
> file page, non-zero mapcount, we can't suddenly switch it to Anon
> (pte_chained) without doing the work to handle that case.

Ouch. You're right. I'll go stare at it for awhile and see if any
solutions jump out at me. I suppose at worst I could write a function to
convert an object page to use pte_chains, but it'd be nice if that weren't
necessary.

Dave McCracken

======================================================================
Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059
[email protected] T/L 678-3059

2003-03-10 04:26:25

by Ed Tomlinson

[permalink] [raw]
Subject: Re: [PATCH] nonlinear oddities

Dave McCracken wrote:

>
> --On Thursday, March 06, 2003 19:14:47 +0000 Hugh Dickins
> <[email protected]> wrote:
>
>> Now I think about it more, install_page's SetPageAnon is not good at all.
>> That (unlocked) page may already be mapped into other vmas as a shared
>> file page, non-zero mapcount, we can't suddenly switch it to Anon
>> (pte_chained) without doing the work to handle that case.
>
> Ouch. You're right. I'll go stare at it for awhile and see if any
> solutions jump out at me. I suppose at worst I could write a function to
> convert an object page to use pte_chains, but it'd be nice if that weren't
> necessary.
>
> Dave McCracken

Does this look like the above happening?

Mar 8 18:44:31 oscar kernel: printing eip:
Mar 8 18:44:31 oscar kernel: c012a29c
Mar 8 18:44:31 oscar kernel: Oops: 0002
Mar 8 18:44:31 oscar kernel: CPU: 0
Mar 8 18:44:31 oscar kernel: EIP: 0060:[find_get_page+40/72] Not tainted
Mar 8 18:44:31 oscar kernel: EFLAGS: 00010206
Mar 8 18:44:31 oscar kernel: EIP is at find_get_page+0x28/0x48
Mar 8 18:44:31 oscar kernel: eax: 6e2f6572 ebx: d3c7e000 ecx: 00000000 edx: d88a2288
Mar 8 18:44:31 oscar kernel: esi: 6e2f6572 edi: 00000000 ebp: d3c7fe34 esp: d3c7fe2c
Mar 8 18:44:31 oscar kernel: ds: 007b es: 007b ss: 0068
Mar 8 18:44:31 oscar kernel: Process startkde (pid: 1514, threadinfo=d3c7e000 task=d9446720)
Mar 8 18:44:31 oscar kernel: Stack: d3c7fea0 00000000 d3c7fe60 c012a61b d74e3b68 00000000 d3c7fea0 00000000
Mar 8 18:44:31 oscar kernel: 00000000 00001000 00000000 00000000 d74e3ad4 d3c7feb0 c012ab3f d74e3b68
Mar 8 18:44:31 oscar kernel: d6b1366c d6b13620 d6b13640 d3c7fea0 c012a8c8 d3c7fed8 d6b13640 d6b13620
Mar 8 18:44:31 oscar kernel: Call Trace:
Mar 8 18:44:31 oscar kernel: [do_generic_mapping_read+119/804] do_generic_mapping_read+0x77/0x324
Mar 8 18:44:31 oscar kernel: [__generic_file_aio_read+379/408] __generic_file_aio_read+0x17b/0x198
Mar 8 18:44:31 oscar kernel: [file_read_actor+0/252] file_read_actor+0x0/0xfc
Mar 8 18:44:31 oscar kernel: [generic_file_read+127/156] generic_file_read+0x7f/0x9c
Mar 8 18:44:31 oscar kernel: [handle_mm_fault+107/292] handle_mm_fault+0x6b/0x124
Mar 8 18:44:31 oscar kernel: [do_page_fault+0/1029] do_page_fault+0x0/0x405
Mar 8 18:44:31 oscar kernel: [open_namei+726/1020] open_namei+0x2d6/0x3fc
Mar 8 18:44:31 oscar kernel: [dentry_open+231/480] dentry_open+0xe7/0x1e0
Mar 8 18:44:31 oscar kernel: [file_ioctl+331/352] file_ioctl+0x14b/0x160
Mar 8 18:44:31 oscar kernel: [vfs_read+171/336] vfs_read+0xab/0x150
Mar 8 18:44:31 oscar kernel: [sys_read+40/60] sys_read+0x28/0x3c
Mar 8 18:44:31 oscar kernel: [syscall_call+7/11] syscall_call+0x7/0xb
Mar 8 18:44:31 oscar kernel:
Mar 8 18:44:31 oscar kernel: Code: ff 46 04 90 ff 4b 10 8b 43 08 a8 08 74 06 e8 19 96 fe ff 90

This was with 64-mm2 after stoping X, changing some font parms, and restarting it

Ed Tomlinson

2003-03-10 11:55:15

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] nonlinear oddities

On Sat, 8 Mar 2003, Ed Tomlinson wrote:
> Dave McCracken wrote:
> > --On Thursday, March 06 2003, Hugh Dickins wrote:
> >
> >> Now I think about it more, install_page's SetPageAnon is not good at all.
> >> That (unlocked) page may already be mapped into other vmas as a shared
> >> file page, non-zero mapcount, we can't suddenly switch it to Anon
> >> (pte_chained) without doing the work to handle that case.
> >
> > Ouch. You're right. I'll go stare at it for awhile and see if any
> > solutions jump out at me. I suppose at worst I could write a function to
> > convert an object page to use pte_chains, but it'd be nice if that weren't
> > necessary.
>
> Does this look like the above happening?

I very much doubt it. You'd have to be running an application using Ingo's
new (since 2.5.46) remap_file_pages system call to suffer from Dave's ouch
e.g. something like an experimental version of a major database app. That
said, I've given no thought to what that ouch might look like in practice.

Hugh