2002-08-29 23:29:53

by Benjamin LaHaise

[permalink] [raw]
Subject: weirdness with ->mm vs ->active_mm handling

Hello,

In trying to track down a bug, I found routines like generic_file_read
getting called with current->mm == NULL. This seems to be a valid state
for lazy tlb tasks, but the code throughout the kernel doesn't seem to
assume that. For starters, I suspect that the fault handling path in
arch/i386/mm/fault.c should probably be using ->active_mm, or else the
corrent page tables will not get updated on a vmalloc fault in some cases.
I'm surprised nobody has encountered this before, hence I'd like comments
on the below (untested) approach of starting to convert ->mm users into
using ->active_mm. As a benefit, we should be able to safely eliminate
the if (mm) check in find_vma once all the stragglers are caught.

-ben

:r ~/patches/v2.5/v2.5.32-active_mm.diff
diff -urN v2.5.32/arch/i386/mm/fault.c active_mm-v2.5.32/arch/i386/mm/fault.c
--- v2.5.32/arch/i386/mm/fault.c Tue Aug 27 16:00:08 2002
+++ active_mm-v2.5.32/arch/i386/mm/fault.c Thu Aug 29 19:27:17 2002
@@ -35,13 +35,14 @@
*/
int __verify_write(const void * addr, unsigned long size)
{
+ struct mm_struct *mm = current->active_mm;
struct vm_area_struct * vma;
unsigned long start = (unsigned long) addr;

if (!size)
return 1;

- vma = find_vma(current->mm, start);
+ vma = find_vma(mm, start);
if (!vma)
goto bad_area;
if (vma->vm_start > start)
@@ -57,7 +58,7 @@

for (;;) {
survive:
- switch (handle_mm_fault(current->mm, vma, start, 1)) {
+ switch (handle_mm_fault(mm, vma, start, 1)) {
case VM_FAULT_SIGBUS:
goto bad_area;
case VM_FAULT_OOM:
@@ -177,7 +178,7 @@
if (address >= TASK_SIZE && !(error_code & 5))
goto vmalloc_fault;

- mm = tsk->mm;
+ mm = tsk->active_mm;
info.si_code = SEGV_MAPERR;

/*
diff -urN v2.5.32/mm/mmap.c active_mm-v2.5.32/mm/mmap.c
--- v2.5.32/mm/mmap.c Tue Aug 20 19:22:36 2002
+++ active_mm-v2.5.32/mm/mmap.c Thu Aug 29 19:28:02 2002
@@ -693,32 +693,30 @@
{
struct vm_area_struct *vma = NULL;

- if (mm) {
- /* Check the cache first. */
- /* (Cache hit rate is typically around 35%.) */
- vma = mm->mmap_cache;
- if (!(vma && vma->vm_end > addr && vma->vm_start <= addr)) {
- rb_node_t * rb_node;
+ /* Check the cache first. */
+ /* (Cache hit rate is typically around 35%.) */
+ vma = mm->mmap_cache;
+ if (!(vma && vma->vm_end > addr && vma->vm_start <= addr)) {
+ rb_node_t * rb_node;

- rb_node = mm->mm_rb.rb_node;
- vma = NULL;
+ rb_node = mm->mm_rb.rb_node;
+ vma = NULL;

- while (rb_node) {
- struct vm_area_struct * vma_tmp;
+ while (rb_node) {
+ struct vm_area_struct * vma_tmp;

- vma_tmp = rb_entry(rb_node, struct vm_area_struct, vm_rb);
+ vma_tmp = rb_entry(rb_node, struct vm_area_struct, vm_rb);

- if (vma_tmp->vm_end > addr) {
- vma = vma_tmp;
- if (vma_tmp->vm_start <= addr)
- break;
- rb_node = rb_node->rb_left;
- } else
- rb_node = rb_node->rb_right;
- }
- if (vma)
- mm->mmap_cache = vma;
+ if (vma_tmp->vm_end > addr) {
+ vma = vma_tmp;
+ if (vma_tmp->vm_start <= addr)
+ break;
+ rb_node = rb_node->rb_left;
+ } else
+ rb_node = rb_node->rb_right;
}
+ if (vma)
+ mm->mmap_cache = vma;
}
return vma;
}


2002-08-29 23:41:27

by Alexander Viro

[permalink] [raw]
Subject: Re: weirdness with ->mm vs ->active_mm handling



On Thu, 29 Aug 2002, Benjamin LaHaise wrote:

> Hello,
>
> In trying to track down a bug, I found routines like generic_file_read
> getting called with current->mm == NULL. This seems to be a valid state
> for lazy tlb tasks, but the code throughout the kernel doesn't seem to
> assume that.

Lazy-TLB == "promise not to use a lot of stuff in the kernel". In particular,
any page fault in that state is a bug.

2002-08-29 23:47:56

by Benjamin LaHaise

[permalink] [raw]
Subject: Re: weirdness with ->mm vs ->active_mm handling

On Thu, Aug 29, 2002 at 07:45:49PM -0400, Alexander Viro wrote:
> Lazy-TLB == "promise not to use a lot of stuff in the kernel". In particular,
> any page fault in that state is a bug.

In that case the lazy vmalloc faulting code is busted, as accessing a vmalloc
page may need to fill in a pgd/pmd entry from a lazy tlb task. Got an idea
for a more preferable fix?

-ben
--
"You will be reincarnated as a toad; and you will be much happier."

2002-08-30 04:59:49

by Linus Torvalds

[permalink] [raw]
Subject: Re: weirdness with ->mm vs ->active_mm handling


On Thu, 29 Aug 2002, Benjamin LaHaise wrote:
>
> In trying to track down a bug, I found routines like generic_file_read
> getting called with current->mm == NULL. This seems to be a valid state
> for lazy tlb tasks, but the code throughout the kernel doesn't seem to
> assume that.

Hmm.. Have you actually ever seen this?

When tsk->mm is NULL, you should never EVER get a page fault, except for
the one special case of the vmalloc'ed area (which is tested for in
do_page_fault() before we even _look_ at "tsk->mm").

In fact, do_page_fault() very much checks

if (in_atomic() || !mm)
goto no_context;

which says that a page fault when in a lazy TLB context should always
cause a trap, killing the thing (or, if the access has a fixup, calling
the fixup - although I don't think that should happen in any normal code)

In other words: I think your patch is "functionally correct", in that it
should work fine, but on the other hand having a NULL tsk->mm and trying
to do any user-level access is _so_ wrong that I'd much rather take a NULL
pointer fault than try to do something "sane" about it.

Linus