2012-08-24 14:38:16

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH 1/2] mm/mmu_notifier: init notifier if necessary

From: Gavin Shan <[email protected]>

While registering MMU notifier, new instance of MMU notifier_mm will
be allocated and later free'd if currrent mm_struct's MMU notifier_mm
has been initialized. That cause some overhead. The patch tries to
eleminate that.

Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
mm/mmu_notifier.c | 22 +++++++++++-----------
1 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 862b608..fb4067f 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -192,22 +192,23 @@ static int do_mmu_notifier_register(struct mmu_notifier *mn,

BUG_ON(atomic_read(&mm->mm_users) <= 0);

- ret = -ENOMEM;
- mmu_notifier_mm = kmalloc(sizeof(struct mmu_notifier_mm), GFP_KERNEL);
- if (unlikely(!mmu_notifier_mm))
- goto out;
-
if (take_mmap_sem)
down_write(&mm->mmap_sem);
ret = mm_take_all_locks(mm);
if (unlikely(ret))
- goto out_cleanup;
+ goto out;

if (!mm_has_notifiers(mm)) {
+ mmu_notifier_mm = kmalloc(sizeof(struct mmu_notifier_mm),
+ GFP_ATOMIC);
+ if (unlikely(!mmu_notifier_mm)) {
+ ret = -ENOMEM;
+ goto out_of_mem;
+ }
INIT_HLIST_HEAD(&mmu_notifier_mm->list);
spin_lock_init(&mmu_notifier_mm->lock);
+
mm->mmu_notifier_mm = mmu_notifier_mm;
- mmu_notifier_mm = NULL;
}
atomic_inc(&mm->mm_count);

@@ -223,13 +224,12 @@ static int do_mmu_notifier_register(struct mmu_notifier *mn,
hlist_add_head(&mn->hlist, &mm->mmu_notifier_mm->list);
spin_unlock(&mm->mmu_notifier_mm->lock);

+out_of_mem:
mm_drop_all_locks(mm);
-out_cleanup:
+out:
if (take_mmap_sem)
up_write(&mm->mmap_sem);
- /* kfree() does nothing if mmu_notifier_mm is NULL */
- kfree(mmu_notifier_mm);
-out:
+
BUG_ON(atomic_read(&mm->mm_users) <= 0);
return ret;
}
--
1.7.7.6


2012-08-24 14:38:39

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH 2/2] mm/vmscan: fix error number for failed kthread

From: Gavin Shan <[email protected]>

The patch fixes the return value while failing to create the kswapd
kernel thread. Also, the error message is prioritized as KERN_ERR.

Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
mm/vmscan.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 8d01243..ddf00a7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3101,9 +3101,10 @@ int kswapd_run(int nid)
if (IS_ERR(pgdat->kswapd)) {
/* failure at boot is fatal */
BUG_ON(system_state == SYSTEM_BOOTING);
- printk("Failed to start kswapd on node %d\n",nid);
- ret = -1;
+ pr_err("Failed to start kswapd on node %d\n", nid);
+ ret = PTR_ERR(pgdat->kswapd);
}
+
return ret;
}

--
1.7.7.6

2012-08-24 21:51:56

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/mmu_notifier: init notifier if necessary

On Fri, 24 Aug 2012 22:37:55 +0800
Wanpeng Li <[email protected]> wrote:

> From: Gavin Shan <[email protected]>
>
> While registering MMU notifier, new instance of MMU notifier_mm will
> be allocated and later free'd if currrent mm_struct's MMU notifier_mm
> has been initialized. That cause some overhead. The patch tries to
> eleminate that.
>
> Signed-off-by: Gavin Shan <[email protected]>
> Signed-off-by: Wanpeng Li <[email protected]>
> ---
> mm/mmu_notifier.c | 22 +++++++++++-----------
> 1 files changed, 11 insertions(+), 11 deletions(-)
>
> diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
> index 862b608..fb4067f 100644
> --- a/mm/mmu_notifier.c
> +++ b/mm/mmu_notifier.c
> @@ -192,22 +192,23 @@ static int do_mmu_notifier_register(struct mmu_notifier *mn,
>
> BUG_ON(atomic_read(&mm->mm_users) <= 0);
>
> - ret = -ENOMEM;
> - mmu_notifier_mm = kmalloc(sizeof(struct mmu_notifier_mm), GFP_KERNEL);
> - if (unlikely(!mmu_notifier_mm))
> - goto out;
> -
> if (take_mmap_sem)
> down_write(&mm->mmap_sem);
> ret = mm_take_all_locks(mm);
> if (unlikely(ret))
> - goto out_cleanup;
> + goto out;
>
> if (!mm_has_notifiers(mm)) {
> + mmu_notifier_mm = kmalloc(sizeof(struct mmu_notifier_mm),
> + GFP_ATOMIC);

Why was the code switched to the far weaker GFP_ATOMIC? We can still
perform sleeping allocations inside mmap_sem.

> + if (unlikely(!mmu_notifier_mm)) {
> + ret = -ENOMEM;
> + goto out_of_mem;
> + }
> INIT_HLIST_HEAD(&mmu_notifier_mm->list);
> spin_lock_init(&mmu_notifier_mm->lock);
> +
> mm->mmu_notifier_mm = mmu_notifier_mm;
> - mmu_notifier_mm = NULL;
> }
> atomic_inc(&mm->mm_count);
>

2012-08-30 19:13:07

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm/mmu_notifier: init notifier if necessary

On Sat, 25 Aug 2012 17:47:50 +0800
Gavin Shan <[email protected]> wrote:

> >> --- a/mm/mmu_notifier.c
> >> +++ b/mm/mmu_notifier.c
> >> @@ -192,22 +192,23 @@ static int do_mmu_notifier_register(struct mmu_notifier *mn,
> >>
> >> BUG_ON(atomic_read(&mm->mm_users) <= 0);
> >>
> >> - ret = -ENOMEM;
> >> - mmu_notifier_mm = kmalloc(sizeof(struct mmu_notifier_mm), GFP_KERNEL);
> >> - if (unlikely(!mmu_notifier_mm))
> >> - goto out;
> >> -
> >> if (take_mmap_sem)
> >> down_write(&mm->mmap_sem);
> >> ret = mm_take_all_locks(mm);
> >> if (unlikely(ret))
> >> - goto out_cleanup;
> >> + goto out;
> >>
> >> if (!mm_has_notifiers(mm)) {
> >> + mmu_notifier_mm = kmalloc(sizeof(struct mmu_notifier_mm),
> >> + GFP_ATOMIC);
> >
> >Why was the code switched to the far weaker GFP_ATOMIC? We can still
> >perform sleeping allocations inside mmap_sem.
> >
>
> Yes, we can perform sleeping while allocating memory, but we're holding
> the "mmap_sem". GFP_KERNEL possiblly block somebody else who also waits
> on mmap_sem for long time even though the case should be rare :-)

GFP_ATOMIC allocations are unreliable. If the allocation attempt fails
here, an entire kernel subsystem will have failed, quite probably
requiring a reboot. It's a bad tradeoff.

Please fix this and retest. With lockdep enabled, of course.

And please do not attempt to sneak changes like this into the kernel
without even mentioning them in the changelog. If I hadn't have
happened to notice this, we'd have ended up with a less reliable
kernel.