2013-09-03 07:02:00

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v4 1/4] mm/vmalloc: don't set area->caller twice

Changelog:
*v1 -> v2: rebase against mmotm tree

The caller address has already been set in set_vmalloc_vm(), there's no need
to set it again in __vmalloc_area_node.

Reviewed-by: Zhang Yanfei <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
---
mm/vmalloc.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 1074543..d78d117 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1566,7 +1566,6 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
pages = kmalloc_node(array_size, nested_gfp, node);
}
area->pages = pages;
- area->caller = caller;
if (!area->pages) {
remove_vm_area(area->addr);
kfree(area);
--
1.8.1.2


2013-09-03 07:02:04

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v4 2/4] mm/vmalloc: revert "mm/vmalloc.c: emit the failure message before return"

Changelog:
*v2 -> v3: revert commit 46c001a2 directly

Don't warning twice in __vmalloc_area_node and __vmalloc_node_range if
__vmalloc_area_node allocation failure. This patch revert commit 46c001a2
(mm/vmalloc.c: emit the failure message before return).

Signed-off-by: Wanpeng Li <[email protected]>
---
mm/vmalloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index d78d117..e3ec8b4 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1635,7 +1635,7 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,

addr = __vmalloc_area_node(area, gfp_mask, prot, node, caller);
if (!addr)
- goto fail;
+ return NULL;

/*
* In this function, newly allocated vm_struct has VM_UNINITIALIZED
--
1.8.1.2

2013-09-03 07:02:08

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v4 4/4] mm/vmalloc: don't assume vmap_area w/o VM_VM_AREA flag is vm_map_ram allocation

There is a race window between vmap_area free and show vmap_area information.

A B

remove_vm_area
spin_lock(&vmap_area_lock);
va->flags &= ~VM_VM_AREA;
spin_unlock(&vmap_area_lock);
spin_lock(&vmap_area_lock);
if (va->flags & (VM_LAZY_FREE | VM_LAZY_FREEZING))
return 0;
if (!(va->flags & VM_VM_AREA)) {
seq_printf(m, "0x%pK-0x%pK %7ld vm_map_ram\n",
(void *)va->va_start, (void *)va->va_end,
va->va_end - va->va_start);
return 0;
}
free_unmap_vmap_area(va);
flush_cache_vunmap
free_unmap_vmap_area_noflush
unmap_vmap_area
free_vmap_area_noflush
va->flags |= VM_LAZY_FREE

The assumption is introduced by commit: d4033afd(mm, vmalloc: iterate vmap_area_list,
instead of vmlist, in vmallocinfo()). This patch fix it by drop the assumption and
keep not dump vm_map_ram allocation information as the logic before that commit.

Signed-off-by: Wanpeng Li <[email protected]>
---
mm/vmalloc.c | 7 -------
1 file changed, 7 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 5368b17..62b7932 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2586,13 +2586,6 @@ static int s_show(struct seq_file *m, void *p)
if (va->flags & (VM_LAZY_FREE | VM_LAZY_FREEING))
return 0;

- if (!(va->flags & VM_VM_AREA)) {
- seq_printf(m, "0x%pK-0x%pK %7ld vm_map_ram\n",
- (void *)va->va_start, (void *)va->va_end,
- va->va_end - va->va_start);
- return 0;
- }
-
v = va->vm;

seq_printf(m, "0x%pK-0x%pK %7ld",
--
1.8.1.2

2013-09-03 07:02:26

by Wanpeng Li

[permalink] [raw]
Subject: [PATCH v4 3/4] mm/vmalloc: revert "mm/vmalloc.c: check VM_UNINITIALIZED flag in s_show instead of show_numa_info"

Changelog:
*v2 -> v3: revert commit d157a558 directly

The VM_UNINITIALIZED/VM_UNLIST flag introduced by commit f5252e00(mm: avoid
null pointer access in vm_struct via /proc/vmallocinfo) is used to avoid
accessing the pages field with unallocated page when show_numa_info() is
called. This patch move the check just before show_numa_info in order that
some messages still can be dumped via /proc/vmallocinfo. This patch revert
commit d157a558 (mm/vmalloc.c: check VM_UNINITIALIZED flag in s_show instead
of show_numa_info);

Signed-off-by: Wanpeng Li <[email protected]>
---
mm/vmalloc.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e3ec8b4..5368b17 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2562,6 +2562,11 @@ static void show_numa_info(struct seq_file *m, struct vm_struct *v)
if (!counters)
return;

+ /* Pair with smp_wmb() in clear_vm_uninitialized_flag() */
+ smp_rmb();
+ if (v->flags & VM_UNINITIALIZED)
+ return;
+
memset(counters, 0, nr_node_ids * sizeof(unsigned int));

for (nr = 0; nr < v->nr_pages; nr++)
@@ -2590,11 +2595,6 @@ static int s_show(struct seq_file *m, void *p)

v = va->vm;

- /* Pair with smp_wmb() in clear_vm_uninitialized_flag() */
- smp_rmb();
- if (v->flags & VM_UNINITIALIZED)
- return 0;
-
seq_printf(m, "0x%pK-0x%pK %7ld",
v->addr, v->addr + v->size, v->size);

--
1.8.1.2

2013-09-03 07:24:31

by Zhang Yanfei

[permalink] [raw]
Subject: Re: [PATCH v4 2/4] mm/vmalloc: revert "mm/vmalloc.c: emit the failure message before return"

On 09/03/2013 03:01 PM, Wanpeng Li wrote:
> Changelog:
> *v2 -> v3: revert commit 46c001a2 directly
>
> Don't warning twice in __vmalloc_area_node and __vmalloc_node_range if
> __vmalloc_area_node allocation failure. This patch revert commit 46c001a2
> (mm/vmalloc.c: emit the failure message before return).
>
> Signed-off-by: Wanpeng Li <[email protected]>

Reviewed-by: Zhang Yanfei <[email protected]>

> ---
> mm/vmalloc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index d78d117..e3ec8b4 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1635,7 +1635,7 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
>
> addr = __vmalloc_area_node(area, gfp_mask, prot, node, caller);
> if (!addr)
> - goto fail;
> + return NULL;
>
> /*
> * In this function, newly allocated vm_struct has VM_UNINITIALIZED
>


--
Thanks.
Zhang Yanfei

2013-09-03 07:24:45

by Zhang Yanfei

[permalink] [raw]
Subject: Re: [PATCH v4 3/4] mm/vmalloc: revert "mm/vmalloc.c: check VM_UNINITIALIZED flag in s_show instead of show_numa_info"

On 09/03/2013 03:01 PM, Wanpeng Li wrote:
> Changelog:
> *v2 -> v3: revert commit d157a558 directly
>
> The VM_UNINITIALIZED/VM_UNLIST flag introduced by commit f5252e00(mm: avoid
> null pointer access in vm_struct via /proc/vmallocinfo) is used to avoid
> accessing the pages field with unallocated page when show_numa_info() is
> called. This patch move the check just before show_numa_info in order that
> some messages still can be dumped via /proc/vmallocinfo. This patch revert
> commit d157a558 (mm/vmalloc.c: check VM_UNINITIALIZED flag in s_show instead
> of show_numa_info);
>
> Signed-off-by: Wanpeng Li <[email protected]>

Reviewed-by: Zhang Yanfei <[email protected]>

> ---
> mm/vmalloc.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index e3ec8b4..5368b17 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2562,6 +2562,11 @@ static void show_numa_info(struct seq_file *m, struct vm_struct *v)
> if (!counters)
> return;
>
> + /* Pair with smp_wmb() in clear_vm_uninitialized_flag() */
> + smp_rmb();
> + if (v->flags & VM_UNINITIALIZED)
> + return;
> +
> memset(counters, 0, nr_node_ids * sizeof(unsigned int));
>
> for (nr = 0; nr < v->nr_pages; nr++)
> @@ -2590,11 +2595,6 @@ static int s_show(struct seq_file *m, void *p)
>
> v = va->vm;
>
> - /* Pair with smp_wmb() in clear_vm_uninitialized_flag() */
> - smp_rmb();
> - if (v->flags & VM_UNINITIALIZED)
> - return 0;
> -
> seq_printf(m, "0x%pK-0x%pK %7ld",
> v->addr, v->addr + v->size, v->size);
>
>


--
Thanks.
Zhang Yanfei

2013-09-03 07:41:30

by Joonsoo Kim

[permalink] [raw]
Subject: Re: [PATCH v4 4/4] mm/vmalloc: don't assume vmap_area w/o VM_VM_AREA flag is vm_map_ram allocation

On Tue, Sep 03, 2013 at 03:01:46PM +0800, Wanpeng Li wrote:
> There is a race window between vmap_area free and show vmap_area information.
>
> A B
>
> remove_vm_area
> spin_lock(&vmap_area_lock);
> va->flags &= ~VM_VM_AREA;
> spin_unlock(&vmap_area_lock);
> spin_lock(&vmap_area_lock);
> if (va->flags & (VM_LAZY_FREE | VM_LAZY_FREEZING))
> return 0;
> if (!(va->flags & VM_VM_AREA)) {
> seq_printf(m, "0x%pK-0x%pK %7ld vm_map_ram\n",
> (void *)va->va_start, (void *)va->va_end,
> va->va_end - va->va_start);
> return 0;
> }
> free_unmap_vmap_area(va);
> flush_cache_vunmap
> free_unmap_vmap_area_noflush
> unmap_vmap_area
> free_vmap_area_noflush
> va->flags |= VM_LAZY_FREE
>
> The assumption is introduced by commit: d4033afd(mm, vmalloc: iterate vmap_area_list,
> instead of vmlist, in vmallocinfo()). This patch fix it by drop the assumption and
> keep not dump vm_map_ram allocation information as the logic before that commit.
>
> Signed-off-by: Wanpeng Li <[email protected]>
> ---
> mm/vmalloc.c | 7 -------
> 1 file changed, 7 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 5368b17..62b7932 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2586,13 +2586,6 @@ static int s_show(struct seq_file *m, void *p)
> if (va->flags & (VM_LAZY_FREE | VM_LAZY_FREEING))
> return 0;
>
> - if (!(va->flags & VM_VM_AREA)) {
> - seq_printf(m, "0x%pK-0x%pK %7ld vm_map_ram\n",
> - (void *)va->va_start, (void *)va->va_end,
> - va->va_end - va->va_start);
> - return 0;
> - }
> -
> v = va->vm;
>
> seq_printf(m, "0x%pK-0x%pK %7ld",

Hello, Wanpeng.

Did you test this patch?

I guess that, With this patch, if there are some vm_map areas,
null pointer deference would occurs, since va->vm may be null for it.

And with this patch, if this race really occur, null pointer deference
would occurs too, since va->vm is set to null in remove_vm_area().

I think that this is not a right fix for this possible race.

Thanks.

2013-09-03 07:42:10

by Zhang Yanfei

[permalink] [raw]
Subject: Re: [PATCH v4 4/4] mm/vmalloc: don't assume vmap_area w/o VM_VM_AREA flag is vm_map_ram allocation

On 09/03/2013 03:01 PM, Wanpeng Li wrote:
> There is a race window between vmap_area free and show vmap_area information.
>
> A B
>
> remove_vm_area
> spin_lock(&vmap_area_lock);
> va->flags &= ~VM_VM_AREA;

Here we also do: va->vm = NULL; And see below....

> spin_unlock(&vmap_area_lock);
> spin_lock(&vmap_area_lock);
> if (va->flags & (VM_LAZY_FREE | VM_LAZY_FREEZING))
> return 0;
> if (!(va->flags & VM_VM_AREA)) {
> seq_printf(m, "0x%pK-0x%pK %7ld vm_map_ram\n",
> (void *)va->va_start, (void *)va->va_end,
> va->va_end - va->va_start);
> return 0;
> }
> free_unmap_vmap_area(va);
> flush_cache_vunmap
> free_unmap_vmap_area_noflush
> unmap_vmap_area
> free_vmap_area_noflush
> va->flags |= VM_LAZY_FREE
>
> The assumption is introduced by commit: d4033afd(mm, vmalloc: iterate vmap_area_list,
> instead of vmlist, in vmallocinfo()). This patch fix it by drop the assumption and
> keep not dump vm_map_ram allocation information as the logic before that commit.
>
> Signed-off-by: Wanpeng Li <[email protected]>
> ---
> mm/vmalloc.c | 7 -------
> 1 file changed, 7 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 5368b17..62b7932 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2586,13 +2586,6 @@ static int s_show(struct seq_file *m, void *p)
> if (va->flags & (VM_LAZY_FREE | VM_LAZY_FREEING))
> return 0;
>
> - if (!(va->flags & VM_VM_AREA)) {
> - seq_printf(m, "0x%pK-0x%pK %7ld vm_map_ram\n",
> - (void *)va->va_start, (void *)va->va_end,
> - va->va_end - va->va_start);
> - return 0;
> - }
> -
> v = va->vm;

If we remove the if test above, in the window you said above, va->vm is NULL,
but below we will still try to access the members of this vm_struct, which
will cause problems...

Correct me if I am wrong, please.

>
> seq_printf(m, "0x%pK-0x%pK %7ld",
>


--
Thanks.
Zhang Yanfei

2013-09-03 08:59:09

by Joonsoo Kim

[permalink] [raw]
Subject: Re: [PATCH v4 4/4] mm/vmalloc: don't assume vmap_area w/o VM_VM_AREA flag is vm_map_ram allocation

On Tue, Sep 03, 2013 at 03:51:39PM +0800, Wanpeng Li wrote:
> On Tue, Sep 03, 2013 at 04:42:21PM +0900, Joonsoo Kim wrote:
> >On Tue, Sep 03, 2013 at 03:01:46PM +0800, Wanpeng Li wrote:
> >> There is a race window between vmap_area free and show vmap_area information.
> >>
> >> A B
> >>
> >> remove_vm_area
> >> spin_lock(&vmap_area_lock);
> >> va->flags &= ~VM_VM_AREA;
> >> spin_unlock(&vmap_area_lock);
> >> spin_lock(&vmap_area_lock);
> >> if (va->flags & (VM_LAZY_FREE | VM_LAZY_FREEZING))
> >> return 0;
> >> if (!(va->flags & VM_VM_AREA)) {
> >> seq_printf(m, "0x%pK-0x%pK %7ld vm_map_ram\n",
> >> (void *)va->va_start, (void *)va->va_end,
> >> va->va_end - va->va_start);
> >> return 0;
> >> }
> >> free_unmap_vmap_area(va);
> >> flush_cache_vunmap
> >> free_unmap_vmap_area_noflush
> >> unmap_vmap_area
> >> free_vmap_area_noflush
> >> va->flags |= VM_LAZY_FREE
> >>
> >> The assumption is introduced by commit: d4033afd(mm, vmalloc: iterate vmap_area_list,
> >> instead of vmlist, in vmallocinfo()). This patch fix it by drop the assumption and
> >> keep not dump vm_map_ram allocation information as the logic before that commit.
> >>
> >> Signed-off-by: Wanpeng Li <[email protected]>
> >> ---
> >> mm/vmalloc.c | 7 -------
> >> 1 file changed, 7 deletions(-)
> >>
> >> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> >> index 5368b17..62b7932 100644
> >> --- a/mm/vmalloc.c
> >> +++ b/mm/vmalloc.c
> >> @@ -2586,13 +2586,6 @@ static int s_show(struct seq_file *m, void *p)
> >> if (va->flags & (VM_LAZY_FREE | VM_LAZY_FREEING))
> >> return 0;
> >>
> >> - if (!(va->flags & VM_VM_AREA)) {
> >> - seq_printf(m, "0x%pK-0x%pK %7ld vm_map_ram\n",
> >> - (void *)va->va_start, (void *)va->va_end,
> >> - va->va_end - va->va_start);
> >> - return 0;
> >> - }
> >> -
> >> v = va->vm;
> >>
> >> seq_printf(m, "0x%pK-0x%pK %7ld",
> >
> >Hello, Wanpeng.
> >
>
> Hi Joonsoo and Yanfei,
>
> >Did you test this patch?
> >
> >I guess that, With this patch, if there are some vm_map areas,
> >null pointer deference would occurs, since va->vm may be null for it.
> >
> >And with this patch, if this race really occur, null pointer deference
> >would occurs too, since va->vm is set to null in remove_vm_area().
> >
> >I think that this is not a right fix for this possible race.
> >
>
> How about append below to this patch?
>
> if (va->vm)
> v = va->vm;
> else
> return 0;

Hello,

I think that appending below code is better to represent it's purpose.
Maybe some comment is needed.

/* blablabla */
if (!(va->flags & VM_VM_AREA))
return 0;

And maybe we can remove below code snippet, since
either VM_LAZY_FREE or VM_LAZY_FREEING is not possible for !VM_VM_AREA case.

if (va->flags & (VM_LAZY_FREE | VM_LAZY_FREEING))
return 0;

Thanks.