Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755475AbcDNNNg (ORCPT ); Thu, 14 Apr 2016 09:13:36 -0400 Received: from mail-wm0-f68.google.com ([74.125.82.68]:35072 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755108AbcDNNN3 (ORCPT ); Thu, 14 Apr 2016 09:13:29 -0400 MIME-Version: 1.0 In-Reply-To: <1460444239-22475-1-git-send-email-chris@chris-wilson.co.uk> References: <1460444239-22475-1-git-send-email-chris@chris-wilson.co.uk> Date: Thu, 14 Apr 2016 15:13:26 +0200 Message-ID: Subject: Re: [PATCH] mm/vmalloc: Keep a separate lazy-free list From: Roman Peniaev To: Chris Wilson Cc: intel-gfx@lists.freedesktop.org, Joonas Lahtinen , Tvrtko Ursulin , Daniel Vetter , Andrew Morton , David Rientjes , Joonsoo Kim , Mel Gorman , Toshi Kani , Shawn Lin , linux-mm@kvack.org, "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6387 Lines: 161 Hi, Chris. Is it made on purpose not to drop VM_LAZY_FREE flag in __purge_vmap_area_lazy()? With your patch va->flags will have two bits set: VM_LAZY_FREE | VM_LAZY_FREEING. Seems it is not that bad, because all other code paths do not care, but still the change is not clear. Also, did you consider to avoid taking static purge_lock in __purge_vmap_area_lazy() ? Because, with your change it seems that you can avoid taking this lock at all. Just be careful when you observe llist as empty, i.e. nr == 0. And one comment is below: On Tue, Apr 12, 2016 at 8:57 AM, Chris Wilson wrote: > When mixing lots of vmallocs and set_memory_*() (which calls > vm_unmap_aliases()) I encountered situations where the performance > degraded severely due to the walking of the entire vmap_area list each > invocation. One simple improvement is to add the lazily freed vmap_area > to a separate lockless free list, such that we then avoid having to walk > the full list on each purge. > > Signed-off-by: Chris Wilson > Cc: Joonas Lahtinen > Cc: Tvrtko Ursulin > Cc: Daniel Vetter > Cc: Andrew Morton > Cc: David Rientjes > Cc: Joonsoo Kim > Cc: Roman Pen > Cc: Mel Gorman > Cc: Toshi Kani > Cc: Shawn Lin > Cc: linux-mm@kvack.org > Cc: linux-kernel@vger.kernel.org > --- > include/linux/vmalloc.h | 3 ++- > mm/vmalloc.c | 29 ++++++++++++++--------------- > 2 files changed, 16 insertions(+), 16 deletions(-) > > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h > index 8b51df3ab334..3d9d786a943c 100644 > --- a/include/linux/vmalloc.h > +++ b/include/linux/vmalloc.h > @@ -4,6 +4,7 @@ > #include > #include > #include > +#include > #include /* pgprot_t */ > #include > > @@ -45,7 +46,7 @@ struct vmap_area { > unsigned long flags; > struct rb_node rb_node; /* address sorted rbtree */ > struct list_head list; /* address sorted list */ > - struct list_head purge_list; /* "lazy purge" list */ > + struct llist_node purge_list; /* "lazy purge" list */ > struct vm_struct *vm; > struct rcu_head rcu_head; > }; > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 293889d7f482..5388bf64dc32 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -21,6 +21,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -282,6 +283,7 @@ EXPORT_SYMBOL(vmalloc_to_pfn); > static DEFINE_SPINLOCK(vmap_area_lock); > /* Export for kexec only */ > LIST_HEAD(vmap_area_list); > +static LLIST_HEAD(vmap_purge_list); > static struct rb_root vmap_area_root = RB_ROOT; > > /* The vmap cache globals are protected by vmap_area_lock */ > @@ -628,7 +630,7 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, > int sync, int force_flush) > { > static DEFINE_SPINLOCK(purge_lock); > - LIST_HEAD(valist); > + struct llist_node *valist; > struct vmap_area *va; > struct vmap_area *n_va; > int nr = 0; > @@ -647,20 +649,15 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, > if (sync) > purge_fragmented_blocks_allcpus(); > > - rcu_read_lock(); > - list_for_each_entry_rcu(va, &vmap_area_list, list) { > - if (va->flags & VM_LAZY_FREE) { > - if (va->va_start < *start) > - *start = va->va_start; > - if (va->va_end > *end) > - *end = va->va_end; > - nr += (va->va_end - va->va_start) >> PAGE_SHIFT; > - list_add_tail(&va->purge_list, &valist); > - va->flags |= VM_LAZY_FREEING; > - va->flags &= ~VM_LAZY_FREE; > - } > + valist = llist_del_all(&vmap_purge_list); > + llist_for_each_entry(va, valist, purge_list) { > + if (va->va_start < *start) > + *start = va->va_start; > + if (va->va_end > *end) > + *end = va->va_end; > + nr += (va->va_end - va->va_start) >> PAGE_SHIFT; > + va->flags |= VM_LAZY_FREEING; > } > - rcu_read_unlock(); > > if (nr) > atomic_sub(nr, &vmap_lazy_nr); > @@ -670,7 +667,7 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, > > if (nr) { > spin_lock(&vmap_area_lock); > - list_for_each_entry_safe(va, n_va, &valist, purge_list) > + llist_for_each_entry_safe(va, n_va, valist, purge_list) > __free_vmap_area(va); > spin_unlock(&vmap_area_lock); > } > @@ -706,6 +703,8 @@ static void purge_vmap_area_lazy(void) > static void free_vmap_area_noflush(struct vmap_area *va) > { > va->flags |= VM_LAZY_FREE; > + llist_add(&va->purge_list, &vmap_purge_list); > + > atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr); it seems to me that this a very long-standing problem: when you mark va->flags as VM_LAZY_FREE, va can be immediately freed from another CPU. If so, the line: atomic_add((va->va_end - va->va_start).... does use-after-free access. So I would also fix it with careful line reordering with barrier: (probably barrier is excess here, because llist_add implies cmpxchg, but I simply want to be explicit here, showing that marking va as VM_LAZY_FREE and adding it to the list should be at the end) - va->flags |= VM_LAZY_FREE; atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr); + smp_mb__after_atomic(); + va->flags |= VM_LAZY_FREE; + llist_add(&va->purge_list, &vmap_purge_list); What do you think? -- Roman