LinuxLists.cc - 2.6.xxx race condition in x86_64's global_flush

2007-10-24 20:40:12

Subject: 2.6.xxx race condition in x86_64's global_flush_tlb???

I have seen some hangs in 2.6-x86_64 in flush_kernel_map(). The tests
cause alot of ioremap/iounmap to occur concurrently across many
processor threads.

Looking at the hung processor hangs, they are looping in
flush_kernel_map() and the list they get from the smp_call_function()
appears to be corrupt. In fact, I see deferred_pages as an entry and
that isn't supposed to happen.

I am questioning the locking in global_flush_tlb() listed below. The
down_read/up_read protection doesn't seen safe. If several threads are
rushing thru here, deferred_pages could be getting changed as they
look at it. I don't think there any protection when
list_replace_init() calls INIT_LIST_HEAD().

I changed the down_read()/up_read() around list_replace_init() to
down_write()/up_write() and my test runs fine.

void global_flush_tlb(void)
{
struct page *pg, *next;
struct list_head l;

down_read(&init_mm.mmap_sem); // XXX should be down_write()???
list_replace_init(&deferred_pages, &l);
up_read(&init_mm.mmap_sem); // XXX should be up_write()????
flush_map(&l);

list_for_each_entry_safe(pg, next, &l, lru) {
ClearPagePrivate(pg);
__free_page(pg);
}
}

2007-10-24 21:14:36

by Randy Dunlap

[permalink] [raw]

Subject: Re: 2.6.xxx race condition in x86_64's global_flush_tlb???

On Wed, 24 Oct 2007 16:39:57 -0400 Doug Reiland wrote:

> I have seen some hangs in 2.6-x86_64 in flush_kernel_map(). The tests
> cause alot of ioremap/iounmap to occur concurrently across many
> processor threads.
>
> Looking at the hung processor hangs, they are looping in
> flush_kernel_map() and the list they get from the smp_call_function()
> appears to be corrupt. In fact, I see deferred_pages as an entry and
> that isn't supposed to happen.
>
> I am questioning the locking in global_flush_tlb() listed below. The
> down_read/up_read protection doesn't seen safe. If several threads are
> rushing thru here, deferred_pages could be getting changed as they
> look at it. I don't think there any protection when
> list_replace_init() calls INIT_LIST_HEAD().
>
> I changed the down_read()/up_read() around list_replace_init() to
> down_write()/up_write() and my test runs fine.
>
>
> void global_flush_tlb(void)
> {
> struct page *pg, *next;
> struct list_head l;
>
> down_read(&init_mm.mmap_sem); // XXX should be down_write()???
> list_replace_init(&deferred_pages, &l);
> up_read(&init_mm.mmap_sem); // XXX should be up_write()????
> flush_map(&l);
>
> list_for_each_entry_safe(pg, next, &l, lru) {
> ClearPagePrivate(pg);
> __free_page(pg);
> }
> }

Seems to be already fixed in current git tree.

---
~Randy

2007-10-24 22:07:27

by Doug Reiland

[permalink] [raw]

Subject: Re: 2.6.xxx race condition in x86_64's global_flush_tlb???

Your right. I thought I was updated but was at 23-rc9.
Sorry!

On 10/24/07, Randy Dunlap <[email protected]> wrote:
> On Wed, 24 Oct 2007 16:39:57 -0400 Doug Reiland wrote:
>
> > I have seen some hangs in 2.6-x86_64 in flush_kernel_map(). The tests
> > cause alot of ioremap/iounmap to occur concurrently across many
> > processor threads.
> >
> > Looking at the hung processor hangs, they are looping in
> > flush_kernel_map() and the list they get from the smp_call_function()
> > appears to be corrupt. In fact, I see deferred_pages as an entry and
> > that isn't supposed to happen.
> >
> > I am questioning the locking in global_flush_tlb() listed below. The
> > down_read/up_read protection doesn't seen safe. If several threads are
> > rushing thru here, deferred_pages could be getting changed as they
> > look at it. I don't think there any protection when
> > list_replace_init() calls INIT_LIST_HEAD().
> >
> > I changed the down_read()/up_read() around list_replace_init() to
> > down_write()/up_write() and my test runs fine.
> >
> >
> > void global_flush_tlb(void)
> > {
> > struct page *pg, *next;
> > struct list_head l;
> >
> > down_read(&init_mm.mmap_sem); // XXX should be down_write()???
> > list_replace_init(&deferred_pages, &l);
> > up_read(&init_mm.mmap_sem); // XXX should be up_write()????
> > flush_map(&l);
> >
> > list_for_each_entry_safe(pg, next, &l, lru) {
> > ClearPagePrivate(pg);
> > __free_page(pg);
> > }
> > }
>
> Seems to be already fixed in current git tree.
>
> ---
> ~Randy
>