Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753435AbaKQWnW (ORCPT ); Mon, 17 Nov 2014 17:43:22 -0500 Received: from www.linutronix.de ([62.245.132.108]:35948 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751499AbaKQWnU (ORCPT ); Mon, 17 Nov 2014 17:43:20 -0500 Date: Mon, 17 Nov 2014 23:43:14 +0100 (CET) From: Thomas Gleixner To: Linus Torvalds cc: Jens Axboe , Ingo Molnar , Dave Jones , Linux Kernel , the arch/x86 maintainers Subject: Re: frequent lockups in 3.18rc4 In-Reply-To: Message-ID: References: <20141114213124.GB3344@redhat.com> User-Agent: Alpine 2.11 (DEB 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 17 Nov 2014, Thomas Gleixner wrote: > On Mon, 17 Nov 2014, Linus Torvalds wrote: > > llist_for_each_entry_safe(csd, csd_next, entry, llist) { > > - csd->func(csd->info); > > + smp_call_func_t func = csd->func; > > + void *info = csd->info; > > csd_unlock(csd); > > + > > + func(info); > > No, that won't work for synchronous calls: > > CPU 0 CPU 1 > > csd_lock(csd); > queue_csd(); > ipi(); > func = csd->func; > info = csd->info; > csd_unlock(csd); > csd_lock_wait(); > func(info); > > The csd_lock_wait() side will succeed and therefor assume that the > call has been completed while the function has not been called at > all. Interesting explosions to follow. > > The proper solution is to revert that commit and properly analyze the > problem which Jens was trying to solve and work from there. So a combo of both (Jens and yours) might do the trick. Patch below. I think what Jens was trying to solve is: CPU 0 CPU 1 csd_lock(csd); queue_csd(); ipi(); csd->func(csd->info); wait_for_completion(csd); complete(csd); reuse_csd(csd); csd_unlock(csd); Thanks, tglx Index: linux/kernel/smp.c =================================================================== --- linux.orig/kernel/smp.c +++ linux/kernel/smp.c @@ -126,7 +126,7 @@ static void csd_lock(struct call_single_ static void csd_unlock(struct call_single_data *csd) { - WARN_ON((csd->flags & CSD_FLAG_WAIT) && !(csd->flags & CSD_FLAG_LOCK)); + WARN_ON(!(csd->flags & CSD_FLAG_LOCK)); /* * ensure we're all done before releasing data: @@ -250,8 +250,23 @@ static void flush_smp_call_function_queu } llist_for_each_entry_safe(csd, csd_next, entry, llist) { - csd->func(csd->info); - csd_unlock(csd); + + /* + * For synchronous calls we are not allowed to unlock + * before the callback returned. For the async case + * its the responsibility of the caller to keep + * csd->info consistent while the callback runs. + */ + if (csd->flags & CSD_FLAG_WAIT) { + csd->func(csd->info); + csd_unlock(csd); + } else { + smp_call_func_t func = csd->func; + void *info = csd->info; + + csd_unlock(csd); + func(info); + } } /* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/