Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751446AbdGRL0I (ORCPT ); Tue, 18 Jul 2017 07:26:08 -0400 Received: from bombadil.infradead.org ([65.50.211.133]:42019 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751365AbdGRLZy (ORCPT ); Tue, 18 Jul 2017 07:25:54 -0400 Date: Tue, 18 Jul 2017 13:25:49 +0200 From: Peter Zijlstra To: Prateek Sood Cc: mingo@redhat.com, sramana@codeaurora.org, linux-kernel@vger.kernel.org, will.deacon@arm.com Subject: Re: [PATCH] osq_lock: fix osq_lock queue corruption Message-ID: <20170718112549.wujrvoxnsuzij4cw@hirez.programming.kicks-ass.net> References: <1500040076-27626-1-git-send-email-prsood@codeaurora.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1500040076-27626-1-git-send-email-prsood@codeaurora.org> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2736 Lines: 123 I added a few pictures, just the text didn't want to make sense to me. On Fri, Jul 14, 2017 at 07:17:56PM +0530, Prateek Sood wrote: > Fix ordering of link creation between node->prev and prev->next in > osq_lock(). A case in which the status of optimistic spin queue is > CPU6->CPU2 in which CPU6 has acquired the lock. tail v ,-. <- ,-. |6| |2| `-' -> `-' > At this point if CPU0 comes in to acquire osq_lock, it will update the > tail count. CPU2 CPU0 ---------------------------------- tail v ,-. <- ,-. ,-. |6| |2| |0| `-' -> `-' `-' > After tail count update if CPU2 starts to unqueue itself from > optimistic spin queue, it will find updated tail count with CPU0 and > update CPU2 node->next to NULL in osq_wait_next(). unqueue-A tail v ,-. <- ,-. ,-. |6| |2| |0| `-' `-' `-' unqueue-B ->tail != curr && !node->next > If reordering of following stores happen then > prev->next where prev being CPU2 would be updated to point to CPU0 node: tail v ,-. <- ,-. ,-. |6| |2| |0| `-' -> `-' -> `-' osq_wait_next() node->next <- 0 xchg(node->next, NULL) tail v ,-. <- ,-. ,-. |6| |2| |0| `-' `-' `-' unqueue-C > At this point if next instruction > WRITE_ONCE(next->prev, prev); > in CPU2 path is committed before the update of CPU0 node->prev = prev then > CPU0 node->prev will point to CPU6 node. tail v----------. v ,-. <- ,-. ,-. |6| |2| |0| `-' `-' `-' `----------^ > At this point if CPU0 path's node->prev = prev is committed resulting > in change of CPU0 prev back to CPU2 node. CPU2 node->next is NULL > currently, tail v ,-. <- ,-. <- ,-. |6| |2| |0| `-' `-' `-' `----------^ > so if CPU0 gets into unqueue path of osq_lock it will keep spinning > in infinite loop as condition prev->next == node will never be true. Also updated the comment.. --- kernel/locking/osq_lock.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) --- a/kernel/locking/osq_lock.c +++ b/kernel/locking/osq_lock.c @@ -109,6 +109,19 @@ bool osq_lock(struct optimistic_spin_que prev = decode_cpu(old); node->prev = prev; + + /* + * osq_lock() unqueue + * + * node->prev = prev osq_wait_next() + * WMB MB + * prev->next = node next->prev = prev // unqueue-C + * + * Here 'node->prev' and 'next->prev' are the same variable and we need + * to ensure these stores happen in-order to avoid corrupting the list. + */ + smp_wmb(); + WRITE_ONCE(prev->next, node); /*