Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932068AbdGNNsP (ORCPT ); Fri, 14 Jul 2017 09:48:15 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:59972 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753634AbdGNNsN (ORCPT ); Fri, 14 Jul 2017 09:48:13 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 9375F60F67 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=prsood@codeaurora.org From: Prateek Sood To: peterz@infradead.org, mingo@redhat.com Cc: sramana@codeaurora.org, linux-kernel@vger.kernel.org, Prateek Sood Subject: [PATCH] osq_lock: fix osq_lock queue corruption Date: Fri, 14 Jul 2017 19:17:56 +0530 Message-Id: <1500040076-27626-1-git-send-email-prsood@codeaurora.org> X-Mailer: git-send-email 1.9.1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2913 Lines: 64 Fix ordering of link creation between node->prev and prev->next in osq_lock(). A case in which the status of optimistic spin queue is CPU6->CPU2 in which CPU6 has acquired the lock. At this point if CPU0 comes in to acquire osq_lock, it will update the tail count. After tail count update if CPU2 starts to unqueue itself from optimistic spin queue, it will find updated tail count with CPU0 and update CPU2 node->next to NULL in osq_wait_next(). If reordering of following stores happen then prev->next where prev being CPU2 would be updated to point to CPU0 node: node->prev = prev; WRITE_ONCE(prev->next, node); At this point if next instruction WRITE_ONCE(next->prev, prev); in CPU2 path is committed before the update of CPU0 node->prev = prev then CPU0 node->prev will point to CPU6 node. At this point if CPU0 path's node->prev = prev is committed resulting in change of CPU0 prev back to CPU2 node. CPU2 node->next is NULL currently, so if CPU0 gets into unqueue path of osq_lock it will keep spinning in infinite loop as condition prev->next == node will never be true. Signed-off-by: Prateek Sood --- kernel/locking/osq_lock.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c index 05a3785..1d371a0 100644 --- a/kernel/locking/osq_lock.c +++ b/kernel/locking/osq_lock.c @@ -104,6 +104,32 @@ bool osq_lock(struct optimistic_spin_queue *lock) prev = decode_cpu(old); node->prev = prev; + + /* + * We need to avoid reordering of link updation sequence of osq. + * A case in which the status of optimistic spin queue is + * CPU6->CPU2 in which CPU6 has acquired the lock. At this point + * if CPU0 comes in to acquire osq_lock, it will update the tail + * count. After tail count update if CPU2 starts to unqueue itself + * from optimistic spin queue, it will find updated tail count with + * CPU0 and update CPU2 node->next to NULL in osq_wait_next(). If + * reordering of following stores happen then prev->next where prev + * being CPU2 would be updated to point to CPU0 node: + * node->prev = prev; + * WRITE_ONCE(prev->next, node); + * + * At this point if next instruction + * WRITE_ONCE(next->prev, prev); + * in CPU2 path is committed before the update of CPU0 node->prev = + * prev then CPU0 node->prev will point to CPU6 node. At this point + * if CPU0 path's node->prev = prev is committed resulting in change + * of CPU0 prev back to CPU2 node. CPU2 node->next is NULL, so if + * CPU0 gets into unqueue path of osq_lock it will keep spinning + * in infinite loop as condition prev->next == node will never be + * true. + */ + smp_wmb(); + WRITE_ONCE(prev->next, node); /* -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.