Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp3954801ybi; Mon, 10 Jun 2019 21:23:14 -0700 (PDT) X-Google-Smtp-Source: APXvYqybwWfYEtnHES3Z4PPcEYakoDHp6Uge54OwRX3XTXnU3MKATs9v0zubt5+itul+4cVxLOlO X-Received: by 2002:a17:90a:a790:: with SMTP id f16mr25071800pjq.27.1560226994500; Mon, 10 Jun 2019 21:23:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560226994; cv=none; d=google.com; s=arc-20160816; b=FcK6DzD4iTEqHyX5UG54g5wa7Fq2r6RROFBVUZgsKLKzKwOctaVl3CJxJh45fGxJgp weMGdefLS4lHOEiIEHBSKCwvar0iamuBs4Vq5X2FT1VqwLoy+IvOp/4YWBD6OWLIBvfS ONFQ0fr7BEgc4ORWeTVXmLfSRBw2f4Uf3nthwrSDNC7nfDb5NoXypDLhjnoF/1iS49Mf Skkl1MvK6Kh5JTu/R88ys6aJLNRXCrKok68M4GCI3P4Zyf2C7esLjbd32f5BHcV7ymwf YXTGkYJ3FcCEUOiQ/oD6ioxCE/GAqxde5tM3NeUlC1NtfOjnv/igPrWv8La5Vm5iCfpZ Wykw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:references:cc:to :subject; bh=S5J2ttc8bxkys6RFt88YE93NoQbCO61yu+VgkxKflAI=; b=HvUkHFl0sZjgLZcpnlJOr+kOBKNea58bwXJm1QC9glH6HWye9Xb0snHrZsMnDdFE8b SKibWWkZD95SuNTfO3L7Mr487Cc8WZ5X+oPilSfdKPh2/Y0FnuLiylBdajaNSHFAhD1P ZDHok574RtGh/frayPaCTw0hc6g5cVs/dWowkx65yStO2DckZ9QoLTWenpsbjdpX99i8 IJp2Mk7zb+OYqM5rUDkwOSC3heejselJDCzz1Ttak31LuwbH8qsnAV96Bf+nHcOeQE6V TPQUHwOi7d/N7XOBC7UZvxI/3ZxEpDAhZonj0nck98iGcoRT5EH4sKo0Bu42/Q/AFChn OEIA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p2si10971496pls.314.2019.06.10.21.22.59; Mon, 10 Jun 2019 21:23:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728865AbfFKEW3 (ORCPT + 99 others); Tue, 11 Jun 2019 00:22:29 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:6934 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726022AbfFKEW3 (ORCPT ); Tue, 11 Jun 2019 00:22:29 -0400 Received: from DGGEMM406-HUB.china.huawei.com (unknown [172.30.72.54]) by Forcepoint Email with ESMTP id 2B09EFC6017021BD9648; Tue, 11 Jun 2019 12:22:26 +0800 (CST) Received: from dggeme754-chm.china.huawei.com (10.3.19.100) by DGGEMM406-HUB.china.huawei.com (10.3.20.214) with Microsoft SMTP Server (TLS) id 14.3.439.0; Tue, 11 Jun 2019 12:22:25 +0800 Received: from [127.0.0.1] (10.184.212.80) by dggeme754-chm.china.huawei.com (10.3.19.100) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1591.10; Tue, 11 Jun 2019 12:22:24 +0800 Subject: Re: [PATCH v2 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock To: Alex Kogan , , , , , , , , , , , , , CC: , , , References: <20190329152006.110370-1-alex.kogan@oracle.com> <20190329152006.110370-4-alex.kogan@oracle.com> From: "liwei (GF)" Message-ID: Date: Tue, 11 Jun 2019 12:22:05 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: <20190329152006.110370-4-alex.kogan@oracle.com> Content-Type: text/plain; charset="gbk" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.184.212.80] X-ClientProxiedBy: dggeme714-chm.china.huawei.com (10.1.199.110) To dggeme754-chm.china.huawei.com (10.3.19.100) X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Alex, On 2019/3/29 23:20, Alex Kogan wrote: > In CNA, spinning threads are organized in two queues, a main queue for > threads running on the same node as the current lock holder, and a > secondary queue for threads running on other nodes. At the unlock time, > the lock holder scans the main queue looking for a thread running on > the same node. If found (call it thread T), all threads in the main queue > between the current lock holder and T are moved to the end of the > secondary queue, and the lock is passed to T. If such T is not found, the > lock is passed to the first node in the secondary queue. Finally, if the > secondary queue is empty, the lock is passed to the next thread in the > main queue. For more details, see https://arxiv.org/abs/1810.05600. > > Note that this variant of CNA may introduce starvation by continuously > passing the lock to threads running on the same node. This issue > will be addressed later in the series. > > Enabling CNA is controlled via a new configuration option > (NUMA_AWARE_SPINLOCKS), which is enabled by default if NUMA is enabled. > > Signed-off-by: Alex Kogan > Reviewed-by: Steve Sistare > --- > arch/x86/Kconfig | 14 +++ > include/asm-generic/qspinlock_types.h | 13 +++ > kernel/locking/mcs_spinlock.h | 10 ++ > kernel/locking/qspinlock.c | 29 +++++- > kernel/locking/qspinlock_cna.h | 173 ++++++++++++++++++++++++++++++++++ > 5 files changed, 236 insertions(+), 3 deletions(-) > create mode 100644 kernel/locking/qspinlock_cna.h > (SNIP) > + > +static __always_inline int get_node_index(struct mcs_spinlock *node) > +{ > + return decode_count(node->node_and_count++); When nesting level is > 4, it won't return a index >= 4 here and the numa node number is changed by mistake. It will go into a wrong way instead of the following branch. /* * 4 nodes are allocated based on the assumption that there will * not be nested NMIs taking spinlocks. That may not be true in * some architectures even though the chance of needing more than * 4 nodes will still be extremely unlikely. When that happens, * we fall back to spinning on the lock directly without using * any MCS node. This is not the most elegant solution, but is * simple enough. */ if (unlikely(idx >= MAX_NODES)) { while (!queued_spin_trylock(lock)) cpu_relax(); goto release; } > +} > + > +static __always_inline void release_mcs_node(struct mcs_spinlock *node) > +{ > + __this_cpu_dec(node->node_and_count); > +} > + > +static __always_inline void cna_init_node(struct mcs_spinlock *node, int cpuid, > + u32 tail) > +{ Thanks, Wei