Received: by 2002:a05:7412:b995:b0:f9:9502:5bb8 with SMTP id it21csp5715784rdb; Sun, 31 Dec 2023 13:50:26 -0800 (PST) X-Google-Smtp-Source: AGHT+IFPW/QhXMKRhDw6wZNm2rV43Hc46uznG8mNoaMj65Ns0FhGtW+07YO5OCN11MRw1r/6IQRz X-Received: by 2002:a05:620a:821c:b0:781:29ba:f842 with SMTP id ow28-20020a05620a821c00b0078129baf842mr14161276qkn.131.1704059426490; Sun, 31 Dec 2023 13:50:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704059426; cv=none; d=google.com; s=arc-20160816; b=tBkV9hWqVnMnnuFcgMVoqYZVx4QciE4TpD5z8egeGXb9KdTpaqIXjpUwtMZGVUn4j1 P0HcZBYSa5UvD7q36ASzBgK1oU7auDVvPpJcm/No+Qbiu02PdRpM4Ypn53Z+7tnsFIVz SspRabgfCLaTyC6bYzhT2s3MckWtx5Vkhm1OBvsOcePxlPTgX7UIobegT5opXbJVPlRP 7XAdWUOoMK3/sd3+b8Ml8CqqZO/dUG47I6rRZV9ZRmKeZwH7XMMD9pCnOYVsJJQn9W8i VNt7iepsK3IW0ewdWkKvfW6L0pRSRiS9lpKF2LR0WjEAdwLUTwOVpe8Evn5Efc5GTXpy hkeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:content-language:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:accept-language :message-id:date:thread-index:thread-topic:subject:cc:to:from; bh=U709lqVU49z3MK5HQWnECBpf5QGr9tJpEpXYfm0tY+I=; fh=rK8F4zm15J6Jc7pqIJYnBw4d8zHrqB7RdoX7pamyocA=; b=q219vRxdTzfD65Rch4fRP5mV6nWjMGNtbYazalzLvCJV001GNRSm0ugto0ei7olo51 1MWp7qktcL1s7xiGndMq6DgFTxb4ZGqI4cbJeiwihJnlRCMeIxERq8k+397KUqlYuh2D JP1LB22v1HKt49T4p7wXiml3fyA3I01VqBV3Oa2zTdO/hZZNU07oN56xoPn7gaY7cUHo q/xmpBPhyiRZVFFrSaSQq2dMF6nCc0HoFoypz63X3p9V4QtS13EcBrjN0rooqGTiV9FP RUukMPSln0u+99dy+3xRZVa/67DyovgtRcaDY3XGRbbRRrNZo72hp1KDFQZ1V2Q5kHH/ kUZQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-13789-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-13789-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id w30-20020a05620a0e9e00b00781935ddd47si5492590qkm.402.2023.12.31.13.50.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 31 Dec 2023 13:50:26 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-13789-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-13789-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-13789-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 3D3A21C21A50 for ; Sun, 31 Dec 2023 21:50:26 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 55E8AC12B; Sun, 31 Dec 2023 21:50:21 +0000 (UTC) X-Original-To: linux-kernel@vger.kernel.org Received: from eu-smtp-delivery-151.mimecast.com (eu-smtp-delivery-151.mimecast.com [185.58.86.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8E75BE48 for ; Sun, 31 Dec 2023 21:50:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=ACULAB.COM Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=aculab.com Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with both STARTTLS and AUTH (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-309-lFOnXFemMY-uP990CKkryw-1; Sun, 31 Dec 2023 21:50:15 +0000 X-MC-Unique: lFOnXFemMY-uP990CKkryw-1 Received: from AcuMS.Aculab.com (10.202.163.6) by AcuMS.aculab.com (10.202.163.6) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Sun, 31 Dec 2023 21:49:53 +0000 Received: from AcuMS.Aculab.com ([::1]) by AcuMS.aculab.com ([::1]) with mapi id 15.00.1497.048; Sun, 31 Dec 2023 21:49:53 +0000 From: David Laight To: "'linux-kernel@vger.kernel.org'" , "'peterz@infradead.org'" , "'longman@redhat.com'" CC: "'mingo@redhat.com'" , "'will@kernel.org'" , "'boqun.feng@gmail.com'" , "'Linus Torvalds'" , "'virtualization@lists.linux-foundation.org'" , 'Zeng Heng' Subject: [PATCH next v2 0/5] locking/osq_lock: Optimisations to osq_lock code. Thread-Topic: [PATCH next v2 0/5] locking/osq_lock: Optimisations to osq_lock code. Thread-Index: Ado8IBeq6d8EUiQSSN60y73H2fc39w== Date: Sun, 31 Dec 2023 21:49:53 +0000 Message-ID: <2b4e8a5816a742d2bd23fdbaa8498e80@AcuMS.aculab.com> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable This is an updated series of optimisations to osq_lock.c Patches #1 and #3 from v1 have been applied by Linus. Some of the generated code issues I was getting were caused by CONFIG_DEBUG_PREEMPT being set. No idea why, it isn't any more. Patch #1 is the node->locked part of the old #2. Patch #2 removes the pretty much guaranteed cache line reload getting the cpu number (from node->prev) for the vcpu_is_preempted() check. It is (basically) the old #5 with the addition of a READ_ONCE() and leaving the '+ 1' offset (for patch 3). Patch #3 ends up removing both node->cpu and node->prev. This saves issues initialising node->cpu. Basically node->cpu was only ever read as node->prev->cpu in the unqueue co= de. Most of the time it is the value read from lock->tail that was used to obtain 'prev' in the first place. The only time it is different is in the unlock race path where 'prev' is re-read from node->prev - updated right at the bottom of osq_lock(). So the updated node->prev_cpu can used (and prev obtained from it) without worrying about only one of node->prev and node->prev-cpu being updated. Linus did suggest just saving the cpu numbers instead of pointers. It actually works for 'prev' but not 'next'. Patch #4 removes the 'should be unnecessary' node->next =3D NULL assignment from the top of osq_lock(). Since longman was worried about race conditions, I've added a WARN_ON_ONCE() check that ensures it is NULL. This saves dirtying the 'node' cache line in the fast path, but the check still requires the cache line be loaded. Patch #5 just stops gcc using two separate instructions to decrement the offset cpu number and then convert it to 64 bits. Linus got annoyed with it, and I'd spotted it as well. I don't seem to be able to get gcc to convert __per_cpu_offset[cpu - 1] to (__per_cpu_offset - 1)[cpu] (cpu is offset by one) but, in any case, it would still need zero extending in the common case. David Laight (5): 1) Defer clearing node->locked until the slow osq_lock() path. 2) Optimise vcpu_is_preempted() check. 3) Use node->prev_cpu instead of saving node->prev. 4) Avoid writing to node->next in the osq_lock() fast path. 5) Optimise decode_cpu() and per_cpu_ptr(). kernel/locking/osq_lock.c | 59 +++++++++++++++++++++------------------ 1 file changed, 32 insertions(+), 27 deletions(-) --=20 2.17.1 - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1= PT, UK Registration No: 1397386 (Wales)