Received: by 2002:a05:6a10:83d0:0:0:0:0 with SMTP id o16csp22721pxh; Thu, 7 Apr 2022 12:48:11 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxt7fpZ+SV7e5peBRs/B8WFGHJR5w8OZOFXJZxGJR+H5vWykELAf91oTj9DWo+dFw/OKHJA X-Received: by 2002:a17:90a:d354:b0:1ca:a0aa:bc23 with SMTP id i20-20020a17090ad35400b001caa0aabc23mr17742046pjx.142.1649360891320; Thu, 07 Apr 2022 12:48:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649360891; cv=none; d=google.com; s=arc-20160816; b=rUmmdTXZPPzZzYnCgpd5a8cU3QZ7YBq1YtoWcnoNzgRRsIjpQtsd7pT8HeYWBMJ6zP SfAdl/ozoTuNVd1UJE4LlocHVCvJqPzdEr+0qCDZS2bsr1Or93ccKVW+wqKWRRh2jhzX a5B+0mp8+XV7a/hKlVCDoz4soacZmD1wYjAAERhU5n1JRmiiyenNWWruz5QJVvN9Q2x6 a4+vmc0YldrGyDyAgn/SYTrKViStT9zfF+20hUvOi7xHynkZZhOYFRnn6VWjGkCN7g3Q xi/IUA+dtJZOnPQnc7b5r3HCfYCzl+DNIA6s46CKCb2ZBy2ljmu45JmIGthhYW62BCM1 Ybaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:content-transfer-encoding:mime-version :message-id:date:cc:to:from:dkim-signature; bh=Rh0EdEfVAD3LFkwOeFqoKK6fKePqsB8IaNskci6Qleo=; b=ncTqczeB8rIMl76hzWvBLOuhgt4pHvoytNEMIxWBkKs7Zojp8aK3fOwM1/9El19OYB 14CHrpw/SDMF/DVh6Jfkh9wXJODmI/g5Nj2ukoh31Irvmj6bHK4uBO4v4n4Zn2d3X0UQ UWfZSGZZHE6NCWrRiPMbVEH0b11YyluvlkHrjJDCABrpuhCIBKAex/hvsqZAoFgMd9cj AXY1stZX/WgV0dV+lWPFV1yd+sn81zRTQZhwtz5BG89ce4dAYKFvdD/g1mBwuxqM0BAC tjDHNdPM8AWZvrdJWCyzXxsqfsMWkOmDjKCGWeCuSknhndxSEMOZwkNJXu1b/6RymY5D fPYg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@deltatee.com header.s=20200525 header.b=ehhkv6TO; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=deltatee.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id r15-20020a170902ea4f00b00153b2d1643csi544730plg.68.2022.04.07.12.48.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Apr 2022 12:48:11 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@deltatee.com header.s=20200525 header.b=ehhkv6TO; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=deltatee.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7465C21DF22; Thu, 7 Apr 2022 12:20:05 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346287AbiDGRZU (ORCPT + 99 others); Thu, 7 Apr 2022 13:25:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346170AbiDGRYR (ORCPT ); Thu, 7 Apr 2022 13:24:17 -0400 Received: from ale.deltatee.com (ale.deltatee.com [204.191.154.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E81D133E07; Thu, 7 Apr 2022 10:22:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:Message-Id:Date:Cc:To:From :references:content-disposition:in-reply-to; bh=Rh0EdEfVAD3LFkwOeFqoKK6fKePqsB8IaNskci6Qleo=; b=ehhkv6TOCZi3EMvIL/7cxCKUmL 1ofmwTSG0njRc1Ym+cYBjgYnGVBuqk3qlIWCAbSNb8GdtduaM7qoPcr/7ff6Jvy7aSZ6ait7hgzAW hTmOoJRZnHSKFjHVivyBgqMNiojIQrkWphmLP1u1j+LuYF3qplGhTY7Tru67Cnyl6T4Peptg9pGs3 3J6ncnboeDKbeJCaJALPD/3W/EpGCVATPKw3Zf5ixuAXNJj/jNnsAugZWCwrleAD51LHdhxceonXf 4bW3loWffn2d1FMjxKY9sfQwCTsO/gfmBOVo98QOweD4MaighgYFFT+2/M1sm6AaK7Og8gYVnAKJh 1bLAX+/w==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ncVG1-002CHk-9v; Thu, 07 Apr 2022 10:45:19 -0600 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1ncVFz-0002DZ-O2; Thu, 07 Apr 2022 10:45:15 -0600 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, Song Liu Cc: Shaohua Li , Guoqing Jiang , Stephen Bates , Martin Oliveira , David Sloan , Logan Gunthorpe Date: Thu, 7 Apr 2022 10:45:03 -0600 Message-Id: <20220407164511.8472-1-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, song@kernel.org, shli@kernel.org, guoqing.jiang@linux.dev, sbates@raithlin.com, Martin.Oliveira@eideticom.com, David.Sloan@eideticom.com, logang@deltatee.com X-SA-Exim-Mail-From: gunthorp@deltatee.com X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net X-Spam-Level: X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 Subject: [PATCH v1 0/8] Improve Raid5 Lock Contention X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, I've been doing some work trying to improve the bulk write performance of raid5 on large systems with fast NVMe drives. The bottleneck appears largely to be lock contention on the hash_lock and device_lock. This series improves the situation slightly by addressing a couple of low hanging fruit ways to take the lock fewer times in the request path. Patch 5 adjusts how batching works by keeping a reference to the previous stripe_head in raid5_make_request(). Under most situtations, this removes the need to take the hash_lock in stripe_add_to_batch_list() which should reduce the number of times the lock is taken by a factor of about 2. Patch 8 pivots the way raid5_make_request() works. Before the patch, the code must find the stripe_head for every 4KB page in the request, so each stripe head must be found once for every data disk. The patch changes this so that all the data disks can be added to a stripe_head at once and the number of times the stripe_head must be found (and thus the number of times the hash_lock is taken) should be reduced by a factor roughly equal to the number of data disks. The remaining patches are just cleanup and prep patches for those two patches. Doing apples to apples testing this series on a small VM with 5 ram disks, I saw a bandwidth increase of roughly 14% and lock contentions on the hash_lock (as reported by lock stat) reduced by more than a factor of 5 (though is still significantly contended). Testing on larger systems with NVMe drives saw similar small bandwidth increases from 3% to 20% depending on the parameters. Oddly small arrays had larger gains, likely due to them having lower starting bandwidths; I would have expected larger gains with larger arrays (seeing there should have been even fewer locks taken in raid5_make_request()). Logan -- Logan Gunthorpe (8): md/raid5: Refactor raid5_make_request loop md/raid5: Move stripe_add_to_batch_list() call out of add_stripe_bio() md/raid5: Move common stripe count increment code into __find_stripe() md/raid5: Make common label for schedule/retry in raid5_make_request() md/raid5: Keep a reference to last stripe_head for batch md/raid5: Refactor add_stripe_bio() md/raid5: Check all disks in a stripe_head for reshape progress md/raid5: Pivot raid5_make_request() drivers/md/raid5.c | 442 +++++++++++++++++++++++++++------------------ drivers/md/raid5.h | 1 + 2 files changed, 270 insertions(+), 173 deletions(-) base-commit: 3123109284176b1532874591f7c81f3837bbdc17 -- 2.30.2