Received: by 2002:a5d:9c59:0:0:0:0:0 with SMTP id 25csp2087124iof; Tue, 7 Jun 2022 19:16:32 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz0qhgd4ELJl02Losm/ueuU24NqKDEDu7CrlHkiAM6HLf0T0efcNiT0MrQqVWBEaYd67ZLQ X-Received: by 2002:a05:6a00:1ac7:b0:51c:6537:e98b with SMTP id f7-20020a056a001ac700b0051c6537e98bmr961640pfv.76.1654654592738; Tue, 07 Jun 2022 19:16:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654654592; cv=none; d=google.com; s=arc-20160816; b=VJ/pNMbCOA6o7laMSr4Wax/IpCkOoWg1m3YjYGj9+WScQ0OY/F7O5DV0nAsdlTg6d0 kqEuMAZnS1iTDBnEeVkL1kPEJJXiajdCQh6dXizz8Csx84rEfjeDgFucYh2FtkzjwDdP eESl0ETv3RCVxOzISrnn+jrRbJHI1RDT848ONvBVE57JnYe8XZI++0gnBawuvUboOzHn 3yL41DjiB5B73/GxdqX58gvxl3cnrkDF7UUtseuhMTRzApzNhXtx5aiZjGyU0tA98kwL RgLcSkt1VeBQXb1iQw/gUblGqL4PYIEUSFBf1AQMt8POHIonkKRE17ZoWqbfZF34taMK gMQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:subject:from:cc:to :content-language:user-agent:mime-version:date:message-id :dkim-signature; bh=3YvRZ5FOVwCVa3SqPpM62f1HUDXg7koMQw6UwikSy60=; b=WMo4Z0MRljDwU/9vKgZbyANdC0ssCh3XcO+Pdwt6StqjwfjSXYkepyPdQ+OBuXM2/7 /vI0bHYwpUcveKBeF3N1jl0UhlxPJIIdo0Fspze55ECslOELMvdXLNIPnbEVXdPXPPYA qzzcbL4NH8foDawEgvoERKWvKVCsIEVEk4htslr5KUHq0a6E9reJ4CLLVcGwMRNb0TXX RVqdbFxuMouOxB8m6dDAweaKCS3BSQOkwWB16RM1BqlRmns2RKzHtoTU6m0Z1iHiM9QL nXOwEY2RlwqjwtriweChjW16wNTlB2Ic16aMIwk2PDbvX8KJVoj9+4QhWN0v7ixs0url S/8A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cybernetics.com header.s=mail header.b=HWP6kvkG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=cybernetics.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id be7-20020a170902aa0700b00163f2f6e2desi10920605plb.131.2022.06.07.19.16.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 Jun 2022 19:16:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@cybernetics.com header.s=mail header.b=HWP6kvkG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=cybernetics.com Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8900038A947; Tue, 7 Jun 2022 19:00:03 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1383789AbiFGWa7 (ORCPT + 99 others); Tue, 7 Jun 2022 18:30:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1380741AbiFGVQz (ORCPT ); Tue, 7 Jun 2022 17:16:55 -0400 Received: from mail.cybernetics.com (mail.cybernetics.com [173.71.130.66]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42AF721F9E6 for ; Tue, 7 Jun 2022 11:57:05 -0700 (PDT) X-ASG-Debug-ID: 1654627115-1cf43917f3396470001-xx1T2L Received: from cybernetics.com ([10.10.4.126]) by mail.cybernetics.com with ESMTP id utCa0jEuR0ydfBDR; Tue, 07 Jun 2022 14:38:35 -0400 (EDT) X-Barracuda-Envelope-From: tonyb@cybernetics.com X-ASG-Whitelist: Client DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=cybernetics.com; s=mail; bh=3YvRZ5FOVwCVa3SqPpM62f1HUDXg7koMQw6UwikSy60=; h=Content-Transfer-Encoding:Content-Type:Subject:From:Cc:To:Content-Language: MIME-Version:Date:Message-ID; b=HWP6kvkGh9cacYSZPz+DzDEx8wo/oHQo8T6bny2R8X0wg NDKyaYMuP833zSg3fJzE0aawJ8vBp3xI1k1sCKlxf393i4KfBWzVGQWBeei0Ls0m5zBJEh0fusLPR L4zDDqhVYBe1NHAC0s6k/rSu2pmptHitdw4iy92t5hCYv0Vto= Received: from [10.157.2.224] (HELO [192.168.200.1]) by cybernetics.com (CommuniGate Pro SMTP 7.1.1) with ESMTPS id 11859378; Tue, 07 Jun 2022 14:38:34 -0400 Message-ID: <340ff8ef-9ff5-7175-c234-4132bbdfc5f7@cybernetics.com> Date: Tue, 7 Jun 2022 14:38:34 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Content-Language: en-US To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: iommu@lists.linux-foundation.org, kernel-team@fb.com, Matthew Wilcox , Keith Busch , Andy Shevchenko , Robin Murphy , Tony Lindgren From: Tony Battersby Subject: [PATCH v6 00/11] mpt3sas and dmapool scalability Content-Type: text/plain; charset=UTF-8 X-ASG-Orig-Subj: [PATCH v6 00/11] mpt3sas and dmapool scalability Content-Transfer-Encoding: 7bit X-Barracuda-Connect: UNKNOWN[10.10.4.126] X-Barracuda-Start-Time: 1654627115 X-Barracuda-URL: https://10.10.4.122:443/cgi-mod/mark.cgi X-Barracuda-BRTS-Status: 1 X-Virus-Scanned: by bsmtpd at cybernetics.com X-Barracuda-Scan-Msg-Size: 3193 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch series improves dmapool scalability by replacing linear scans with red-black trees. Note that Keith Busch is also working on improving dmapool scalability, so for now I would recommend not merging my scalability patches until Keith's approach can be evaluated. In the meantime, my patches can serve as a benchmark comparison. I also have a number of cleanup patches in my series that could be useful on their own. Changes since v5: 1. inline pool_free_page() into dma_pool_destroy() to avoid adding unused code 2. convert scnprintf() to sysfs_emit() 3. avoid adding a hole in struct dma_pool 4. fix big O usage in description References: v5 https://lore.kernel.org/linux-mm/9b08ab7c-b80b-527d-9adf-7716b0868fbc@cybernetics.com/ Keith Busch's dmapool performance enhancements https://lore.kernel.org/linux-mm/20220428202714.17630-1-kbusch@kernel.org/ Below is my original description of the motivation for these patches. drivers/scsi/mpt3sas is running into a scalability problem with the kernel's DMA pool implementation. With a LSI/Broadcom SAS 9300-8i 12Gb/s HBA and max_sgl_entries=256, during modprobe, mpt3sas does the equivalent of: chain_dma_pool = dma_pool_create(size = 128); for (i = 0; i < 373959; i++) { dma_addr[i] = dma_pool_alloc(chain_dma_pool); } And at rmmod, system shutdown, or system reboot, mpt3sas does the equivalent of: for (i = 0; i < 373959; i++) { dma_pool_free(chain_dma_pool, dma_addr[i]); } dma_pool_destroy(chain_dma_pool); With this usage, both dma_pool_alloc() and dma_pool_free() exhibit O(n) complexity, although dma_pool_free() is much worse due to implementation details. On my system, the dma_pool_free() loop above takes about 9 seconds to run. Note that the problem was even worse before commit 74522a92bbf0 ("scsi: mpt3sas: Optimize I/O memory consumption in driver."), where the dma_pool_free() loop could take ~30 seconds. mpt3sas also has some other DMA pools, but chain_dma_pool is the only one with so many allocations: cat /sys/devices/pci0000:80/0000:80:07.0/0000:85:00.0/pools (manually cleaned up column alignment) poolinfo - 0.1 reply_post_free_array pool 1 21 192 1 reply_free pool 1 1 41728 1 reply pool 1 1 1335296 1 sense pool 1 1 970272 1 chain pool 373959 386048 128 12064 reply_post_free pool 12 12 166528 12 The patches in this series improve the scalability of the DMA pool implementation, which significantly reduces the running time of the DMA alloc/free loops. With the patches applied, "modprobe mpt3sas", "rmmod mpt3sas", and system shutdown/reboot with mpt3sas loaded are significantly faster. Here are some benchmarks (of DMA alloc/free only, not the entire modprobe/rmmod): dma_pool_create() + dma_pool_alloc() loop, size = 128, count = 373959 original: 350 ms ( 1x) dmapool patches: 18 ms (19x) dma_pool_free() loop + dma_pool_destroy(), size = 128, count = 373959 original: 8901 ms ( 1x) dmapool patches: 19 ms ( 477x)