Received: by 2002:a05:6358:700f:b0:131:369:b2a3 with SMTP id 15csp136874rwo; Tue, 1 Aug 2023 14:21:17 -0700 (PDT) X-Google-Smtp-Source: APBJJlFObtv9vMFB4ysvFZOCCfnuIvn+UKQRD77hBqv1oW9V5gh/lQcZ3EFFqYuYfU38Suybt3p7 X-Received: by 2002:aa7:c458:0:b0:51e:2664:e6e7 with SMTP id n24-20020aa7c458000000b0051e2664e6e7mr3357108edr.38.1690924877061; Tue, 01 Aug 2023 14:21:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690924877; cv=none; d=google.com; s=arc-20160816; b=gDNTJ6vrE+kw2oHzdDHMDdbFqWk4NMk7CTZxTTs/O5JnaP4RK2ndGn9zyyFfi7SN1X ZyU3qyvgmeSVaaEE3UkBFwT+1pXJKfTJ2hnNIbXoaXXa5wTbnmNNov12A7EL09cquOcy aXXX53JwyKe+yq0FGnqlJAoL1xnFcqCE6nFayNY1+In41fPzsOsN00HMogtYaAHtA0Zv D+0mAcGRQiH6Bh8IlhhlONumxXeGwlobelzjVbxLLEB7MKJNS53AV1Ob6AWk64lHwNrV 7jfgVXfKzLKuJqaa/NmJgB36KbJae+Xbz3+RpJ/adlzHol//DD9+NGyJUs1uwPS+3Cox nUIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=EHYSwcovOuMESUTA5bYv/hl0ljd1OC0AxmWQtxJBqmI=; fh=WGC78jz/1H/VpK+kBJAytB4GHP0idszWhzYJ4OEqlyc=; b=peZuX9s7A3/WCttCsKOtiU5fM6kDLXjPCp156QRkOBd/A1Ig/0ybzvSFhlili17Fea pQ+AlvxGFT53JoUcRFIGhlq0nM9XXW6FYgn/AKIPfuKQ8RY7ozuui9ETRo2+wyDNoexL SoNpUamZ1oGd6ZPq3SipsCqu/99W5Sl0Hb5iWQKOGCp7v+AG+koMgO5ZSi7ZuZk5A0ws TWgonwzXKwnWTToG3GTcjBzePwpD9Thzq6a5EUiC65g/cKiqH3u6qzHm17LZMfUwsHc5 c5F3tSZrgRKsKjoTUjCJGECdTl5gzIGD42T4wTgBYfLdcAQ5INz+KSMhXPf2sAm1CPsw oJQA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="K/nSGvQs"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t15-20020aa7d70f000000b005220ee1fe53si4037203edq.510.2023.08.01.14.20.50; Tue, 01 Aug 2023 14:21:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="K/nSGvQs"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232607AbjHAUk0 (ORCPT + 99 others); Tue, 1 Aug 2023 16:40:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232592AbjHAUkZ (ORCPT ); Tue, 1 Aug 2023 16:40:25 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 699FA198A for ; Tue, 1 Aug 2023 13:39:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1690922383; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EHYSwcovOuMESUTA5bYv/hl0ljd1OC0AxmWQtxJBqmI=; b=K/nSGvQsNjiJ3vuyhstxTWBIiiIWY1c8j7FALFrb6/jY3Jwt0cajWmAnWQXUVcrKzYtuXA 7lQse9z8adcfIe89CM1KDAx5TnikkbK02uHPr4G5bj0W3RJDvOQIYc2FkGgRqS9udfb5hf EsBVhIcHcLVAk3yF0r9/SXMpWe/Md2o= Received: from mail-il1-f200.google.com (mail-il1-f200.google.com [209.85.166.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-464-5sySfxKfPkCpLryIWOitdA-1; Tue, 01 Aug 2023 16:39:41 -0400 X-MC-Unique: 5sySfxKfPkCpLryIWOitdA-1 Received: by mail-il1-f200.google.com with SMTP id e9e14a558f8ab-3492ef8860cso12458645ab.2 for ; Tue, 01 Aug 2023 13:39:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690922380; x=1691527180; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=EHYSwcovOuMESUTA5bYv/hl0ljd1OC0AxmWQtxJBqmI=; b=Y1Fq5S9c2+K3/P5X9xffVoRqzccLsuQY8PzWZjh8V3+VI3uzqSPn1QH6FXJMumQ8hQ doCQro4BRbrcTVFiffmAV1+zy9Iz+ECkgSLL/OvpRF6BDq/IMTfRdMNSsnGMIU3dI1K7 XoktBG1E59k5PRvvcomij/HChcK7mQCyQWz9sDArBx0kNBrFvkVHCvI5lbMUZ5BCbmMn qiPgS63kwpFQQrxnB+BjfMXuwUo+bSSfi1mLz0XMaAs1Wgw6U0XxsJ8VhWMfucqsswbz uRTqU3pWZkOUh9Jnw44hW3dTKJrpAnQoY7k1Br73WLDS0vc2wH4EkBK1IlP7ODcZQDGL 1Whw== X-Gm-Message-State: ABy/qLbH1cDg/HnoN6JnPVl+2df48aWIOyazZOz90IGHhrBh8j2Zoxu4 kyB7K6cVqnVDqrPQscmkmYsIgh1/6l5y8UNjFTtt1zb++5I5b65jcD78uUHsHvh6xQqQCfxVAUF 1tWoVpeVchjNGmpNhFxYg9rCYR4h44yF4 X-Received: by 2002:a05:6e02:20ee:b0:349:191:af05 with SMTP id q14-20020a056e0220ee00b003490191af05mr15049949ilv.16.1690922380421; Tue, 01 Aug 2023 13:39:40 -0700 (PDT) X-Received: by 2002:a05:6e02:20ee:b0:349:191:af05 with SMTP id q14-20020a056e0220ee00b003490191af05mr15049938ilv.16.1690922380121; Tue, 01 Aug 2023 13:39:40 -0700 (PDT) Received: from redhat.com ([38.15.60.12]) by smtp.gmail.com with ESMTPSA id o7-20020a02cc27000000b0042b1cd4c096sm3887565jap.74.2023.08.01.13.39.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Aug 2023 13:39:39 -0700 (PDT) Date: Tue, 1 Aug 2023 14:39:38 -0600 From: Alex Williamson To: Like Xu Cc: Paolo Bonzini , Zhu Lingshan , linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: Re: [PATCH] KVM: irqbypass: Convert producers/consumers single linked list to XArray Message-ID: <20230801143938.3d27a199.alex.williamson@redhat.com> In-Reply-To: <20230801115646.33990-1-likexu@tencent.com> References: <20230801115646.33990-1-likexu@tencent.com> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.35; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 1 Aug 2023 19:56:46 +0800 Like Xu wrote: > From: Like Xu > > Replace producers/consumers linked list with XArray. There are no changes > in functionality, but lookup performance has been improved. > > The producers and consumers in current IRQ bypass manager are stored in > simple linked lists, and a single mutex is held while traversing the lists > and connecting a consumer to a producer (and vice versa). With this design > and implementation, if there are a large number of KVM agents concurrently > creating irqfds and all requesting to register their irqfds in the global > consumers list, the global mutex contention will exponentially increase > the avg wait latency, which is no longer tolerable in modern systems with > a large number of CPU cores. For example: > > the wait time latency to acquire the mutex in a stress test where 174000 > irqfds were created concurrently on an 2.70GHz ICX w/ 144 cores: > > - avg = 117.855314 ms > - min = 20 ns > - max = 11428.340858 ms > > To reduce latency introduced by the irq_bypass_register_consumer() in > the above usage scenario, the data structure XArray and its normal API > is applied to track the producers and consumers so that lookups don't > require a linear walk since the "tokens" used to match producers and > consumers are just kernel pointers. > > Thanks to the nature of XArray (more memory-efficient, parallelisable > and cache friendly), the latecny is significantly reduced (compared to > list and hlist proposal) under the same environment and testing: > > - avg = 314 ns > - min = 124 ns > - max = 47637 ns > > In this conversion, the non-NULL opaque token to match between producer > and consumer () is used as the XArray index. The list_for_each_entry() is > replaced by xa_load(), and list_add/del() is replaced by xa_store/erase(). > The list_head member for linked list is removed, along with comments. > > Cc: Alex Williamson > Reported-by: Yong He > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217379 > Suggested-by: Sean Christopherson > Suggested-by: Paolo Bonzini > Signed-off-by: Like Xu > --- > Prerequisite: > - https://lore.kernel.org/kvm/20230801085408.69597-1-likexu@tencent.com Perhaps send it as a series? > Test Requests: > - Please rant to me if it causes a negative impact on vdpa/vfio testing. > include/linux/irqbypass.h | 8 +-- > virt/lib/irqbypass.c | 123 +++++++++++++++++++------------------- > 2 files changed, 61 insertions(+), 70 deletions(-) > > diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h > index 9bdb2a781841..dbcc1b4d0ccf 100644 > --- a/include/linux/irqbypass.h > +++ b/include/linux/irqbypass.h > @@ -8,14 +8,12 @@ > #ifndef IRQBYPASS_H > #define IRQBYPASS_H > > -#include > - > struct irq_bypass_consumer; > > /* > * Theory of operation > * > - * The IRQ bypass manager is a simple set of lists and callbacks that allows > + * The IRQ bypass manager is a simple set of xarrays and callbacks that allows > * IRQ producers (ex. physical interrupt sources) to be matched to IRQ > * consumers (ex. virtualization hardware that allows IRQ bypass or offload) > * via a shared token (ex. eventfd_ctx). Producers and consumers register > @@ -30,7 +28,6 @@ struct irq_bypass_consumer; > > /** > * struct irq_bypass_producer - IRQ bypass producer definition > - * @node: IRQ bypass manager private list management > * @token: opaque token to match between producer and consumer (non-NULL) > * @irq: Linux IRQ number for the producer device > * @add_consumer: Connect the IRQ producer to an IRQ consumer (optional) > @@ -43,7 +40,6 @@ struct irq_bypass_consumer; > * for a physical device assigned to a VM. > */ > struct irq_bypass_producer { > - struct list_head node; > void *token; > int irq; > int (*add_consumer)(struct irq_bypass_producer *, > @@ -56,7 +52,6 @@ struct irq_bypass_producer { > > /** > * struct irq_bypass_consumer - IRQ bypass consumer definition > - * @node: IRQ bypass manager private list management > * @token: opaque token to match between producer and consumer (non-NULL) > * @add_producer: Connect the IRQ consumer to an IRQ producer > * @del_producer: Disconnect the IRQ consumer from an IRQ producer > @@ -69,7 +64,6 @@ struct irq_bypass_producer { > * portions of the interrupt handling to the VM. > */ > struct irq_bypass_consumer { > - struct list_head node; > void *token; > int (*add_producer)(struct irq_bypass_consumer *, > struct irq_bypass_producer *); > diff --git a/virt/lib/irqbypass.c b/virt/lib/irqbypass.c > index e0aabbbf27ec..78238c0fa83f 100644 > --- a/virt/lib/irqbypass.c > +++ b/virt/lib/irqbypass.c > @@ -15,15 +15,15 @@ > */ > > #include > -#include > #include > #include > +#include > > MODULE_LICENSE("GPL v2"); > MODULE_DESCRIPTION("IRQ bypass manager utility module"); > > -static LIST_HEAD(producers); > -static LIST_HEAD(consumers); > +static DEFINE_XARRAY(producers); > +static DEFINE_XARRAY(consumers); > static DEFINE_MUTEX(lock); > > /* @lock must be held when calling connect */ > @@ -78,11 +78,12 @@ static void __disconnect(struct irq_bypass_producer *prod, > * irq_bypass_register_producer - register IRQ bypass producer > * @producer: pointer to producer structure > * > - * Add the provided IRQ producer to the list of producers and connect > - * with any matching token found on the IRQ consumers list. > + * Add the provided IRQ producer to the xarray of producers and connect > + * with any matching token found on the IRQ consumers xarray. > */ > int irq_bypass_register_producer(struct irq_bypass_producer *producer) > { > + unsigned long token = (unsigned long)producer->token; > struct irq_bypass_producer *tmp; > struct irq_bypass_consumer *consumer; > int ret; > @@ -97,23 +98,22 @@ int irq_bypass_register_producer(struct irq_bypass_producer *producer) > > mutex_lock(&lock); > > - list_for_each_entry(tmp, &producers, node) { > - if (tmp->token == producer->token || tmp == producer) { > - ret = -EBUSY; > + tmp = xa_load(&producers, token); > + if (tmp || tmp == producer) { > + ret = -EBUSY; > + goto out_err; > + } > + > + consumer = xa_load(&consumers, token); > + if (consumer) { > + ret = __connect(producer, consumer); > + if (ret) > goto out_err; > - } > } > > - list_for_each_entry(consumer, &consumers, node) { > - if (consumer->token == producer->token) { > - ret = __connect(producer, consumer); > - if (ret) > - goto out_err; > - break; > - } > - } > - > - list_add(&producer->node, &producers); > + ret = xa_err(xa_store(&producers, token, producer, GFP_KERNEL)); > + if (ret) > + goto out_err; This leaves the producer and consumer connected but not tracked in the producers xarray. > > mutex_unlock(&lock); > > @@ -129,11 +129,12 @@ EXPORT_SYMBOL_GPL(irq_bypass_register_producer); > * irq_bypass_unregister_producer - unregister IRQ bypass producer > * @producer: pointer to producer structure > * > - * Remove a previously registered IRQ producer from the list of producers > + * Remove a previously registered IRQ producer from the xarray of producers > * and disconnect it from any connected IRQ consumer. > */ > void irq_bypass_unregister_producer(struct irq_bypass_producer *producer) > { > + unsigned long token = (unsigned long)producer->token; > struct irq_bypass_producer *tmp; > struct irq_bypass_consumer *consumer; > > @@ -143,24 +144,18 @@ void irq_bypass_unregister_producer(struct irq_bypass_producer *producer) > might_sleep(); > > if (!try_module_get(THIS_MODULE)) > - return; /* nothing in the list anyway */ > + return; /* nothing in the xarray anyway */ > > mutex_lock(&lock); > > - list_for_each_entry(tmp, &producers, node) { > - if (tmp != producer) > - continue; > + tmp = xa_load(&producers, token); > + if (tmp == producer) { > + consumer = xa_load(&consumers, token); > + if (consumer) > + __disconnect(producer, consumer); > > - list_for_each_entry(consumer, &consumers, node) { > - if (consumer->token == producer->token) { > - __disconnect(producer, consumer); > - break; > - } > - } > - > - list_del(&producer->node); > + xa_erase(&producers, token); > module_put(THIS_MODULE); > - break; > } > > mutex_unlock(&lock); > @@ -173,11 +168,12 @@ EXPORT_SYMBOL_GPL(irq_bypass_unregister_producer); > * irq_bypass_register_consumer - register IRQ bypass consumer > * @consumer: pointer to consumer structure > * > - * Add the provided IRQ consumer to the list of consumers and connect > - * with any matching token found on the IRQ producer list. > + * Add the provided IRQ consumer to the xarray of consumers and connect > + * with any matching token found on the IRQ producer xarray. > */ > int irq_bypass_register_consumer(struct irq_bypass_consumer *consumer) > { > + unsigned long token = (unsigned long)consumer->token; > struct irq_bypass_consumer *tmp; > struct irq_bypass_producer *producer; > int ret; > @@ -193,23 +189,22 @@ int irq_bypass_register_consumer(struct irq_bypass_consumer *consumer) > > mutex_lock(&lock); > > - list_for_each_entry(tmp, &consumers, node) { > - if (tmp->token == consumer->token || tmp == consumer) { > - ret = -EBUSY; > + tmp = xa_load(&consumers, token); > + if (tmp || tmp == consumer) { > + ret = -EBUSY; > + goto out_err; > + } > + > + producer = xa_load(&producers, token); > + if (producer) { > + ret = __connect(producer, consumer); > + if (ret) > goto out_err; > - } > } > > - list_for_each_entry(producer, &producers, node) { > - if (producer->token == consumer->token) { > - ret = __connect(producer, consumer); > - if (ret) > - goto out_err; > - break; > - } > - } > - > - list_add(&consumer->node, &consumers); > + ret = xa_err(xa_store(&consumers, token, consumer, GFP_KERNEL)); > + if (ret) > + goto out_err; Same as above. Thanks, Alex > > mutex_unlock(&lock); > > @@ -225,11 +220,12 @@ EXPORT_SYMBOL_GPL(irq_bypass_register_consumer); > * irq_bypass_unregister_consumer - unregister IRQ bypass consumer > * @consumer: pointer to consumer structure > * > - * Remove a previously registered IRQ consumer from the list of consumers > + * Remove a previously registered IRQ consumer from the xarray of consumers > * and disconnect it from any connected IRQ producer. > */ > void irq_bypass_unregister_consumer(struct irq_bypass_consumer *consumer) > { > + unsigned long token = (unsigned long)consumer->token; > struct irq_bypass_consumer *tmp; > struct irq_bypass_producer *producer; > > @@ -239,24 +235,18 @@ void irq_bypass_unregister_consumer(struct irq_bypass_consumer *consumer) > might_sleep(); > > if (!try_module_get(THIS_MODULE)) > - return; /* nothing in the list anyway */ > + return; /* nothing in the xarray anyway */ > > mutex_lock(&lock); > > - list_for_each_entry(tmp, &consumers, node) { > - if (tmp != consumer) > - continue; > + tmp = xa_load(&consumers, token); > + if (tmp == consumer) { > + producer = xa_load(&producers, token); > + if (producer) > + __disconnect(producer, consumer); > > - list_for_each_entry(producer, &producers, node) { > - if (producer->token == consumer->token) { > - __disconnect(producer, consumer); > - break; > - } > - } > - > - list_del(&consumer->node); > + xa_erase(&consumers, token); > module_put(THIS_MODULE); > - break; > } > > mutex_unlock(&lock); > @@ -264,3 +254,10 @@ void irq_bypass_unregister_consumer(struct irq_bypass_consumer *consumer) > module_put(THIS_MODULE); > } > EXPORT_SYMBOL_GPL(irq_bypass_unregister_consumer); > + > +static void __exit irqbypass_exit(void) > +{ > + xa_destroy(&producers); > + xa_destroy(&consumers); > +} > +module_exit(irqbypass_exit); > > base-commit: b580148824057ef8e3cc3a459082ebcb99716880