Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp629257ybl; Fri, 31 Jan 2020 05:16:18 -0800 (PST) X-Google-Smtp-Source: APXvYqyY7I9sg2ExpZqXPkI9DiGFPEvHtsCzDpmw5I55fBGfxCkquUgTVe8/MpjUooKRWxa+jXXm X-Received: by 2002:a05:6830:4a7:: with SMTP id l7mr7117751otd.372.1580476578104; Fri, 31 Jan 2020 05:16:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580476578; cv=none; d=google.com; s=arc-20160816; b=F3+kzi0KEGDTKscmaF5n1FhQjflnfrXS+9Urb20D0t9R2tlUVC/vK2Sb+vuhoDHb+C oE7q4oEY/XvU0W3dqvvyF32j+cUQrecBkVglAlLVrnSu+z4SM1E6SdW12nwzZkxNcWXE 5TawP6zzbRGHxdbGuvVYg18DviwXF9XP12ln60OjQceSAU69d4LImmHe+tpzUyqfNYtX tWRhbxlmQ2l35KmP0+ZSq6B2fbKgbrRYdiMASnXy7KxflEA9y6F3N6qxfzLWAin5PtKc d7YEe4hRSgsKBhHeYPGF1Dvjfk8HpYz+yVMSUlt3W6Sm4GKvFzL5A2v5cmcBTWUKDAon YGbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=1wdUycQduqenzuZBGYYPIbcN8Bfs3JMpg3P3r1VrT2Q=; b=xtuiCW7hSl89Xyyvu++aTqLqBRuukWVZ9zRNeNg+mGglbHGRIihNLxbG39E66lWxWZ YETp8sSHe98fmMoCti1Wv3aGYzUqBNDfnFgIYL5PuLx2aDhK+Pertl4WLfe+LbrK4Fl2 FQvg9keNyK/aqIOhGBpQAmY03MpLcokqLlU2h8JDLe4hFc+RPRmIwWuI+DqlQ7p+VbpD hg9/JA2R+N+U4r/dHoA0xvr02UxQI+03J6NehSsv/srH61lL9rGrKYKIG6faEqf+gmYc 9qO9gWi5xdmWfr3GCoImQtbODb+hWVakOmjTB9pLaB1EmJITYd6HVOoiJB0WE+2IQ/M0 t7gw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m24si4324988oic.11.2020.01.31.05.16.05; Fri, 31 Jan 2020 05:16:18 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728702AbgAaNOg (ORCPT + 99 others); Fri, 31 Jan 2020 08:14:36 -0500 Received: from foss.arm.com ([217.140.110.172]:35298 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728544AbgAaNOg (ORCPT ); Fri, 31 Jan 2020 08:14:36 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6E18C106F; Fri, 31 Jan 2020 05:14:35 -0800 (PST) Received: from [192.168.1.123] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7A4A63F67D; Fri, 31 Jan 2020 05:14:34 -0800 (PST) Subject: Re: [PATCH] dma-debug: add a per-cpu cache to avoid lock contention To: Eric Dumazet , Christoph Hellwig , Joerg Roedel Cc: iommu@lists.linux-foundation.org, Eric Dumazet , linux-kernel References: <20200130201451.253115-1-edumazet@google.com> From: Robin Murphy Message-ID: <87bde968-0103-c90d-380d-e0793cd39aa5@arm.com> Date: Fri, 31 Jan 2020 13:14:36 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Thunderbird/68.4.2 MIME-Version: 1.0 In-Reply-To: <20200130201451.253115-1-edumazet@google.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2020-01-30 8:14 pm, Eric Dumazet via iommu wrote: > Networking drivers very often have to replace one page with > another for their RX ring buffers. > > A multi-queue NIC will severly hit a contention point > in dma-debug while grabbing free_entries_lock spinlock. > > Adding a one entry per-cpu cache removes the need > to grab this spinlock twice per page replacement. > > Tested on a 40Gbit mlx4 NIC, with 16 RX queues and about > 1,000,000 replacements per second. Looks like a decent benefit for minimal collateral impact - from what I've seen, networking appears to be a particularly easy place to run into awkward races, so mitigating even a little of the debug-induced serialisation for the common case can only be good. Reviewed-by: Robin Murphy Thanks, Robin. > Signed-off-by: Eric Dumazet > Cc: Christoph Hellwig > --- > kernel/dma/debug.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c > index a310dbb1515e92c081f8f3f9a7290dd5e53fc889..b7221426ef49cf640db5bcb261b0817d714a3033 100644 > --- a/kernel/dma/debug.c > +++ b/kernel/dma/debug.c > @@ -97,6 +97,8 @@ static LIST_HEAD(free_entries); > /* Lock for the list above */ > static DEFINE_SPINLOCK(free_entries_lock); > > +static DEFINE_PER_CPU(struct dma_debug_entry *, dma_debug_entry_cache); > + > /* Global disable flag - will be set in case of an error */ > static bool global_disable __read_mostly; > > @@ -676,6 +678,10 @@ static struct dma_debug_entry *dma_entry_alloc(void) > struct dma_debug_entry *entry; > unsigned long flags; > > + entry = this_cpu_xchg(dma_debug_entry_cache, NULL); > + if (entry) > + goto end; > + > spin_lock_irqsave(&free_entries_lock, flags); > if (num_free_entries == 0) { > if (dma_debug_create_entries(GFP_ATOMIC)) { > @@ -690,7 +696,7 @@ static struct dma_debug_entry *dma_entry_alloc(void) > entry = __dma_entry_alloc(); > > spin_unlock_irqrestore(&free_entries_lock, flags); > - > +end: > #ifdef CONFIG_STACKTRACE > entry->stack_len = stack_trace_save(entry->stack_entries, > ARRAY_SIZE(entry->stack_entries), > @@ -705,6 +711,9 @@ static void dma_entry_free(struct dma_debug_entry *entry) > > active_cacheline_remove(entry); > > + if (!this_cpu_cmpxchg(dma_debug_entry_cache, NULL, entry)) > + return; > + > /* > * add to beginning of the list - this way the entries are > * more likely cache hot when they are reallocated. >