Received: by 2002:ab2:f03:0:b0:1ef:ffd0:ce49 with SMTP id i3csp114040lqf; Tue, 26 Mar 2024 16:43:18 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXg/FhLP7rq890c/KMS4Qp32OovhegIsZDrFynPXhKMOc/cWGDkDPi/BKTG7bcb3KYFj96HmmVTu/K1TnhbGCPUsR0ARtjCYpA1Woo39w== X-Google-Smtp-Source: AGHT+IH+9JS8xx7lccMOkpxQR0t63USiQK7Wbg/z3BS3KDJ/ddGyJasksfyP5w4nLIjdkhOf5CI1 X-Received: by 2002:a19:f817:0:b0:513:c25b:8fe with SMTP id a23-20020a19f817000000b00513c25b08femr8432083lff.58.1711496598163; Tue, 26 Mar 2024 16:43:18 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711496598; cv=pass; d=google.com; s=arc-20160816; b=QEB7Be1c38W+PzzaI4D5y183cvI+EMyhAfbNRsp2FT+JDliLhYUZMEkrPiuVL0c7XQ biJYCOT5QKZnqyasa3GDocHsdBnCBCzJMNtl5Dyh95IZsKQmBR/P80MZ2gc85kUpLw4B yyBr3yRVP/CYSmOLWtFr1di8Zydf+YxOGJrgaz5gyN2Fj5/w0nEYnvGPRWW0cBwdtTHa h4DsNk27nWJztns8MdGXxT12kQDVWvcJeEevJfl5n2VxC1vYNPJVerXk5webyU96B/fo bWNrOkuuPwqqpUiwWH1ddMbX2b28GAU7EA0sha2ONhE+crjH/g1FdfK/rDMFFU9Kurd8 cqpg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=rvDdQrbODi0sD02nmP6yvpfO3qTS3+5lt/MWqTSqwuk=; fh=k8+FICNIycn0jYZsWtVYz5bixaCS0ZCoK1NwoTL4cTY=; b=fXiScA8ttVp20PmaPBDh4CdfWyRoOgu4eUl3ixFqW8VaY7F1Dd+kCWNC+fI3cQAWtd Pgv65al7Dxw+JS3tR1/l6Ns6kBvThrNLiwyvuLGpZUxl50UJcfiUzlEHS5YMW+CgYmNX heFtUeBTSemJCrTd8Vbw/0z5ZRXXK6xF08f+La2GbGfpgdk54FXr3XXtFWVUdpuE/WBV 8FF5bJtAduBJ3ErOQL5cOCsBs4CYgBUiieqSOl2Tn/goP7fWSiqqzF7CffgMdwXv9vUj khFgr7me+2SQg3MsGR33oekJEmt9xP0bKgrsBgXQmQaDWwI09qr3GMPGuVgn543qCj/4 futQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=e1Te5PEg; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-120145-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-120145-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id k1-20020a17090627c100b00a45ed888130si4071203ejc.536.2024.03.26.16.43.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Mar 2024 16:43:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-120145-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=e1Te5PEg; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-120145-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-120145-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id B464C1F35E8F for ; Tue, 26 Mar 2024 23:43:17 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D3D1B13E419; Tue, 26 Mar 2024 23:43:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="e1Te5PEg" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D17EE13D8AA; Tue, 26 Mar 2024 23:43:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711496585; cv=none; b=d/2VzD/JH9Wo0vXtMePaipQhUW16D6vTJA/uxdd/GU4tUe+oPhxYkWCbfwRq0BlNYcRsxN+7eAxO77orCav4YgaArN7iEm4L8/D5xroNy9HXtmWMT5yZyHDkkcXiGFHps93kko71ec9y2pDeY1/hiS/8yrqsF1lBJOT9otfZz3c= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711496585; c=relaxed/simple; bh=kI9/vVj/4BMdUHeYftmqEsqbJO5noMBQIfg14jMBKFM=; h=Date:From:To:Cc:Subject:Message-Id:In-Reply-To:References: Mime-Version:Content-Type; b=EZVwqqsrgoRYJrlytw3e5Y0Rk7qoDDbIlh1I7gq7YkxKxMHTN2xjCY5Hk2e3n42YLl5jjrA1Fc/oy3uOl0di5JE4n0no4E1s0sR2zV50+ciq9f0tC0Gw/bkUOGBAY7wDs9fb63xBXVffZJk+Q6YheUPsvvc/8V93EUd1b6ZGp+g= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=e1Te5PEg; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id E22F2C433F1; Tue, 26 Mar 2024 23:43:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1711496584; bh=kI9/vVj/4BMdUHeYftmqEsqbJO5noMBQIfg14jMBKFM=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=e1Te5PEgkrCYKRBnHUyqYXAVy9x8vqXx2I7eSOMJxCbhzbY18/8jVt8W37zvxuW4d wJT7SAsq0OhrnG7pvho+cqC2hsjw8cGh1rlmLXa7OyhvjOKqZ30UBMwkY7MC3WHOoU 9ko/ZLsxapUxeaMIy6iLbs3m9Ljyi21hpsAmm5kjdjyfF9+nU8Uvb0aUgjYe1K9ylm WxElv8pjlNdXmihyI+//KM/HUXCLl/Is3yY8l3GC7nrffDskMdjnHGi/NMORU10wY7 XoWY2ibGVvgeLUNx65P6RQRTc8annaar9v/CODGB+sIdtLB26q2OgvwhAAHovdjzVD tNVNt5neKavlQ== Date: Wed, 27 Mar 2024 08:42:58 +0900 From: Masami Hiramatsu (Google) To: Jonthan Haslam Cc: linux-trace-kernel@vger.kernel.org, andrii@kernel.org, bpf@vger.kernel.org, rostedt@goodmis.org, Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] uprobes: reduce contention on uprobes_tree access Message-Id: <20240327084258.c385e997782a97fef07ba084@kernel.org> In-Reply-To: References: <20240321145736.2373846-1-jonathan.haslam@gmail.com> <20240325120323.ec3248d330b2755e73a6571e@kernel.org> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Mon, 25 Mar 2024 19:04:59 +0000 Jonthan Haslam wrote: > Hi Masami, > > > > This change has been tested against production workloads that exhibit > > > significant contention on the spinlock and an almost order of magnitude > > > reduction for mean uprobe execution time is observed (28 -> 3.5 microsecs). > > > > Looks good to me. > > > > Acked-by: Masami Hiramatsu (Google) > > > > BTW, how did you measure the overhead? I think spinlock overhead > > will depend on how much lock contention happens. > > Absolutely. I have the original production workload to test this with and > a derived one that mimics this test case. The production case has ~24 > threads running on a 192 core system which access 14 USDTs around 1.5 > million times per second in total (across all USDTs). My test case is > similar but can drive a higher rate of USDT access across more threads and > therefore generate higher contention. Thanks for the info. So this result is measured in enough large machine with high parallelism. So lock contention is matter. Can you also include this information with the number in next version? Thank you, > > All measurements are done using bpftrace scripts around relevant parts of > code in uprobes.c and application code. > > Jon. > > > > > Thank you, > > > > > > > > [0] https://docs.kernel.org/locking/spinlocks.html > > > > > > Signed-off-by: Jonathan Haslam > > > --- > > > kernel/events/uprobes.c | 22 +++++++++++----------- > > > 1 file changed, 11 insertions(+), 11 deletions(-) > > > > > > diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c > > > index 929e98c62965..42bf9b6e8bc0 100644 > > > --- a/kernel/events/uprobes.c > > > +++ b/kernel/events/uprobes.c > > > @@ -39,7 +39,7 @@ static struct rb_root uprobes_tree = RB_ROOT; > > > */ > > > #define no_uprobe_events() RB_EMPTY_ROOT(&uprobes_tree) > > > > > > -static DEFINE_SPINLOCK(uprobes_treelock); /* serialize rbtree access */ > > > +static DEFINE_RWLOCK(uprobes_treelock); /* serialize rbtree access */ > > > > > > #define UPROBES_HASH_SZ 13 > > > /* serialize uprobe->pending_list */ > > > @@ -669,9 +669,9 @@ static struct uprobe *find_uprobe(struct inode *inode, loff_t offset) > > > { > > > struct uprobe *uprobe; > > > > > > - spin_lock(&uprobes_treelock); > > > + read_lock(&uprobes_treelock); > > > uprobe = __find_uprobe(inode, offset); > > > - spin_unlock(&uprobes_treelock); > > > + read_unlock(&uprobes_treelock); > > > > > > return uprobe; > > > } > > > @@ -701,9 +701,9 @@ static struct uprobe *insert_uprobe(struct uprobe *uprobe) > > > { > > > struct uprobe *u; > > > > > > - spin_lock(&uprobes_treelock); > > > + write_lock(&uprobes_treelock); > > > u = __insert_uprobe(uprobe); > > > - spin_unlock(&uprobes_treelock); > > > + write_unlock(&uprobes_treelock); > > > > > > return u; > > > } > > > @@ -935,9 +935,9 @@ static void delete_uprobe(struct uprobe *uprobe) > > > if (WARN_ON(!uprobe_is_active(uprobe))) > > > return; > > > > > > - spin_lock(&uprobes_treelock); > > > + write_lock(&uprobes_treelock); > > > rb_erase(&uprobe->rb_node, &uprobes_tree); > > > - spin_unlock(&uprobes_treelock); > > > + write_unlock(&uprobes_treelock); > > > RB_CLEAR_NODE(&uprobe->rb_node); /* for uprobe_is_active() */ > > > put_uprobe(uprobe); > > > } > > > @@ -1298,7 +1298,7 @@ static void build_probe_list(struct inode *inode, > > > min = vaddr_to_offset(vma, start); > > > max = min + (end - start) - 1; > > > > > > - spin_lock(&uprobes_treelock); > > > + read_lock(&uprobes_treelock); > > > n = find_node_in_range(inode, min, max); > > > if (n) { > > > for (t = n; t; t = rb_prev(t)) { > > > @@ -1316,7 +1316,7 @@ static void build_probe_list(struct inode *inode, > > > get_uprobe(u); > > > } > > > } > > > - spin_unlock(&uprobes_treelock); > > > + read_unlock(&uprobes_treelock); > > > } > > > > > > /* @vma contains reference counter, not the probed instruction. */ > > > @@ -1407,9 +1407,9 @@ vma_has_uprobes(struct vm_area_struct *vma, unsigned long start, unsigned long e > > > min = vaddr_to_offset(vma, start); > > > max = min + (end - start) - 1; > > > > > > - spin_lock(&uprobes_treelock); > > > + read_lock(&uprobes_treelock); > > > n = find_node_in_range(inode, min, max); > > > - spin_unlock(&uprobes_treelock); > > > + read_unlock(&uprobes_treelock); > > > > > > return !!n; > > > } > > > -- > > > 2.43.0 > > > > > > > > > -- > > Masami Hiramatsu (Google) -- Masami Hiramatsu (Google)