Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp2328614ybv; Fri, 21 Feb 2020 13:24:04 -0800 (PST) X-Google-Smtp-Source: APXvYqxLAnhK8L+12iT9vpWX3eDfoCNkjK1NAD20CA4ppOBwBc/SHNa2zHveJohjvJz36Zdo+PIT X-Received: by 2002:a9d:664a:: with SMTP id q10mr28624490otm.298.1582320244384; Fri, 21 Feb 2020 13:24:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582320244; cv=none; d=google.com; s=arc-20160816; b=lDn3oz9AGIomNpuxS0L1atrgt0s09dx33nMHBI86A7zyp7TliXTAVr9tD1MI23SLNa MouxG+phGy06WxtZ5x7OvugUNDyu1c76Ntdyt1hIebfKEkBTjZUVddrtinKr1vJwKHm/ LaQYku0SyOXfimZfDkzvwUtq9USDisTKIy5og+sJrktQBUvpfJW2rUovSgS/9zHzTYzL PJ2JlVrMHwju8GVbdaejq3ewitJFbng8+yFFtsXgXrjDn14ZTAD29BoOkAbX1TlvCXV0 JGNJutruaa1FzBcYjLRN0PHwf4kZAsLz+Hy/QuIB06thuPJkR9W7oy2ZD/crU72PrU+y mVtg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=z0UtmZPeT56XfWRuDUA7oBTkz9ZjiDJ1oBCH4qpkMQ0=; b=IyuUxSOcXmpSUg2BxuJoumU+xSYdLwHoBlMwlpqeY1+Nb0BewJZcQtkqWCJaSRxUYe 5pzived0QXykTrWz8tNLAFOlUMVCWKwyLhSWBwgCifSrund4Dn63vV6P2vUCDXh0H5OU pVkIoCF4ISolYxbmBA9YiMIxmVfVp8nf7jsb3s/7rHBruyOU60GaKsoXDG9mQnazRdrK Z+pAohXPAd1ZILS2CCzy+Do7RJxq3N+iq8RQTMk9NbwP83dF/ffm31vxsukkeYgGeJFj VBNvJUFtrjBEsYKDZpDNrcaJ8ibOVq4PhApM/mDbxfn4E3zdpBJGD4qfQq2Jbb8Ul8yb VD1A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=c9HY3eS5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r66si1303785oie.255.2020.02.21.13.23.51; Fri, 21 Feb 2020 13:24:04 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=c9HY3eS5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726989AbgBUVXl (ORCPT + 99 others); Fri, 21 Feb 2020 16:23:41 -0500 Received: from mail-ed1-f68.google.com ([209.85.208.68]:38389 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726707AbgBUVXl (ORCPT ); Fri, 21 Feb 2020 16:23:41 -0500 Received: by mail-ed1-f68.google.com with SMTP id p23so4080121edr.5 for ; Fri, 21 Feb 2020 13:23:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=z0UtmZPeT56XfWRuDUA7oBTkz9ZjiDJ1oBCH4qpkMQ0=; b=c9HY3eS5WKljqU/p0tZYMuxAzdtRfJ6HN3lf2Wth3vfg4DT06ipl3CpoYJRP77uD7r ZrtMXzZHQgfyam/bMlnj2W60j8GclkUygHSQjUADqT946pzosXTD1dJh82qL+LTOw1Fi dP6IFmFaje/wVpl/eCfKFYB08Chn0tSAtcWvx6jDL9j5NW5hDR+VjWNtm2Lg5dWPRQVg h8G89ykiTz67ZWe+fGESVdqaOOvVJSDQagR267ej413Uo3+0Qs81VhF8M1kKr5mZnxQd 577ztcTKhVNda4GZSR2F0mssWbccIPQ7xR94If4FnjrwlB6c76NDY0FWu700lk566xAb kQrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=z0UtmZPeT56XfWRuDUA7oBTkz9ZjiDJ1oBCH4qpkMQ0=; b=fW2SSHsph0qxzE/m1APyBGeIYpGFikpxzjcUrFvgDrDWFTCrzhMOnf7HT7r7VctEYU D8lJP6wC6N0V+wKr3xRjwSkhi/Xuf3Bv/+9u+HNs4Q9Cql0z8C/SXr8pIc7pBW9gwzQ1 LJW3IwqkvnMnKyboUmv4ws5vua6elv/ggOy0IsaqLlxiqwz267HBMcKvHF+HDFY94XoE FXPrJNR41ij+zN3ZWdwYrJQHSRuhsEAknLj7HvY/UlJRB2ElWWivSZmEP2K/Ofr++nUB //K7KG+7jdiZ6c9RaKGnw3dQIvs0e+TJJe3lohpHbEiR46FGTtf65VJfCVNvK0wfFhH1 4UgQ== X-Gm-Message-State: APjAAAXaGC5V2Vkn3ssYiBhfbNJvpg/DcUP+CjSflD3/muKQ23WPprVW wlB7ocUemj+0WXYsK2k+0uGdvh24VecTBjT2BRw24A== X-Received: by 2002:a50:fd15:: with SMTP id i21mr35541410eds.12.1582320219090; Fri, 21 Feb 2020 13:23:39 -0800 (PST) MIME-Version: 1.0 References: <20200218005659.91318-1-lrizzo@google.com> <20200218165529.39e761c4be828285cc060279@kernel.org> <20200218205025.4047cf0506f56b18f9a989c4@kernel.org> In-Reply-To: <20200218205025.4047cf0506f56b18f9a989c4@kernel.org> From: Luigi Rizzo Date: Fri, 21 Feb 2020 13:23:28 -0800 Message-ID: Subject: Re: [PATCH v3] kretprobe: percpu support To: Masami Hiramatsu Cc: linux-kernel@vger.kernel.org, naveen.n.rao@linux.ibm.com, anil.s.keshavamurthy@intel.com, David Miller , gregkh@linuxfoundation.org, Ingo Molnar , Steven Rostedt Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 18, 2020 at 3:50 AM Masami Hiramatsu wrote: > > On Tue, 18 Feb 2020 01:39:40 -0800 > Luigi Rizzo wrote: > > > On Mon, Feb 17, 2020 at 11:55 PM Masami Hiramatsu wrote: > > > > > > Hi Luigi, > > > > > > On Mon, 17 Feb 2020 16:56:59 -0800 > > > Luigi Rizzo wrote: > > > > > > > kretprobe uses a list protected by a single lock to allocate a > > > > kretprobe_instance in pre_handler_kretprobe(). This works poorly with > > > > concurrent calls. > > > > > > Yes, there are several potential performance issue and the recycle > > > instance is one of them. However, I think this spinlock is not so racy, > > > but noisy (especially on many core machine) right? > > > > correct, it is especially painful on 2+ sockets and many-core systems > > when attaching kretprobes on otherwise uncontended paths. > > > > > > > > Racy lock is the kretprobe_hash_lock(), I would like to replace it > > > with ftrace's per-task shadow stack. But that will be available > > > only if CONFIG_FUNCTION_GRAPH_TRACER=y (and instance has no own > > > payload). > > > > > > > This patch offers a simplified fix: the percpu_instance flag indicates > > > > that we allocate one instance per CPU, and the allocation is contention > > > > free, but we allow only have one pending entry per CPU (this could be > > > > extended to a small constant number without much trouble). > > > > > > OK, the percpu instance idea is good to me, and I think it should be > > > default option. Unless user specifies the number of instances, it should > > > choose percpu instance by default. > > > > That was my initial implementation, which would not even need the > > percpu_instance > > flag in struct kretprobe. However, I felt that changing the default > > would have subtle > > side effects (e.g., only one outstanding call per CPU) so I thought it > > would be better > > to leave the default unchanged and make the flag explicit. > > > > > Moreover, this makes things a bit complicated, can you add per-cpu > > > instance array? If it is there, we can remove the old recycle rp insn > > > code. > > > > Can you clarify what you mean by "per-cpu instance array" ? > > Do you mean allowing multiple outstanding entries per cpu? > > Yes, either allocating it on percpu area or allocating arraies > on percpu pointer is OK. e.g. > > instance_size = sizeof(*rp->pcpu) + rp->data_size; > rp->pcpu = __alloc_percpu(instance_size * array_size, > __alignof__(*rp->pcpu)); > > And we will search free ri on the percpu array by checking ri->rp == NULL. I have posted a v4 patch with the refactoring you suggested, but still defaulting to non percpu allocation, and only one entry per cpu. The former to avoid potential regressions, the latter because I worry that the search in the array may incur several cache misses especially if the traced function is allowed to block or the caller can migrate. (Maybe I am over cautious, but I want to measure that cost first; once that is clear perhaps we can move forward with another patch that defaults to percpu and removes the reclaim code). cheers luigi