Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp5297720ybv; Mon, 17 Feb 2020 16:57:28 -0800 (PST) X-Google-Smtp-Source: APXvYqxNSGCdWotXtRclqIm9hklgwtzCkvdVKzK7nfcx/jfymlc3w/tzXSxW1kMaoV86E2UY3+Yw X-Received: by 2002:a9d:4e93:: with SMTP id v19mr12057140otk.200.1581987447917; Mon, 17 Feb 2020 16:57:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1581987447; cv=none; d=google.com; s=arc-20160816; b=AWBH5mzefjpFmMOKg6+1sVWYeRSstH6Oqan5m6hv6HN0mHmiKB/cc01IhAGJi26x4w MvOpuyWuLTsX+4zn6KipvJMSxgsYsySlvaHYdjpD02uZtKhVbIUT4jOekd9sMcFjJ7h3 THfGtuLYUmtOc/+VUnmbzyOT4oXysC350aYcV/PfCEHPhdI9UFRC553dX5XE6K8T2VTJ b7k5PImHB8g9bfT/kvoYtel6J4jDOvvf46K6272TLw7zc4WH/5DBREe6Olxvv/FEzvhQ JYyCkiF/ynwx+FaX9aGEbuPhl1++LeHOgPpJ/njGeVAwgRr0rkTXbkcnTFu40UIcZX2W WLDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:from:subject:mime-version :message-id:date:dkim-signature; bh=UWOivOqvvKnnG9YpQAwFVSpkmhSp9gpofcG/6/xewVk=; b=aLkoKUE18PW8yLIdVucmsA/B/ElpZaRMgiyTaJo2z9zd69zIfmMEn60x5b5U+32qgM 4IdDCnv7nuFdjJN1KIFg0zY0vt65vncymvEcITOJ0htTSOGYLkWQeTsedeHj7bXrFQnf 7MiCKtXk/mWtmC0LBX20ja9FKbKTb4gJaEl/NSMo62I3z0o9rHJ7w7tWn9mczZz2MdEy HZzbMEHbNzgp7VD7jkTJ5TSVIWbfSDSa7lTC9ZLgGJ3NDuknAoNcKZ/9CYwiYRoPcLVr V0g7n61uC5MGysBz2dROdFEuy2YA9De7oEmosa30n8joCOQpeFUaIQq+PuiUappqDxBi Uxxg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=dFdrVx8K; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i2si950224otc.130.2020.02.17.16.57.15; Mon, 17 Feb 2020 16:57:27 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=dFdrVx8K; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726155AbgBRA5F (ORCPT + 99 others); Mon, 17 Feb 2020 19:57:05 -0500 Received: from mail-pj1-f73.google.com ([209.85.216.73]:54412 "EHLO mail-pj1-f73.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726069AbgBRA5F (ORCPT ); Mon, 17 Feb 2020 19:57:05 -0500 Received: by mail-pj1-f73.google.com with SMTP id a31so389802pje.4 for ; Mon, 17 Feb 2020 16:57:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:to:cc; bh=UWOivOqvvKnnG9YpQAwFVSpkmhSp9gpofcG/6/xewVk=; b=dFdrVx8KZ5kYYfVfjrA7D44Z9E/BRkzVZbmxU0Dan5MH1YoxkgQlU8BPU373KWuC6I Qr7vV6zEPS9uOdjtgJuIZsO6CGrf0QIdzujpHXXGZjvunOqzhDPcE50bJscbQ9jv3eBo QCMJwgo4T2OgiWSjZqWoTtLHU3RdEHG+Ox8EJqF2p/P86KB9BDuXeEfYw4eOWfG5okSh fH7Wf7o91LzZl8bO1b6nCOJvRzVrpN3bBtPlhPgElE1Ae1G1/RnCIS+3LIiwRUdrR5rK GwRFXOo62bEfQOO/X2N/d8LmisGBArPIjaqh8YnWOEc4yFPbHoBVNR0SpEUV2XkMzdT8 LQ5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=UWOivOqvvKnnG9YpQAwFVSpkmhSp9gpofcG/6/xewVk=; b=Lwhu8iT767Hwgv4R6J6xcVvYG3f6AhagerAdvxrw16vMdu1cwBDFmiATYkD2zyZGzG LOZG9nw+WEIJErBUDFjE/Xak8u7SghRUzzlJulv02N7Xou160C8d16BCdLSyWqe1ZOr5 /mmervC5BgviPNkW5qNodldDkEMg9jV9eErp/4QvH9w1LRjJ9Q8hcbdmiIkQwlA9zqOl +s27g/U3QCFFwnTyKbWgBgzf8ptOr3pJbjdEQupQ3Snc6J1MKL//U4psJpWWvwIz+Wqf h8XZjJMhTv4O0BH3kEo5nQbgkiCour6nRdn8wfESACkxyYAn1vuLTR/L3Of0vzwbWUZl 7yww== X-Gm-Message-State: APjAAAVbZ7yTMgW+rNYIY2YYs29XeYlfREDV6UjkszqbgyaSStm83bC7 nHpCqaCw8ppnipAWwOKwqYnmKik/xGMcudeABnvhcbgJfuiUiPt9tmm+D8chLb3+de9AuKYOzQ1 qMlWEZR4UA2tUiHfN7Rj8/JcPAnPndFGAU0VDjTT1LRS/8zxWoZGdR+YIvbq9fNLRpmJb0g== X-Received: by 2002:a65:40c8:: with SMTP id u8mr16965521pgp.188.1581987424436; Mon, 17 Feb 2020 16:57:04 -0800 (PST) Date: Mon, 17 Feb 2020 16:56:59 -0800 Message-Id: <20200218005659.91318-1-lrizzo@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.25.0.265.gbab2e86ba0-goog Subject: [PATCH v3] kretprobe: percpu support From: Luigi Rizzo To: linux-kernel@vger.kernel.org, naveen.n.rao@linux.ibm.com, anil.s.keshavamurthy@intel.com, davem@davemloft.net, mhiramat@kernel.org, gregkh@linuxfoundation.org Cc: Luigi Rizzo Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org kretprobe uses a list protected by a single lock to allocate a kretprobe_instance in pre_handler_kretprobe(). This works poorly with concurrent calls. This patch offers a simplified fix: the percpu_instance flag indicates that we allocate one instance per CPU, and the allocation is contention free, but we allow only have one pending entry per CPU (this could be extended to a small constant number without much trouble). Signed-off-by: Luigi Rizzo --- include/linux/kprobes.h | 10 ++++++++-- kernel/kprobes.c | 41 +++++++++++++++++++++++++++++++++++++++++ kernel/test_kprobes.c | 18 +++++++++++++----- 3 files changed, 62 insertions(+), 7 deletions(-) diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h index 04bdaf01112cb..5c549d6a599b9 100644 --- a/include/linux/kprobes.h +++ b/include/linux/kprobes.h @@ -142,7 +142,8 @@ static inline int kprobe_ftrace(struct kprobe *p) * can be active concurrently. * nmissed - tracks the number of times the probed function's return was * ignored, due to maxactive being too low. - * + * percpu_instance - if set, uses one instance per cpu instead of allocating + * from the list protected by lock. */ struct kretprobe { struct kprobe kp; @@ -151,8 +152,13 @@ struct kretprobe { int maxactive; int nmissed; size_t data_size; - struct hlist_head free_instances; + union { + struct kretprobe_instance __percpu *pcpu; + struct hlist_head free_instances; + }; raw_spinlock_t lock; + u32 percpu_instance:1; + u32 unused:31; }; struct kretprobe_instance { diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 2625c241ac00f..12ca682083ffc 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -1184,6 +1184,10 @@ void recycle_rp_inst(struct kretprobe_instance *ri, hlist_del(&ri->hlist); INIT_HLIST_NODE(&ri->hlist); if (likely(rp)) { + if (rp->percpu_instance) { + ri->rp = NULL; + return; + } raw_spin_lock(&rp->lock); hlist_add_head(&ri->hlist, &rp->free_instances); raw_spin_unlock(&rp->lock); @@ -1274,6 +1278,11 @@ static inline void free_rp_inst(struct kretprobe *rp) struct kretprobe_instance *ri; struct hlist_node *next; + if (rp->percpu_instance) { + free_percpu(rp->pcpu); + return; + } + hlist_for_each_entry_safe(ri, next, &rp->free_instances, hlist) { hlist_del(&ri->hlist); kfree(ri); @@ -1874,6 +1883,27 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs) /* TODO: consider to only swap the RA after the last pre_handler fired */ hash = hash_ptr(current, KPROBE_HASH_BITS); + if (rp->percpu_instance) { + unsigned long flags; + + local_irq_save(flags); + ri = this_cpu_ptr(rp->pcpu); + if (!ri || ri->rp) { /* already in use */ + local_irq_restore(flags); + rp->nmissed++; + return 0; + } + INIT_HLIST_NODE(&ri->hlist); + ri->rp = rp; + ri->task = current; + local_irq_restore(flags); + + if (rp->entry_handler && rp->entry_handler(ri, regs)) { + ri->rp = NULL; /* failed */ + return 0; + } + goto good; + } raw_spin_lock_irqsave(&rp->lock, flags); if (!hlist_empty(&rp->free_instances)) { ri = hlist_entry(rp->free_instances.first, @@ -1891,6 +1921,7 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs) return 0; } +good: arch_prepare_kretprobe(ri, regs); /* XXX(hch): why is there no hlist_move_head? */ @@ -1950,6 +1981,15 @@ int register_kretprobe(struct kretprobe *rp) rp->kp.post_handler = NULL; rp->kp.fault_handler = NULL; + if (rp->percpu_instance) { + rp->pcpu = __alloc_percpu(sizeof(*rp->pcpu) + rp->data_size, + __alignof__(*rp->pcpu)); + if (rp->pcpu) + goto finalize; + free_rp_inst(rp); + return -ENOMEM; + } + /* Pre-allocate memory for max kretprobe instances */ if (rp->maxactive <= 0) { #ifdef CONFIG_PREEMPTION @@ -1971,6 +2011,7 @@ int register_kretprobe(struct kretprobe *rp) hlist_add_head(&inst->hlist, &rp->free_instances); } +finalize: rp->nmissed = 0; /* Establish function entry probe point */ ret = register_kprobe(&rp->kp); diff --git a/kernel/test_kprobes.c b/kernel/test_kprobes.c index 76c997fdbc9da..22b734fb8bcb2 100644 --- a/kernel/test_kprobes.c +++ b/kernel/test_kprobes.c @@ -236,31 +236,36 @@ static struct kretprobe rp2 = { .kp.symbol_name = "kprobe_target2" }; -static int test_kretprobes(void) +static int test_kretprobes(bool percpu_instance) { int ret; struct kretprobe *rps[2] = {&rp, &rp2}; + const char *mode = percpu_instance ? "percpu " : "normal"; /* addr and flags should be cleard for reusing kprobe. */ rp.kp.addr = NULL; rp.kp.flags = 0; + rp.percpu_instance = percpu_instance; + rp2.kp.addr = NULL; + rp2.kp.flags = 0; + rp2.percpu_instance = percpu_instance; ret = register_kretprobes(rps, 2); if (ret < 0) { - pr_err("register_kretprobe returned %d\n", ret); + pr_err("register_kretprobe mode %s returned %d\n", mode, ret); return ret; } krph_val = 0; ret = target(rand1); if (krph_val != rand1) { - pr_err("kretprobe handler not called\n"); + pr_err("kretprobe handler mode %s not called\n", mode); handler_errors++; } krph_val = 0; ret = target2(rand1); if (krph_val != rand1) { - pr_err("kretprobe handler2 not called\n"); + pr_err("kretprobe handler2 mode %s not called\n", mode); handler_errors++; } unregister_kretprobes(rps, 2); @@ -297,7 +302,10 @@ int init_test_probes(void) errors++; num_tests++; - ret = test_kretprobes(); + ret = test_kretprobes(false); + if (ret < 0) + errors++; + ret = test_kretprobes(true); if (ret < 0) errors++; #endif /* CONFIG_KRETPROBES */ -- 2.25.0.265.gbab2e86ba0-goog