Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp8754ybt; Tue, 23 Jun 2020 13:53:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx3kIRR+E9Ok13l74vYC0eE+0nErsikVU4h6sD7aXyyyuvt5Ub6nCRzH7RblpxQtUBLCIyK X-Received: by 2002:a17:906:8542:: with SMTP id h2mr12151605ejy.517.1592945601817; Tue, 23 Jun 2020 13:53:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592945601; cv=none; d=google.com; s=arc-20160816; b=hoO+cffdNErcuB5xfvOqGSVKPFIkUqCImvrgIauCH1nuLq0tDfX/+uWQEP7OIPDPvR DStYgq0y1mDSaAFvQyMF3FtLJ4GrvjBncPhZIsgxDJInudHmTk658UXne0ziE6xmkyxF xkTgxaIOVdp/U7Y8oO+/jOFZhN2wVHdr1W8UCxg2kHUusRlzgyi7BRjIK1AxOW7zCzP2 k/CNMLxa8nt6FIFpOmR57PyIu/znfgBgcjupYYd5Z+9ZFZgOmAyod7CnEh/5Js18OVa+ MXjEbj0BuHWT5+kqxGfzDSNBD2tA0Hsl3eqG5EIFQVbd7KjByuD6AaaGxcqn49vT/q5R 3eiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=q5CoiVPFC/++TRnGRsxZclxtjd8pWpTxignsBtlBYyc=; b=OivmQWxIUmIO1F1fNEj8/MN9k20SZWt3a/nQ/O9pewM5ix3eAM7V0fpeRGPClThOed SOV5S6j0qJS+Ati591OBDtlJDKCu0/C1Ftj5IOuOo6Ib7ZuQ0dpBP9kLXqHnrtXsvQmC 1nzF/v4Fc1N+rZQirvOKqeHAOiOZisbs+HgAvnxuB/6ghfvS2YfnidRXCiNqQSh9Iy/D KzwO+jmIJC++95tLlWQ2OSg+8KhyzgnGct+L3VXIQs25kWwpSrysO3rVuiXQ+X6jOz8+ p1m6488FVDbMDdnBwN6Q7U8i/brqX/GH2ZVjvbZtpNelRxLmrQrKgs9f0hLHeyxwLNGY jNpw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=pqFZg6cv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cc10si11596939edb.236.2020.06.23.13.52.58; Tue, 23 Jun 2020 13:53:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=pqFZg6cv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392982AbgFWUtg (ORCPT + 99 others); Tue, 23 Jun 2020 16:49:36 -0400 Received: from mail.kernel.org ([198.145.29.99]:48948 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2392958AbgFWUtW (ORCPT ); Tue, 23 Jun 2020 16:49:22 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 03342217D9; Tue, 23 Jun 2020 20:49:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1592945361; bh=cRS/HVzEC3wcnDLYPbPIgPryvo+1rerFWJVeneDRZaw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=pqFZg6cvXOVcvymdq6mm2uFEDYmK1fM3CK6scCns4GFBlvDnErj4tym19WwlHV3Oc IflCFwEQaiaDwW8BSTjmsPibQhYGpcSfOfmlBZoh8BCSQ63Glh61T4NWIS2pBuLyNm izSZsCmUAFvCSEsK9sdEJHut/bi++EVHsjGpgtek= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Ingo Molnar , "Gustavo A . R . Silva" , Anders Roxell , "Naveen N . Rao" , Anil S Keshavamurthy , David Miller , Ingo Molnar , Peter Zijlstra , "Ziqian SUN (Zamir)" , Masami Hiramatsu , Jiri Olsa , "Steven Rostedt (VMware)" , Sasha Levin Subject: [PATCH 4.14 134/136] kretprobe: Prevent triggering kretprobe from within kprobe_flush_task Date: Tue, 23 Jun 2020 21:59:50 +0200 Message-Id: <20200623195310.570239639@linuxfoundation.org> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20200623195303.601828702@linuxfoundation.org> References: <20200623195303.601828702@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Jiri Olsa [ Upstream commit 9b38cc704e844e41d9cf74e647bff1d249512cb3 ] Ziqian reported lockup when adding retprobe on _raw_spin_lock_irqsave. My test was also able to trigger lockdep output: ============================================ WARNING: possible recursive locking detected 5.6.0-rc6+ #6 Not tainted -------------------------------------------- sched-messaging/2767 is trying to acquire lock: ffffffff9a492798 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_hash_lock+0x52/0xa0 but task is already holding lock: ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(kretprobe_table_locks[i].lock)); lock(&(kretprobe_table_locks[i].lock)); *** DEADLOCK *** May be due to missing lock nesting notation 1 lock held by sched-messaging/2767: #0: ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50 stack backtrace: CPU: 3 PID: 2767 Comm: sched-messaging Not tainted 5.6.0-rc6+ #6 Call Trace: dump_stack+0x96/0xe0 __lock_acquire.cold.57+0x173/0x2b7 ? native_queued_spin_lock_slowpath+0x42b/0x9e0 ? lockdep_hardirqs_on+0x590/0x590 ? __lock_acquire+0xf63/0x4030 lock_acquire+0x15a/0x3d0 ? kretprobe_hash_lock+0x52/0xa0 _raw_spin_lock_irqsave+0x36/0x70 ? kretprobe_hash_lock+0x52/0xa0 kretprobe_hash_lock+0x52/0xa0 trampoline_handler+0xf8/0x940 ? kprobe_fault_handler+0x380/0x380 ? find_held_lock+0x3a/0x1c0 kretprobe_trampoline+0x25/0x50 ? lock_acquired+0x392/0xbc0 ? _raw_spin_lock_irqsave+0x50/0x70 ? __get_valid_kprobe+0x1f0/0x1f0 ? _raw_spin_unlock_irqrestore+0x3b/0x40 ? finish_task_switch+0x4b9/0x6d0 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 The code within the kretprobe handler checks for probe reentrancy, so we won't trigger any _raw_spin_lock_irqsave probe in there. The problem is in outside kprobe_flush_task, where we call: kprobe_flush_task kretprobe_table_lock raw_spin_lock_irqsave _raw_spin_lock_irqsave where _raw_spin_lock_irqsave triggers the kretprobe and installs kretprobe_trampoline handler on _raw_spin_lock_irqsave return. The kretprobe_trampoline handler is then executed with already locked kretprobe_table_locks, and first thing it does is to lock kretprobe_table_locks ;-) the whole lockup path like: kprobe_flush_task kretprobe_table_lock raw_spin_lock_irqsave _raw_spin_lock_irqsave ---> probe triggered, kretprobe_trampoline installed ---> kretprobe_table_locks locked kretprobe_trampoline trampoline_handler kretprobe_hash_lock(current, &head, &flags); <--- deadlock Adding kprobe_busy_begin/end helpers that mark code with fake probe installed to prevent triggering of another kprobe within this code. Using these helpers in kprobe_flush_task, so the probe recursion protection check is hit and the probe is never set to prevent above lockup. Link: http://lkml.kernel.org/r/158927059835.27680.7011202830041561604.stgit@devnote2 Fixes: ef53d9c5e4da ("kprobes: improve kretprobe scalability with hashed locking") Cc: Ingo Molnar Cc: "Gustavo A . R . Silva" Cc: Anders Roxell Cc: "Naveen N . Rao" Cc: Anil S Keshavamurthy Cc: David Miller Cc: Ingo Molnar Cc: Peter Zijlstra Cc: stable@vger.kernel.org Reported-by: "Ziqian SUN (Zamir)" Acked-by: Masami Hiramatsu Signed-off-by: Jiri Olsa Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Sasha Levin --- arch/x86/kernel/kprobes/core.c | 16 +++------------- include/linux/kprobes.h | 4 ++++ kernel/kprobes.c | 24 ++++++++++++++++++++++++ 3 files changed, 31 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c index 9d7bb8de2917e..02665ffef0506 100644 --- a/arch/x86/kernel/kprobes/core.c +++ b/arch/x86/kernel/kprobes/core.c @@ -744,16 +744,11 @@ asm( NOKPROBE_SYMBOL(kretprobe_trampoline); STACK_FRAME_NON_STANDARD(kretprobe_trampoline); -static struct kprobe kretprobe_kprobe = { - .addr = (void *)kretprobe_trampoline, -}; - /* * Called from kretprobe_trampoline */ __visible __used void *trampoline_handler(struct pt_regs *regs) { - struct kprobe_ctlblk *kcb; struct kretprobe_instance *ri = NULL; struct hlist_head *head, empty_rp; struct hlist_node *tmp; @@ -763,16 +758,12 @@ __visible __used void *trampoline_handler(struct pt_regs *regs) void *frame_pointer; bool skipped = false; - preempt_disable(); - /* * Set a dummy kprobe for avoiding kretprobe recursion. * Since kretprobe never run in kprobe handler, kprobe must not * be running at this point. */ - kcb = get_kprobe_ctlblk(); - __this_cpu_write(current_kprobe, &kretprobe_kprobe); - kcb->kprobe_status = KPROBE_HIT_ACTIVE; + kprobe_busy_begin(); INIT_HLIST_HEAD(&empty_rp); kretprobe_hash_lock(current, &head, &flags); @@ -851,7 +842,7 @@ __visible __used void *trampoline_handler(struct pt_regs *regs) __this_cpu_write(current_kprobe, &ri->rp->kp); ri->ret_addr = correct_ret_addr; ri->rp->handler(ri, regs); - __this_cpu_write(current_kprobe, &kretprobe_kprobe); + __this_cpu_write(current_kprobe, &kprobe_busy); } recycle_rp_inst(ri, &empty_rp); @@ -867,8 +858,7 @@ __visible __used void *trampoline_handler(struct pt_regs *regs) kretprobe_hash_unlock(current, &flags); - __this_cpu_write(current_kprobe, NULL); - preempt_enable(); + kprobe_busy_end(); hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) { hlist_del(&ri->hlist); diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h index 520702b821340..be7a49f437ea3 100644 --- a/include/linux/kprobes.h +++ b/include/linux/kprobes.h @@ -385,6 +385,10 @@ static inline struct kprobe_ctlblk *get_kprobe_ctlblk(void) return this_cpu_ptr(&kprobe_ctlblk); } +extern struct kprobe kprobe_busy; +void kprobe_busy_begin(void); +void kprobe_busy_end(void); + kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset); int register_kprobe(struct kprobe *p); void unregister_kprobe(struct kprobe *p); diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 484670370a6a1..f2d2194b51ca3 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -1218,6 +1218,26 @@ __releases(hlist_lock) } NOKPROBE_SYMBOL(kretprobe_table_unlock); +struct kprobe kprobe_busy = { + .addr = (void *) get_kprobe, +}; + +void kprobe_busy_begin(void) +{ + struct kprobe_ctlblk *kcb; + + preempt_disable(); + __this_cpu_write(current_kprobe, &kprobe_busy); + kcb = get_kprobe_ctlblk(); + kcb->kprobe_status = KPROBE_HIT_ACTIVE; +} + +void kprobe_busy_end(void) +{ + __this_cpu_write(current_kprobe, NULL); + preempt_enable(); +} + /* * This function is called from finish_task_switch when task tk becomes dead, * so that we can recycle any function-return probe instances associated @@ -1235,6 +1255,8 @@ void kprobe_flush_task(struct task_struct *tk) /* Early boot. kretprobe_table_locks not yet initialized. */ return; + kprobe_busy_begin(); + INIT_HLIST_HEAD(&empty_rp); hash = hash_ptr(tk, KPROBE_HASH_BITS); head = &kretprobe_inst_table[hash]; @@ -1248,6 +1270,8 @@ void kprobe_flush_task(struct task_struct *tk) hlist_del(&ri->hlist); kfree(ri); } + + kprobe_busy_end(); } NOKPROBE_SYMBOL(kprobe_flush_task); -- 2.25.1