Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp1834084ybb; Thu, 9 Apr 2020 09:51:35 -0700 (PDT) X-Google-Smtp-Source: APiQypKMZbSKx/vJKiHlJodnmjSAd5XdS7JrB6NQDz1phfS3MAxupmAe4nn8ah3xOcP9ZnCLiZ4o X-Received: by 2002:ac8:18c3:: with SMTP id o3mr260575qtk.49.1586451095520; Thu, 09 Apr 2020 09:51:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1586451095; cv=none; d=google.com; s=arc-20160816; b=YWFTrr/MumPGRD5bd8WgMNGLwkueBD2g66KV2XuUc1MQ7x2snHKiEdX3pNbazX+fiB 0+1l6CbsWz7djyVb6aWbhEBQwYIQHMTCT2F/rmBhsjbMatBjTCVplYoRIujeP8nEuQ/P ddzOrdqzeuD4o/pf2YfvIrgpZ72wfxd6XNK9IAayGH0tYOBzZGFGepodnrxduQiy88hG MbX37OVFjeSCbPqVHa+Q9N3bk3VvM+6Mt/HftuERBNPoNP0F7Q2M4dtX1ym5idVChptR I6kU7dodw+z4nJi39onkwjMblDyJBODHBJ8uEjqsFndXsCyH4qAb8gdRgCsG5zJZKew4 KIBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=OSRKFQBmODQfHxlC9DY67nz8JkVe3BGquwaDAc0a7xI=; b=bZoGMe8lyx+DtIBpCaLgFDHm4rZytfgyVLbnAWqQNLpNpPnMtIeEA4WkifGh5w5bI9 IYE/GJvjZDOl0L9d/7TXTgW3FxrOsGGTUyhDxclfsMDstyc1jJlo13mUmmt26TWReuoY gV/NpU0YaZGVUEN3zZVCJFkZDpLSqrwDRUsSKM7wlBvHJpfnBtSdXbIjA/9BD6QPmwbA 4Dft5aGi+DdLkveev3V17gQJomCRBGv7e8AIDpuBsDXXA+hc73GStpVwxNfCuwKupKGD IeMYjaAdK7jPsWuzgbrhXdmRqjaoK4w96J9SSTczcvu0ykw7lKS5ObzcDz9arUIs2apF cXFw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=I6kL4V8j; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v1si6077297qkd.327.2020.04.09.09.51.18; Thu, 09 Apr 2020 09:51:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=I6kL4V8j; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727822AbgDIOlH (ORCPT + 99 others); Thu, 9 Apr 2020 10:41:07 -0400 Received: from mail.kernel.org ([198.145.29.99]:47888 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727327AbgDIOlG (ORCPT ); Thu, 9 Apr 2020 10:41:06 -0400 Received: from devnote2 (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 429C42078E; Thu, 9 Apr 2020 14:41:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1586443266; bh=ubVb0I2ai8nKLC8GF70Hzzlt+BYOLspkiVSoi7kI8V0=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=I6kL4V8jneQArdv0/VR5+Ajsnrqdl4EeY07NFGAr9eIvOnksVtB7e4t4p1znNEEgf pkKJm8Xmn04IxwMUK5zcvFi61+79ueFvIgMCnV3cBx5KljLtN1cwTESlAzRPN6dwbQ bF5+ivv/pC1cBKmPfMj13/q6+9T8thU4gHtISoc0= Date: Thu, 9 Apr 2020 23:41:01 +0900 From: Masami Hiramatsu To: Jiri Olsa Cc: "Naveen N. Rao" , Anil S Keshavamurthy , "David S. Miller" , Peter Zijlstra , lkml , "bibo,mao" , "Ziqian SUN (Zamir)" Subject: Re: [RFC] kretprobe: Prevent triggering kretprobe from within kprobe_flush_task Message-Id: <20200409234101.8814f3cbead69337ac5a33fa@kernel.org> In-Reply-To: <20200408164641.3299633-1-jolsa@kernel.org> References: <20200408164641.3299633-1-jolsa@kernel.org> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.32; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 8 Apr 2020 18:46:41 +0200 Jiri Olsa wrote: > hi, > Ziqian reported lockup when adding retprobe on _raw_spin_lock_irqsave. > My test was also able to trigger lockdep output: > > ============================================ > WARNING: possible recursive locking detected > 5.6.0-rc6+ #6 Not tainted > -------------------------------------------- > sched-messaging/2767 is trying to acquire lock: > ffffffff9a492798 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_hash_lock+0x52/0xa0 > > but task is already holding lock: > ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50 > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > ---- > lock(&(kretprobe_table_locks[i].lock)); > lock(&(kretprobe_table_locks[i].lock)); > > *** DEADLOCK *** > > May be due to missing lock nesting notation > > 1 lock held by sched-messaging/2767: > #0: ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50 > > stack backtrace: > CPU: 3 PID: 2767 Comm: sched-messaging Not tainted 5.6.0-rc6+ #6 > Call Trace: > dump_stack+0x96/0xe0 > __lock_acquire.cold.57+0x173/0x2b7 > ? native_queued_spin_lock_slowpath+0x42b/0x9e0 > ? lockdep_hardirqs_on+0x590/0x590 > ? __lock_acquire+0xf63/0x4030 > lock_acquire+0x15a/0x3d0 > ? kretprobe_hash_lock+0x52/0xa0 > _raw_spin_lock_irqsave+0x36/0x70 > ? kretprobe_hash_lock+0x52/0xa0 > kretprobe_hash_lock+0x52/0xa0 > trampoline_handler+0xf8/0x940 > ? kprobe_fault_handler+0x380/0x380 > ? find_held_lock+0x3a/0x1c0 > kretprobe_trampoline+0x25/0x50 > ? lock_acquired+0x392/0xbc0 > ? _raw_spin_lock_irqsave+0x50/0x70 > ? __get_valid_kprobe+0x1f0/0x1f0 > ? _raw_spin_unlock_irqrestore+0x3b/0x40 > ? finish_task_switch+0x4b9/0x6d0 > ? __switch_to_asm+0x34/0x70 > ? __switch_to_asm+0x40/0x70 > > The code within the kretprobe handler checks for probe reentrancy, > so we won't trigger any _raw_spin_lock_irqsave probe in there. > > The problem is in outside kprobe_flush_task, where we call: > > kprobe_flush_task > kretprobe_table_lock > raw_spin_lock_irqsave > _raw_spin_lock_irqsave > > where _raw_spin_lock_irqsave triggers the kretprobe and installs > kretprobe_trampoline handler on _raw_spin_lock_irqsave return. > > The kretprobe_trampoline handler is then executed with already > locked kretprobe_table_locks, and first thing it does is to > lock kretprobe_table_locks ;-) the whole lockup path like: > > kprobe_flush_task > kretprobe_table_lock > raw_spin_lock_irqsave > _raw_spin_lock_irqsave ---> probe triggered, kretprobe_trampoline installed > > ---> kretprobe_table_locks locked > > kretprobe_trampoline > trampoline_handler > kretprobe_hash_lock(current, &head, &flags); <--- deadlock > > The change below sets current_kprobe in kprobe_flush_task, so the probe > recursion protection check is hit and the probe is never set. It seems > to fix the deadlock. > > I'm not sure this is the best fix, any ideas are welcome ;-) OK, I just have 1 comment. :) > > thanks, > jirka > > > --- > kernel/kprobes.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > index 2625c241ac00..b13247cae752 100644 > --- a/kernel/kprobes.c > +++ b/kernel/kprobes.c > @@ -1236,6 +1236,10 @@ __releases(hlist_lock) > } > NOKPROBE_SYMBOL(kretprobe_table_unlock); > > +static struct kprobe kretprobe_dummy = { > + .addr = (void *)kretprobe_trampoline, > +}; > + > /* > * This function is called from finish_task_switch when task tk becomes dead, > * so that we can recycle any function-return probe instances associated > @@ -1256,12 +1260,14 @@ void kprobe_flush_task(struct task_struct *tk) > INIT_HLIST_HEAD(&empty_rp); > hash = hash_ptr(tk, KPROBE_HASH_BITS); > head = &kretprobe_inst_table[hash]; > + __this_cpu_write(current_kprobe, &kretprobe_dummy); Can you also set the kcb->kprobe_state = KPROBE_HIT_ACTIVE? BTW, we may be better to introduce a common kprobe_reject_section_start() and kprobe_reject_section_end() so that the user don't need to prepare dummy kprobes. Thank you, > kretprobe_table_lock(hash, &flags); > hlist_for_each_entry_safe(ri, tmp, head, hlist) { > if (ri->task == tk) > recycle_rp_inst(ri, &empty_rp); > } > kretprobe_table_unlock(hash, &flags); > + __this_cpu_write(current_kprobe, NULL); > hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) { > hlist_del(&ri->hlist); > kfree(ri); > -- > 2.25.2 > -- Masami Hiramatsu