Received: by 2002:ab2:3350:0:b0:1f4:6588:b3a7 with SMTP id o16csp1994229lqe; Tue, 9 Apr 2024 06:55:12 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWLCmy6aOUyGBQnnCQpRY/KsM0Pber8dr5drzP9LtN0koitwdIOX6qaHEbqSrv3nEqk/kWvaw1uCHEh06+U7MAXDjZ03kw0ACxtg9y+bA== X-Google-Smtp-Source: AGHT+IEv9YaEGVh5dZCwD5dbyB0JE/Xn0R2xP9A52ReUpBw2wwH3W7QB4ZjrK3EP+CNbA5MN6MQo X-Received: by 2002:a17:906:ae4a:b0:a51:b3cb:cdca with SMTP id lf10-20020a170906ae4a00b00a51b3cbcdcamr7537990ejb.26.1712670912240; Tue, 09 Apr 2024 06:55:12 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712670912; cv=pass; d=google.com; s=arc-20160816; b=LbCAIiUimJSqPSc89aA6FfYZvV70uJy5oE9SyRnMtXnVz1FeKvYKwdk8WG7Is1uWcC KRD5X6ZB03bPXTcdCPuW2WWeVNswc6oos2ESYnm9gSs0nVhhH6rG1a6pzViJ5fgNkpI4 VUGMDGnaUn2AXxOKH+Zv+LeACQLOUdmCfbRoyadWBn3xiZVYUAlARlx0GOUbP+je/oLE Jwzb+gZLrdnbctxv0ODL5WeB9gxyMERFgQC6XBswqLB4EOHknCw6wHEb48i6HlnEVvvb kv/gVUtxHzKJgTWZAzBiS+Ch0T+O4UHDZPSMsAMheO4WKNkn5RKTNwF8sFzplz8O123S 2bqA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=IKe+DTn0Zjahs/7dHNAYLkxQclmCztXuY8YSovwKBdE=; fh=5+bcjx7aBibEOdQMadHvNOYSE+oGV0H7DWESFgfO3u8=; b=EZJQrrp14KZepFC9yJY6mmwuB/ZpjfqKDemHHpi5laocyvrgFcieEnZtcjRu9Vb8N/ ta2ALXDmqIW/9gioQa4UIhUE915EGDc2RFQK9l1gzODnDXQJn6ktq9BAZGDhamsf3Bow bfGh2VBXevs/otWr5uGcKIUf8QyG/s4nWzrx5Q1/vKkkrg66dw3SU27LNinjWYcCKhCj dB6XKUJ2+TyJ3eqk4mt+c4GS55jiD6BcLbtz3sUemHA7zfRQNL5c/Nrke5GSjtkvGZuT gKD7SIqryU8yx73FA1afX7I7SmylmWwqQ/XNmcaFUtMpQLbtovUihE145Xd0OkqsAkdm F0DA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=tnN3ews6; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-137025-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-137025-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id rs28-20020a170907037c00b00a517a8d9e83si4777809ejb.617.2024.04.09.06.55.12 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 09 Apr 2024 06:55:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-137025-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=tnN3ews6; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-137025-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-137025-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id CAB351F2220D for ; Tue, 9 Apr 2024 13:55:11 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 76D481327E8; Tue, 9 Apr 2024 13:49:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="tnN3ews6" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 32DD1132471; Tue, 9 Apr 2024 13:49:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712670566; cv=none; b=Bi9tQuoe+ZmFhpy7G+tRb44SaEqStPgqNflwaZB4VsR6vb07WYBEBZHYZVcTWxUAAvw8xrxE1xOptLt1GhBdFPr6Y+42L9gHHlLrvKm0L5eFpOyQCx81OvtcWqTGNYi8GwBfngsk7K9vPZ9+7/n4KpkOStPwsXY3EkPPE/6w9ik= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712670566; c=relaxed/simple; bh=p0eug8O3p1S0YU40BLzEUAlTDogSyrE4TDh9AEvpxrg=; h=Date:From:To:Cc:Subject:Message-Id:In-Reply-To:References: Mime-Version:Content-Type; b=HhS3JgDh6PiXEpKmj0gJul8b3stW/sEdVZBy4RX3k5WRzD3mXfY8OGFSg/EGCpObhPBxs9vgkLTFi1VOAaOjE3THC1RccMUXh7iOELRJAswCG1ly3n975+5Ut0vADCdgg9yohSePqcxqUaYvCvUD41y6cAhsBqPjghREauG2xeY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=tnN3ews6; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5CC51C433F1; Tue, 9 Apr 2024 13:49:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712670565; bh=p0eug8O3p1S0YU40BLzEUAlTDogSyrE4TDh9AEvpxrg=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=tnN3ews6LnbbJvub/lKu+IkiR8RZ0kr2r0pcqUZOLbxq8kwIQYeAUtQY2juejCu5p 2Ub702cpcJWK/AvEoiDiYbbbFVUcB8lL2KdP6F1AVAArg4U1Kj+QwmgG69KeCU3oBA A48FItNez7V4HHPDOV09Zzzw+Wpp/FKn4A3ANAP6TkKUhRIU+C6LEHMRyK+vFEXdm8 ZmhwXMwBOwR4ioC4iHclKPKrZXrmZf0WiysHWeFTo3Pt6w2nP0WMKhOk+JvFpMBa3a ArsbIQbF49TsV+BQw48xCPO4P8ZFDJ6VXmvpoRt4mgQGeKHYGzYnQZPm7Tf2ao0hKh 429Mtaoet9mPQ== Date: Tue, 9 Apr 2024 22:49:22 +0900 From: Masami Hiramatsu (Google) To: Zheng Yejian Cc: , , , , Subject: Re: [PATCH v2] kprobes: Avoid possible warn in __arm_kprobe_ftrace() Message-Id: <20240409224922.5f192e8ace5f7a90937bfa69@kernel.org> In-Reply-To: References: <20240407035904.2556645-1-zhengyejian1@huawei.com> <20240408083403.3302274-1-zhengyejian1@huawei.com> <20240408214102.be792c5cefd5ab757ef32a14@kernel.org> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Tue, 9 Apr 2024 14:20:45 +0800 Zheng Yejian wrote: > On 2024/4/8 20:41, Masami Hiramatsu (Google) wrote: > > Hi Zheng, > > > > On Mon, 8 Apr 2024 16:34:03 +0800 > > Zheng Yejian wrote: > > > >> There is once warn in __arm_kprobe_ftrace() on: > >> > >> ret = ftrace_set_filter_ip(ops, (unsigned long)p->addr, 0, 0); > >> if (WARN_ONCE(..., "Failed to arm kprobe-ftrace at %pS (error %d)\n", ...) > >> return ret; > >> > >> This warning is generated because 'p->addr' is detected to be not a valid > >> ftrace location in ftrace_set_filter_ip(). The ftrace address check is done > >> by check_ftrace_location() at the beginning of check_kprobe_address_safe(). > >> At that point, ftrace_location(addr) == addr should return true if the > >> module is loaded. Then the module is searched twice: > >> 1. in is_module_text_address(), we find that 'p->addr' is in a module; > >> 2. in __module_text_address(), we find the module; > >> > >> If the module has just been unloaded before the second search, then > >> '*probed_mod' is NULL and we would not go to get the module refcount, > >> then the return value of check_kprobe_address_safe() would be 0, but > >> actually we need to return -EINVAL. > > > > OK, so you found a race window in check_kprobe_address_safe(). > > > > It does something like below. > > > > check_kprobe_address_safe() { > > ... > > > > /* Timing [A] */ > > > > if (!(core_kernel_text(p->addr) || > > is_module_text_address(p->addr)) || > > ...(other reserved address check)) { > > return -EINVAL; > > } > > > > /* Timing [B] */ > > > > *probed_mod = __module_text_address(p->addr): > > if (*probe_mod) { > > if (!try_module_get(*probed_mod)) { > > return -ENOENT; > > } > > ... > > } > > } > > > > So, if p->addr is in a module which is alive at the timing [A], but > > unloaded at timing [B], 'p->addr' is passed the > > 'is_module_text_address(p->addr)' check, but *probed_mod becomes NULL. > > Thus the corresponding module is not referenced and kprobe_arm(p) will > > access a wrong address (use after free). > > This happens either kprobe on ftrace is enabled or not. > > Yes, This is the problem. And for this case, check_kprobe_address_safe() > still return 0, and then going on to arm kprobe may cause problems. So > we should make check_kprobe_address_safe() return -EINVAL when refcount > of the module is not got. Yes, > > > > > To fix this problem, we should move the mutex_lock(kprobe_mutex) before > > check_kprobe_address_safe() because kprobe_module_callback() also lock it > > so it can stop module unloading. > > > > Can you ensure this will fix your problem? > > It seems not, the warning in __arm_kprobe_ftrace() still occurs. I > contrived following simple test: > > #!/bin/bash > sysctl -w kernel.panic_on_warn=1 > while [ True ]; do > insmod mod.ko # contain function 'foo' > rmmod mod.ko > done & > while [ True ]; do > insmod kprobe.ko # register kprobe on function 'foo' > rmmod kprobe.ko > done & > > I think holding kprobe_mutex cannot make sure we get the refcount of the > module. Aah, yes, it cannot, because the kallsyms in a module will be removed after module->state becomes MODULE_STATE_UNFORMED. Before UNFORMED, the state is MODULE_STATE_GOING and the kprobe_module_callback() is called at that point. Thus, the following scenario happens. CPU1 CPU2 mod->state = MODULE_STATE_GOING kprobe_module_callback() { mutex_lock(&kprobe_mutex) loop on kprobe_table to disable kprobe in the module. mutex_unlock(&kprobe_mutex) } register_kprobe(p) { mutex_lock(&kprobe_mutex) check_kprobe_address_safe(p->addr) { [A''] is_module_text_address() return true until mod->state == UNFORMED. mod->state = MODULE_STATE_UNFORMED [B''] __module_text_address() returns NULL. } p is on the kprobe_table. mutex_unlock(&kprobe_mutex) So, as your fix, if we save the module at [A''] and use it at [B''], the mod is NOT able to get because mod->state != MODULE_STATE_LIVE. > > > I think your patch is just optimizing but not fixing the fundamental > > problem, which is we don't have an atomic search symbol and get module > > Sorry, this patch is a little confusing, but it is not just optimizing :) > > As shown below, after my patch, if p->addr is in a module which is alive > at the timing [A'] but unloaded at timing [B'], then *probed_mod must > not be NULL. Then after timing [B'], it will go to try_module_get() and > expected to fail and return -ENOENT. So this is the different. > > check_kprobe_address_safe() { > ... > *probed_mod = NULL; > if (!core_kernel_text((unsigned long) p->addr)) { > > /* Timing [A'] */ > > *probed_mod = __module_text_address((unsigned long) p->addr); > if (!(*probed_mod)) { > return -EINVAL; > } > } > ... > > /* Timing [B'] */ > > if (*probed_mod) { > if (!try_module_get(*probed_mod)) { > return -ENOENT; > } > ... > } OK, I got it. Hmm, but this is a bit long story to explain, the root cause is the delay of module unloading process. So more precisely, we can explain it as below. ---- When unloading a module, its state is changing MODULE_STATE_LIVE -> MODULE_STATE_GOING -> MODULE_STATE_UNFORMED. Each change will take a time. `is_module_text_address()` and `__module_text_address()` works with MODULE_STATE_LIVE and MODULE_STATE_GOING. If we use `is_module_text_address()` and `__module_text_address()` separately, there is a chance that the first one is succeeded but the next one is failed because module->state becomes MODULE_STATE_UNFORMED between those operations. In `check_kprobe_address_safe()`, if the second `__module_text_address()` is failed, that is ignored because it expected a kernel_text address. But it may have failed simply because module->state has been changed to MODULE_STATE_UNFORMED. In this case, arm_kprobe() will try to modify non-exist module text address (use-after-free). To fix this problem, we should not use separated `is_module_text_address()` and `__module_text_address()`, but use only `__module_text_address()` once and do `try_module_get(module)` which is only available with MODULE_STATE_LIVE. ---- Would it be good for you too? The code itself looks good to me now :-) Thank you! > > > API. In that case, we should stop a whole module unloading system until > > registering a new kprobe on a module. (After registering the kprobe, > > the callback can mark it gone and disarm_kprobe does not work anymore.) > > > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > > index 9d9095e81792..94eaefd1bc51 100644 > > --- a/kernel/kprobes.c > > +++ b/kernel/kprobes.c > > @@ -1633,11 +1633,11 @@ int register_kprobe(struct kprobe *p) > > p->nmissed = 0; > > INIT_LIST_HEAD(&p->list); > > > > + mutex_lock(&kprobe_mutex); > > + > > ret = check_kprobe_address_safe(p, &probed_mod); > > if (ret) > > - return ret; > > - > > - mutex_lock(&kprobe_mutex); > > + goto out; > > > > if (on_func_entry) > > p->flags |= KPROBE_FLAG_ON_FUNC_ENTRY; > > > > ---- > > > > Thank you, > > > >> > >> To fix it, originally we can simply check 'p->addr' is out of text again, > >> like below. But that would check twice respectively in kernel text and > >> module text, so finally I reduce them to be once. > >> > >> if (!(core_kernel_text((unsigned long) p->addr) || > >> is_module_text_address((unsigned long) p->addr)) || ...) { > >> ret = -EINVAL; > >> goto out; > >> } > >> ... > >> *probed_mod = __module_text_address((unsigned long) p->addr); > >> if (*probed_mod) { > >> ... > >> } else if (!core_kernel_text((unsigned long) p->addr)) { // check again! > >> ret = -EINVAL; > >> goto out; > >> } > >> > >> Signed-off-by: Zheng Yejian > >> --- > >> kernel/kprobes.c | 18 ++++++++++++------ > >> 1 file changed, 12 insertions(+), 6 deletions(-) > >> > >> v2: > >> - Update commit messages and comments as suggested by Masami. > >> Link: https://lore.kernel.org/all/20240408115038.b0c85767bf1f249eccc32fff@kernel.org/ > >> > >> v1: > >> - Link: https://lore.kernel.org/all/20240407035904.2556645-1-zhengyejian1@huawei.com/ > >> > >> diff --git a/kernel/kprobes.c b/kernel/kprobes.c > >> index 9d9095e81792..65adc815fc6e 100644 > >> --- a/kernel/kprobes.c > >> +++ b/kernel/kprobes.c > >> @@ -1567,10 +1567,17 @@ static int check_kprobe_address_safe(struct kprobe *p, > >> jump_label_lock(); > >> preempt_disable(); > >> > >> - /* Ensure it is not in reserved area nor out of text */ > >> - if (!(core_kernel_text((unsigned long) p->addr) || > >> - is_module_text_address((unsigned long) p->addr)) || > >> - in_gate_area_no_mm((unsigned long) p->addr) || > >> + /* Ensure the address is in a text area, and find a module if exists. */ > >> + *probed_mod = NULL; > >> + if (!core_kernel_text((unsigned long) p->addr)) { > >> + *probed_mod = __module_text_address((unsigned long) p->addr); > >> + if (!(*probed_mod)) { > >> + ret = -EINVAL; > >> + goto out; > >> + } > >> + } > >> + /* Ensure it is not in reserved area. */ > >> + if (in_gate_area_no_mm((unsigned long) p->addr) || > >> within_kprobe_blacklist((unsigned long) p->addr) || > >> jump_label_text_reserved(p->addr, p->addr) || > >> static_call_text_reserved(p->addr, p->addr) || > >> @@ -1580,8 +1587,7 @@ static int check_kprobe_address_safe(struct kprobe *p, > >> goto out; > >> } > >> > >> - /* Check if 'p' is probing a module. */ > >> - *probed_mod = __module_text_address((unsigned long) p->addr); > >> + /* Get module refcount and reject __init functions for loaded modules. */ > >> if (*probed_mod) { > >> /* > >> * We must hold a refcount of the probed module while updating > >> -- > >> 2.25.1 > >> > > > -- > Thanks > Zheng Yejian > > > > -- Masami Hiramatsu (Google)