Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp4593990rdb; Fri, 15 Sep 2023 07:01:27 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEjW9vHMJr8ycnm2KiTCBdP4i9j6iwRsXh7TXOohzehT7TG9FJ5AJ/A/UmsXCSeitClQUJE X-Received: by 2002:a05:6a20:734c:b0:140:cb66:73aa with SMTP id v12-20020a056a20734c00b00140cb6673aamr2136609pzc.3.1694786487215; Fri, 15 Sep 2023 07:01:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694786487; cv=none; d=google.com; s=arc-20160816; b=RJjY2wqxbnWWvDDWcWA3Pi6yrdvM7Pwc2sAnoxwi3X4e7X4+BgH5qwr/Lp6oNKdrQ9 6X4jzwv8xhuOmS7inp1/rqT7fYNi6kluwUOmJZ/MeWUZMGe6zlbLtTDM8vpSlRaW1+YR 8ezPHCeNC7h86ze14N49wIxubyWrzd0Q+3VXs0UsvPJ0gUdqZXr9sWgB4Fa5praIGWBX iUrWxVzBkjfyK6dirLhG+rDTkfu3l+pUScyEWe0pvJ9q50eO9uBjC4dkBFX06Um3dmid gcZpBG4tmJtHigqDLaOTQcai6duiKvKZRCzhvT9BgAo/DTttrxgEQMnJQ3gdE9EAv2LT I7IQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id :dkim-signature; bh=/CZRQ4CQ6CK+YkV5XceF78ucYRwapU0IqEIE5QvZWaU=; fh=FOko/EAq07QFIZh2Mq9H/o1EUScmxYR/A2EJiAH7nzs=; b=XIQqCRXe2GZIukOOWEqMEiH8CA432Urdz3x318Yzt7j3QyjRWmGZn/6NedpqeINxQS qVlhOg57JAnmK0SV79GvyoKgdP2eX2ctD99SFQBeBJ25LZpN0DW0v7j6Iij1VFx6dYrh pIffRodj/huDpYdqg9FxK06/LIqnMNfvtr935WCo0+pX+SOYdGwA0ghLST3jK4hNZeTi zVBMI+vpbfY8of6Ew2yydvB7ray774TZHD4DMoIbKgONmIQ9l9fr8jtyaF/7NwPAEUA3 iyBLobAop/sVHSacQ6GoRxqQJg13b+fTj7q3a7iE+/ZAy1mYTapTu1i00ihUWzXGH/wI +Uqw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=Zz+uMINP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Return-Path: Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id cm10-20020a056a00338a00b0068fbace5bfcsi3362784pfb.149.2023.09.15.07.01.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 07:01:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=Zz+uMINP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 0895D834F749; Thu, 14 Sep 2023 22:47:28 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232120AbjIOFrP (ORCPT + 99 others); Fri, 15 Sep 2023 01:47:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43840 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231997AbjIOFrO (ORCPT ); Fri, 15 Sep 2023 01:47:14 -0400 Received: from mail-ot1-x329.google.com (mail-ot1-x329.google.com [IPv6:2607:f8b0:4864:20::329]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A18712723 for ; Thu, 14 Sep 2023 22:46:50 -0700 (PDT) Received: by mail-ot1-x329.google.com with SMTP id 46e09a7af769-6c0bbbbad81so1028180a34.3 for ; Thu, 14 Sep 2023 22:46:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1694756810; x=1695361610; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=/CZRQ4CQ6CK+YkV5XceF78ucYRwapU0IqEIE5QvZWaU=; b=Zz+uMINPeTq/PinHIQ6ze0FrNAmyBRdXSSBPWVQKf4y+ALbKu2w4HqPU/p3ae9qmGb VzJ6xWNvRQnmeXYvL4V941hWdV8OUORhreFef9ybd3XWnagY4kFHG2lp4HN7amAzoXmd NipFyr2+4dP3RH6MW1x58aU0Fzg+FYWnwuF2qjCpS1wz+TIjmXMaEmnsem5snXTuAUrW o34gcjGhU5fB+e4zNZ/q7P2U1d/pzfgsU91e2arzXSjUSFul1XiXzGwo+JarxiIFjhlk 2ZJNKUFTCHmkz9MEmZa/LK4ljxOLA8iVn0W+kONL9UMP9sO5xjyX+DP0/AM0MMS05Z7W vdVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694756810; x=1695361610; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=/CZRQ4CQ6CK+YkV5XceF78ucYRwapU0IqEIE5QvZWaU=; b=exdjlECcIYIjztkidq7DuSVzN/2JoIVNpRRVe+m0b3Zxi5pGdn7M3nlcUDLOrFYa8T P44n4iQdecpKOx6KFzU6QdHL/derXBm+j7HmJW2g0IFwXcDzr4hQ5RaNvubml76knqtA thZp799EFH7mecTKWa9QbQ+/1ruglgROXzyeyE3dPGuTFTA20nSfIj4w6Tohnh/aP/3r JUlo27/7D0/pd1WbRVg354ALfpoJ3Po2us7isjWPBWLtzRtH1IraykG2tUfs83nft6/x 3g/jPEOcQKx1McicoKrgNwXzZFkD0SZE3DlTJt0GwH7EOWc4jThRpoka9yDwkWsogf1T 1jBw== X-Gm-Message-State: AOJu0YypPFNskomVOstD0Xwe6AtYuAVOavDEK2SENvPa7LVBcq6R4hAl PYLel7HuXZvVbO9Ha9JSI7CY/g== X-Received: by 2002:a9d:7316:0:b0:6bc:63ca:a245 with SMTP id e22-20020a9d7316000000b006bc63caa245mr579623otk.10.1694756809901; Thu, 14 Sep 2023 22:46:49 -0700 (PDT) Received: from [10.84.145.144] ([203.208.167.146]) by smtp.gmail.com with ESMTPSA id j4-20020aa78dc4000000b00686fe7b7b48sm2190209pfr.121.2023.09.14.22.46.45 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 14 Sep 2023 22:46:49 -0700 (PDT) Message-ID: <8f388b8f-bc19-5ad1-00ee-e67cdcdd9d4f@bytedance.com> Date: Fri, 15 Sep 2023 13:46:43 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [External] Re: [PATCH bpf-next v2 5/6] bpf: teach the verifier to enforce css_iter and process_iter in RCU CS To: Andrii Nakryiko , Alexei Starovoitov Cc: bpf@vger.kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@kernel.org, tj@kernel.org, linux-kernel@vger.kernel.org, ast@kernel.org References: <20230912070149.969939-1-zhouchuyi@bytedance.com> <20230912070149.969939-6-zhouchuyi@bytedance.com> <4c15c9fc-7c9f-9695-5c67-d3f214d04bd9@bytedance.com> <1f9cae15-979c-c049-78a9-f89d5cd1b53e@bytedance.com> From: Chuyi Zhou In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Thu, 14 Sep 2023 22:47:28 -0700 (PDT) Hello. 在 2023/9/15 07:26, Andrii Nakryiko 写道: > On Thu, Sep 14, 2023 at 1:56 AM Chuyi Zhou wrote: >> >> >> >> 在 2023/9/13 21:53, Chuyi Zhou 写道: >>> Hello. >>> >>> 在 2023/9/12 15:01, Chuyi Zhou 写道: >>>> css_iter and process_iter should be used in rcu section. Specifically, in >>>> sleepable progs explicit bpf_rcu_read_lock() is needed before use these >>>> iters. In normal bpf progs that have implicit rcu_read_lock(), it's OK to >>>> use them directly. >>>> >>>> This patch checks whether we are in rcu cs before we want to invoke >>>> bpf_iter_process_new and bpf_iter_css_{pre, post}_new in >>>> mark_stack_slots_iter(). If the rcu protection is guaranteed, we would >>>> let st->type = PTR_TO_STACK | MEM_RCU. is_iter_reg_valid_init() will >>>> reject if reg->type is UNTRUSTED. >>> >>> I use the following BPF Prog to test this patch: >>> >>> SEC("?fentry.s/" SYS_PREFIX "sys_getpgid") >>> int iter_task_for_each_sleep(void *ctx) >>> { >>> struct task_struct *task; >>> struct task_struct *cur_task = bpf_get_current_task_btf(); >>> >>> if (cur_task->pid != target_pid) >>> return 0; >>> bpf_rcu_read_lock(); >>> bpf_for_each(process, task) { >>> bpf_rcu_read_unlock(); >>> if (task->pid == target_pid) >>> process_cnt += 1; >>> bpf_rcu_read_lock(); >>> } >>> bpf_rcu_read_unlock(); >>> return 0; >>> } >>> >>> Unfortunately, we can pass the verifier. >>> >>> Then I add some printk-messages before setting/clearing state to help >>> debug: >>> >>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >>> index d151e6b43a5f..35f3fa9471a9 100644 >>> --- a/kernel/bpf/verifier.c >>> +++ b/kernel/bpf/verifier.c >>> @@ -1200,7 +1200,7 @@ static int mark_stack_slots_iter(struct >>> bpf_verifier_env *env, >>> __mark_reg_known_zero(st); >>> st->type = PTR_TO_STACK; /* we don't have dedicated reg >>> type */ >>> if (is_iter_need_rcu(meta)) { >>> + printk("mark reg_addr : %px", st); >>> if (in_rcu_cs(env)) >>> st->type |= MEM_RCU; >>> else >>> @@ -11472,8 +11472,8 @@ static int check_kfunc_call(struct >>> bpf_verifier_env *env, struct bpf_insn *insn, >>> return -EINVAL; >>> } else if (rcu_unlock) { >>> bpf_for_each_reg_in_vstate(env->cur_state, >>> state, reg, ({ >>> + printk("clear reg_addr : %px MEM_RCU : >>> %d PTR_UNTRUSTED : %d\n ", reg, reg->type & MEM_RCU, reg->type & >>> PTR_UNTRUSTED); >>> if (reg->type & MEM_RCU) { >>> - printk("clear reg addr : %lld", >>> reg); >>> reg->type &= ~(MEM_RCU | >>> PTR_MAYBE_NULL); >>> reg->type |= PTR_UNTRUSTED; >>> } >>> >>> >>> The demsg log: >>> >>> [ 393.705324] mark reg_addr : ffff88814e40e200 >>> >>> [ 393.706883] clear reg_addr : ffff88814d5f8000 MEM_RCU : 0 >>> PTR_UNTRUSTED : 0 >>> >>> [ 393.707353] clear reg_addr : ffff88814d5f8078 MEM_RCU : 0 >>> PTR_UNTRUSTED : 0 >>> >>> [ 393.708099] clear reg_addr : ffff88814d5f80f0 MEM_RCU : 0 >>> PTR_UNTRUSTED : 0 >>> .... >>> .... >>> >>> I didn't see ffff88814e40e200 is cleared as expected because >>> bpf_for_each_reg_in_vstate didn't find it. >>> >>> It seems when we are doing bpf_read_unlock() in the middle of iteration >>> and want to clearing state through bpf_for_each_reg_in_vstate, we can >>> not find the previous reg which we marked MEM_RCU/PTR_UNTRUSTED in >>> mark_stack_slots_iter(). >>> >> >> bpf_get_spilled_reg will skip slots if they are not STACK_SPILL, but in >> mark_stack_slots_iter() we has marked the slots *STACK_ITER* >> >> With the following change, everything seems work OK. >> >> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h >> index a3236651ec64..83c5ecccadb4 100644 >> --- a/include/linux/bpf_verifier.h >> +++ b/include/linux/bpf_verifier.h >> @@ -387,7 +387,7 @@ struct bpf_verifier_state { >> >> #define bpf_get_spilled_reg(slot, frame) \ >> (((slot < frame->allocated_stack / BPF_REG_SIZE) && \ >> - (frame->stack[slot].slot_type[0] == STACK_SPILL)) \ >> + (frame->stack[slot].slot_type[0] == STACK_SPILL || >> frame->stack[slot].slot_type[0] == STACK_ITER)) \ >> ? &frame->stack[slot].spilled_ptr : NULL) >> >> I am not sure whether this would harm some logic implicitly when using >> bpf_get_spilled_reg/bpf_for_each_spilled_reg in other place. If so, >> maybe we should add a extra parameter to control the picking behaviour. >> >> #define bpf_get_spilled_reg(slot, frame, stack_type) >> \ >> (((slot < frame->allocated_stack / BPF_REG_SIZE) && \ >> (frame->stack[slot].slot_type[0] == stack_type)) \ >> ? &frame->stack[slot].spilled_ptr : NULL) >> >> Thanks. > > I don't think it's safe to just make bpf_get_spilled_reg, and > subsequently bpf_for_each_reg_in_vstate and bpf_for_each_spilled_reg > just suddenly start iterating iterator states and/or dynptrs. At least > some of existing uses of those assume they are really working just > with registers. IIUC, when we are doing bpf_rcu_unlock, we do need to clear the state of reg including STACK_ITER. Maybe here we only need change the logic when using bpf_for_each_reg_in_vstate to clear state in bpf_rcu_unlock and keep everything else unchanged ? Thanks.