Received: by 2002:a05:7412:b995:b0:f9:9502:5bb8 with SMTP id it21csp3879392rdb; Thu, 28 Dec 2023 03:10:02 -0800 (PST) X-Google-Smtp-Source: AGHT+IHQw32kR0VlzW0zK/QBk4cwRjKazqeIewQS1hVuTP+SaXSH4W5+mN67tW9BXhUpuGe257sl X-Received: by 2002:a05:6e02:b48:b0:35f:ef55:6abf with SMTP id f8-20020a056e020b4800b0035fef556abfmr14609473ilu.0.1703761801959; Thu, 28 Dec 2023 03:10:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703761801; cv=none; d=google.com; s=arc-20160816; b=xKXJF6U5B3M3dy07fdY0AM22K/NasPtofg/BfwF3AEubtPoB6X6lsYbGiOuNavRzCu xyN1dBwiyi1ViMa/H0HIBLByAiPTOgci79GcGjduSFvqZLK5uW+P8k1PDXKTXQfT2IgW Tzit7XcTbOpl1FLOJXknhwVOp5NteNfSP0PqqJOTXNLrp/pQt311IdXvbLJVTdSZ+h4I gNMCb0NLkyhg5I3vVl9PaPqFu0XxMrYtOkQERg3ramt9Thk4mUxGpkJ9PGKrK5AHsyZN MYZ1KsTrIR8MxbROhVE7madRsElB+UjjhXisIRKM+u3CGJTcl0+3dcp9IMdaseqkZYDr NVSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id; bh=REajgy9d9YFyi5LbGgatUdufPYFdnkW4fi+uMLHin0M=; fh=wLyYV7pmOPQdvjiWfvNVR5UZV9Qzovjve02tZyi75M8=; b=UJd8iv3i1T5oDM+AyU5QSNaZLtbaGvuIEcTcyv8XIbBp77B/tOA4v1dw+bC6g29Tmj 9CjR6JJuOa6Tio6J/uFQ+3wE0K0Qwaj6+xgaPU+nRIfcwou2y9loWg9ITMbRCRUQdllG mfL1D+ECMs9FIPrH1zMCsPyKJS/86rx7Tgl2HPXlKcsA7inBd6nuRVDsvDPcjr8DP0+w ROP0cu9MLzMONo4/FVdEbi6kB0OECxMpHAJhxvRdfdNPdNxlViyJSeHhkWMbFLNa6oez O4th81Z6bV8xVJXKO2CpKTFraLTMHaRJAaaxZH0j6NwjFn7uTwxqd3wFvpjsElb0u9eU vIcA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-12603-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-12603-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id z64-20020a636543000000b005c6707eedb1si12407589pgb.679.2023.12.28.03.10.01 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Dec 2023 03:10:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-12603-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-12603-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-12603-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 0618CB21821 for ; Thu, 28 Dec 2023 11:07:09 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B6DB57468; Thu, 28 Dec 2023 11:06:54 +0000 (UTC) X-Original-To: linux-kernel@vger.kernel.org Received: from out30-101.freemail.mail.aliyun.com (out30-101.freemail.mail.aliyun.com [115.124.30.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AB5336AAC; Thu, 28 Dec 2023 11:06:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R981e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045176;MF=alibuda@linux.alibaba.com;NM=1;PH=DS;RN=14;SR=0;TI=SMTPD_---0VzOK8oQ_1703761600; Received: from 30.221.146.89(mailfrom:alibuda@linux.alibaba.com fp:SMTPD_---0VzOK8oQ_1703761600) by smtp.aliyun-inc.com; Thu, 28 Dec 2023 19:06:41 +0800 Message-ID: <5f8ee6e1-8a3c-457c-bbda-5b003e726a7c@linux.alibaba.com> Date: Thu, 28 Dec 2023 19:06:39 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC nf-next v3 1/2] netfilter: bpf: support prog update Content-Language: en-US To: Alexei Starovoitov Cc: Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , bpf , LKML , Network Development , coreteam@netfilter.org, netfilter-devel , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Alexei Starovoitov References: <1703081351-85579-1-git-send-email-alibuda@linux.alibaba.com> <1703081351-85579-2-git-send-email-alibuda@linux.alibaba.com> <1d3cb7fc-c1dc-a779-8952-cdbaaf696ce3@linux.alibaba.com> From: "D. Wythe" In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 12/28/23 3:00 AM, Alexei Starovoitov wrote: > On Wed, Dec 27, 2023 at 12:20 AM D. Wythe wrote: >> >> Hi Alexei, >> >> >> IMMO, nf_unregister_net_hook does not wait for the completion of the >> execution of the hook that is being removed, >> instead, it allocates a new array without the very hook to replace the >> old arrayvia rcu_assign_pointer() (in __nf_hook_entries_try_shrink), >> then it use call_rcu() to release the old one. >> >> You can find more details in commit >> 8c873e2199700c2de7dbd5eedb9d90d5f109462b. >> >> In other words, when nf_unregister_net_hook returns, there may still be >> contexts executing hooks on the >> old array, which means that the `link` may still be accessed after >> nf_unregister_net_hook returns. >> >> And that's the reason why we use kfree_rcu() to release the `link`. >>>> nf_hook_run_bpf >>>> const struct >>>> bpf_nf_link *nf_link = bpf_link; >>>> >>>> bpf_nf_link_release >>>> nf_unregister_net_hook(nf_link->net, &nf_link->hook_ops); >>>> >>>> bpf_nf_link_dealloc >>>> free(link) >>>> bpf_prog_run(link->prog); > Got it. > Sounds like it's an existing bug. If so it should be an independent > patch with Fixes tag. > > Also please craft a test case to demonstrate UAF. > It is not an existing bug... Accessing the link within the hook was something I introduced here to support updates????, as previously there was no access to the link within the hook. >> I must admit that it is indeed feasible if we eliminate the mutex and >> use cmpxchg to swap the prog (we need to ensure that there is only one >> bpf_prog_put() on the old prog). >> However, when cmpxchg fails, it means that this context has not >> outcompeted the other one, and we have to return a failure. Maybe >> something like this: >> >> if (!cmpxchg(&link->prog, old_prog, new_prog)) { >> /* already replaced by another link_update */ >> return -xxx; >> } >> >> As a comparison, The version with the mutex wouldn't encounter this >> error, every update would succeed. I think that it's too harsh for the >> user to receive a failure >> in that case since they haven't done anything wrong. > Disagree. The mutex doesn't prevent this issue. > There is always a race. > It happens when link_update.old_prog_fd and BPF_F_REPLACE > were specified. > One user space passes an FD of the old prog and > another user space doing the same. They both race and one of them > gets > if (old_prog && link->prog != old_prog) { > err = -EPERM; > > it's no different with dropping the mutex and doing: > if (old_prog) { > if (!cmpxchg(&link->prog, old_prog, new_prog)) > -EPERM > } else { > old_prog = xchg(&link->prog, new_prog); > } Got it!  It's very helpful, Thanks very much! I will modify my patch accordingly. Best wishes, D. Wythe