Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp6875972rwb; Mon, 12 Dec 2022 07:27:11 -0800 (PST) X-Google-Smtp-Source: AA0mqf5TqDyJvy7Mtq8ZNpVc6hHIB+wakbw1cvT2NASvM6zrrdvx6QAHya67DGWHAAs2GCfxO58b X-Received: by 2002:a05:6a20:9e49:b0:a6:7529:7c99 with SMTP id mt9-20020a056a209e4900b000a675297c99mr22126633pzb.5.1670858831407; Mon, 12 Dec 2022 07:27:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670858831; cv=none; d=google.com; s=arc-20160816; b=FjsAKXObgpfmQ+p9YVW58uMgtcgEo3FTthpZz/hY9phZJiUZ7wIAsHVfact1q4jGHf kLs4IZsHwQ1pZ3YgyihI51bPleR7TAWqihQgfc8GvePwbDooTB3xo453C9QZtnPF1LP5 ExLdhC8wY56IQ+bJEcvawWOlKI20I9db6W9w+NkvDXuB4c/lYaZtWEoeXBRJoYdYHgsU Gv04iuAKZ9GNHwPJ31N23E72VBsp2I44Yl6dmTRnQIXW9Jd+dE+x/AEW6O/UMIOiyMod 79tQhxKl9UOUYGoQSimSWX+g2BPKAGmx7wK3yydcGsW8cgAesiuTEzsicPLAWjrviccu DtKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:date:from:dkim-signature; bh=mVXG215IpHTW14bOAFnr20SAKbuAjq8u9fkZJEk1hwY=; b=Y9bACtRkgDLmLFSorbIp3pftGWcDAzWMQbcXERvThrxI4jF/J5kXblfffHqYyf7bZy BfNRzu7LnJg5LMBkDxp9WLjWAsfATZhWsMoLh925cnVAOltCmgYn9PM8zWpDnkGJnV3g Ef6BIqy6Q8bMl47OS+b7cwtXJjDav3vYpIv68xvSd7S3W/BFKBhvtYTrOBKyYItItuSv TU6FxanK2kJE5a/VMtVC40pEGF7CEaGUxdFptpBobTbNh3GKvRmxE3O2wL4fbJqksRxj PQKMlOOxedyFWm9RxpZDp8lEog4URe2jwDHSNQt8yyu9kKX812WNp3zmXHl4tjFvELK/ qcbQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=FHwgxelB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a21-20020a63e855000000b0047874011fa6si9779005pgk.272.2022.12.12.07.27.00; Mon, 12 Dec 2022 07:27:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=FHwgxelB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232480AbiLLPEm (ORCPT + 74 others); Mon, 12 Dec 2022 10:04:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33766 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232477AbiLLPEk (ORCPT ); Mon, 12 Dec 2022 10:04:40 -0500 Received: from mail-ej1-x632.google.com (mail-ej1-x632.google.com [IPv6:2a00:1450:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1D34213EBB; Mon, 12 Dec 2022 07:04:39 -0800 (PST) Received: by mail-ej1-x632.google.com with SMTP id t17so28795612eju.1; Mon, 12 Dec 2022 07:04:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:from:to:cc:subject:date:message-id:reply-to; bh=mVXG215IpHTW14bOAFnr20SAKbuAjq8u9fkZJEk1hwY=; b=FHwgxelBmmdRgokgPpc4gmxoAZ3IpTSVsP6vnf0dQUazSvawIEErpPf7fTWBIZEvfA BsKcxu74qftaRuXC8JFIlg48z2Xz6FrNtBkIA5PbSXQ5hauKu2uW3Hw58cfeKbP5j1ST cU7wPJDeTLNfDpI0GAZALDnhWKju5UsXRrqP/IZcRYHQG7LeYw/L+UEt3Uf2gFixAkJ0 wo9a7V9XYBaIQ1P7nroLrBIZ+ujH5j19Hh4kvNpU46+fX1Iib8LAWGL++kwMNsV0/qOe yBA5vb71L4EczH9aaVhCyoDA9uTipdSuxeZwZ9oqm8uCycKCor3bFvYglgx2ViG9tnMj nTjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=mVXG215IpHTW14bOAFnr20SAKbuAjq8u9fkZJEk1hwY=; b=TAdP+WtY3i9Xe0UWVQJMaZ8nyXmeNJttfi7ZqnTvWR+L1pnc0NsseDT9u2zeW09+Dp ueDEZ8bAz1CP8ToUReKG1aYHkqwIwJUleNqbkrPXI/V/q6rgLtO2bMfykK66dIGmdPNN W6w8jKKJrlh8r7LKOdqRWBZywiI5UFVmUD/+b4u/QTa8PkrtGHp6AzYj0gm9lblC8Vz6 ciUooBP93em1dl/F5jtL31gMjd42VY8vhOWBLRXMnbFKzJoemYLNQ74qKgOFrT8UrM77 xmTFsPp0tWHGylveGJIdiRKOWcuToNjnijP6P1hyu6epvffrl081pIn0PVsdmCQyw4QX 6gEw== X-Gm-Message-State: ANoB5pkpqft6iFPiiclaLJAIFPLlqYrYt1IBnnXcnKCvhPBSqIaK5ABx ayq1Yy0xzi7DBrnVCnAggk0= X-Received: by 2002:a17:906:2209:b0:7c1:1b89:1fe0 with SMTP id s9-20020a170906220900b007c11b891fe0mr15045466ejs.65.1670857477499; Mon, 12 Dec 2022 07:04:37 -0800 (PST) Received: from krava ([83.240.63.35]) by smtp.gmail.com with ESMTPSA id e8-20020a170906314800b0073ae9ba9ba8sm3418860eje.3.2022.12.12.07.04.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Dec 2022 07:04:36 -0800 (PST) From: Jiri Olsa X-Google-Original-From: Jiri Olsa Date: Mon, 12 Dec 2022 16:04:34 +0100 To: Jiri Olsa , Hao Sun Cc: Alexei Starovoitov , Jakub Kicinski , "Paul E. McKenney" , Daniel Borkmann , Yonghong Song , Song Liu , Peter Zijlstra , bpf , Alexei Starovoitov , John Fastabend , Andrii Nakryiko , Martin KaFai Lau , Yonghong Song , KP Singh , Stanislav Fomichev , Hao Luo , David Miller , Jesper Dangaard Brouer , Linux Kernel Mailing List , netdev , Thorsten Leemhuis Subject: Re: BUG: unable to handle kernel paging request in bpf_dispatcher_xdp Message-ID: References: <5c9d77bf-75f5-954a-c691-39869bb22127@meta.com> <96b0d9d8-02a7-ce70-de1e-b275a01f5ff3@iogearbox.net> <20221209153445.22182ca5@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Dec 10, 2022 at 02:11:34PM +0100, Jiri Olsa wrote: > On Fri, Dec 09, 2022 at 05:12:03PM -0800, Alexei Starovoitov wrote: > > On Fri, Dec 9, 2022 at 4:06 PM Jiri Olsa wrote: > > > > > > On Fri, Dec 09, 2022 at 03:34:45PM -0800, Jakub Kicinski wrote: > > > > On Sat, 10 Dec 2022 00:32:07 +0100 Daniel Borkmann wrote: > > > > > fwiw, these should not be necessary, Documentation/RCU/checklist.rst : > > > > > > > > > > [...] One example of non-obvious pairing is the XDP feature in networking, > > > > > which calls BPF programs from network-driver NAPI (softirq) context. BPF > > > > > relies heavily on RCU protection for its data structures, but because the > > > > > BPF program invocation happens entirely within a single local_bh_disable() > > > > > section in a NAPI poll cycle, this usage is safe. The reason that this usage > > > > > is safe is that readers can use anything that disables BH when updaters use > > > > > call_rcu() or synchronize_rcu(). [...] > > > > > > > > FWIW I sent a link to the thread to Paul and he confirmed > > > > the RCU will wait for just the BH. > > > > > > so IIUC we can omit the rcu_read_lock/unlock on bpf_prog_run_xdp side > > > > > > Paul, > > > any thoughts on what we can use in here to synchronize bpf_dispatcher_change_prog > > > with bpf_prog_run_xdp callers? > > > > > > with synchronize_rcu_tasks I'm getting splats like: > > > https://lore.kernel.org/bpf/20221209153445.22182ca5@kernel.org/T/#m0a869f93404a2744884d922bc96d497ffe8f579f > > > > > > synchronize_rcu_tasks_rude seems to work (patch below), but it also sounds special ;-) > > > > Jiri, > > > > I haven't tried to repro this yet, but I feel you're on > > the wrong path here. The splat has this: > > ? bpf_prog_run_xdp include/linux/filter.h:775 [inline] > > ? bpf_test_run+0x2ce/0x990 net/bpf/test_run.c:400 > > that test_run logic takes rcu_read_lock. > > See bpf_test_timer_enter. > > I suspect the addition of synchronize_rcu_tasks_rude > > only slows down the race. > > The synchronize_rcu_tasks_trace also behaves like synchronize_rcu. > > See our new and fancy rcu_trace_implies_rcu_gp(), > > but I'm not sure it applies to synchronize_rcu_tasks_rude. > > Have you tried with just synchronize_rcu() ? > > If your theory about the race is correct then > > the vanila sync_rcu should help. > > If not, the issue is some place else. > > synchronize_rcu seems to work as well, I'll keep the test > running for some time looks good, Hao Sun, could you please test change below? thanks, jirka --- diff --git a/kernel/bpf/dispatcher.c b/kernel/bpf/dispatcher.c index c19719f48ce0..4b0fa5b98137 100644 --- a/kernel/bpf/dispatcher.c +++ b/kernel/bpf/dispatcher.c @@ -124,6 +124,7 @@ static void bpf_dispatcher_update(struct bpf_dispatcher *d, int prev_num_progs) } __BPF_DISPATCHER_UPDATE(d, new ?: (void *)&bpf_dispatcher_nop_func); + synchronize_rcu(); if (new) d->image_off = noff;