Received: by 2002:a05:7412:ba23:b0:fa:4c10:6cad with SMTP id jp35csp292817rdb; Thu, 18 Jan 2024 03:58:28 -0800 (PST) X-Google-Smtp-Source: AGHT+IFmd96MNk0lXE1FB8iANud3VLloNF+AFbbi4iQXvHfvUl1ShoNXz9rMf5oUQSI0V9KlBw0w X-Received: by 2002:a05:6a20:d394:b0:19a:bd37:8bfa with SMTP id iq20-20020a056a20d39400b0019abd378bfamr730558pzb.23.1705579108352; Thu, 18 Jan 2024 03:58:28 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705579108; cv=pass; d=google.com; s=arc-20160816; b=Y3tXq32r82elmuGYA+Afq+vTQISCg2rdQo0pPmEhv4EUJx9S8AdLpgGUy3DbXKEAkb 5+a2nPQwc3pMOajhyZOTAWQdW3mB3iPy3jPs8YASbIbG9b/XXncHu8xBQAWPb9Jj1P7B 86+B4wcbcrQYIh+el7cbOOpOroi1lImf5lpthWLMifoZgoywNpK6+wyP2chtGQOGSWre hGsWG+DthFIT3060nLqCyxEAuCZOEmTHeW2S3n0qeoPZ+Te9T6TLwDxp8zuk/kFdpXUn I8VaVIZ8cZl+GDfIs//M7poOtcnOmo5LxhrnV5mHMlScDx/kj5l/KLrvzgp0nnDU9tQe /8Kw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:references :in-reply-to:subject:cc:to:from:dkim-signature; bh=K8Qc8s97lzK/6LVP40VLku+QWSyBWBrxtyhvc1y67Ow=; fh=ziiDCShbkWLlBWK9tWvAHQREZ7hsHuYwq4GklQBOPX8=; b=GNzkDoZRXlvxXrEbb7iNFZ0lElLyA5HA5rGECARHGuemCcB1EnFEepjYwy6xNdnY7V aOKGYimBxumxZmqRO/r2B2Yhjdz6k+J0+hMtg84V7CMj1LwAAAwbFStDrkhi95n7uRJ/ sef2SD70luR58TKXYUxv1jHrX+NnMkyHa862OOe3q6B9iXIjt+ck+UjZKkKcj1rGewUp //7WrKCBSlR82ygDAhHzYuq6V3GO+cQdLf/danIH2X7mDBWUERxF8JxGeYJD5XA/7137 DKkxwGj79KuJsA8ELyDngy6utLt7vKgSkp/4X1yDfpJvI78pQV3IBGwebl6NI1/FhPN/ WTWg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=OIetjBDp; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-30072-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-30072-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id y26-20020aa7805a000000b006d9a6d5737dsi3451467pfm.91.2024.01.18.03.58.28 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jan 2024 03:58:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-30072-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=OIetjBDp; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-30072-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-30072-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id F15E1282A41 for ; Thu, 18 Jan 2024 11:58:27 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 37DC724A0C; Thu, 18 Jan 2024 11:58:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="OIetjBDp" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 033CF24201 for ; Thu, 18 Jan 2024 11:58:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705579093; cv=none; b=E0e93F+iKexpmj9PYZW7m/ZbQMndcs0znbIgHvMpeD/aFauTte4jZmlhmUTqLlLwBZmfyPzNK4AzWD9FfHEvJIdft2YC+1LQNE3+26pX1i1GfEmLPHK2fiRda5DKgBjcCe4xzVeG1sli9rl5DSBeRXKdp17ux1g1uCk+YpLu3lg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705579093; c=relaxed/simple; bh=X82cqppfXg1vTXr8GzlQmIMAHtN7MmeahOqytI0aDbM=; h=DKIM-Signature:Received:X-MC-Unique:Received: X-Google-DKIM-Signature:X-Gm-Message-State:X-Received: X-Google-Smtp-Source:X-Received:Received:Received:From:To:Cc: Subject:In-Reply-To:References:X-Clacks-Overhead:Date:Message-ID: MIME-Version:Content-Type:Content-Transfer-Encoding; b=mBhU6Z+Ie2YjkHYeGT5OJy607Lw8FGS6ip+Ve1TWva5uxX30gtIM5tK5+bUAnoNWRuVk9bvpHum8pL9YZMuseQ2D8lGb2Z3q6xo12p8daceR+OJFeqLu4j0fl8j+MCqf9VFe+piOrnjqUqxQIPzAKg+r0aZO3wraHNJUqpohZIY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=OIetjBDp; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705579091; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=K8Qc8s97lzK/6LVP40VLku+QWSyBWBrxtyhvc1y67Ow=; b=OIetjBDpayHlHu4C77+NXRFw7uCqMXRb5vF7NZ2WeUxnuhbm3y1onrXxc/4uNgpM+tAiZ7 eNZHRQGngW2kKIob89U+ifbBrg7Td6MsxUUdWvOz75g2pG8k4blT+6aAO0D1m/noCbglBO L+okcYkUinaAdNa0yaz7bohTbVtr94M= Received: from mail-ej1-f69.google.com (mail-ej1-f69.google.com [209.85.218.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-84-mzX2lsBuPSOWVtf_kZl2Rw-1; Thu, 18 Jan 2024 06:58:09 -0500 X-MC-Unique: mzX2lsBuPSOWVtf_kZl2Rw-1 Received: by mail-ej1-f69.google.com with SMTP id a640c23a62f3a-a2866011bbaso78363166b.1 for ; Thu, 18 Jan 2024 03:58:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705579088; x=1706183888; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=K8Qc8s97lzK/6LVP40VLku+QWSyBWBrxtyhvc1y67Ow=; b=gfMOdOXWOnkwzp+qJq2mwV8H3jejxigtZA8Hi4DUJy07Thcgba/UqqHZTXlYizH8Oj ObaFIZLR2ilwxFCuEF9h9pmwJ1rTie0kcFYbYHP2YQoYtCoDMIqCX2/yxXi2jxP3isMg VoCF9UBHMpHTbPpGRBRHPG0ViejV0Z7tkdn+Ce/AdVZpNJ609oojX1UaX8V8B/xxbe7z vfE+Xb4a33NRa5mfewbgByO6pW9SPDAPi+p/42rzwf8j33GaItxro+6f8ucp1AJiWyxL RMjDwVVhe5CB79H9FW/S1SSa1hdFo/NnGz8uSNES0sQ6lqoatlvg5ilNkJ0QXSFLUr29 R0vQ== X-Gm-Message-State: AOJu0YwFRFQtpyllNLI0FZUXZskmioXo0stgZATriLpiHhNjnLOwSOz1 6BowFojxMpijZVfBJyOSf1LAVfvSBqJL0bxB6BVNSX2BWdR2gIBZv2yfo8LZe6mpvINeuYshvSA 8GH6BgpiwmY1uO3rS3sLW3lUOLRdAsmFKq/lL3/QXWGq5U4zCG38kvxWZaqqNuA== X-Received: by 2002:a17:907:920d:b0:a2b:ebd5:80bd with SMTP id ka13-20020a170907920d00b00a2bebd580bdmr695558ejb.42.1705579088506; Thu, 18 Jan 2024 03:58:08 -0800 (PST) X-Received: by 2002:a17:907:920d:b0:a2b:ebd5:80bd with SMTP id ka13-20020a170907920d00b00a2bebd580bdmr695543ejb.42.1705579088227; Thu, 18 Jan 2024 03:58:08 -0800 (PST) Received: from alrua-x1.borgediget.toke.dk ([45.145.92.2]) by smtp.gmail.com with ESMTPSA id cw1-20020a170907160100b00a2dae4e408bsm5484231ejd.15.2024.01.18.03.58.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jan 2024 03:58:07 -0800 (PST) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 57A871088BB0; Thu, 18 Jan 2024 12:58:07 +0100 (CET) From: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= To: Sebastian Andrzej Siewior Cc: Alexei Starovoitov , LKML , Network Development , "David S. Miller" , Boqun Feng , Daniel Borkmann , Eric Dumazet , Frederic Weisbecker , Ingo Molnar , Jakub Kicinski , Paolo Abeni , Peter Zijlstra , Thomas Gleixner , Waiman Long , Will Deacon , Alexei Starovoitov , Andrii Nakryiko , Cong Wang , Hao Luo , Jamal Hadi Salim , Jesper Dangaard Brouer , Jiri Olsa , Jiri Pirko , John Fastabend , KP Singh , Martin KaFai Lau , Ronak Doshi , Song Liu , Stanislav Fomichev , VMware PV-Drivers Reviewers , Yonghong Song , bpf Subject: Re: [PATCH net-next 15/24] net: Use nested-BH locking for XDP redirect. In-Reply-To: <20240118073540.GIobmYpD@linutronix.de> References: <20231215171020.687342-1-bigeasy@linutronix.de> <20231215171020.687342-16-bigeasy@linutronix.de> <87r0iw524h.fsf@toke.dk> <20240112174138.tMmUs11o@linutronix.de> <87ttnb6hme.fsf@toke.dk> <20240118073540.GIobmYpD@linutronix.de> X-Clacks-Overhead: GNU Terry Pratchett Date: Thu, 18 Jan 2024 12:58:07 +0100 Message-ID: <878r4m6egg.fsf@toke.dk> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Sebastian Andrzej Siewior writes: > On 2024-01-17 17:37:29 [+0100], Toke H=C3=B8iland-J=C3=B8rgensen wrote: >> This is all back-of-the-envelope calculations, of course. Having some >> actual numbers to look at would be great; I don't suppose you have a >> setup where you can run xdp-bench and see how your patches affect the >> throughput? > > No but I probably could set it up. That would be great! Feel free to ping me if you need any pointers to how we usually do the perf measurements :) >> I chatted with Jesper about this, and he had an idea not too far from >> this: split up the XDP and regular stack processing in two stages, each >> with their individual batching. So whereas right now we're doing >> something like: >>=20 >> run_napi() >> bh_disable() >> for pkt in budget: >> act =3D run_xdp(pkt) >> if (act =3D=3D XDP_PASS) >> run_netstack(pkt) // this is the expensive bit >> bh_enable() >>=20 >> We could instead do: >>=20 >> run_napi() >> bh_disable() >> for pkt in budget: >> act =3D run_xdp(pkt) >> if (act =3D=3D XDP_PASS) >> add_to_list(pkt, to_stack_list) >> bh_enable() >> // sched point >> bh_disable() >> for pkt in to_stack_list: >> run_netstack(pkt) >> bh_enable() >>=20 >>=20 >> This would limit the batching that blocks everything to only the XDP >> processing itself, which should limit the maximum time spent in the >> blocking state significantly compared to what we have today. The caveat >> being that rearranging things like this is potentially a pretty major >> refactoring task that needs to touch all the drivers (even if some of >> the logic can be moved into the core code in the process). So not really >> sure if this approach is feasible, TBH. > > This does not work because bh_disable() does not disable scheduling. > Scheduling may happen. bh_disable() acquires a lock which is currently > the only synchronisation point between two say network driver doing > NAPI. And this what I want to get rid of. > Regarding expensive bit as in XDP_PASS: This doesn't need locking as per > proposal, just the REDIRECT piece. Right, well s/bh_disable()/lock()/; my main point was splitting up the processing so that the XDP processing itself and the stack activation on XDP_PASS is not interleaved. This will make it possible to hold the lock around the whole XDP batch, not just individual packets, and so retain the performance we gain from amortising expensive operations over multiple packets. -Toke