Received: by 2002:a05:6358:7058:b0:131:369:b2a3 with SMTP id 24csp7464639rwp; Tue, 18 Jul 2023 16:10:08 -0700 (PDT) X-Google-Smtp-Source: APBJJlG9GKkCmArOvtpl4v735u2mqLla5kG8+y7RTTO0LOKhUKyOg0KrYegqtowUlcyraqySAxBr X-Received: by 2002:a17:906:77d8:b0:994:580c:5058 with SMTP id m24-20020a17090677d800b00994580c5058mr916100ejn.14.1689721807698; Tue, 18 Jul 2023 16:10:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689721807; cv=none; d=google.com; s=arc-20160816; b=spNPSMY61Xqme3JSnleVMz+Be8RwqxJYT0crSn8XJlTpfewkH49MH97HNSmzAOYiBT ptjrrD0RaIWHIruOfDF2Wto32mRqgCF2u2gauRedqnc6AawFFsn+0tGFcnrd492ApWm6 +EXuh8sZBo7p7FkDsHc3Q50+4tM1pWlbKOVS7q/xs0Z9VGK209SG6MCCw8mvfoeKlEYr 2YD1jmP/JHvpRgdlx1yxK2+pIqqxG7AoXYE6ehV8OsWLOV4yf7pmLwLJj+1M3t3CRsT3 2N37ETlP7H7r63tHrTxyL/OkRkyLdNQBwcgeQSfTjhiCCQ6H50OWdMhKavRCqqaN7Cij X7qA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=ZBK526CC9vRM8Qicvd5fv+TDorJV2z3St+r9zTNd/dg=; fh=8sk9OyDa0AbSddJkxD1Qeqd0ogk1sDGzugjiNOjn204=; b=dl8mTskGX7ZZMVUdwdgE0VzrL0OY/twGiQ3siww+FgOmZg2Cqpt49zXrcTBGLJiRdR RS3Sl+fvfVmCjTnOIYm8j/l9bAigwDsE1uvPlejVS0xGZZ/LURhe58ODxRb1OH3z13Wr ii+wApfPRW51U+pR8ga35fQqV/fCF9aKDDntODW91lqSCRFsK/rOzbYxsO8tXmoOx2Hb 0vvSMMkRKk94bCdZKxycRP/enidC+ww7YwdWzWPgf0ekL9NWT9UrzcXe/k2my4rNWQoc TG9wdZ7QuDdiXIK0ki5VxYpaAiwon5VuDLqdf6UAdsAAeMH18WClBS0Emcrny67u7CwJ l9qg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cloudflare.com header.s=google header.b=K0k0d3wB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=cloudflare.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q11-20020a170906a08b00b0099845f657a8si574189ejy.616.2023.07.18.16.09.43; Tue, 18 Jul 2023 16:10:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@cloudflare.com header.s=google header.b=K0k0d3wB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=cloudflare.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229750AbjGRWKa (ORCPT + 99 others); Tue, 18 Jul 2023 18:10:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41872 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231534AbjGRWK0 (ORCPT ); Tue, 18 Jul 2023 18:10:26 -0400 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 95DD119B0 for ; Tue, 18 Jul 2023 15:10:22 -0700 (PDT) Received: by mail-wr1-x434.google.com with SMTP id ffacd0b85a97d-314319c0d3eso6269951f8f.0 for ; Tue, 18 Jul 2023 15:10:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; t=1689718221; x=1692310221; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ZBK526CC9vRM8Qicvd5fv+TDorJV2z3St+r9zTNd/dg=; b=K0k0d3wBHzrO0MRiJievBoNmRX6KbxNymIWT6RBI4ZnD4Q3QUdIXiGVD9ZCIYGJtYJ bhZFB5IvP9mYMWHbRw6Qz3FvXD8xmQMbtxB3pJdFEQzr8Lbaxx9hpG8WiaX3K3ZgqRrx nFrLWN6sA+CrRuw3ZM37Qpo47Z5fyxL4HpY7Q= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689718221; x=1692310221; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZBK526CC9vRM8Qicvd5fv+TDorJV2z3St+r9zTNd/dg=; b=abSafEzaYw7iyH3ZM3kks8pu9FBOtxtMbS2Np2RDIgp2aol0fxI9z4haWaxT/OfkEp 4kpJfNlqUIZL/kq67wLjWjBHTXAlUvv+LwX1NCwZhL3LFFnGIzwMEDt7jM8a7VZBi3C5 51Sguo0x3kaGU/2FbsYg6I0D+uSwjDA/bTNtQ0rfScP9NnIBLcB2upabYImEVma593Ky dBtVR1n7EOZwA9Zj9nCvgZd9PHYXDUw8jY7d01VCN8E2hKbHsIYD7R6OV0Ypx7iiC9OO 80bR8OKA/kWSpbCFnZhnPSCmCipRUsDLXGGJTzM/dj3Z+Q9LLqiE/4935tDSiCXTCApd ya8w== X-Gm-Message-State: ABy/qLZZJDx45t54//yqseajSOsg9hjHWki/CWJbVJxaRmPIX+XAUSB9 0NO2xwHFRmS8nujNou50sKP28Pt37v8aHYueXqY3SglfDwR5NC9Oyb+3eg== X-Received: by 2002:adf:eb44:0:b0:314:36c5:e4c0 with SMTP id u4-20020adfeb44000000b0031436c5e4c0mr812830wrn.11.1689718220693; Tue, 18 Jul 2023 15:10:20 -0700 (PDT) MIME-Version: 1.0 References: <20230711043453.64095-1-ivan@cloudflare.com> <20230711193612.22c9bc04@kernel.org> <20230712104210.3b86b779@kernel.org> <3cab5936-c696-157f-f3a6-eba8b26df32d@kernel.org> In-Reply-To: <3cab5936-c696-157f-f3a6-eba8b26df32d@kernel.org> From: Ivan Babrou Date: Tue, 18 Jul 2023 15:10:09 -0700 Message-ID: Subject: Re: [RFC PATCH net-next] tcp: add a tracepoint for tcp_listen_queue_drop To: David Ahern Cc: Steven Rostedt , Jakub Kicinski , Yan Zhai , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@cloudflare.com, Eric Dumazet , "David S. Miller" , Paolo Abeni , Masami Hiramatsu Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 14, 2023 at 6:30=E2=80=AFPM David Ahern wr= ote: > > On 7/14/23 5:38 PM, Ivan Babrou wrote: > > On Fri, Jul 14, 2023 at 8:09=E2=80=AFAM David Ahern wrote: > >>> We can start a separate discussion to break it down by category if it > >>> would help. Let me know what kind of information you would like us to > >>> provide to help with that. I assume you're interested in kernel stack= s > >>> leading to kfree_skb with NOT_SPECIFIED reason, but maybe there's > >>> something else. > >> > >> stack traces would be helpful. > > > > Here you go: https://lore.kernel.org/netdev/CABWYdi00L+O30Q=3DZah28QwZ_= 5RU-xcxLFUK2Zj08A8MrLk9jzg@mail.gmail.com/ > > > >>> Even if I was only interested in one specific reason, I would still > >>> have to arm the whole tracepoint and route a ton of skbs I'm not > >>> interested in into my bpf code. This seems like a lot of overhead, > >>> especially if I'm dropping some attack packets. > >> > >> you can add a filter on the tracepoint event to limit what is passed > >> (although I have not tried the filter with an ebpf program - e.g., > >> reason !=3D NOT_SPECIFIED). > > > > Absolutely, but isn't there overhead to even do just that for every fre= ed skb? > > There is some amount of overhead. If filters can be used with ebpf > programs, then the differential cost is just the cycles for the filter > which in this case is an integer compare. Should be low - maybe Steven > has some data on the overhead? I updated my benchmarks and added two dimensions: * Empty probe that just returns immediately (simple and complex map increments were already there) * Tracepoint probe (fprobe and kprobe were already there) The results are here: * https://github.com/cloudflare/ebpf_exporter/tree/master/benchmark It looks like we can expect an empty tracepoint probe to finish in ~15ns. At least that's what I see on my M1 laptop in a VM running v6.5-rc1. 15ns x 400k calls/s =3D 6ms/s or 0.6% of a single CPU if all you do is nothing (which is the likely outcome) for kfree_skb tracepoint. I guess it's not as terrible as I expected, which is good news. > >>> If you have an ebpf example that would help me extract the destinatio= n > >>> port from an skb in kfree_skb, I'd be interested in taking a look and > >>> trying to make it work. > >> > >> This is from 2020 and I forget which kernel version (pre-BTF), but it > >> worked at that time and allowed userspace to summarize drop reasons by > >> various network data (mac, L3 address, n-tuple, etc): > >> > >> https://github.com/dsahern/bpf-progs/blob/master/ksrc/pktdrop.c > > > > It doesn't seem to extract the L4 metadata (local port specifically), > > which is what I'm after. > > This program takes the path of copy headers to userspace and does the > parsing there (there is a netmon program that uses that ebpf program > which shows drops for varying perspectives). You can just as easily do > the parsing in ebpf. Once you have the start of packet data, walk the > protocols of interest -- e.g., see parse_pkt in flow.h. I see, thanks. I want to do all the aggregation in the kernel, so I took a stab at that. With a lot of trial and error I was able to come up with the following: * https://github.com/cloudflare/ebpf_exporter/pull/235 Some points from my experience doing that: * It was a lot harder to get it working than the tracepoint I proposed. There are few examples of parsing skb in the kernel in bpf and none do what I wanted to do. * It is unclear whether this would work with vlan or other encapsulation. Your code has special handling for vlan. As a user I just want the L4 port. * There's a lot more boilerplate code to get to L4 info. Accessing sk is a lot easier. * It's not very useful without adding the reasons that would correspond to listen drops in TCP. I'm not sure if I'm up to the task, but I can give it a shot. * It's unclear how to detect which end of the socket is bound locally. I want to know which ports that are listened on locally are experiencing issues, ignoring sockets that connect elsewhere. E.g. I care about my http server dropping packets, but not as much about curl doing the same. * UDP drops seem to work okay in my local testing, I can see SKB_DROP_REASON_SOCKET_RCVBUFF by port. As a reminder, the code for my tracepoint is here: * https://github.com/cloudflare/ebpf_exporter/pull/221 It's a lot simpler. I still feel that it's justified to exist. Hope this helps.