Received: by 2002:a05:6602:2086:0:0:0:0 with SMTP id a6csp4224431ioa; Tue, 26 Apr 2022 21:30:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx7oSJeJrijsmfh8TMJ5+e4IDCkMeHsodj/zjMKAaPWA8qzaOPwfoKinJWsaAooHWeEw4OQ X-Received: by 2002:a17:907:7b8c:b0:6f3:a7d8:2609 with SMTP id ne12-20020a1709077b8c00b006f3a7d82609mr8516380ejc.53.1651033809321; Tue, 26 Apr 2022 21:30:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651033809; cv=none; d=google.com; s=arc-20160816; b=hytPrqUrNkfD38Gv5WnFBf3E+1+d4T5YEjhFwgx899z790ktHAzh4dIKgdoj91z8lK djrL4ByocMrUB8tD/zUEcRXyjFJVuGDi5pMDbnN8+l9ch9XSYPBLqThUNIXUHvzK5u4T s2P+7lsJx4No4SPd4KPfzm7v7Ad3Yf+X8OdkaV+AVk/fhtrIpt+/Q2Z7Ay8icqSmWhmo 9y7aIK0QMaXUz2k3yxva8RdUWI1id9f8b0eckZ934DUg8b1F+8jBfp3hyz6v6WsxsUc4 gdG2CkxPufDZy8eb5Cf0NzHjRUG+HqcD2aIdgiopWKVWRayDqBcjt21bknuNy2AfA27q Ft2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=7Y1qcKhmhihjquPkNmPpQlcIRxQVN9xLFKDb4pHIlpg=; b=I5PeaN5ifXjD1cRbPpVHepbJoAP7Ouy/7mHuwMSskWpbtQAWkwJ/C1FonRq5VNqo0A x2wDaKv8wx6g1HeER7CTqGn5GwW82qf7PXKS1Pr8fY2vxeT51zW7yp4N1lwWeWe6Zs3p ggE1T5AavxYjEgTR17XD8aAJDz8Wm7EbbzK6Bpj1wPfnc8qPhiohn6YgLz6aEjOVOj7P qkkcv8TZZzm8w+udDQ3DL21wUuBWRYAn9t/ueCcAsihnywYTARusZnlwZX+akrYZ1p3/ rksgU3/uMZKCIenTL0Xwb0dq9vMUu7skkNPLQZBu+Ooty9R1IgkUm7cma5m7Q6JquNli /Fkw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="JnSnF/ra"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hr4-20020a1709073f8400b006df76385e8fsi221093ejc.815.2022.04.26.21.29.45; Tue, 26 Apr 2022 21:30:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="JnSnF/ra"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236434AbiDZNfz (ORCPT + 99 others); Tue, 26 Apr 2022 09:35:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44522 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244037AbiDZNfx (ORCPT ); Tue, 26 Apr 2022 09:35:53 -0400 Received: from mail-yw1-x112a.google.com (mail-yw1-x112a.google.com [IPv6:2607:f8b0:4864:20::112a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D071431369 for ; Tue, 26 Apr 2022 06:32:43 -0700 (PDT) Received: by mail-yw1-x112a.google.com with SMTP id 00721157ae682-2ef5380669cso182122787b3.9 for ; Tue, 26 Apr 2022 06:32:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7Y1qcKhmhihjquPkNmPpQlcIRxQVN9xLFKDb4pHIlpg=; b=JnSnF/ra24mcHx7zxUAEG+TCXf4om2hIoj6vPbRiIiyMS5oTcm3ivWDHPLVC/KiqaG RdrWuhTQjNr8miBR5XnGFIhNvoWHC3hbWJ1f67oiMO23J573HjzpJyWgnzsQt11QB93s fhtF1YtrOkqGzQpym/I5gPlFyvuQR7/eOKfHUalFdIlPcCyx8aCMdUwi58fSMjW3MwTF 2dPMvryNM0cXiEnbIwLDqzvTOE60VoEV6FXY6TfyPWNYahEYxgy23qRwV/3FoxGqlCJY //gJ20ZJauuCUv8OJCmOK4geHbteXfaTDbMV25OCPgYRN5XDAKyKbdPMcrl2TU3Lrb6L LDbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7Y1qcKhmhihjquPkNmPpQlcIRxQVN9xLFKDb4pHIlpg=; b=t4x0Lzlm5SAYZmBGOSBHYhhfW4z//3+OWwrpsjwwwNjBrMSzW7HR7AgKokvPiwFFxp 8xsb6xEOAbcbtb4YP/t+mefcEa54ahLodIzxD3y8PCTVHQ18oXhPJv26stL5RZKXA4+y jVU+Ldw4Opg2R28tHmVgxg73xYqMVkAichcMlz7+VhwQxewqt1VS4Rf7IAh7HOes6TOe tmx2m/EkONbSq6m7gEBcszMyagQL+59Z8FPtegoqh/fSSY0DXcTZtTAUL5LotW91CxuV zg8+cepIz5WdfmoF9aJ0xciLPX1vhskOWK7cuYWv6yxqUK4BgPjbF/ETAjr6M7mJKSxe R4Jg== X-Gm-Message-State: AOAM530PxXHcKv06DhgOPTEhYq7iMkiphDsVArofwIWCtxoHKRLY9HzZ tywAiuXanPUXnVI+zCaDNKFnzNl7Fh3Qtcu5O1EKFA== X-Received: by 2002:a81:89c3:0:b0:2f3:227:d5af with SMTP id z186-20020a8189c3000000b002f30227d5afmr21879072ywf.47.1650979962669; Tue, 26 Apr 2022 06:32:42 -0700 (PDT) MIME-Version: 1.0 References: <20220426080709.6504-1-imagedong@tencent.com> <20220426080709.6504-2-imagedong@tencent.com> In-Reply-To: <20220426080709.6504-2-imagedong@tencent.com> From: Eric Dumazet Date: Tue, 26 Apr 2022 06:32:31 -0700 Message-ID: Subject: Re: [PATCH net-next 1/2] net: add skb drop reasons to inet connect request To: Menglong Dong Cc: Jakub Kicinski , Steven Rostedt , Ingo Molnar , David Miller , Hideaki YOSHIFUJI , David Ahern , Paolo Abeni , benbjiang@tencent.com, flyingpeng@tencent.com, Menglong Dong , Martin KaFai Lau , Talal Ahmad , Kees Cook , mengensun@tencent.com, Dongli Zhang , LKML , netdev Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 26, 2022 at 1:07 AM wrote: > > From: Menglong Dong > > The 'conn_request()' in struct inet_connection_sock_af_ops is used to > process connection requesting for TCP/DCCP. Take TCP for example, it > is just 'tcp_v4_conn_request()'. > > When non-zero value is returned by 'tcp_v4_conn_request()', the skb > will be freed by kfree_skb() and a 'reset' packet will be send. > Otherwise, it will be freed normally. > > In this code path, 'consume_skb()' is used in many abnormal cases, such > as the accept queue of the listen socket full, which should be > 'kfree_skb()'. > > Therefore, we make a little change to the 'conn_request()' interface. > When 0 is returned, we call 'consume_skb()' as usual; when negative is > returned, we call 'kfree_skb()' and send a 'reset' as usual; when > positive is returned, which has not happened yet, we do nothing, and > skb will be freed in 'conn_request()'. Then, we can use drop reasons > in 'conn_request()'. > > Following new drop reasons are added: > > SKB_DROP_REASON_LISTENOVERFLOWS > SKB_DROP_REASON_TCP_REQQFULLDROP > > Reviewed-by: Jiang Biao > Reviewed-by: Hao Peng > Signed-off-by: Menglong Dong > --- > include/linux/skbuff.h | 4 ++++ > include/trace/events/skb.h | 2 ++ > net/dccp/input.c | 12 +++++------- > net/ipv4/tcp_input.c | 21 +++++++++++++-------- > net/ipv4/tcp_ipv4.c | 3 ++- > 5 files changed, 26 insertions(+), 16 deletions(-) > > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h > index 84d78df60453..f33b3636bbce 100644 > --- a/include/linux/skbuff.h > +++ b/include/linux/skbuff.h > @@ -469,6 +469,10 @@ enum skb_drop_reason { > SKB_DROP_REASON_PKT_TOO_BIG, /* packet size is too big (maybe exceed > * the MTU) > */ > + SKB_DROP_REASON_LISTENOVERFLOWS, /* accept queue of the listen socket is full */ > + SKB_DROP_REASON_TCP_REQQFULLDROP, /* request queue of the listen > + * socket is full > + */ > SKB_DROP_REASON_MAX, > }; > > diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h > index a477bf907498..de6c93670437 100644 > --- a/include/trace/events/skb.h > +++ b/include/trace/events/skb.h > @@ -80,6 +80,8 @@ > EM(SKB_DROP_REASON_IP_INADDRERRORS, IP_INADDRERRORS) \ > EM(SKB_DROP_REASON_IP_INNOROUTES, IP_INNOROUTES) \ > EM(SKB_DROP_REASON_PKT_TOO_BIG, PKT_TOO_BIG) \ > + EM(SKB_DROP_REASON_LISTENOVERFLOWS, LISTENOVERFLOWS) \ > + EM(SKB_DROP_REASON_TCP_REQQFULLDROP, TCP_REQQFULLDROP) \ > EMe(SKB_DROP_REASON_MAX, MAX) > > #undef EM > diff --git a/net/dccp/input.c b/net/dccp/input.c > index 2cbb757a894f..ed20dfe83f66 100644 > --- a/net/dccp/input.c > +++ b/net/dccp/input.c > @@ -574,8 +574,7 @@ int dccp_rcv_state_process(struct sock *sk, struct sk_buff *skb, > struct dccp_sock *dp = dccp_sk(sk); > struct dccp_skb_cb *dcb = DCCP_SKB_CB(skb); > const int old_state = sk->sk_state; > - bool acceptable; > - int queued = 0; > + int err, queued = 0; > > /* > * Step 3: Process LISTEN state > @@ -606,13 +605,12 @@ int dccp_rcv_state_process(struct sock *sk, struct sk_buff *skb, > */ > rcu_read_lock(); > local_bh_disable(); > - acceptable = inet_csk(sk)->icsk_af_ops->conn_request(sk, skb) >= 0; > + err = inet_csk(sk)->icsk_af_ops->conn_request(sk, skb); > local_bh_enable(); > rcu_read_unlock(); > - if (!acceptable) > - return 1; > - consume_skb(skb); > - return 0; > + if (!err) > + consume_skb(skb); > + return err < 0; > } > if (dh->dccph_type == DCCP_PKT_RESET) > goto discard; > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index daff631b9486..e0bbbd624246 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -6411,7 +6411,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) > struct inet_connection_sock *icsk = inet_csk(sk); > const struct tcphdr *th = tcp_hdr(skb); > struct request_sock *req; > - int queued = 0; > + int err, queued = 0; > bool acceptable; > SKB_DR(reason); > > @@ -6438,14 +6438,13 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) > */ > rcu_read_lock(); > local_bh_disable(); > - acceptable = icsk->icsk_af_ops->conn_request(sk, skb) >= 0; > + err = icsk->icsk_af_ops->conn_request(sk, skb); > local_bh_enable(); > rcu_read_unlock(); > > - if (!acceptable) > - return 1; > - consume_skb(skb); > - return 0; > + if (!err) > + consume_skb(skb); Please, do not add more mess like that, where skb is either freed by the callee or the caller. > + return err < 0; Where err is set to a negative value ? > } > SKB_DR_SET(reason, TCP_FLAGS); > goto discard; > @@ -6878,6 +6877,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, > bool want_cookie = false; > struct dst_entry *dst; > struct flowi fl; > + SKB_DR(reason); > > /* TW buckets are converted to open requests without > * limitations, they conserve resources and peer is > @@ -6886,12 +6886,15 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, > if ((net->ipv4.sysctl_tcp_syncookies == 2 || > inet_csk_reqsk_queue_is_full(sk)) && !isn) { > want_cookie = tcp_syn_flood_action(sk, rsk_ops->slab_name); > - if (!want_cookie) > + if (!want_cookie) { > + SKB_DR_SET(reason, TCP_REQQFULLDROP); > goto drop; > + } > } > > if (sk_acceptq_is_full(sk)) { > NET_INC_STATS(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS); > + SKB_DR_SET(reason, LISTENOVERFLOWS); > goto drop; > } > > @@ -6947,6 +6950,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, > */ > pr_drop_req(req, ntohs(tcp_hdr(skb)->source), > rsk_ops->family); > + SKB_DR_SET(reason, TCP_REQQFULLDROP); > goto drop_and_release; > } > > @@ -7006,7 +7010,8 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, > drop_and_free: > __reqsk_free(req); > drop: > + kfree_skb_reason(skb, reason); Ugh no, prefer "return reason" and leave to the caller the freeing part. Your changes are too invasive and will hurt future backports. > tcp_listendrop(sk); > - return 0; > + return 1; > } > EXPORT_SYMBOL(tcp_conn_request); > diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c > index 157265aecbed..b8daf49f54a5 100644 > --- a/net/ipv4/tcp_ipv4.c > +++ b/net/ipv4/tcp_ipv4.c > @@ -1470,7 +1470,8 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb) > > drop: > tcp_listendrop(sk); > - return 0; This return 0 meant : do not send reset. > + kfree_skb_reason(skb, SKB_DROP_REASON_IP_INADDRERRORS); double kfree_skb() ? > + return 1; -> send RESET > } > EXPORT_SYMBOL(tcp_v4_conn_request); > > -- > 2.36.0 > I have a hard time understanding this patch. Where is the related IPv6 change ? I really wonder if you actually have tested this.