Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp4249585iog; Tue, 28 Jun 2022 12:05:49 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uwqH5TmOic6FwEvdSVzYxDxFdBPXvhQ71xrGWk1ypt9vyAghY+IvvD0sdo8/Y6wDS6Sym+ X-Received: by 2002:a17:906:8501:b0:711:bf65:2a47 with SMTP id i1-20020a170906850100b00711bf652a47mr19533462ejx.150.1656443147379; Tue, 28 Jun 2022 12:05:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656443147; cv=none; d=google.com; s=arc-20160816; b=W491x6CY8tU1C6mq4NqudjeqVFnUV37ChYFVhcfoMO0+6TsO4l8F/2cVR0nvA3/fi0 rsf6LbjG8jmjQO9PmxxInzkb5HpBJYO/qZ171gGTSRg4ibqcCu3tMxj1Y+rNcdrVTW2F ob7IiDsA5+QuFyknhNrjrcJdUp2HWvr8twdI+1uk7dqi14NRtmFpUmiE4NxqA74DS4wR HQdY0XgJGqn1w+yUNiBlrIAft99PL4TIS2KNtj31ODsGBirslTU85bU5kHAWTMBPzu9D NAEr7K6UCbL1nLk+nZXYu1rYHqnOiocsgptw9AbR1/g+YQgIhJLGo9IYhBBuY2+geVN7 fmDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=5DskBu6zK/vS94aIx3bV1D04eBSiML1zSrR0iXhmDVw=; b=Umyc0gAYZ/Z+UXZcnlVJ8VzWbff6GcClWmC6KvuNP+tVDeQCMWZLIBegD95NzZWy1q Hapgf18k+mb3R6kd9zdL+Z8F2RgCWyX5GII+80uTrWE5qM06L73qmspf10iCrhgzCpNJ A5SrqJVe1aVsrCmBJJ8jhMKMhtgPPau4kHrlso3wYq10TiuZ41cOv3A/HkEClKAbfDme jlKrYsvb4Iyc7XmPx5KQ9negDDByyc2NF3sdN0xY2hu5Y0AxsBIY7j8/gSbQ6xgOBfl7 zAIMZQvD0uUY01gYcMxJFyu9EHsMOWT8AaK8qK85Vq99rrU9Z/Z2SaidqGFiohPlYlnL SyNw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Xi3HjuCq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p8-20020a17090635c800b0072629f000a0si14344224ejb.134.2022.06.28.12.05.14; Tue, 28 Jun 2022 12:05:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Xi3HjuCq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235190AbiF1TBv (ORCPT + 99 others); Tue, 28 Jun 2022 15:01:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56580 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234356AbiF1TA2 (ORCPT ); Tue, 28 Jun 2022 15:00:28 -0400 Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 582BF17A9C; Tue, 28 Jun 2022 12:00:07 -0700 (PDT) Received: by mail-ed1-x531.google.com with SMTP id eq6so18872171edb.6; Tue, 28 Jun 2022 12:00:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5DskBu6zK/vS94aIx3bV1D04eBSiML1zSrR0iXhmDVw=; b=Xi3HjuCqDt5f5v6qFcqs8wBXNipUK5e1yBMsksvYLbz4Yug+qWtGS70p/lM1XWmFrK 7wi8pHMzynNaVaQ10q++3oSpKFsIK6cdPWxdfYPisyahGnut51iQaYt9w2NSNFyWjsWB FUrh93nvW4mmO63XAlPT/S7cWsjjmQDntNIMoYqiNWQ5POCjwfC/tUfCZAYi62XBKZry gsCkWhBrf+FFHLOUWoyb8+csdgi0iv8Di1l8YxG+OfGP/m6RLMnyRMh/M0nporHK7Lib EkxrK4EadrtTk5GYbigaAttggzGWoBxlAkn4IYG6XTWCutmV+rnq1i2GBLbKOVIxXoWa Ef5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5DskBu6zK/vS94aIx3bV1D04eBSiML1zSrR0iXhmDVw=; b=HcylnpzhgE8tDe7Oxs8BcCJiJtrRnETlmNP4PwML2zlFNt6JfZl+TwwBDw6/GxEVdn i7xrN/KIEaN49hZS5oYLurLcgbzJ7PXBfMqCXLon6gViZaMuRqRUY4LbRbcIzfzrZiNo A9usV9alcH+HbY1aCGrHi3uZaG9J4gSwXCjmTs6huomAEhioPHcLh5qvA7Xbzq3MJj8S mtBEt14tgVnLU2bt7zZkGTWokDYllV3f17xszFSpp5GaNCJJgQfYddANpCmeeEcF74Kp P0SdvSKZlS26dgJj39ik8PIO52auNInFe/IlpGXcEs/oYd+IVfZ8d1fMbOe9ew2AkWoI GpYg== X-Gm-Message-State: AJIora8iILo1StO+fJDuJRoRkVoQ9AfOZmTvLkqootUCWHh2a0743LQ0 ggiCxRWVdGs++JqK07+X06KFamYfZ0AagQ== X-Received: by 2002:a05:6402:538d:b0:435:7ca6:a136 with SMTP id ew13-20020a056402538d00b004357ca6a136mr25601657edb.268.1656442805664; Tue, 28 Jun 2022 12:00:05 -0700 (PDT) Received: from 127.0.0.1localhost (188.28.125.106.threembb.co.uk. [188.28.125.106]) by smtp.gmail.com with ESMTPSA id t21-20020a05640203d500b0043573c59ea0sm9758451edw.90.2022.06.28.12.00.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Jun 2022 12:00:05 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Cc: "David S . Miller" , Jakub Kicinski , Jonathan Lemon , Willem de Bruijn , Jens Axboe , kernel-team@fb.com, Pavel Begunkov Subject: [RFC net-next v3 11/29] tcp: support zc with managed data Date: Tue, 28 Jun 2022 19:56:33 +0100 Message-Id: <2d0c627c125cf1019096e1db04264e1cb6149dec.1653992701.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Also make tcp to use managed data and propagate SKBFL_MANAGED_FRAG_REFS to optimise frag pages referencing. Signed-off-by: Pavel Begunkov --- net/ipv4/tcp.c | 51 +++++++++++++++++++++++++++++++++----------------- 1 file changed, 34 insertions(+), 17 deletions(-) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 9984d23a7f3e..832c1afcdbe7 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1202,17 +1202,23 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) flags = msg->msg_flags; - if (flags & MSG_ZEROCOPY && size && sock_flag(sk, SOCK_ZEROCOPY)) { + if ((flags & MSG_ZEROCOPY) && size) { skb = tcp_write_queue_tail(sk); - uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb)); - if (!uarg) { - err = -ENOBUFS; - goto out_err; - } - zc = sk->sk_route_caps & NETIF_F_SG; - if (!zc) - uarg->zerocopy = 0; + if (msg->msg_ubuf) { + uarg = msg->msg_ubuf; + net_zcopy_get(uarg); + zc = sk->sk_route_caps & NETIF_F_SG; + } else if (sock_flag(sk, SOCK_ZEROCOPY)) { + uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb)); + if (!uarg) { + err = -ENOBUFS; + goto out_err; + } + zc = sk->sk_route_caps & NETIF_F_SG; + if (!zc) + uarg->zerocopy = 0; + } } if (unlikely(flags & MSG_FASTOPEN || inet_sk(sk)->defer_connect) && @@ -1335,8 +1341,13 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) copy = min_t(int, copy, pfrag->size - pfrag->offset); - if (tcp_downgrade_zcopy_pure(sk, skb) || - !sk_wmem_schedule(sk, copy)) + if (unlikely(skb_zcopy_pure(skb) || skb_zcopy_managed(skb))) { + if (tcp_downgrade_zcopy_pure(sk, skb)) + goto wait_for_space; + skb_zcopy_downgrade_managed(skb); + } + + if (!sk_wmem_schedule(sk, copy)) goto wait_for_space; err = skb_copy_to_page_nocache(sk, &msg->msg_iter, skb, @@ -1357,14 +1368,20 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) pfrag->offset += copy; } else { /* First append to a fragless skb builds initial - * pure zerocopy skb + * zerocopy skb */ - if (!skb->len) + if (!skb->len) { + if (msg->msg_managed_data) + skb_shinfo(skb)->flags |= SKBFL_MANAGED_FRAG_REFS; skb_shinfo(skb)->flags |= SKBFL_PURE_ZEROCOPY; - - if (!skb_zcopy_pure(skb)) { - if (!sk_wmem_schedule(sk, copy)) - goto wait_for_space; + } else { + /* appending, don't mix managed and unmanaged */ + if (!msg->msg_managed_data) + skb_zcopy_downgrade_managed(skb); + if (!skb_zcopy_pure(skb)) { + if (!sk_wmem_schedule(sk, copy)) + goto wait_for_space; + } } err = skb_zerocopy_iter_stream(sk, skb, msg, copy, uarg); -- 2.36.1