Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp2236368rwl; Thu, 30 Mar 2023 07:46:31 -0700 (PDT) X-Google-Smtp-Source: AK7set+WkiErm8YxfRQcs5vdO1Bo/1LeBn0ZkSGBfvVB0wbiRaJ+6Q4CT1tjh78hTa82rxJCU4hQ X-Received: by 2002:a05:6a20:c9c:b0:d9:3e8a:dc48 with SMTP id dt28-20020a056a200c9c00b000d93e8adc48mr18949627pzb.62.1680187590931; Thu, 30 Mar 2023 07:46:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680187590; cv=none; d=google.com; s=arc-20160816; b=U+3E6kpSyxk0r7zMVp/XZ9dmBXADH0CgqyBhm39c7dPPq/ux4qSNCtiw8jYg7SOVQQ gxuSulSxNyJ9VU69iGo/+3nzJQZuRkBvbgCN5R4PbRzI0giKhrJTjDQKcvzFGzXUsW+F tetsxLb5j+c5JY600EBiso/wAu4n5hzmQZ8xH1cU/8MrFlnNkJo7/0ca2q92D4Yozf1A /EAMaSZa+QwkUk/D8r65CXlh4MOUnYatDfo3kmhWhjuYImt4GL3GElBTXBfMpXwaJtug pkFhY17Y2rZ40tH/0u4gnz/b1DdkRfqv4K4pIodd7JQgKWZI1pxveqmy45R00vB3vNPi SBUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:content-transfer-encoding :content-id:mime-version:subject:cc:to:references:in-reply-to:from :organization:dkim-signature; bh=iywXSeja+4hH/ErrL8m8CoW+J3pVfduyOhfyme6hsWE=; b=JaggHRrg2v5EUoeEUa0b6FloPZQmNgmMil6n27rTkjm5xilRO5pz+iDOyJnqmxg7aU Aix4TTacsXCs60QODNk8DAPofL9laWDtrWFfcn9fnPYPk+YAA2vrXTEYflQdN8qEww+i /zMNG3R4IAO6OHrqR6qX1TJK4/mpJwn0ggERkQ9tsk8Vo0s/db5fe4Y14pUa9/VKBGel 5udtWkVyBRh0y93Js3FUHJZg+ImuTMSGpugbQOVzDhz8LI0luM7JVmhD3pcAQPOMu2TZ ls/Jkq0NWP1Qxsk0wn9wNS4MLsSLYqqYoHk8xng9dp22w75Zf0hjdQDZmOAHzexLRi+g +MmA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="NNGs/63R"; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p11-20020a63fe0b000000b004a68aefb7b0si35040635pgh.173.2023.03.30.07.46.06; Thu, 30 Mar 2023 07:46:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="NNGs/63R"; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232484AbjC3O2A (ORCPT + 99 others); Thu, 30 Mar 2023 10:28:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232488AbjC3O1y (ORCPT ); Thu, 30 Mar 2023 10:27:54 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E95E7DBA for ; Thu, 30 Mar 2023 07:27:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680186428; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iywXSeja+4hH/ErrL8m8CoW+J3pVfduyOhfyme6hsWE=; b=NNGs/63RmJIvUeA7CTc8+3ykTcR/uR7uuK++dmG6fwbnKZX4HLdSODMX8TGuR+n+e3FggT T869WKeSCVML/i6QmqxVpaZ4Ik9C4L9nklV5vE1l5BNj7CI2GMPCx9L6gyYxQ5jBfdz+I7 AlytgjHKw4q02VYmHV0OyjhmKcVahaM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-599-0cvLYY7SPAuqjcSsQFN8FA-1; Thu, 30 Mar 2023 10:27:03 -0400 X-MC-Unique: 0cvLYY7SPAuqjcSsQFN8FA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9AC95100DEA9; Thu, 30 Mar 2023 14:27:02 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.18]) by smtp.corp.redhat.com (Postfix) with ESMTP id 56203404DC50; Thu, 30 Mar 2023 14:27:00 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells In-Reply-To: <3A132FA8-A764-416E-9753-08E368D6877A@oracle.com> References: <3A132FA8-A764-416E-9753-08E368D6877A@oracle.com> <812034.1680181285@warthog.procyon.org.uk> <6F2985FF-2474-4F36-BD94-5F8E97E46AC2@oracle.com> <20230329141354.516864-1-dhowells@redhat.com> <20230329141354.516864-41-dhowells@redhat.com> <812755.1680182190@warthog.procyon.org.uk> To: Chuck Lever III Cc: dhowells@redhat.com, Matthew Wilcox , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Linus Torvalds , "open list:NETWORKING [GENERAL]" , linux-fsdevel , Linux Kernel Mailing List , Linux Memory Management List , Trond Myklebust , Anna Schumaker , Linux NFS Mailing List Subject: Re: [RFC PATCH v2 40/48] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <822316.1680186419.1@warthog.procyon.org.uk> Content-Transfer-Encoding: quoted-printable Date: Thu, 30 Mar 2023 15:26:59 +0100 Message-ID: <822317.1680186419@warthog.procyon.org.uk> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Chuck Lever III wrote: > Don't. Just change svc_tcp_send_kvec() to use sock_sendmsg, and > leave the marker alone for now, please. If you insist. See attached. David --- sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage When transmitting data, call down into TCP using sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing sendpage calls to transmit header, data pages and trailer. Signed-off-by: David Howells cc: Trond Myklebust cc: Anna Schumaker cc: Chuck Lever cc: Jeff Layton cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: linux-nfs@vger.kernel.org cc: netdev@vger.kernel.org --- include/linux/sunrpc/svc.h | 11 +++++------ net/sunrpc/svcsock.c | 40 +++++++++++++--------------------------= - 2 files changed, 18 insertions(+), 33 deletions(-) diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h index 877891536c2f..456ae554aa11 100644 --- a/include/linux/sunrpc/svc.h +++ b/include/linux/sunrpc/svc.h @@ -161,16 +161,15 @@ static inline bool svc_put_not_last(struct svc_serv = *serv) extern u32 svc_max_payload(const struct svc_rqst *rqstp); = /* - * RPC Requsts and replies are stored in one or more pages. + * RPC Requests and replies are stored in one or more pages. * We maintain an array of pages for each server thread. * Requests are copied into these pages as they arrive. Remaining * pages are available to write the reply into. * - * Pages are sent using ->sendpage so each server thread needs to - * allocate more to replace those used in sending. To help keep track - * of these pages we have a receive list where all pages initialy live, - * and a send list where pages are moved to when there are to be part - * of a reply. + * Pages are sent using ->sendmsg with MSG_SPLICE_PAGES so each server th= read + * needs to allocate more to replace those used in sending. To help keep= track + * of these pages we have a receive list where all pages initialy live, a= nd a + * send list where pages are moved to when there are to be part of a repl= y. * * We use xdr_buf for holding responses as it fits well with NFS * read responses (that have a header, and some data pages, and possibly diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 03a4f5615086..af146e053dfc 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -1059,17 +1059,18 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp= ) svc_xprt_received(rqstp->rq_xprt); return 0; /* record not complete */ } - + = static int svc_tcp_send_kvec(struct socket *sock, const struct kvec *vec, int flags) { - return kernel_sendpage(sock, virt_to_page(vec->iov_base), - offset_in_page(vec->iov_base), - vec->iov_len, flags); + struct msghdr msg =3D { .msg_flags =3D MSG_SPLICE_PAGES | flags, }; + + iov_iter_kvec(&msg.msg_iter, ITER_SOURCE, vec, 1, vec->iov_len); + return sock_sendmsg(sock, &msg); } = /* - * kernel_sendpage() is used exclusively to reduce the number of + * MSG_SPLICE_PAGES is used exclusively to reduce the number of * copy operations in this path. Therefore the caller must ensure * that the pages backing @xdr are unchanging. * @@ -1109,28 +1110,13 @@ static int svc_tcp_sendmsg(struct socket *sock, st= ruct xdr_buf *xdr, if (ret !=3D head->iov_len) goto out; = - if (xdr->page_len) { - unsigned int offset, len, remaining; - struct bio_vec *bvec; - - bvec =3D xdr->bvec + (xdr->page_base >> PAGE_SHIFT); - offset =3D offset_in_page(xdr->page_base); - remaining =3D xdr->page_len; - while (remaining > 0) { - len =3D min(remaining, bvec->bv_len - offset); - ret =3D kernel_sendpage(sock, bvec->bv_page, - bvec->bv_offset + offset, - len, 0); - if (ret < 0) - return ret; - *sentp +=3D ret; - if (ret !=3D len) - goto out; - remaining -=3D len; - offset =3D 0; - bvec++; - } - } + msg.msg_flags =3D MSG_SPLICE_PAGES; + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, xdr->bvec, + xdr_buf_pagecount(xdr), xdr->page_len); + ret =3D sock_sendmsg(sock, &msg); + if (ret < 0) + return ret; + *sentp +=3D ret; = if (tail->iov_len) { ret =3D svc_tcp_send_kvec(sock, tail, 0);