Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp640329rwd; Thu, 18 May 2023 01:39:36 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7h6B+ERUE7092sgFS7F8I60j7XENzjd6UpB6obEf7ry04WJ9LSkr5zOK9aN7+pMiiBGGPO X-Received: by 2002:a05:6a00:c82:b0:63b:8afe:a4dc with SMTP id a2-20020a056a000c8200b0063b8afea4dcmr3446101pfv.30.1684399176527; Thu, 18 May 2023 01:39:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684399176; cv=none; d=google.com; s=arc-20160816; b=TRg9SdDxe5a7KU5LWGvhKwZsIM9Ds9Hhu7+30uHAoxDrK7X+vtWwN9+CkgYwH02Prs RMKazvU0GGvuSpjHvtZgfJUGlxNmLvi13SaoQMVPjRKVaupE11lZPuXjJjARTljMtfx/ mnBFwleMYDErSK+8dD2TGIh4g+lVhiWBTmQ1fTULAcCh7MJeSLNJ5w+ZKL+UJfWSkYrS f+qWqKspAyQWPew60D2M0G5xwG+AeqMD4QRZrUYczIl1VR9f5nVWtLkoMHfrfkk/Hegj WLQWvh8DJv5mEAVaugDLSavTQBy426wBQeTNTg5IstR2v5TYhrjwHWiofEcBhJyw/1Mv nCCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent :content-transfer-encoding:references:in-reply-to:date:cc:to:from :subject:message-id:dkim-signature; bh=bqtU7SQwnuw72dTPSwE1Q8pNrbFhJ19RSEhkS8/CzT8=; b=VrSU1BN9guX5lOtGg56/7Sloags6JvMD5ffFXIFCCkHtsXSW9pICs5YFGckdyETmDL Iotuy2YoVps8BSjhIcTzb+SOT8vA4+1y79jrP/6NMT3co4BaCKES58unbfIDF2nyl1Xz 4MfVcavt2Y8kYmIoUNonIuJPN1pIHiKSY2yUz4fX3yCeBwYFNR5s4MEhv/95aFLQ5vTw u7cozuYhiqcm5vULJxTlkfYQqGnSESbZRGmZ76cTnOmhl3OodIaDiJ7T4y+lKVKJ1yrG 2jzqOVID9NFEh4u0cixkt1HJH6sA6DZyWa1AjN2Wd2NV9JXCPqXGjaX7jFO0xUMYFKkc hPIA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=cMSS3OPl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r30-20020aa79ede000000b00647d69652bdsi1130172pfq.264.2023.05.18.01.39.24; Thu, 18 May 2023 01:39:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=cMSS3OPl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229704AbjERI31 (ORCPT + 99 others); Thu, 18 May 2023 04:29:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57680 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229726AbjERI3Z (ORCPT ); Thu, 18 May 2023 04:29:25 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4889D19B4 for ; Thu, 18 May 2023 01:28:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684398517; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bqtU7SQwnuw72dTPSwE1Q8pNrbFhJ19RSEhkS8/CzT8=; b=cMSS3OPl6T5HQp/L96GLYGHt/TbIzRVcHT6QNDX1AgmfBtj4lYFGWveRRFIrPhJaHsf1Li /G+SinrNuuOTj/6LFX4ew9ibhXv7iii1FVsvrj1s0zOAQgZZJ7LpB8/vrPvS7ypP4/C2jD aH4BdqIvDpJhndHt0RbuOgZmXhaUyrE= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-593-3es_zAQlMMi4xkcXaObJkg-1; Thu, 18 May 2023 04:28:35 -0400 X-MC-Unique: 3es_zAQlMMi4xkcXaObJkg-1 Received: by mail-qv1-f71.google.com with SMTP id 6a1803df08f44-61b6f717b6eso3471176d6.0 for ; Thu, 18 May 2023 01:28:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684398515; x=1686990515; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=bqtU7SQwnuw72dTPSwE1Q8pNrbFhJ19RSEhkS8/CzT8=; b=E0RJmMb0KDwyIXtyQyrUph+UJWwClyYiWgUtEooyHQqJhWhMmJ5zMFJSNDLz9fgEJU EM0TJKtKwq1IFY+73INEPh8RoZJyGEbuHlSjDIF8+WgwNlUEaoXNEGnqEB+mpj6HNlK7 5AxySjqU8Zo0GO87zj/waeKTGD/yOp6aDh+VwyGioONR7m25D22VIrClGU6faTypqVg7 /hP8F1YSBCaW3hGVSgn26epEbwc67syDZMno8+z2C226ZnuRehwaWtEAfeCj3/TJSCTI uzyxKRZ3dfR9ZUasenepw1+ctpQrv02Wve7KO5DIhHG6qNJSYqO+zexpXxINx/fxYh5Y 244A== X-Gm-Message-State: AC+VfDyKh9UMG3c/S3vDYMamAwyUV3ssUKkQdKJIVaPkpzoy9w1op8kc 1HR9B+Fu31zSj100bbrddhreFFaj4KoO3NXPI6sdbSIcnMflb7DAw4Vq9pNfjxFMCJCEqL7PPfB gMAGewxvlvGo26zbxqegP+R2s X-Received: by 2002:a05:6214:21a7:b0:616:73d9:b9d8 with SMTP id t7-20020a05621421a700b0061673d9b9d8mr9887999qvc.3.1684398515053; Thu, 18 May 2023 01:28:35 -0700 (PDT) X-Received: by 2002:a05:6214:21a7:b0:616:73d9:b9d8 with SMTP id t7-20020a05621421a700b0061673d9b9d8mr9887975qvc.3.1684398514775; Thu, 18 May 2023 01:28:34 -0700 (PDT) Received: from gerbillo.redhat.com (146-241-239-175.dyn.eolo.it. [146.241.239.175]) by smtp.gmail.com with ESMTPSA id h5-20020a05620a10a500b0073b878e3f30sm254636qkk.59.2023.05.18.01.28.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 May 2023 01:28:34 -0700 (PDT) Message-ID: <93aba6cc363e94a6efe433b3c77ec1b6b54f2919.camel@redhat.com> Subject: Re: [PATCH net-next v7 03/16] net: Add a function to splice pages into an skbuff for MSG_SPLICE_PAGES From: Paolo Abeni To: David Howells , netdev@vger.kernel.org Cc: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Willem de Bruijn , David Ahern , Matthew Wilcox , Al Viro , Christoph Hellwig , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Date: Thu, 18 May 2023 10:28:29 +0200 In-Reply-To: <20230515093345.396978-4-dhowells@redhat.com> References: <20230515093345.396978-1-dhowells@redhat.com> <20230515093345.396978-4-dhowells@redhat.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.4 (3.46.4-1.fc37) MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2023-05-15 at 10:33 +0100, David Howells wrote: > Add a function to handle MSG_SPLICE_PAGES being passed internally to > sendmsg(). Pages are spliced into the given socket buffer if possible an= d > copied in if not (e.g. they're slab pages or have a zero refcount). >=20 > Signed-off-by: David Howells > cc: Eric Dumazet > cc: "David S. Miller" > cc: David Ahern > cc: Jakub Kicinski > cc: Paolo Abeni > cc: Al Viro > cc: Jens Axboe > cc: Matthew Wilcox > cc: netdev@vger.kernel.org > --- >=20 > Notes: > ver #7) > - Export function. > - Never copy data, return -EIO if sendpage_ok() returns false. >=20 > include/linux/skbuff.h | 3 ++ > net/core/skbuff.c | 95 ++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 98 insertions(+) >=20 > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h > index 4c0ad48e38ca..1c5f0ac6f8c3 100644 > --- a/include/linux/skbuff.h > +++ b/include/linux/skbuff.h > @@ -5097,5 +5097,8 @@ static inline void skb_mark_for_recycle(struct sk_b= uff *skb) > #endif > } > =20 > +ssize_t skb_splice_from_iter(struct sk_buff *skb, struct iov_iter *iter, > + ssize_t maxsize, gfp_t gfp); > + > #endif /* __KERNEL__ */ > #endif /* _LINUX_SKBUFF_H */ > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index 7f53dcb26ad3..56d629ea2f3d 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -6892,3 +6892,98 @@ nodefer: __kfree_skb(skb); > if (unlikely(kick) && !cmpxchg(&sd->defer_ipi_scheduled, 0, 1)) > smp_call_function_single_async(cpu, &sd->defer_csd); > } > + > +static void skb_splice_csum_page(struct sk_buff *skb, struct page *page, > + size_t offset, size_t len) > +{ > + const char *kaddr; > + __wsum csum; > + > + kaddr =3D kmap_local_page(page); > + csum =3D csum_partial(kaddr + offset, len, 0); > + kunmap_local(kaddr); > + skb->csum =3D csum_block_add(skb->csum, csum, skb->len); > +} > + > +/** > + * skb_splice_from_iter - Splice (or copy) pages to skbuff > + * @skb: The buffer to add pages to > + * @iter: Iterator representing the pages to be added > + * @maxsize: Maximum amount of pages to be added > + * @gfp: Allocation flags > + * > + * This is a common helper function for supporting MSG_SPLICE_PAGES. It > + * extracts pages from an iterator and adds them to the socket buffer if > + * possible, copying them to fragments if not possible (such as if they'= re slab > + * pages). > + * > + * Returns the amount of data spliced/copied or -EMSGSIZE if there's > + * insufficient space in the buffer to transfer anything. > + */ > +ssize_t skb_splice_from_iter(struct sk_buff *skb, struct iov_iter *iter, > + ssize_t maxsize, gfp_t gfp) > +{ > + struct page *pages[8], **ppages =3D pages; > + unsigned int i; > + ssize_t spliced =3D 0, ret =3D 0; > + size_t frag_limit =3D READ_ONCE(sysctl_max_skb_frags); Minor nit: please respect the reverse x-mas tree order (there are a few other occurrences around) > + > + while (iter->count > 0) { > + ssize_t space, nr; > + size_t off, len; > + > + ret =3D -EMSGSIZE; > + space =3D frag_limit - skb_shinfo(skb)->nr_frags; > + if (space < 0) > + break; > + > + /* We might be able to coalesce without increasing nr_frags */ > + nr =3D clamp_t(size_t, space, 1, ARRAY_SIZE(pages)); > + > + len =3D iov_iter_extract_pages(iter, &ppages, maxsize, nr, 0, &off); > + if (len <=3D 0) { > + ret =3D len ?: -EIO; > + break; > + } > + > + if (space =3D=3D 0 && > + !skb_can_coalesce(skb, skb_shinfo(skb)->nr_frags, > + pages[0], off)) { > + iov_iter_revert(iter, len); > + break; > + } It looks like the above condition/checks duplicate what the later skb_append_pagefrags() will perform below. I guess the above chunk could be removed? > + > + i =3D 0; > + do { > + struct page *page =3D pages[i++]; > + size_t part =3D min_t(size_t, PAGE_SIZE - off, len); > + > + ret =3D -EIO; > + if (!sendpage_ok(page)) > + goto out; My (limited) understanding is that the current sendpage code assumes that the caller provides/uses pages suitable for such use. The existing sendpage_ok() check is in place as way to try to catch possible code bug - via the WARN_ONCE(). I think the same could be done here? Thanks! Paolo > + > + ret =3D skb_append_pagefrags(skb, page, off, part, > + frag_limit); > + if (ret < 0) { > + iov_iter_revert(iter, len); > + goto out; > + } > + > + if (skb->ip_summed =3D=3D CHECKSUM_NONE) > + skb_splice_csum_page(skb, page, off, part); > + > + off =3D 0; > + spliced +=3D part; > + maxsize -=3D part; > + len -=3D part; > + } while (len > 0); > + > + if (maxsize <=3D 0) > + break; > + } > + > +out: > + skb_len_add(skb, spliced); > + return spliced ?: ret; > +} > +EXPORT_SYMBOL(skb_splice_from_iter); >=20