Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp803012rwd; Wed, 31 May 2023 05:49:50 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7Hy8yRR7rGPkr1i9ou+t+FdX4vXGY2Tgnf/B8HV39dgds7K04mBP99duG+MZt7V7EXYpfS X-Received: by 2002:a17:903:1d0:b0:1ae:6947:e63b with SMTP id e16-20020a17090301d000b001ae6947e63bmr5717228plh.16.1685537389730; Wed, 31 May 2023 05:49:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685537389; cv=none; d=google.com; s=arc-20160816; b=j1ErkPYIGmweV6vlnB+wUJMRYlNnUOnqvLksKs0JjfGDNu0BVTC6JCeMroT3sy/KEs q7KnfEwZg24oRunEuuENdljfKPBj5X/qr2misBvoHg4yCIEwcqqb6zStIE2eoxR/4vrc bpJYw8aGgwCvhUKZtceMnxROmu6EfiV9dvvdxRTKWVR4QMn3/D5L90U0b7bqog66D3uM KC0gVqOrqBwG9nV8AEx+MB+aXmTGZgtsGRLJnCz8Jodhs2yPynr+EhvtOJKQTwwuK655 +AssO+A7Ax/bzvqrT63ZTVaYHit4YLcru0gwdXKRWTJv+jNzmHk6T+AMwKP4L/EI4vBF 2VgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=VUrQef8DmIGGqxD5N5RNkU1BHyTMou9oFGzjwHHlMuE=; b=qQ573rgZw56n5J4RcWUR5GQzLuS2+yf64vOC5C3hR8dU+vQHaSEVdKa1gqzoKazSOM SvbHfzYV/ttzpXTs10hLDQMoMwHOkbVxBtnVDiPse6f/EGd7wzhdSxkcbZ4AmMc8FUE5 1ikMB2TU5OPM+CtgjUjFVrgDIdoaX9fwiQlKX5AbAx5a//2XsqEJfSD4Aa4CI5Z4wKoh 7nA+swb5Qegdq5MTPotWbA2QbOfUMyzVwu1+V0CMwoEQlP+Lnq98m3qcxFwXZfvMfSPj cz7tWAMOkwp91sAABGqJgCT7HqbKHwJ0JhoUO4r8mlO/0nWwAwoOl7MV5ScOE6XUs6Vf ax4A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=H5yRwNeV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t16-20020a170902e85000b001a6dfb35f63si843633plg.385.2023.05.31.05.49.37; Wed, 31 May 2023 05:49:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=H5yRwNeV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235904AbjEaMq3 (ORCPT + 99 others); Wed, 31 May 2023 08:46:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43148 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235901AbjEaMq2 (ORCPT ); Wed, 31 May 2023 08:46:28 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02DCE124 for ; Wed, 31 May 2023 05:45:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1685537142; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VUrQef8DmIGGqxD5N5RNkU1BHyTMou9oFGzjwHHlMuE=; b=H5yRwNeVIjOnlS5NbLxWGa6hsALP2DgsXpmWB5byNmH/bj/eNw71uz3bZdOwWFtI/nvJhU 5rWTUaAGkSL09UJIWE0YSaiDFR5GScNUu1OziMoAHBDYKBhVEbvvAVgAwEk2q4no7YnUE/ IN+QKUgcOkPNSI68si7CnfvicpooxVk= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-479-MLbnjYjsMtCMKcbnjqfptw-1; Wed, 31 May 2023 08:45:38 -0400 X-MC-Unique: MLbnjYjsMtCMKcbnjqfptw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DE0311C0150B; Wed, 31 May 2023 12:45:37 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.42.28.182]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1CF7F20296C8; Wed, 31 May 2023 12:45:34 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Chuck Lever , Boris Pismenny , John Fastabend , Jakub Kicinski , "David S. Miller" , Eric Dumazet , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Christoph Hellwig , Linus Torvalds , Al Viro , Jan Kara , Jeff Layton , David Hildenbrand , Christian Brauner , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org Subject: [PATCH net-next v2 1/6] splice, net: Fix MSG_MORE signalling in splice_direct_to_actor() Date: Wed, 31 May 2023 13:45:23 +0100 Message-ID: <20230531124528.699123-2-dhowells@redhat.com> In-Reply-To: <20230531124528.699123-1-dhowells@redhat.com> References: <20230531124528.699123-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org splice_direct_to_actor() doesn't manage SPLICE_F_MORE correctly - and, as a result, incorrectly signals MSG_MORE when splicing to a socket. The problem happens when a short splice occurs because we got a short read due to hitting the EOF on a file. Because the length read (read_len) is less than the remaining size to be spliced (len), SPLICE_F_MORE is set. This causes MSG_MORE to be set by pipe_to_sendpage(), indicating to the network protocol that more data is to be expected. With the changes I want to make to switch from using sendpage to using sendmsg(MSG_SPLICE_PAGES), MSG_MORE needs to work properly. This was observed with the multi_chunk_sendfile tests in the tls kselftest program. Some of those tests would hang and time out when the last chunk of file was less than the sendfile request size. This has been observed before[1] and worked around in AF_TLS[2]. Fix this by checking to see if the source file is seekable if we get a short read and, if it is, checking to see if we hit the file size. This should also work for block devices. This won't help procfiles and suchlike as they're zero length files that can be read from[3]. To handle that, should splice make a zero-length call with SPLICE_F_MORE cleared (assuming it wasn't set by userspace via splice()) if it gets a zero-length read? Signed-off-by: David Howells cc: Jakub Kicinski cc: Jens Axboe cc: Christoph Hellwig cc: Linus Torvalds cc: Al Viro cc: Matthew Wilcox cc: Jan Kara cc: Jeff Layton cc: David Hildenbrand cc: Christian Brauner cc: Chuck Lever cc: Boris Pismenny cc: John Fastabend cc: Eric Dumazet cc: "David S. Miller" cc: Paolo Abeni cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-mm@kvack.org cc: netdev@vger.kernel.org Link: https://lore.kernel.org/netdev/1591392508-14592-1-git-send-email-pooja.trivedi@stackpath.com/ [1] Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=d452d48b9f8b1a7f8152d33ef52cfd7fe1735b0a [2] Link: https://lore.kernel.org/r/CAHk-=wjDq5_wLWrapzFiJ3ZNn6aGFWeMJpAj5q+4z-Ok8DD9dA@mail.gmail.com/ [3] --- fs/splice.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index 3e06611d19ae..a7cf216c02a7 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -982,10 +982,21 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd, * If this is the last data and SPLICE_F_MORE was not set * initially, clears it. */ - if (read_len < len) - sd->flags |= SPLICE_F_MORE; - else if (!more) + if (read_len < len) { + struct inode *ii = in->f_mapping->host; + + if (ii->i_fop->llseek != noop_llseek && + pos >= i_size_read(ii)) { + if (!more) + sd->flags &= ~SPLICE_F_MORE; + } else { + sd->flags |= SPLICE_F_MORE; + } + + } else if (!more) { sd->flags &= ~SPLICE_F_MORE; + } + /* * NOTE: nonblocking mode only applies to the input. We * must not do the output in nonblocking mode as then we