Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp1797634rwd; Sun, 21 May 2023 06:33:36 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6wrkpSW90e1toLJH8uiF1iw8KHTdLqfCqP7oDv7TffRinxI/uEGtVoLEG+7MUbo2fkLXDN X-Received: by 2002:a05:6a00:1413:b0:63b:19e5:a9ec with SMTP id l19-20020a056a00141300b0063b19e5a9ecmr10776797pfu.33.1684676015612; Sun, 21 May 2023 06:33:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684676015; cv=none; d=google.com; s=arc-20160816; b=LWycg2MImpfTzFwFTuFl3S1xkSwPfYZa79nI13/RMsvxr/+eV8DVSJT1y1uOM5e3E2 u+KvqV2VJcc1VvYZOOL+XC7el7gPpJ8AAgoB2v8Eq4p2EhDnOheNCuurXyrcCRLpyXOY TjTtXofJi+kdhoWTwhjhlT180NRObvCHlfCjLVUUYE/2b0A82SFSgX7Hq/vmA6+yCQjO /PJJW9OJcHotUZSi1LcvGvQuHw0PZKkTIpMEfq5WV9m1iJJCZspMJL9N5vEChgYKj5t7 0FdBFry5loobkJhcmZozV4jgbGc0sA23hX1N0ddNZhnYjgI0Xck4KiuhWw6Ps1TImdm1 b2AQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:content-id:mime-version:subject :cc:to:references:in-reply-to:from:organization:dkim-signature; bh=Da4xtSq0QdB30NCu+72mMP6X5NYk0+FQ0rBdl0VkLvU=; b=UQ3N2qb2TgMjPC69wSgV5z0VwRZc1VgnpVIdHnNvEqa/LvRZktVoG8A7pbHKAtsINf bh2rplw0FnX2kAiQA0Q+UqmoUYniYmpXPTfeXHpb8Dw6ZqIQm375IdBCp749h9PxowHM oi/AkuGsj9fgMvwLg6mfoT68JgWi5TixNLgjfnwFbWHqjLPemFcSCmzy1wXaB4k54S82 K0hL2N5CvXUrJfJLxzB+MJjKd1E3eMVfhOB79pGkSNkqOQMKaVkuLwa9n3LxBsXOcfIc iMltJzzA9ncG4UHHDoOBAOrzfitQ1XG73zy2A3mehKRuOnOAX1ugRgjsEPvBBRwzjUmp VbSA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="EH/5v1o5"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d21-20020a63d715000000b00530743695d1si1520602pgg.665.2023.05.21.06.33.20; Sun, 21 May 2023 06:33:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="EH/5v1o5"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230033AbjEUMve (ORCPT + 99 others); Sun, 21 May 2023 08:51:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229671AbjEUMvb (ORCPT ); Sun, 21 May 2023 08:51:31 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2BA51C4 for ; Sun, 21 May 2023 05:50:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684673448; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Da4xtSq0QdB30NCu+72mMP6X5NYk0+FQ0rBdl0VkLvU=; b=EH/5v1o5QBVgmbraVEpkJZJ7Z/2Nr0b12Hu4sjZRqEkP5mH9RnDGxgZ9RE8KBArNz4Wo0D qo1vz8KRi2oORkjnek5hKdGIZinJTh2uOb2BIYDyOFl4YpeBoqZpk40346fooA0IkTKcDk H/nFWWNNtMCxx6rKnH9HKEd96RGnWSM= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-66-5viUjdlnMoCZk6kLet-5VA-1; Sun, 21 May 2023 08:50:46 -0400 X-MC-Unique: 5viUjdlnMoCZk6kLet-5VA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9E2691C05154; Sun, 21 May 2023 12:50:45 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.39.192.68]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7D82F40CFD00; Sun, 21 May 2023 12:50:41 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells In-Reply-To: <20230521192826.825bfafa17645aacba9b1076@kernel.org> References: <20230521192826.825bfafa17645aacba9b1076@kernel.org> <20230520000049.2226926-1-dhowells@redhat.com> <20230520000049.2226926-27-dhowells@redhat.com> To: Masami Hiramatsu (Google) Cc: dhowells@redhat.com, Jens Axboe , Al Viro , Christoph Hellwig , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , Steven Rostedt , linux-trace-kernel@vger.kernel.org Subject: Re: [PATCH v21 26/30] splice: Convert trace/seq to use copy_splice_read() MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <2332232.1684673440.1@warthog.procyon.org.uk> Date: Sun, 21 May 2023 13:50:40 +0100 Message-ID: <2332233.1684673440@warthog.procyon.org.uk> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Masami Hiramatsu (Google) wrote: > David Howells wrote: > > > For the splice from the trace seq buffer, just use copy_splice_read(). > > So this is because you will remove generic_file_splice_read() (since > it's buggy), right? An ITER_PIPE iterator has a problem if it gets reverted with other changes I want to make. The problem is that it may not be valid to control the lifetime of the data in the buffer with get_page(). The pages may need a pin taking (FOLL_PIN) or the lifetime might be controlled with kfree() or rmmod. > > In the future, something better can probably be done by gifting pages from > > seq->buf into the pipe, but that would require changing seq->buf into a > > vmap over an array of pages. > > ... We introduced splice support for avoiding copy ringbuffer pages, but > this drops it. Thus this will drop performance of splice on ring buffer > (trace file). If it is correct, can you also add a note about that? Actually, no. There is no special splice support for tracing_fops. You currently use generic_file_splice_read(), which wends its way down into seq_read_iter. However, the seqfile stuff uses kvmalloc() to allocate the buffer, so you are not allowed to splice page refs from kmalloc'd or vmalloc'd memory into a pipe, so it doesn't. It calls copy_to_iter() which will cause ITER_PIPE to allocate bufferage on an as-needed basis. copy_splice_read() instead creates an ITER_BVEC and populates it up front using the bulk allocator, so if you're splicing a lot of data, this ought to be marginally faster. > So what we need is to introduce a vmap? We could implement seq_splice_read(). What we would need to do is to change how the buffer is allocated: bulk allocate a bunch of arbitrary pages which we then vmap(). When we need to splice, we read into the buffer, do a vunmap() and then splice the pages holding the data we used into the pipe. If we don't manage to splice all the data, we can continue splicing from the pages we have left next time. If a read() comes along to view partially spliced data, we would need to copy from the individual pages. When we use up all the data, we discard all the pages we might have spliced from and shuffle down the other pages, call the bulk allocator to replenish the buffer and then vmap() it again. Any pages we've spliced from must be discarded and replaced and not rewritten. If a read() comes without the buffer having been spliced from, it can do as it does now. David