Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp4388556ybl; Tue, 21 Jan 2020 19:31:45 -0800 (PST) X-Google-Smtp-Source: APXvYqwGmTkROg4s5hp64fy04nW16Rlg6cSNW79RF+GbT4cQaFCeg4ymbuiKf9XOsL7esYCvJ8Zh X-Received: by 2002:aca:3cd7:: with SMTP id j206mr4025286oia.142.1579663905655; Tue, 21 Jan 2020 19:31:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579663905; cv=none; d=google.com; s=arc-20160816; b=xFfVNssQhWs1rOeMcMeq/rTKFieb51bYxJOlDTFKsabj+MlGFCTYIyFFaf9kf3e0Jg 8Aw+NBNH4sapAAb56XAOeT5OTYd/wqvHi6lSaUyxnfhihdcEYBbbpIePrrRoYQXDoxWg XxhheIo0sTXE+faR06K0D4egWCmvMh68uSfHxKCSbaEGCsvYEd0zANUUiSYLR73B7cwx Ib7QtR2Rr2OTC0bt0e2xNa+QxaIyCf+SoJYp7uXWFJWucQra/ZFc7+VR24126S0nlnfz +hVCZnY9cac97qYqskAv6S4zstmWWFwg6codKj5IR35PXp0bSNPbc6kdzSmlnyTBGDG/ 9YsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=qC1RUoM3/NtlqUaitknsdL/RkF2RqwRcZGXGQrwHHKY=; b=lnxIyTnKiOvtCZt42Vt9GufH4x3M71XOlgski2LdvG7F7JKsIz5fcYgp8BGCN2MWxr xzzAR5mD8RwndYyVpKGXgprzKJMcFwnyqRE1/C/IiPeeoaFMBK7A7vPeCa4oJeWIDyog IDvvPgHIV4RKK5f+Rh3w93XCn4w8vih/KiLluNpRa7wXLxvcGfBx8dYNKtAY/3t13r07 +uD8+gwpk4hF/EBNk8p9fZSLtsOYLH39as3WCm9tPeR0Mf7l0eqLXKQoiD+YetYnCoxv hww+z7PkMPuuaVWoeS5GBfZOUjOYJYdMA3oNjs/8fo0qLbxb/VF3FJ6QV9fxgGxvRJpj rJCw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=xDunS1Jh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c186si20150095oib.103.2020.01.21.19.31.32; Tue, 21 Jan 2020 19:31:45 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=xDunS1Jh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729126AbgAVDaR (ORCPT + 99 others); Tue, 21 Jan 2020 22:30:17 -0500 Received: from mail-pf1-f194.google.com ([209.85.210.194]:35174 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729074AbgAVDaR (ORCPT ); Tue, 21 Jan 2020 22:30:17 -0500 Received: by mail-pf1-f194.google.com with SMTP id i23so2626389pfo.2 for ; Tue, 21 Jan 2020 19:30:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=qC1RUoM3/NtlqUaitknsdL/RkF2RqwRcZGXGQrwHHKY=; b=xDunS1JhkaFpBpNmrHHt1HfTnw54J9IC2Y88T+VYta/x5jNWG2RFv/N42ujfWUn24X /+MzbTNjWhtC0f0tXymbdW8Tiy8IjKhTss9krg3EBu4FD4EM0IqAwYTJjR0mNLmGkqXj WQa7SFucJDGaL0UIvLo3ChgkS5GrAfy1DBwJ2BOy6jAoSnlpnhwwJdIpX6av4koGEGKo +hyNmi5GpLT+8Jfm0ao5ySi8TGkD2R16+zImwPMF8vDolNpH0P4KLt+4BrDKNs/yOVgL EnjDEh7dTSvTkMpAptYyhPG5eIKoU/BCYAuIF/vs12hJWJs204Uy7AzMnU1x0/NY2tIM 4tnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=qC1RUoM3/NtlqUaitknsdL/RkF2RqwRcZGXGQrwHHKY=; b=X3y9wn3W5otApERAPcuHfxdMhB3e1cYxo9TW3Ua5nc4j/fyAs2rNxA+TO1orpuK6gS XxnxLmjb+JWhPSefQy9nnMsh9isInRM+o9um1h7tQgGWJ7eo7tsi4KfbhNNYg6/kdtVn lb0xlvw8Rj/zstWIvE41TjtCO/qSNQEwti/LTbkhanq9D2FFFtyJlwqIkorgQUHHMpOu dAawQT7dtsZq4PHO6YP7Q/rAHIuu5aWJIJUE4wB7ifikgYPOQogAV8X6exjCgDtky2Zx Zt7pF/Zi7eRI+sjgapDMOUv+eS8KVZ0mTp1iC3SrhAzkl6EDoVTfdRn9LrEW7+ihOVl9 XycQ== X-Gm-Message-State: APjAAAUUO8PVuyRnzhaiSVqY0XF6SU6C17uahWvdUS6Q2bV6nkaSyGmc +UnK3OICr4Ok23aBIXUSk1uZ2DFpVxw= X-Received: by 2002:a62:c541:: with SMTP id j62mr693157pfg.237.1579663816352; Tue, 21 Jan 2020 19:30:16 -0800 (PST) Received: from [192.168.1.188] ([66.219.217.145]) by smtp.gmail.com with ESMTPSA id k3sm43215290pgc.3.2020.01.21.19.30.15 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 21 Jan 2020 19:30:15 -0800 (PST) Subject: Re: [POC RFC 0/3] splice(2) support for io_uring To: Pavel Begunkov , io-uring@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Alexander Viro References: <63119dd6-7668-a7bc-ea24-1db4909762bb@kernel.dk> <45f0b63b-e3e7-ba71-d037-9af1db7bbd98@gmail.com> From: Jens Axboe Message-ID: <8316dfa9-9210-3402-a6c3-4889b6bbdb49@kernel.dk> Date: Tue, 21 Jan 2020 20:30:13 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 In-Reply-To: <45f0b63b-e3e7-ba71-d037-9af1db7bbd98@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/21/20 8:11 PM, Pavel Begunkov wrote: > On 22/01/2020 04:55, Jens Axboe wrote: >> On 1/21/20 5:05 PM, Pavel Begunkov wrote: >>> It works well for basic cases, but there is still work to be done. E.g. >>> it misses @hash_reg_file checks for the second (output) file. Anyway, >>> there are some questions I want to discuss: >>> >>> - why sqe->len is __u32? Splice uses size_t, and I think it's better >>> to have something wider (e.g. u64) for fututre use. That's the story >>> behind added sqe->splice_len. >> >> IO operations in Linux generally are INT_MAX, so the u32 is plenty big. >> That's why I chose it. For this specifically, if you look at splice: >> >> if (unlikely(len > MAX_RW_COUNT)) >> len = MAX_RW_COUNT; >> >> so anything larger is truncated anyway. > > Yeah, I saw this one, but that was rather an argument for the future. > It's pretty easy to transfer more than 4GB with sg list, but that > would be the case for splice. I don't see this changing, ever, basically. And probably not a big deal, if you want to do more than 2GB worth of IO, you simply splice them over multiple commands. At those sizes, the overhead there is negligible. >>> - it requires 2 fds, and it's painful. Currently file managing is done >>> by common path (e.g. io_req_set_file(), __io_req_aux_free()). I'm >>> thinking to make each opcode function handle file grabbing/putting >>> themself with some helpers, as it's done in the patch for splice's >>> out-file. >>> 1. Opcode handler knows, whether it have/needs a file, and thus >>> doesn't need extra checks done in common path. >>> 2. It will be more consistent with splice. >>> Objections? Ideas? >> >> Sounds reasonable to me, but always easier to judge in patch form :-) >> >>> - do we need offset pointers with fallback to file->f_pos? Or is it >>> enough to have offset value. Jens, I remember you added the first >>> option somewhere, could you tell the reasoning? >> >> I recently added support for -1/cur position, which splice also uses. So >> you should be fine with that. >> > > I always have been thinking about it as a legacy from old days, and > one of the problems of posix. It's not hard to count it in the > userspace especially in C++ or high-level languages, and is just > another obstacle for having a performant API. So, I'd rather get rid > of it here. But is there any reasons against? It's not always trivial to do in libraries, or programming languages even. That's why it exists. I would not expect anyone to use it outside of that. -- Jens Axboe