Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp1170893ybt; Thu, 18 Jun 2020 02:13:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzlA1gUTmJ+ceXnG+KLUPed63uUTovgqocEclLiA9NCQcYA+tTZ6tFN+rBPvtnC0YonfXQY X-Received: by 2002:a50:f387:: with SMTP id g7mr3214689edm.185.1592471615152; Thu, 18 Jun 2020 02:13:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592471615; cv=none; d=google.com; s=arc-20160816; b=TSZzf5O4XL2SXzR6tz/NiRYAA86QB1PcDBv0RMlrzwztytXvxkX73Dudu7Cm8Y46gl MRSs9wJ1t2OaeNCGpZfgZN7CVDKoyZwWhgqquYnEA+jR86LYI4Kj2yX5PMMRF1YMBqdf 8R6za2iuW9RqIHzfw67djlFNaSxaznsiVY+3XnydzFTuTCu203wxVlL9DYOpoPyVButg bc6tGL5SOrL81M5DIzA+U/AKPq5pwopPxq6zaEhZjqbgpVIKqkj73kaf4bNgd46Buxh7 lKndTTu68uHXXUMK8j8EQFocSVnQWQ9euL/0se+oBCffgPDQKAocxjxgUf/TK/gfu1VH qyRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=nJ/ameCVngCfbGYTO+5aI2tlO0POD3IpltXZZNLuxEs=; b=ViNFQlYmQQx/rhalqjdlmcBNZG9llEJCqvzUVbNfAYNZnBEcf9HgLH4aYQMHWCS0Xe 2xu41kWRFZwC4fUBIB8iGGpxsoeCFFsaWGVstMie479zORNIYnrfqQGeLsHiQyUoFJ6t gBxiokXqXeZ9Y+rskYCKddGONMLsGx2MZx7lS/hT0qZHraZ+Tqhp1T8jPNA2EctKQT4e zoENEwiQUyrBmfYl0h6Rs2PJaSuZLgiWoaMHgdkSNb8gTKVV73QF0IVeMGGjbg1MwDDp +tp9faQNj5uCZSE7fubVnDMcOqhVNgDXQkUduc8V3ICvsaBRe4FOw4mQOaMSzUzNKBSX ZDaA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@javigon-com.20150623.gappssmtp.com header.s=20150623 header.b=zrEq0qH5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i7si1819935ejo.684.2020.06.18.02.13.11; Thu, 18 Jun 2020 02:13:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@javigon-com.20150623.gappssmtp.com header.s=20150623 header.b=zrEq0qH5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728986AbgFRJLV (ORCPT + 99 others); Thu, 18 Jun 2020 05:11:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728937AbgFRJLS (ORCPT ); Thu, 18 Jun 2020 05:11:18 -0400 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0BCA1C0613ED for ; Thu, 18 Jun 2020 02:11:16 -0700 (PDT) Received: by mail-wr1-x444.google.com with SMTP id x6so5200938wrm.13 for ; Thu, 18 Jun 2020 02:11:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=javigon-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=nJ/ameCVngCfbGYTO+5aI2tlO0POD3IpltXZZNLuxEs=; b=zrEq0qH5SUEXfv9B8Ir5FQX8mZb9/7ShVjC1BqOnxa+nJeZZepPmEXK0JUq9Iyo/er x2eV2pn2BWHZZYzdw8xWsFUWKUmeJ2LD0KtYxtFNlFFMMUrP3JuBoJQh5pCG0LPjeBXb Bj/SSgbof0QLi7qLSpLHKlV1nbJkdJLFawhgO8tsKLBmz4bafqZR0T+4ppsS7C9a9SUW Uqs9ZRYksz+MN2tnjrFdL7Q3hlp6toE2WynUOVWvB1951khtuJOXjysmpuQxmkjTaK9w WKMnxGdseFfMzIDugIemYPP8DSqA3Reo0AAoyttSjL52Sb7wQ5cxg8jIDYGFDymW76Ol /7LA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=nJ/ameCVngCfbGYTO+5aI2tlO0POD3IpltXZZNLuxEs=; b=Oh9ML9MLXBoUFCqg/592g7YCea0zDF3jlZyP2HTB/gFS6g0Lqm4EIT/gP/X7Y52yTw B0dpD5Eq/X0R+SpJr1JIdyEuoQAAY7YBrS7kz6BEGQWymxqc8E3/3PGZZyAwUgthl9xl 4MaksyJLL0Qs1TQAxw6+J2JrcTAl2zFVNORl4n3Q42yZSRXrBdyuiob2YG7SVd8NblBH XqhQnghfjr1dGFmoWfOWMBR6hWX8YukyJ5tsE+fvboMWu3Wm5RdORnc85WHqavWHA+Ga DjIezZUZiCq2rRZGYvJz7ozeLqUQj6EyLdZLDMv/IyPqiUHf4VAl1ocRrwrRFSU6FRuI RT4w== X-Gm-Message-State: AOAM530GjoUxdnU6rQS7JoekHM7NQuvuS1D7+OvnI0LXCD3Vwmg/1Z1q 2HjaudTGIPW+gynTfQKgTSwtDA== X-Received: by 2002:a05:6000:114e:: with SMTP id d14mr3499793wrx.110.1592471475570; Thu, 18 Jun 2020 02:11:15 -0700 (PDT) Received: from localhost ([194.62.217.57]) by smtp.gmail.com with ESMTPSA id q4sm2773031wma.47.2020.06.18.02.11.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jun 2020 02:11:14 -0700 (PDT) Date: Thu, 18 Jun 2020 11:11:13 +0200 From: "javier.gonz@samsung.com" To: Damien Le Moal Cc: Kanchan Joshi , "axboe@kernel.dk" , "viro@zeniv.linux.org.uk" , "bcrl@kvack.org" , "linux-fsdevel@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-aio@kvack.org" , "io-uring@vger.kernel.org" , "linux-block@vger.kernel.org" , "selvakuma.s1@samsung.com" , "nj.shetty@samsung.com" Subject: Re: [PATCH 3/3] io_uring: add support for zone-append Message-ID: <20200618091113.eu2xdp6zmdooy5d2@mpHalley.local> References: <1592414619-5646-1-git-send-email-joshi.k@samsung.com> <1592414619-5646-4-git-send-email-joshi.k@samsung.com> <20200618083529.ciifu4chr4vrv2j5@mpHalley.local> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 18.06.2020 08:47, Damien Le Moal wrote: >On 2020/06/18 17:35, javier.gonz@samsung.com wrote: >> On 18.06.2020 07:39, Damien Le Moal wrote: >>> On 2020/06/18 2:27, Kanchan Joshi wrote: >>>> From: Selvakumar S >>>> >>>> Introduce three new opcodes for zone-append - >>>> >>>> IORING_OP_ZONE_APPEND : non-vectord, similiar to IORING_OP_WRITE >>>> IORING_OP_ZONE_APPENDV : vectored, similar to IORING_OP_WRITEV >>>> IORING_OP_ZONE_APPEND_FIXED : append using fixed-buffers >>>> >>>> Repurpose cqe->flags to return zone-relative offset. >>>> >>>> Signed-off-by: SelvaKumar S >>>> Signed-off-by: Kanchan Joshi >>>> Signed-off-by: Nitesh Shetty >>>> Signed-off-by: Javier Gonzalez >>>> --- >>>> fs/io_uring.c | 72 +++++++++++++++++++++++++++++++++++++++++-- >>>> include/uapi/linux/io_uring.h | 8 ++++- >>>> 2 files changed, 77 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/fs/io_uring.c b/fs/io_uring.c >>>> index 155f3d8..c14c873 100644 >>>> --- a/fs/io_uring.c >>>> +++ b/fs/io_uring.c >>>> @@ -649,6 +649,10 @@ struct io_kiocb { >>>> unsigned long fsize; >>>> u64 user_data; >>>> u32 result; >>>> +#ifdef CONFIG_BLK_DEV_ZONED >>>> + /* zone-relative offset for append, in bytes */ >>>> + u32 append_offset; >>> >>> this can overflow. u64 is needed. >> >> We chose to do it this way to start with because struct io_uring_cqe >> only has space for u32 when we reuse the flags. >> >> We can of course create a new cqe structure, but that will come with >> larger changes to io_uring for supporting append. >> >> Do you believe this is a better approach? > >The problem is that zone size are 32 bits in the kernel, as a number of sectors. >So any device that has a zone size smaller or equal to 2^31 512B sectors can be >accepted. Using a zone relative offset in bytes for returning zone append result >is OK-ish, but to match the kernel supported range of possible zone size, you >need 31+9 bits... 32 does not cut it. Agree. Our initial assumption was that u32 would cover current zone size requirements, but if this is a no-go, we will take the longer path. > >Since you need a 64-bit sized result, I would also prefer that you drop the zone >relative offset as a result and return the absolute offset instead. That makes >life easier for the applications since the zone append requests also must use >absolute offsets for zone start. An absolute offset as a result becomes >consistent with that and all other read/write system calls that all use absolute >offsets (seek() is the only one that I know of that can use a relative offset, >but that is not an IO system call). Agree. Using relative offsets was a product of reusing the existing u32. If we move to u64, there is no need to do an extra transformation. Thanks Damien! Javier