Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp932964pxb; Thu, 28 Jan 2021 04:09:45 -0800 (PST) X-Google-Smtp-Source: ABdhPJw3Rmnkx4kEmoIPb1LBUTxS6EfPSos73feRLbuZ2CeG+H1rp1H+xZVn2YPmnstJZplMV47/ X-Received: by 2002:a17:906:2c51:: with SMTP id f17mr10306168ejh.62.1611835784998; Thu, 28 Jan 2021 04:09:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611835784; cv=none; d=google.com; s=arc-20160816; b=tlqYxjirTi4hnfRofeb4rXANPzrrLdUuwLYF91x09DDh7BZpbiMKevWJSJwAabCF4Y ooPbxesmOIjzHn9zhp8uqhYBinGElOURuUoH5xK+UM1+pdvyES6CkPrCdUhAb7NiMXT5 jxu/n5nKqd11eT2hasS/TfoQiIJ+BcdVAu/r/2uiweLkLg9Hiq4hYDfN817OGYg5OjMH EyiBWoL6dEQQskcM3EI0OC8OeMc7v7ZdFcnieg6m/375DZmpMiS2N8lVskC5kEdP075Y 99wHq89eApNeDURxWaowyh8+Zlj4xQ3bQXG0yAh+5cYFeU3rc4V8qiNVugbyEcq6HsFl F/pw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=FvPd5ifPVoBtRVnaGQOlqp/tByyxR4FiQHLJy7BycmA=; b=Aw63/LpyJDsMmdkGWe3FwPx9ej3tTP/4MAruOuwdZh6p1Aoty0+sNjSWEyXrS+EJbc +4NyNbK4c1nYnfd3LwzN5s/nW0DL09F/DnxUdLJAfKwtg0/vJodX37Bat1agfJdLaxml Q70741FCkghXw/DaQ0g/1ErVReY+Zjd4U3oxc8RLe7k9X1AQMjPzg3jaddjmv8SGk4ix 8yn2Onpz/jj2x++/YyK9vrM/7BdHaUc14HgHyEKN6bGU8nHV0J5zTyrPj1nYBqZXlS2d lNF01k2u8WmIFuf6lWvY+WsNAGLLP0Mh9YEO8BtwsRROhpqqsoncxDiTGYUwg4Wse7wI a3CA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=FxGiNKqv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bw14si2431010ejb.201.2021.01.28.04.09.20; Thu, 28 Jan 2021 04:09:44 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=FxGiNKqv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231666AbhA1MHF (ORCPT + 99 others); Thu, 28 Jan 2021 07:07:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40374 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231232AbhA1MFz (ORCPT ); Thu, 28 Jan 2021 07:05:55 -0500 Received: from mail-wr1-x433.google.com (mail-wr1-x433.google.com [IPv6:2a00:1450:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 52FFBC061573; Thu, 28 Jan 2021 04:05:15 -0800 (PST) Received: by mail-wr1-x433.google.com with SMTP id c4so2424028wru.9; Thu, 28 Jan 2021 04:05:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=FvPd5ifPVoBtRVnaGQOlqp/tByyxR4FiQHLJy7BycmA=; b=FxGiNKqvyueMB/lnAWyUND5sS13WlwGLefVHBucluxwpfTiLTiB/tqj2CEBOCAXLa6 5z15i3J/sJAdQD0DyjNpqcBY+mfNIzkeryFwpejOq3p21irD1hy5eLoncofJX3N7VZyM qup1Glfv1Bohk6ivde403LQE10t2+FZy/qti/s/H+fqCeOO5XJj0eP0hiEzaets9d3oQ skh/3y7ZAX1y4BzRWT13oucubWLb6TgW0/dOt0m6MdEhok52ok0sbVAwq/ZX3eOL8jPw jKF+Ge1YdnFYtM9zRMxwUYAAyBeZow6YfMyrOAc8YGH5rVSvUe9r/AB/Eva+M0JGjGXN ZuYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=FvPd5ifPVoBtRVnaGQOlqp/tByyxR4FiQHLJy7BycmA=; b=O/DIa04k7pf+9jRM+2Q9jaJh4hK+ttv51ztizPBBdfTS7QQJwIuDg9YeaEZpvTPKh2 xSVUFsCIBxaHaVrA1xLgjIdiw3XVXG1QstN6/+tyf8OaR7uIwO6u6G0lMytQv+jN9/KV Xi73IWgL2yQBCuQJyCxPqUJP9obeLMFQsXuAT2RDST5HcOdqvxwNAirRdV5PaOa5iESc jB8ZygL2QvnB5d1STAxka8Nn0/0p1GDzbMbHLlS6ehHZ0BqmbHqxEnygfF1Ilo4uRZAA J+VNXH3FQOX73Slzym+5gSQHQjiE+A6qw8QVQKu61CdmZ+82CUrnL53K735ZDwgKEDu+ IG6w== X-Gm-Message-State: AOAM531xlF5b+AA/GVAjcIxv4hbrAxg/lhFDSunsy1O9r/0PVSM8PXhv mVvzZxjf+Uj/iUwR+Xh+LdrGu+KkZ8HFq807mHI= X-Received: by 2002:adf:f182:: with SMTP id h2mr15937080wro.355.1611835513892; Thu, 28 Jan 2021 04:05:13 -0800 (PST) MIME-Version: 1.0 References: <20210127150029.13766-1-joshi.k@samsung.com> <489691ce-3b1e-30ce-9f72-d32389e33901@gmail.com> In-Reply-To: From: Kanchan Joshi Date: Thu, 28 Jan 2021 17:34:47 +0530 Message-ID: Subject: Re: [RFC PATCH 0/4] Asynchronous passthrough ioctl To: Pavel Begunkov , Jens Axboe Cc: Kanchan Joshi , Keith Busch , Christoph Hellwig , sagi@grimberg.me, linux-nvme@lists.infradead.org, io-uring@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Javier Gonzalez , Nitesh Shetty , Selvakumar S Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 27, 2021 at 9:32 PM Pavel Begunkov wrote: > > On 27/01/2021 15:42, Pavel Begunkov wrote: > > On 27/01/2021 15:00, Kanchan Joshi wrote: > >> This RFC patchset adds asynchronous ioctl capability for NVMe devices. > >> Purpose of RFC is to get the feedback and optimize the path. > >> > >> At the uppermost io-uring layer, a new opcode IORING_OP_IOCTL_PT is > >> presented to user-space applications. Like regular-ioctl, it takes > >> ioctl opcode and an optional argument (ioctl-specific input/output > >> parameter). Unlike regular-ioctl, it is made to skip the block-layer > >> and reach directly to the underlying driver (nvme in the case of this > >> patchset). This path between io-uring and nvme is via a newly > >> introduced block-device operation "async_ioctl". This operation > >> expects io-uring to supply a callback function which can be used to > >> report completion at later stage. > >> > >> For a regular ioctl, NVMe driver submits the command to the device and > >> the submitter (task) is made to wait until completion arrives. For > >> async-ioctl, completion is decoupled from submission. Submitter goes > >> back to its business without waiting for nvme-completion. When > >> nvme-completion arrives, it informs io-uring via the registered > >> completion-handler. But some ioctls may require updating certain > >> ioctl-specific fields which can be accessed only in context of the > >> submitter task. For that reason, NVMe driver uses task-work infra for > >> that ioctl-specific update. Since task-work is not exported, it cannot > >> be referenced when nvme is compiled as a module. Therefore, one of the > >> patch exports task-work API. > >> > >> Here goes example of usage (pseudo-code). > >> Actual nvme-cli source, modified to issue all ioctls via this opcode > >> is present at- > >> https://github.com/joshkan/nvme-cli/commit/a008a733f24ab5593e7874cfbc69ee04e88068c5 > > > > see https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-fops > > > > Looks like good time to bring that branch/discussion back > > a bit more context: > https://github.com/axboe/liburing/issues/270 Thanks, it looked good. It seems key differences (compared to uring-patch that I posted) are - 1. using file-operation instead of block-dev operation. 2. repurpose the sqe memory for ioctl-cmd. If an application does ioctl with <=40 bytes of cmd, it does not have to allocate ioctl-cmd. That's nifty. We still need to support passing larger-cmd (e.g. nvme-passthru ioctl takes 72 bytes) but that shouldn't get too difficult I suppose. And for some ioctls, driver may still need to use task-work to update the user-space pointers (embedded in uring/ioctl cmd) during completion. @Jens - will it be fine if I start looking at plumbing nvme-part of this series on top of your work? Thanks,