Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp7950256rwb; Tue, 6 Dec 2022 11:58:35 -0800 (PST) X-Google-Smtp-Source: AA0mqf598A60M+5P6zeqhSC+wCFHOd5aZjQdlRGMDYoqVopEBqPR7tdwN0NjQ4FWxHUGA85Sjpoh X-Received: by 2002:a05:6402:b11:b0:46b:c86a:6411 with SMTP id bm17-20020a0564020b1100b0046bc86a6411mr24301491edb.417.1670356715575; Tue, 06 Dec 2022 11:58:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670356715; cv=none; d=google.com; s=arc-20160816; b=uxYPEKZTbTUpQJsuSRVkqwcVOFggcZCvLuvi1Np0vcV6Ha6zY0J0ytS5c+8cwkEWJv bwSopBvVsc8Glb3uKreND59hKMBXQNh23E0VO0pCTLrCCwjIwf+xBTmQRSj8bH2HFPlh EossLgZZjcbXYWBpQxcsVuZskDYzjZ8r2ljn96VB4kK7sCrBNxhKyOhX+fAN/asCpHez NFRW4HDLbtXJaFIhFlsYYMKbmMIM5lRNu9S5cQUXaHHtBwoVV+7hD8ND/uW1bsvkuMCN vdfd/n+lAOKlEjXOAzKpjYCpGhiF+GxOdunNVNjRqSubc15J2Nrq96JnUowWfpjBZMGM Wjfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=GG73iLTCO/uw4M1Ylg2jcMVxqfgJjKpsiaTSmU2EXGA=; b=aZeVOXZhX8vf2WcI2WbxJ6O8zGdNbyF0PFSLS0GpEDl3poIDL5v1Q4ElaJLSOQy9UF v8+wpy+HMTTyMEQ3315I2UlN1uKbsn94mGqtm71j9XpJlTyqcxphPYLN1HB+3OShEFk8 LyPss0uE+QJGSOJpu06zLTL5boKOMk6TocSlYD0AZTE/QwCogjSYx62k4RkVQX682xlc P4jqfyiIopgpbUfJbV87sn0ylQ33K4mzHJaeIxDMXxLAjdhyB4Jzmya35wWxu/AHgd2/ VyNJfcYgzVji8Hi1dOAHw9/XetbvzDjA0l9VIREjCNdV4waaORuEnOoVtg5b1oUqQZXu uWdQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=JIXO4eFg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id du12-20020a17090772cc00b007bd7e8dbf47si15360989ejc.959.2022.12.06.11.58.16; Tue, 06 Dec 2022 11:58:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=JIXO4eFg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229732AbiLFTPw (ORCPT + 77 others); Tue, 6 Dec 2022 14:15:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52418 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229718AbiLFTPq (ORCPT ); Tue, 6 Dec 2022 14:15:46 -0500 Received: from mail-qt1-x82d.google.com (mail-qt1-x82d.google.com [IPv6:2607:f8b0:4864:20::82d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6695740934 for ; Tue, 6 Dec 2022 11:15:44 -0800 (PST) Received: by mail-qt1-x82d.google.com with SMTP id ay32so6455066qtb.11 for ; Tue, 06 Dec 2022 11:15:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=GG73iLTCO/uw4M1Ylg2jcMVxqfgJjKpsiaTSmU2EXGA=; b=JIXO4eFgOaT8ujjjyeXvuO8qQ+yypxhZOimo6XQPj9ykjHnpz42lA0hxwRS2CqALlP InZ5KniMwf4gL2CkysIfWTSK/zRhcKpgYW1h7jk80RV72fs+TIGlADEs1qMbux4LzQXs kgFo0jiGF80V8ESdg6/Y8ivo2WWI0dTsZ+SjdqyptOk2b/5s3bOyMy2Cjfh133fw5t7P inVYux5ZW6ridMLq77+CizaOGY5xhsgGqo/Q+axhpiXkSy3h5saQm0aOlMTavw/1Z53L RwcTU+ikujUo3UDBcOg37Tvhuh6FiRok4lYQDG1ACo4Agcpan+pZrTeBeKsNDB9ShmHe G8Ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=GG73iLTCO/uw4M1Ylg2jcMVxqfgJjKpsiaTSmU2EXGA=; b=mynzEuoUIg+MPs0o3YpXy4yBxgr9Z8ilr4MQqxooMAf8ez8tP53RhxqTDx/zkUsO+M XMJouyqL1ySPsDlk5d2kDBrjbuZ+yai31vi66oEbGoHBz08859EsD88SPjV3watTBVZK TTlOUDqb15FHV1y/VJfIszEQgVN3rNJWRTb/bekYcOJVxsO7iOi93gu3GD09NhMxPnge DLEq5zOEL+bTiirTVR+obyEGL/AhR7qaRN0nvDtE6UJ9vVnKezgQ2LEfjXWI3Y5PuPZf hsDIqQnffjgh1Ssr3pnv48NmRvZ1vE6OrYsOTmrw2mrWL2GEoS6saf1Q62kArN+IazVP 8OcQ== X-Gm-Message-State: ANoB5pntp/iuElo5SDWKwHmhdg6hhVLRoEEjIJsZtNjFwWxNzgFewOSQ uQajp9maHgrG5uOGSUVg8lCgiw== X-Received: by 2002:ac8:5892:0:b0:3a5:3d08:9fdd with SMTP id t18-20020ac85892000000b003a53d089fddmr82236079qta.684.1670354143281; Tue, 06 Dec 2022 11:15:43 -0800 (PST) Received: from ziepe.ca (hlfxns017vw-47-55-122-23.dhcp-dynamic.fibreop.ns.bellaliant.net. [47.55.122.23]) by smtp.gmail.com with ESMTPSA id y14-20020a37f60e000000b006fbaf9c1b70sm14752500qkj.133.2022.12.06.11.15.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Dec 2022 11:15:42 -0800 (PST) Received: from jgg by wakko with local (Exim 4.95) (envelope-from ) id 1p2dPp-004gjs-T7; Tue, 06 Dec 2022 15:15:41 -0400 Date: Tue, 6 Dec 2022 15:15:41 -0400 From: Jason Gunthorpe To: Christoph Hellwig Cc: Lei Rao , kbusch@kernel.org, axboe@fb.com, kch@nvidia.com, sagi@grimberg.me, alex.williamson@redhat.com, cohuck@redhat.com, yishaih@nvidia.com, shameerali.kolothum.thodi@huawei.com, kevin.tian@intel.com, mjrosato@linux.ibm.com, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, kvm@vger.kernel.org, eddie.dong@intel.com, yadong.li@intel.com, yi.l.liu@intel.com, Konrad.wilk@oracle.com, stephen@eideticom.com, hang.yuan@intel.com Subject: Re: [RFC PATCH 1/5] nvme-pci: add function nvme_submit_vf_cmd to issue admin commands for VF driver. Message-ID: References: <20221206055816.292304-1-lei.rao@intel.com> <20221206055816.292304-2-lei.rao@intel.com> <20221206061940.GA6595@lst.de> <20221206135810.GA27689@lst.de> <20221206153811.GB2266@lst.de> <20221206165503.GA8677@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20221206165503.GA8677@lst.de> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 06, 2022 at 05:55:03PM +0100, Christoph Hellwig wrote: > On Tue, Dec 06, 2022 at 11:51:23AM -0400, Jason Gunthorpe wrote: > > That is a big deviation from where VFIO is right now, the controlled > > function is the one with the VFIO driver, it should be the one that > > drives the migration uAPI components. > > Well, that is one way to see it, but I think the more natural > way to deal with it is to drive everyting from the controlling > function, because that is by definition much more in control. Sure, the controlling function should (and does in mlx5) drive everything here. What the kernel is doing is providing the abstraction to link the controlling function to the VFIO device in a general way. We don't want to just punt this problem to user space and say 'good luck finding the right cdev for migration control'. If the kernel struggles to link them then userspace will not fare better on its own. Especially, we do not want every VFIO device to have its own crazy way for userspace to link the controlling/controlled functions together. This is something the kernel has to abstract away. So, IMHO, we must assume the kernel is aware of the relationship, whatever algorithm it uses to become aware. It just means the issue is doing the necessary cross-subsystem locking. That combined with the fact they really are two halfs of the same thing - operations on the controlling function have to be sequenced with operations on the VFIO device - makes me prefer the single uAPI. > More importantly any sane design will have easy ways to list and > manipulate all the controlled functions from the controlling > functions, while getting from the controlled function to the > controlling one is extremely awkward, as anything that can be > used for that is by definition and information leak. We have spend some time looking at this for mlx5. It is a hard problem. The VFIO driver cannot operate the device, eg it cannot do MMIO and things, so it is limited to items in the PCI config space to figure out the device identity. > It seems mlx5 just gets away with that by saying controlled > functions are always VFs, and the controlling function is a PF, but > that will break down very easily, Yes, that is part of the current mlx5 model. It is not inherent to the device design, but the problems with arbitary nesting were hard enough they were not tackled at this point. Jason