Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp2121567rwd; Fri, 2 Jun 2023 05:17:48 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7EEVf+J8g3sLX3qe1J7L5Sw8/UPfAeBCQQBYxMiVTB7sIjhDzD3HmjzH1CqiG/PqV/SRg0 X-Received: by 2002:a05:6a20:9588:b0:10c:b9ed:6a38 with SMTP id iu8-20020a056a20958800b0010cb9ed6a38mr12515772pzb.28.1685708268392; Fri, 02 Jun 2023 05:17:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685708268; cv=none; d=google.com; s=arc-20160816; b=Topp8V9Q9SnG7aToyCOjp5hyhl6duJiLdO+QYBHm7xwXmZDKIZpWUWdnrUNgUnBy3q LIdusYlHO5xeOMsX/uU5NfNJ/reNtzITq2wmiPLSPlC9sLOovZ4xgkSPIzHfW3RxV5ce X7PZd6xug9Z5U3SwCikQwA2eD7V8xXHr6bndOSLhfPykq7M9j2GXBmMuAizVSjy9VmW7 QdLIC5AHAoA2hmpfqe/M5R9yqOnWVdScwq6UEAdCkaVBM8iYWg8dAEp1eB/s7SrI0b8e hxMLt1qQzk3n2Uwb2frnOe8hL3DRCNblzM7XsmFw660J/GofdddfYrJ8W0aZ6xZHd2nY y3ug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :organization:from:references:cc:to:content-language:subject :user-agent:mime-version:date:message-id:dkim-signature; bh=AWjq3+tT8iZ2ZA7EYofD7K+xubUKWrfpRqvZyd98N9k=; b=EOF8xSFZDQhMqxNJnwr/i1yXbr5IymE+jCR/aWe191lXRft2eNJg1CSL36bfmFEZCH VP7sRarinbxohAd8Cf9fere10wKAu7P7QgeQvsFCa654REYJKJPw2IbM3sAN3ml08oAg z9EnDgzMtlmj9EXuz4D6fkQfcfm82NIq6+qa2WbiOn3ZxBJYQYYS2WIRVQES+gW68DZi Yam88q4iPShLd0eOFe8x8i6Ajs4VHBNnereWarzBW0DUIfyxCjXcnOLgdGbBCY2CoCp/ 75WjB/WzXjIc5dKs485C7bnfinW31VptZpho+OmJE/uP4Va7iOpA5xI8k6vQC0w0uXuX yWgA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=KZ7TPpTQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z127-20020a626585000000b0064354230c2asi615910pfb.367.2023.06.02.05.17.34; Fri, 02 Jun 2023 05:17:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=KZ7TPpTQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235654AbjFBMLB (ORCPT + 99 others); Fri, 2 Jun 2023 08:11:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51900 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235665AbjFBMK3 (ORCPT ); Fri, 2 Jun 2023 08:10:29 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 13A05D3; Fri, 2 Jun 2023 05:10:28 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 95D3E64F8E; Fri, 2 Jun 2023 12:10:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AAAF4C433D2; Fri, 2 Jun 2023 12:10:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1685707827; bh=4j8IEfNND6Ip35V7OtZoAneDtYiN13jPWMiXFEGXX8Y=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=KZ7TPpTQeD+Vq5oJC3k6NUFUDf/mSj9/Xg//feYUZoO47Kygl0xx68NMlze9SXgTE vObudw8yQG/29PxYUzmM34qJUp/u0N4sFY/3KmkQ6mVHVfsCAnt1titgWsZwzhmGJU v/8BG/J5MGIpKzWJOB5SOd9g0N7T8UeWzwb6bDnYNuNTyOJh35RKHpVERVxQq+dRim F1pShYvk3Wz5Wb/IjC72U3GqValbIvX/cT1b+AXlbnGLXr6sPTDJZQjVsQVBUwxCPk 2SAhfOD7rQRVmmpDZfQPsXBFF0GWJkUf+I3xJ2NcywVy2QVkRxKgAfczGSkbUYDILJ r5ow31qPdQWOg== Message-ID: <38d41f97-14d5-e24b-3d19-6c4f96305c58@kernel.org> Date: Fri, 2 Jun 2023 21:10:22 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: Re: [RFC PATCH 1/3] PCI: endpoint: support an alignment aware map/unmaping Content-Language: en-US To: Kishon Vijay Abraham I , Shunsuke Mie , Jingoo Han Cc: Gustavo Pimentel , Lorenzo Pieralisi , Rob Herring , =?UTF-8?Q?Krzysztof_Wilczy=c5=84ski?= , Bjorn Helgaas , Manivannan Sadhasivam , Kishon Vijay Abraham I , Kunihiko Hayashi , Hou Zhiqiang , Frank Li , Li Chen , linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org References: <20230113090350.1103494-1-mie@igel.co.jp> <20230113090350.1103494-2-mie@igel.co.jp> <978b63ac-90b5-b909-d259-0668b77f1cc8@kernel.org> <52b8f850-af8c-1971-9729-c5de37875bf9@amd.com> From: Damien Le Moal Organization: Western Digital Research In-Reply-To: <52b8f850-af8c-1971-9729-c5de37875bf9@amd.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/2/23 20:39, Kishon Vijay Abraham I wrote: > > > On 6/2/2023 5:13 AM, Damien Le Moal wrote: >> On 6/2/23 00:06, Kishon Vijay Abraham I wrote: >>> Hi Shunsuke, >>> >>> On 1/13/2023 2:33 PM, Shunsuke Mie wrote: >>>> Add an align_mem operation to the EPC ops, which function is used to >>>> pci_epc_map/unmap_addr(). These change to enable mapping for any alignment >>>> restriction of EPC. The map function maps an aligned memory to include a >>>> requested memory region. >>> >>> I'd prefer all the PCIe address alignment restriction be handled in the >>> endpoint function drivers and not inside the core layer (esp in map and >>> unmap calls). >> >> That is a really *bad* idea ! Most function drivers should be able to work with >> any EP controller hardware. Asking these drivers to support all the alignment >> peculiarities of every possible EP controller is impossible. > > Function drivers already work with various restrictions of EP controller > hardware. pci_epc_features was added to provide such restrictions to > function drivers. Not sure why it has to be different here. >> >>> IMO, get the pci address alignment restriction using pci_epc_features. >>> And use a bigger size (based on alignment restriction) in >>> pci_epc_mem_alloc_addr() and access the allocated window using an offset >>> (based on alignment value). You can add separate helpers if required. >> >> That is too simplistic and not enough. Example: Rick and I working on an nvme >> function driver are facing a lot of issues with the EPC API for mem & mapping >> management because we have 0 control over the PCI address that the host will >> use. Alignment is all over the place, and the current EPC memory API >> restrictions (window size limitations) make it impossible to transparently >> handle all cases. We endup with NVMe command failures simply because of the API >> limitations. > > You mean restrictions w.r.t OB window address and not PCIe address? >> >> And sure, we can modify that driver to better support the EP controller we are >> using (rockchip). But we need to support other EP controllers as well. So API > > Every EP controller can provide it's restrictions in pci_epc_features. > Unless the alignment is going to change dynamically, don't see a need > for adding new epc ops. > > Not sure why the following cannot be handled from function driver? > > From > > A A + S > ┌────────────────────────┐ > │ │ > │ OB WIN │ > ├────────────────────────┤ > mapping │ │ > ▼ B + S ▼ > B ┌────────────────────────┐ > │ │ > │ PCI Address │ > └────────────────────────┘ > > To > > > A A'│ A + S A+S+alignment > ┌────┼───────────────────┬──────┐ > │ │ │ │ > │ │ OB WIN │ │ > ├────┴───────────────────┴──────┤ > │ | > │ | > B' ▼ B B + S ▼ > ┌────┬──────────────────────────┐ > │ │ │ > │ │ PCI Address │ > └────┴──────────────────────────┘ > > So the changes in function driver will be > 1) Get alignment value in epc_features > 2) pci_epc_mem_alloc_addr()/pci_epc_map_addr() will take into account > the alignment value (change in size parameter) > 3) Access host memory from an offset in the provided > pci_epc_mem_alloc_addr(). The problem with all this is that some EP controllers (at least the rockchip for sure, likely the Cadence one as well) have alignment constraints that depend on the *host* PCI address (yes, the rockchip driver is still buggy in that respect, fixes coming, see at the end for the details about the rockchip). The current API does not allow for that to be gracefully handled and using the epc_features for that would not work at all. With this dynamic constraint based on the host PCI address (which the EPF cannot control), we need EPC core functions that: 1) allocate memory from windows based on the PCI address they will be mapped to 2) Depending on the size of the transfer + the alignment need for a PCI address, a single memory window may not be enough, so we need the ability to allocate memory over multiple windows 3) Some nice helpers that avoid that pattern of mem alloc + map pci addr and simplify them with "map this PCI address for me and tell me the local CPU address for it, completely hiding any alignment concerns. >> changes are definitely needed. Working on that. That is not easy as the mapping >> API and its semantic impacts data transfers (memcpy_from|toio and DMA). >> >> I do have a patch that does something similar as this one, but at a much higher >> level with a helper function that gives the function driver the offset into the >> allocated memory region to use for mapping a particular PCI address. And then >> this helper is then in turn used into a new pci_epc_map() function which does >> mem alloc + mapping in one go based on the EPC constraints. That hides all > > pci_epc_map() was added only to perform mapping functionality. I'd > prefer it stays that way instead of adding bunch of other things into it. I am not proposing to add to it or to modify it. That function can remain the basic one for simple cases. But we need better functions for more complex EPF functions that need to map potentially large memory areas to random PCI addresses. What I am proposing is to have more intelligent helpers using the current simple functions: essentially wrapping pci_epc_mem_alloc_addr()+pci_epc_map_addr() with pci_epc_map(), and similar for unmap. That would greatly simplify the code of EPF drivers that constantly need to map/unmap PCI address to serve IOs/transfers as requested by the host/RP side. Developers would still be free to use the verbose path if they wish to do so, modulo the mandatory fixes for gracefully handling alignment and allocation size, for which we need either to modify pci_epc_mem_alloc_addr() or new functions. Note about the rk3399 EP controller: it has 1MB memory windows that can be used to map up to 1MB of PCI address space. This limits comes from the fact that the mapping controller uses at most the lower 22 bits from the local CPU address as the lower bits for the PCI address. But this also implies that the offset (the alignment) into the memory window must be equal to the mask of the PCI address to map over the number of bits of PCI address that will change over the range of addresses mapped (the number of bits of address changing over the address range [PCI_addr .. PCI_addr + mapping_size - 1]). Notifying this alignment need to an EPF driver can only be done using an API. Cannot do that with epc_features fields. -- Damien Le Moal Western Digital Research