Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965573AbbDVAKZ (ORCPT ); Tue, 21 Apr 2015 20:10:25 -0400 Received: from mail-db3on0064.outbound.protection.outlook.com ([157.55.234.64]:36146 "EHLO emea01-db3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S964981AbbDVAKV (ORCPT ); Tue, 21 Apr 2015 20:10:21 -0400 X-Greylist: delayed 2017 seconds by postgrey-1.27 at vger.kernel.org; Tue, 21 Apr 2015 20:10:21 EDT From: Liran Liss To: Michael Wang , Roland Dreier , Sean Hefty , Hal Rosenstock , "linux-rdma@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "hal@dev.mellanox.co.il" CC: Tom Tucker , Steve Wise , Hoang-Nam Nguyen , "raisch@de.ibm.com" , Mike Marciniszyn , Eli Cohen , Faisal Latif , Jack Morgenstein , "Or Gerlitz" , Haggai Eran , "Ira Weiny" , Tom Talpey , Jason Gunthorpe , Doug Ledford Subject: RE: [PATCH v5 00/27] IB/Verbs: IB Management Helpers Thread-Topic: [PATCH v5 00/27] IB/Verbs: IB Management Helpers Thread-Index: AQHQe0QWgYjx3AmeeEu2Ae1BasUTY51YB1mA Date: Tue, 21 Apr 2015 23:36:40 +0000 Message-ID: References: <5534B8C9.506@profitbricks.com> In-Reply-To: <5534B8C9.506@profitbricks.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [93.173.251.40] authentication-results: profitbricks.com; dkim=none (message not signed) header.d=none; x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DB4PR05MB447; x-forefront-antispam-report: BMV:1;SFV:NSPM;SFS:(10009020)(6009001)(51704005)(51914003)(501624003)(77156002)(50986999)(46102003)(87936001)(5001770100001)(62966003)(15975445007)(76176999)(54356999)(2900100001)(92566002)(66066001)(122556002)(2501003)(40100003)(102836002)(19580405001)(33656002)(2950100001)(19580395003)(74316001)(2656002)(2201001)(86362001)(106116001)(76576001)(4001450100001)(422495003);DIR:OUT;SFP:1101;SCL:1;SRVR:DB4PR05MB447;H:DB4PR05MB0863.eurprd05.prod.outlook.com;FPR:;SPF:None;MLV:sfv;LANG:en; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(601004)(5002010)(5005006);SRVR:DB4PR05MB447;BCL:0;PCL:0;RULEID:;SRVR:DB4PR05MB447; x-forefront-prvs: 0553CBB77A Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Apr 2015 23:36:40.9722 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB4PR05MB447 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by nfs id t3M0AW6s018800 Content-Length: 10971 Lines: 252 Hi Michael, The spirit of this patch-set is great, but I think that we need to clarify some concepts. Since this will affect the whole patch-set, I am laying out my concerns here instead. A suggestion for the resulting management helpers is given below. I believe the result would be much more coherent. --Liran In general ======== An ib_dev (or a port of) should be distinguished by 3 qualifiers: - The link layer: -- Ethernet (shared by iWARP, USNIC, and ROCE) -- Infiniband - The transport (*) -- IBTA transport (shared by IB and ROCE) -- iWARP transport -- USNIC transport (*) Transport means both: - The L4 wire protocols (e.g., BTH+ headers of IBTA, optionally encapsulated by UDP in ROCEv2, or the iWARP stack) - The transport semantics (for example, there are slight semantic differences between IBTA and iWARP) - The node type (**) -- CA -- Switch -- Router (**) This has been extended to also encode the transport in the current code. At least for user-space visible APIs, we might chose to leave this for backward compatibility, but we can consider cleaning up the kernel code. So, I think that our "old-transport" below is just fine. No need to change it (and you aren't, since it is currently implemented as a function). The "new-transport" does not really exist, but is broken into several capability checks of the L4 transport, optionally with conditions on the link type. I would remove the table below and tell what we really want to achieve: ==> move technology-specific feature-check logic out of the (multiple!) IB code components and various ULPs into per-feature helpers. Detailed remarks ============== 1) The introduction of cap_*_*() stuff should have been introduced directly in patch 02/27. This back-and-forth between rdma_ib_or_iboe() and cap_* is confusing and increases the number of patches in the patch-set. Do this and remove patches 16-24. 2)The name rdma_tech_* is lame. rdma_transport_*(), adhering to the above (*) remark, is much better. For example, both IB and ROCE *do* use the same transport. 3) The name cap_* as it is used above is not accurate. You use it to describe technology characteristics rather than extendable capabilities. I would suggest having a single convention for all helpers, such as rdma_has_*() and rdma_is_*(). For example: cap_ib_smi() ==> rdma_has_smi(). 4) Remove all capabilities that do not introduce any distinction in the current code. We can add them as needed later. This means remove patches: - [PATCH v5 22/27] IB/Verbs: Use management helper cap_ipoib() – all IB devices support ipoib - [PATCH v5 24/27] IB/Verbs: Use management helper cap_af_ib() – all IB devices support AF_IB. On the other hand: - rdma_has_multicast() makes sense, since iWARP doesn’t support it. - cap_ib_sa() might make sense to cut code even further in the CMA, since RoCE has a GSI but no SA. 5) Do no modify phys_state_show() in [PATCH v5 09/27] IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs It *is* the link layer! 6) Remove cap_read_multi_sge It is not device/port feature, but a transport capability. Use rdma_is_iwarp_transport() instead, or introduce a new transport flag in 'enum ib_device_cap_flags'. 7) Remove [PATCH v5 25/27] IB/Verbs: Use management helper cap_eth_ah(). Address handles that refer to Ethernet links always have Ethernet addressing. In the CMA code, using rdma_tech_iboe() is just fine. This is how you define cap_eth_ah() anyway. Currently, this patch just adds clutter. 8) Remove patch [PATCH v5 26/27] IB/Verbs: Clean up rdma_ib_or_iboe(). We do need a transport qualifier, as exemplified in comment 5) above, and for a complete clean model. This is after renaming the function to rdma_is_ib_transport()... Putting it all together ================== We are left with the following helpers: - rdma_is_ib_transport() - rdma_is_iwarp_transport() - rdma_is_usnic_transport() - rdma_is_iboe() - rdma_has_mad() - rdma_has_smi() - rdma_has_gsi() - complements smi; can be used by the mad code for clarity - rdma_has_sa() - rdma_has_cm() - rdma_has_mcast() > Subject: [PATCH v5 00/27] IB/Verbs: IB Management Helpers > > > Since v4: > * Thanks for the comments from Hal, Sean, Tom, Or Gerlitz, Jason, > Roland, Ira and Steve :-) Please remind me if anything missed :-P > * Fix logical issue inside 3#, 14# > * Refine 3#, 4#, 5# with label 'free' > * Rework 10# to stop using port 1 when port already assigned > > There are plenty of lengthy code to check the transport type of IB device, or > the link layer type of it's port, but actually we are just speculating whether a > particular management/feature is supported by the device/port. > > Thus instead of inferring, we should have our own mechanism for IB > management capability/protocol/feature checking, several proposals below. > > This patch set will reform the method of getting transport type, we will now > using query_transport() instead of inferring from transport and link layer > respectively, also we defined the new transport type to make the concept > more reasonable. > > Mapping List: > node-type link-layer old-transport new-transport > nes RNIC ETH IWARP IWARP > amso1100 RNIC ETH IWARP IWARP > cxgb3 RNIC ETH IWARP IWARP > cxgb4 RNIC ETH IWARP IWARP > usnic USNIC_UDP ETH USNIC_UDP USNIC_UDP > ocrdma IB_CA ETH IB IBOE > mlx4 IB_CA IB/ETH IB IB/IBOE > mlx5 IB_CA IB IB IB > ehca IB_CA IB IB IB > ipath IB_CA IB IB IB > mthca IB_CA IB IB IB > qib IB_CA IB IB IB > > For example: > if (transport == IB) && (link-layer == ETH) will now become: > if (query_transport() == IBOE) > > Thus we will be able to get rid of the respective transport and link-layer > checking, and it will help us to add new protocol/Technology (like OPA) more > easier, also with the introduced management helpers, IB management logical > will be more clear and easier for extending. > > Highlights: > The patch set covered a wide range of IB stuff, thus for those who are > familiar with the particular part, your suggestion would be invaluable ;-) > > Patch 1#~15# included all the logical reform, 16#~25# introduced the > management helpers, 26#~27# do clean up. > > Patches haven't been tested yet, we appreciate if any one who have these > HW willing to provide his Tested-by :-) > > Doug suggested the bitmask mechanism: > https://www.mail-archive.com/linux- > rdma@vger.kernel.org/msg23765.html > which could be the plan for future reforming, we prefer that to be another > series which focus on semantic and performance. > > This patch-set is somewhat 'bloated' now and it may be a good timing for > staging, I'd like to suggest we focus on improving existed helpers and push > all the further reforms into next series ;-) > > Proposals: > Sean: > https://www.mail-archive.com/linux- > rdma@vger.kernel.org/msg23339.html > Doug: > https://www.mail-archive.com/linux- > rdma@vger.kernel.org/msg23418.html > https://www.mail-archive.com/linux- > rdma@vger.kernel.org/msg23765.html > Jason: > https://www.mail-archive.com/linux- > rdma@vger.kernel.org/msg23425.html > > Michael Wang (27): > IB/Verbs: Implement new callback query_transport() > IB/Verbs: Implement raw management helpers > IB/Verbs: Reform IB-core mad/agent/user_mad > IB/Verbs: Reform IB-core cm > IB/Verbs: Reform IB-core sa_query > IB/Verbs: Reform IB-core multicast > IB/Verbs: Reform IB-ulp ipoib > IB/Verbs: Reform IB-ulp xprtrdma > IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs > IB/Verbs: Reform cm related part in IB-core cma/ucm > IB/Verbs: Reform route related part in IB-core cma > IB/Verbs: Reform mcast related part in IB-core cma > IB/Verbs: Reserve legacy transport type in 'dev_addr' > IB/Verbs: Reform cma_acquire_dev() > IB/Verbs: Reform rest part in IB-core cma > IB/Verbs: Use management helper cap_ib_mad() > IB/Verbs: Use management helper cap_ib_smi() > IB/Verbs: Use management helper cap_ib_cm() > IB/Verbs: Use management helper cap_iw_cm() > IB/Verbs: Use management helper cap_ib_sa() > IB/Verbs: Use management helper cap_ib_mcast() > IB/Verbs: Use management helper cap_ipoib() > IB/Verbs: Use management helper cap_read_multi_sge() > IB/Verbs: Use management helper cap_af_ib() > IB/Verbs: Use management helper cap_eth_ah() > IB/Verbs: Clean up rdma_ib_or_iboe() > IB/Verbs: Cleanup rdma_node_get_transport() > > --- > drivers/infiniband/core/agent.c | 4 > drivers/infiniband/core/cm.c | 26 +- > drivers/infiniband/core/cma.c | 328 ++++++++++++--------------- > drivers/infiniband/core/device.c | 1 > drivers/infiniband/core/mad.c | 51 ++-- > drivers/infiniband/core/multicast.c | 18 - > drivers/infiniband/core/sa_query.c | 41 +-- > drivers/infiniband/core/sysfs.c | 8 > drivers/infiniband/core/ucm.c | 5 > drivers/infiniband/core/ucma.c | 27 -- > drivers/infiniband/core/user_mad.c | 32 +- > drivers/infiniband/core/uverbs_cmd.c | 6 > drivers/infiniband/core/verbs.c | 33 -- > drivers/infiniband/hw/amso1100/c2_provider.c | 7 > drivers/infiniband/hw/cxgb3/iwch_provider.c | 7 > drivers/infiniband/hw/cxgb4/provider.c | 7 > drivers/infiniband/hw/ehca/ehca_hca.c | 6 > drivers/infiniband/hw/ehca/ehca_iverbs.h | 3 > drivers/infiniband/hw/ehca/ehca_main.c | 1 > drivers/infiniband/hw/ipath/ipath_verbs.c | 7 > drivers/infiniband/hw/mlx4/main.c | 10 > drivers/infiniband/hw/mlx5/main.c | 7 > drivers/infiniband/hw/mthca/mthca_provider.c | 7 > drivers/infiniband/hw/nes/nes_verbs.c | 6 > drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1 > drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 6 > drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 3 > drivers/infiniband/hw/qib/qib_verbs.c | 7 > drivers/infiniband/hw/usnic/usnic_ib_main.c | 1 > drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 6 > drivers/infiniband/hw/usnic/usnic_ib_verbs.h | 2 > drivers/infiniband/ulp/ipoib/ipoib_main.c | 17 - > include/rdma/ib_verbs.h | 204 +++++++++++++++- > net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 6 > net/sunrpc/xprtrdma/svc_rdma_transport.c | 51 +--- > 35 files changed, 584 insertions(+), 368 deletions(-) > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the > body of a message to majordomo@vger.kernel.org More majordomo info at > http://vger.kernel.org/majordomo-info.html ????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?