Received: by 2002:ac0:e350:0:0:0:0:0 with SMTP id g16csp476094imn; Fri, 29 Jul 2022 12:52:32 -0700 (PDT) X-Google-Smtp-Source: AGRyM1sJzwR/10sMrgbP9VSHMzYDpbQSZaSE+9pY6H0BVJDJcn06uniY232kcDgXIkyJIIbQkFxx X-Received: by 2002:a05:6402:4411:b0:437:b723:72 with SMTP id y17-20020a056402441100b00437b7230072mr5129110eda.38.1659124351874; Fri, 29 Jul 2022 12:52:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659124351; cv=none; d=google.com; s=arc-20160816; b=yt11E1pV6tQaU39M5BsuXd12D2ETVeHdodixcQQtjhCDx+YhI8CnbJz2UV+N2dtzK6 xQpcdRpHHBIDwm5pgFGw9iIjFg4W9cEVJTP3tyEr6EMTDylsA3hN1buN7qEQZoX72KRG hecDiOlaCD8Znwy9fZMYkQ57LQXApBewIeTC2kpjUAvdnMqw2/NjwR2PWmMr3/uG4FX+ Hdp/pqt7rlDf6hd/8qBhfNzUaeUbjWL4g4S+iW4QdB8GijBFVvX2AktHwZeT5A5Aig4S CTxz66vYaKpdrKoMnBt9STj5tMYt9i/uZxA26oLQVyJ3uW97zP4qQWZxEkZ0UoJYOkGb lZZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=MD/Y1Ux36S7YhGnDVVwMOyxaB+EoqSrFVx9rgP9aahY=; b=ny9axjUikVNzchzv2W8FJ95Rvq6L4XYQGjP7jdDHuYs3JBZ5TbOU6A2LGVpPeKN6eL bMk4VUdN7t+021xuOi/pN2zKpipkKkSQdjrhk3S3ENv4iE/Ycs0ut950TeyHfAFeJx9r qm2UJGtGfnxwnV2d3b1B1LjSG4mA1UOl4031Sr+xG66joZyjTNFD8+b+Fm/uPNJHXi1S rSc+6nT4toG6soY4Yz5Uk6z1bTBQBOR/ZFB4KAY63Eaz4jIgBzo0Aj858XCo6SXxlF3P 4kN4DcvbJ3QLCCO/wczJY4AShaAaZBki9/OzhLCEeydLlYJUooRRB1UzjX1bmm2YQhtG OH7A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=S4AT4Ada; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g23-20020a50d5d7000000b0043be5daf80asi4371442edj.353.2022.07.29.12.52.07; Fri, 29 Jul 2022 12:52:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=S4AT4Ada; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238734AbiG2TMn (ORCPT + 99 others); Fri, 29 Jul 2022 15:12:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34082 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237395AbiG2TMh (ORCPT ); Fri, 29 Jul 2022 15:12:37 -0400 Received: from mail-qv1-xf29.google.com (mail-qv1-xf29.google.com [IPv6:2607:f8b0:4864:20::f29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3AA988210A for ; Fri, 29 Jul 2022 12:12:35 -0700 (PDT) Received: by mail-qv1-xf29.google.com with SMTP id mk9so1456679qvb.11 for ; Fri, 29 Jul 2022 12:12:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=MD/Y1Ux36S7YhGnDVVwMOyxaB+EoqSrFVx9rgP9aahY=; b=S4AT4Ada+qIIoQWVjxhyOOAUEoHPwY3fwslxsVJJvK4hTaUep3WogZjhTKdxr1jPTj NiAFRDaIffdXgXTe0VuKReiQwhVGzrMVJIOOYVtzOyBFGxHzaaeKdtC8oLNszTrztzok Rqk9lyUKpvIWzYcNpUmJHjXubVxeTe4NqQ9gdTUEVCwmoZ46YM5vI/wKHUGHMqcTFzaj T2jbKkmWQY+DFZAKkBQI6hLto1jK74J3gpTOL7DyLaxU03xLOMtKIXzQk8w0EIcKb//b z3EH9nla/6PrrKE7pJRHlnKl1aHxOxgOQh3zgmQLWuyjeKH3+TJli3dQuVjHhAXjuSnu +4kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=MD/Y1Ux36S7YhGnDVVwMOyxaB+EoqSrFVx9rgP9aahY=; b=cK6GRetfLV/WmgAMt6FW2omCv8KJKYKLX4Y5+j1QHWSGtmXIGfvEZzAHDPSergO2Rt 5/1KzPSvFEwhVwaanj55BL9brtRfB0l0W8/Y29yhNern07spWwM79FbNfI5odoSvSjfy ps12qVNaFu3/ZzQw85wwo1ogsr2L5LFIfscR1O7JSn/GhUfzAc3CK5QAmrpKkyXZBpyZ OKGytPDu4Pze552ff9/Z2xQh+n5A7hvEPohshmAf6pzLCVgDVzzRLEj7yRppig1i46xk 4vJXXJtBtKT1J1KrQwtEXjKb2Y0v/RpAwxo1Ci6kRQ9hJ3BDz5kmEO58rc5VH4gFXd49 hQ2g== X-Gm-Message-State: ACgBeo0qwGQNUz9aPqg9pUEZ5rVbFVqXddjYiiJkfrn+lf62ndSbvKGQ hlJvAkX4wuij1SvUAgXa/uh/rw== X-Received: by 2002:a05:6214:248a:b0:474:3739:6007 with SMTP id gi10-20020a056214248a00b0047437396007mr4723247qvb.57.1659121953858; Fri, 29 Jul 2022 12:12:33 -0700 (PDT) Received: from ziepe.ca (hlfxns017vw-142-162-113-129.dhcp-dynamic.fibreop.ns.bellaliant.net. [142.162.113.129]) by smtp.gmail.com with ESMTPSA id dm26-20020a05620a1d5a00b006af147d4876sm3035166qkb.30.2022.07.29.12.12.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Jul 2022 12:12:32 -0700 (PDT) Received: from jgg by wakko with local (Exim 4.95) (envelope-from ) id 1oHVPU-000D7g-3B; Fri, 29 Jul 2022 16:12:32 -0300 Date: Fri, 29 Jul 2022 16:12:32 -0300 From: Jason Gunthorpe To: Long Li Cc: Dexuan Cui , KY Srinivasan , Haiyang Zhang , Stephen Hemminger , Wei Liu , "David S. Miller" , Jakub Kicinski , Paolo Abeni , Leon Romanovsky , "edumazet@google.com" , "shiraz.saleem@intel.com" , Ajay Sharma , "linux-hyperv@vger.kernel.org" , "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-rdma@vger.kernel.org" Subject: Re: [Patch v4 03/12] net: mana: Handle vport sharing between devices Message-ID: References: <1655345240-26411-1-git-send-email-longli@linuxonhyperv.com> <1655345240-26411-4-git-send-email-longli@linuxonhyperv.com> <20220720234209.GP5049@ziepe.ca> <20220721143858.GV5049@ziepe.ca> <20220721183219.GA6833@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 29, 2022 at 06:44:22PM +0000, Long Li wrote: > > Subject: Re: [Patch v4 03/12] net: mana: Handle vport sharing between devices > > > > On Thu, Jul 21, 2022 at 05:58:39PM +0000, Long Li wrote: > > > > > "vport" is a hardware resource that can either be used by an > > > > > Ethernet device, or an RDMA device. But it can't be used by both > > > > > at the same time. The "vport" is associated with a protection > > > > > domain and doorbell, it's programmed in the hardware. Outgoing > > > > > traffic is enforced on this vport based on how it is programmed. > > > > > > > > Sure, but how is the users problem to "get this configured right" > > > > and what exactly is the user supposed to do? > > > > > > > > I would expect the allocation of HW resources to be completely > > > > transparent to the user. Why is it not? > > > > > > > > > > In the hardware, RDMA RAW_QP shares the same hardware resource (in > > > this case, the vPort in hardware table) with the ethernet NIC. When an > > > RDMA user creates a RAW_QP, we can't just shut down the ethernet. The > > > user is required to make sure the ethernet is not in used when he > > > creates this QP type. > > > > You haven't answered my question - how is the user supposed to achieve this? > > The user needs to configure the network interface so the kernel will not use it when the user creates a RAW QP on this port. > > This can be done via system configuration to not bring this > interface online on system boot, or equivalently doing "ifconfig xxx > down" to make the interface down when creating a RAW QP on this > port. That sounds horrible, why allow the user to even bind two drivers if the two drivers can't be used together? > > And now I also want to know why the ethernet device and rdma device can even > > be loaded together if they cannot share the physical port? > > Exclusivity is not a sharing model that any driver today implements. > > This physical port limitation only applies to the RAW QP. For RC QP, > the hardware doesn't have this limitation. The user can create RC > QPs on a physical port up to the hardware limits independent of the > Ethernet usage on the same port. .. and it is because you support sharing models in other cases :\ > Scenario 1: The Ethernet loses TCP connection. > 1. User A runs a program listing on a TCP port, accepts an incoming > TCP connection and is communicating with the remote peer over this > TCP connection. > 2. User B creates an RDMA RAW_QP on the same port on the device. > 3. As soon as the RAW_QP is created, the program in 1 can't > send/receive data over this TCP connection. After some period of > inactivity, the TCP connection terminates. It is a little more complicated than that, but yes, that could possibly happen if the userspace captures the right traffic. > Please note that this may also pose a security risk. User B with > RAW_QP can potentially hijack this TCP connection from the kernel by > framing the correct Ethernet packets and send over this QP to trick > the remote peer, making it believe it's User A. Any root user can do this with the netstack using eg tcpdump, bpf, XDP, raw sockets, etc. This is why the capability is guarded by CAP_NET_RAW. It is nothing unusual. > Scenario 2: The Ethernet port state changes after RDMA RAW_QP is used on the port. > 1. User uses "ifconfig ethx down" on the NIC, intending to make it offline > 2. User creates a RDMA RAW_QP on the same port on the device. > 3. User destroys this RAW_QP. > 4. The ethx device in 1 reports carrier state in step 2, in many > Linux distributions this makes it online without user > interaction. "ifconfig ethx" shows its state changes to "up". This I'm not familiar with, it actually sounds like a bug that the RAW_QP's interfere with the netdev carrier state. > the Mellanox NICs implement the RAW_QP. IMHO, it's better to have > the user explicitly decide whether to use Ethernet or RDMA RAW_QP on > a specific port. It should all be carefully documented someplace. Jason