Received: by 10.192.165.148 with SMTP id m20csp1190098imm; Thu, 10 May 2018 07:02:35 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpLC/ioNk1JaoOOe1s6z8Bpd457XaEhLDccm3s+99sfZAkBV+h3XbQyjQ9jbdFReBVKmZ8I X-Received: by 2002:a62:5d4a:: with SMTP id r71-v6mr1503170pfb.147.1525960955356; Thu, 10 May 2018 07:02:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525960955; cv=none; d=google.com; s=arc-20160816; b=RVc6c3MCwABFTNjsMpg4jQXhHpp7TUB9oqKrdrBBlLFrChDXeNPjMn8psmGACeqkKm U0Rxwxzr52RV0HANyBi2VEHdWao5TeglzW4LU+yo3bzvNtZwt/z1eDHKNpDfVZCb/Q/L hH3jcXsM9S1IcrrGG8EaAEZ+fY1rh3ji5XXFTkHWVCqbPTe6p9Wlq7ygQtg5dx4ZY1Oz WBJ2ow3lZVeeHAaBfCStNw8A96BiA1nJZM7Th9Olzz4V2gkM3KtnJC2b5/hAj79oCLZP 7UqH4036F0wqSbgRW9w2FBUZUo5oOsHE6i6UNtLgRnPjDSzVGYbQ+x5AabgB2jnYV0kJ BgNQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=dhl4gTyoSdki8Ai7+oCjO6nEUDwcO7Jrt8vdE3WMgGE=; b=lIouOLT6wGKzEV2Dbmm9iAaF3MOsDZCkUY/IoH50s0DjxFI/r8mVDZ2IwxSzj/foDg e3vBZchdeno1Cg8yda+m9exFaXCG3c8g0Lh61LqaKj8KYhdzPFg60bE+s6zxFwvhXydk t9ZyRIRlfCsWlv7H1YzTVdmRsdWUGaW+bVpFh3QEUeiJdlozeLelmVt58YzffJllxdSC mzxYnl+sR/VnuoiHUtwkGWDB54fs/6FBFkArImmSWYlC9HnVZKvbC8Deto9oN4WZqfh/ ojzqlIF/lGWXGRQEJea+P6C3rO5nmXJ2p3w3zONrutcx4jyohknVMRwebEw0nR/QMRi7 j9Dw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@dev-mellanox-co-il.20150623.gappssmtp.com header.s=20150623 header.b=uunDxovU; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mellanox.co.il Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q6-v6si746489pgp.258.2018.05.10.07.02.07; Thu, 10 May 2018 07:02:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@dev-mellanox-co-il.20150623.gappssmtp.com header.s=20150623 header.b=uunDxovU; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mellanox.co.il Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757414AbeEJOB3 (ORCPT + 99 others); Thu, 10 May 2018 10:01:29 -0400 Received: from mail-qt0-f195.google.com ([209.85.216.195]:38776 "EHLO mail-qt0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757385AbeEJOBZ (ORCPT ); Thu, 10 May 2018 10:01:25 -0400 Received: by mail-qt0-f195.google.com with SMTP id m9-v6so2653347qtb.5 for ; Thu, 10 May 2018 07:01:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dev-mellanox-co-il.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=dhl4gTyoSdki8Ai7+oCjO6nEUDwcO7Jrt8vdE3WMgGE=; b=uunDxovUFZDnJncV0soo3WtiSPdAkXWY6qunfhlv0w81glbi9GFJjxHstPHVTJevrP JwQJ5ylr4Nd6LKSztYOA0tlKLj6izKbJ23fDF3+dpat5BCSVNignHy/3GFAE1dC807yd gJG7HTtKj8FCHhOMsoSnF+u/+TzruLAelAVqVxou/G6hs7mxSp0oOAc7dhIt01P5AVEH 7TL4srEeD8mymE8dAlvPZREPRJKLyI4JOQEu/bqM9gmVqgziosuSPw31LkMFPXWCkJkA FFf6YsU1TvB+35jmAWm25c6nXy9Jt4xt7G3rcvLTfNPEKXXdTZsSIxRsPojRE/kNgZdh h4ZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=dhl4gTyoSdki8Ai7+oCjO6nEUDwcO7Jrt8vdE3WMgGE=; b=o1KAsoMBUS7UFbhaxpxWVXb/OUBq7hxmaTTbnQFM/Zh514DhQM8tDhgG+gZd2VGZNM IQTKjwHS4y+m8jTsLIXuV7VOyQb7SPp3otioZ8t9CdezztK8j5fzqq9hoCqZwsGRWnfI KcB0MZH7SdjIrtPjvaiM2sHwA72DobII/owuuXm4DnWXMW9CQZGe7Gt5UicRjEJzesvO b4HBaohMnuQ5rUBwdoso3f26oe5MR3NUp8fJ6rCtEbLEo9gJghz9IR2yOUF7AJtB/GWJ phIo9pcX0rVDplv387JHBQLYO4uld1jsIUrvUzNu6Yg7Xt7mNlxXxkwl5IvKNZpb6/n0 8vxQ== X-Gm-Message-State: ALKqPwcvOH/lxrKRnfDbyy4aUIZ5Lu6mVKURb6bcgFh+kOIOAmRhPlsR rbjOg5edPdLvlZT3Lxc9ooCWmhqG X-Received: by 2002:a0c:946c:: with SMTP id i41-v6mr1357965qvi.139.1525960884524; Thu, 10 May 2018 07:01:24 -0700 (PDT) Received: from [192.168.1.183] (c-73-182-207-166.hsd1.ma.comcast.net. [73.182.207.166]) by smtp.googlemail.com with ESMTPSA id x28-v6sm638866qtx.95.2018.05.10.07.01.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 10 May 2018 07:01:23 -0700 (PDT) Subject: Re: [PATCH IB/core 2/2] IB/cm: Send authentic pkey in REQ msg and check eligibility of the pkeys To: =?UTF-8?Q?H=c3=a5kon_Bugge?= Cc: Doug Ledford , Don Hiatt , Ira Weiny , Sean Hefty , OFED mailing list , linux-kernel@vger.kernel.org References: <20180509093020.24503-1-Haakon.Bugge@oracle.com> <20180509093020.24503-3-Haakon.Bugge@oracle.com> From: Hal Rosenstock Message-ID: <3bee76df-49a6-cf3c-6df4-749a6309358e@dev.mellanox.co.il> Date: Thu, 10 May 2018 10:01:22 -0400 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/10/2018 5:16 AM, Håkon Bugge wrote: > > >> On 9 May 2018, at 13:28, Hal Rosenstock wrote: >> >> On 5/9/2018 5:30 AM, Håkon Bugge wrote: >>> There is no point in using RDMA CM to establish a connection between >>> two QPs that cannot possible communicate. Particularly, if both the >>> active and passive side use limited pkeys, they are not able to >>> communicate. >>> >>> In order to detect this situation, the authentic pkey is used in the >>> CM REQ message. The authentic pkey is the one that the HCA inserts >>> into the BTH in the IB packets. >>> >>> When the passive side receives the REQ, commit ("ib_core: A full pkey >>> is required to match a limited one") ensures that >>> ib_find_matched_cached_pkey() fails unless at least one of the pkeys >>> compared has the full-member bit. >>> >>> In the limited-to-limited case, this will prohibit the connection to >>> be formed, and thus, Pkey Violation Traps will not be sent to the SM. >>> >>> Signed-off-by: Håkon Bugge >>> --- >>> drivers/infiniband/core/cm.c | 39 ++++++++++++++++++++++++++++++++------- >>> include/rdma/ib_cm.h | 4 +++- >>> 2 files changed, 35 insertions(+), 8 deletions(-) >>> >>> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c >>> index a92e1a5c202b..52ed51d5bd2a 100644 >>> --- a/drivers/infiniband/core/cm.c >>> +++ b/drivers/infiniband/core/cm.c >>> @@ -3,6 +3,7 @@ >>> * Copyright (c) 2004 Topspin Corporation. All rights reserved. >>> * Copyright (c) 2004, 2005 Voltaire Corporation. All rights reserved. >>> * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved. >>> + * Copyright (c) 2018 Oracle and/or its affiliates. All rights reserved. >>> * >>> * This software is available to you under a choice of one of two >>> * licenses. You may choose to be licensed under the terms of the GNU >>> @@ -91,6 +92,7 @@ static const char * const ibcm_rej_reason_strs[] = { >>> [IB_CM_REJ_INVALID_CLASS_VERSION] = "invalid class version", >>> [IB_CM_REJ_INVALID_FLOW_LABEL] = "invalid flow label", >>> [IB_CM_REJ_INVALID_ALT_FLOW_LABEL] = "invalid alt flow label", >>> + [IB_CM_REJ_INVALID_PKEY] = "invalid PKey", >> >> If this patch goes ahead, IBA spec for CM should be updated to include this. > > Sure, I see: > > 33 Invalid Alternate Flow Label > > as the latest in the spec. This appears to be an implementation rather than architectural issue. If there is limited <-> limited CM, then the MAD would never get up to the CM layer as it would be filtered by end port pkey enforcement. This is only needed to protect against current code which can send on the full rather than limited pkey (in BTH), right ? >> >>> }; >>> >>> const char *__attribute_const__ ibcm_reject_msg(int reason) >>> @@ -518,8 +520,8 @@ static int cm_init_av_by_path(struct sa_path_rec *path, struct cm_av *av, >>> return -EINVAL; >>> cm_dev = port->cm_dev; >>> >>> - ret = ib_find_cached_pkey(cm_dev->ib_device, port->port_num, >>> - be16_to_cpu(path->pkey), &av->pkey_index); >>> + ret = ib_find_matched_cached_pkey(cm_dev->ib_device, port->port_num, >>> + be16_to_cpu(path->pkey), &av->pkey_index); >>> if (ret) >>> return ret; >>> >>> @@ -1241,7 +1243,7 @@ static void cm_format_req(struct cm_req_msg *req_msg, >>> cm_req_set_starting_psn(req_msg, cpu_to_be32(param->starting_psn)); >>> cm_req_set_local_resp_timeout(req_msg, >>> param->local_cm_response_timeout); >>> - req_msg->pkey = param->primary_path->pkey; >>> + req_msg->pkey = cpu_to_be16(cm_id_priv->pkey); >>> cm_req_set_path_mtu(req_msg, param->primary_path->mtu); >>> cm_req_set_max_cm_retries(req_msg, param->max_cm_retries); >>> >>> @@ -1396,7 +1398,23 @@ int ib_send_cm_req(struct ib_cm_id *cm_id, >>> cm_id_priv->responder_resources = param->responder_resources; >>> cm_id_priv->retry_count = param->retry_count; >>> cm_id_priv->path_mtu = param->primary_path->mtu; >>> - cm_id_priv->pkey = param->primary_path->pkey; >>> + >>> + /* >>> + * We want to send the pkey used in the BTH in packets >>> + * sent. This, in order for the passive side to determine if >>> + * communication is permitted by the respective pkeys. >>> + * >>> + * The pkey in the paths are derived from the MGID, which has >>> + * the full membership bit set. Hence, we retrieve the pkey by >>> + * using the address vector's pkey_index. >> >> The paths usually come from the SM and I don't expect SM to provide path >> between ports of only limited members of partition. > > In my case, it does. Which OpenSM flavor are you talking about (upstream, MLNX, or other) ? What is OpenSM version string ? Is allow_both_pkeys TRUE or FALSE ? >> Default ACM provider >> forms path from multicast group parameters including pkey. Is that the >> scenario of concern ? > > Also RDMA CM does that. Do an ibdump of a CM REQ message sent from a limited port, and you will see the PKey is the full member in the CM REQ msg. Not exactly sure what you mean by RDMA CM does this as there are a number of different ways to use RDMA CM. Are you referring to using IPoIB for address resolution ? Another way is for a path record to be passed in from user space and there is no control over it's contents. At a minimum, I think the comment mentioning MGID should be clarified. >> If so, I still don't fully understand the scenario >> because limited members are not supposed to be part of a multicast >> group. There was some work started to extend this for client/server >> model but it was never completed. However, there may be hole(s) in >> various components of implementation which open(s) this door. > > I view OpenSM not returning a valid path between two limited members an orthogonal issue, as OpenSM is another component. Agreed. I think that there are other components too... > I think the CM REQ message should contain the correct PKey (fixed by this patch series). > > And in the event the passive side being a limited member and receives a CM REQ with a limited PKey, that connection should not be formed (fixed by this patch series). This is the backward compatibility for the current behavior on the active side so that REQ gets rejected rather than accepted, right ? In general, the approach used when there is pkey mismatch is to silently drop packet rather than respond which causes a timeout on the requester side. In this specific case, I'm not sure which approach is better. Also, is there a similar issue with SIDR_REQ ? > Read me correct, I am also in favour of fixing the OpenSM to not return a valid (but useless) path record in this case. Understood and agreed. -- Hal > > Thxs, Håkon > > >> >> -- Hal >> >>> + */ >>> + ret = ib_get_cached_pkey(cm_id_priv->id.device, >>> + cm_id_priv->av.port->port_num, >>> + cm_id_priv->av.pkey_index, >>> + &cm_id_priv->pkey); >>> + if (ret) >>> + goto error1; >>> + >>> cm_id_priv->qp_type = param->qp_type; >>> >>> ret = cm_alloc_msg(cm_id_priv, &cm_id_priv->msg); >>> @@ -1956,16 +1974,19 @@ static int cm_req_handler(struct cm_work *work) >>> cm_id_priv); >>> if (ret) { >>> int err; >>> + int rej_reason = (ret == -ENOENT ? >>> + IB_CM_REJ_INVALID_PKEY : >>> + IB_CM_REJ_INVALID_GID); >>> >>> err = ib_get_cached_gid(work->port->cm_dev->ib_device, >>> work->port->port_num, 0, >>> &work->path[0].sgid, >>> NULL); >>> if (err) >>> - ib_send_cm_rej(cm_id, IB_CM_REJ_INVALID_GID, >>> + ib_send_cm_rej(cm_id, rej_reason, >>> NULL, 0, NULL, 0); >>> else >>> - ib_send_cm_rej(cm_id, IB_CM_REJ_INVALID_GID, >>> + ib_send_cm_rej(cm_id, rej_reason, >>> &work->path[0].sgid, >>> sizeof(work->path[0].sgid), >>> NULL, 0); >>> @@ -1975,7 +1996,11 @@ static int cm_req_handler(struct cm_work *work) >>> ret = cm_init_av_by_path(&work->path[1], &cm_id_priv->alt_av, >>> cm_id_priv); >>> if (ret) { >>> - ib_send_cm_rej(cm_id, IB_CM_REJ_INVALID_ALT_GID, >>> + int rej_reason = (ret == -ENOENT ? >>> + IB_CM_REJ_INVALID_PKEY : >>> + IB_CM_REJ_INVALID_ALT_GID); >>> + >>> + ib_send_cm_rej(cm_id, rej_reason, >>> &work->path[0].sgid, >>> sizeof(work->path[0].sgid), NULL, 0); >>> goto rejected; >>> diff --git a/include/rdma/ib_cm.h b/include/rdma/ib_cm.h >>> index 7979cb04f529..56b62303946a 100644 >>> --- a/include/rdma/ib_cm.h >>> +++ b/include/rdma/ib_cm.h >>> @@ -3,6 +3,7 @@ >>> * Copyright (c) 2004 Topspin Corporation. All rights reserved. >>> * Copyright (c) 2004 Voltaire Corporation. All rights reserved. >>> * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved. >>> + * Copyright (c) 2018 Oracle and/or its affiliates. All rights reserved. >>> * >>> * This software is available to you under a choice of one of two >>> * licenses. You may choose to be licensed under the terms of the GNU >>> @@ -183,7 +184,8 @@ enum ib_cm_rej_reason { >>> IB_CM_REJ_DUPLICATE_LOCAL_COMM_ID = 30, >>> IB_CM_REJ_INVALID_CLASS_VERSION = 31, >>> IB_CM_REJ_INVALID_FLOW_LABEL = 32, >>> - IB_CM_REJ_INVALID_ALT_FLOW_LABEL = 33 >>> + IB_CM_REJ_INVALID_ALT_FLOW_LABEL = 33, >>> + IB_CM_REJ_INVALID_PKEY = 34, >>> }; >>> >>> struct ib_cm_rej_event_param { >>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >