Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp2871654ioo; Tue, 24 May 2022 07:49:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJydEdlX5a3gh39OYvWk/z1DL2XMPRzQQW6nAPBILwtJ/rM05abE3XU8dU3+nhlbfjYXqryo X-Received: by 2002:a17:902:a9c4:b0:15f:90c:12f6 with SMTP id b4-20020a170902a9c400b0015f090c12f6mr27581425plr.115.1653403795134; Tue, 24 May 2022 07:49:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653403795; cv=none; d=google.com; s=arc-20160816; b=Abe/liWu0KDt6GU2+glmmbNbqj26zE/PBmtZugVaYEjDYtP52VS8lJA97BpbGzOcV9 snr169zFjRr2WgrR4/oJDiCgYfu9TR7FKh/NFxzg/b9M6ySGiCKOPkQ2dNzFRzijMepQ hjyKggHngoARwZ3skZXfjBffGNC++1QM6eJ9lHG1AqPpCeg6EOJseEx5+G8GG3BFz0M2 aiBsBv8SGfD5+UrIhihZAsOjn7xgDy+wUw/Ut0TG/H5KfgHVl4BZYG+1IqQkL659kDpB Zv6f5ri3zMFOVqzdlx+KgTkAqhFdkLIv6BRvsxmpXbvnZ+iUhgRckuUZ/KKqP5j6jHGa i8YA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=QnYq5/1KDN6nj1Cx6w4BELQSxaZlmCHFI8UJEJylCow=; b=xKJinR0Y7mMzXtgdnIgGpxwjzj3LnzuiCzMvpq1gHmiW0Uz108le/yxMtv0WkYFQ/E Rm/+ZIWvX8WVPxfoO2gw2n8lVHeYvxKiA3KHfgBaKNFrs2TY9vHLlnzSeR2NA6OyRifk kOXBJh+NPSbbBYTGlDKqxDzVtkJLt7jrSTgdrsk+vCe3EdHwaY3XtmB0V0YNPnp15PRo 9aO4PJSSXn1DHulICSIOqoA6HBk78ybtXA8V+u2xuiUgu9Z0IUPKVj8WPzdxbz/Vy/ku DMlg0x+qcK9TLFd82V+ua17mhz1/4FKw4+lY4wC4BRfaGA7V7W+dMbn/DBumOyreY/R+ VLgw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ionos.com header.s=google header.b=UQ8wuLN7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ionos.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x11-20020a656aab000000b003c5a1baf6acsi15387310pgu.503.2022.05.24.07.49.42; Tue, 24 May 2022 07:49:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ionos.com header.s=google header.b=UQ8wuLN7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ionos.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235567AbiEXK5B (ORCPT + 99 others); Tue, 24 May 2022 06:57:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47388 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236076AbiEXK44 (ORCPT ); Tue, 24 May 2022 06:56:56 -0400 Received: from mail-lj1-x231.google.com (mail-lj1-x231.google.com [IPv6:2a00:1450:4864:20::231]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B35262CDD7 for ; Tue, 24 May 2022 03:56:53 -0700 (PDT) Received: by mail-lj1-x231.google.com with SMTP id m11so8973446ljc.1 for ; Tue, 24 May 2022 03:56:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ionos.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=QnYq5/1KDN6nj1Cx6w4BELQSxaZlmCHFI8UJEJylCow=; b=UQ8wuLN7bK6Vkz/LJwvBZdh47U3rso8jMnZ16DG/fEXfBUWQXqGQtUtwIz+oymUMcc i/wwSMdwbJ80cOHGa/lttWrZeZEvVyg1dynY/O6IlVfYHQZkV1D852EQOzT09QFlLRHk GjzVyC43DhFAPqLJnSORfc57khHbQSpR5Ta6yW55jVZBHjWmZMkaIHpxlgM/6PBO/F6h aIY+IsGnfNw6E52wmaa2t0f4bH5cpRErXOI9qxjtESCjfbvgDUPLLCFoMWJJ17wrf/0Q J01n68osZZJ12ocIJGI6BBU7HWM6hAJwu14PKSJ0LyTwDHWAmCZaxc4m+fz+SPXZL5EK damQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=QnYq5/1KDN6nj1Cx6w4BELQSxaZlmCHFI8UJEJylCow=; b=CSPyxKdkU4/ntKNL1gaeuqssOZRN2zacXmpMMSXQw+YtOvrexDYvhRmfwQtCk5CcsC Y+hL5DlSQn7JJjp1yGmS0RF7F+Mqm3bhaCQH0dN29T9w1upMeg9AWusUvNIjgyNlUwcs Lmj9VbyMcUi1UgsKTl8eaDcZZbEsprYAP3NVypeqDTUq/myAnyJmSDw2zRnfqGkBW7Nj BtkB1qMktUsnX+aluhJ7eZO+Xf3wZp5TpVNcbRV0M99YZbH79wkGOra+sQGBuDayPtRe xeshXwg7yEovplDrUzHlRE/sIEnQLRYjMVZxEB0F97eHq6JBudYFFL7/ButXfSkeyGL/ /d0Q== X-Gm-Message-State: AOAM531JlYYt+fmHECplBrGLEiXi+GG6SdaYbLnIPXbapf22bxWZtgVZ ktwjqvx3X4PQbNNV2RXoNoQzhe/sfKLArPaFA0Y/Eg== X-Received: by 2002:a2e:90c1:0:b0:24f:eca:3eb2 with SMTP id o1-20020a2e90c1000000b0024f0eca3eb2mr15627907ljg.158.1653389811988; Tue, 24 May 2022 03:56:51 -0700 (PDT) MIME-Version: 1.0 References: <20220518043725.771549-1-lizhijian@fujitsu.com> <20220520144511.GA2302907@nvidia.com> <3e3373f5-7b12-a8e8-2d73-c2976b272290@fujitsu.com> In-Reply-To: <3e3373f5-7b12-a8e8-2d73-c2976b272290@fujitsu.com> From: Haris Iqbal Date: Tue, 24 May 2022 12:56:41 +0200 Message-ID: Subject: Re: [PATCH] RDMA/rxe: Use kzalloc() to alloc map_set To: "lizhijian@fujitsu.com" Cc: Jason Gunthorpe , Bob Pearson , Zhu Yanjun , "linux-rdma@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Guoqing Jiang , Aleksei Marov , Jinpu Wang Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 24, 2022 at 6:00 AM lizhijian@fujitsu.com wrote: > > Hi Jason & Bob > CC Guoqing > > @Guoqing, It may correlate with your previous bug report: https://lore.kernel.org/all/20220210073655.42281-1-guoqing.jiang@linux.dev/T/ > > > It's observed that a same MR in rnbd server will trigger below code > path: > -> rxe_mr_init_fast() > |-> alloc map_set() # map_set is uninitialized > |...-> rxe_map_mr_sg() # build the map_set > |-> rxe_mr_set_page() > |...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE that means > # we can access host memory(such rxe_mr_copy) > |...-> rxe_invalidate_mr() # mr->state change to FREE from VALID > |...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE, > # but map_set was not built again > |...-> rxe_mr_copy() # kernel crash due to access wild addresses > # that lookup from the map_set > > I draft a patch like below for it, but i wonder if it's rxe's responsibility to do such checking. > Any comments are very welcome. > > > From e9d0bd821f07f5e049027f07b3ce9dc283624201 Mon Sep 17 00:00:00 2001 > From: Li Zhijian > Date: Tue, 24 May 2022 10:56:19 +0800 > Subject: [PATCH] RDMA/rxe: check map_set valid when handle IB_WR_REG_MR > > It's observed that a same MR in rnbd server will trigger below code > path: > -> rxe_mr_init_fast() > |-> alloc map_set() # map_set is uninitialized > |...-> rxe_map_mr_sg() # build the map_set > |-> rxe_mr_set_page() > |...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE that means > # we can access host memory(such rxe_mr_copy) > |...-> rxe_invalidate_mr() # mr->state change to FREE from VALID > |...-> rxe_reg_fast_mr() # mr->state change to VALID from FREE, > # but map_set was not built again > |...-> rxe_mr_copy() # kernel crash due to access wild addresses > # that lookup from the map_set > > Signed-off-by: Li Zhijian > --- > drivers/infiniband/sw/rxe/rxe_mr.c | 9 +++++++++ > drivers/infiniband/sw/rxe/rxe_verbs.c | 1 + > drivers/infiniband/sw/rxe/rxe_verbs.h | 1 + > 3 files changed, 11 insertions(+) > > diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c > index 787c7dadc14f..09673d559c06 100644 > --- a/drivers/infiniband/sw/rxe/rxe_mr.c > +++ b/drivers/infiniband/sw/rxe/rxe_mr.c > @@ -90,6 +90,7 @@ static int rxe_mr_alloc_map_set(int num_map, struct rxe_map_set **setp) > if (!set->map) > goto err_free_set; > > + set->valid = false; > for (i = 0; i < num_map; i++) { > set->map[i] = kmalloc(sizeof(struct rxe_map), GFP_KERNEL); > if (!set->map[i]) > @@ -216,6 +217,7 @@ int rxe_mr_init_user(struct rxe_pd *pd, u64 start, u64 length, u64 iova, > } > > set = mr->cur_map_set; > + set->valid = true; > set->page_shift = PAGE_SHIFT; > set->page_mask = PAGE_SIZE - 1; > > @@ -643,6 +645,7 @@ int rxe_invalidate_mr(struct rxe_qp *qp, u32 rkey) > } > > mr->state = RXE_MR_STATE_FREE; > + mr->cur_map_set->valid = mr->next_map_set->valid = false; > ret = 0; > > err_drop_ref: > @@ -679,12 +682,18 @@ int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe) > return -EINVAL; > } > > + if (!mr->next_map_set->valid) { > + pr_warn("%s: map set is not valid\n", __func__); > + return -EINVAL; > + } > + > mr->access = access; > mr->lkey = (mr->lkey & ~0xff) | key; > mr->rkey = (access & IB_ACCESS_REMOTE) ? mr->lkey : 0; > mr->state = RXE_MR_STATE_VALID; > > set = mr->cur_map_set; > + set->valid = false; > mr->cur_map_set = mr->next_map_set; > mr->cur_map_set->iova = wqe->wr.wr.reg.mr->iova; > mr->next_map_set = set; > diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c > index 58e4412b1d16..4b7ae2d1d921 100644 > --- a/drivers/infiniband/sw/rxe/rxe_verbs.c > +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c > @@ -992,6 +992,7 @@ static int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, > set->page_shift = ilog2(ibmr->page_size); > set->page_mask = ibmr->page_size - 1; > set->offset = set->iova & set->page_mask; > + set->valid = true; > > return n; > } > diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h > index 86068d70cd95..2edf31aab7e1 100644 > --- a/drivers/infiniband/sw/rxe/rxe_verbs.h > +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h > @@ -289,6 +289,7 @@ struct rxe_map { > > struct rxe_map_set { > struct rxe_map **map; > + bool valid; > u64 va; > u64 iova; > size_t length; > -- > 2.31.1 Thanks for posting the description and the patch. We have been facing the exact same issue (only with rxe), and we also realized that to get around this we will have to call ib_map_mr_sg() before every IB_WR_REG_MR wr; even if we are reusing the MR and simply invalidating and re-validating them. In reference to this, we have 2 questions. 1) This change was made in the following commit. commit 647bf13ce944f20f7402f281578423a952274e4a Author: Bob Pearson Date: Tue Sep 14 11:42:06 2021 -0500 RDMA/rxe: Create duplicate mapping tables for FMRs For fast memory regions create duplicate mapping tables so ib_map_mr_sg() can build a new mapping table which is then swapped into place synchronously with the execution of an IB_WR_REG_MR work request. Currently the rxe driver uses the same table for receiving RDMA operations and for building new tables in preparation for reusing the MR. This exposes users to potentially incorrect results. Link: https://lore.kernel.org/r/20210914164206.19768-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson Signed-off-by: Jason Gunthorpe We tried to understand what potential incorrect result that commit message talks about, but were not able to. If someone can through light into this scenario where single mapping table can result into issues, it would be great. 2) We wanted to confirm that, with the above patch, it clearly means that the use-case where we reuse the MR, by simply invalidating and re-validating (IB_WR_REG_MR wr) is a correct one. And there is no requirement saying that ib_map_mr_sg() needs to be done everytime regardless. (We were planning to send this in the coming days, but wanted other discussions to get over. Since the patch got posted and discussion started, we thought better to put this out.) Regards > > > On 23/05/2022 22:02, Li, Zhijian wrote: > > > > on 2022/5/20 22:45, Jason Gunthorpe wrote: > >> On Wed, May 18, 2022 at 12:37:25PM +0800, Li Zhijian wrote: > >>> Below call chains will alloc map_set without fully initializing map_set. > >>> rxe_mr_init_fast() > >>> -> rxe_mr_alloc() > >>> -> rxe_mr_alloc_map_set() > >>> > >>> Uninitialized values inside struct rxe_map_set are possible to cause > >>> kernel panic. > >> If the value is uninitialized then why is 0 an OK value? > >> > >> Would be happier to know the exact value that is not initialized > > > > Well, good question. After re-think of this issue, it seems this patch wasn't the root cause though it made the crash disappear in some extent. > > > > I'm still working on the root cause :) > > > > Thanks > > > > Zhijian > > > > > >> > >> Jason > > > >