Received: by 2002:a05:7412:251c:b0:e2:908c:2ebd with SMTP id w28csp2407607rda; Wed, 25 Oct 2023 01:35:21 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEnQ/zjGEX4qGchhPlnG9YnEjv6bPMqYEpoHVgNpFnObv77nGmjCLhAHS2CzhoawK/3tsAA X-Received: by 2002:a05:6102:471c:b0:457:bc5f:b497 with SMTP id ei28-20020a056102471c00b00457bc5fb497mr13244229vsb.27.1698222921132; Wed, 25 Oct 2023 01:35:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698222921; cv=none; d=google.com; s=arc-20160816; b=qzMxaZCeeVqTS2du9E8k4klU3pRCpnw50bi23ZIbW27JbOsRAFcveXfGy8sUaPfgjd fimWjTNkFaHXrHbqqyvxtY8WhPIRBc4PxHAoZwIdhRG3Vgd8dxeBHXaTIHTm/ICNHsHe 6mVyxJVRQsg/5j7HUc0tHXV+BmmSSoS+xC0Dm49sEqe4BiArNDgLAC/F8LBz85OwTLts y1kd6arr6nf8tblB8kaXi7poWlhSvr4jOIhLa7/TDj++0xZJB70qFZeQDEDZKwZ3zJkI /Rb9mSHljqGgjmegin6t7AWjB3DcztWV4mOHioi/ewa3rvc2EWG6pO6jk6cR4AKNpVNk aWsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=bdvypW9wOvOASVvXtdGwiekOfLB34GXMP2jBem/fI+s=; fh=Hcz4iNbmju5EjPPyJIF6BdNBAPEX+BVg1WoNgiVuHrc=; b=EisYgi9S/pkNxRRVqJmr78wS/ZXzFXlIIWWMXBSCsYVh2Zq0z4I9RzKKEvJOcSMfN8 9KV8YEhoAMT9UIDWvw2CcyHv84Ks3f65iJDWnoF2Wpqjdr+A/MkitwmZSTNV/nM/ZZDl m+nvSgBMc5NBD6wVUl2iUlrQ2Zc6RzLI9TFD+JitwrkvBuKE4xSO3xyONVUD0ZADStGu +Gb3wWrCPHlXzAhbiVusetoVnbs5PLSeCY50/gEbqubfCP1lP9ZweHxs7tdjZu56NdXN Xm62GONtBoRt9OwkICkX9CJ9E35+BcuRm10in89oIOGchjdc1K7s+elJMaRbPH4vr88x v+8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=KNnlEW2Z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id t11-20020a25aa8b000000b00d8691d6b21csi4458632ybi.29.2023.10.25.01.35.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 01:35:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=KNnlEW2Z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id DAF62801D147; Wed, 25 Oct 2023 01:35:19 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233766AbjJYIfQ (ORCPT + 99 others); Wed, 25 Oct 2023 04:35:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51816 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233262AbjJYIfO (ORCPT ); Wed, 25 Oct 2023 04:35:14 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8AE15138; Wed, 25 Oct 2023 01:35:11 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8FC45C433C7; Wed, 25 Oct 2023 08:35:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1698222911; bh=1S2yiLysaUFjqYfPWpycMGksr8a+HstiXzxymMTdaEc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=KNnlEW2ZjUAxbOYcTvzEMN5+Tq9do+CeY89SM3Yqxl4eUFpWlbbjj0TAv/bFU6zUs Gljv7Zu5sWCq4D2dvqmZS3Ktgo7JOYog7a4ghWIw/VrPuPhV1utFtIS3LeTn8zclFn 0lRm+PqlVgkUE/w4q66Dw7Ei6PJTDanGhMxqyrBLrArAA5Zz4qzmfQOZjQixqs3tFz 8GbZV/Cgtdxm+5gCNHCOg1twZxlt8p9UnPZ9oqVqM/cQ0oMhsgUG7S5mnSbjjGBGx6 G8ls+DeVilbN1yNOwNVcCpkLMjCSBQLym46EmciIg9lB9UkwruwkpPfSGlxIUOCBIC 84pvp1tbtcOJg== Date: Wed, 25 Oct 2023 11:35:07 +0300 From: Leon Romanovsky To: Qing Huang Cc: George Kennedy , "jgg@ziepe.ca" , "sd@queasysnail.net" , "linux-rdma@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "netdev@vger.kernel.org" , Tom Hromatka , Harshit Mogalapalli Subject: Re: [PATCH v2] mlx5: fix init stage error handling to avoid double free of same QP and UAF Message-ID: <20231025083507.GB2950466@unreal> References: <1698170518-4006-1-git-send-email-george.kennedy@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 25 Oct 2023 01:35:20 -0700 (PDT) On Wed, Oct 25, 2023 at 12:15:36AM +0000, Qing Huang wrote: > > > -----Original Message----- > > From: George Kennedy > > Sent: Tuesday, October 24, 2023 11:02 AM > > To: leon@kernel.org; jgg@ziepe.ca; sd@queasysnail.net; linux- > > rdma@vger.kernel.org; linux-kernel@vger.kernel.org; netdev@vger.kernel.org > > Cc: George Kennedy ; Tom Hromatka > > ; Harshit Mogalapalli > > > > Subject: [PATCH v2] mlx5: fix init stage error handling to avoid double free of > > same QP and UAF > > > > In the unlikely event that workqueue allocation fails and returns NULL in > > mlx5_mkey_cache_init(), delete the call to > > mlx5r_umr_resource_cleanup() (which frees the QP) in > > mlx5_ib_stage_post_ib_reg_umr_init(). This will avoid attempted double free of > > the same QP when __mlx5_ib_add() does its cleanup. > > > > > Hi George, > > There seems no cleanup function defined for this stage: > > STAGE_CREATE(MLX5_IB_STAGE_POST_IB_REG_UMR, > mlx5_ib_stage_post_ib_reg_umr_init, > NULL), > > Do you know where __mlx5_ib_add() does the double free call after the allocation failure? It is done in MLX5_IB_STAGE_PRE_IB_REG_UMR. Unfortunately, we have asymmetric init/release flow for UMRs. Thanks > > Regards, > Qing > > > Syzkaller reported a UAF in ib_destroy_qp_user > > > > workqueue: Failed to create a rescuer kthread for wq "mkey_cache": -EINTR > > infiniband mlx5_0: mlx5_mkey_cache_init:981:(pid 1642): > > failed to create work queue > > infiniband mlx5_0: mlx5_ib_stage_post_ib_reg_umr_init:4075:(pid 1642): > > mr cache init failed -12 > > ================================================================== > > BUG: KASAN: slab-use-after-free in ib_destroy_qp_user > > (drivers/infiniband/core/verbs.c:2073) > > Read of size 8 at addr ffff88810da310a8 by task repro_upstream/1642 > > > > Call Trace: > > > > kasan_report (mm/kasan/report.c:590) > > ib_destroy_qp_user (drivers/infiniband/core/verbs.c:2073) > > mlx5r_umr_resource_cleanup (drivers/infiniband/hw/mlx5/umr.c:198) > > __mlx5_ib_add (drivers/infiniband/hw/mlx5/main.c:4178) > > mlx5r_probe (drivers/infiniband/hw/mlx5/main.c:4402) > > ... > > > > > > Allocated by task 1642: > > __kmalloc (./include/linux/kasan.h:198 mm/slab_common.c:1026 > > mm/slab_common.c:1039) > > create_qp (./include/linux/slab.h:603 ./include/linux/slab.h:720 > > ./include/rdma/ib_verbs.h:2795 drivers/infiniband/core/verbs.c:1209) > > ib_create_qp_kernel (drivers/infiniband/core/verbs.c:1347) > > mlx5r_umr_resource_init (drivers/infiniband/hw/mlx5/umr.c:164) > > mlx5_ib_stage_post_ib_reg_umr_init (drivers/infiniband/hw/mlx5/main.c:4070) > > __mlx5_ib_add (drivers/infiniband/hw/mlx5/main.c:4168) > > mlx5r_probe (drivers/infiniband/hw/mlx5/main.c:4402) > > ... > > > > Freed by task 1642: > > __kmem_cache_free (mm/slub.c:1826 mm/slub.c:3809 mm/slub.c:3822) > > ib_destroy_qp_user (drivers/infiniband/core/verbs.c:2112) > > mlx5r_umr_resource_cleanup (drivers/infiniband/hw/mlx5/umr.c:198) > > mlx5_ib_stage_post_ib_reg_umr_init (drivers/infiniband/hw/mlx5/main.c:4076 > > drivers/infiniband/hw/mlx5/main.c:4065) > > __mlx5_ib_add (drivers/infiniband/hw/mlx5/main.c:4168) > > mlx5r_probe (drivers/infiniband/hw/mlx5/main.c:4402) > > ... > > > > The buggy address belongs to the object at ffff88810da31000 which belongs to > > the cache kmalloc-2k of size 2048 The buggy address is located 168 bytes inside > > of freed 2048-byte region [ffff88810da31000, ffff88810da31800) > > > > The buggy address belongs to the physical page: > > page:000000003b5e469d refcount:1 mapcount:0 mapping:0000000000000000 > > index:0x0 pfn:0x10da30 > > head:000000003b5e469d order:3 entire_mapcount:0 nr_pages_mapped:0 > > pincount:0 > > flags: 0x17ffffc0000840(slab|head|node=0|zone=2|lastcpupid=0x1fffff) > > page_type: 0xffffffff() > > raw: 0017ffffc0000840 ffff888100042f00 ffffea0004180800 > > dead000000000002 > > raw: 0000000000000000 0000000000080008 00000001ffffffff > > 0000000000000000 page dumped because: kasan: bad access detected > > > > Memory state around the buggy address: > > ffff88810da30f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > > ffff88810da31000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > > >ffff88810da31080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > > ^ > > ffff88810da31100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > > ffff88810da31180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > > ================================================================== > > Disabling lock debugging due to kernel taint > > > > Fixes: 04876c12c19e ("RDMA/mlx5: Move init and cleanup of UMR to umr.c") > > Reported-by: syzkaller > > Suggested-by: Leon Romanovsky > > Signed-off-by: George Kennedy > > --- > > v2: went with fix suggested by: Leon Romanovsky > > > > drivers/infiniband/hw/mlx5/main.c | 4 +--- > > 1 file changed, 1 insertion(+), 3 deletions(-) > > > > diff --git a/drivers/infiniband/hw/mlx5/main.c > > b/drivers/infiniband/hw/mlx5/main.c > > index 555629b7..5d963ab 100644 > > --- a/drivers/infiniband/hw/mlx5/main.c > > +++ b/drivers/infiniband/hw/mlx5/main.c > > @@ -4071,10 +4071,8 @@ static int > > mlx5_ib_stage_post_ib_reg_umr_init(struct mlx5_ib_dev *dev) > > return ret; > > > > ret = mlx5_mkey_cache_init(dev); > > - if (ret) { > > + if (ret) > > mlx5_ib_warn(dev, "mr cache init failed %d\n", ret); > > - mlx5r_umr_resource_cleanup(dev); > > - } > > return ret; > > } > > > > -- > > 1.8.3.1 >