Received: by 2002:a05:6358:45e:b0:b5:b6eb:e1f9 with SMTP id 30csp3053101rwe; Mon, 29 Aug 2022 05:11:47 -0700 (PDT) X-Google-Smtp-Source: AA6agR6fawiZGcmKX4gXXLYK7c1+wjnUh2365FvnkhwbAATk7mE081G+sF7fVUEduA0iwDtlWVhq X-Received: by 2002:a17:907:6e1a:b0:741:7b1e:f7b3 with SMTP id sd26-20020a1709076e1a00b007417b1ef7b3mr3998763ejc.303.1661775106674; Mon, 29 Aug 2022 05:11:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1661775106; cv=none; d=google.com; s=arc-20160816; b=AME8vQjG0J8u5qIjsN8MfksWFpBf6aMRrVrucZ5lVNAoZMAFwkJEDMeQcGCIZRbQ/L MTGHmeBXW0jVvAn4q2dfsHkf8fEx1s21lMWjLqBTxjl4pWyO7pPyWs3ICq48+rW26iP1 MQZkhi3G9cUfjlUJCqA6dcrSYdLyJrr4I5PF8Y5ZV9L1g031h8RUiRkdrUUY3g9m1cBE CRZREFxWsxnGV1b/xdiv/NgqrLgKhoytP72j4cXL6ixNtJ2zp2nmcvyuR7TaItPdLoCT g7Fpn6tvcxJIZs5VEkIn57eh9TfnjR1sV0wKQ/c7sTPtzy4H1qFp5hOPxuZ9rwjkbsey uVVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=RAfltDnApQZyoTm4HL4JYFXwyZqTJHtwDNAXtlfeSpk=; b=pz57kEicGGnUe0gQJqhbjNC+0BbH9rF2YEbOR+1/EH5DaF9AyQld4aLcsU657MxOjN qx5g6nDu8jy6pUtQpTgmwiee1CSxc0C/wslyHibBWTuO/HmSZ8a02x6ex8uqgr5xc9h3 Z7l6kUnkCtZsNNK4K3OlhU3qhLUKFWSRygtRVSOkqAMVodV8Ej1n8yibxEFImvNFUvnn cgXfYd2bIGVxgDTHxT20VMaN+wzGnJjMGhI6q9FBBr0CeSzoIDSS/hTCCnEbEPfVQU99 rZSYWi07DzzwK2/Ym/gMNZlHj05ewAJOtDBn7WlgLwPohKoEIt6nwawPktOAB4+r/d34 GxTw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=LJkQAZpW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hb11-20020a170907160b00b007313314bb73si6292320ejc.806.2022.08.29.05.11.20; Mon, 29 Aug 2022 05:11:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=LJkQAZpW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231467AbiH2LQu (ORCPT + 99 others); Mon, 29 Aug 2022 07:16:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43070 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229594AbiH2LOu (ORCPT ); Mon, 29 Aug 2022 07:14:50 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 47F0F7268C; Mon, 29 Aug 2022 04:10:41 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A16B761219; Mon, 29 Aug 2022 11:10:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AF553C433C1; Mon, 29 Aug 2022 11:10:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1661771433; bh=tbTXONZcIDODaTeJjGsqkCOrLmAzAOl33LNzXwtVO4k=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LJkQAZpWe6Dc8+sF9839iAT1UYB8XC3TbJXChkbiPRMAo6K6wvZoRvntOAzIL0sfy 4tNzojD+oIJ+Tl2HmztPVLbF23s4MdeWIJAATfesSg8CAnaSdnjObVb9dG91UrDYMb Uc2x4rjDoEu1EAwaAr6VnJ0R5MtVHVUd1w6VZo8w= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Eli Cohen , Maor Dickman , Mark Bloch , Saeed Mahameed , Sasha Levin Subject: [PATCH 5.19 030/158] net/mlx5: LAG, fix logic over MLX5_LAG_FLAG_NDEVS_READY Date: Mon, 29 Aug 2022 12:58:00 +0200 Message-Id: <20220829105810.050918798@linuxfoundation.org> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220829105808.828227973@linuxfoundation.org> References: <20220829105808.828227973@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Eli Cohen [ Upstream commit a6e675a66175869b7d87c0e1dd0ddf93e04f8098 ] Only set MLX5_LAG_FLAG_NDEVS_READY if both netdevices are registered. Doing so guarantees that both ldev->pf[MLX5_LAG_P0].dev and ldev->pf[MLX5_LAG_P1].dev have valid pointers when MLX5_LAG_FLAG_NDEVS_READY is set. The core issue is asymmetry in setting MLX5_LAG_FLAG_NDEVS_READY and clearing it. Setting it is done wrongly when both ldev->pf[MLX5_LAG_P0].dev and ldev->pf[MLX5_LAG_P1].dev are set; clearing it is done right when either of ldev->pf[i].netdev is cleared. Consider the following scenario: 1. PF0 loads and sets ldev->pf[MLX5_LAG_P0].dev to a valid pointer 2. PF1 loads and sets both ldev->pf[MLX5_LAG_P1].dev and ldev->pf[MLX5_LAG_P1].netdev with valid pointers. This results in MLX5_LAG_FLAG_NDEVS_READY is set. 3. PF0 is unloaded before setting dev->pf[MLX5_LAG_P0].netdev. MLX5_LAG_FLAG_NDEVS_READY remains set. Further execution of mlx5_do_bond() will result in null pointer dereference when calling mlx5_lag_is_multipath() This patch fixes the following call trace actually encountered: [ 1293.475195] BUG: kernel NULL pointer dereference, address: 00000000000009a8 [ 1293.478756] #PF: supervisor read access in kernel mode [ 1293.481320] #PF: error_code(0x0000) - not-present page [ 1293.483686] PGD 0 P4D 0 [ 1293.484434] Oops: 0000 [#1] SMP PTI [ 1293.485377] CPU: 1 PID: 23690 Comm: kworker/u16:2 Not tainted 5.18.0-rc5_for_upstream_min_debug_2022_05_05_10_13 #1 [ 1293.488039] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 1293.490836] Workqueue: mlx5_lag mlx5_do_bond_work [mlx5_core] [ 1293.492448] RIP: 0010:mlx5_lag_is_multipath+0x5/0x50 [mlx5_core] [ 1293.494044] Code: e8 70 40 ff e0 48 8b 14 24 48 83 05 5c 1a 1b 00 01 e9 19 ff ff ff 48 83 05 47 1a 1b 00 01 eb d7 0f 1f 44 00 00 0f 1f 44 00 00 <48> 8b 87 a8 09 00 00 48 85 c0 74 26 48 83 05 a7 1b 1b 00 01 41 b8 [ 1293.498673] RSP: 0018:ffff88811b2fbe40 EFLAGS: 00010202 [ 1293.500152] RAX: ffff88818a94e1c0 RBX: ffff888165eca6c0 RCX: 0000000000000000 [ 1293.501841] RDX: 0000000000000001 RSI: ffff88818a94e1c0 RDI: 0000000000000000 [ 1293.503585] RBP: 0000000000000000 R08: ffff888119886740 R09: ffff888165eca73c [ 1293.505286] R10: 0000000000000018 R11: 0000000000000018 R12: ffff88818a94e1c0 [ 1293.506979] R13: ffff888112729800 R14: 0000000000000000 R15: ffff888112729858 [ 1293.508753] FS: 0000000000000000(0000) GS:ffff88852cc40000(0000) knlGS:0000000000000000 [ 1293.510782] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1293.512265] CR2: 00000000000009a8 CR3: 00000001032d4002 CR4: 0000000000370ea0 [ 1293.514001] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1293.515806] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Fixes: 8a66e4585979 ("net/mlx5: Change ownership model for lag") Signed-off-by: Eli Cohen Reviewed-by: Maor Dickman Reviewed-by: Mark Bloch Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin --- drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c index 5d41e19378e09..c520edb942ca5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c @@ -1234,7 +1234,7 @@ void mlx5_lag_add_netdev(struct mlx5_core_dev *dev, mlx5_ldev_add_netdev(ldev, dev, netdev); for (i = 0; i < ldev->ports; i++) - if (!ldev->pf[i].dev) + if (!ldev->pf[i].netdev) break; if (i >= ldev->ports) -- 2.35.1