Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp964601pxb; Fri, 22 Apr 2022 15:28:03 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxz3yxioIRiKpI1WyZMLxDs0Ylo8caM8iW/lOfJZbRvg0OM3BdNrGm/RtDwh7bvqR06dooQ X-Received: by 2002:a63:5b0d:0:b0:39c:c393:7e7f with SMTP id p13-20020a635b0d000000b0039cc3937e7fmr5841155pgb.16.1650666483598; Fri, 22 Apr 2022 15:28:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650666483; cv=none; d=google.com; s=arc-20160816; b=eba9tMVNverxIFXU17ogPWaRo/JZqxmsa5nRNMHl8tqIubv+Ti2Smparyb20cOQun5 llUevHrKzrpMuMamSWnnb4GX5VBMaMgZh9XKsfpIV7tMgvP3dqnZm4riRKXkiJyW5p7c UuGwK40eJBFlJdlOiHkCdE3w7snB3KLN4a97klr462s5NOvS7LK7vOGhxF23GaIbFo4k bFUnc1Td9d+7R9pI6Q6f4HhQ0WcTgni8l+pa8wZ0l3uGJyi3Ma+lcbN42C3ZwI9+iQ/5 anZmdqGcdyk//+O527lzk4Fsq14l6eqsyBKsOusI3WHrNWffvBzSUSRTv/eNRRBmZijN LXTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=dpx8X09AgXnON6w4OuIfs9dz6NapsnJnBeIvQpmxcrM=; b=o1Jrk9Ltdi8rCYSIKQOvj2LIuKnUCXktQRa/9kVDM3Rb49PirbxL3imG8AwrhdtSbc fLMRBIMskNke0xidCjJZOdhBWk/IptD6EfyAUFy9GvpOQM/dHqXEHbSyYT1/pIZpRubP IpQr1Jer6w8QH3YZhGTHgPvMPE3OlUsqZ2DO7CrkbENyIlEV0eG9emBENrjnYqfM36yn +uwmSx/GRQppFFEJuAYU4DGtA2SjdvCzfiB9D27C3+d10k0GjIHV5IJr9UCgZ2TxLN6h Q0W4zzzrPrrdYQjk6FFowLbJz6n/1yaONJwIyT4VkjYK8JfrBW2GU93tCCYZCN55TXUy JPaA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=K9IBNyyX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id j13-20020a170903024d00b00155f7af1822si10019536plh.484.2022.04.22.15.28.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Apr 2022 15:28:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=K9IBNyyX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 350E1296C08; Fri, 22 Apr 2022 13:35:01 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232575AbiDVRtV (ORCPT + 99 others); Fri, 22 Apr 2022 13:49:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231532AbiDVRsz (ORCPT ); Fri, 22 Apr 2022 13:48:55 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5A86BAB95; Fri, 22 Apr 2022 10:45:41 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 046AAB831B9; Fri, 22 Apr 2022 17:43:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 18754C385A9; Fri, 22 Apr 2022 17:43:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1650649401; bh=78vekOif5h0AQRVGTonAepebkBfhWINFWbKphNKFsC8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=K9IBNyyXRIgGSN7wGgcT0qkGBFMKJHJe8CgdLozejJGvcCGnX+VDDlnDsjCJJl6A0 YktriXM12lFeYWJZ7NdZUwUmRV5XSH99dOByD9D63JiP9Jjz42FW1binCQ2s0q6VIC u6j9PjD1KPlebtUuW2huA/GvDDv0Co9yLnix6luwH86+zQONQlkJLKD8jZ0V+psHRj HOXr9Lim4KKntCHO19xiyeGjUIJA0nIfkaq7QcKwWE5gYA+nAKvaeD7XFTtd9Qg/LV gl+mBAP2ZciKeQHZqNzB8wvMKMdFz9Q2JJsXN6zYptZf2a/Vsfmv+Z2aVbrXtOYu1s 5e/62FibhdQSQ== Date: Fri, 22 Apr 2022 20:43:17 +0300 From: Leon Romanovsky To: Ivan Vecera Cc: netdev@vger.kernel.org, poros@redhat.com, mschmidt@redhat.com, Jesse Brandeburg , Tony Nguyen , "David S. Miller" , Jakub Kicinski , Paolo Abeni , Dave Ertman , Shiraz Saleem , "moderated list:INTEL ETHERNET DRIVERS" , open list Subject: Re: [PATCH net v3] ice: Fix race during aux device (un)plugging Message-ID: References: <20220421060906.1902576-1-ivecera@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220421060906.1902576-1-ivecera@redhat.com> X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RDNS_NONE,SPF_HELO_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 21, 2022 at 08:09:05AM +0200, Ivan Vecera wrote: > Function ice_plug_aux_dev() assigns pf->adev field too early prior > aux device initialization and on other side ice_unplug_aux_dev() > starts aux device deinit and at the end assigns NULL to pf->adev. > This is wrong because pf->adev should always be non-NULL only when > aux device is fully initialized and ready. This wrong order causes > a crash when ice_send_event_to_aux() call occurs because that function > depends on non-NULL value of pf->adev and does not assume that > aux device is half-initialized or half-destroyed. > After order correction the race window is tiny but it is still there, > as Leon mentioned and manipulation with pf->adev needs to be protected > by mutex. > > Fix (un-)plugging functions so pf->adev field is set after aux device > init and prior aux device destroy and protect pf->adev assignment by > new mutex. This mutex is also held during ice_send_event_to_aux() > call to ensure that aux device is valid during that call. Device > lock used ice_send_event_to_aux() to avoid its concurrent run can > be removed as this is secured by that mutex. > > Reproducer: > cycle=1 > while :;do > echo "#### Cycle: $cycle" > > ip link set ens7f0 mtu 9000 > ip link add bond0 type bond mode 1 miimon 100 > ip link set bond0 up > ifenslave bond0 ens7f0 > ip link set bond0 mtu 9000 > ethtool -L ens7f0 combined 1 > ip link del bond0 > ip link set ens7f0 mtu 1500 > sleep 1 > > let cycle++ > done > > In short when the device is added/removed to/from bond the aux device > is unplugged/plugged. When MTU of the device is changed an event is > sent to aux device asynchronously. This can race with (un)plugging > operation and because pf->adev is set too early (plug) or too late > (unplug) the function ice_send_event_to_aux() can touch uninitialized > or destroyed fields. In the case of crash below pf->adev->dev.mutex. > > Crash: > [ 53.372066] bond0: (slave ens7f0): making interface the new active one > [ 53.378622] bond0: (slave ens7f0): Enslaving as an active interface with an u > p link > [ 53.386294] IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready > [ 53.549104] bond0: (slave ens7f1): Enslaving as a backup interface with an up > link > [ 54.118906] ice 0000:ca:00.0 ens7f0: Number of in use tx queues changed inval > idating tc mappings. Priority traffic classification disabled! > [ 54.233374] ice 0000:ca:00.1 ens7f1: Number of in use tx queues changed inval > idating tc mappings. Priority traffic classification disabled! > [ 54.248204] bond0: (slave ens7f0): Releasing backup interface > [ 54.253955] bond0: (slave ens7f1): making interface the new active one > [ 54.274875] bond0: (slave ens7f1): Releasing backup interface > [ 54.289153] bond0 (unregistering): Released all slaves > [ 55.383179] MII link monitoring set to 100 ms > [ 55.398696] bond0: (slave ens7f0): making interface the new active one > [ 55.405241] BUG: kernel NULL pointer dereference, address: 0000000000000080 > [ 55.405289] bond0: (slave ens7f0): Enslaving as an active interface with an u > p link > [ 55.412198] #PF: supervisor write access in kernel mode > [ 55.412200] #PF: error_code(0x0002) - not-present page > [ 55.412201] PGD 25d2ad067 P4D 0 > [ 55.412204] Oops: 0002 [#1] PREEMPT SMP NOPTI > [ 55.412207] CPU: 0 PID: 403 Comm: kworker/0:2 Kdump: loaded Tainted: G S > 5.17.0-13579-g57f2d6540f03 #1 > [ 55.429094] bond0: (slave ens7f1): Enslaving as a backup interface with an up > link > [ 55.430224] Hardware name: Dell Inc. PowerEdge R750/06V45N, BIOS 1.4.4 10/07/ > 2021 > [ 55.430226] Workqueue: ice ice_service_task [ice] > [ 55.468169] RIP: 0010:mutex_unlock+0x10/0x20 > [ 55.472439] Code: 0f b1 13 74 96 eb e0 4c 89 ee eb d8 e8 79 54 ff ff 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 65 48 8b 04 25 40 ef 01 00 31 d2 48 0f b1 17 75 01 c3 e9 e3 fe ff ff 0f 1f 00 0f 1f 44 00 00 48 > [ 55.491186] RSP: 0018:ff4454230d7d7e28 EFLAGS: 00010246 > [ 55.496413] RAX: ff1a79b208b08000 RBX: ff1a79b2182e8880 RCX: 0000000000000001 > [ 55.503545] RDX: 0000000000000000 RSI: ff4454230d7d7db0 RDI: 0000000000000080 > [ 55.510678] RBP: ff1a79d1c7e48b68 R08: ff4454230d7d7db0 R09: 0000000000000041 > [ 55.517812] R10: 00000000000000a5 R11: 00000000000006e6 R12: ff1a79d1c7e48bc0 > [ 55.524945] R13: 0000000000000000 R14: ff1a79d0ffc305c0 R15: 0000000000000000 > [ 55.532076] FS: 0000000000000000(0000) GS:ff1a79d0ffc00000(0000) knlGS:0000000000000000 > [ 55.540163] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 55.545908] CR2: 0000000000000080 CR3: 00000003487ae003 CR4: 0000000000771ef0 > [ 55.553041] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 55.560173] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 55.567305] PKRU: 55555554 > [ 55.570018] Call Trace: > [ 55.572474] > [ 55.574579] ice_service_task+0xaab/0xef0 [ice] > [ 55.579130] process_one_work+0x1c5/0x390 > [ 55.583141] ? process_one_work+0x390/0x390 > [ 55.587326] worker_thread+0x30/0x360 > [ 55.590994] ? process_one_work+0x390/0x390 > [ 55.595180] kthread+0xe6/0x110 > [ 55.598325] ? kthread_complete_and_exit+0x20/0x20 > [ 55.603116] ret_from_fork+0x1f/0x30 > [ 55.606698] > > Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA") > Cc: Leon Romanovsky > Signed-off-by: Ivan Vecera > --- > drivers/net/ethernet/intel/ice/ice.h | 1 + > drivers/net/ethernet/intel/ice/ice_idc.c | 33 ++++++++++++++--------- > drivers/net/ethernet/intel/ice/ice_main.c | 2 ++ > 3 files changed, 23 insertions(+), 13 deletions(-) > Thanks, Reviewed-by: Leon Romanovsky