Received: by 2002:a05:6a10:d5a5:0:0:0:0 with SMTP id gn37csp4008974pxb; Mon, 4 Oct 2021 15:06:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyDYpyDMZdVKQbU6kEfGI5dpTw/TWqXNuTgdfi5ruRkKB6Lh0CZ+y0k6qTDtpaCWykhZt02 X-Received: by 2002:a17:907:3312:: with SMTP id ym18mr19877208ejb.370.1633385210943; Mon, 04 Oct 2021 15:06:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1633385210; cv=none; d=google.com; s=arc-20160816; b=ByJcd6EVR8SlzO6m+3Oj+8npYutOAAnljkMT3XpOYgKDWzlAdhb7CNRXDNHrmxDkzL FBTa674nBLTEr0hyUIDouKWvvbX9xTWgiVq4q+pPy4iidbUWAAQlRoIChE55JUKfuK46 /xtyvvDCWkX7cObit+s0MTkKpvn5Ve63Ozj/YA4gRdSNCPPPK9AOphJjzxZbaU3rUOTg iSKD0ofETOxPbYidNHE+hPzUMkg8Z4Avk26EV/rRauj/gzqFiEFHFHIKHr2jSCoJXlL6 bSTNq8ddYPyUZW0yfPcU5jvnAeN0sXiAjkRFSiI+lSLqh5jZNqK4vdx/hJ4tVa62YHde r9Sw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=mqFUr95SHOXz+MKSK9jWsZ2nU09uNcaKTEJXQHCfOW4=; b=YGzB5Oby5dr18gSp91fSGl+K1pYZz/9AJp5VrOXqelzVqdLxYvNBxUSk162EG/ZkBq ZnkGYHe6+SWWrx97mLcikBv8FbIZjLRPtMX9oEGN/om2J+203dU7JwCkOCM3DCF0Xpwr GjbEwk/EHmp+bxRjKm8lQfXA3efD3gZNGhOnYu+VMpeDRhxuoCeXWpMySKqZYqTU3h7F lqZIrBNPrJhwBj0Aj7c/9V09G+8Jm4tOqUvwb9JGNnWi8ey8RtneXIiEY6xw1fBpJr25 BXlFsYwkmx/SVOBPCNuvFWal4+jJka1TJgbhlUDyccb2pKxZ+OwvmivHycYJD4kCx2L8 paZg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=p6VQKvSb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b3si18533457ejb.219.2021.10.04.15.06.26; Mon, 04 Oct 2021 15:06:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=p6VQKvSb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235875AbhJDPrB (ORCPT + 99 others); Mon, 4 Oct 2021 11:47:01 -0400 Received: from mail.kernel.org ([198.145.29.99]:60104 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235455AbhJDPrB (ORCPT ); Mon, 4 Oct 2021 11:47:01 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 587BD61357; Mon, 4 Oct 2021 15:45:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1633362312; bh=mZSKmWm8SAdamHTmkeSu5hd1hOJE1TChJtoljR5IALY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=p6VQKvSbK5Fv4RhmL5ezqtWqHDcDmgjmDDkm0j5gmpAQhF8bM2ThQ8qbypujqJjxo hopoE722L3U7l5n7VvF7hLEKtWJZUI4K47uMaSwbydJk06h56CaHYH9UHgeevGcVM8 +JncQaZwZ3Jho82E8G+i2eEMnXEVDLkaciD+IT6oruWCo1P+2X/GrBeOfdGkBhxA+K 24e1K1GRipdR8bYO4eowt36dQ1+4VywHPY+RZ8srzvROPRHbBpkAOa3n2DvAKiRz7r fioGIJXXFo7Dqc6J9irHIlLoRxKEeyGNX12sLCOLOAId47nhaVPpm075ISBTwKKwe3 AGwf/2DhZt5OA== Date: Mon, 4 Oct 2021 18:45:07 +0300 From: Leon Romanovsky To: Ido Schimmel Cc: "David S . Miller" , Jakub Kicinski , Ido Schimmel , Ingo Molnar , Jiri Pirko , linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, mlxsw@nvidia.com, Moshe Shemesh , netdev@vger.kernel.org, Saeed Mahameed , Salil Mehta , Shay Drory , Steven Rostedt , Tariq Toukan , Yisen Zhuang Subject: Re: [PATCH net-next v2 5/5] devlink: Delete reload enable/disable interface Message-ID: References: <06ebba9e115d421118b16ac4efda61c2e08f4d50.1633284302.git.leonro@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 04, 2021 at 05:19:40PM +0300, Ido Schimmel wrote: > On Sun, Oct 03, 2021 at 09:12:06PM +0300, Leon Romanovsky wrote: > > From: Leon Romanovsky > > > > After changes to allow dynamically set the reload_up/_down callbacks, > > we ensure that properly supported devlink ops are not accessible before > > devlink_register, which is last command in the initialization sequence. > > > > It makes devlink_reload_enable/_disable not relevant anymore and can be > > safely deleted. > > > > Signed-off-by: Leon Romanovsky > > [...] > > > diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c > > index cb6645012a30..09e48fb232a9 100644 > > --- a/drivers/net/netdevsim/dev.c > > +++ b/drivers/net/netdevsim/dev.c > > @@ -1512,7 +1512,6 @@ int nsim_dev_probe(struct nsim_bus_dev *nsim_bus_dev) > > > > nsim_dev->esw_mode = DEVLINK_ESWITCH_MODE_LEGACY; > > devlink_register(devlink); > > - devlink_reload_enable(devlink); > > return 0; > > > > err_psample_exit: > > @@ -1566,9 +1565,7 @@ void nsim_dev_remove(struct nsim_bus_dev *nsim_bus_dev) > > struct nsim_dev *nsim_dev = dev_get_drvdata(&nsim_bus_dev->dev); > > struct devlink *devlink = priv_to_devlink(nsim_dev); > > > > - devlink_reload_disable(devlink); > > devlink_unregister(devlink); > > - > > nsim_dev_reload_destroy(nsim_dev); > > > > nsim_bpf_dev_exit(nsim_dev); > > I didn't remember why devlink_reload_{enable,disable}() were added in > the first place so it was not clear to me from the commit message why > they can be removed. It is described in commit a0c76345e3d3 ("devlink: > disallow reload operation during device cleanup") with a reproducer. It was added because devlink ops were accessible by the user space very early in the driver lifetime. All my latest devlink patches are the attempt to fix this arch/design/implementation issue. > > Tried the reproducer with this series and I cannot reproduce the issue. > Wasn't quite sure why, but it does not seem to be related to "changes to > allow dynamically set the reload_up/_down callbacks", as this seems to > be specific to mlx5. You didn't reproduce because of my series that moved devlink_register()/devlink_unregister() to be last/first commands in .probe()/.remove() flows. Patch to allow dynamically set ops was needed because mlx5 had logic like this: if(something) devlink_reload_enable() And I needed a way to keep this if ... condition. > > IIUC, the reason that the race described in above mentioned commit can > no longer happen is related to the fact that devlink_unregister() is > called first in the device dismantle path, after your previous patches. > Since both the reload operation and devlink_unregister() hold > 'devlink_mutex', it is not possible for the reload operation to race > with device dismantle. > > Agree? If so, I think it would be good to explain this in the commit > message unless it's clear to everyone else. I don't agree for very simple reason that devlink_mutex is going to be removed very soon and it is really not a reason why devlink reload is safer now when before. The reload can't race due to: 1. devlink_unregister(), which works as a barrier to stop accesses from the user space. 2. reference counting that ensures that all in-flight commands are counted. 3. wait_for_completion that blocks till all commands are done. Thanks > > Thanks