Received: by 2002:a05:6a10:d5a5:0:0:0:0 with SMTP id gn37csp4846782pxb; Tue, 5 Oct 2021 11:34:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwe3gNqDK1HP0Mt2nFhoLu4qlm0b4/XCvAAHdGy1qitGGZO8kcNAly9S4MjE4HtLn4dMtp5 X-Received: by 2002:aa7:870b:0:b0:44b:bcef:32b4 with SMTP id b11-20020aa7870b000000b0044bbcef32b4mr32792375pfo.41.1633458841267; Tue, 05 Oct 2021 11:34:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1633458841; cv=none; d=google.com; s=arc-20160816; b=kxcIeHziB+9UB2t4pCuZc4C1Yrg4OnIcwkQbwiuilYhd0JNEwIsU5cJ8STQsNrWfFM fJTqmAQDfpQV9r7KzWWj0NRTNADea8ZU9ZR+wFH8tvK9zhJjw0vJoSsoKZSfEHD34Gpf RDYHc1uN2MRsN7lnW5fO/C6f4pMBT4DXmE2fW2ZjEpNuKH+bgnkxMIHV09UJ89SdTeHK F+/+R8tJRkYE+g16qa6dgaewdlrL5aGYEa9eXIAdKXPzHMDngCYtGfvLamAu5dxVHdx/ tBiwiCzh6may/vpR6GGDm+TJIlGQu6f1dvBcV0ZjcJfhcHQZoVn61qQtFzQNZIyKJBhW k+Kg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=03qNOdHV+1OPSvCS14YCniSUHUHz8GHGXY7dhNoshB4=; b=LPMxDuKzHNJNvtumgylNgcHzuPH/h3MqahYsLioVr6ceRYJNzpNd8mGKN05o4R9zS8 3rONFAibQVzA3Fq7Jo4NYwHtcZwMs8OVO5xRku/bgLC0P2D4tUGvVb7H21ISgQ0Kb8bl vQpJTOMOJztRaHf9NkHVMfabbxJYPRd3oAMEaVJR6hccab7zoZzY71MdbNPXyxQjt9Gs jHxkFvrt2BFnpnBY3zRqmfzrCMRbd/rH2cJUSAYoEBVJDFoSRFJ1WP7HP2D4jxmm9Vru YfLzJwgWiJz55TKxAgtJ/fJ626cdaei+8j+PoMFKUlvjgA9osd+20mcMyP3nQYBYd5se LJqA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=BAPwjPOQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x10si21790309pfm.17.2021.10.05.11.33.48; Tue, 05 Oct 2021 11:34:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=BAPwjPOQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234855AbhJESeH (ORCPT + 99 others); Tue, 5 Oct 2021 14:34:07 -0400 Received: from mail.kernel.org ([198.145.29.99]:45550 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229626AbhJESeG (ORCPT ); Tue, 5 Oct 2021 14:34:06 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 2B3FD610FC; Tue, 5 Oct 2021 18:32:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1633458735; bh=fWiFxnLxlEYJaeb4vqHUXjVPIw59E/Lp4hGMAvlSYAo=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=BAPwjPOQLG7w7zAx4AnjgMhZwqXELQcZGskKEhXSDsb6E7PKRwr3e42VEt8sWqfbw EVM20EWdWgF1FL7AO54vHDEFWdJV3yYOd99w+fIWVTvoI6AT+KICXDMZrdIs1ZhJOJ fTC9cHEp8sFmQpwcgwAnkKZnXm99V8a6kRXJmmfrjjIpaw/BukRwS0I+uu9IP06YzV ZUJMm9P2iO/JmED0HuPwmxO+r7f73aCAVjZZadccs+tttbl/KvO4XT+jFrBnhIJFgh v9DMbrlupU4F9uU0l4phg8bOET3O6YY/KTs8iJBAzbiCRxKutwjFNx2O5QEygh0p7e yhJDa1F/kPkjQ== Date: Tue, 5 Oct 2021 11:32:13 -0700 From: Jakub Kicinski To: Leon Romanovsky Cc: "David S . Miller" , Ido Schimmel , Ingo Molnar , Jiri Pirko , linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, mlxsw@nvidia.com, Moshe Shemesh , netdev@vger.kernel.org, Saeed Mahameed , Salil Mehta , Shay Drory , Steven Rostedt , Tariq Toukan , Yisen Zhuang Subject: Re: [PATCH net-next v2 3/5] devlink: Allow set specific ops callbacks dynamically Message-ID: <20211005113213.0ee61358@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> In-Reply-To: References: <92971648bcad41d095d12f5296246fc44ab8f5c7.1633284302.git.leonro@nvidia.com> <20211004164413.60e9ce80@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 5 Oct 2021 10:32:45 +0300 Leon Romanovsky wrote: > On Mon, Oct 04, 2021 at 04:44:13PM -0700, Jakub Kicinski wrote: > > On Sun, 3 Oct 2021 21:12:04 +0300 Leon Romanovsky wrote: > > > From: Leon Romanovsky > > > > > > Introduce new devlink call to set specific ops callback during > > > device initialization phase after devlink_alloc() is already > > > called. > > > > > > This allows us to set specific ops based on device property which > > > is not known at the beginning of driver initialization. > > > > > > For the sake of simplicity, this API lacks any type of locking and > > > needs to be called before devlink_register() to make sure that no > > > parallel access to the ops is possible at this stage. > > > > The fact that it's not registered does not mean that the callbacks > > won't be invoked. Look at uses of devlink_compat_flash_update(). > > It is impossible, devlink_register() is part of .probe() flow and if it > wasn't called -> probe didn't success -> net_device doesn't exist. Are you talking about reality or the bright future brought by auxbus? > We are not having net_device without "connected" device beneath, aren't we? > > At least drivers that I checked are not prepared at all to handle call > to devlink->ops.flash_update() if they didn't probe successfully. Last time I checked you moved the devlink_register() at the end of probe which for all no-auxbus drivers means after register_netdev(). > > > diff --git a/net/core/devlink.c b/net/core/devlink.c > > > index 4e484afeadea..25c2aa2b35cd 100644 > > > --- a/net/core/devlink.c > > > +++ b/net/core/devlink.c > > > @@ -53,7 +53,7 @@ struct devlink { > > > struct list_head trap_list; > > > struct list_head trap_group_list; > > > struct list_head trap_policer_list; > > > - const struct devlink_ops *ops; > > > + struct devlink_ops ops; > > > > Security people like ops to live in read-only memory. You're making > > them r/w for every devlink instance now. > > Yes, but we are explicitly copy every function pointer, which is safe. The goal is for ops to live in pages which are mapped read-only, so that heap overflows can overwrite the pointers. > > > struct xarray snapshot_ids; > > > struct devlink_dev_stats stats; > > > struct device *dev; > > > +EXPORT_SYMBOL_GPL(devlink_set_ops); > > > > I still don't like this. IMO using feature bits to dynamically mask-off > > capabilities has much better properties. We already have static caps > > in devlink_ops (first 3 members), we should build on top of that. > > These capabilities are for specific operation, like flash or reload. > They control how these flows will work, they don't control if this flow > is valid or not. > > You are too focused on reload caps, but mutliport mlx5 device doesn't > support eswitch too. I just didn't remove the eswitch callbacks to > stay focused on more important work - making devlink better. :) > > Even if we decide to use new flag in devlink_ops, we will still need to > add this devlink_set_ops() patch, because the value of that new flag > will be known very late in initialization phase, after FW capabilities > are known and I will need to overwrite RO memory. Yes, you can change the caps at run time, that's perfectly reasonable. You'll also be able to define more fine grained caps going forward as needed. > Jakub, > > Can we please continue with the current approach? It doesn't expose any > user visible API and everything here will be easy rewrite differently > if such needs arise. > > We have so much ahead, like removing devlink_lock, rewriting devlink->lock, > fixing devlink reload of IB part, e.t.c I don't like it. If you're feeling strongly please gather support of other developers. Right now it's my preference against yours. I don't even see you making arguments that your approach is better, just that mine is not perfect and requires some similar changes.