Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp189012pxb; Mon, 25 Oct 2021 06:26:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzi02hQ0sS1CXFKoAasKPIlo9rvC6xWkqNmSfe43KxHXjTUJRheKbNqEcGnv0IFhPOS7cLE X-Received: by 2002:a05:6a00:8d0:b0:44c:26e6:1c13 with SMTP id s16-20020a056a0008d000b0044c26e61c13mr18286270pfu.28.1635168401066; Mon, 25 Oct 2021 06:26:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635168401; cv=none; d=google.com; s=arc-20160816; b=tTOp/D0UP/T5J4PTwgkBah2848TdQpDTvc8NvQq1G4t9irU7kRoEEgNPaDPdCEUhKG FSUNfePKAggr8FgOFJoa2jeJ5o9azXiaasE1m1oEp8IAjd24wRHMD1+bDfpC8ltiVTQl laL9UQkkRNKKG8vt5L27XaISv0q23YbAhj57LmzTXDJ1tfREOZuyiWqWE9vxGHAI5T4P KaGOPIIkAuaqz8+ME35YoUSJfY73UjMz+J9zuW33sRiZfjtaMBMMKdYuYcpMVAccHY6d VHlZQAisFEaVfYnM87G23cdNN4OqVP+BRao3qEMPVHWIeC37FpB/j7OXBe4mLgIIKgx4 Cr5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=CBnuzj9Qu/Tw/7IsvFnusE0ltXSYG+eTAUM1lPZjvsU=; b=QYLymP1+iBb+5ePQLzYKnOBpMX8kGuufheyVNpCvzlW0q0l0Xl+CESYtNU+RARvCYW vG16442PRtux+xheXN6r7tthmMiVKKDFBdhlBXBsjgzTWPZB3T2rzQNmpQgn7RcWrZgh yciZQFPyF/5MuFgJ48eV8itI5di2Sf2NzzHgbkA9+cUG/HBH/daTZyi9qQ+LEu9yvhlE kz61rjSc1oRGmFRCRK62WUCkanx1w1GU3RyjEbTtQq25NdJ0WFX8s/chIDsDKh0xhQav 1+nzLl7p0aTRoXy0lrsAVBRAtn+kihhpwZe4vXjz0eA5t3d8M7zHXeioOA+PEifkBYEH 8tsg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=VdFWBz5U; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h126si25013759pgc.276.2021.10.25.06.26.26; Mon, 25 Oct 2021 06:26:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=VdFWBz5U; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232336AbhJYK6q (ORCPT + 99 others); Mon, 25 Oct 2021 06:58:46 -0400 Received: from mail.kernel.org ([198.145.29.99]:42694 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232090AbhJYK6n (ORCPT ); Mon, 25 Oct 2021 06:58:43 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id A68B660F46; Mon, 25 Oct 2021 10:56:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1635159381; bh=7j8ynqn/OK6718s9WxLto/Xdf9z2pakrrTtvt+j7xhc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=VdFWBz5UALHdb++MOrmOPw/Omjz7jBCyDwttDm5mHNLX92Z6X0cxLGJVwzoyQpyFd xtL0x2hX7Xow/1sZ3UW737AecMYlEdvWE1J0Umuf/4hOW3aH/0+W3RklDJvTVC3qIv 1cL2f2T47ub3Tc1i4d+2TmdyjfDEyqf3O2j8goeu8OC3NnZDSCQwOGEVSm4UmEBvQQ N53BwVW/jjz+aJVOlif2ZJHmnhP0HV4a9idKt9MRRSICsxKmQzyslgs0aGOl6q/JlO MHrrhGqALjThygprksogupPh3/XElCSZv6+tr6X97O/LdTkAdQeENJnboKU2suULxV q3xC/OaOrNTyA== Date: Mon, 25 Oct 2021 13:56:17 +0300 From: Leon Romanovsky To: Ido Schimmel Cc: "David S . Miller" , Jakub Kicinski , Ido Schimmel , Jiri Pirko , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, syzbot+93d5accfaefceedf43c1@syzkaller.appspotmail.com Subject: Re: [PATCH net-next] netdevsim: Register and unregister devlink traps on probe/remove device Message-ID: References: <725e121f05362da4328dda08d5814211a0725dac.1635064599.git.leonro@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 25, 2021 at 11:08:00AM +0300, Ido Schimmel wrote: > On Mon, Oct 25, 2021 at 08:34:55AM +0300, Leon Romanovsky wrote: > > On Sun, Oct 24, 2021 at 01:48:25PM +0300, Ido Schimmel wrote: > > > On Sun, Oct 24, 2021 at 12:54:52PM +0300, Leon Romanovsky wrote: > > > > On Sun, Oct 24, 2021 at 12:05:12PM +0300, Ido Schimmel wrote: > > > > > On Sun, Oct 24, 2021 at 11:42:11AM +0300, Leon Romanovsky wrote: > > > > > > From: Leon Romanovsky > > > > > > > > > > > > Align netdevsim to be like all other physical devices that register and > > > > > > unregister devlink traps during their probe and removal respectively. > > > > > > > > > > No, this is incorrect. Out of the three drivers that support both reload > > > > > and traps, both netdevsim and mlxsw unregister the traps during reload. > > > > > Here is another report from syzkaller about mlxsw [1]. > > > > > > > > Sorry, I overlooked it. > > > > > > > > > > > > > > Please revert both 22849b5ea595 ("devlink: Remove not-executed trap > > > > > policer notifications") and 8bbeed485823 ("devlink: Remove not-executed > > > > > trap group notifications"). > > > > > > > > However, before we rush and revert commit, can you please explain why > > > > current behavior to reregister traps on reload is correct? > > > > > > > > I think that you are not changing traps during reload, so traps before > > > > reload will be the same as after reload, am I right? > > > > > > During reload we tear down the entire driver and load it again. As part > > > of the reload_down() operation we tear down the various objects from > > > both devlink and the device (e.g., shared buffer, ports, traps, etc.). > > > As part of the reload_up() operation we issue a device reset and > > > register everything back. > > > > This is an implementation which is arguably questionable and pinpoints > > problem with devlink reload. It mixes different SW layers into one big > > mess which I tried to untangle. > > > > The devlink "feature" that driver reregisters itself again during execution > > of other user-visible devlink command can't be right design. > > > > > > > > While the list of objects doesn't change, their properties (e.g., shared > > > buffer size, trap action, policer rate) do change back to the default > > > after reload and we cannot go back on that as it's a user-visible > > > change. > > > > I don't propose to go back, just prefer to see fixed mlxsw that > > shouldn't touch already created and registered objects from net/core/devlink.c. > > > > All reset-to-default should be performed internally to the driver > > without any need to devlink_*_register() again, so we will be able to > > clean rest devlink notifications. > > > > So at least for the netdevsim, this change looks like the correct one, > > while mlxsw should be fixed next. > > No, it's not correct. After your patch, trap properties like action are > not set back to the default. Regardless of what you think is the "right > design", you cannot introduce such regressions. Again, I'm not against fixing the regression, I'm trying to understand why it is impossible to fix mlxsw and netdevsim to honor SW layering properly. > > Calling devlink_*_unregister() in reload_down() and devlink_*_register() > in reload_up() is not new. It is done for multiple objects (e.g., ports, > regions, shared buffer, etc). After your patch, netdevsim is still doing > it. Yeah, it was introduced by same developers who did it in mlxsw, so no wonders that same patterns exist in both drivers. > > Again, please revert the two commits I mentioned. If you think they are > necessary, you can re-submit them in the future, after proper review and > testing of the affected code paths. It was posted for review in the ML, no one objected. Can you please explain why is it so important to touch devlink SW objects, reallocate them again and again on every reload in mlxsw? The flow is convoluted without any reason: devlink reload -> mlxsw reload_down -> devlink ... unregister -> mlxsw reload_down again -> devlink reload again -> mlxsw reload_up -> devlink ... register -> mlxsw to handle register -> mlxsw reload_up again -> devlink reload finish Instead of: devlink reload -> mlxsw reload_down -> devlink reload again -> mlxsw reload_up -> devlink reload finish Thanks > > Thanks