Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp293706pxb; Wed, 27 Oct 2021 03:15:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz5GZPBTSjkNclyLZm42yhbVVBdv4JJnrpDyzd1HxIhayPfiaKwfw2f7pmjpnbT5vjJNcn6 X-Received: by 2002:a17:907:1044:: with SMTP id oy4mr3594548ejb.311.1635329712400; Wed, 27 Oct 2021 03:15:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635329712; cv=none; d=google.com; s=arc-20160816; b=gNEXyG92gv6krDSvgQ3RNZrqVS0gp3DJu+WeBZjw2XIH/r1XObR8rCGxW5PyEtUlkq jIKdSbWzapIaI58DSS36fPMpVIeargZ4bHVWWXSR2aQbOmlvfFtSHXY3lh8TfEIrxhiy apxXTTChMxgMGSCNG1wkapRLEdJYssqST+sKJDObAXOc17sVEb80fylUnafOdCy1e0yP F0DOGuxNeqzKCeppFcsmwMNS6+UMfI1R7d/huow7RhTCpgGB8X+gkW9LBubSWW2pQW9r mxzQBu7+c2DjSYBULVcl8k9ijZ1OLUqRWK+khm9gdMNA+q12p9F1FQyHPy+HO4AGY+A8 wSMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=eatsSJ90+7wtVivj0BA+XY9T/Yqtf/IIShhkQpS1u2U=; b=JBU2P6JXSbYOnpse1kKDQ/mnyB7sQ5a/d6p0/ILVxTQuwF7wJ1sFi5PJGqw7XWlcA/ cCwjQEdQArvUkYHOH54pMJrkNLj92/A34h9d5H2xez0+0dof0EqgcaxpWbmOqmDIj3EQ oZA+UY1Uf/BIKFidIHInrsXbnYQHERHKtG9v5vMRTwFBFNp6DlB1lG5araXi8LqxK+jC SrkjgzqJFHbmIfEUaw5oFi2FnAcURsUIeHE4G9Ce/qddYMWYfVwEHE+GnMig+0JVb3AV WyN82rH1M6ROqvYI+h92xIO8qPHhH5S7a73dK4IylmiR4ucKlSxP2YBccVNsvtNJiEw+ 30qg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@broadcom.com header.s=google header.b=Ol+Xd7xf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadcom.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id hr4si40756397ejc.651.2021.10.27.03.14.48; Wed, 27 Oct 2021 03:15:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@broadcom.com header.s=google header.b=Ol+Xd7xf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadcom.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237220AbhJZUGo (ORCPT + 99 others); Tue, 26 Oct 2021 16:06:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48766 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235867AbhJZUGn (ORCPT ); Tue, 26 Oct 2021 16:06:43 -0400 Received: from mail-yb1-xb31.google.com (mail-yb1-xb31.google.com [IPv6:2607:f8b0:4864:20::b31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2083C061570 for ; Tue, 26 Oct 2021 13:04:18 -0700 (PDT) Received: by mail-yb1-xb31.google.com with SMTP id r184so460318ybc.10 for ; Tue, 26 Oct 2021 13:04:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=eatsSJ90+7wtVivj0BA+XY9T/Yqtf/IIShhkQpS1u2U=; b=Ol+Xd7xfdmTF0nEr1WSsZg01qEplLaHwGMrNCTy7RvwubZbsuSVjHmoemXCSVTPrkm 1Sil+Lopr26zO8CY+VtAbf/VM4w4p3S94KkfZQNxDOMZMT1PkErrNlNiqMtNNzvuoPSB rM7YHnaePQTP10yXowUZ7R96oiU834ZB0XXuM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eatsSJ90+7wtVivj0BA+XY9T/Yqtf/IIShhkQpS1u2U=; b=h9zkwaSCsLplaAWUil9TeAX1D3ns+TUV4LFiPHhQmrbjhOyzrhRfUo7GG7wYyRx5my aXk5CUa0zkKnIhu8TdiXjsJPZ6u4mY22iLixxIIMQXunO93K3Pd86fs7PEtSfdIPZSAy JekrHLyxzKw0KFT872yYYZQnqRiK2gTFDtdNs7MiU9FndPbOosh2KZ2jPJigOgU4O/fI R4epylzxzpEQUEE+qj1hfcDgaBbgSZ0mzsa9PsA9SbJodWQVXhMnOL3U2L3VgT6Mm4YE tdq1m2J3I4g7ejjgHKbMF5y+UGgt87CvhTVh+udYqJTcW8Whp2bGh7dfT/63e8bj/B86 Ntdg== X-Gm-Message-State: AOAM530/agt7hXUX8Si62G2rEMaZRkrUebMDQd8WH0gf2N3/9BAefZOl QoSzt++DRZRxn7RFUlZ0DOdZzNVK0hTvaCmNz94iZ7NWzTbGEw== X-Received: by 2002:a25:6705:: with SMTP id b5mr26165199ybc.116.1635278657910; Tue, 26 Oct 2021 13:04:17 -0700 (PDT) MIME-Version: 1.0 References: <725e121f05362da4328dda08d5814211a0725dac.1635064599.git.leonro@nvidia.com> In-Reply-To: From: Edwin Peer Date: Tue, 26 Oct 2021 13:03:41 -0700 Message-ID: Subject: Re: [PATCH net-next] netdevsim: Register and unregister devlink traps on probe/remove device To: Leon Romanovsky Cc: Ido Schimmel , "David S . Miller" , Jakub Kicinski , Ido Schimmel , Jiri Pirko , Linux Kernel Mailing List , netdev , syzbot+93d5accfaefceedf43c1@syzkaller.appspotmail.com, Michael Chan Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 26, 2021 at 12:22 PM Leon Romanovsky wrote: > At least in mlx5 case, reload_enable() was before register_netdev(). > It stayed like this after swapping it with devlink_register(). What am I missing here? err = mlx5_init_one(dev); if (err) { mlx5_core_err(dev, "mlx5_init_one failed with error code %d\n", err); goto err_init_one; } err = mlx5_crdump_enable(dev); if (err) dev_err(&pdev->dev, "mlx5_crdump_enable failed with error code %d\n", err); pci_save_state(pdev); devlink_register(devlink); Doesn't mlx5_init_one() ultimately result in the netdev being presented to user space, even if it is via aux bus? > No, it is not requirement, but my suggestion. You need to be aware that > after call to devlink_register(), the device will be fully open for devlink > netlink access. So it is strongly advised to put devlink_register to be the > last command in PCI initialization sequence. Right, that's the problem. Once we register the netdev, we're in a race with user space, which may expect to be able to call devlink before we get to devlink_register(). > You obviously need to fix your code. Upstream version of bnxt driver > doesn't have reload_* support, so all this regression blaming it not > relevant here. Right, our timing is unfortunate and that's on us. It's still not clear to me how to actually fix the devlink reload code without the benefit of something similar to the reload enable API. > In upstream code, devlink_register() doesn't accept ops like it was > before and position of that call does only one thing - opens devlink > netlink access. All kernel devlink APIs continue to be accessible even > before devlink_register. This isn't about kernel API. This is precisely about existing user space that expects devlink to work immediately after the netdev appears. > It looks like your failure is in backport code. Our out-of-tree driver isn't the issue here. I'm talking about the proposed upstream code. The issue is what to do in order to get something workable upstream for devlink reload. We can't move devlink_register() later, that will cause a regression. What do you suggest instead? Regards, Edwin Peer