Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp306009pxa; Wed, 5 Aug 2020 01:23:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx/mJaQwPBcX10AyPtKMhByv0a9DC0eQdafg26EbCLki4d9TzOKuW6kxE4faiwWQml3VvD+ X-Received: by 2002:a17:906:6d91:: with SMTP id h17mr1944560ejt.531.1596615831421; Wed, 05 Aug 2020 01:23:51 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1596615831; cv=pass; d=google.com; s=arc-20160816; b=kIcIyl3UgRQPXN0tT20DdhADrg3QbhwWy36qGnXDiCeuJbhdmWKZsP+7ulFQgTfqiQ 10Zm813ZSYPmdUQGi9euJiSfFpA/Uu2eWxZd+UcEOg5JefWZVUWWT1jbI97xhmkGkgyl DADAjlYBZAFqFlMyy8X39Ye+VTG8S10IKl1Py3xEXW6ADq7MIHlsRrdTPuMxJPBeOAI0 sfmRLdou9Tc0deRCbt+XcAQXlK+eVmAsUgNP7H3rP3ZnoIBKHgs2jKrq8VEeXbz8ZwGJ RyFxecImnUsJ8jil3xS/UsSSVryTPXefbE0mM5KeTSJu8m2rz+eTiI8Nj6FxR9O+YEIn mQ7w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-language :content-transfer-encoding:in-reply-to:user-agent:date:message-id :from:references:cc:to:subject:dkim-signature; bh=WhtXOVRZifAylnfOyAj2L1SQ4ehEIMRLvdz801PqGZ4=; b=zsXTaB620ka/RZkpYUoyu/ScoHPgP4QAs3dqlwjSpoYkZGFyu0utDI3tp8yeh13osv 3DdFkHxr3Dj3NcduBheiY5aAqVY/pqSRJkKO3FVvD+whmhe8AXELLMjA+iVrYg4WcOfF 4bPeorzhFinejmlhwxk8bu+7p4yfjhHV71CIVOote4JYtBfX70IXHaUAn3tkA0VaNQH+ JXF4Z1L+SZYIvLRmJZKOqbo5XBBiWnZU2BYVbhsIMpcOFdYpBs+Lu0vpZFAniG6RkPAE tdH7d9yLUFlMjkoBtCtk9Tkazc2Ib7YvEPW3ZTHqwHQvyJ7NyyB9AQ3uqL6N2zXs7mAQ WHCw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@Mellanox.com header.s=selector1 header.b=emx0Gget; arc=pass (i=1 spf=pass spfdomain=mellanox.com dkim=pass dkdomain=mellanox.com dmarc=pass fromdomain=mellanox.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=mellanox.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u12si808921edp.476.2020.08.05.01.23.28; Wed, 05 Aug 2020 01:23:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@Mellanox.com header.s=selector1 header.b=emx0Gget; arc=pass (i=1 spf=pass spfdomain=mellanox.com dkim=pass dkdomain=mellanox.com dmarc=pass fromdomain=mellanox.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=mellanox.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726604AbgHEIVO (ORCPT + 99 others); Wed, 5 Aug 2020 04:21:14 -0400 Received: from mail-db8eur05on2084.outbound.protection.outlook.com ([40.107.20.84]:42976 "EHLO EUR05-DB8-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725809AbgHEIVI (ORCPT ); Wed, 5 Aug 2020 04:21:08 -0400 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Buu7KJvxI7A8vwKnpphIW5XGTqwbbl9eChVhSVt5ZmCQhsuMr/ifO6S1w7QRBd5VeeQdEQFLS+Gl9T1fBzBrMNrlsrSrCXxdbYP1mg3EM4J6VmImGHjYIPiJspzWU9fVbV2w1qvOx8yhfrKwziNn3fli2r7hWR35hsAIrio+vpLeB6s7jLmjZSGDaD7SvPVCGlpobKK58T9v6ovnh199XB6Gv6pXUIsvhjf5qpNRxfk3W1Qse+F/y9MXHfgNstjc3y+AshVFJaPQmeXqo5WeD9RLaG6kw9JqUSjnNj58ntjnmUCm3qW+Qea2A/FWPr7TTBIjMOb+atf++7mN1RswLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WhtXOVRZifAylnfOyAj2L1SQ4ehEIMRLvdz801PqGZ4=; b=ASZQfwfaLAGIiGNT2JAVOD52GIIMrhQ54AMaPGlQ4Miw6eNwSO4QIc0TTlTUvm5qWt/OReiAmxKS9Zm1ytqB6pfa/X3+PPskKV9BoyWAB0lx6ul2vRCNLckWUjbnyjlqhrS57w/wtkJpAHnMwn38juLv3erx5Jrrbsk01EDhpMgxFz/m3gvV4mMrmOVXH1Gha9bb8UnxTfIf/FMd/NXE+O5YxmiOInT03wcb71+yQhqM8030nI1Y2pjwcb4c1/8UGCkjyXrpo3WFzdAW6vRM8gAPHgV8/cqvHIbwFL2ll5m0x5dYG9Zl1Lp8qH6M6Ggw3iJbi1eV2PK5qWq4ZiCmOw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=mellanox.com; dmarc=pass action=none header.from=mellanox.com; dkim=pass header.d=mellanox.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WhtXOVRZifAylnfOyAj2L1SQ4ehEIMRLvdz801PqGZ4=; b=emx0Gget31BqsOr4rbdZYCsrC4uMMrcjXJ8uSPKAlbw0ogymVzJNeN8Td7uf+JJ5jOOiZmyTfjycsSmgubW/9B+tSbXM6eLROb3L4NTsSK47AVSpR9Y/cE5THt/dNI8PtPqbHia3WswJ1npBQITxS+PdNaKV1CKct0TRJXSfWz4= Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=mellanox.com; Received: from AM0PR05MB4290.eurprd05.prod.outlook.com (2603:10a6:208:63::16) by AM0PR05MB5987.eurprd05.prod.outlook.com (2603:10a6:208:130::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3261.15; Wed, 5 Aug 2020 08:21:03 +0000 Received: from AM0PR05MB4290.eurprd05.prod.outlook.com ([fe80::21b3:2006:95aa:7a1f]) by AM0PR05MB4290.eurprd05.prod.outlook.com ([fe80::21b3:2006:95aa:7a1f%3]) with mapi id 15.20.3261.018; Wed, 5 Aug 2020 08:21:03 +0000 Subject: Re: [PATCH net-next RFC 00/13] Add devlink reload level option To: Vasundhara Volam Cc: Jacob Keller , "David S. Miller" , Jiri Pirko , Netdev , open list References: <1595847753-2234-1-git-send-email-moshe@mellanox.com> <7a9c315f-fa29-7bd5-31be-3748b8841b29@mellanox.com> <7fd63d16-f9fa-9d55-0b30-fe190d0fb1cb@mellanox.com> From: Moshe Shemesh Message-ID: Date: Wed, 5 Aug 2020 11:20:59 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-ClientProxiedBy: FR2P281CA0011.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:a::21) To AM0PR05MB4290.eurprd05.prod.outlook.com (2603:10a6:208:63::16) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from [192.168.0.105] (31.210.180.3) by FR2P281CA0011.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:a::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3261.13 via Frontend Transport; Wed, 5 Aug 2020 08:21:02 +0000 X-Originating-IP: [31.210.180.3] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: a1c799ae-bf62-4db6-d28e-08d839187e08 X-MS-TrafficTypeDiagnostic: AM0PR05MB5987: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: xcEBMhib/B5BZx3KqCZuktqYGlcLFUF/+jqNoFl483DCi+xQVEgSGWgxjaTKcbjpH42ddV7dYqacZSfAKo4H6c7VVnTCNqRZbpChL6vfo2lolUT00BJiXADEkw4LsjmWDv+qkoA4/OWKmtsYenaQRyMxPYYeWjhCHSG8Kb0NO2bRmFw6adrueEsbcO4kkJVD/aoKWPQjoSYq7lOeYGFuk7vYn37eOhNXgXLjkQd8Po/rt8dtH9u0WssA0E0TdZ+kZiFQB65NMRTwnAMgPM+jhqByw7F7sfhiysU7f1j0n1SNGjVmEotpupLRDf+MwOuN+sh9frQc+jwZ2sv8y+hXtm1TVSecNzYYnOEDbIorXs+m3RJZd0tkyV+VvuvKbCgF X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:AM0PR05MB4290.eurprd05.prod.outlook.com;PTR:;CAT:NONE;SFTY:;SFS:(4636009)(136003)(39860400002)(366004)(346002)(396003)(376002)(956004)(2906002)(54906003)(2616005)(26005)(83380400001)(186003)(53546011)(6916009)(86362001)(31696002)(16526019)(316002)(16576012)(6486002)(31686004)(66476007)(66556008)(8936002)(4326008)(478600001)(52116002)(66946007)(8676002)(6666004)(36756003)(5660300002)(43740500002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData: o4A7L6DGHar6dWcTWA+N1olseQ4w3n66ucyUIsgUqYCoylssnNNbIUDYWXU/f9nul0XopOPSviqN2nNfr2NFpVqcQsk5TIjRMZsLqs3+81n2ydqqDhNy1n0CsYMs7+DzI3yK8D8PV/tCns2ZVhufXLh4P9IkBmdc/aKF7QhzST0euoyw51d5SAbV0l5VxzgRxQNLq0ZeugF7LeQgCeTRK/IfTNS/5qYHozkPf6ZyC2PepnnXngARKx6wHu7E8+tRW0TvzLEZGc2syz37zXAvfQToNUgczHzOWfARk1N9p/rfmQLmr76u0WLtzaEUBu0WtDtbA8KO9/5BHIeF9pIBM0pMdZYtwokilj9nI8jsOW95H1Z58spVVNQ3Eeu4QcX66xQP0z5kKbFoyLABYVG+U0LtEBxFudh/YRGAOt5GpZTqkqiyd1ED7bEtDm5quzdy40V6YGl5QcMW0tyZXOJc6Aqygp24jLHZJVAkCWbDbJYTO6+UMRbJmJAWXwCLiccE8U/JFeys+unV2+Wxsbh7yH4rpKAL7/l83qN5p2FYs+yvQ9pnTN02hnbff3QO1w4d/iesQgSjtMTcfEeBjhyg4QK6AxyeJgsq5Ry/cHWuGydaLus3cF5V6mc+YRIEwLuHmyrrTAHv6cbbNMp+Jns2UQ== X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: a1c799ae-bf62-4db6-d28e-08d839187e08 X-MS-Exchange-CrossTenant-AuthSource: AM0PR05MB4290.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Aug 2020 08:21:03.1539 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: FyTUCNp+0pcVgLOBXPzxUl35BlXPqAXpZOZDe3PQ/ktzRqFRRnVLKkOcveJnLlVwmAV/tUnVAU+ArY8PGe4QoQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR05MB5987 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 8/5/2020 9:55 AM, Vasundhara Volam wrote: > On Wed, Aug 5, 2020 at 12:02 PM Moshe Shemesh wrote: >> >> On 8/4/2020 1:13 PM, Vasundhara Volam wrote: >>> On Mon, Aug 3, 2020 at 7:23 PM Moshe Shemesh wrote: >>>> On 8/3/2020 3:47 PM, Vasundhara Volam wrote: >>>>> On Mon, Aug 3, 2020 at 5:47 PM Moshe Shemesh wrote: >>>>>> On 8/3/2020 1:24 PM, Vasundhara Volam wrote: >>>>>>> On Tue, Jul 28, 2020 at 10:13 PM Jacob Keller wrote: >>>>>>>> On 7/27/2020 10:25 PM, Vasundhara Volam wrote: >>>>>>>>> On Mon, Jul 27, 2020 at 4:36 PM Moshe Shemesh wrote: >>>>>>>>>> Introduce new option on devlink reload API to enable the user to select the >>>>>>>>>> reload level required. Complete support for all levels in mlx5. >>>>>>>>>> The following reload levels are supported: >>>>>>>>>> driver: Driver entities re-instantiation only. >>>>>>>>>> fw_reset: Firmware reset and driver entities re-instantiation. >>>>>>>>> The Name is a little confusing. I think it should be renamed to >>>>>>>>> fw_live_reset (in which both firmware and driver entities are >>>>>>>>> re-instantiated). For only fw_reset, the driver should not undergo >>>>>>>>> reset (it requires a driver reload for firmware to undergo reset). >>>>>>>>> >>>>>>>> So, I think the differentiation here is that "live_patch" doesn't reset >>>>>>>> anything. >>>>>>> This seems similar to flashing the firmware and does not reset anything. >>>>>> The live patch is activating fw change without reset. >>>>>> >>>>>> It is not suitable for any fw change but fw gaps which don't require reset. >>>>>> >>>>>> I can query the fw to check if the pending image change is suitable or >>>>>> require fw reset. >>>>> Okay. >>>>>>>>>> fw_live_patch: Firmware live patching only. >>>>>>>>> This level is not clear. Is this similar to flashing?? >>>>>>>>> >>>>>>>>> Also I have a basic query. The reload command is split into >>>>>>>>> reload_up/reload_down handlers (Please correct me if this behaviour is >>>>>>>>> changed with this patchset). What if the vendor specific driver does >>>>>>>>> not support up/down and needs only a single handler to fire a firmware >>>>>>>>> reset or firmware live reset command? >>>>>>>> In the "reload_down" handler, they would trigger the appropriate reset, >>>>>>>> and quiesce anything that needs to be done. Then on reload up, it would >>>>>>>> restore and bring up anything quiesced in the first stage. >>>>>>> Yes, I got the "reload_down" and "reload_up". Similar to the device >>>>>>> "remove" and "re-probe" respectively. >>>>>>> >>>>>>> But our requirement is a similar "ethtool reset" command, where >>>>>>> ethtool calls a single callback in driver and driver just sends a >>>>>>> firmware command for doing the reset. Once firmware receives the >>>>>>> command, it will initiate the reset of driver and firmware entities >>>>>>> asynchronously. >>>>>> It is similar to mlx5 case here for fw_reset. The driver triggers the fw >>>>>> command to reset and all PFs drivers gets events to handle and do >>>>>> re-initialization. To fit it to the devlink reload_down and reload_up, >>>>>> I wait for the event handler to complete and it stops at driver unload >>>>>> to have the driver up by devlink reload_up. See patch 8 in this patchset. >>>>>> >>>>> Yes, I see reload_down is triggering the reset. In our driver, after >>>>> triggering the reset through a firmware command, reset is done in >>>>> another context as the driver initiates the reset only after receiving >>>>> an ASYNC event from the firmware. >>>> Same here. >>>> >>>>> Probably, we have to use reload_down() to send firmware command to >>>>> trigger reset and do nothing in reload_up. >>>> I had that in previous version, but its wrong to use devlink reload this >>>> way, so I added wait with timeout for the event handling to complete >>>> before unload_down function ends. See mlx5_fw_wait_fw_reset_done(). Also >>>> the event handler stops before load back to have that done by devlink >>>> reload_up. >>> But "devlink dev reload" will be invoked by the user only on a single >>> dev handler and all function drivers will be re-instantiated upon the >>> ASYNC event. reload_down and reload_up are invoked only the function >>> which the user invoked. >>> >>> Take an example of a 2-port (PF0 and PF1) adapter on a single host and >>> with some VFs loaded on the device. User invokes "devlink dev reload" >>> on PF0, ASYNC event is received on 2 PFs and VFs for reset. All the >>> function drivers will be re-instantiated including PF0. >>> >>> If we wait for some time in reload_down() of PF0 and then call load in >>> reload_up(), this code will be different from other function drivers. >> >> I see your point here, but the user run devlink reload command on one >> PF, in this case of fw-reset it will influence other PFs, but that's a >> result of the fw-reset, the user if asked for params change or namespace >> change that was for this PF. > Right, if any driver is implementing only fw-reset have to leave > reload_up as an empty function. No, its not only up the driver. The netns option is implemented by devlink and its running between reload_down and reload_up. >>>>> And returning from reload >>>>> does not mean that reset is complete as it is done in another context >>>>> and the driver notifies the health reporter once the reset is >>>>> complete. devlink framework may have to allow drivers to implement >>>>> reload_down only to look more clean or call reload_up only if the >>>>> driver notifies the devlink once reset is completed from another >>>>> context. Please suggest.