Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp827785yba; Fri, 26 Apr 2019 09:22:26 -0700 (PDT) X-Google-Smtp-Source: APXvYqxF9M0WBZXEW5Yuo8E3Vp4ATL9iO1t0gEn/nAIEdv8ZvxE7qqpCIbsWSyqe1qg7zSnJsUh0 X-Received: by 2002:aa7:82c5:: with SMTP id f5mr47687907pfn.256.1556295745975; Fri, 26 Apr 2019 09:22:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556295745; cv=none; d=google.com; s=arc-20160816; b=tKJJvysCaQXFDdhi4ZcY+6Ejsd4LkX6xs/XivzxinX0BLhpewQl28mNa5K/yYl+vHV dU3CGmz3srFBGtpj44eYKtjV+ZfE0ipg6A/qGf1KNz2SKbiTxKYXKa7C/Thd9o1YzhE2 otRp1MTYedKa0KQTqaU2wjPLWLVKbTorOdQRQcTbJ8D1mGI2GhPAdqZFW+IKMjCRk/gy 9XrxunLMxgBHnEnV5vC04bggAc4vkTYNJxggcvKNedBc8u2W23NVy3QYX715SSYjBflZ Gt6FcBkh84pVZ9gKgKhQ4a2AvwGAWInNf1BB8NZ0TO942+wwdAEKhM5q1xUHRA3CrFFI XGQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=ieCTVXJiyhcOi2/nx53QZYmt0TZMysNwpvj8N+mOnRQ=; b=LuXcer53Ku325tYDF9bjeyee2vX3/DfCposRGWga8MfkVWyHctXcZz14gnpRJkWDAH 5BpQ2gEg+2c+Jz69MGvQc44adpnuyd10oiFmJ+cYAF/FURkSu6uvFB1GlyPONVPY4mES +goFrcIqnha0HnBgngBwza2vhEHMHcIIyYxr+q9j/DV1RCVABUouDFEijy69hgsrAFz7 Y8dfWQqZ/ANiARjgDPcGN2wwAPf2MYjqIU6HMOiE7D6UsUn6ORV3OkPeuKGMOj+kCxG9 sczY4L5ps5WeoUajBf8yxw32hpQ/OxxWJZywC0NV+t2Udct6+o0xgFBRaH2EOB0bp6ZC TC6Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@netronome-com.20150623.gappssmtp.com header.s=20150623 header.b="bW0/FZAt"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s3si23059330plb.418.2019.04.26.09.22.09; Fri, 26 Apr 2019 09:22:25 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@netronome-com.20150623.gappssmtp.com header.s=20150623 header.b="bW0/FZAt"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726406AbfDZQTw (ORCPT + 99 others); Fri, 26 Apr 2019 12:19:52 -0400 Received: from mail-vs1-f68.google.com ([209.85.217.68]:38366 "EHLO mail-vs1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726272AbfDZQTw (ORCPT ); Fri, 26 Apr 2019 12:19:52 -0400 Received: by mail-vs1-f68.google.com with SMTP id s2so2260025vsi.5 for ; Fri, 26 Apr 2019 09:19:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netronome-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ieCTVXJiyhcOi2/nx53QZYmt0TZMysNwpvj8N+mOnRQ=; b=bW0/FZAtgKs+lLeBUBHHdamPAsF5q9m+uVynVFmBl0kvNbv9EX27+Lw3s0b+fIHf3c XwL2tu9/p7Hre+5NXU7YPckqVEgqJNsF9lRwe1TnSJBiHhwBe+cRDBt5NYaWrPhNx+48 BaODMulleGow3zOlDv20XT8KSWVMg4+QLAgKdHGKkXRAXuMiwNId49TK2GVJPE5A8DEr KN4Lyiup+KLXiCzsS/MD7MWLoIcZxRfgArGQ1Q/U4dn+AWkqvn/1+G7TIttkSl291ohQ l9qxdeDDiOjvHOXw9OJWtPNnMhNCBXsJ1yXyiK/K28FiYpbCxqfBcfhIA3UDbTRK4Iuk rqZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ieCTVXJiyhcOi2/nx53QZYmt0TZMysNwpvj8N+mOnRQ=; b=QSzQF1Agov9j14/oU07AiueEcvZwHysAf7dCIiBvhnBDWXEpyCXPA7yCMHwcFgPnOX 3PcceYsGZnkCjjtLl46bROrMA0FRTPcpupzA8mUJ7qjsz83V+2GVI2mOzqJIpYwydbfA B2kuwVyZW7gxX4PBQGil3yxkbwBOd4zOR6IckKvDd7m9eDiJFyD5e+3BhDJu4BVMffAS s/DIeRvWj1oFke5hf7NE+A75hVpFEdga0nNMaGpGX2XQJBofc1/th5daru6W1H5xUDY/ jKDMIhRnvRrUBi6VDVqr84luIPGjeZOTjaHsEZwWgLAhhdADtVtQOBxwZEBjjTozPb6R ez1w== X-Gm-Message-State: APjAAAUhfiYch9mNrlOBSWP+B1egEUBvbIXHVAmsv0Xka9CQY+7d4sZi AZRa8q67yUxWwsxEiG4ASIAbdR1oUywMx1HBlswVmw== X-Received: by 2002:a67:f256:: with SMTP id y22mr24121680vsm.19.1556295591010; Fri, 26 Apr 2019 09:19:51 -0700 (PDT) MIME-Version: 1.0 References: <1556189823-5368-1-git-send-email-moshe@mellanox.com> <20190425143847.417033ab@cakuba.netronome.com> <20190425193737.131be39f@cakuba.netronome.com> <8bb8a789-c189-df07-5d90-fd8f2fa172ca@mellanox.com> In-Reply-To: <8bb8a789-c189-df07-5d90-fd8f2fa172ca@mellanox.com> From: Jakub Kicinski Date: Fri, 26 Apr 2019 09:19:39 -0700 Message-ID: Subject: Re: [PATCH net-next] devlink: Execute devlink health recover as a work To: Moshe Shemesh Cc: Saeed Mahameed , "davem@davemloft.net" , Jiri Pirko , "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 26, 2019 at 6:04 AM Moshe Shemesh wrote: > On 4/26/2019 5:37 AM, Jakub Kicinski wrote: > > On Fri, 26 Apr 2019 01:42:34 +0000, Saeed Mahameed wrote: > >>>> @@ -4813,7 +4831,11 @@ static int > >>>> devlink_nl_cmd_health_reporter_recover_doit(struct sk_buff *skb, > >>>> if (!reporter) > >>>> return -EINVAL; > >>>> > >>>> - return devlink_health_reporter_recover(reporter, NULL); > >>>> + if (!reporter->ops->recover) > >>>> + return -EOPNOTSUPP; > >>>> + > >>>> + queue_work(devlink->reporters_wq, &reporter->recover_work); > >>>> + return 0; > >>>> } > >>> > >>> So the recover user space request will no longer return the status, > >>> and > >>> it will not actually wait for the recover to happen. Leaving user > >>> pondering - did the recover run and fail, or did it nor get run > >>> yet... > >>> > >> > >> wait_for_completion_interruptible_timeout is missing from the design ? > > > > Perhaps, but I think its better to avoid the async execution of > > the recover all together. Perhaps its better to refcount the > > reporters on the call to recover_doit? Or some such.. :) > > > > I tried using refcount instead of devlink lock here. But once I get to > reporter destroy I wait for the refcount and not sure if I should > release the reporter after some timeout or have endless wait for > refcount. Both options seem not good. Well you should "endlessly" wait. Why would the refcount not drop, you have to remove it from the list first, so no new operations can start, right? In principle there is no difference between waiting for refcount to drop, flushing the work, or waiting for the devlink lock if reporter holds it?