Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp25338pxv; Tue, 29 Jun 2021 22:21:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxI+h3Z1mz8oZOO0dK2mcEZkivt6/3bnxaf2GDawyQZZkFcFJFck84zP0b0ldWqIuJlMq4A X-Received: by 2002:a17:907:3e1b:: with SMTP id hp27mr33043442ejc.470.1625030507994; Tue, 29 Jun 2021 22:21:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1625030507; cv=none; d=google.com; s=arc-20160816; b=txiG99eWkZu6fY+UMpz6V/JBySy4QJER4dH1BhJCmgA8HnN6PtYpx6ifEO87r7ndwP WnnljLpQ8VvjiYjP8TGZM2bFhtK6ZHnj6M0kUVjBbsnDibhNEu96ip4FH/vB/WNryGo7 yLHQyY2iaaJFlx6gJW5fq/WfUWVujU1rHx1ag6ryWP1GsM/crXPi0I48+HREKSb3eg38 Cp64THG1h64fIEnedS1QU1LLwofAheY3YNXDpmBYzjQ3CMP2OUasT5nkYgf7ifedyCWN j03f8LoGcU0fpswBXD26HZksqH2oYGqlLqaEmgQ6SWHeyzDle7iBXpUHMapypk8+vGrz R3Pg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version; bh=3fW7OvTi2NbEKQkpkAR20Gdi1SBFD/hBlroCkzQ4+Rk=; b=goQoT086N3/siEi2r/bhnZh6n2Ctq+48ojZLoCejSInVkEfZvBEkHfRa6CV0Z94GwQ 9JfZiry4HkHiTfSe6Lh+e+Lyv/HzWsWvduJsh4jox+uWUCdl1fUk6Y7lLnHeKXSm7uTA 46jEAr0UaFgJHwgkc0XG9AOeddJf5FT98b77TuZUH48g+yt6IGBgWgFxQfMILhOi14Xi frNasSbkphAJES1gzK/ewf5C0kEsVIYAW70XOeV3ani83qiS6QKj7Sbrxf7yCxHYrmg9 Awh/+amOO0AVHGonrZceZ9tLpLmX85b9zZltYFyL07qwW++1GXU6XQC9+6TeTZdtHpGt 4Afg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p8si22263143edj.155.2021.06.29.22.21.23; Tue, 29 Jun 2021 22:21:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229924AbhF3FWH convert rfc822-to-8bit (ORCPT + 99 others); Wed, 30 Jun 2021 01:22:07 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:54813 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233364AbhF3FVq (ORCPT ); Wed, 30 Jun 2021 01:21:46 -0400 Received: from mail-ed1-f71.google.com ([209.85.208.71]) by youngberry.canonical.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from ) id 1lySd2-0001dV-MY for linux-kernel@vger.kernel.org; Wed, 30 Jun 2021 05:19:16 +0000 Received: by mail-ed1-f71.google.com with SMTP id h11-20020a50ed8b0000b02903947b9ca1f3so526116edr.7 for ; Tue, 29 Jun 2021 22:19:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=UyIG4rD3Oagh4vKkL1rfDRcvV3u/g4NiyUtXwM4RGjk=; b=Lk0WdI6mHiEJKlzflcoYZg5BHVHgIhxnpwdAF8jLdTi6V+yAwr+EDRf6AW0vddpfyo vofpehm1u5L/LH6xthLFJvcYzTyV/TEjczMChT1S7JkmRopo+xllplK/OFs7KbUYBEqi qdbRkFI0nVIZz7YAnbF5ONM5wix1wR0mkFwvi8YbyIR1gy0iiSTV5kCeOUPvUDDMiIAI i8lBpeUSmshpoevUN1TLc8qKJeeUK0JIlZiLPrZaxWA04acbuoo9ZVs2/gbyYEcnjuzV fX4BN4MVqYFTDl3JrCWDRoOXIemZuJohmhROSBL/iyh8fesfZfvOhrFreOuU8sDiORnD kV+A== X-Gm-Message-State: AOAM530NAEODCCSDT/R5FGu0W7Q+FKZvZ+l8pXWM6HQqfr6XquQwogRi pS/osFspsUSv/1aAcsGFNjqKR1ZUT9IxxNZvhYYdplt9lk7FkHUstX00+4Opp10OSrJ+hbxiJuZ 7UX+09RXDCl4Lbu2fwh+GaWb81Eot8Jm1YWnZ589JH/7gbK4//awxN27pRA== X-Received: by 2002:a05:6402:1057:: with SMTP id e23mr1364842edu.352.1625030356366; Tue, 29 Jun 2021 22:19:16 -0700 (PDT) X-Received: by 2002:a05:6402:1057:: with SMTP id e23mr1364823edu.352.1625030356131; Tue, 29 Jun 2021 22:19:16 -0700 (PDT) MIME-Version: 1.0 References: <20210420075406.64105-1-acelan.kao@canonical.com> In-Reply-To: From: AceLan Kao Date: Wed, 30 Jun 2021 13:19:05 +0800 Message-ID: Subject: Re: [PATCH] net: called rtnl_unlock() before runpm resumes devices To: Heiner Kallweit Cc: Krzysztof Kozlowski , "David S. Miller" , Jakub Kicinski , Alexei Starovoitov , Andrii Nakryiko , Eric Dumazet , Wei Wang , Cong Wang , Taehee Yoo , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , netdev , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org It's been a while, do we have any conclusion about this? Do you need me re-send the patch with "Fixes:"? Heiner Kallweit 於 2021年4月30日 週五 上午3:36寫道: > > On 29.04.2021 13:58, Krzysztof Kozlowski wrote: > > On Tue, 20 Apr 2021 at 09:55, AceLan Kao wrote: > >> > >> From: "Chia-Lin Kao (AceLan)" > >> > >> The rtnl_lock() has been called in rtnetlink_rcv_msg(), and then in > >> __dev_open() it calls pm_runtime_resume() to resume devices, and in > >> some devices' resume function(igb_resum,igc_resume) they calls rtnl_lock() > >> again. That leads to a recursive lock. > >> > >> It should leave the devices' resume function to decide if they need to > >> call rtnl_lock()/rtnl_unlock(), so call rtnl_unlock() before calling > >> pm_runtime_resume() and then call rtnl_lock() after it in __dev_open(). > >> > >> [ 967.723577] INFO: task ip:6024 blocked for more than 120 seconds. > >> [ 967.723588] Not tainted 5.12.0-rc3+ #1 > >> [ 967.723592] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > >> [ 967.723594] task:ip state:D stack: 0 pid: 6024 ppid: 5957 flags:0x00004000 > >> [ 967.723603] Call Trace: > >> [ 967.723610] __schedule+0x2de/0x890 > >> [ 967.723623] schedule+0x4f/0xc0 > >> [ 967.723629] schedule_preempt_disabled+0xe/0x10 > >> [ 967.723636] __mutex_lock.isra.0+0x190/0x510 > >> [ 967.723644] __mutex_lock_slowpath+0x13/0x20 > >> [ 967.723651] mutex_lock+0x32/0x40 > >> [ 967.723657] rtnl_lock+0x15/0x20 > >> [ 967.723665] igb_resume+0xee/0x1d0 [igb] > >> [ 967.723687] ? pci_pm_default_resume+0x30/0x30 > >> [ 967.723696] igb_runtime_resume+0xe/0x10 [igb] > >> [ 967.723713] pci_pm_runtime_resume+0x74/0x90 > >> [ 967.723718] __rpm_callback+0x53/0x1c0 > >> [ 967.723725] rpm_callback+0x57/0x80 > >> [ 967.723730] ? pci_pm_default_resume+0x30/0x30 > >> [ 967.723735] rpm_resume+0x547/0x760 > >> [ 967.723740] __pm_runtime_resume+0x52/0x80 > >> [ 967.723745] __dev_open+0x56/0x160 > >> [ 967.723753] ? _raw_spin_unlock_bh+0x1e/0x20 > >> [ 967.723758] __dev_change_flags+0x188/0x1e0 > >> [ 967.723766] dev_change_flags+0x26/0x60 > >> [ 967.723773] do_setlink+0x723/0x10b0 > >> [ 967.723782] ? __nla_validate_parse+0x5b/0xb80 > >> [ 967.723792] __rtnl_newlink+0x594/0xa00 > >> [ 967.723800] ? nla_put_ifalias+0x38/0xa0 > >> [ 967.723807] ? __nla_reserve+0x41/0x50 > >> [ 967.723813] ? __nla_reserve+0x41/0x50 > >> [ 967.723818] ? __kmalloc_node_track_caller+0x49b/0x4d0 > >> [ 967.723824] ? pskb_expand_head+0x75/0x310 > >> [ 967.723830] ? nla_reserve+0x28/0x30 > >> [ 967.723835] ? skb_free_head+0x25/0x30 > >> [ 967.723843] ? security_sock_rcv_skb+0x2f/0x50 > >> [ 967.723850] ? netlink_deliver_tap+0x3d/0x210 > >> [ 967.723859] ? sk_filter_trim_cap+0xc1/0x230 > >> [ 967.723863] ? skb_queue_tail+0x43/0x50 > >> [ 967.723870] ? sock_def_readable+0x4b/0x80 > >> [ 967.723876] ? __netlink_sendskb+0x42/0x50 > >> [ 967.723888] ? security_capable+0x3d/0x60 > >> [ 967.723894] ? __cond_resched+0x19/0x30 > >> [ 967.723900] ? kmem_cache_alloc_trace+0x390/0x440 > >> [ 967.723906] rtnl_newlink+0x49/0x70 > >> [ 967.723913] rtnetlink_rcv_msg+0x13c/0x370 > >> [ 967.723920] ? _copy_to_iter+0xa0/0x460 > >> [ 967.723927] ? rtnl_calcit.isra.0+0x130/0x130 > >> [ 967.723934] netlink_rcv_skb+0x55/0x100 > >> [ 967.723939] rtnetlink_rcv+0x15/0x20 > >> [ 967.723944] netlink_unicast+0x1a8/0x250 > >> [ 967.723949] netlink_sendmsg+0x233/0x460 > >> [ 967.723954] sock_sendmsg+0x65/0x70 > >> [ 967.723958] ____sys_sendmsg+0x218/0x290 > >> [ 967.723961] ? copy_msghdr_from_user+0x5c/0x90 > >> [ 967.723966] ? lru_cache_add_inactive_or_unevictable+0x27/0xb0 > >> [ 967.723974] ___sys_sendmsg+0x81/0xc0 > >> [ 967.723980] ? __mod_memcg_lruvec_state+0x22/0xe0 > >> [ 967.723987] ? kmem_cache_free+0x244/0x420 > >> [ 967.723991] ? dentry_free+0x37/0x70 > >> [ 967.723996] ? mntput_no_expire+0x4c/0x260 > >> [ 967.724001] ? __cond_resched+0x19/0x30 > >> [ 967.724007] ? security_file_free+0x54/0x60 > >> [ 967.724013] ? call_rcu+0xa4/0x250 > >> [ 967.724021] __sys_sendmsg+0x62/0xb0 > >> [ 967.724026] ? exit_to_user_mode_prepare+0x3d/0x1a0 > >> [ 967.724032] __x64_sys_sendmsg+0x1f/0x30 > >> [ 967.724037] do_syscall_64+0x38/0x90 > >> [ 967.724044] entry_SYSCALL_64_after_hwframe+0x44/0xae > >> > >> Signed-off-by: Chia-Lin Kao (AceLan) > >> --- > >> net/core/dev.c | 5 ++++- > >> 1 file changed, 4 insertions(+), 1 deletion(-) > >> > >> diff --git a/net/core/dev.c b/net/core/dev.c > >> index 1f79b9aa9a3f..427cbc80d1e5 100644 > >> --- a/net/core/dev.c > >> +++ b/net/core/dev.c > >> @@ -1537,8 +1537,11 @@ static int __dev_open(struct net_device *dev, struct netlink_ext_ack *extack) > >> > >> if (!netif_device_present(dev)) { > >> /* may be detached because parent is runtime-suspended */ > >> - if (dev->dev.parent) > >> + if (dev->dev.parent) { > >> + rtnl_unlock(); > >> pm_runtime_resume(dev->dev.parent); > > > > A side topic, maybe a little bit silly question (I don't know that > > much about net core): > > Why not increase the parent or current PM runtime usage counter > > instead? The problem with calling pm_runtime_resume() is that if the > > parent device was already suspended (so it's usage counter ==0), it > > might get suspended right after this call. IOW, you do not have any > > guarantee that the device will be really resumed. Probably it should > > be pm_runtime_resume_and_get() (and later "put" on close or other > > events). This however might not solve the lock problem at all. > > > The point of runtime-resuming the parent here isn't to ensure it's > resumed for the complete time the device is open. It's called > because netif_device_present() may be (initially) false just because > the parent is runtime-suspended. > We want the device to be able to runtime-suspend later if e.g. > the link is down. > > > Best regards, > > Krzysztof > > >