Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp404329iog; Thu, 30 Jun 2022 03:07:55 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vaf5hz4qU10recDKZt5jqXkG6nc2KzPojO2x2ksiKSTX5GD76XnUKH5/S0jRxHJEmR6jei X-Received: by 2002:a17:906:b750:b0:726:3100:7991 with SMTP id fx16-20020a170906b75000b0072631007991mr7989051ejb.459.1656583674939; Thu, 30 Jun 2022 03:07:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656583674; cv=none; d=google.com; s=arc-20160816; b=HkNIS7NI62mqlCB7L0ix/jMudEWdRDoMIQWii1z3puVWiyvYSLRn72V8uqcmR7FWw7 tN6p3eH+1v0bEd1VfGABpf3tVmDK0NhgraluDhbYTCKEwRWiIRGU0bhleVTjF4Amiyuu c78I4nkcVrUFBqAzSUou/kDiZrYHmpUtepybw/f0ZZCwGoG1eJ1HGU9AIftxLCV961Dk bH2NIKRTAzp4Y2WECnOdNC9EpxsIB0OJStlIMbchr2HWIX9X/x31GnXsivwXcxpN2Fux pAtEgNOvWeNMSxXDmHme5OkKdP0290c2pAKYOKE/O+z9y0fPNA+RD4lpKZiGk+3OY5G2 m1zQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=CDu+zCVL/iHOE4iiWUZjPtowW+QlF+sjQdHuJYh9AdU=; b=dFyz4qKKfV4KCO9Q+pU8WHMqVzOskkbUkfev40QAVjM83y9tnDfw5MwHIO1GYtGUqz rxJWX7NBaByJ4AkoTv+l9S7aBh6zibCqnJb+xAFZXPbXIhjnM922315Q8ZUqzM+mXdll 8IVsUAyymDGdGYZC+0mr3NGyXq7r+cQZkhtoCz7hO/obsE43CGNzCXDuTYJIfq8qgKAs LWxKMTVd39l9wv3UrzesNbFPz7V7zOlsjwSAnYrVtUkGEEgGiYLQJ5isZJqB60l35wCG VDOwraA34p4Nj3TfENsj6YLPO4QMkQoAWc4gz3aP0uarkb5M7C9rikfMVBZm1/gvWw6y 0xmg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=eO03aUtn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 11-20020a508e4b000000b0041e0cd418dasi6080170edx.115.2022.06.30.03.07.29; Thu, 30 Jun 2022 03:07:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=eO03aUtn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234606AbiF3J47 (ORCPT + 99 others); Thu, 30 Jun 2022 05:56:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58490 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234600AbiF3J45 (ORCPT ); Thu, 30 Jun 2022 05:56:57 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 63A0743AC1 for ; Thu, 30 Jun 2022 02:56:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1656583015; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CDu+zCVL/iHOE4iiWUZjPtowW+QlF+sjQdHuJYh9AdU=; b=eO03aUtnys/vdryqr0BX3lAkt9jUSKzwUf9sEukQTZoEUISAGEthZxH54uarpYZUk3i8gb FyaKt1wqf1R/O5aSZfVo6NBEKQaqawCwIS3hlHsJMRt4cvhTnqIELIaobYXMP8t6VftVkZ 5ABq4AV4v7W6Y1UOCC4IbDnMNFB5Z3o= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-299-xGSpg6wUPkSw00KDq1e1XA-1; Thu, 30 Jun 2022 05:56:53 -0400 X-MC-Unique: xGSpg6wUPkSw00KDq1e1XA-1 Received: by mail-qv1-f70.google.com with SMTP id mr11-20020a056214348b00b004705c0cb439so17944637qvb.19 for ; Thu, 30 Jun 2022 02:56:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:subject:from:to:cc:date:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=CDu+zCVL/iHOE4iiWUZjPtowW+QlF+sjQdHuJYh9AdU=; b=71UvhqXLliLiVWe2SXv+j8aYcxeQKoBQ9OMLA+4TjMrzB46jZTtHWOQku20qJJb5J8 9zDWOO3iwuU+LOqK8Yda7pBrt9upF3YzrGQnCqQ4xRpsidKrOX24E/aJMoGoWgHyQ2R5 HVaHIKbY4k8Vvuur3JQInrNBVr2ijGglIWuXXnxzf0jk/kJEErFu1ijkw3if+Do0+4Fy 6Zbvi4vw78MBpzjiqphL608IoIXr8Y4o2qtWxpfBbKuBJpGM0S5heSrbpTnmw12+AMRK HXm2ZfRshnu2C/FEukz3+ALQ+B5ns2UMsB+ZzUV2ZPwkQnl2bg4EmXgq9sYTxWkd/R6T o8ZA== X-Gm-Message-State: AJIora9dS2IsFEIDWXYu9E9j1grLgOVX9yWNOblGNQRZqec7jipa0nog PjZxJ93CuWIRYj795RRPmczoBZ2lXuLS+3ayPNex4NaiDhLZZH0qbO4KS4D33NRJSlInorEt9aE 0IaEVoSvWXjHFEFF4yGix+wzt X-Received: by 2002:ac8:5298:0:b0:319:63c3:8b1d with SMTP id s24-20020ac85298000000b0031963c38b1dmr6624866qtn.261.1656583013345; Thu, 30 Jun 2022 02:56:53 -0700 (PDT) X-Received: by 2002:ac8:5298:0:b0:319:63c3:8b1d with SMTP id s24-20020ac85298000000b0031963c38b1dmr6624851qtn.261.1656583013075; Thu, 30 Jun 2022 02:56:53 -0700 (PDT) Received: from gerbillo.redhat.com (146-241-106-148.dyn.eolo.it. [146.241.106.148]) by smtp.gmail.com with ESMTPSA id g6-20020ac842c6000000b00317ccc66971sm11586812qtm.52.2022.06.30.02.56.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jun 2022 02:56:52 -0700 (PDT) Message-ID: <64e59afe33fff04861c800853a549f7979270f79.camel@redhat.com> Subject: Re: [PATCH] net: hinic: avoid kernel hung in hinic_get_stats64() From: Paolo Abeni To: Qiao Ma , davem@davemloft.net, edumazet@google.com, kuba@kernel.org, gustavoars@kernel.org, cai.huoqing@linux.dev, aviad.krawczyk@huawei.com, zhaochen6@huawei.com Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org Date: Thu, 30 Jun 2022 11:56:48 +0200 In-Reply-To: <07736c2b7019b6883076a06129e06e8f7c5f7154.1656487154.git.mqaio@linux.alibaba.com> References: <07736c2b7019b6883076a06129e06e8f7c5f7154.1656487154.git.mqaio@linux.alibaba.com> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.42.4 (3.42.4-2.fc35) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2022-06-29 at 15:28 +0800, Qiao Ma wrote: > When using hinic device as a bond slave device, and reading device stats of > master bond device, the kernel may hung. > > The kernel panic calltrace as follows: > Kernel panic - not syncing: softlockup: hung tasks > Call trace: > native_queued_spin_lock_slowpath+0x1ec/0x31c > dev_get_stats+0x60/0xcc > dev_seq_printf_stats+0x40/0x120 > dev_seq_show+0x1c/0x40 > seq_read_iter+0x3c8/0x4dc > seq_read+0xe0/0x130 > proc_reg_read+0xa8/0xe0 > vfs_read+0xb0/0x1d4 > ksys_read+0x70/0xfc > __arm64_sys_read+0x20/0x30 > el0_svc_common+0x88/0x234 > do_el0_svc+0x2c/0x90 > el0_svc+0x1c/0x30 > el0_sync_handler+0xa8/0xb0 > el0_sync+0x148/0x180 > > And the calltrace of task that actually caused kernel hungs as follows: > __switch_to+124 > __schedule+548 > schedule+72 > schedule_timeout+348 > __down_common+188 > __down+24 > down+104 > hinic_get_stats64+44 [hinic] > dev_get_stats+92 > bond_get_stats+172 [bonding] > dev_get_stats+92 > dev_seq_printf_stats+60 > dev_seq_show+24 > seq_read_iter+964 > seq_read+220 > proc_reg_read+164 > vfs_read+172 > ksys_read+108 > __arm64_sys_read+28 > el0_svc_common+132 > do_el0_svc+40 > el0_svc+24 > el0_sync_handler+164 > el0_sync+324 > > When getting device stats from bond, kernel will call bond_get_stats(). > It first holds the spinlock bond->stats_lock, and then call > hinic_get_stats64() to collect hinic device's stats. > However, hinic_get_stats64() calls `down(&nic_dev->mgmt_lock)` to > protect its critical section, which may schedule current task out. > And if system is under high pressure, the task cannot be woken up > immediately, which eventually triggers kernel hung panic. > > Fixes: edd384f682cc ("net-next/hinic: Add ethtool and stats") > Signed-off-by: Qiao Ma Side note: it looks like that after this patch every section protected by the mgmt_lock is already under rtnl lock protection, so you could probably remove the hinic specific lock (in a separate, net-next, patch). Please double check the above as I skimmed upon that quickly. Thanks, Paolo