Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp1818362pxb; Tue, 26 Oct 2021 16:59:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz7IPNRhl9KD4GezLbHSrYAsSaMfqi20hWfjtmstvLp8eTwtjj1U5myIk7WBFW0ZobSeAVX X-Received: by 2002:a17:907:1b16:: with SMTP id mp22mr34221928ejc.503.1635292774901; Tue, 26 Oct 2021 16:59:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635292774; cv=none; d=google.com; s=arc-20160816; b=kyDm7meopJB9NHOEt+KxXRkCV3RfvTmGr5aOtLNGuTDo+1dMpZZ5/VY636+9on16CT vEEhmDRcNijNOA6eYiSrNh91SmH6w+wXKQ+ZBBAyU8oZiFeHjnVU1mwhNbE1bkIqt6s5 LqbMKq3ap82Xz38y4hnNLtUO16DrZiFqIV6rSz/a9Ixaab6Y2AYF5a3+82KPLwzhEm7v 2ZfaHf/HsV8im7JAvJZbxI8PBeOmzLj4u1H2hN0nCQJ1pAPVw+/FWNbx+URPvFhBFjQW x17xTnLmB9TGO7/htJJH8Ip7sA4/mY5gFDy02HzTbZdKYSSsytuiRq0CSNUVEa0sr4oh 2d8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=/5FpGiAUYndvE6BpEAdYlwvaGZ2+B01ieQ5uMBIPutM=; b=ttP11Q233+q6/HzE4kbHrf9OFgvabJ1iO2+BeEx7AFt5I296nGcPYZXxV5ykf8OldY 5tm8TxAccrJzcHVCJfc1uLT9x/r+BBTC77K9QXkHagsUfn0Cmm1Sghtnza3comc/xlrx Hj2McAdVlcVuh4uF74+MdI/Oqar+RpWOgxgAUta5GjlAWjCCmeo4MCym7HFQLhrqYqo5 8eqXwjtrzTu2SjtvuUXDUeU0JQBAA7/NZzjFxACZMPlxnf640bNRwoSSIOGFk+smfliT 6oUoB5iKYyPkmrmxKQ2kr6riPGvvvbMCp2g+FhOWMNH7DiWzSdIDvjN6wqz0vDtbQXIR 9Urw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=OXG6jHbR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z1si9735827edd.133.2021.10.26.16.59.00; Tue, 26 Oct 2021 16:59:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=OXG6jHbR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236412AbhJZQae (ORCPT + 99 others); Tue, 26 Oct 2021 12:30:34 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:16102 "EHLO mx0b-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235260AbhJZQab (ORCPT ); Tue, 26 Oct 2021 12:30:31 -0400 Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19QFrdfb010243; Tue, 26 Oct 2021 16:27:47 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : mime-version : content-transfer-encoding; s=pp1; bh=/5FpGiAUYndvE6BpEAdYlwvaGZ2+B01ieQ5uMBIPutM=; b=OXG6jHbRUYEADlwtY0vWVp9j8OOv+ePlW5S6KhTZM5BqwivdnaSXvJN8pYX9LX1LuMM4 RFx2CngNjZvP6s/R8se3y0mIF5dtmA83VPr9ip9jaQiTR3VesbhYygE1+e6w/59a0X/9 3hO1u6U5UjVARmy/9lYt36C99Md/OS7O0SFP+2Gr/bs9ZtUmmo4fhFMk7HkrEt4f3pPq S6MsXkJmQqQyi2HgKd48w2Ud6eFi3e76zARnyUagyb+YXMAWDYCHy4wbU1vilxfZXrnZ 40uc+whiE26RZPRJTVSjY/J5qWBj+dDcwzsBe7XE4Mw7igbWCOd52+UZVy30OGB/ZDCK nQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bx4k8pm71-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 26 Oct 2021 16:27:46 +0000 Received: from m0098417.ppops.net (m0098417.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19QDrblI004620; Tue, 26 Oct 2021 16:27:46 GMT Received: from ppma04fra.de.ibm.com (6a.4a.5195.ip4.static.sl-reverse.com [149.81.74.106]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bx4k8pm6a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 26 Oct 2021 16:27:46 +0000 Received: from pps.filterd (ppma04fra.de.ibm.com [127.0.0.1]) by ppma04fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19QGHMMt032582; Tue, 26 Oct 2021 16:27:44 GMT Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by ppma04fra.de.ibm.com with ESMTP id 3bx4f7exav-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 26 Oct 2021 16:27:44 +0000 Received: from b06wcsmtp001.portsmouth.uk.ibm.com (b06wcsmtp001.portsmouth.uk.ibm.com [9.149.105.160]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19QGRg4m49938694 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 26 Oct 2021 16:27:42 GMT Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 13497A405C; Tue, 26 Oct 2021 16:27:42 +0000 (GMT) Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B5FA6A4054; Tue, 26 Oct 2021 16:27:41 +0000 (GMT) Received: from localhost.localdomain (unknown [9.145.63.253]) by b06wcsmtp001.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 26 Oct 2021 16:27:41 +0000 (GMT) From: Laurent Dufour To: mpe@ellerman.id.au, benh@kernel.crashing.org, paulus@samba.org Cc: npiggin@gmail.com, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: [PATCH 0/2] powerpc prevents deadlock in the watchdog path Date: Tue, 26 Oct 2021 18:27:38 +0200 Message-Id: <20211026162740.16283-1-ldufour@linux.ibm.com> X-Mailer: git-send-email 2.33.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 1cPmwLfwbbILT3CgcY1obs11vQXiW9gL X-Proofpoint-ORIG-GUID: kzcNe-goX77fc1CjS8liausGg3IoiSCi X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-26_05,2021-10-26_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 lowpriorityscore=0 phishscore=0 impostorscore=0 malwarescore=0 spamscore=0 mlxscore=0 adultscore=0 mlxlogscore=999 suspectscore=0 priorityscore=1501 clxscore=1011 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2110260088 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org While doing LPM on large system (for instance a Brazos system with 1024 CPUs and 12TB of memory) with an heavy load (I ran 'stress-ng --futex 500 -vm 5'), watchdog hard lockup are seen when the hypervisor is taking too much time handling the page tables to track page's changes. When this happens, the system may hung with a deadlock between the watchdog lock and the console owner lock. The first patch of this series prevents that deadlock by not calling printk while holding the watchdog lock, and also not sending IPI (and waiting for CPU's answer during 1s) while holding the watchdog lock. The second patch ensures that the watchdog's data are accessed under the protection of the watchdog lock. Laurent Dufour (2): powerpc/watchdog: prevent printk and send IPI while holding the wd lock powerpc/watchdog: ensure watchdog data accesses are protected arch/powerpc/kernel/watchdog.c | 45 +++++++++++++++++++--------------- 1 file changed, 25 insertions(+), 20 deletions(-) -- 2.33.1