Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp485658ybi; Fri, 26 Jul 2019 13:14:17 -0700 (PDT) X-Google-Smtp-Source: APXvYqyOkCJ+PQOU+VGnseQZDkHPO/8On6mzoLi5U0LSazkIBk8kk2h++93TyF9p1mee8vDAWNsZ X-Received: by 2002:a17:902:1e9:: with SMTP id b96mr98785309plb.277.1564172057091; Fri, 26 Jul 2019 13:14:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564172057; cv=none; d=google.com; s=arc-20160816; b=ftikPTPKthP/aLyFM/TJ/Ruvy4hpkdkLZQjslOWKFj0K5Hsnq8BgJa9hkd02F5zzpE uTl7zBNVGvzjGsBrwyLv1noGgQQFXhwtJZmoKJ1v5HV3/8DMXdBM95CmiUHIeQcjRBiJ MWnExeyndzx/Z+UO8GEW/IahAYbRr03p/Daymr0EgTxATT0kCbZ4/tdzPXmkXXNkbTEe rcNwSFamFhHlwudXP/xQ41g9NUNIYSdqxcH1Wiy02Xm+DuARDvDPmyxjCZjNFeXFIl8o OtcHtzRj9ECWe6s/iQkOo0+eCw9PAVbfkKKE02w/2Lif1fAygDIopzvJkOES2MSbHci0 F/2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:mime-version:user-agent:date:message-id:subject :from:to; bh=wg50ZSWuiVVvU563/5gLicWtT7+n+AYFBrxE6PAerBE=; b=VFsr+DW0E6mvsZ2VPAtLSLCi2UgOs8YW20+0sdQLmuURNAQdbmVfgcZTWXJAtGKF3h zlEKp2fEqci75QGbZR9FFpUbIWCUEA40ANWpwdlv7POxPFvJsncx9lwl1J9/BTJNs2UO KfIHzV//TnCzoxhNTenyYfGwn8nlCdWboVju4HCLSoXaD0v1Ycte8V/F9TJFcqmtlxxj e+nEY7oLU8sM9JODogouukddIDbsYaeV8bnXIPgYioHXqlrp9Jab2tcGhmiOH0xRprZH m+vpyDqs0kQaoimf0hM/2yueb34WLpifELLFifoaAF0l/TdRWJbsLvQwu6GnwhLKGl9k JDIA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p11si22825812pgk.223.2019.07.26.13.14.01; Fri, 26 Jul 2019 13:14:17 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727715AbfGZTg7 (ORCPT + 99 others); Fri, 26 Jul 2019 15:36:59 -0400 Received: from mail1.windriver.com ([147.11.146.13]:63853 "EHLO mail1.windriver.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726279AbfGZTg7 (ORCPT ); Fri, 26 Jul 2019 15:36:59 -0400 Received: from ALA-HCA.corp.ad.wrs.com ([147.11.189.40]) by mail1.windriver.com (8.15.2/8.15.1) with ESMTPS id x6QJasON004309 (version=TLSv1 cipher=AES128-SHA bits=128 verify=FAIL); Fri, 26 Jul 2019 12:36:54 -0700 (PDT) Received: from [172.25.39.5] (172.25.39.5) by ALA-HCA.corp.ad.wrs.com (147.11.189.50) with Microsoft SMTP Server (TLS) id 14.3.468.0; Fri, 26 Jul 2019 12:36:53 -0700 To: rt-users , From: Chris Friesen Subject: [RT] hit recently-fixed PREEMPT_RT CFS-bandwidth timer locking issue in the wild Message-ID: Date: Fri, 26 Jul 2019 13:36:51 -0600 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [172.25.39.5] Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi all, I thought people might be interested to hear that we recently hit the bug fixed by git commit c0ad4aa4d8 on multiple lab systems running the RHEL 7 "kernel-rt" kernel. (But I think other versions are at risk as well.) Interestingly, when the bug hit the system just hung completely. Nothing was emitted on netconsole or serial console, neither the hung task timer nor the NMI watchdog triggered, CONFIG_DEBUG_SPINLOCK didn't output anything, and magic sysrq didn't work on the serial console. As you can imagine this was a bit frustrating. I was finally able to cause a panic by sending an NMI from the BMC and that allowed kdump to store the core file so I could get stack traces. Given how annoying it was to debug, I'd recommend backporting this fix as far back as it applies. HRTIMER_MODE_SOFT was introduced in mainline in 4.16, but at least in the RHEL7 kernel-rt package (and I think in the vanilla PREEMPT_RT patches as well) hrtimers are run by default in softirq context and so the fix might apply to all supported PREEMPT_RT versions. Chris