Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp4875123iob; Mon, 9 May 2022 03:48:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw8FUKmm7hYJNiB4ZJBWMe4ZQ7+xakbVkabQxSgVXadPsSPQZE2RPdGSpxK+wT0PlqPfBuw X-Received: by 2002:a17:90b:3b8d:b0:1dc:7637:91c3 with SMTP id pc13-20020a17090b3b8d00b001dc763791c3mr25627694pjb.186.1652093308099; Mon, 09 May 2022 03:48:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652093308; cv=none; d=google.com; s=arc-20160816; b=vBLchAwzsmefcRy8YWH4WFUFKXBi4SkF8i9jNaOz321N1VbJaF7ivd4732Gm6nHpJe RhefMlVWQzseuD05z1AosortTiS4XddCxwrzzLtj2gJwVs64cUhbDtxx+9BzvNU1XPgi urE0+cCvm3ExNwiENcJ2vGxjhHrNrf3EIPRSCJfHGTeHmNSf0s63qIMKXdmTFZ/iRXgx eTR1C/EZryHNqa2mMsiz2ebxZKhKTAmLaeyviH74Hvexqpl2UmJpF52aKSh7YrEPWVPo T//BgrHmB6HraYlklbuLUkR11GEqeeDgjZP/qn6HvM39kZsAnJf2Pome/KrQAq5Fx262 YWwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:mime-version:message-id:date :dkim-signature; bh=lywA1vt51kZck6PXzExT18cr3ZKuMyPEg8s9HmYYUI0=; b=mwcHOfhW5X6eB+sOiOBF6GKllR5/kOFsjckDl+8E6E/7gtvRFIV357kF6HlLTVuQLt wImbMxInob2oPWYc2fVYzT0yGBph8wspUBqBvNVBgf2W2+mqyaaos42klepArKq+yemF xRsJ51Yl8ul8ZvqVVVwoUtApMlbePu/VUvsdpMSX6us4BpBeJVGdUAgltzriKNLhLJ7V NOgOYDJsoBI2Sg3dQlVJ42CerFGjt55napuVxCdoyCoohn6Z+jo4Zmko5gMkH8aIfJtr CfZZ0bma/hYaXIzGbV1yLLPzMJEvKf21k2NXHZfho++s/n8W7d59TMAbRBQErQYbQ+E0 XXfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=EFgQknN9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id q6-20020a170902bd8600b0015d2b2dfdeasi10484467pls.375.2022.05.09.03.48.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 May 2022 03:48:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=EFgQknN9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id AFD15236767; Mon, 9 May 2022 03:09:29 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238041AbiEIJeU (ORCPT + 99 others); Mon, 9 May 2022 05:34:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42252 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232093AbiEIJPN (ORCPT ); Mon, 9 May 2022 05:15:13 -0400 Received: from mail-wr1-x44a.google.com (mail-wr1-x44a.google.com [IPv6:2a00:1450:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C2921D3D62 for ; Mon, 9 May 2022 02:11:19 -0700 (PDT) Received: by mail-wr1-x44a.google.com with SMTP id k29-20020adfb35d000000b0020adc94662dso5513120wrd.12 for ; Mon, 09 May 2022 02:11:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:message-id:mime-version:subject:from:to:cc; bh=lywA1vt51kZck6PXzExT18cr3ZKuMyPEg8s9HmYYUI0=; b=EFgQknN9kz0vNqF7IeQoUwRJZqgbytppSycHiqpb1+5jMewc1YNqZG83lLllsx66wi hPWbTH5433R+3JdGpDwSUGgKg8RcTEclTZ1/DVxfUbuDR4hkAua6YTGmN1W9UdXaeBRb JpFvmzH0GXHyi6sJK6D9BWsHIwwu6l0TYLLbmovQW+zC/8kec43+NsvCl95IMKXUfriq XbS7eJaaxkzWsXuh1f0JSc4m/GymScAuYOUvm+IzOFWmKqQ/VoaAm4EI5sBtK4Soh7En 80vEz99Zl8vhpI3l+ldV+l+ScQXjgKuCd0ofKBObgcFFJx9t1vTPrj/XLSb+pUKEOAV6 hJog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=lywA1vt51kZck6PXzExT18cr3ZKuMyPEg8s9HmYYUI0=; b=8GlQbY0/Af5KWZ+6YoeG+Igtb8zYNmwz0/a73rTahzO0yapcMl86DbFU0U+3piYB5r gHdFS1HD8r+hK5RKKJOeWqNES3WGLGziwp5crmX4xGwsxSJav0siKHnqCyKcj9Hq0T5S gQtu41j99/jBpTfkklAwfRyJtvaVlbMYqsBph3UZWqEdivEolh3o1psTxsz2F/sOa+1+ 10ZfyIWs2dRzB1hqCiCh1QTz6NXwYmxaGH/+BEmFwCguseMuPk7NEF6enMcSxw1eMXJY lIDUtkBd4fA0lBoGHb1UtAgOeyigc3KM2CTJ65j3TMDwXdZmU4ixBl4xBLYZ4SHdbO2k syuw== X-Gm-Message-State: AOAM533btRbhcTBmB2eIykGO2ZX1X6DDE7uqzYXIYpIcDKJyFfa8VXPm pkGvZ6dlR5J0bS2mZByTQ+jehqejl3cflVawGYY= X-Received: from sene.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:27c4]) (user=sebastianene job=sendgmr) by 2002:a7b:c446:0:b0:394:3293:a88f with SMTP id l6-20020a7bc446000000b003943293a88fmr15090980wmi.22.1652087477500; Mon, 09 May 2022 02:11:17 -0700 (PDT) Date: Mon, 9 May 2022 09:11:02 +0000 Message-Id: <20220509091103.2220604-1-sebastianene@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.36.0.512.ge40c2bad7a-goog Subject: [PATCH v5 0/2] Detect stalls on guest vCPUS From: Sebastian Ene To: Rob Herring , Greg Kroah-Hartman , Arnd Bergmann , Dragan Cvetic Cc: linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, maz@kernel.org, will@kernel.org, qperret@google.com, Guenter Roeck , Sebastian Ene Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This adds a mechanism to detect stalls on the guest vCPUS by creating a per CPU hrtimer which periodically 'pets' the host backend driver. On a conventional watchdog-core driver, the userspace is responsible for delivering the 'pet' events by writing to the particular /dev/watchdogN node. In this case we require a strong thread affinity to be able to account for lost time on a per vCPU basis. This device driver acts as a soft lockup detector by relying on the host backend driver to measure the elapesed time between subsequent 'pet' events. If the elapsed time doesn't match an expected value, the backend driver decides that the guest vCPU is locked and resets the guest. The host backend driver takes into account the time that the guest is not running. The communication with the backend driver is done through MMIO and the register layout of the virtual watchdog is described as part of the backend driver changes. The host backend driver is implemented as part of: https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/3548817 Changelog v5: - fix dt warnings - rename %s/watchdog/stall_detector/g - rename the config from Kconfig VM_WATCHDOG -> VCPU_STALL_DETECTOR Changelog v4: - rename the source from vm-wdt.c -> vm-watchdog.c - convert all the error logging calls from pr_* to dev_* calls - rename the DTS node "clock" to "clock-frequency" Changelog v3: - cosmetic fixes, remove pr_info and version information - improve description in the commit messag - improve description in the Kconfig help section Sebastian Ene (2): dt-bindings: vcpu_stall_detector: Add qemu,vcpu-stall-detector compatible misc: Add a mechanism to detect stalls on guest vCPUs .../bindings/misc/vcpu_stall_detector.yaml | 47 ++++ drivers/misc/Kconfig | 12 + drivers/misc/Makefile | 1 + drivers/misc/vcpu_stall_detector.c | 218 ++++++++++++++++++ 4 files changed, 278 insertions(+) create mode 100644 Documentation/devicetree/bindings/misc/vcpu_stall_detector.yaml create mode 100644 drivers/misc/vcpu_stall_detector.c -- 2.36.0.512.ge40c2bad7a-goog