Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp61960pxb; Tue, 12 Jan 2021 20:09:49 -0800 (PST) X-Google-Smtp-Source: ABdhPJyAHTOCirfaeYaCpAeRU+JuqmX9VUb38d3J2ttmL/sS1yxs+9kkcXzqWs4LZlL+/4JhBbEb X-Received: by 2002:a17:906:5ad0:: with SMTP id x16mr149580ejs.135.1610510988846; Tue, 12 Jan 2021 20:09:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610510988; cv=none; d=google.com; s=arc-20160816; b=ak7AejpKXCg+4Q/c6+uAGa4kkR+o6VUHHi0S5pZkPur1bKZAoA4s9dKbUr/SrrV912 +4+seSiUy0AO4m8nl+Tm6w6JN8vLszWitcNTvffyZ5CdZ94z2F+V7PXbah3oBfGJlhj/ 4lxlTkDZ8d5IPzcRo/1DHdK6/j6ExykX5RYSrKkCewyAV9uD22X6pTlA6NjEvrux4d54 ailZIYYhaFkJSvQgfU2REMi4OgzW3156QeibQ2GEwD80s51I94Y9wO/XI+K6bYu6UgZb DbMwOq7VZdVVP1RZERlGkZwAzwQrjREq8qVp6qRIQ3KGN2D/XwT7krm2dVAAISUR/RMm IMmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=nbLyjtvFxvmKNYxhP/lJoJ/nf5a/TGw22r5rcB+fC2s=; b=qcIsw8v7xMUmHQ0RJ3OIV7sYBfZzT/qR5YMQrT86BuWFn/H9Yx+WdW2ZgCOk37KDVB O17oCI9ScSgImk7KGOSsWaJN2bYY1pzb4S6T4Zdl4obUKvvCXc/G4xL5V9accI79YGoq XRo64wsuz1aD/NIResyRWi7YILB4MllPSyJHOnql9SBlZ8N0wF+90Ajb7ejQUa6zbcFg MWjIJMiDt2Wv+nXSQnHXUyG1nAmO7NLfJF5JO6OxejyYZvCl21gKZAN57Cg5C4tt1zKp LFPWjNtW9RJvVf4HoQruSL5YgMOzY0KXTkdDhpQ9ArXQ0A2MOsCNo/94iXhjvVk5IrdB HQ0g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=E83JIL5C; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ot5si359232ejb.565.2021.01.12.20.09.25; Tue, 12 Jan 2021 20:09:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=E83JIL5C; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2394190AbhALWFv (ORCPT + 99 others); Tue, 12 Jan 2021 17:05:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51262 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390084AbhALWFu (ORCPT ); Tue, 12 Jan 2021 17:05:50 -0500 Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C5A8C0617A5 for ; Tue, 12 Jan 2021 14:04:46 -0800 (PST) Received: by mail-pj1-x1029.google.com with SMTP id iq13so2496643pjb.3 for ; Tue, 12 Jan 2021 14:04:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=nbLyjtvFxvmKNYxhP/lJoJ/nf5a/TGw22r5rcB+fC2s=; b=E83JIL5C/0yD67sQZUe0iGFv7Iec6mv83jCj5+dMw6Bt6fY3v3oVETyOnrp+QzzZ6D DcvZ9zkA0Q1ErcVwG5bTMFgNz7SysicC7VwZwRtsitaS84lymSSMQAD4rZQlQ87aFcam 1Z9jiK59DXA8W09RMu0HwZxhxht89EkwzmJYMMEHv7y7/MI6zFfljfwIivRExzDe7et3 rRMB3AW9rUNXS5ijczsrIw1TqbTmUZMJu8GEO265Xe4eyL048WjLeEVGpXmqwwZuvq7R kYHbyumnbKiaQvAUDCpcCglH8exVlnz/+cJQcwbCtGaioaFmF/iGZJ0JFrtYtvatE8/X SjVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=nbLyjtvFxvmKNYxhP/lJoJ/nf5a/TGw22r5rcB+fC2s=; b=Jod5PHHygWEuTbgJeurMdDOFWY7ZFbCR+D9+QgvcJbm5KmBxlJEezS3Rer51dUxCkV znjixkVITyPzYb0AG+lbDig/m7gJM0OUiRWIR8fFaa6JoJmwCaLiHX7roVuu0NaFB6Fi JeKFjTyBo+Zd6t7mfe7XeThp90AWScCIzFUMNHTIFpejmKlgv0WbUBgU2/zYJSW0BCC+ KyOmcl6WnVDVLR72rR7BvoqJKdrPeMxRKRrIrjOBDIg6wr/FrKaqqN32sYTmWqLm/5u3 G8wDMQU0AOSGk5BKX1F/UGSxpM7fQ7qY3xog35zuLMQUYLZVhtiK00PmU0F84JuxtMFk 9Z+g== X-Gm-Message-State: AOAM530Q+7Zh6aP2ZlTpsR39rou/TimWyQnUpZZGri0nSZ1bUuxieaP/ WXwXhFiRez4yAlmj0vPhpHM2vA== X-Received: by 2002:a17:902:d48c:b029:de:2fb:98a with SMTP id c12-20020a170902d48cb02900de02fb098amr1018607plg.59.1610489085769; Tue, 12 Jan 2021 14:04:45 -0800 (PST) Received: from google.com ([2620:15c:f:10:1ea0:b8ff:fe73:50f5]) by smtp.gmail.com with ESMTPSA id u126sm75536pfu.113.2021.01.12.14.04.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Jan 2021 14:04:45 -0800 (PST) Date: Tue, 12 Jan 2021 14:04:38 -0800 From: Sean Christopherson To: Nitesh Narayan Lal Cc: Vitaly Kuznetsov , linux-kernel@vger.kernel.org, kvm@vger.kernel.org, w90p710@gmail.com, pbonzini@redhat.com, Thomas Gleixner Subject: Re: [PATCH] Revert "KVM: x86: Unconditionally enable irqs in guest context" Message-ID: References: <20210105192844.296277-1-nitesh@redhat.com> <874kjuidgp.fsf@vitty.brq.redhat.com> <87ble1gkgx.fsf@vitty.brq.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 12, 2021, Nitesh Narayan Lal wrote: > > On 1/7/21 4:33 AM, Vitaly Kuznetsov wrote: > > Sean Christopherson writes: > > > >> On Wed, Jan 06, 2021, Vitaly Kuznetsov wrote: > >>> Looking back, I don't quite understand why we wanted to account ticks > >>> between vmexit and exiting guest context as 'guest' in the first place; > >>> to my understanging 'guest time' is time spent within VMX non-root > >>> operation, the rest is KVM overhead (system). > >> With tick-based accounting, if the tick IRQ is received after PF_VCPU is cleared > >> then that tick will be accounted to the host/system. The motivation for opening > >> an IRQ window after VM-Exit is to handle the case where the guest is constantly > >> exiting for a different reason _just_ before the tick arrives, e.g. if the guest > >> has its tick configured such that the guest and host ticks get synchronized > >> in a bad way. > >> > >> This is a non-issue when using CONFIG_VIRT_CPU_ACCOUNTING_GEN=y, at least with a > >> stable TSC, as the accounting happens during guest_exit_irqoff() itself. > >> Accounting might be less-than-stellar if TSC is unstable, but I don't think it > >> would be as binary of a failure as tick-based accounting. > >> > > Oh, yea, I vaguely remember we had to deal with a very similar problem > > but for userspace/kernel accounting. It was possible to observe e.g. a > > userspace task going 100% kernel while in reality it was just perfectly > > synchronized with the tick and doing a syscall just before it arrives > > (or something like that, I may be misremembering the details). > > > > So depending on the frequency, it is probably possible to e.g observe > > '100% host' with tick based accounting, the guest just has to > > synchronize exiting to KVM in a way that the tick will always arrive > > past guest_exit_irqoff(). > > > > It seems to me this is a fundamental problem in case the frequency of > > guest exits can match the frequency of the time accounting tick. > > > > Just to make sure that I am understanding things correctly. > There are two issues: > 1. The first issue is with the tick IRQs that arrive after PF_VCPU is > ?? cleared as they are then accounted into the system context atleast on > ?? the setup where CONFIG_VIRT_CPU_ACCOUNTING_GEN is not enabled. With the > ?? patch "KVM: x86: Unconditionally enable irqs in guest context", we are > ?? atleast taking care of the scenario where the guest context is exiting > ?? constantly just before the arrival of the tick. Yep. > 2. The second issue that Sean mentioned was introduced because of moving > ?? guest_exit_irqoff() closer to VM-exit. Due to this change, any ticks that > ?? happen after IRQs are disabled are incorrectly accounted into the system > ?? context. This is because we exit the guest context early without > ?? ensuring if the required guest states to handle IRQs are restored. Yep. > So, the increase in the system time (reported by cpuacct.stats) that I was > observing is not entirely correct after all. It's correct, but iff CONFIG_VIRT_CPU_ACCOUNTING_GEN=y, as that doesn't rely on ticks and so closer to VM-Enter is better. The problem is that it completely breaks CONFIG_VIRT_CPU_ACCOUNTING_GEN=n (#2 above) because KVM will never service an IRQ, ticks included, with PF_VCPU set. > Am I missing anything here?