Received: by 2002:a05:7412:8d09:b0:fa:4c10:6cad with SMTP id bj9csp523498rdb; Tue, 16 Jan 2024 07:37:41 -0800 (PST) X-Google-Smtp-Source: AGHT+IGv1K6Zf1V4PK8vuaSwtqXau5vNsKymLCgeLu27Pi8Z5T1JyIowqjyM82xUyOhyjj4Mr5ut X-Received: by 2002:ac2:410e:0:b0:50e:80dc:b11b with SMTP id b14-20020ac2410e000000b0050e80dcb11bmr2939072lfi.44.1705419461198; Tue, 16 Jan 2024 07:37:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1705419461; cv=none; d=google.com; s=arc-20160816; b=SoRpHqRvjoGaiAJsgXFy/V6ownzDQ1WDYszLKNG6tJVtgj/XbreUb1IPQLyD0jc4Zw ecHQtGCtXfK9TVbASVsQT9sD38jY1KpLNM+UFAU+BV7lM32qXaBFHwnmyc89Wpz3bpb8 SDpef3BTgSdGA2qxqS7Vl2IvZD/dRKu/XpiHrISj/cFuovZC/ruEBQXF937pI8c6gYuO rZJCQa/39hr2glpqFOw2zWu6QWFlKCBfTTBUFLpGRThz0SEqeW0HyavZdHgo6U9BZv9Z 2/gtEIzIzd7umFT9VlUXf8VgaEkbUqAfAdU/VX8t9M4Sk9V1DZLLXJ3qw98bPQV0V97F 4s+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id; bh=yTgGGV1huFl6WhISznZihDgXIwBDxZbNAgkNsHL3EUA=; fh=5FwSu8NITXdaOqsBIynVvkCEMTuUDpkoNtI/Cl4IDas=; b=i43qsCIOePJ50jKvYGv5sv//UT7t1uDNAKPJiRluBYKkIf1a69yYIkDf7KIe2Q/SHq ZI/zlpjcS3X9Oz/pOVzlwUO0ZPPU6l8Fc5bfwiBPsv1W0Wk7xXv07Rk2OhvuE5qs5KvC P56Na4O6em+DQvWWQqup+nOWDy9vEVlbIqRTQGPX6ar+Dp3JsVMkhfkUHsh4LzjTFQCQ 1WKMLNbfjIDEmDYnKJzCzxxRSA3LP/TFwyNE7hH0R4lhyVBvYshQFLN0JVJPg1Mlx4e8 k6OkVAleCEDFlgTneO94SjjLTjJh4NucadV9DAo2Dwd0Y7cT6yBlJj9ubDcnsE2aXaAU WJ+Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-27532-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-27532-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id l10-20020a056402124a00b00559b57e28d5si469848edw.423.2024.01.16.07.37.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 Jan 2024 07:37:41 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-27532-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-27532-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-27532-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id E64261F21D8F for ; Tue, 16 Jan 2024 15:37:40 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E51C71C2BC; Tue, 16 Jan 2024 15:37:30 +0000 (UTC) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8B6E31C284; Tue, 16 Jan 2024 15:37:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=proxmox.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=proxmox.com Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 495D549183; Tue, 16 Jan 2024 16:37:20 +0100 (CET) Message-ID: Date: Tue, 16 Jan 2024 16:37:19 +0100 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Temporary KVM guest hangs connected to KSM and NUMA balancer Content-Language: en-US To: Sean Christopherson Cc: kvm@vger.kernel.org, Paolo Bonzini , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <832697b9-3652-422d-a019-8c0574a188ac@proxmox.com> From: Friedrich Weber In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Hi Sean, On 11/01/2024 17:00, Sean Christopherson wrote: > This is a known issue. It's mostly a KVM bug[...] (fix posted[...]), but I suspect > that a bug in the dynamic preemption model logic[...] is also contributing to the > behavior by causing KVM to yield on preempt models where it really shouldn't. I tried the following variants now, each applied on top of 6.7 (0dd3ee31): * [1], the initial patch series mentioned in the bugreport ("[PATCH 0/2] KVM: Pre-check mmu_notifier retry on x86") * [2], its v2 that you linked above ("[PATCH v2] KVM: x86/mmu: Retry fault before acquiring mmu_lock if mapping is changing") * [3], the scheduler patch you linked above ("[PATCH] sched/core: Drop spinlocks on contention iff kernel is preemptible") * both [2] & [3] My kernel is PREEMPT_DYNAMIC and, according to /sys/kernel/debug/sched/preempt, defaults to preempt=voluntary. For case [3], I additionally tried manually switching to preempt=full. Provided I did not mess up, I get the following results for the reproducer I posted: * [1] (the initial patch series): no hangs * [2] (its v2): hangs * [3] (the scheduler patch) with preempt=voluntary: no hangs * [3] (the scheduler patch) with preempt=full: hangs * [2] & [3]: no hangs So it seems like: * [1] (the initial patch series) fixes the hangs, which is consistent with the feedback in the bugreport [4]. * But weirdly, its v2 [2] does not fix the hangs. * As long as I stay with preempt=voluntary, [3] (the scheduler patch) alone is already enough to fix the hangs in my case -- this I did not expect :) Does this make sense to you? Happy to double-check or run more tests if anything seems off. Best wishes, Friedrich [1] https://lore.kernel.org/all/20230825020733.2849862-1-seanjc@google.com/ [2] https://lore.kernel.org/all/20240110012045.505046-1-seanjc@google.com/ [3] https://lore.kernel.org/all/20240110214723.695930-1-seanjc@google.com/ [4] https://bugzilla.kernel.org/show_bug.cgi?id=218259#c6