Received: by 2002:a05:7412:bbc7:b0:fc:a2b0:25d7 with SMTP id kh7csp655561rdb; Thu, 1 Feb 2024 22:48:52 -0800 (PST) X-Google-Smtp-Source: AGHT+IHXm10EkiZytbioWWh3SrlDj6Xlh66I1NM5TYRo00qNWJ/T5ARTTOE0FRoJl5rVbrG5lo80 X-Received: by 2002:a05:6808:1b09:b0:3bf:c2df:d603 with SMTP id bx9-20020a0568081b0900b003bfc2dfd603mr304472oib.26.1706856531945; Thu, 01 Feb 2024 22:48:51 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706856531; cv=pass; d=google.com; s=arc-20160816; b=rwKazt+OmCOG2M1vlyKQV9R6iXJw0mAQllyKKgeH/B53c5zaAZt3dgEHKo2TGSVarp 9ClUiEgLvYxa/m2w/0LE7VRL752JJETQkAFpYPf9OYRo7Xelit7GSsCaEXB91DQ1Uvtm ZGoQr0FHE0oJ7QGj7QyHEmrVHEaXTZTcO36W+7ysjqT7Y8JLHfpz+OKyq+BZRpCvY/5l GhiDg8kSFd2saAEpap/OfxiZs9VBOQ9dvrLodxvJ96LH42uIALLV39y7eQPC7Dy4qUsu J2yF7FRrGmua2nWj44LX6gevydOeevKSVPX3R/nPPypfgrT/og/ekwLebfstEJIxqCY0 fZ8A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=0uFHR1zAX7zGOTNl6chkBN+h5HVp+40si/IAsptCuA8=; fh=453x8NlbD8KYX95mSQ0ztk2nXJtQt6lrntBALy7S9lk=; b=oPNOLomXcAMtWtSWKoBqNRhukcnzDSpnXickWkw6cnNTz/fE/8b8kvZFXqt+JupacY wOfQclxbOTAw20Nn+oDAh9pgsqbygxGj/tQdt37WG8Q0n42I43rZyqT8tmhoKxIC7+0a /0sMfI2do/yFwvyYDPYX1rqtiMliZ5vYJ+1zZnLcIrijtqJ5JVSxJylWmn4+79+a1wzG mewg2JNCQhJ1mIxoqwBqZ2bYWzWhTTgTUXTyYqZ32zktaLFxssRRJ9HYKt9WYMdv3M+6 5DkjnvrO3BH+moSvhpDnIwfXUAANtmGRyw37h7p5qvdolP9R4aGPpW3ONNFeazilKQkC f9bw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=czyfSCD4; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-49302-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-49302-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Forwarded-Encrypted: i=1; AJvYcCWmyNRUv9r2Z4fFnV4l7llZ7M/0ihfQGJ7Pq3FrZvh31P9L9HTpAlsMWgB6rfYUfB+KAiYchdQNk/gynVTboW59TfxF43MksC7bHBju7w== Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id g3-20020a056a000b8300b006dd8088dd63si1033315pfj.395.2024.02.01.22.48.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Feb 2024 22:48:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-49302-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=czyfSCD4; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-49302-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-49302-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id D960E28CA79 for ; Fri, 2 Feb 2024 06:47:58 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 38BF01946C; Fri, 2 Feb 2024 06:43:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="czyfSCD4" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1BB402B9D2 for ; Fri, 2 Feb 2024 06:43:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706856220; cv=none; b=RNtrOeWK6f0Fe/8+J86VNL/fFzZ+CjZ0idtNoB4e0hn4GcD6crtZGtNlRzGEmPttzwoCHx6jkVyHYaq9e9mfYvwO8rznO/T1yhqXnqWONi19yY8e0gneNxLwcQ7/0EEEV6WIZ17CLCr246rYz36HycyplEwZM6UwGbB0N41PB00= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706856220; c=relaxed/simple; bh=XLvulHZVJquTrMep5y1cVtcFzDjgJkl/pS+1VTnGBD0=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=UrkFk9VHN15N3EZkPlDtzFY3XozTSpcn2TD2sDGI6LqjnH8VrlrJdZg6ezLn3BSNf0++3FUHbz8ZdXUcIfGw4iXjfAw/f4PDD7hShHXGjp7dGyLQ5pbIdL8TNmTkFx1KDUtAJiqaVjVXS5EEHPPeQTOVXEGx5mpanzWIc9AK8V0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=czyfSCD4; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706856217; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=0uFHR1zAX7zGOTNl6chkBN+h5HVp+40si/IAsptCuA8=; b=czyfSCD4WRV5k89F0cBYd/AVjS2zdg83fDdRC0p0LQfDce0zyna6vnIB6DDq5FRM3teJHl MvA5iANcHusXzIPj1I/EJxRY9OnUT6iq1/Zhsf74BE+hzwdi25HUGmPqa8DpQ6D7GtWVd5 VS0M+AQlyMXLJHtlv2bpW0J3wgsUVRM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-500-xf7PUJSRNKyGeKSykiMSGw-1; Fri, 02 Feb 2024 01:43:34 -0500 X-MC-Unique: xf7PUJSRNKyGeKSykiMSGw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 00403827D88; Fri, 2 Feb 2024 06:43:34 +0000 (UTC) Received: from virt-mtcollins-01.lab.eng.rdu2.redhat.com (virt-mtcollins-01.lab.eng.rdu2.redhat.com [10.8.1.196]) by smtp.corp.redhat.com (Postfix) with ESMTP id DEEB63C2E; Fri, 2 Feb 2024 06:43:33 +0000 (UTC) From: Shaoqin Huang To: Paolo Bonzini , Sean Christopherson Cc: Peter Xu , Shaoqin Huang , Shuah Khan , kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v3] KVM: selftests: Fix the dirty_log_test semaphore imbalance Date: Fri, 2 Feb 2024 01:43:32 -0500 Message-Id: <20240202064332.9403-1-shahuang@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 When execute the dirty_log_test on some aarch64 machine, it sometimes trigger the ASSERT: ==== Test Assertion Failure ==== dirty_log_test.c:384: dirty_ring_vcpu_ring_full pid=14854 tid=14854 errno=22 - Invalid argument 1 0x00000000004033eb: dirty_ring_collect_dirty_pages at dirty_log_test.c:384 2 0x0000000000402d27: log_mode_collect_dirty_pages at dirty_log_test.c:505 3 (inlined by) run_test at dirty_log_test.c:802 4 0x0000000000403dc7: for_each_guest_mode at guest_modes.c:100 5 0x0000000000401dff: main at dirty_log_test.c:941 (discriminator 3) 6 0x0000ffff9be173c7: ?? ??:0 7 0x0000ffff9be1749f: ?? ??:0 8 0x000000000040206f: _start at ??:? Didn't continue vcpu even without ring full The dirty_log_test fails when execute the dirty-ring test, this is because the sem_vcpu_cont and the sem_vcpu_stop is non-zero value when execute the dirty_ring_collect_dirty_pages() function. When those two sem_t variables are non-zero, the dirty_ring_wait_vcpu() at the beginning of the dirty_ring_collect_dirty_pages() will not wait for the vcpu to stop, but continue to execute the following code. In this case, before vcpu stop, if the dirty_ring_vcpu_ring_full is true, and the dirty_ring_collect_dirty_pages() has passed the check for the dirty_ring_vcpu_ring_full but hasn't execute the check for the continued_vcpu, the vcpu stop, and set the dirty_ring_vcpu_ring_full to false. Then dirty_ring_collect_dirty_pages() will trigger the ASSERT. Why sem_vcpu_cont and sem_vcpu_stop can be non-zero value? It's because the dirty_ring_before_vcpu_join() execute the sem_post(&sem_vcpu_cont) at the end of each dirty-ring test. It can cause two cases: 1. sem_vcpu_cont be non-zero. When we set the host_quit to be true, the vcpu_worker directly see the host_quit to be true, it quit. So the log_mode_before_vcpu_join() function will set the sem_vcpu_cont to 1, since the vcpu_worker has quit, it won't consume it. 2. sem_vcpu_stop be non-zero. When we set the host_quit to be true, the vcpu_worker has entered the guest state, the next time it exit from guest state, it will set the sem_vcpu_stop to 1, and then see the host_quit, no one will consume the sem_vcpu_stop. When execute more and more dirty-ring tests, the sem_vcpu_cont and sem_vcpu_stop can be larger and larger, which makes many code paths don't wait for the sem_t. Thus finally cause the problem. To fix this problem, we can wait a while before set the host_quit to true, which gives the vcpu time to enter the guest state, so it will exit again. Then we can wait the vcpu to exit, and let it continue again, then the vcpu will see the host_quit. Thus the sem_vcpu_cont and sem_vcpu_stop will be both zero when test finished. Signed-off-by: Shaoqin Huang --- v2->v3: - Rebase to v6.8-rc2. - Use TEST_ASSERT(). v1->v2: - Fix the real logic bug, not just fresh the context. v1: https://lore.kernel.org/all/20231116093536.22256-1-shahuang@redhat.com/ v2: https://lore.kernel.org/all/20231117052210.26396-1-shahuang@redhat.com/ tools/testing/selftests/kvm/dirty_log_test.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c index 6cbecf499767..dd2d8be390a5 100644 --- a/tools/testing/selftests/kvm/dirty_log_test.c +++ b/tools/testing/selftests/kvm/dirty_log_test.c @@ -417,7 +417,8 @@ static void dirty_ring_after_vcpu_run(struct kvm_vcpu *vcpu, int ret, int err) static void dirty_ring_before_vcpu_join(void) { - /* Kick another round of vcpu just to make sure it will quit */ + /* Wait vcpu exit, and let it continue to see the host_quit. */ + dirty_ring_wait_vcpu(); sem_post(&sem_vcpu_cont); } @@ -719,6 +720,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) struct kvm_vm *vm; unsigned long *bmap; uint32_t ring_buf_idx = 0; + int sem_val; if (!log_mode_supported()) { print_skip("Log mode '%s' not supported", @@ -726,6 +728,11 @@ static void run_test(enum vm_guest_mode mode, void *arg) return; } + sem_getvalue(&sem_vcpu_stop, &sem_val); + assert(sem_val == 0); + sem_getvalue(&sem_vcpu_cont, &sem_val); + assert(sem_val == 0); + /* * We reserve page table for 2 times of extra dirty mem which * will definitely cover the original (1G+) test range. Here @@ -825,6 +832,13 @@ static void run_test(enum vm_guest_mode mode, void *arg) sync_global_to_guest(vm, iteration); } + /* + * + * Before we set the host_quit, let the vcpu has time to run, to make + * sure we consume the sem_vcpu_stop and the vcpu consume the + * sem_vcpu_cont, to keep the semaphore balance. + */ + usleep(p->interval * 1000); /* Tell the vcpu thread to quit */ host_quit = true; log_mode_before_vcpu_join(); base-commit: 41bccc98fb7931d63d03f326a746ac4d429c1dd3 -- 2.40.1