Received: by 2002:a05:7412:b130:b0:e2:908c:2ebd with SMTP id az48csp189696rdb; Thu, 16 Nov 2023 16:19:11 -0800 (PST) X-Google-Smtp-Source: AGHT+IGDxlDORTwcZbWzYfc91w1VZPvd0M4EYZC3/VurgpILtsdeSl+/feqtQDAjSbzoqjLoCgsr X-Received: by 2002:a17:90b:3b46:b0:280:767:d3eb with SMTP id ot6-20020a17090b3b4600b002800767d3ebmr16898568pjb.30.1700180350952; Thu, 16 Nov 2023 16:19:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700180350; cv=none; d=google.com; s=arc-20160816; b=sZsbVYCiudzk8CzJ8OvdgSKCz8U3qwKaLURaEjrIfIgC7bi6fHn+XmYWU4ZHOQpw2W PPiljeWrwHWpJtOj+YGQ4Hj9e1JQi9/pmYiSfbkfO//vLKdYY2VjAsljjq65rE91dprV WbcMeLK/m+4U9uYh54qwCwdXtH/rzP42aOoNS053JX303VQucvtycc11vwmsAylAzvMV abXExZ9Vc242A+kWJiTEtBI4z2yngJL1QJF06BFpymQHLcD9ND/4OzqpKXx2ChEhF+Ps a+Vnez6GYIE3W96Qk0g1BHlHiofJZy8pe3eeH2vszSd613Qw3JUjPcp2xWD7bzjnLuIW Nd3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:dkim-signature:date; bh=QD9j/IPLCC8903UMjVBCJa3RC+ypRWcjx7JiuDPe2Ls=; fh=7tLzd0wAN4Qi8L4oSNrRhG04BXH+jyLgfRn0AxI3wOc=; b=hQcCSts60tDgXp0/PWcdAsZ6rYxu5MVGyBEq8LE7pDGZd3EZ81TIW28RjO3d6H1Z+g k8Dy4EznrlxQrVKEMpgZ1WnpH4YvB8Ox96JSCo/VCyUVFlhq+tMgScioKAPtbZHayMLk +iPSM+jlv1Bbz9bxS/IJxtBdQ0BDLhBu3x6BSdOBvza2UNIR/6zqCESNOjGK+NT30Gru uf0KODVL6rMVCyU8ApiuG17u4Zgd6oGsqLZmW3jRfLwLAacq/dbNhVavSdvWLMIToS7n dLY/EPZNN+K/YxZC93h6XODG3pGfKoqlMChsJpt8jK4miWJxiRDGvkaj+pIlb5jy20uY Ynug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=tl0+r6AG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id x9-20020a17090a9dc900b00280216d7e28si663515pjv.45.2023.11.16.16.19.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Nov 2023 16:19:10 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=tl0+r6AG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id D4B5C8114EDE; Thu, 16 Nov 2023 16:19:07 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345585AbjKQAS6 (ORCPT + 99 others); Thu, 16 Nov 2023 19:18:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33344 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229487AbjKQAS5 (ORCPT ); Thu, 16 Nov 2023 19:18:57 -0500 Received: from out-170.mta0.migadu.com (out-170.mta0.migadu.com [IPv6:2001:41d0:1004:224b::aa]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E94C1EA for ; Thu, 16 Nov 2023 16:18:53 -0800 (PST) Date: Thu, 16 Nov 2023 16:18:36 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1700180332; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=QD9j/IPLCC8903UMjVBCJa3RC+ypRWcjx7JiuDPe2Ls=; b=tl0+r6AGUUEJc6BOYRFE9I+D/a/vSp3pXhOwpx0ZY3NnJJONDKOyPyKqWvuzYpkKM2a3oU sf6a/H5bxRSm2Oqna0oR6uiBHC6/IZCAHnBohKwYaTrJiSHvWG3bPfEM5/X8ulBtWDAnPJ hTJAvSCqcoZDwF3KOg+YvGCP9fBEspI= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Oliver Upton To: Shaoqin Huang Cc: kvm@vger.kernel.org, kvmarm@lists.linux.dev, Paolo Bonzini , Shuah Khan , linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v1] KVM: selftests: Initalize sem_vcpu_[cont|stop] before each test in dirty_log_test Message-ID: References: <20231116093536.22256-1-shahuang@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231116093536.22256-1-shahuang@redhat.com> X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Thu, 16 Nov 2023 16:19:08 -0800 (PST) Hi Shaoqin, On Thu, Nov 16, 2023 at 04:35:36AM -0500, Shaoqin Huang wrote: > When execute the dirty_log_test on some aarch64 machine, it sometimes > trigger the ASSERT: > > ==== Test Assertion Failure ==== > dirty_log_test.c:384: dirty_ring_vcpu_ring_full > pid=14854 tid=14854 errno=22 - Invalid argument > 1 0x00000000004033eb: dirty_ring_collect_dirty_pages at dirty_log_test.c:384 > 2 0x0000000000402d27: log_mode_collect_dirty_pages at dirty_log_test.c:505 > 3 (inlined by) run_test at dirty_log_test.c:802 > 4 0x0000000000403dc7: for_each_guest_mode at guest_modes.c:100 > 5 0x0000000000401dff: main at dirty_log_test.c:941 (discriminator 3) > 6 0x0000ffff9be173c7: ?? ??:0 > 7 0x0000ffff9be1749f: ?? ??:0 > 8 0x000000000040206f: _start at ??:? > Didn't continue vcpu even without ring full > > The dirty_log_test fails when execute the dirty-ring test, this is > because the sem_vcpu_cont and the sem_vcpu_stop is non-zero value when > execute the dirty_ring_collect_dirty_pages() function. When those two > sem_t variables are non-zero, the dirty_ring_wait_vcpu() at the > beginning of the dirty_ring_collect_dirty_pages() will not wait for the > vcpu to stop, but continue to execute the following code. In this case, > before vcpu stop, if the dirty_ring_vcpu_ring_full is true, and the > dirty_ring_collect_dirty_pages() has passed the check for the > dirty_ring_vcpu_ring_full but hasn't execute the check for the > continued_vcpu, the vcpu stop, and set the dirty_ring_vcpu_ring_full to > false. Then dirty_ring_collect_dirty_pages() will trigger the ASSERT. > > Why sem_vcpu_cont and sem_vcpu_stop can be non-zero value? It's because > the dirty_ring_before_vcpu_join() execute the sem_post(&sem_vcpu_cont) > at the end of each dirty-ring test. It can cause two cases: > > 1. sem_vcpu_cont be non-zero. When we set the host_quit to be true, > the vcpu_worker directly see the host_quit to be true, it quit. So > the log_mode_before_vcpu_join() function will set the sem_vcpu_cont > to 1, since the vcpu_worker has quit, it won't consume it. > 2. sem_vcpu_stop be non-zero. When we set the host_quit to be true, > the vcpu_worker has entered the guest state, the next time it exit > from guest state, it will set the sem_vcpu_stop to 1, and then see > the host_quit, no one will consume the sem_vcpu_stop. > > When execute more and more dirty-ring tests, the sem_vcpu_cont and > sem_vcpu_stop can be larger and larger, which makes many code paths > don't wait for the sem_t. Thus finally cause the problem. > > Fix this problem is easy, simply initialize the sem_t before every test. > Thus whatever the state previous test left, it won't interfere the next > test. In your changelog you describe what sounds like a semaphore imbalance at the time of test completion, yet your proposed fix is to just clobber the error and start fresh. Why not nip it at the bud and fix the logic bug instead? -- Thanks, Oliver