Received: by 2002:ab2:1149:0:b0:1f3:1f8c:d0c6 with SMTP id z9csp1035662lqz; Sun, 31 Mar 2024 10:25:11 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUWBLD4a9p1UrZRKuabQx72w9XAh9NlO5pyzU17lKHPi6ANWc+OOsOAzgVJ2fqj5ZRbYmT+WHk3N70vxUZ3ijxUsRS6aJqsd2u84p1OkA== X-Google-Smtp-Source: AGHT+IFjSkLmJ7v3gd/4HbckDXHNZWU/SXGOMIH+hAQFfMF0EWHVXWv9jJHD1uGPj6oJcsu0gxHi X-Received: by 2002:a05:6512:20a:b0:513:dcc1:4c35 with SMTP id a10-20020a056512020a00b00513dcc14c35mr4325057lfo.55.1711905911604; Sun, 31 Mar 2024 10:25:11 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711905911; cv=pass; d=google.com; s=arc-20160816; b=mBGACNzle+yzpds7mM2l2UGYweyixqxPiH+WKLX8okGJy2EkWxAxh85r8JldxtWBsI XgFFDCQphCTOTBm7G2LqKug6tlT7dqVlCiSeCRqHOyKcXUkz81w2TtKgTr9pyjVeQ1tL vyoQQjn02k2FNvyj8lHSe2o4gf8r4HyHP9Ffi0pWHZ3SmJw19xjCr5GJ82XwZlOsYaqI qwlLm2NlZVBLYJ206TFNvqm4Jt75LuZccR8MWKzMRDwWVnFnIJJr7sSltnd2cl8p9HTV O9lJOXCTN8YvEc7M5NoOZu8O4R/J6Viga8WbAHgowba190h4U+p5ur4rhFo6OGdY1ZhG FKmg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id:dkim-signature; bh=QkYfgLS7hkRF6DPbDYtOSNe6tba9tryOWI4KjfUu1rY=; fh=p7n+g8LaX/AW3zCDOQv4UASvvw/0i1wtQIbWSWVKVf4=; b=GYs2Acjhhpp8nfgaXWCC9lC1E9/h37QBYcPQcKfo+aB0ZVC35dFUEmuOHbjTrx3wsU TBZs4wwy89ml+PbQNQNqP/7zOQpAZyCMJPVr8EMfEn92hoySkh3G4/ivVQ72i+qBumTI +Rq0HZES/ye35+hK4fdVqXlCK0DAAlDfqr0ktXmEfOMq5kJmkm0SHfP0KJx21knJZvDE nmTpNWrZ/56kbomxMSQknZKCi829i1SbpjH77RePHsN2OiDl2VxW5XlCKZPYGK0UTqxg dE/y8t3w9PtrJiNGZiT7pZOK+VkvrtRWXcUvRxZRqt9ex5ElkNxYkkm3GclgqtMbwoVj JAnw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=qb41CNTk; arc=pass (i=1 spf=pass spfdomain=igalia.com dkim=pass dkdomain=igalia.com); spf=pass (google.com: domain of linux-kernel+bounces-126321-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-126321-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id gl14-20020a170906e0ce00b00a4dfd949254si3740895ejb.925.2024.03.31.10.25.11 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 31 Mar 2024 10:25:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-126321-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=qb41CNTk; arc=pass (i=1 spf=pass spfdomain=igalia.com dkim=pass dkdomain=igalia.com); spf=pass (google.com: domain of linux-kernel+bounces-126321-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-126321-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 313821F21DAE for ; Sun, 31 Mar 2024 17:25:11 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 200B314600B; Sun, 31 Mar 2024 17:07:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="qb41CNTk" Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6F51C1E531 for ; Sun, 31 Mar 2024 17:07:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711904867; cv=none; b=C+0dnVuzM5PNcF+xp9AsfRvgRWbD6yw3X//dzpLOHua1NMd89uYUqQazJb8nUyl9aYs3kN8PPr9H6srbAOxYOggR/PamxZJtTZjXGmiOeC33cZqY62geYFeUcEuFNbg75IUq5vfzfVf7+eVOUT6GvglHy3yBFaNCt9I0HtqlkGU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711904867; c=relaxed/simple; bh=ip/ojBn1bBIN37JHwCVonuz9hy+GTreofF8f4G55+f4=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=QFeenuFYHCSKRPbnbmvruVRmh5n6UqTECCm8Gt6DndUN/x28kP3kTvugXUp3y1NpUJKh+ELSVF/Yhd/s2fMBHN0OIjpruIznXuFet0YMqBaCkcROMjIOx8WQ3aiowkhMfkPkGDkokXe+jV/7e2wuHGmVoMaYqU4qx3l5SEZ7haU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=qb41CNTk; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:From: References:Cc:To:Subject:MIME-Version:Date:Message-ID:Sender:Reply-To: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=QkYfgLS7hkRF6DPbDYtOSNe6tba9tryOWI4KjfUu1rY=; b=qb41CNTkrlQyNw/zpC02hfKPfe VdoTMltrAiIdxssFiQW7oeDhQzlhuhpwd1TwHyZXSyyWuLViSKHqn8GsUky+5eIVWHsDKijT8OHON dE3tUJCSpHWrV17OuhIOzdKD40HlPiNyPs4ekuJ3ZDOGcjDHXY2Z8bO9TAyqVGEKHe0GgSJn57il+ OccVl63PUWingDwlWP9MXGxm+hlPoq/MaruHzG//SzziV+7leQfF9B/Ele68F4r/YUYGFhT5Tws96 0UKBhmDywRXQvn5a3O9xJg7g4U1Y4zRR1yEo0GewMlLMNfjhAEPIQThjXZjsodZe3syuZLVOdT+Vo hAEMPc+w==; Received: from [179.232.147.2] (helo=[192.168.0.12]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_128_GCM:128) (Exim) id 1rqyee-00HM7F-9R; Sun, 31 Mar 2024 19:07:36 +0200 Message-ID: <7c8c6f7c-7476-d73d-4df1-9dea0aa4ecf2@igalia.com> Date: Sun, 31 Mar 2024 14:07:28 -0300 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [PATCH] x86/split_lock: fix delayed detection enabling Content-Language: en-US To: Maksim Davydov Cc: den-plotnikov@yandex-team.ru, x86@kernel.org, linux-kernel@vger.kernel.org, dave.hansen@linux.intel.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de References: <20240321195522.24830-1-davydov-max@yandex-team.ru> From: "Guilherme G. Piccoli" In-Reply-To: <20240321195522.24830-1-davydov-max@yandex-team.ru> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 21/03/2024 16:55, Maksim Davydov wrote: > If the warn mode with disabled mitigation mode is used, then on each cpu > where the split lock occurred detection will be disabled in order to make > progress and delayed work will be scheduled, which then will enable > detection back. Now it turns out that all CPUs use one global delayed > work structure. This leads to the fact that if a split lock occurs on > several CPUs at the same time (within 2 jiffies), only one cpu will > schedule delayed work, but the rest will not. The return value of > schedule_delayed_work_on() would have shown this, but it is not checked > in the code > In order to fix the warn mode with disabled mitigation mode, delayed work > has to be a per-cpu. > > Fixes: 727209376f49 ("x86/split_lock: Add sysctl to control the misery mode") Thanks Maksim! I confess I (think I) understand the theory behind the possible problem, but I'm not seeing how it happens - probably just me being silly , but can you help me to understand it clearly? Let's say we have 2 CPUs, CPU0 and CPU1 and we're running with sld_mitigate = 0, meaning we don't have "the misery". If the code running in CPU0 reaches split_lock_warn(), my understanding is that it warns the user, schedule the sld reenable [via and schedule_delayed_work_on()] and disables the feature with sld_update_msr(false), correct? So, does this disabling happens only at core level, or it disables for the whole CPU including all cores? But back to our example, if CPU1 detects the split lock, it'll run the same procedure as CPU0 did - so are you saying we have a race there if CPU1 face a split lock before CPU0 disabled the MSR? Maybe a more clear example of the issue would be even helpful in the commit message, showing the path both CPUs would take and how the problem happens exactly. Thanks in advance, Guilherme