Received: by 2002:ab2:6203:0:b0:1f5:f2ab:c469 with SMTP id o3csp482969lqt; Fri, 19 Apr 2024 01:28:09 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXf1LfydljsfiVxS/cVfMhfA8qLnrq6hQBXHgeSJYwJPsw76/cKqItI2AUWPSoTy3kqSomRZEY7jWv8dRRzTiVV0rM1WMIVayhz+SY3nQ== X-Google-Smtp-Source: AGHT+IFVbbXcGpeQKQa0aaE9lTgSt4tIMzdRVYPdnykhD463iEL7Gz3aDsCbJ7UocVl31+NDsgsX X-Received: by 2002:a05:620a:621e:b0:78d:7546:762e with SMTP id ou30-20020a05620a621e00b0078d7546762emr1578776qkn.53.1713515288828; Fri, 19 Apr 2024 01:28:08 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1713515288; cv=pass; d=google.com; s=arc-20160816; b=I2kxm+7C83tmC16ys4C0vNoM5K5vquhf9msoZSDGaXFdPdLerwBX9nE8Ji52MdbiTR 9EdFLbduWJ3suJFxiGieBU6cLxSiEjFvA3nZSt6iFEitYIQ/gk13lUYGKNzjD0w6WTyZ L+wAvYDOAoL8vbHsiYInjDLDXnLY2t/34kGc1+5UpibknxjTrGsF53r0rX3svZ7esRwa +WNdmMMpGHXOPzNJJqK2+4+gRvJwf34wMhEeiYxnLCOFmYz8pkjrh/VVT54aq0HOhz2Y lN9rvzM5OTVfP7JHZeRxjhMnCOXc+PMhkjASn9Wk0aAjaufHFiWZM/u6fH84wWCAeNN9 Z9BA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :user-agent:message-id:date:references:in-reply-to:subject:cc:to :from:dkim-signature; bh=mUz7G2I31WWjnfIm90UXXiHcAE5I6dJUYddf94VWWLk=; fh=9ZP459YFFnOGvSm65GLj+MoJx2baStPZa+DfUJZks4I=; b=U7PzfyeCWUGnPMElpOhw+XL2iD/dbXFHPruFNJk6sPA9IQxOzSJZkHDEtOMABL2rdX n8zqegiLFV5+lzKXw6DfLRrPs1XYOjnhkWqSlpfCChgU2CLJhTKjKjvxZds9tvzK4Exw MTIvhfO+2oCrGOb/Ew6B1xPuHq5Fgi8/EK3ZafA9gtyta7oNu0v2VLF3uSLPBY7fBcOB Ru9OLam7btg0KceFqyCRayo1x32pqE2O4wyjj+h+YUSznHlc8DVxm2C//3cLQJBjkyvW a4v53swPefmzPbg2IO1CkVISzTShJ8TwrweVG95VSCf6CVwzI1ixtwhzmXL8C2p+oyce 1i5A==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=L7Qc9X8A; arc=pass (i=1 spf=pass spfdomain=linux.ibm.com dkim=pass dkdomain=ibm.com dmarc=pass fromdomain=linux.ibm.com); spf=pass (google.com: domain of linux-kernel+bounces-151164-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-151164-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id j16-20020a05620a411000b0078edd30a009si3772511qko.206.2024.04.19.01.28.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Apr 2024 01:28:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-151164-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=L7Qc9X8A; arc=pass (i=1 spf=pass spfdomain=linux.ibm.com dkim=pass dkdomain=ibm.com dmarc=pass fromdomain=linux.ibm.com); spf=pass (google.com: domain of linux-kernel+bounces-151164-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-151164-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 8B89C1C215BA for ; Fri, 19 Apr 2024 08:28:08 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A1EBD5FB99; Fri, 19 Apr 2024 08:27:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="L7Qc9X8A" Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5299565BA8 for ; Fri, 19 Apr 2024 08:27:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.156.1 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713515247; cv=none; b=W7+Z7QfKc9EREjfLMWRgLdyfOBC9Ehhg5h9u15rvRcSrV1vsmWFEgvlCLfItvlASMZnQBskxeTNxdLWqxeb408aQS/wBIcY+lOQ8VDOF2yssUCpK1TOW3jVoLSMWJch5/zZ2k/QdJylpyLr1mHEyHJpOEOINgn2HqQdN7FCK8SE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713515247; c=relaxed/simple; bh=aP8T+L34lw2c4CnWLlQ0WQsB+pBClVN9MOPNilVm71Q=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=Z4NNeSGkPX9hymjyQOTGAHw1oiQDrX+HZRfof92+997Eka+GONPhJQAqKzxnu75iV+NIY08VLfInXBMd/ZqUWwj7uvPW8rgY0qH7JXBpp+gVNF5lHrXsHknPF21ymx3CBsUYT31+4Y7F+oE79QCg4T+eVIBM32WPMI0oxQZ7Um0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=L7Qc9X8A; arc=none smtp.client-ip=148.163.156.1 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0353726.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 43J8RCkn001727; Fri, 19 Apr 2024 08:27:12 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : in-reply-to : references : date : message-id : mime-version : content-type; s=pp1; bh=mUz7G2I31WWjnfIm90UXXiHcAE5I6dJUYddf94VWWLk=; b=L7Qc9X8AaG34HRMO88C4PXIzIaUK1HzIh2LaB0OvyuLI2rHX3dbRPwdXXT9XwZj+5a6a D7+LAxU0cg7Mte9Hj2AOYEscDAMgdO/aseF6y/KpPFL6D7RZWq4gOzPDsigPO05CIoNU eSBo8r59fx+aK69vnt/eeL1nhwsT8I9jG3bzoT+d96Ti2fmyBJlu/nrz4vh9u8YQMiRf s3eCwQ/sW0lU+6Yx5OgN6pNev63Qo2gcg4upwOmanH0qexhwXqcVp5QHmXz53h5SR/1o HImvqnF1DPkaUOUDE1dAj4s0xxoBxc7LvcqAphDgfzeMmTdtvL4cL5bG3x1EnXhOL/qS 3g== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3xkn40g014-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 19 Apr 2024 08:27:12 +0000 Received: from m0353726.ppops.net (m0353726.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 43J8RBNJ001717; Fri, 19 Apr 2024 08:27:11 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3xkn40g011-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 19 Apr 2024 08:27:11 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 43J5FbYx020834; Fri, 19 Apr 2024 08:27:10 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3xkbm9jh1s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 19 Apr 2024 08:27:10 +0000 Received: from smtpav05.fra02v.mail.ibm.com (smtpav05.fra02v.mail.ibm.com [10.20.54.104]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 43J8R6Xi36831546 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 19 Apr 2024 08:27:08 GMT Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6187D2004D; Fri, 19 Apr 2024 08:27:06 +0000 (GMT) Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3C4BD2004B; Fri, 19 Apr 2024 08:27:06 +0000 (GMT) Received: from tuxmaker.linux.ibm.com (unknown [9.152.85.9]) by smtpav05.fra02v.mail.ibm.com (Postfix) with ESMTPS; Fri, 19 Apr 2024 08:27:06 +0000 (GMT) From: Sven Schnelle To: Tejun Heo Cc: Lai Jiangshan , linux-kernel@vger.kernel.org, Heiko Carstens , Peter Zijlstra Subject: Re: [PATCH] workqueue: fix selection of wake_cpu in kick_pool() In-Reply-To: (Tejun Heo's message of "Wed, 17 Apr 2024 15:55:57 -1000") References: <20240415053550.538722-1-svens@linux.ibm.com> Date: Fri, 19 Apr 2024 10:27:05 +0200 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Y4SsCU9Q2x3du1ED6odiU_SQestHSyPM X-Proofpoint-GUID: L9Jn3h-Az_dddY7AlmM7LgwALECaXbq1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-04-19_05,2024-04-17_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 mlxlogscore=472 impostorscore=0 suspectscore=0 adultscore=0 phishscore=0 malwarescore=0 clxscore=1011 bulkscore=0 spamscore=0 lowpriorityscore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2404010000 definitions=main-2404190062 Hi Tejun, Tejun Heo writes: > On Wed, Apr 17, 2024 at 05:36:38PM +0200, Sven Schnelle wrote: >> > This generally seems like a good idea but isn't this still racy? The CPU may >> > go down between setting p->wake_cpu and wake_up_process(). >> >> Don't know without reading the source, but how does this code normally >> protect against that? > > Probably by wrapping determining the wake_cpu and the wake_up inside > cpu_read_lock() section. Do you mean rcu_read_lock()? cpus_read_lock() takes a mutex, and the crash happens in softirq context - so cpus_read_lock() can't be the correct lock. If i read the code correctly, cpu hotplug uses stop_machine_cpuslocked() - so rcu_read_lock() should be sufficient for non-atomic context. Looking at the backtrace the crash is actually happening in arch_vpu_is_preempted(). I don't know the semantics of that function, whether it is ok to call it for offline CPUs, or whether the calling code should make sure that the cpu is online (which would be my guess). Following the backtrace from my initial mail, I can't find a place where a check is done whether p->wake_cpu is actually online. Eventually available_idle_cpu() is calling vcpu_is_preempted(). I wonder whether available_idle_cpu() should do a cpu_online() check right at the beginning? Adding Peter to CC, he probably knows. Thanks, Sven