Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp2605926pxb; Mon, 11 Jan 2021 14:26:33 -0800 (PST) X-Google-Smtp-Source: ABdhPJzPLedPFLc0fhCG2wLZTIDg6syY7DwnWIioCXnTrbUqI+zOWkqJHThWmOLCVewFwpFrEBsT X-Received: by 2002:a17:906:52d9:: with SMTP id w25mr1024175ejn.504.1610403993166; Mon, 11 Jan 2021 14:26:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610403993; cv=none; d=google.com; s=arc-20160816; b=cfdoFCJCunKb57M2RVsJg4ljiGVJ/Py7SpCvhzTLgib7F/OIbE/comRMKt1pAHDsZn YjFTGV7tTDTUKyLUIxs+DVSvUwNoTzNhz3FGx5imuIdZz2IQKiGgmPyNPcckerwxrlwC +zGGg6OFTQ04FFcBAXMmuBIOkQ5UK9g1TRLQsq9P+c4uD078J6Ao7VxwQkG6kuIPMvA+ B6u6dNFS+AfbWhafJ6WREdSOM4tKhDFViP+viZCdrlQ/LakiVPKZcK/AJpCnm87T1zZZ ArqB94dF9oujss6WNdB6r01eIOwxDPIOk5QPnhvIqF9z1GKaiWXSCJpwORiZ4FgJnHPi pOJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:ironport-sdr:ironport-sdr; bh=sfIYw1Gt2s5UFZ2XdadidBn37AMzuDS23u/AmSqCvOw=; b=xJvIj43xzpAVi1HMmju1IaQ+wwgRJgzqamzSuuK4UlZWh73ULH9gmkEzohUievDOeW bsKLMvNYAZfAwUD6aAWrckgdczENK5lTMYlwCTogs6S80amql7Gxt0UKQq6GPPAO8gM1 qgorNENpdNXCaRgd7CORi2dMKAqzHf6RGfYen9XL+K8K5k7Xpijz7aAVgu6l9m8aV+4A w+UX7elj3F6qfg3/lVPATeNqd3cpBiXXBNcNghrcyMvA0PvPSn3/WgnK5N9PsvtXVU5X PWASR0g9gvZLOc6ruAqui4ONSrz7Io7iKbz/WJ4ZEpM2sFsxO+h1bHszQa0MxHulEkLu 2f6Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d7si402529edx.507.2021.01.11.14.26.08; Mon, 11 Jan 2021 14:26:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389901AbhAKWVk (ORCPT + 99 others); Mon, 11 Jan 2021 17:21:40 -0500 Received: from mga03.intel.com ([134.134.136.65]:45580 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728040AbhAKWVk (ORCPT ); Mon, 11 Jan 2021 17:21:40 -0500 IronPort-SDR: pHPw3qYSbVQ5yv0uGWaWMjyUL4ajqTedgpjq2n3TlqoLFaqgnuH1r4nQJRtRcl/lIdwDI9cRwr FzRIyUfRdUDw== X-IronPort-AV: E=McAfee;i="6000,8403,9861"; a="178040007" X-IronPort-AV: E=Sophos;i="5.79,339,1602572400"; d="scan'208";a="178040007" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jan 2021 14:20:59 -0800 IronPort-SDR: 6pIQeWH++WIe5oa7r6R4Yn0mC8sPjVDU2XZ0A2/jSZOf5nG9czsXeUsatRR7hFe9qglLiP6g72 UUxZF1tWmcUA== X-IronPort-AV: E=Sophos;i="5.79,339,1602572400"; d="scan'208";a="381170169" Received: from agluck-desk2.sc.intel.com (HELO agluck-desk2.amr.corp.intel.com) ([10.3.52.68]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jan 2021 14:20:58 -0800 Date: Mon, 11 Jan 2021 14:20:57 -0800 From: "Luck, Tony" To: Andy Lutomirski Cc: Borislav Petkov , x86@kernel.org, Andrew Morton , Peter Zijlstra , Darren Hart , Andy Lutomirski , linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery Message-ID: <20210111222057.GA2369@agluck-desk2.amr.corp.intel.com> References: <20210111214452.1826-2-tony.luck@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 11, 2021 at 02:11:56PM -0800, Andy Lutomirski wrote: > > > On Jan 11, 2021, at 1:45 PM, Tony Luck wrote: > > > > Recovery action when get_user() triggers a machine check uses the fixup > > path to make get_user() return -EFAULT. Also queue_task_work() sets up > > so that kill_me_maybe() will be called on return to user mode to send a > > SIGBUS to the current process. > > > > But there are places in the kernel where the code assumes that this > > EFAULT return was simply because of a page fault. The code takes some > > action to fix that, and then retries the access. This results in a second > > machine check. > > > > While processing this second machine check queue_task_work() is called > > again. But since this uses the same callback_head structure that > > was used in the first call, the net result is an entry on the > > current->task_works list that points to itself. > > Is this happening in pagefault_disable context or normal sleepable fault context? If the latter, maybe we should reconsider finding a way for the machine check code to do its work inline instead of deferring it. The first machine check is in pagefault_disable() context. static int get_futex_value_locked(u32 *dest, u32 __user *from) { int ret; pagefault_disable(); ret = __get_user(*dest, from); pagefault_enable(); return (ret == -ENXIO) ? ret : ret ? -EFAULT : 0; } -Tony