Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1701926pxb; Fri, 26 Feb 2021 19:43:17 -0800 (PST) X-Google-Smtp-Source: ABdhPJx89EmPGeY2bUueSmlKgf0qH7hsvJyoUQNuWfYd92NWpRpV42hHKfUNZeF9pz1gJ56He+Pg X-Received: by 2002:a17:906:c08e:: with SMTP id f14mr6711761ejz.388.1614397397477; Fri, 26 Feb 2021 19:43:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614397397; cv=none; d=google.com; s=arc-20160816; b=IMyxRTo6IGaOpPnLJ+GCIm2B1eN8qdSef19oIFWOjArQXdUTEgy0MYwz5w3m/ReS3+ 0Sepi3TOvJQockZNIzeuU7ml6yW7Yek0IumIcL4HbsCjaFMbCnY1SIY/rO6lyJtPjVg0 WVt5jkDTmsRREMRbyEhbf9YA77bZRIN64P5dSyFj+BFdQhAZHmFlF5NoRiPabk+TLFIH GF5ERLJGtcbvm69rn4LYwReHV5jbVIVBFf0c+hdeA/ne1kgut+iRryWGuK8BI6kKkWMj pBh/RcYl3Gfy4T44tOGBBxjcQPIJzUkY/Wim6HrL+8An+bJ1O9tw3xzHixhsKPsJqFdR YvTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=nDu5n0B3R1Flv2/A+RB4NnpR+Z3i/VgHVlm0ia5aCTE=; b=Q77kb4sswQDO6EXlxy3DGUR0gFH2DTm7/l+ti+9j0+/KohI+qHD/33S7jvyCAWrdpp cNH1chdah2ZturJqhIBq4uNdOj72aYdOka4Oi7zxoh6cT+dGYo6e4tEuCFGfhL1QOtUI NzrIfjT3u4kYEz0zSGMGYgncBYUacRzFXjsPMjKfIpOPoHt38jOWVMpDjzkoik/ehmFZ lCqqDABAPgnA/8vg5h3M9XUXZebSgpPSnOjJNCZhEoJ6UGv8drHpMjm8jxG4BYcFgEKh WQ6veuNmC8hWPWjiGUVOi75hWtozqhJ+kP7MRgVfdBxXoYwYSY7BKfkoZJEo5Lxwffqz kPGA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=XyYvUGJY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u20si198611ejr.511.2021.02.26.19.42.55; Fri, 26 Feb 2021 19:43:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=XyYvUGJY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230096AbhB0Dle (ORCPT + 99 others); Fri, 26 Feb 2021 22:41:34 -0500 Received: from mail.kernel.org ([198.145.29.99]:44316 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229991AbhB0Dla (ORCPT ); Fri, 26 Feb 2021 22:41:30 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 14DFE64F2A for ; Sat, 27 Feb 2021 03:40:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1614397249; bh=GaYBKNg2FYle2ONdxwZhRkedGU09PcAZdiKDMRmBPLQ=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=XyYvUGJY7Wkxu5qH4nylRf/R5eP6PHeOZFS+5TuNBjYLmZB/zVjMYp7YSuy/NFfqG nW+HTGQoIRhUTCfIy7LzMLuINFnacalY9DEZOIESRzNP32m+ItFpwIBBaUfLNxknVv cIo/k7esJNeeKJNMMETnsFcASUHwpHugySCaqWetz1raXtqZLDy3/0oEuAKIo45wXY xZnDmvRI9g/qYm7BwkBhCWh8kkU7Pmdx736XnX3PFvV6MADsRjO1YgNk7CJwDZ2d+6 nzjTE7qzpBEzgVEMbUTO7RS4Z5+hxm5baZdpSZor2j0qfmqXkI3RgeukSeEUtANtRU kvQ5nXENLPZXQ== Received: by mail-ej1-f44.google.com with SMTP id b21so7017202eja.4 for ; Fri, 26 Feb 2021 19:40:48 -0800 (PST) X-Gm-Message-State: AOAM530sbu/GQoNpo5QSNvfUgB5AKl/JMZGNNZFntIQyzr94aOX2GLzY k9yhV+zvS0d7lfNgh0xFnKhpfqFFea1tryZDiZb5yA== X-Received: by 2002:a17:906:a44:: with SMTP id x4mr6400087ejf.101.1614397247562; Fri, 26 Feb 2021 19:40:47 -0800 (PST) MIME-Version: 1.0 References: <20210223204436.1df73153@alex-virtual-machine> <788DFBA0-903F-4548-9C2F-B1A1543EE770@amacapital.net> <20210223164259.GA166727@agluck-desk2.amr.corp.intel.com> <20210225124711.35b31965@alex-virtual-machine> In-Reply-To: <20210225124711.35b31965@alex-virtual-machine> From: Andy Lutomirski Date: Fri, 26 Feb 2021 19:40:35 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v3] x86/fault: Send a SIGBUS to user process always for hwpoison page access. To: Aili Yao Cc: "Luck, Tony" , =?UTF-8?B?SE9SSUdVQ0hJIE5BT1lBKCDloIDlj6Mg55u05LmfKQ==?= , Dave Hansen , Andrew Lutomirski , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , X86 ML , yangfeng1@kingsoft.com, Linux-MM , LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 24, 2021 at 8:47 PM Aili Yao wrote: > > On Tue, 23 Feb 2021 08:42:59 -0800 > "Luck, Tony" wrote: > > > On Tue, Feb 23, 2021 at 07:33:46AM -0800, Andy Lutomirski wrote: > > > > > > > On Feb 23, 2021, at 4:44 AM, Aili Yao wrote: > > > > > > > > =EF=BB=BFOn Fri, 5 Feb 2021 17:01:35 +0800 > > > > Aili Yao wrote: > > > > > > > >> When one page is already hwpoisoned by MCE AO action, processes ma= y not > > > >> be killed, processes mapping this page may make a syscall include = this > > > >> page and result to trigger a VM_FAULT_HWPOISON fault, as it's in k= ernel > > > >> mode it may be fixed by fixup_exception, current code will just re= turn > > > >> error code to user code. > > > >> > > > >> This is not sufficient, we should send a SIGBUS to the process and= log > > > >> the info to console, as we can't trust the process will handle the= error > > > >> correctly. > > > >> > > > >> Suggested-by: Feng Yang > > > >> Signed-off-by: Aili Yao > > > >> --- > > > >> arch/x86/mm/fault.c | 62 +++++++++++++++++++++++++++++------------= ---- > > > >> 1 file changed, 40 insertions(+), 22 deletions(-) > > > >> > > > > Hi luto; > > > > Is there any feedback? > > > > > > At the very least, this needs a clear explanation of why your propose= d behavior is better than the existing behavior. > > > > The explanation is buried in that "can't trust the process" line. > > > > E.g. user space isn't good about checking for failed write(2) syscalls. > > So if the poison was in a user buffer passed to write(fd, buffer, count= ) > > sending a SIGBUS would be the action if they read the poison directly, > > so it seems reasonable to send the same signal if the kernel read their > > poison for them. > > > > It would avoid users that didn't check the return value merrily proceed= ing > > as if everything was ok. > > Hi luto: > I will add more infomation: > Even if the process will check return value of syscall like write, I d= on't think > process will take proper action for this. > In test example, the return value will be errno is 14 (Bad Address), t= he process may not realize > this is a hw issue, and may take wrong action not as expected. > And totally, A hw error will rarely happen, and the hw error hitting t= his branch will be > more unlikely, the impaction without this patch is quite minor, but this = is still not good enough, we should > make it better, right? There are a few issues I can imagine: Some programs may use read(2), write(2), etc as ways to check if memory is valid without getting a signal. They might not want signals, which means that this feature might need to be configurable. It's worth making sure that this doesn't end up sending duplicate signals. If nothing else, this would impact the vsyscall emulation code. Programs that get a signal might expect that the RIP that the signal frame points to is the instruction that caused the signal and that the instruction faulted without side effects. For SIGSEGV, I would be especially nervous about this. Maybe SIGBUS is safer. For SIGSEGV, it's entirely valid to look at CR2 / si_fault_addr, fix it up, and return. This would be completely *invalid* with your patch. I'm not sure what to do about this. --Andy