Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp3880340rdb; Thu, 14 Sep 2023 05:46:37 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGMWq90aHj+XiSCB+40kQD6O67zCQqdiLi/8wTS/ta6e0DwrDCv+G8EzM9WcPi2tgQudISa X-Received: by 2002:a05:6a21:a595:b0:13f:65ca:52a2 with SMTP id gd21-20020a056a21a59500b0013f65ca52a2mr6287795pzc.5.1694695597366; Thu, 14 Sep 2023 05:46:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694695597; cv=none; d=google.com; s=arc-20160816; b=M9or4qGasBt/vDs9/6nkM+2a3N/1C0aKJUNjbfN8JW/M8Bv3d584cWW118hwZcXqh8 PjuWnS1P3GPJkFvKxUxZ95dDs2b2DB6GMdWYvhpqUaBpvlD8GOTrmpSgFGty5dAddVjD q7KCSo/968FLuu2CPxlaHmbEXDqVPBlbTaHgUQBZvCMCLk8RPf+QQGg8zoGsRRZ5OywR pmgWH0lcfi4wgGCWZW2dbpegh2z+m/dm2YsSE2ySTAgNzMs6k5/tJUzDEoW1U+BxDdZX mGhOhkXjRSBC8a+QAevfpdAM/mdlb6E3VtI0DO1ei8eeP2SZmymX76hwEZNnwE6sgwPr PdPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=ymH1v6cOBwI8SJ/5mETElIehW/IG9VVdKuCvlgGH9zI=; fh=k25tR+5t3ySjYq+uNv9DPZg+ptDoIVBwgTSX+XxD5+U=; b=eYuiXkuqFwa+VZJVRMYSI1ptm1abOI983h5brZmR1wqw2JBC4YgDMXcUvNEf8TMAfb 5y2Zar0jTywu0VppvpPoX1zcNrst3XTj3pL/4w2CWYF82tRXgMYYn4Jt7MXnkLZ4y/sa 1rtJLA9BGmkSedrRPpUDWHSNs/U1/LrPsvKcy5P7Axy54cLbSE14/Dp480/g6CXxKwBR QdanXA4Y7OOLmLoHqSXDn4oYjo//LBcjZxfeDAnr8Vqe6NholGw3KLYx8gqho5ZyHIX6 /q9oCvr6GbQ6Uxbj7ku74KCYwFKUYXh8MN+bF1Frt1M/2HY665DamNR5THwq20I0doco ebKw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=UiXuCXjc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id s8-20020a17090ad48800b0027455c727f5si1587768pju.84.2023.09.14.05.46.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 05:46:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=UiXuCXjc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id DB89C809927B; Wed, 13 Sep 2023 12:44:08 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231830AbjIMToJ (ORCPT + 99 others); Wed, 13 Sep 2023 15:44:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33370 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229642AbjIMToI (ORCPT ); Wed, 13 Sep 2023 15:44:08 -0400 Received: from mail-ej1-x629.google.com (mail-ej1-x629.google.com [IPv6:2a00:1450:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4173AB7 for ; Wed, 13 Sep 2023 12:44:04 -0700 (PDT) Received: by mail-ej1-x629.google.com with SMTP id a640c23a62f3a-99c93638322so48960066b.1 for ; Wed, 13 Sep 2023 12:44:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1694634242; x=1695239042; darn=vger.kernel.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ymH1v6cOBwI8SJ/5mETElIehW/IG9VVdKuCvlgGH9zI=; b=UiXuCXjcyR6hVjFPlwuTS/HRTbrbIkqG98Ij90Tqx2DZpyUtb0CRXcUulpgAobbXE+ U8xBPEOAjB/D0IMty0hWfRRZefAvCItykPE1PPuH58sWDUlYfJAInh5JR9jtdzrAQMUQ 5cSdeYNvHWQWy7TQmlasilYmQuwdbNn8RwkMg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694634242; x=1695239042; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ymH1v6cOBwI8SJ/5mETElIehW/IG9VVdKuCvlgGH9zI=; b=v0if4Eqw8AK2z37vdpWbbKWI6InbKEq7Zg+/U0hLTnI6wohIRWJmOn7qUfQoyQHRPJ kRC7lOcGi2drcLJRxTeS672RnhSBCt+JDuC+BkQpSTrhWVytIzXXOwmvazOforvJpcCQ ijsHhpwBfxpo68y+ckPnETnSuPAzjeunG4EKmpIdcElXxyKS2089Ewd8w3dFFHj6hu9M K8APZXGLqCC+FSKgdEq7rIrSg4kcV3ybrp6GpTjbQNMB3Mk+Hw2i/ys9jbfTcfXguH0B ThPb3L/W787kCDpDoezDVkmdJwk1S9qvj3iQuYM3i7JLvL/z2G+63cKVD82ceBwktm93 F1Vw== X-Gm-Message-State: AOJu0YzsfHkG0wL9MbaDZGuZ2l3bXU3GDC4b3SJyAtcWDPlY5BwzLefH 5beSfPXk5km8aNa+/9avPrdQVFONS4+NBd7xRbjtD5ns X-Received: by 2002:a17:907:96a4:b0:9a5:c38d:6b75 with SMTP id hd36-20020a17090796a400b009a5c38d6b75mr10005593ejc.15.1694634242303; Wed, 13 Sep 2023 12:44:02 -0700 (PDT) Received: from mail-ed1-f48.google.com (mail-ed1-f48.google.com. [209.85.208.48]) by smtp.gmail.com with ESMTPSA id e10-20020a170906044a00b0099d0a8ccb5fsm8942919eja.152.2023.09.13.12.44.01 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 13 Sep 2023 12:44:01 -0700 (PDT) Received: by mail-ed1-f48.google.com with SMTP id 4fb4d7f45d1cf-51a52a7d859so2921820a12.0 for ; Wed, 13 Sep 2023 12:44:01 -0700 (PDT) X-Received: by 2002:a17:907:6e92:b0:9ad:7840:ab29 with SMTP id sh18-20020a1709076e9200b009ad7840ab29mr11942207ejc.32.1694634240890; Wed, 13 Sep 2023 12:44:00 -0700 (PDT) MIME-Version: 1.0 References: <20230913165648.2570623-1-dhowells@redhat.com> <20230913165648.2570623-8-dhowells@redhat.com> In-Reply-To: <20230913165648.2570623-8-dhowells@redhat.com> From: Linus Torvalds Date: Wed, 13 Sep 2023 12:43:43 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v4 07/13] iov_iter: Make copy_from_iter() always handle MCE To: David Howells Cc: Al Viro , Jens Axboe , Christoph Hellwig , Christian Brauner , David Laight , Matthew Wilcox , Jeff Layton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Wed, 13 Sep 2023 12:44:09 -0700 (PDT) On Wed, 13 Sept 2023 at 09:57, David Howells wrote: > > Make copy_from_iter() always catch an MCE and return a short copy and make > the coredump code rely on that. This requires arch support in the form of > a memcpy_mc() function that returns the length copied. What? This patch seems to miss the point of the machine check copy entirely. You create that completely random memcpy_mc() function, that has nothing to do with our existing copy_mc_to_kernel(), and you claim that the issue is that it should return the length copied. Which is not the issue at all. Several x86 chips will HANG due to internal CPU corruption if you use the string instructions for copying data when a machine check exception happens (possibly only due to memory poisoning with some non-volatile RAM thing). Are these chips buggy? Yes. Is the Intel machine check architecture nasty and bad? Yes, Christ yes. Can these machines hang if user space does repeat string instructions to said memory? Afaik, very much yes again. They are buggy. I _think_ this only happens with the non-volatile storage stuff (thus the dax / pmem / etc angle), and I hope we can put it behind us some day. But that doesn't mean that you can take our existing copy_mc_to_kernel() code that tries to work around this and replace it with something completely different that definitely does *not* work around it. See the comment in arch/x86/lib/copy_mc_64.S: * copy_mc_fragile - copy memory with indication if an exception / fault happened * * The 'fragile' version is opted into by platform quirks and takes * pains to avoid unrecoverable corner cases like 'fast-string' * instruction sequences, and consuming poison across a cacheline * boundary. The non-fragile version is equivalent to memcpy() * regardless of CPU machine-check-recovery capability. and yes, it's disgusting, and no, I've never seen a machine that does this, since it's all "enterprise hardware", and I don't want to touch that shite with a ten-foot pole. Should I go on another rant about how "enterprise" means "over-priced garbage, but with a paper trail of how bad it is, so that you can point fingers at somebody else"? That's true both when applied to software and to hardware, I'm afraid. So if we get rid of that horrendous "copy_mc_fragile", then pretty much THE WHOLE POINT of the stupid MC copy goes away, and we should just get rid of it all entirely. Which might be a good idea, but is absolutely *not* something that should be done randomly as part of some iov_iter rewrite series. I'll dance on the grave of that *horrible* machine check copy code, but when I see this as part of iov_iter cleanup, I can only say "No. Not this way". > [?] Is it better to kill the thread in the event of an MCE occurring? Oh, the thread will be dead already. In fact, if I understand the problem correctly, the whole f$^!ng machine will be dead and need to be power-cycled. Linus