Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp1728966pxb; Fri, 27 Aug 2021 16:25:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwZ8g/cwBj1TUrOCROGDUeVbV9/1+7QkesxTDmCxBC3GsHS/Mvrnr6PmWO7hdqjoU2MPHgq X-Received: by 2002:a6b:f203:: with SMTP id q3mr9324797ioh.32.1630106717991; Fri, 27 Aug 2021 16:25:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630106717; cv=none; d=google.com; s=arc-20160816; b=EEprovVqFxfbauAGCx/Mxy5lPGddzbOgS2hUrjj2TLtmNW40yqid+dBPdNcMkRn48O 9Zufaobuqmkhrd5GgOwwmAlolhHcWbgnO/8UeYC6wPs97j5X3VztDYgmbn/MYOPjqN0g sULkBJZcFBTwMGEoY0kA4fHrMSt/Pd28ZmteC2AuAviS5wbT7GAgAf64k7S+P6y9eBp/ TGpPgb4Mh/JtBNl1U6gCXxDlFXwaF6Lwg963PJk3eY5AZjXM1rZE3Enwjej09dQ71/Mr 8klZlmoSbJAIY4MwI0Bf9TxZX85FpTKJQwJWZuo0QgKct7mcf6W6AAAn0YJjb4GGti7P +zdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=2BF+c8YWnOcFUk48DNldtQYf9Ub8BmEyv18a6wxuKTc=; b=JAe2XWArx5WJGozQQeGMqHV9nWeMEwlNpVY+sOPZ2RHNgtxIijEiIpi3dN8resiUuV mLTWft6T47XbAdELQYPrCCPfwHCK80U2HKqESFad6+93ljRU0HXKlaNwvZrDHryPI04c S+ZZPfXK9Jwb5CvL05+q6WRileJrb5ZNCU4axAuK2nm7WjeVhgQX3XwgvKGq59ZmqXW9 5ORrLyz5WzVHGtUKyo5gf5Z4/atXgG6T0xNNpu1L/UvdFQc8L/01AFPCTrF1ZcaWUXJY vM8WJecIxipRwPWyLoRuxv/Z+wg1cFlLHVgujm8blw6ldMUlvmLExEzGUiF3e5Ha9DWB T2NA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k14si8010684ilo.4.2021.08.27.16.24.50; Fri, 27 Aug 2021 16:25:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232468AbhH0XXj (ORCPT + 99 others); Fri, 27 Aug 2021 19:23:39 -0400 Received: from mga02.intel.com ([134.134.136.20]:52665 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232433AbhH0XXj (ORCPT ); Fri, 27 Aug 2021 19:23:39 -0400 X-IronPort-AV: E=McAfee;i="6200,9189,10089"; a="205253045" X-IronPort-AV: E=Sophos;i="5.84,357,1620716400"; d="scan'208";a="205253045" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Aug 2021 16:22:47 -0700 X-IronPort-AV: E=Sophos;i="5.84,357,1620716400"; d="scan'208";a="538679488" Received: from agluck-desk2.sc.intel.com (HELO agluck-desk2.amr.corp.intel.com) ([10.3.52.146]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Aug 2021 16:22:47 -0700 Date: Fri, 27 Aug 2021 16:22:46 -0700 From: "Luck, Tony" To: Al Viro Cc: Linus Torvalds , Andreas Gruenbacher , Christoph Hellwig , "Darrick J. Wong" , Jan Kara , Matthew Wilcox , cluster-devel , linux-fsdevel , Linux Kernel Mailing List , ocfs2-devel@oss.oracle.com Subject: Re: [PATCH v7 05/19] iov_iter: Introduce fault_in_iov_iter_writeable Message-ID: <20210827232246.GA1668365@agluck-desk2.amr.corp.intel.com> References: <20210827164926.1726765-1-agruenba@redhat.com> <20210827164926.1726765-6-agruenba@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 27, 2021 at 09:57:10PM +0000, Al Viro wrote: > On Fri, Aug 27, 2021 at 09:48:55PM +0000, Al Viro wrote: > > > [btrfs]search_ioctl() > > Broken with memory poisoning, for either variant of semantics. Same for > > arm64 sub-page permission differences, I think. > > > > So we have 3 callers where we want all-or-nothing semantics - two in > > arch/x86/kernel/fpu/signal.c and one in btrfs. HWPOISON will be a problem > > for all 3, AFAICS... > > > > IOW, it looks like we have two different things mixed here - one that wants > > to try and fault stuff in, with callers caring only about having _something_ > > faulted in (most of the users) and one that wants to make sure we *can* do > > stores or loads on each byte in the affected area. > > > > Just accessing a byte in each page really won't suffice for the second kind. > > Neither will g-u-p use, unless we teach it about HWPOISON and other fun > > beasts... Looks like we want that thing to be a separate primitive; for > > btrfs I'd probably replace fault_in_pages_writeable() with clear_user() > > as a quick fix for now... > > > > Comments? > > Wait a sec... Wasn't HWPOISON a per-page thing? arm64 definitely does have > smaller-than-page areas with different permissions, so btrfs search_ioctl() > has a problem there, but arch/x86/kernel/fpu/signal.c doesn't have to deal > with that... > > Sigh... I really need more coffee... On Intel poison is tracked at the cache line granularity. Linux inflates that to per-page (because it can only take a whole page away). For faults triggered in ring3 this is pretty much the same thing because mm/memory_failure.c unmaps the page ... so while you see a #MC on first access, you get #PF when you retry. The x86 fault handler sees a magic signature in the page table and sends a SIGBUS. But it's all different if the #MC is triggerd from ring0. The machine check handler can't unmap the page. It just schedules task_work to do the unmap when next returning to the user. But if your kernel code loops and tries again without a return to user, then your get another #MC. -Tony