Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp2643395pxb; Tue, 12 Oct 2021 10:29:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyRB92kZp773IBNJ91PX3FRDEJ6hnI1S9vCPkaSRlnyWgzBMedpaV6jpHwpnmC/IiVpXEwq X-Received: by 2002:aa7:8497:0:b0:44d:24ce:988d with SMTP id u23-20020aa78497000000b0044d24ce988dmr10899299pfn.18.1634059794336; Tue, 12 Oct 2021 10:29:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634059794; cv=none; d=google.com; s=arc-20160816; b=oBisQxNSqAoh4lcgw1Wu4PA6GguhLHTrer23MYF6pQaHAXqLfuBPyyJD6d+GajfLPq PXGw/hYDefdNp+TWfp0UrQAxA772CMgq5DgTlbRVSIaxWeu736cGrivsQ6NL18jsiAXV oLuvztZyA9/R1yTUd2wAKqjfzJ+ngCu7wE49U87Xl/CkQpWk20AxMk/QUTqny3Tez/pz VIXFdvOK0i/4JbZx635aTDe786K/J10ffc8ZsqCW1n5tnRmkuYKpLK+sYbT3LlpvsaRW fXbn+8StGGSfHMCBW+ogqJe4oktS0pZxlYTLiAOlhlFKYVDAM1aSJYWISckEKhwDAYUk XMBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=6UL99cRWDSOw2MzLGLQ1k/Uhml2pDe+eEXSuRxXP+PI=; b=OiVpSz8QZZBg9vMPFHrvP6QoolBIIpgu9MPrvaT79bqsOn4zy7knpKp6n6mjGdIL6U uZt6Hk7NKbkXQrMJx9/lYWlD41NQXCuUvLOmnI5QKR5eZ2omBG7WS5sjHDEpUIS7g7iQ PPpSHVKOfM2FQ2xFDGaszKqVso+d+cffDVSFpBZ/HY4PjK24M7nshmgoTAzvdxtcnw/g Yjrst9tdTftTQPIxNCi2qSck20BPLKbbjMH8Ut1M1WXNKkL/fK/y98rte/Y+Qt+SZba3 RqsailQwDI7LptRwnoacA5F7RaxcJMVEksL7IaJWA9QF/xx6GLAa4XkMI7xDrCYZ6pHq jBqQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x33si16419038pgl.277.2021.10.12.10.29.41; Tue, 12 Oct 2021 10:29:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231886AbhJLR3P (ORCPT + 99 others); Tue, 12 Oct 2021 13:29:15 -0400 Received: from mail.kernel.org ([198.145.29.99]:52608 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230306AbhJLR3O (ORCPT ); Tue, 12 Oct 2021 13:29:14 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id E0EB460EFE; Tue, 12 Oct 2021 17:27:09 +0000 (UTC) Date: Tue, 12 Oct 2021 18:27:06 +0100 From: Catalin Marinas To: Linus Torvalds Cc: Al Viro , Andreas Gruenbacher , Christoph Hellwig , "Darrick J. Wong" , Jan Kara , Matthew Wilcox , cluster-devel , linux-fsdevel , Linux Kernel Mailing List , "ocfs2-devel@oss.oracle.com" , Josef Bacik , Will Deacon Subject: Re: [RFC][arm64] possible infinite loop in btrfs search_ioctl() Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 11, 2021 at 04:59:28PM -0700, Linus Torvalds wrote: > On Mon, Oct 11, 2021 at 2:08 PM Catalin Marinas wrote: > > +#ifdef CONFIG_ARM64_MTE > > +#define FAULT_GRANULE_SIZE (16) > > +#define FAULT_GRANULE_MASK (~(FAULT_GRANULE_SIZE-1)) > > [...] > > > If this looks in the right direction, I'll do some proper patches > > tomorrow. > > Looks fine to me. It's going to be quite expensive and bad for caches, > though. > > That said, fault_in_writable() is _supposed_ to all be for the slow > path when things go south and the normal path didn't work out, so I > think it's fine. > > I do wonder how the sub-page granularity works. Is it sufficient to > just read from it? For arm64 MTE and I think SPARC ADI, just reading should be sufficient. There is CHERI in the long run, if it takes off, where the user can set independent read/write permissions and uaccess would use the capability rather than a match-all pointer (hence checked). > Because then a _slightly_ better option might be to > do one write per page (to catch page table writability) and then one > read per "granule" (to catch pointer coloring or cache poisoning > issues)? > > That said, since this is all preparatory to us wanting to write to it > eventually anyway, maybe marking it all dirty in the caches is only > good. It depends on how much would be written in the actual copy. For significant memcpy on arm CPUs, write streaming usually kicks in and the cache dirtying is skipped. This probably matters more for copy_page_to_iter_iovec() than the btrfs search ioctl. Apart from fault_in_pages_*(), there's also fault_in_user_writeable() called from the futex code which uses the GUP mechanism as the write would be destructive. It looks like it could potentially trigger the same infinite loop on -EFAULT. For arm64 MTE, we get away with this by disabling the tag checking around the arch futex code (we did it for an unrelated issue - we don't have LDXR/STXR that would run with user permissions in kernel mode like we do with LDTR/STTR). I wonder whether we should actually just disable tag checking around the problematic accesses. What these callers seem to have in common is using pagefault_disable/enable(). We could abuse this to disable tag checking or maybe in_atomic() when handling the exception to lazily disable such faults temporarily. A more invasive change would be to return a different error for such faults like -EACCESS and treat them differently in the caller. -- Catalin