Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp624112pxu; Wed, 7 Oct 2020 11:24:04 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyYTLbMTXJl2HXgFRRw5RpNfRyNvdzkdvVltGkkFfmTtpFlGWv2S2k16vQGJr0N7AVhRpMh X-Received: by 2002:a17:907:212b:: with SMTP id qo11mr4638292ejb.107.1602095044158; Wed, 07 Oct 2020 11:24:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602095044; cv=none; d=google.com; s=arc-20160816; b=tTe7eX9l6+DzM6Hb3jevr0/2ZAR8XNhiYy7yE4kxSfNjoKduZHb0aScGZGzFNodUts DWj1svKdM4JL0WcA+G1eL56RXhiw9My2OZ9xHeOif8RGkVnVcGc2CGej6eeNzYhBKeLZ WXoVrXvCsdqCZTpBJjlqKLCoz/hcIBkifEcCkv/VawsJMY4InNRwx/ax0mag+NqIyk47 SXk3kxdlbmIMoDE+sM38KCdwpTDdC9JtSXFhai847tFxX3t1Hv0q9b+dIK94TRpLf9db Er4++ygpij4sjIHZzmGepgGkEl30oayElOi8NCXnLX14VKwYV5Rs53B/RuyRqS5YmJ8K 8CBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=KjKZykFuT09vi8gvWSxWfIoih5+HtjIRxuS5MgzFTGo=; b=SwAbCYk0MA0su0G6wQY096JBCyj6tp3Sp+ZAaMmW46nCu/EMFASS8y5P8wrMEiAwVG x/no2S0OG9gL/4rx+SoJAmqQ5QuAg45T1OCj18i0ArbafudA9G9z6cLndg1V6O0eHvbo tvxR1hjAmthOjKKxHhaJT0LWLrJrA7xGZ6J+agRp/G8X4Ttdf8bAldr5poIRLnd52E6I xaAme2BkKS8QoxioZjwbOKpPO0LoMw6cKx1gfK/xx+7HfAJ6PrFtqCu1+9DdnAwFHZAJ HhlQ3tQ3IKV5791oewWur/EBYGjNIsgcBYBdnHUEwH1Asa1gahgFGw/Svtg/Vli/4+QA /stQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=XlsYCfTV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id f9si2756225ejl.591.2020.10.07.11.23.40; Wed, 07 Oct 2020 11:24:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=XlsYCfTV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728381AbgJGSBy (ORCPT + 99 others); Wed, 7 Oct 2020 14:01:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36652 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728342AbgJGSBy (ORCPT ); Wed, 7 Oct 2020 14:01:54 -0400 Received: from mail-oo1-xc44.google.com (mail-oo1-xc44.google.com [IPv6:2607:f8b0:4864:20::c44]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1B466C0613D4 for ; Wed, 7 Oct 2020 11:01:54 -0700 (PDT) Received: by mail-oo1-xc44.google.com with SMTP id c25so176896ooe.13 for ; Wed, 07 Oct 2020 11:01:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=KjKZykFuT09vi8gvWSxWfIoih5+HtjIRxuS5MgzFTGo=; b=XlsYCfTVgW+6nmGAiv0O62GLPsC5r7eB+oidAEHNavDRGpWicEbtNZg9to7A5saHMp zaYRD/gT5inYX9JmqOP5ZzySgAIOp0+59TPao6y2kIbw9G7KU0mdvCuhCQrafShecj0J tHZf4mGHu3aegW49dmX0ZJptYBRUrRuO4MkvA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=KjKZykFuT09vi8gvWSxWfIoih5+HtjIRxuS5MgzFTGo=; b=Yz8vX+2HU+EURljxPtsgmqwRyazT1icduNNyTnkOCuJkJcb6GPWpk0U4Ohs+HIaLZ1 i0KMDb/6lnMozCX7jYYcH8VqkxBlrSFV0KTWQrCZSkCMM1481FHchC2tipivp7mCG2jb eJsOhNCYQ9jNxyFyAYqIlvYIdrZJSLAh+V96Nf465B0LkPysV3YilgN6z6aSNo399tN3 s8fHcaqQeb9NrZNFYTt+KEtfQNvbfeSs3/h+B1Cj7pSTwsBFberWbwEhNDOR8kAtn9xB Wv7iGQ/jcUZwippxxrIAo13uq7oQwlz/mtNYZqeWSeiKk9yrfM9/Y9SoqXfutZFFrDAB OdUw== X-Gm-Message-State: AOAM533EHxS6NR3VY9jblPyT0+UA1tBGWGHqNr6lrRfrdV48kS40L7nS lQxy4kn0z0BNVmpGpbagFo8Iw1TZSr7e2HY/b45mpw== X-Received: by 2002:a4a:c011:: with SMTP id v17mr2806667oop.89.1602093713265; Wed, 07 Oct 2020 11:01:53 -0700 (PDT) MIME-Version: 1.0 References: <20201007164426.1812530-1-daniel.vetter@ffwll.ch> <20201007164426.1812530-8-daniel.vetter@ffwll.ch> <20201007172746.GU5177@ziepe.ca> In-Reply-To: <20201007172746.GU5177@ziepe.ca> From: Daniel Vetter Date: Wed, 7 Oct 2020 20:01:42 +0200 Message-ID: Subject: Re: [PATCH 07/13] mm: close race in generic_access_phys To: Jason Gunthorpe Cc: DRI Development , LKML , kvm@vger.kernel.org, Linux MM , Linux ARM , linux-samsung-soc , "open list:DMA BUFFER SHARING FRAMEWORK" , linux-s390@vger.kernel.org, Dan Williams , Kees Cook , Rik van Riel , Benjamin Herrensmidt , Dave Airlie , Hugh Dickins , Andrew Morton , John Hubbard , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Jan Kara , Daniel Vetter Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 7, 2020 at 7:27 PM Jason Gunthorpe wrote: > > On Wed, Oct 07, 2020 at 06:44:20PM +0200, Daniel Vetter wrote: > > Way back it was a reasonable assumptions that iomem mappings never > > change the pfn range they point at. But this has changed: > > > > - gpu drivers dynamically manage their memory nowadays, invalidating > > ptes with unmap_mapping_range when buffers get moved > > > > - contiguous dma allocations have moved from dedicated carvetouts to > > cma regions. This means if we miss the unmap the pfn might contain > > pagecache or anon memory (well anything allocated with GFP_MOVEABLE) > > > > - even /dev/mem now invalidates mappings when the kernel requests that > > iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87 > > ("/dev/mem: Revoke mappings when a driver claims the region") > > > > Accessing pfns obtained from ptes without holding all the locks is > > therefore no longer a good idea. Fix this. > > > > Since ioremap might need to manipulate pagetables too we need to drop > > the pt lock and have a retry loop if we raced. > > > > While at it, also add kerneldoc and improve the comment for the > > vma_ops->access function. It's for accessing, not for moving the > > memory from iomem to system memory, as the old comment seemed to > > suggest. > > > > References: 28b2ee20c7cb ("access_process_vm device memory infrastructu= re") > > Cc: Jason Gunthorpe > > Cc: Dan Williams > > Cc: Kees Cook > > Cc: Rik van Riel > > Cc: Benjamin Herrensmidt > > Cc: Dave Airlie > > Cc: Hugh Dickins > > Cc: Andrew Morton > > Cc: John Hubbard > > Cc: J=C3=A9r=C3=B4me Glisse > > Cc: Jan Kara > > Cc: Dan Williams > > Cc: linux-mm@kvack.org > > Cc: linux-arm-kernel@lists.infradead.org > > Cc: linux-samsung-soc@vger.kernel.org > > Cc: linux-media@vger.kernel.org > > Signed-off-by: Daniel Vetter > > --- > > include/linux/mm.h | 3 ++- > > mm/memory.c | 44 ++++++++++++++++++++++++++++++++++++++++++-- > > 2 files changed, 44 insertions(+), 3 deletions(-) > > This does seem to solve the race with revoke_devmem(), but it is really u= gly. > > It would be much nicer to wrap a rwsem around this access and the unmap. > > Any place using it has a nice linear translation from vm_off to pfn, > so I don't think there is a such a good reason to use follow_pte in > the first place. > > ie why not the helper be this: > > int generic_access_phys(unsigned long pfn, unsigned long pgprot, > void *buf, size_t len, bool write) > > Then something like dev/mem would compute pfn and obtain the lock: > > dev_access(struct vm_area_struct *vma, unsigned long addr, void *buf, int= len, int write) > { > cpu_addr =3D vma->vm_pgoff*PAGE_SIZE + (addr - vma->vm_start)); > > /* FIXME: Has to be over each page of len */ > if (!devmem_is_allowed_access(PHYS_PFN(cpu_addr/4096))) > return -EPERM; > > down_read(&mem_sem); > generic_access_phys(cpu_addr/4096, pgprot_val(vma->vm_page_prot), > buf, len, write); > up_read(&mem_sem); > } > > The other cases looked simpler because they don't revoke, here the > mmap_sem alone should be enough protection, they would just need to > provide the linear translation to pfn. > > What do you think? I think it'd fix the bug, until someone wires ->access up for drivers/gpu, or the next subsystem. This is also just for ptrace, so we really don't care when we stall the vm badly and other silly things. So I figured the somewhat ugly, but full generic solution is the better one, so that people who want to be able to ptrace read/write their iomem mmaps can just sprinkle this wherever they feel like. But yeah if we go with most minimal fix, i.e. only trying to fix the current users, then your thing should work and is simpler. But it leaves the door open for future problems. -Daniel --=20 Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch