Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp1057784pxu; Thu, 8 Oct 2020 02:04:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwcHzgclyrseTpDcsPgeuPTETbNqa34Vd1hv5h072+SWU7aldMSettPEJjsZOxe5fQP7KlA X-Received: by 2002:a05:6402:1a43:: with SMTP id bf3mr3892948edb.8.1602147840323; Thu, 08 Oct 2020 02:04:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602147840; cv=none; d=google.com; s=arc-20160816; b=yXONg4r8wEVkjWL4kZFTtZH4WKyVL8rQOnHlt96LXV1tDqZVUwmpnYHmY+l+8BN7TS RiNTPAtq6znnWC8Mx9ks6K1/eP564HdiHEE3b0uHiytfVMtOpTWdOsTwqpzXWzEZ1q80 p7LyTvoMUTBQeZhVSWcgdkFTKqZNhaMVwgr0is5pOnnz2CR/eoF9GGNKL2GoAkIQOKZ4 oP0+jfNue8HNiYvcMqc5jkUIUGqsDcUfEG+dMNwy+5OlC8Wv2SDdezMx8MxDXT5qJcx+ yPHvIzJq1VdjZyZk4OesXeZ4Y0g4Qgv2agGO4lGX/QubeRGAj2qpOvp8IJZpet8oAiTh YWfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=jJ+2+sl7v16ZFQwrrbmFXcTV3fSlDqvjbzZU8FH7J0k=; b=bTDahEJvRWBtzWZTGCw+z5YTJMtBwRrVdGRyWe7cuTL83kKvnq+38bjpYtxJDNv2t/ jzOmcYXLvPndRZgV+6xl+3F5Y4xTCthJP/fzVbJqCHUYwPaxaF40mx2oY8VFN12gTg/1 IUYvnNpmrzAbyO7hm/Q2P4VAzPcGM0ThUbh71Jdi+oIiouugQa/WiHOZwwBen4+nY8dC dlqS4BlwZe/6jYb8VwRQn5s89pqEFTKt/Qn5N9Bj6l4OkuPsDG6zEX6GttWVFuXGh3vW D+5PPUHfB5Z4AjwZ2RLIb85a5EOb54jbnbNK7FyObCGhJDRXRsALvhx1RNylHalC2fjm 9fnA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=YbQ4JKBc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p17si3183007edy.143.2020.10.08.02.03.36; Thu, 08 Oct 2020 02:04:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=YbQ4JKBc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726273AbgJHIfz (ORCPT + 99 others); Thu, 8 Oct 2020 04:35:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58578 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725890AbgJHIfy (ORCPT ); Thu, 8 Oct 2020 04:35:54 -0400 Received: from mail-ed1-x541.google.com (mail-ed1-x541.google.com [IPv6:2a00:1450:4864:20::541]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1DD85C061755 for ; Thu, 8 Oct 2020 01:35:54 -0700 (PDT) Received: by mail-ed1-x541.google.com with SMTP id x1so4985056eds.1 for ; Thu, 08 Oct 2020 01:35:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=jJ+2+sl7v16ZFQwrrbmFXcTV3fSlDqvjbzZU8FH7J0k=; b=YbQ4JKBcC915g3FwIkD4kMLjfA+HVo2VeyW7+RDRzBMyT6+pZrez6yFdxS1JVF+MCw I9fi5xpJkCMQ1MZFTgUNhIEalXi4O0KwTMs6M55IBSfgwcTCW7bCzjzwadX5IHs01Ys+ QUkVJA+YFpvasOLS9njwPhCahsqEDmCIU3hjqiJOLg6eOuSAtlQRKOhBK1ipMWiljZWh bYsA4n/IJXJ6UoXazYhMMo1uHPy0DHAjqas/E118aED/AKQSL4ZqYOLHJELydII57n0C lMedTQeMsrYtUwkqt58wwsiqUj0n2XZJEoEzBMy4uQlfr4hZt8lgFUBScrlQH+fPt2h/ k/2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=jJ+2+sl7v16ZFQwrrbmFXcTV3fSlDqvjbzZU8FH7J0k=; b=TfxSFevybSUMegUbthsNZYrz0xLIWQc6e/ku6HDs0L0dqcxY4BeRqBYV7WltrdS3bf ANk8htvtTV0jgfZqIzmn5ZyO7UkHS4zl+0WGQgXbzNRLUu+1sl6shIAZEOUpyjG11uTW 1XF/EVoXDqFTcbPLMbu2ziSQNXtlOTVC0VW71/TqI7Nzw495P8VBo2wjPsuU2Vh9PMao nicatMYSKdmtgBriWpzE0hFAeWNq2+rOHy9/kCPVVB5qu41yHH5By9c6weNCuOkkwCLf UbFnUUMsUq13cSJUhLu7DGHQGQMXoAvbeyTpwPHHd1Hy6KX3jraB70a6HnrPi8zFSJyZ 7Byw== X-Gm-Message-State: AOAM533u5dDYYrBcOyvRV0K2iiLKXa3occ1ayK7FCkN/4BxiNDKJrV3N 3MSbNaj0Nd4QiR/HdMz//riYThyPoAhSw6rGSUhx5g== X-Received: by 2002:a50:9ea6:: with SMTP id a35mr8193746edf.52.1602146152651; Thu, 08 Oct 2020 01:35:52 -0700 (PDT) MIME-Version: 1.0 References: <20201007164426.1812530-1-daniel.vetter@ffwll.ch> <20201007164426.1812530-11-daniel.vetter@ffwll.ch> <20201007232448.GC5177@ziepe.ca> In-Reply-To: From: Dan Williams Date: Thu, 8 Oct 2020 01:35:41 -0700 Message-ID: Subject: Re: [PATCH 10/13] PCI: revoke mappings like devmem To: Daniel Vetter Cc: Jason Gunthorpe , DRI Development , LKML , KVM list , Linux MM , Linux ARM , linux-samsung-soc , "Linux-media@vger.kernel.org" , linux-s390 , Daniel Vetter , Kees Cook , Andrew Morton , John Hubbard , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Jan Kara , Bjorn Helgaas , Linux PCI Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 8, 2020 at 1:13 AM Daniel Vetter wrote: > > On Thu, Oct 8, 2020 at 9:50 AM Dan Williams wrote: > > > > On Wed, Oct 7, 2020 at 4:25 PM Jason Gunthorpe wrote: > > > > > > On Wed, Oct 07, 2020 at 12:33:06PM -0700, Dan Williams wrote: > > > > On Wed, Oct 7, 2020 at 11:11 AM Daniel Vetter wrote: > > > > > > > > > > Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims > > > > > the region") /dev/kmem zaps ptes when the kernel requests exclusive > > > > > acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is > > > > > the default for all driver uses. > > > > > > > > > > Except there's two more ways to access pci bars: sysfs and proc mmap > > > > > support. Let's plug that hole. > > > > > > > > Ooh, yes, lets. > > > > > > > > > > > > > > For revoke_devmem() to work we need to link our vma into the same > > > > > address_space, with consistent vma->vm_pgoff. ->pgoff is already > > > > > adjusted, because that's how (io_)remap_pfn_range works, but for the > > > > > mapping we need to adjust vma->vm_file->f_mapping. Usually that's done > > > > > at ->open time, but that's a bit tricky here with all the entry points > > > > > and arch code. So instead create a fake file and adjust vma->vm_file. > > > > > > > > I don't think you want to share the devmem inode for this, this should > > > > be based off the sysfs inode which I believe there is already only one > > > > instance per resource. In contrast /dev/mem can have multiple inodes > > > > because anyone can just mknod a new character device file, the same > > > > problem does not exist for sysfs. > > > > > > The inode does not come from the filesystem char/mem.c creates a > > > singular anon inode in devmem_init_inode() > > > > That's not quite right, An inode does come from the filesystem I just > > arranged for that inode's i_mapping to be set to a common instance. > > > > > Seems OK to use this more widely, but it feels a bit weird to live in > > > char/memory.c. > > > > Sure, now that more users have arrived it should move somewhere common. > > > > > This is what got me thinking maybe this needs to be a bit bigger > > > generic infrastructure - eg enter this scheme from fops mmap and > > > everything else is in mm/user_iomem.c > > > > It still requires every file that can map physical memory to have its > > ->open fop do > > > > inode->i_mapping = devmem_inode->i_mapping; > > filp->f_mapping = inode->i_mapping; > > > > I don't see how you can centralize that part. > > btw, why are you setting inode->i_mapping? The inode is already > published, changing that looks risky. And I don't think it's needed, > vma_link() only looks at filp->f_mapping, and in our drm_open() we > only set that one. I think you're right it is unnecessary for devmem, but I don't think it's dangerous to do it from the very first open before anything is using the address space. It's copy-paste from what all the other "shared address space" implementers do. For example, block-devices in bd_acquire(). However, the rationale for block_devices to do it is so that page cache pages can be associated with the address space in the absence of an f_mapping. Without filesystem page writeback to coordinate I don't see any devmem code paths that would operate on the inode->i_mapping.