Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752237AbbD2Tgw (ORCPT ); Wed, 29 Apr 2015 15:36:52 -0400 Received: from mta-out1.inet.fi ([62.71.2.227]:44876 "EHLO kirsi1.inet.fi" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751006AbbD2Tgs (ORCPT ); Wed, 29 Apr 2015 15:36:48 -0400 Date: Wed, 29 Apr 2015 22:36:22 +0300 From: "Kirill A. Shutemov" To: Mark Williamson Cc: Mark Seaborn , kernel list , "Kirill A. Shutemov" , Pavel Emelyanov , Konstantin Khlebnikov , Andrew Morton , Linus Torvalds , Andy Lutomirski , Linux API , Finn Grimwood , Daniel James Subject: Re: Regression: Requiring CAP_SYS_ADMIN for /proc//pagemap causes application-level breakage Message-ID: <20150429193622.GA11892@node.dhcp.inet.fi> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2273 Lines: 51 On Wed, Apr 29, 2015 at 07:44:57PM +0100, Mark Williamson wrote: > Hi all, > > We've been investigating further and found a snag with the PFN-hiding > approach discussed last week - looks like it won't be enough on all > the architectures we support. Our product runs on x86_32, x86_64 and > ARM. For now, it looks like soft-dirty is only available on x86_64. > A patch that simply zeros out the physical addresses in > /proc/PID/pagemap will therefore help us on x86_64 but we'll still > have problems on other platforms[1]. > > For context, we were previously using pagemap as a cross-platform way > to get soft-dirty-like functionality. Specifically, to ask "did a > process write to any pages since fork()" by comparing addresses and > deducing where CoW must have occurred. In the absence of soft-dirty > and the physical addresses, it looks like we can't figure that out > with the remaining information in pagemap. > > If the pagemap file included the "writeable" bit from the PTE, we > think we'd have all the information required to deduce what we need > (although I realise that's a bit of a nasty workaround). If I > proposed including the PTE protection bits in pagemap, would that be > controversial? I'm guessing yes but thought it was worth a shot ;-) > Would anybody be able to suggest a more tasteful approach? Emm.. I have hard time to understand how writable bit is enough to get soft-dirty-alike functionality. Let's say we have anon-mapping with COW setup after the fork(). It's not writable PTEs to trigger COW on wp faults. But you can easily get to the same non-writable PTE after breaking COW: fork() again or mprotect(PROT_READ) and mprotect(PROT_READ|PROT_WRITE) back. ? > > Thanks, > Mark > > [1] I'd note that using soft-dirty is clearly the right approach for > us on x64, where available and that ideally we'd use it on other > architectures - cross-arch support for soft-dirty is a slightly > different discussion, which I hope to post another thread for. -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/