Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751181AbbD2UYq (ORCPT ); Wed, 29 Apr 2015 16:24:46 -0400 Received: from mail-la0-f53.google.com ([209.85.215.53]:33865 "EHLO mail-la0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750729AbbD2UYm (ORCPT ); Wed, 29 Apr 2015 16:24:42 -0400 MIME-Version: 1.0 In-Reply-To: <20150429193622.GA11892@node.dhcp.inet.fi> References: <20150429193622.GA11892@node.dhcp.inet.fi> Date: Wed, 29 Apr 2015 21:24:40 +0100 Message-ID: Subject: Re: Regression: Requiring CAP_SYS_ADMIN for /proc//pagemap causes application-level breakage From: Mark Williamson To: "Kirill A. Shutemov" Cc: Mark Seaborn , kernel list , "Kirill A. Shutemov" , Pavel Emelyanov , Konstantin Khlebnikov , Andrew Morton , Linus Torvalds , Andy Lutomirski , Linux API , Finn Grimwood , Daniel James Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2520 Lines: 62 Hi, On Wed, Apr 29, 2015 at 8:36 PM, Kirill A. Shutemov wrote: > On Wed, Apr 29, 2015 at 07:44:57PM +0100, Mark Williamson wrote: >> Hi all, ... snip ... >> For context, we were previously using pagemap as a cross-platform way >> to get soft-dirty-like functionality. Specifically, to ask "did a >> process write to any pages since fork()" by comparing addresses and >> deducing where CoW must have occurred. In the absence of soft-dirty >> and the physical addresses, it looks like we can't figure that out >> with the remaining information in pagemap. >> >> If the pagemap file included the "writeable" bit from the PTE, we >> think we'd have all the information required to deduce what we need >> (although I realise that's a bit of a nasty workaround). If I >> proposed including the PTE protection bits in pagemap, would that be >> controversial? I'm guessing yes but thought it was worth a shot ;-) >> Would anybody be able to suggest a more tasteful approach? > > Emm.. I have hard time to understand how writable bit is enough to get > soft-dirty-alike functionality. In the general case, you are of course correct - in our specific case I *think* we'd be able to manage OK ... (see below). > Let's say we have anon-mapping with COW setup after the fork(). It's not > writable PTEs to trigger COW on wp faults. But you can easily get to the > same non-writable PTE after breaking COW: fork() again or > mprotect(PROT_READ) and mprotect(PROT_READ|PROT_WRITE) back. I believe we'll be able to get away with this in our particular usecase. The process is running in our debugger at the time and so we can interpose on the system calls that are happening. That should give us the opportunity to check for CoW-breaking before the debuggee is allowed to alter page protections itself. It ends up not being full soft-dirty behaviour but it's similar enough to tell us what we need to know. Cheers, Mark > ? > >> >> Thanks, >> Mark >> >> [1] I'd note that using soft-dirty is clearly the right approach for >> us on x64, where available and that ideally we'd use it on other >> architectures - cross-arch support for soft-dirty is a slightly >> different discussion, which I hope to post another thread for. > > -- > Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/