Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp132780imu; Thu, 10 Jan 2019 20:01:22 -0800 (PST) X-Google-Smtp-Source: ALg8bN5hbwSSzQ0Arn8MVM7bXdavgkp1wdU3LoiHevLHctOUD5avIXQBYqFowIDJLwWXaak/j4ek X-Received: by 2002:a63:9749:: with SMTP id d9mr11646838pgo.415.1547179282883; Thu, 10 Jan 2019 20:01:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547179282; cv=none; d=google.com; s=arc-20160816; b=q8plIH48CunQptGOojQKK0Vh1EMLEkNmPvAaaS7l2ehsjs83OtXDgE8tMicvtTm+P7 ofPPXNAu9g4d9HX2nfEBQbgsM5e3CJgwqGleYOwDpdXtpGmdoBpG9kTuI8ch4DlB6IAT 3JkJLpdJj2RR5eHtd+Iuu163z9dzLt1scpQKnZr+YSYOBF7A/PgyORD9i3NJpDoBIJta LD6zZdkippwsCjaDdG/2bGo/BlrE3Nzu9hVpXIg+uJicor9tUxHKBUFHZ7Jw0i6urmlg GdUTKUgGpvEFAGCeb/KWQEyH51aMM2ycxWZQwMe5xD0ev0F0J8B7777CazvSPzGf4XOg 8vGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=yiCcBlmBbZDM2kdN58+yCSAx4T7LTUxSF4GS8QnydI8=; b=FBgHdu7lYlMYiksUrO8CjPkFwnrwPBt+tycLYg3IMshrprfrIuezrDkZHigVjIfyM3 8ZVmtYD2XywyrswdFN35i28J8vMQW4c6KF8TJZIHUvzN6dfTjQHzVt6lcfNqpMMk8VUg SHNkoQSigp3EjIaUduCdiR8cyPpwydi9wJDYUUpwjH+zCCeB+AVh0A2HXBjb7hTprbb3 V6yXg2tY0eMoqsNRD14hqe5OCkVOTSoRlnlZ+W1ZhJh6L5+rxgL9FN+gUe7No5ohBm77 rz0GgPW+djmYO6zlQ+LvcRgoKsgOWgnRF3v+YC37f+GXY/jn508f1Jyl68IkJL5JzWmN Upbg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m35si59073669pgb.246.2019.01.10.20.01.07; Thu, 10 Jan 2019 20:01:22 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729478AbfAKBrd (ORCPT + 99 others); Thu, 10 Jan 2019 20:47:33 -0500 Received: from ipmail06.adl6.internode.on.net ([150.101.137.145]:41985 "EHLO ipmail06.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727750AbfAKBrd (ORCPT ); Thu, 10 Jan 2019 20:47:33 -0500 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail06.adl6.internode.on.net with ESMTP; 11 Jan 2019 12:17:29 +1030 Received: from dave by dastard with local (Exim 4.80) (envelope-from ) id 1ghluy-0003Jr-JX; Fri, 11 Jan 2019 12:47:28 +1100 Date: Fri, 11 Jan 2019 12:47:28 +1100 From: Dave Chinner To: Andy Lutomirski Cc: Linus Torvalds , Jiri Kosina , Matthew Wilcox , Jann Horn , Andrew Morton , Greg KH , Peter Zijlstra , Michal Hocko , Linux-MM , kernel list , Linux API Subject: Re: [PATCH] mm/mincore: allow for making sys_mincore() privileged Message-ID: <20190111014728.GL27534@dastard> References: <20190108044336.GB27534@dastard> <20190109022430.GE27534@dastard> <20190109043906.GF27534@dastard> <20190110004424.GH27534@dastard> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 09, 2019 at 09:26:41PM -0800, Andy Lutomirski wrote: > Since direct IO has been brought up, I have a question. I've wondered > for years why direct IO works the way it does. If I were implementing > it from scratch, my first inclination would be to use the page cache > instead of fighting it. To do a single-page direct read, I would look > that page up in the page cache (i.e. i_pages these days). If the page > is there, I would do a normal buffered read. If the page is not Therein lies the problem. Copying data is prohibitively expensive, and that's the primary reason for O_DIRECT existing. i.e. O_DIRECT is a low-overhead, zero-copy data movement interface. The moment we switch from using CPU to dispatch IO to copying data, performance goes down because we will be unable to keep storage pipelines full. IOWs, any rework of O_DIRECT that involves copying data is a non-starter. But let's bring this back to the issue at hand - observability of page cache residency of file pages. If th epage is caceh resident, then it will have a latency of copying that data out of the page (i.e. very low latency). If the page is not resident, then it will do IO and take much, much longer to complete. i.e. we have clear timing differences between cachce hit and cache miss IO. This is exactly the timing information needed for observing page cache residency. We need to work out how to make page cache residency less observable, not add new, near perfect observation mechanisms that third parties can easily exploit... Cheers, Dave. -- Dave Chinner david@fromorbit.com