Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2111720imu; Thu, 10 Jan 2019 08:27:53 -0800 (PST) X-Google-Smtp-Source: ALg8bN5lKjOhj+RnQutMWC6hXfpxW9tdbSJNwTUXzDEWEzq9o+rJuWWybyVufVkUpPXHnWeiI6nv X-Received: by 2002:a63:df50:: with SMTP id h16mr9998194pgj.421.1547137673735; Thu, 10 Jan 2019 08:27:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547137673; cv=none; d=google.com; s=arc-20160816; b=MmEn9oSJe9nuJtNvQ9+lu54Ap0+i0316V5X+wc0ex4CL6fZLfnIA73O01BQIMyT0I1 fo1jQt15ovMUEMI7/SiOXWgdKPWFjBPP2DE1TOFk2LmZ2Td6LNxJVIvFsfCTKkoiHObo GbfL/2cK+ydi/9wX5SKRCem04qo63ISBMFGGk4ZQhoQWKO1wxH9pKWZyhYoU1Y5JA3oJ kIHOQljKommgRbgIDKW67Mj8zp16XsYGvhg66HVBlfmgduBjpWkxW0x79q/NlCWZZEmY BBJjFrGFccn5ZoUXv2v1VIx8gw6lx9/Uqw+VQfZZAZKzLmDEzzbnPTprHc0pW4vbIIFN JERA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=ANYXKevTAinQDIHwIVYZJui5pV72A9ueV8/mDNkD+pg=; b=daz5vbopQIK51JG2Sgii/96g3im/miEDxoYmhfHQG5OFP+/wRhklkYq0h4OXZ2ylXj 3e8la7Q2rZ4UARemLglG5hLzpnpw/5JQ7PAMDQy2Cl5u0DLKcrel5h2gK1LqIDKmgh7C md+aAVjb49ytTLDgPZCc+A++EFKXXf2ht+gH6eqTUWP5FxILT+L7rA4B71RSzl7NF4F3 quFjGMu1BHbCy0sWdHk98iFzPX08pW3HJmwV5Dx9WG5l5M06rEvawC4QuqrJrqh4ytey O4VNfF03WW323g4pE4e4s8ULIDx0V54gCmBJDNyOYk5S1QnjB0QxLCqM590QXqVzyegk erqA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=SUCuwoWE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o22si17892205pgb.584.2019.01.10.08.27.38; Thu, 10 Jan 2019 08:27:53 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=SUCuwoWE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729235AbfAJOrS (ORCPT + 99 others); Thu, 10 Jan 2019 09:47:18 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:47496 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727723AbfAJOrS (ORCPT ); Thu, 10 Jan 2019 09:47:18 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=ANYXKevTAinQDIHwIVYZJui5pV72A9ueV8/mDNkD+pg=; b=SUCuwoWEFo94cxwC+gtxmafZv 3MdF5HmXFjSUUvdC/nOyvh24C2iLWjVxYFRFl/IAUy1M01/1BS1i6ycRAQqmdhn7kAG7ynm7DHfc+ TpNxWYpxqQnoXpqRcAP75p5EqBLp3uz7DX7bZNiTaG3oJJ6dBw5lkbT1Su5fn1qOc/ILOo0JXsfTr uW0lDMJGFFhff+N1y/HDpXEVHTFbHS88ZrDNDw6zfwkJ3iMBZTAabxVIJHEPVOxSwkcToF/uB3O+j H7UWsBsuIX/TfF3aPFv8ChvU77DftooWpoRYnvKF0u5U3RMQ6+jfA6kgy3m7DmznCXl1RVr31n6T9 yTD3kT3Xg==; Received: from willy by bombadil.infradead.org with local (Exim 4.90_1 #2 (Red Hat Linux)) id 1ghbbz-0007H9-E4; Thu, 10 Jan 2019 14:47:11 +0000 Date: Thu, 10 Jan 2019 06:47:11 -0800 From: Matthew Wilcox To: Andy Lutomirski Cc: Linus Torvalds , Dave Chinner , Jiri Kosina , Jann Horn , Andrew Morton , Greg KH , Peter Zijlstra , Michal Hocko , Linux-MM , kernel list , Linux API Subject: Re: [PATCH] mm/mincore: allow for making sys_mincore() privileged Message-ID: <20190110144711.GV6310@bombadil.infradead.org> References: <20190108044336.GB27534@dastard> <20190109022430.GE27534@dastard> <20190109043906.GF27534@dastard> <20190110004424.GH27534@dastard> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 09, 2019 at 09:26:41PM -0800, Andy Lutomirski wrote: > Since direct IO has been brought up, I have a question. I've wondered > for years why direct IO works the way it does. If I were implementing > it from scratch, my first inclination would be to use the page cache > instead of fighting it. To do a single-page direct read, I would look > that page up in the page cache (i.e. i_pages these days). If the page > is there, I would do a normal buffered read. If the page is not > there, I would insert a record into i_pages indicating that direct IO > is in progress and then I would do the IO into the destination page. > If any other read, direct or otherwise, sees a record saying "under > direct IO", it would wait. OK, you're in the same ballpark I am ;-) Kent Overstreet pointed out that what you want to do here is great for the mixed case, but it's pretty inefficient for IOs to files which are wholly uncached. So what I'm currently thinking about is an rwsem which works like this: O_DIRECT task: if i_pages is empty, take rwsem for read, recheck i_pages is empty, do IO, drop rwsem. if i_pages is not empty, insert XA_LOCK_ENTRY, when IO complete, wake waitqueue for that (mapping, index). buffered IO: if i_pages is empty, take rwsem for write, allocate page, insert page, drop rwsem. if i_pages is not empty, look up index, if entry is XA_LOCK_ENTRY sleep on waitqueue. otherwise proceed as now.