Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp190372imu; Thu, 10 Jan 2019 21:38:37 -0800 (PST) X-Google-Smtp-Source: ALg8bN5D5CS8Viavn2+8ecykwViVncr0zZ11OyBU17mnuF2VHixeI2CAkh4Am2l/QsW3cYTk4UZs X-Received: by 2002:a62:43c1:: with SMTP id l62mr13414680pfi.22.1547185117565; Thu, 10 Jan 2019 21:38:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547185117; cv=none; d=google.com; s=arc-20160816; b=LdKpvHbsuP3Epsp3RAIJa0wkSZWD9snCe+AQBQKG9y5bAXZiV5Fj8aoiNOCohuxhRL rPj70qzopbMmY1jnReKWAPVPxOYUDFXv56hoaweFAG2o+rYPbhVYAIsiXhISusfb9FIE KnHs/hle+NTYbt7mHueBE3rR+ScM+emFwhHLOHnISqbrHZIBa40lgAsguMtf0nWJYFok Y8jIEmk1KEbT4KRvDmoKeYh0vWiSdnZsBFuz49l/m/g2q3wckbxmSl8euuhLLTSWrx1u ihXGfCpcM9K0RW+luTDFtNIRDG//UT56uWb1LRP//AzM53tuXVhxJiog2QdKFhg0HcRb ZV0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=j6musmNfW576eU4Zh6gyAydiK4rzqbc8gvzYDKGJo/4=; b=waWGL5SFB5NidH2H92Mk1+uHVfHSUHg/s5UID7GMeI2vR+rJo3Jt0yXZKXjIoHoH6R vJF3mlUBE0YH7TM6rX0ufAB+OtvhfxefEguIix5ycw+3cChFN+v7gSiOlrf0gYpQN+Ay m12Nf3ecsqDQvmMMMNsO7IRU41DBCdks9gm2reYWKvy9rLPcUgXSBdc6aVqRnZ/QRg0K k9Fjvg4rZyMAtebmMYHDq20lA52RIJAjhIv88+RWabjCgB+ikgDvNwshLp5sY0LtGUwT zohjrUORV2D/2/g7jAFWI6itfxshiVty/Ui8Mvx/PNo1TxyEOZy7lyoX7m8UU2xQY73j +/LA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a185si8064716pge.404.2019.01.10.21.38.21; Thu, 10 Jan 2019 21:38:37 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729645AbfAKE6K (ORCPT + 99 others); Thu, 10 Jan 2019 23:58:10 -0500 Received: from nautica.notk.org ([91.121.71.147]:35239 "EHLO nautica.notk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729013AbfAKE6J (ORCPT ); Thu, 10 Jan 2019 23:58:09 -0500 Received: by nautica.notk.org (Postfix, from userid 1001) id 14F21C01B; Fri, 11 Jan 2019 05:58:05 +0100 (CET) Date: Fri, 11 Jan 2019 05:57:50 +0100 From: Dominique Martinet To: Linus Torvalds Cc: Dave Chinner , Jiri Kosina , Matthew Wilcox , Jann Horn , Andrew Morton , Greg KH , Peter Zijlstra , Michal Hocko , Linux-MM , kernel list , Linux API Subject: Re: [PATCH] mm/mincore: allow for making sys_mincore() privileged Message-ID: <20190111045750.GA27333@nautica> References: <20190109022430.GE27534@dastard> <20190109043906.GF27534@dastard> <20190110004424.GH27534@dastard> <20190110070355.GJ27534@dastard> <20190110122442.GA21216@nautica> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Linus Torvalds wrote on Thu, Jan 10, 2019: > On Thu, Jan 10, 2019 at 4:25 AM Dominique Martinet > wrote: > > Linus Torvalds wrote on Thu, Jan 10, 2019: > > > (Except, of course, if somebody actually notices outside of tests. > > > Which may well happen and just force us to revert that commit. But > > > that's a separate issue entirely). > > > > Both Dave and I pointed at a couple of utilities that break with > > this. nocache can arguably work with the new behaviour but will behave > > differently; vmtouch on the other hand is no longer able to display > > what's in cache or not - people use that for example to "warm up" a > > container in page cache based on how it appears after it had been > > running for a while is a pretty valid usecase to me. > > So honestly, the main reason I'm loath to revert is that yes, we know > of theoretical differences, but they seem to all be > performance-related. I don't see what other use mincore could have, yes - even the "debugging" use I gave is performance investigations and not hard problems (and I probably would go straight to perf nowadays, you'd get the info that the program doesn't use cache from the call graphs) > It would be really good to hear numbers. Is the warm-up optimization > something that changes things from 3ms to 3.5ms? Or does it change > things from 3ms to half a second? This is heavily workload and storage hardware dependant, so hard to give some absolute value. Trying with some big server, fast SSD, mysql and doing: # echo 3 > /proc/sys/vm/drop_caches # (optional) prefetch table and innodb files # systemctl restart mariadb # time mysql -q db "select * from mytable where id in $ENTRIES" > /dev/null # time mysql -q db "select * from mytable where id in $ENTRIES2" > /dev/null # time mysql -q db "select * from mytable where id in $ENTRIES3" > /dev/null (where ENTRIES* are lists of 1000 id, and id is indexed; the table is 8GB for 62590661 entries so 1000 entries is approx 128KB of data out of that file) I get on average over a few queries approximately a real time of 350ms, 230ms and 220ms immediately after drop cache and service restart, and 150ms, 60ms and 60ms after a prefetch (hand-wavy average over 3 runs, I didn't have the patience to do proper testing). (In both cases, user/sys are less than 10ms; I don't see much difference there) If I restart the service without dropping caches and redo the query I get 60ms from the first query onwards so I must not be preloading everything properly, some real script that would look all over a container to properly restore the page cache would do better than me blindly preloading a few files. Either way, we're talking about a factor of 2-3 until the application has been looking at most of the entries, and I didn't try to see how that would look like on spinning disks or the kind of slow storage one would get on VPS somewhere in the cloud - I'm sure someone with time to waste could get much more impressive figures, but this already look pretty worthwhile to me. -- Dominique Martinet | Asmadeus