Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760128Ab3HNRKS (ORCPT ); Wed, 14 Aug 2013 13:10:18 -0400 Received: from mga09.intel.com ([134.134.136.24]:52045 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760064Ab3HNRKQ (ORCPT ); Wed, 14 Aug 2013 13:10:16 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.89,878,1367996400"; d="scan'208";a="362658148" Message-ID: <520BB9EF.5020308@linux.intel.com> Date: Wed, 14 Aug 2013 10:10:07 -0700 From: Dave Hansen User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130623 Thunderbird/17.0.7 MIME-Version: 1.0 To: linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com, linux-ext4@vger.kernel.org, Jan Kara , LKML , david@fromorbit.com, Tim Chen , Andi Kleen , Andy Lutomirski Subject: page fault scalability (ext3, ext4, xfs) Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2347 Lines: 45 We talked a little about this issue in this thread: http://marc.info/?l=linux-mm&m=137573185419275&w=2 but I figured I'd follow up with a full comparison. ext4 is about 20% slower in handling write page faults than ext3. xfs is about 30% slower than ext3. I'm running on an 8-socket / 80-core / 160-thread system. Test case is this: https://github.com/antonblanchard/will-it-scale/blob/master/tests/page_fault3.c It's a little easier to look at the trends as you grow the number of processes: http://www.sr71.net/~dave/intel/page-fault-exts/cmp.html?1=ext3&2=ext4&3=xfs&hide=linear,threads,threads_idle,processes_idle&rollPeriod=16 I recorded and diff'd some perf data (I've still got the raw data if anyone wants it), and the main culprit of the ext4/xfs delta looks to be spinlock contention (or at least bouncing) in xfs_log_commit_cil(). This looks to be a known problem: http://oss.sgi.com/archives/xfs/2013-07/msg00110.html Here's a brief snippet of the ext4->xfs 'perf diff'. Note that things like page_fault() go down in the profile because we are doing _fewer_ of them, not because it got faster: > # Baseline Delta Shared Object Symbol > # ........ ....... ..................... .............................................. > # > 22.04% -4.07% [kernel.kallsyms] [k] page_fault > 2.93% +12.49% [kernel.kallsyms] [k] _raw_spin_lock > 8.21% -0.58% page_fault3_processes [.] testcase > 4.87% -0.34% [kernel.kallsyms] [k] __set_page_dirty_buffers > 4.07% -0.58% [kernel.kallsyms] [k] mem_cgroup_update_page_stat > 4.10% -0.61% [kernel.kallsyms] [k] __block_write_begin > 3.69% -0.57% [kernel.kallsyms] [k] find_get_page It's a bit of a bummer that things are so much less scalable on the newer filesystems. I expected xfs to do a _lot_ better than it did. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/