Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757965AbZD2X0F (ORCPT ); Wed, 29 Apr 2009 19:26:05 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754012AbZD2XZx (ORCPT ); Wed, 29 Apr 2009 19:25:53 -0400 Received: from tomts20.bellnexxia.net ([209.226.175.74]:42018 "EHLO tomts20-srv.bellnexxia.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753200AbZD2XZw (ORCPT ); Wed, 29 Apr 2009 19:25:52 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AhYFABd8+ElMQW1W/2dsb2JhbACBUL08CJBkgiUWCIEyBQ Date: Wed, 29 Apr 2009 19:25:46 -0400 From: Mathieu Desnoyers To: Linus Torvalds , akpm@linux-foundation.org, Nick Piggin Cc: Ingo Molnar , KOSAKI Motohiro , Peter Zijlstra , thomas.pi@arcor.dea, Yuriy Lalym , linux-kernel@vger.kernel.org, ltt-dev@lists.casi.polymtl.ca Subject: [PATCH] Fix dirty page accounting in redirty_page_for_writepage() Message-ID: <20090429232546.GB15782@Krystal> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Editor: vi X-Info: http://krystal.dyndns.org:8080 X-Operating-System: Linux/2.6.21.3-grsec (i686) X-Uptime: 19:22:26 up 60 days, 19:48, 2 users, load average: 0.29, 0.38, 0.41 User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5117 Lines: 164 Basically, the following execution : dd if=/dev/zero of=/tmp/testfile will slowly fill _all_ ram available without taking into account memory pressure. This is because the dirty page accounting is incorrect in redirty_page_for_writepage. This patch adds missing dirty page accounting in redirty_page_for_writepage(). This should fix a _lot_ of issues involving machines becoming slow under heavy write I/O. No surprise : eventually the system starts swapping. Linux kernel 2.6.30-rc2 The /proc/meminfo picture I had before applying this patch after filling my memory with the dd execution was : MemTotal: 16433732 kB MemFree: 10919700 kB Buffers: 12492 kB Cached: 5262508 kB SwapCached: 0 kB Active: 37096 kB Inactive: 5254384 kB Active(anon): 16716 kB Inactive(anon): 0 kB Active(file): 20380 kB Inactive(file): 5254384 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 19535024 kB SwapFree: 19535024 kB Dirty: 2125956 kB Writeback: 50476 kB AnonPages: 16660 kB Mapped: 9560 kB Slab: 189692 kB SReclaimable: 166688 kB SUnreclaim: 23004 kB PageTables: 3396 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 27751888 kB Committed_AS: 53904 kB VmallocTotal: 34359738367 kB VmallocUsed: 10764 kB VmallocChunk: 34359726963 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 3456 kB DirectMap2M: 16773120 kB After applying my patch, the same test case steadily leaves between 8 and 500MB ram free in the steady-state (when pressure is reached). MemTotal: 16433732 kB MemFree: 85144 kB Buffers: 23148 kB Cached: 15766280 kB SwapCached: 0 kB Active: 51500 kB Inactive: 15755140 kB Active(anon): 15540 kB Inactive(anon): 1824 kB Active(file): 35960 kB Inactive(file): 15753316 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 19535024 kB SwapFree: 19535024 kB Dirty: 2501644 kB Writeback: 33280 kB AnonPages: 17280 kB Mapped: 9272 kB Slab: 505524 kB SReclaimable: 485596 kB SUnreclaim: 19928 kB PageTables: 3396 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 27751888 kB Committed_AS: 54508 kB VmallocTotal: 34359738367 kB VmallocUsed: 10764 kB VmallocChunk: 34359726715 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 3456 kB DirectMap2M: 16773120 kB The pressure pattern I see with the patch applied is : (16GB ram total) - Inactive(file) fills up to 15.7GB. - Dirty fills up to 1.7GB. - Writeback vary between 0 and 600MB sync() behavior : - Dirty down to ~6MB. - Writeback increases to 1.6GB, then shrinks down to ~0MB. References : This insanely huge http://bugzilla.kernel.org/show_bug.cgi?id=12309 [Bug 12309] Large I/O operations result in slow performance and high iowait times (yes, I've been in CC all along) Special thanks to Linus Torvalds and Nick Piggin and Thomas Pi for their suggestions on previous patch iterations. Special thanks to the LTTng community, which helped me getting LTTng up to its current usability level. It's been tremendously useful in understanding those problematic I/O workloads and generating fio test cases. Signed-off-by: Mathieu Desnoyers CC: Linus Torvalds CC: akpm@linux-foundation.org CC: Nick Piggin CC: Ingo Molnar CC: KOSAKI Motohiro CC: Peter Zijlstra CC: thomas.pi@arcor.dea CC: Yuriy Lalym --- mm/page-writeback.c | 6 ++++++ 1 file changed, 6 insertions(+) Index: linux-2.6-lttng/mm/page-writeback.c =================================================================== --- linux-2.6-lttng.orig/mm/page-writeback.c 2009-04-29 18:14:48.000000000 -0400 +++ linux-2.6-lttng/mm/page-writeback.c 2009-04-29 18:23:59.000000000 -0400 @@ -1237,6 +1237,12 @@ int __set_page_dirty_nobuffers(struct pa if (!mapping) return 1; + /* + * Take care of setting back page accounting correctly. + */ + inc_zone_page_state(page, NR_FILE_DIRTY); + inc_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE); + spin_lock_irq(&mapping->tree_lock); mapping2 = page_mapping(page); if (mapping2) { /* Race with truncate? */ -- Mathieu Desnoyers OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/