Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758597Ab3DYNi2 (ORCPT ); Thu, 25 Apr 2013 09:38:28 -0400 Received: from relay1.sgi.com ([192.48.179.29]:41051 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756342Ab3DYNi0 (ORCPT ); Thu, 25 Apr 2013 09:38:26 -0400 Date: Thu, 25 Apr 2013 08:38:25 -0500 From: Cliff Wickman To: HATAYAMA Daisuke Cc: ebiederm@xmission.com, vgoyal@redhat.com, kumagai-atsushi@mxc.nes.nec.co.jp, lisa.mitchell@hp.com, kexec@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v4 0/8] kdump, vmcore: support mmap() on /proc/vmcore Message-ID: <20130425133825.GA25089@sgi.com> References: <20130413002000.18245.21513.stgit@localhost6.localdomain6> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130413002000.18245.21513.stgit@localhost6.localdomain6> User-Agent: Mutt/1.5.17+20080114 (2008-01-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4355 Lines: 120 On Fri, Apr 05, 2013 at 12:04:02AM +0000, HATAYAMA Daisuke wrote: > Currently, read to /proc/vmcore is done by read_oldmem() that uses > ioremap/iounmap per a single page. For example, if memory is 1GB, > ioremap/iounmap is called (1GB / 4KB)-times, that is, 262144 > times. This causes big performance degradation. > > In particular, the current main user of this mmap() is makedumpfile, > which not only reads memory from /proc/vmcore but also does other > processing like filtering, compression and IO work. > > To address the issue, this patch implements mmap() on /proc/vmcore to > improve read performance. > > Benchmark > ========= > > You can see two benchmarks on terabyte memory system. Both show about > 40 seconds on 2TB system. This is almost equal to performance by > experimtanal kernel-side memory filtering. > > - makedumpfile mmap() benchmark, by Jingbai Ma > https://lkml.org/lkml/2013/3/27/19 > > - makedumpfile: benchmark on mmap() with /proc/vmcore on 2TB memory system > https://lkml.org/lkml/2013/3/26/914 > > ChangeLog > ========= > > v3 => v4) > > - Rebase 3.9-rc7. > - Drop clean-up patches orthogonal to the main topic of this patch set. > - Copy ELF note segments in the 1st kernel just as in v1. Allocate > vmcore objects per pages. => See [PATCH 5/8] > - Map memory referenced by PT_LOAD entry directly even if the start or > end of the region doesn't fit inside page boundary, no longer copy > them as the previous v3. Then, holes, outside OS memory, are visible > from /proc/vmcore. => See [PATCH 7/8] > > v2 => v3) > > - Rebase 3.9-rc3. > - Copy program headers seprately from e_phoff in ELF note segment > buffer. Now there's no risk to allocate huge memory if program > header table positions after memory segment. > - Add cleanup patch that removes unnecessary variable. > - Fix wrongly using the variable that is buffer size configurable at > runtime. Instead, use the varibale that has original buffer size. > > v1 => v2) > > - Clean up the existing codes: use e_phoff, and remove the assumption > on PT_NOTE entries. > - Fix potencial bug that ELF haeader size is not included in exported > vmcoreinfo size. > - Divide patch modifying read_vmcore() into two: clean-up and primary > code change. > - Put ELF note segments in page-size boundary on the 1st kernel > instead of copying them into the buffer on the 2nd kernel. > > Test > ==== > > This patch set is composed based on v3.9-rc7. > > Done on x86-64, x86-32 both with 1GB and over 4GB memory environments. > > --- > > HATAYAMA Daisuke (8): > vmcore: support mmap() on /proc/vmcore > vmcore: treat memory chunks referenced by PT_LOAD program header entries in \ > page-size boundary in vmcore_list > vmcore: count holes generated by round-up operation for page boudary for size \ > of /proc/vmcore > vmcore: copy ELF note segments in the 2nd kernel per page vmcore objects > vmcore: Add helper function vmcore_add() > vmcore, procfs: introduce MEM_TYPE_CURRENT_KERNEL flag to distinguish objects \ > copied in 2nd kernel vmcore: clean up read_vmcore() > vmcore: allocate buffer for ELF headers on page-size alignment > > > fs/proc/vmcore.c | 349 ++++++++++++++++++++++++++++++++--------------- > include/linux/proc_fs.h | 8 + > 2 files changed, 245 insertions(+), 112 deletions(-) > > -- > > Thanks. > HATAYAMA, Daisuke This is a very important patch set for speeding the kdump process. (patches 1 - 8) We have found the mmap interface to /proc/vmcore about 80x faster than the read interface. That is, doing mmap's and copying data (in pieces the size of page structures) transfers all of /proc/vmcore about 80 times faster than reading it. This greatly speeds up the capture of a kdump, as the scan of page structures takes the bulk of the time in dumping the OS on a machine with terabytes of memory. We would very much like to see this set make it into the 3.10 release. Acked-by: Cliff Wickman -Cliff -- Cliff Wickman SGI cpw@sgi.com (651) 683-3824 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/