Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758589AbZCRWVh (ORCPT ); Wed, 18 Mar 2009 18:21:37 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754715AbZCRWV1 (ORCPT ); Wed, 18 Mar 2009 18:21:27 -0400 Received: from smtp1.linux-foundation.org ([140.211.169.13]:47651 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754619AbZCRWV0 (ORCPT ); Wed, 18 Mar 2009 18:21:26 -0400 Date: Wed, 18 Mar 2009 15:11:57 -0700 From: Andrew Morton To: Ying Han Cc: linux-kernel , linux-mm , guichaz@gmail.com, Alex Khesin , Mike Waychison , Rohit Seth , Nick Piggin , Peter Zijlstra , Linus Torvalds Subject: Re: ftruncate-mmap: pages are lost after writing to mmaped file. Message-Id: <20090318151157.85109100.akpm@linux-foundation.org> In-Reply-To: <604427e00903181244w360c5519k9179d5c3e5cd6ab3@mail.gmail.com> References: <604427e00903181244w360c5519k9179d5c3e5cd6ab3@mail.gmail.com> X-Mailer: Sylpheed 2.4.7 (GTK+ 2.12.1; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2993 Lines: 90 On Wed, 18 Mar 2009 12:44:08 -0700 Ying Han wrote: > We triggered the failure during some internal experiment with > ftruncate/mmap/write/read sequence. And we found that some pages are > "lost" after writing to the mmaped file. which in the following test > cases (count >= 0). > > First we deployed the test cases into group of machines and see about > >20% failure rate on average. Then, I did couple of experiment to try > to reproduce it on a single machine. what i found is that: > 1. add a fsync after write the file, i can not reproduce this issue. > 2. add memory pressure(mmap/mlock) while run the test in infinite > loop, the failure is reproduced quickly. ( background flushing ? ) > > The "bad pages" count differs each time from one digit to 4,5 digit > for 128M ftruncated file. and what i also found that the bad page > number are contiguous for each segment which total bad pages container > several segments. ext "1-4, 9-20, 48-50" ( batch flushing ? ) > > (The failure is reproduced based on 2.6.29-rc8, also happened on > 2.6.18 kernel. . Here is the simple test case to reproduce it with > memory pressure. ) Thanks. This will be a regression - the testing I did back in the days when I actually wrote stuff would have picked this up. Perhaps it is a 2.6.17 thing. Which, IIRC, is when we made the changes to redirty pages on each write fault. Or maybe it was something else. Nick, Peter: I'm in .au at preset, not able to build and run kernels - is this something you'd have time to look into please? Given the amount of time for which this bug has existed, I guess it isn't a 2.6.29 blocker, but once we've found out the cause we should have a little post-mortem to work out how a bug of this nature has gone undetected for so long. > #include > #include > #include > #include > #include > #include > #include > > long kMemSize = 128 << 20; > int kPageSize = 4096; > > int main(int argc, char **argv) { > int status; > int count = 0; > int i; > char *fname = "/root/test.mmap"; > char *mem; > > unlink(fname); > int fd = open(fname, O_CREAT | O_EXCL | O_RDWR, 0600); > status = ftruncate(fd, kMemSize); > > mem = mmap(0, kMemSize, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); > // Fill the memory with 1s. > memset(mem, 1, kMemSize); > > for (i = 0; i < kMemSize; i++) { > int byte_good = mem[i] != 0; > > if (!byte_good && ((i % kPageSize) == 0)) { > //printf("%d ", i / kPageSize); > count++; > } > } > > munmap(mem, kMemSize); > close(fd); > unlink(fname); > > if (count > 0) { > printf("Running %d bad page\n", count); > return 1; > } > return 0; > } > > --Ying -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/