Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755230AbbERXjF (ORCPT ); Mon, 18 May 2015 19:39:05 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:24399 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754822AbbERXjC (ORCPT ); Mon, 18 May 2015 19:39:02 -0400 Message-ID: <555A780E.7070701@oracle.com> Date: Mon, 18 May 2015 16:38:54 -0700 From: santosh shilimkar Organization: Oracle Corporation User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.1.2 MIME-Version: 1.0 To: Ming Lei CC: Jens Axboe , Christoph Hellwig , Linux Kernel Mailing List , ocfs2-devel@oss.oracle.com Subject: Re: [Regression] Guest fs corruption with 'block: loop: improve performance via blk-mq' References: <5557A4EC.6000508@oracle.com> <555A2A76.5050701@oracle.com> <555A7237.1030004@oracle.com> In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Source-IP: userv0022.oracle.com [156.151.31.74] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4305 Lines: 108 On 5/18/2015 4:25 PM, Ming Lei wrote: > On Tue, May 19, 2015 at 7:13 AM, santosh shilimkar > wrote: >> On 5/18/2015 11:07 AM, santosh shilimkar wrote: >>> >>> On 5/17/2015 6:26 PM, Ming Lei wrote: >>>> >>>> Hi Santosh, >>>> >>>> Thanks for your report! >>>> >>>> On Sun, May 17, 2015 at 4:13 AM, santosh shilimkar >>>> wrote: >>>>> >>>>> Hi Ming Lei, Jens, >>>>> >>>>> While doing few tests with recent kernels with Xen Server, >>>>> we saw guests(DOMU) disk image getting corrupted while booting it. >>>>> Strangely the issue is seen so far only with disk image over ocfs2 >>>>> volume. If the same image kept on the EXT3/4 drive, no corruption >>>>> is observed. The issue is easily reproducible. You see the flurry >>>>> of errors while guest is mounting the file systems. >>>>> >>>>> After doing some debug and bisects, we zeroed down the issue with >>>>> commit "b5dd2f6 block: loop: improve performance via blk-mq". With >>>>> that commit reverted the corruption goes away. >>>>> >>>>> Some more details on the test setup: >>>>> 1. OVM(XEN) Server kernel(DOM0) upgraded to more recent kernel >>>>> which includes commit b5dd2f6. Boot the Server. >>>>> 2. On DOM0 file system create a ocfs2 volume >>>>> 3. Keep the Guest(VM) disk image on ocfs2 volume. >>>>> 4. Boot guest image. (xm create vm.cfg) >>>> >>>> >>>> I am not familiar with xen, so is the image accessed via >>>> loop block inside of guest VM? Is he loop block created >>>> in DOM0 or guest VM? >>>> >>> Guest. The Guest disk image is represented as a file by loop >>> device. >>> >>>>> 5. Observe the VM boot console log. VM itself use the EXT3 fs. >>>>> You will see errors like below and after this boot, that file >>>>> system/disk-image gets corrupted and mostly won't boot next time. >>>> >>>> >>>> OK, that means the image is corrupted by VM booting. >>>> >>> Right >>> >>> [...] >>> >>>>> >>>>> From the debug of the actual data on the disk vs what is read by >>>>> the guest VM, we suspect the *reads* are actually not going all >>>>> the way to disk and possibly returning the wrong data. Because >>>>> the actual data on ocfs2 volume at those locations seems >>>>> to be non-zero where as the guest seems to be read it as zero. >>>> >>>> >>>> Two big changes in the patchset are: 1) use blk-mq request based IO; >>>> 2) submit I/O concurrently(write vs. write is still serialized) >>>> >>>> Could you apply the patch in below link to see if it can fix the issue? >>>> BTW, this patch only removes concurrent submission. >>>> >>>> http://marc.info/?t=143093223200004&r=1&w=2 >>>> >>> What kernel is this patch generated against ? It doesn't apply against >>> v4.0. Does this need the AIO/DIO conversion patches as well. Do you >>> have the dependent patch-set I can't apply it against v4.0. >>> >> Anyways, I created patch(end of the email) against v4.0, based on your patch >> and tested it. The corruption is no more seen so it does fix >> the issue after backing out concurrent submission changes from >> commit b5dd2f6. Let me know whats you plan with it since linus >> tip as well as v4.0 needs this fix. > > If your issue is caused by concurrent IO submittion, it might be one > issue of ocfs2. As you see, there isn't such problem for ext3/ext4. > As we speak, I got to know about another regression with XFS as well and am quite confident based on symptom that its similar issue. I will get a confirmation on the same by tomorrow whether the patch fixes it or not. > And the single thread patch is introduced for aio/dio support, which > shouldn't have been a fix patch. > Well before the loop blk-mq conversion commit b5dd2f6, the loop driver was single threaded and as you see that issue seen with that commit. Now with this experiment, it also proves that those work-queue split changes are problematic. So am not sure why do you say that it shouldn't be a fix patch. Am not denying that the issue could be with OCFS2 or XFS(not proved yet) but they were happy before that commit ;-) Regards, Santosh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/