Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754692AbbHFPwB (ORCPT ); Thu, 6 Aug 2015 11:52:01 -0400 Received: from mga11.intel.com ([192.55.52.93]:38683 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754076AbbHFPv7 convert rfc822-to-8bit (ORCPT ); Thu, 6 Aug 2015 11:51:59 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,623,1432623600"; d="scan'208";a="620569503" From: "Wilcox, Matthew R" To: Jeff Moyer CC: "linda.knippers@hp.com" , "linux-kernel@vger.kernel.org" , "linux-fsdevel@vger.kernel.org" Subject: RE: regression introduced by "block: Add support for DAX reads/writes to block devices" Thread-Topic: regression introduced by "block: Add support for DAX reads/writes to block devices" Thread-Index: AQHQ0F1Hbz3w0Nd/aUKgc2IIOADimZ3/HrMQ Date: Thu, 6 Aug 2015 15:51:56 +0000 Message-ID: <100D68C7BA14664A8938383216E40DE040914111@FMSMSX114.amr.corp.intel.com> References: <100D68C7BA14664A8938383216E40DE04091408C@FMSMSX114.amr.corp.intel.com> In-Reply-To: Accept-Language: en-CA, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.1.200.107] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3499 Lines: 80 Yes, that's the result I want. Fundamentally, I think DAX should be able to support devices that are not multiples of PAGE_SIZE in size. -----Original Message----- From: Jeff Moyer [mailto:jmoyer@redhat.com] Sent: Thursday, August 06, 2015 8:34 AM To: Wilcox, Matthew R Cc: linda.knippers@hp.com; linux-kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org Subject: Re: regression introduced by "block: Add support for DAX reads/writes to block devices" "Wilcox, Matthew R" writes: > I think I see the problem. I'm kind of wrapped up in other things > right now; can you try replacing the line in dax_io(): > > - bh->b_size = PAGE_ALIGN(end - pos); > + bh->b_size = ALIGN(end - pos, 1 << blkbits); That's not gonna work either. :) You'll end up with -EINVAL since bdev_direct_access wants the sector to be aligned to a page: if (sector % (PAGE_SIZE / 512)) return -EINVAL; I think you really want to call direct_access with the full page, and then tease out the part you want up in dax_io, right? I'll take a crack at it if you're busy. Cheers, Jeff > -----Original Message----- > From: Jeff Moyer [mailto:jmoyer@redhat.com] > Sent: Wednesday, August 05, 2015 1:19 PM > To: Wilcox, Matthew R; linda.knippers@hp.com > Cc: linux-kernel@vger.kernel.org; linux-fsdevel@vger.kernel.org > Subject: regression introduced by "block: Add support for DAX reads/writes to block devices" > > Hi, Matthew, > > Linda Knippers noticed that commit (bbab37ddc20b) breaks mkfs.xfs: > > # mkfs -t xfs -f /dev/pmem0 > meta-data=/dev/pmem0 isize=256 agcount=4, agsize=524288 blks > = sectsz=512 attr=2, projid32bit=1 > = crc=0 finobt=0 > data = bsize=4096 blocks=2097152, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0 ftype=0 > log =internal log bsize=4096 blocks=2560, version=2 > = sectsz=512 sunit=0 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > mkfs.xfs: read failed: Numerical result out of range > > I sat down with Linda to look into it, and the problem is that mkfs.xfs > sets the blocksize of the device to 512 (via BLKBSZSET), and then reads > from the last sector of the device. This results in dax_io trying to do > a page-sized I/O at 512 bytes from the end of the device. > bdev_direct_access, receiving this bogus pos/size combo, returns > -ERANGE: > > if ((sector + DIV_ROUND_UP(size, 512)) > > part_nr_sects_read(bdev->bd_part)) > return -ERANGE; > > Given that file systems supporting dax refuse to mount with a blocksize > != page size, I'm guessing this is sort of expected behavior. However, > we really shouldn't be breaking direct I/O on pmem devices. > > So, what do you want to do? We could make the pmem device's logical > block size fixed at the sytem page size. Or, we could modify the dax > code to work with blocksize < pagesize. Or, we could continue using the > direct I/O codepath for direct block device access. What do you think? > > Thaks, > Jeff and Linda -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/