MIME-Version: 1.0
In-Reply-To: <ae3c9d95-5af1-eb2d-c0c1-32ae622f6c54@nazar.ca>
References: <1ae53e17-e455-4f17-0280-b0dae183a449@nazar.ca>
 <CA+55aFzukw-6m0vtm6jDn4kXyfcUQbDFSwt1OOoa9nwG6toycQ@mail.gmail.com> <ae3c9d95-5af1-eb2d-c0c1-32ae622f6c54@nazar.ca>
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 23 Aug 2017 13:13:04 -0700
Message-ID: <CA+55aFzWCSU6A0xmG+TC+K-tN3tZTOo_0dxYy5k=5K+DeTEx8A@mail.gmail.com>
Subject: Re: Kernels v4.9+ cause short reads of block devices
To: Doug Nazar <nazard@nazar.ca>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        Wei Fang <fangwei1@huawei.com>,
        linux-fsdevel <linux-fsdevel@vger.kernel.org>
Content-Type: text/plain; charset="UTF-8"
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1640
Lines: 44

On Wed, Aug 23, 2017 at 12:53 PM, Doug Nazar <nazard@nazar.ca> wrote:
>
> It's compiling now, but I think it's already set to MAX_LFS_FILESIZE.
>
> [  169.095127] ppos=80180006000, s_maxbytes=7ffffffffff, magic=0x62646576,
> type=bdev

Oh, right you are - I'm much too used to 64-bit, where
MAX_LFS_FILESIZE is basically infinite, and was jusr assuming that it
was something like the UFS bug we had not that long ago that was due
to the 32-bit limit.

But yes, on 32-bit, we are limited by the 32-bit index into the page
cache, and we limit the index to 31 bits too, so we have (PAGE_SIZE <<
31) -1, which is that 7ffffffffff.

And that also explains why people haven't seen it. You do need

 (a) 32-bit environment

 (b) a disk larger than that 8TB in size

The *hard* limit for the page cache on a 32-bit environment should
actually be (PAGE_SIZE << 32)-PAGE_SIZE (that final PAGE_SIZE
subtraction is to make sure we don't generate that page cache with
index -1), so having a disk that is 16TB or larger is not going to
work, but your disk is right in that 8TB-16TB hole that used to work
and was broken by that check.

Anyway, that makes me feel better. I should have looked at your disk
size more, now I at least understand why nobody noticed before.

So just throw away my patch. That's wrong, and garbage.

The *right* patch is likely to just this instead:

  -#define MAX_LFS_FILESIZE       (((loff_t)PAGE_SIZE << (BITS_PER_LONG-1))-1)
  +#define MAX_LFS_FILESIZE       (((loff_t)PAGE_SIZE <<
BITS_PER_LONG)-PAGE_SIZE)

which should make MAX_LFS_FILESIZE be 0xffffffff000 and you disk size
should be ok.

                      Linus