2011-06-10 08:38:06

by Akira Fujita

[permalink] [raw]
Subject: [BUG] ext4 timestamps corruption

Hi,

Officially, ext4 can handle its timestamps until 2514
with 32bit entries plus EPOCH_BIT (2bits).
But when timestamps values use 32+ bit
(e.g. 2038-01-19 9:14:08 0x0000000080000000),
we can get corrupted values.
Because sign bit is overwritten by transferring value
between kernel space and user space.

This can be happened with kernel 3.0.0-rc2 (Also older kernel)
on x86_64.

# This issue is already on Bugzilla,
does anybody know this current status?
https://bugzilla.kernel.org/show_bug.cgi?id=23732

Reproduce steps are as follows:
# System time is set to UTC.

# mount -t ext4 /dev/sda8 /mnt/mp1

# touch -t 203801190314.08 /mnt/mp1/FILE

# umount /mnt/mp1
# mount -t ext4 /dev/sda8 /mnt/mp1

# stat /mnt/mp1/FILE
File: `/mnt/mp1/FILE'
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: 808h/2056d Inode: 12 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 1901-12-13 20:45:52.000000000 +0000 <-----
Modify: 1901-12-13 20:45:52.000000000 +0000 <-----
Change: 2011-06-10 03:57:39.595385951 +0100
Birth: -

Regards,
Akira Fujita



2011-06-11 18:04:51

by Andreas Dilger

[permalink] [raw]
Subject: Re: [BUG] ext4 timestamps corruption

On 2011-06-10, at 2:27 AM, Akira Fujita <[email protected]> wrote:
>
> Officially, ext4 can handle its timestamps until 2514
> with 32bit entries plus EPOCH_BIT (2bits).
> But when timestamps values use 32+ bit
> (e.g. 2038-01-19 9:14:08 0x0000000080000000),
> we can get corrupted values.
> Because sign bit is overwritten by transferring value
> between kernel space and user space.
>
> This can be happened with kernel 3.0.0-rc2 (Also older kernel)
> on x86_64.
>
> # This issue is already on Bugzilla,
> does anybody know this current status?
> https://bugzilla.kernel.org/show_bug.cgi?id=23732

I can't find any discussion about this bug in the list archives, but it is definitely a real problem.

At first glance, it appears that the correct solution is to shift the high
bits in the extra time by only 31 bits.

As stated in the posting, it makes sense to keep the range -2^31 - +2^33
for maximum usability. I don't think there is any value to store more
negative times.

To be honest I also don't think there is any value to even keeping negative
timestamps (before 1970) since this is about storing file creation or
modification times and I don't think any files with real creation dates
before 1970 are used anywhere.

Either way I expect the time range to be sufficient, once the bug is fixed.

> Reproduce steps are as follows:
> # System time is set to UTC.
>
> # mount -t ext4 /dev/sda8 /mnt/mp1
>
> # touch -t 203801190314.08 /mnt/mp1/FILE
>
> # umount /mnt/mp1
> # mount -t ext4 /dev/sda8 /mnt/mp1
>
> # stat /mnt/mp1/FILE
> File: `/mnt/mp1/FILE'
> Size: 0 Blocks: 0 IO Block: 4096 regular empty file
> Device: 808h/2056d Inode: 12 Links: 1
> Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
> Access: 1901-12-13 20:45:52.000000000 +0000 <-----
> Modify: 1901-12-13 20:45:52.000000000 +0000 <-----
> Change: 2011-06-10 03:57:39.595385951 +0100
> Birth: -

Hmm, interesting, I wonder how stat is expecting to get the "Birth" time from
the kernel? We have the crtime in ext4 inodes, so we could be returning it.

Cheers, Andreas

2011-06-23 07:52:54

by Akira Fujita

[permalink] [raw]
Subject: Re: [BUG] ext4 timestamps corruption

Hi Andreas,

Thanks for comment. I wrote a patch, could you review this?

(2011/06/12 1:48), Andreas Dilger wrote:
>
> On 2011-06-10, at 2:27 AM, Akira Fujita <[email protected] <mailto:[email protected]>> wrote:
>>
>> Officially, ext4 can handle its timestamps until 2514
>> with 32bit entries plus EPOCH_BIT (2bits).
>> But when timestamps values use 32+ bit
>> (e.g. 2038-01-19 9:14:08 0x0000000080000000),
>> we can get corrupted values.
>> Because sign bit is overwritten by transferring value
>> between kernel space and user space.
>>
>> This can be happened with kernel 3.0.0-rc2 (Also older kernel)
>> on x86_64.
>>
>> # This issue is already on Bugzilla,
>> does anybody know this current status?
>> https://bugzilla.kernel.org/show_bug.cgi?id=23732
>
> I can't find any discussion about this bug in the list archives, but it is definitely a real problem.
>
> At first glance, it appears that the correct solution is to shift the high bits in the extra time by only 31 bits.
>
> As stated in the posting, it makes sense to keep the range -2^31 - +2^33 for maximum usability. I don't think there is any value to store more negative times.
>
> To be honest I also don't think there is any value to even keeping negative timestamps (before 1970) since this is about storing file creation/modification times and I don't think any files with real
> creation dates before 1970 are used anywhere.
>
> Either way I expect the time range to be sufficient, once the bug is fixed.

ext4: Fix ext4 timestamps corruption

Officially, ext4 can handle its timestamps until 2514
with 32bit entries plus EPOCH_BIT (2bits).
But when timestamps values use 32+ bit
(e.g. 2038-01-19 9:14:08 0x0000000080000000),
we can get corrupted values.
Because sign bit is overwritten by transferring value
between kernel space and user space.

To fix this issue, 32th bit of extra time fields in ext4_inode structure
(e.g. i_ctime_extra) are used as the sign for 64bit user space.
Because these are used only 20bits for nano-second and bottom of 2bits
are for EXT4_EPOCH_BITS shift.
With this patch, ext4 supports timestamps Y1901-2514.

The performance comparison is as follows:
------------------------------------------------
| | file create | ls -l |
|--------------|---------------|----------------
|with patch | 138.2056 sec | 14.9333 sec |
|without patch | 133.4566 sec | 14.9096 sec |
------------------------------------------------
file count:1 million (average of 3 trials)
kernel: 3.0.0-rc3

There is a slightly difference, but I think it is acceptable.

Thanks and Regards,
Akira Fujita

Signed-off-by: Akira Fujita <[email protected]>
---
fs/ext4/ext4.h | 30 +++++++++++++++++++++---------
1 file changed, 21 insertions(+), 9 deletions(-)

diff -Nrup -X linux-3.0-rc3-a/Documentation/dontdiff linux-3.0-rc3-a/fs/ext4/ext4.h linux-3.0-rc3-b/fs/ext4/ext4.h
--- linux-3.0-rc3-a/fs/ext4/ext4.h 2011-06-14 07:29:59.000000000 +0900
+++ linux-3.0-rc3-b/fs/ext4/ext4.h 2011-06-23 14:18:47.000000000 +0900
@@ -645,6 +645,7 @@ struct move_extent {
#define EXT4_EPOCH_BITS 2
#define EXT4_EPOCH_MASK ((1 << EXT4_EPOCH_BITS) - 1)
#define EXT4_NSEC_MASK (~0UL << EXT4_EPOCH_BITS)
+#define EXT4_TIMESTAMP_SIGN_MASK 0x80000000

/*
* Extended fields will fit into an inode if the filesystem was formatted
@@ -665,16 +666,23 @@ struct move_extent {
static inline __le32 ext4_encode_extra_time(struct timespec *time)
{
return cpu_to_le32((sizeof(time->tv_sec) > 4 ?
- (time->tv_sec >> 32) & EXT4_EPOCH_MASK : 0) |
- ((time->tv_nsec << EXT4_EPOCH_BITS) & EXT4_NSEC_MASK));
+ (time->tv_sec >> 32) &
+ (EXT4_EPOCH_MASK | EXT4_TIMESTAMP_SIGN_MASK) :
+ time->tv_sec & EXT4_TIMESTAMP_SIGN_MASK) |
+ ((time->tv_nsec << EXT4_EPOCH_BITS) & EXT4_NSEC_MASK));
}

static inline void ext4_decode_extra_time(struct timespec *time, __le32 extra)
{
- if (sizeof(time->tv_sec) > 4)
+ if (sizeof(time->tv_sec) > 4) {
time->tv_sec |= (__u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK)
<< 32;
- time->tv_nsec = (le32_to_cpu(extra) & EXT4_NSEC_MASK) >> EXT4_EPOCH_BITS;
+ if (le32_to_cpu(extra) & EXT4_TIMESTAMP_SIGN_MASK)
+ time->tv_sec |= EXT4_NSEC_MASK << 32;
+ }
+
+ time->tv_nsec = ((le32_to_cpu(extra) & ~EXT4_TIMESTAMP_SIGN_MASK) >>
+ EXT4_EPOCH_BITS);
}

#define EXT4_INODE_SET_XTIME(xtime, inode, raw_inode) \
@@ -696,19 +704,23 @@ do { \

#define EXT4_INODE_GET_XTIME(xtime, inode, raw_inode) \
do { \
- (inode)->xtime.tv_sec = (signed)le32_to_cpu((raw_inode)->xtime); \
- if (EXT4_FITS_IN_INODE(raw_inode, EXT4_I(inode), xtime ## _extra)) \
+ if (EXT4_FITS_IN_INODE(raw_inode, EXT4_I(inode), xtime ## _extra)) { \
+ (inode)->xtime.tv_sec = \
+ (__u32)(le32_to_cpu((raw_inode)->xtime)); \
ext4_decode_extra_time(&(inode)->xtime, \
raw_inode->xtime ## _extra); \
- else \
+ } else { \
+ (inode)->xtime.tv_sec = \
+ (signed)le32_to_cpu((raw_inode)->xtime); \
(inode)->xtime.tv_nsec = 0; \
+ } \
} while (0)

#define EXT4_EINODE_GET_XTIME(xtime, einode, raw_inode) \
do { \
if (EXT4_FITS_IN_INODE(raw_inode, einode, xtime)) \
- (einode)->xtime.tv_sec = \
- (signed)le32_to_cpu((raw_inode)->xtime); \
+ (einode)->xtime.tv_sec = \
+ (__u32)(le32_to_cpu((raw_inode)->xtime)); \
if (EXT4_FITS_IN_INODE(raw_inode, einode, xtime ## _extra)) \
ext4_decode_extra_time(&(einode)->xtime, \
raw_inode->xtime ## _extra); \

2011-06-23 22:37:12

by Mark Harris

[permalink] [raw]
Subject: Re: [BUG] ext4 timestamps corruption

2011/6/23 Akira Fujita <[email protected]>:
>> https://bugzilla.kernel.org/show_bug.cgi?id=23732
>
> ext4: Fix ext4 timestamps corruption
>
> Officially, ext4 can handle its timestamps until 2514
> with 32bit entries plus EPOCH_BIT (2bits).
> But when timestamps values use 32+ bit
> (e.g. 2038-01-19 9:14:08 0x0000000080000000),
> we can get corrupted values.
> Because sign bit is overwritten by transferring value
> between kernel space and user space.
>
> To fix this issue, 32th bit of extra time fields in ext4_inode structure
> (e.g. i_ctime_extra) are used as the sign for 64bit user space.
> Because these are used only 20bits for nano-second and bottom of 2bits
> are for EXT4_EPOCH_BITS shift.
> With this patch, ext4 supports timestamps Y1901-2514.

Thanks for looking into this bug. However tv_nsec is in the
range 0..999999999 and requires 30 bits. That is why tv_sec was
only extended by 2 bits. So there are no additional spare bits
in the "extra" field.

34-bit seconds can accommodate a maximum of 544.4 years, e.g.
1970..2514 or 1901..2446. Although an early version of the patch
for 34-bit tv_sec in ext4 worked with years 1970..2514, prior to
being committed the patch was changed to support pre-1970
timestamps (introducing the sign extension issue in the decoding):
http://marc.info/?l=linux-ext4&m=118208541320999

The existing encoding simply encodes bits 31..0 of tv_sec in the
regular time field and bits 33..32 in the extra field (along with
the 30-bit tv_nsec). The issue is in the decoding, which I think
can be addressed by changing only the body of the "if" in in the
ext4_decode_extra_time function, to something like this:

time->tv_sec += ((__u32)time->tv_sec +
((__u64)le32_to_cpu(extra) << 32) +
0x80000000LL) & 0x300000000LL;

This is untested, and might look nicer with some macros, but
it should decode the 34 bits into a timestamp in the range
-0x80000000 (1901-12-13) to 0x37fffffff (2446-05-10), while
retaining compatibility with the existing encoding.

2 msb of adjustment needed to convert
extra 32-bit sign-extended 32-bit tv_sec
bits time decoded 64-bit tv_sec to decoded 64-bit tv_sec
1 1 1 -0x80000000..-1 0
0 0 0 0x000000000..0x07fffffff 0
0 0 1 0x080000000..0x0ffffffff 0x100000000
0 1 0 0x100000000..0x17fffffff 0x100000000
0 1 1 0x180000000..0x1ffffffff 0x200000000
1 0 0 0x200000000..0x27fffffff 0x200000000
1 0 1 0x280000000..0x2ffffffff 0x300000000
1 1 0 0x300000000..0x37fffffff 0x300000000

- Mark

2011-06-24 15:04:14

by Andreas Dilger

[permalink] [raw]
Subject: Re: [BUG] ext4 timestamps corruption

On 2011-06-23, at 1:52 AM, Akira Fujita wrote:
> ext4: Fix ext4 timestamps corruption

Hi Akira-san,
thank you for the patch. For submitting patches, it is easier for
Ted if you send the patch in a separate email with a proper subject
(e.g. "[PATCH] ext4: Fix ext4 timestamp > 2038 corruption" from
instead of containing the reply to an old email. Otherwise Ted has
to manually reformat the patch.

> Officially, ext4 can handle its timestamps until 2514 with 32bit
> entries plus EPOCH_BIT (2bits). But when timestamps values use
> 32+ bit (e.g. 2038-01-19 9:14:08 0x0000000080000000), we can get
> corrupted values. Because sign bit is overwritten by transferring
> value between kernel space and user space.
>
> To fix this issue, 32th bit of extra time fields in ext4_inode structure
> (e.g. i_ctime_extra) are used as the sign for 64bit user space.
> Because these are used only 20bits for nano-second and bottom of 2bits

This should be "30 bits for nanosecond".

> are for EXT4_EPOCH_BITS shift.
> With this patch, ext4 supports timestamps Y1901-2514.
>
> The performance comparison is as follows:
> ------------------------------------------------
> | | file create | ls -l |
> |--------------|---------------|----------------
> |with patch | 138.2056 sec | 14.9333 sec |
> |without patch | 133.4566 sec | 14.9096 sec |
> ------------------------------------------------
> file count:1 million (average of 3 trials)
> kernel: 3.0.0-rc3
>
> There is a slightly difference, but I think it is acceptable.

I'm surprised that the difference is even measurable for such a
change.

> Thanks and Regards,
> Akira Fujita
>
> Signed-off-by: Akira Fujita <[email protected]>
> ---
> fs/ext4/ext4.h | 30 +++++++++++++++++++++---------
> 1 file changed, 21 insertions(+), 9 deletions(-)
>
> diff -Nrup -X linux-3.0-rc3-a/Documentation/dontdiff linux-3.0-rc3-a/fs/ext4/ext4.h linux-3.0-rc3-b/fs/ext4/ext4.h
> --- linux-3.0-rc3-a/fs/ext4/ext4.h 2011-06-14 07:29:59.000000000 +0900
> +++ linux-3.0-rc3-b/fs/ext4/ext4.h 2011-06-23 14:18:47.000000000 +0900
> @@ -645,6 +645,7 @@ struct move_extent {
> #define EXT4_EPOCH_BITS 2
> #define EXT4_EPOCH_MASK ((1 << EXT4_EPOCH_BITS) - 1)
> #define EXT4_NSEC_MASK (~0UL << EXT4_EPOCH_BITS)
> +#define EXT4_TIMESTAMP_SIGN_MASK 0x80000000
>
> /*
> * Extended fields will fit into an inode if the filesystem was formatted
> @@ -665,16 +666,23 @@ struct move_extent {
> static inline __le32 ext4_encode_extra_time(struct timespec *time)
> {
> + return cpu_to_le32((sizeof(time->tv_sec) > 4 ?
> + (time->tv_sec >> 32) &
> + (EXT4_EPOCH_MASK | EXT4_TIMESTAMP_SIGN_MASK) :
> + time->tv_sec & EXT4_TIMESTAMP_SIGN_MASK) |
> + ((time->tv_nsec << EXT4_EPOCH_BITS) & EXT4_NSEC_MASK));
> }

As mentioned by Mark, this cannot work because the high bit is already used
for the most significant bit of the nanoseconds.

> static inline void ext4_decode_extra_time(struct timespec *time, __le32 extra)
> {
> + if (sizeof(time->tv_sec) > 4) {
> time->tv_sec |= (__u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK)
> << 32;
> + if (le32_to_cpu(extra) & EXT4_TIMESTAMP_SIGN_MASK)
> + time->tv_sec |= EXT4_NSEC_MASK << 32;
> + }
> +
> + time->tv_nsec = ((le32_to_cpu(extra) & ~EXT4_TIMESTAMP_SIGN_MASK) >>
> + EXT4_EPOCH_BITS);
> }
>
> #define EXT4_INODE_SET_XTIME(xtime, inode, raw_inode) \
> @@ -696,19 +704,23 @@ do { \
>
> #define EXT4_INODE_GET_XTIME(xtime, inode, raw_inode) \
> do { \
> - (inode)->xtime.tv_sec = (signed)le32_to_cpu((raw_inode)->xtime); \
> - if (EXT4_FITS_IN_INODE(raw_inode, EXT4_I(inode), xtime ## _extra)) \
> + if (EXT4_FITS_IN_INODE(raw_inode, EXT4_I(inode), xtime ## _extra)) { \
> + (inode)->xtime.tv_sec = \
> + (__u32)(le32_to_cpu((raw_inode)->xtime)); \
> ext4_decode_extra_time(&(inode)->xtime, \
> raw_inode->xtime ## _extra); \
> - else \
> + } else { \
> + (inode)->xtime.tv_sec = \
> + (signed)le32_to_cpu((raw_inode)->xtime); \
> (inode)->xtime.tv_nsec = 0; \
> + } \
> } while (0)
>
> #define EXT4_EINODE_GET_XTIME(xtime, einode, raw_inode) \
> do { \
> if (EXT4_FITS_IN_INODE(raw_inode, einode, xtime)) \
> - (einode)->xtime.tv_sec = \
> - (signed)le32_to_cpu((raw_inode)->xtime); \
> + (einode)->xtime.tv_sec = \
> + (__u32)(le32_to_cpu((raw_inode)->xtime)); \
> if (EXT4_FITS_IN_INODE(raw_inode, einode, xtime ## _extra)) \
> ext4_decode_extra_time(&(einode)->xtime, \
> raw_inode->xtime ## _extra); \


Cheers, Andreas






2011-06-24 22:46:47

by Andreas Dilger

[permalink] [raw]
Subject: Re: [BUG] ext4 timestamps corruption

On 2011-06-23, at 4:37 PM, Mark Harris wrote:
> 2011/6/23 Akira Fujita <[email protected]>:
>>> https://bugzilla.kernel.org/show_bug.cgi?id=23732
>>
>> ext4: Fix ext4 timestamps corruption
>>
>> Officially, ext4 can handle its timestamps until 2514
>> with 32bit entries plus EPOCH_BIT (2bits).
>> But when timestamps values use 32+ bit
>> (e.g. 2038-01-19 9:14:08 0x0000000080000000),
>> we can get corrupted values.
>> Because sign bit is overwritten by transferring value
>> between kernel space and user space.
>>
>> To fix this issue, 32th bit of extra time fields in ext4_inode structure
>> (e.g. i_ctime_extra) are used as the sign for 64bit user space.
>> Because these are used only 20bits for nano-second and bottom of 2bits
>> are for EXT4_EPOCH_BITS shift.
>> With this patch, ext4 supports timestamps Y1901-2514.
>
> Thanks for looking into this bug. However tv_nsec is in the
> range 0..999999999 and requires 30 bits. That is why tv_sec was
> only extended by 2 bits. So there are no additional spare bits
> in the "extra" field.
>
> 34-bit seconds can accommodate a maximum of 544.4 years, e.g.
> 1970..2514 or 1901..2446. Although an early version of the patch
> for 34-bit tv_sec in ext4 worked with years 1970..2514, prior to
> being committed the patch was changed to support pre-1970
> timestamps (introducing the sign extension issue in the decoding):
> http://marc.info/?l=linux-ext4&m=118208541320999

Sigh, it seems so unlikely that anyone even has a valid timestamp
on a file with a date before 1970, it makes me wonder if this extra
effort is even worthwhile...

> The existing encoding simply encodes bits 31..0 of tv_sec in the
> regular time field and bits 33..32 in the extra field (along with
> the 30-bit tv_nsec). The issue is in the decoding, which I think
> can be addressed by changing only the body of the "if" in in the
> ext4_decode_extra_time function, to something like this:
>
> time->tv_sec += ((__u32)time->tv_sec +
> ((__u64)le32_to_cpu(extra) << 32) +
> 0x80000000LL) & 0x300000000LL;
>
> This is untested, and might look nicer with some macros, but
> it should decode the 34 bits into a timestamp in the range
> -0x80000000 (1901-12-13) to 0x37fffffff (2446-05-10), while
> retaining compatibility with the existing encoding.
>
> 2 msb of adjustment needed to convert
> extra 32-bit sign-extended 32-bit tv_sec
> bits time decoded 64-bit tv_sec to decoded 64-bit tv_sec
> 1 1 1 -0x80000000..-1 0
> 0 0 0 0x000000000..0x07fffffff 0
> 0 0 1 0x080000000..0x0ffffffff 0x100000000
> 0 1 0 0x100000000..0x17fffffff 0x100000000
> 0 1 1 0x180000000..0x1ffffffff 0x200000000
> 1 0 0 0x200000000..0x27fffffff 0x200000000
> 1 0 1 0x280000000..0x2ffffffff 0x300000000
> 1 1 0 0x300000000..0x37fffffff 0x300000000

The problem with this encoding is that it requires existing 32-bit
timestamps before 1970 to have encoded "11" in the extra epoch bits,
which is not the case. Current pre-1970 timestamps would be encoded
with "00" there, which would (according to your table) bump them past
2038 incorrectly.

What we need is an encoding that preserves the times for extra epoch "00":

2 msb of adjustment needed to convert
extra 32-bit sign-extended 32-bit tv_sec
bits time decoded 64-bit tv_sec to decoded 64-bit tv_sec
0 0 1 -0x80000000..-1 0
0 0 0 0x000000000..0x07fffffff 0
0 1 1 0x080000000..0x0ffffffff 0x100000000
0 1 0 0x100000000..0x17fffffff 0x100000000
1 0 1 0x180000000..0x1ffffffff 0x200000000
1 0 0 0x200000000..0x27fffffff 0x200000000
1 1 1 0x280000000..0x2ffffffff 0x300000000
1 1 0 0x300000000..0x37fffffff 0x300000000

So, looking at the above desired encoding, it looks like the error in
the existing code is that it is doing a boolean operation on decode
instead of a mathematical one, and it was incorrectly trying to extend
the time to (1ULL<<34). The below should fix this:

static inline void ext4_decode_extra_time(struct timespec *time, __le32 extra)
{
if (unlikely(sizeof(time->tv_sec) > 4 &&
(extra & cpu_to_le32(EXT4_EPOCH_MASK)))
time->tv_sec += (u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK) << 32;

time->tv_nsec = (le32_to_cpu(extra) & EXT4_NSEC_MASK) >> EXT4_EPOCH_BITS;
}

Cheers, Andreas






2011-06-25 05:12:14

by Mark Harris

[permalink] [raw]
Subject: Re: [BUG] ext4 timestamps corruption

On Fri, Jun 24, 2011 at 15:46, Andreas Dilger wrote:
> On 2011-06-23, at 4:37 PM, Mark Harris wrote:
>> 2011/6/23 Akira Fujita <[email protected]>:
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=23732
>>>
>>> ext4: Fix ext4 timestamps corruption
>>>
>>> Officially, ext4 can handle its timestamps until 2514
>>> with 32bit entries plus EPOCH_BIT (2bits).
>>> But when timestamps values use 32+ bit
>>> (e.g. 2038-01-19 9:14:08 0x0000000080000000),
>>> we can get corrupted values.
>>> Because sign bit is overwritten by transferring value
>>> between kernel space and user space.
>>>
>>> To fix this issue, 32th bit of extra time fields in ext4_inode structure
>>> (e.g. i_ctime_extra) are used as the sign for 64bit user space.
>>> Because these are used only 20bits for nano-second and bottom of 2bits
>>> are for EXT4_EPOCH_BITS shift.
>>> With this patch, ext4 supports timestamps Y1901-2514.
>>
>> Thanks for looking into this bug. ?However tv_nsec is in the
>> range 0..999999999 and requires 30 bits. ?That is why tv_sec was
>> only extended by 2 bits. ?So there are no additional spare bits
>> in the "extra" field.
>>
>> 34-bit seconds can accommodate a maximum of 544.4 years, e.g.
>> 1970..2514 or 1901..2446. ?Although an early version of the patch
>> for 34-bit tv_sec in ext4 worked with years 1970..2514, prior to
>> being committed the patch was changed to support pre-1970
>> timestamps (introducing the sign extension issue in the decoding):
>> ?http://marc.info/?l=linux-ext4&m=118208541320999
>
> Sigh, it seems so unlikely that anyone even has a valid timestamp
> on a file with a date before 1970, it makes me wonder if this extra
> effort is even worthwhile...
>
>> The existing encoding simply encodes bits 31..0 of tv_sec in the
>> regular time field and bits 33..32 in the extra field (along with
>> the 30-bit tv_nsec). ?The issue is in the decoding, which I think
>> can be addressed by changing only the body of the "if" in in the
>> ext4_decode_extra_time function, to something like this:
>>
>> ? ? ? ?time->tv_sec += ((__u32)time->tv_sec +
>> ? ? ? ? ? ? ? ?((__u64)le32_to_cpu(extra) << 32) +
>> ? ? ? ? ? ? ? ?0x80000000LL) & 0x300000000LL;
>>
>> This is untested, and might look nicer with some macros, but
>> it should decode the 34 bits into a timestamp in the range
>> -0x80000000 (1901-12-13) to 0x37fffffff (2446-05-10), while
>> retaining compatibility with the existing encoding.
>>
>> ? 2 ? ?msb of ? ? ? ? ? ? ? ? ? ? ? ? adjustment needed to convert
>> extra ?32-bit ? ? ? ? ? ? ? ? ? ? ? ? sign-extended 32-bit tv_sec
>> ?bits ? time ? decoded 64-bit tv_sec ? to decoded 64-bit tv_sec
>> ?1 1 ? ? 1 ? ?-0x80000000..-1 ? ? ? ? ? 0
>> ?0 0 ? ? 0 ? ?0x000000000..0x07fffffff ?0
>> ?0 0 ? ? 1 ? ?0x080000000..0x0ffffffff ?0x100000000
>> ?0 1 ? ? 0 ? ?0x100000000..0x17fffffff ?0x100000000
>> ?0 1 ? ? 1 ? ?0x180000000..0x1ffffffff ?0x200000000
>> ?1 0 ? ? 0 ? ?0x200000000..0x27fffffff ?0x200000000
>> ?1 0 ? ? 1 ? ?0x280000000..0x2ffffffff ?0x300000000
>> ?1 1 ? ? 0 ? ?0x300000000..0x37fffffff ?0x300000000
>
> The problem with this encoding is that it requires existing 32-bit
> timestamps before 1970 to have encoded "11" in the extra epoch bits,
> which is not the case. ?Current pre-1970 timestamps would be encoded
> with "00" there, which would (according to your table) bump them past
> 2038 incorrectly.

I was under the impression that the encoding code stored bits
33 & 32 of tv_sec there, which would be 1,1 for a negative value
like -1. Certainly the decoding would be simpler if the extra
value was only non-zero for large timestamps.

On closer inspection of ext4_encode_extra_time, it looks like for
tv_sec = -1, a 64-bit kernel will store 1,1 in the extra bits and
a 32-bit kernel will store 0,0 in the extra bits. That is a problem
if both of these need to be decoded as -1 on a 64-bit system.

>
> What we need is an encoding that preserves the times for extra epoch "00":
>
> ?2 ? ?msb of ? ? ? ? ? ? ? ? ? ? ? ? adjustment needed to convert
> extra ?32-bit ? ? ? ? ? ? ? ? ? ? ? ? sign-extended 32-bit tv_sec
> ?bits ? time ? decoded 64-bit tv_sec ? to decoded 64-bit tv_sec
> ?0 0 ? ? 1 ? ?-0x80000000..-1 ? ? ? ? ? 0
> ?0 0 ? ? 0 ? ?0x000000000..0x07fffffff ?0
> ?0 1 ? ? 1 ? ?0x080000000..0x0ffffffff ?0x100000000
> ?0 1 ? ? 0 ? ?0x100000000..0x17fffffff ?0x100000000
> ?1 0 ? ? 1 ? ?0x180000000..0x1ffffffff ?0x200000000
> ?1 0 ? ? 0 ? ?0x200000000..0x27fffffff ?0x200000000
> ?1 1 ? ? 1 ? ?0x280000000..0x2ffffffff ?0x300000000
> ?1 1 ? ? 0 ? ?0x300000000..0x37fffffff ?0x300000000
>
> So, looking at the above desired encoding, it looks like the error in
> the existing code is that it is doing a boolean operation on decode
> instead of a mathematical one, and it was incorrectly trying to extend
> the time to (1ULL<<34). ?The below should fix this:
>
> static inline void ext4_decode_extra_time(struct timespec *time, __le32 extra)
> {
> ? ? ? ?if (unlikely(sizeof(time->tv_sec) > 4 &&
> ? ? ? ? ? ? ? ? ? ? (extra & cpu_to_le32(EXT4_EPOCH_MASK)))
> ? ? ? ? ? ? ? ?time->tv_sec += (u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK) << 32;
>
> ? ? ? ?time->tv_nsec = (le32_to_cpu(extra) & EXT4_NSEC_MASK) >> EXT4_EPOCH_BITS;
> }

That is not compatible with the existing ext4_encode_extra_time.
For example, 2038-01-31 (0x80101500) is encoded with extra bits
equal to bits 33 & 32, i.e. 0,0. But this code would decode it
as 1901-12-25 (i.e. it would leave the sign-extended 32-bit value
unchanged).

Possible solutions:

(a) Define the current 64-bit encoding as the correct encoding since
the 2 extra bits are not even decoded on 32-bit kernels, so its
encoding doesn't matter much. However, if anyone with existing
pre-1970 timestamps written using a 32-bit kernel wants to use
their ext4 filesystem with a 64-bit kernel, the pre-1970
timestamps would be wrong unless they re-write them with a
fixed kernel.

Change ext4_decode_extra_time "if" body to something like:
time->tv_sec += ((__u32)time->tv_sec +
((__u64)le32_to_cpu(extra) << 32) +
0x80000000LL) & 0x300000000LL;

Change ext4_encode_extra_time ": 0" to something like:
time->tv_sec < 0 ? EXT4_EPOCH_MASK : 0

(b) Change the encoding of the extra bits to be those in your new
table. This is compatible with the 32-bit kernel encoding
(which does not decode these bits) but incompatible with the
64-bit kernel encoding. Existing pre-1970 timestamps written
with a 64-bit kernel would be decoded as dates far in the future.

Requires your change to ext4_decode_extra_time.
Also requires a change to ext4_encode_extra_time, changing
(time->tv_sec >> 32) to something like:
((time->tv_sec - (signed int)time->tv_sec) >> 32)

(c) If 100% compatibility with existing pre-1970 32-bit timestamps
is desired even when switching between 32-bit and 64-bit kernels,
both extra=1,1/msb=1 and extra=0,0/msb=1 could be treated as year
1901..1969 timestamps. However this would reduce the maximum
64-bit ext4 timestamp, and would necessarily be incompatible with
the existing 64-bit kernel encoding of timestamps > year 2038
(since a current 64-bit kernel encodes a year 2039 timestamp
exactly the same as a current 32-bit kernel encodes a year 1902
timestamp).

This requires additional complexity in both ext4_decode_extra_time
and ext4_encode_extra_time.

(d) Declare that ext4 supports only timestamps with year >= 1970.
i.e. 1970..2514 (64-bit), 1970..2038 (32-bit).
Any existing pre-1970 timestamps would now be interpreted as a
year >= 2038 timestamp on 64-bit kernels.

It may be possible for users of 32-bit kernels to continue to
successfully read and write 1901..1969 timestamps, but this
would have to be unsupported. If such a timestamp was read with
a 64-bit kernel, or a program like fsck.ext4, the time may be
different.

If some day, as 2038 approaches, 32-bit time_t is changed to
unsigned, ext4 would once again support all 32-bit time_t
values.

To implement, the decoding can simply drop all casts to (signed).
Optionally, the encoding could encode any negative tv_sec as 0
to make 32-bit and 64-bit behavior for pre-1970 timestamps
consistent (bugzilla 5079/8643). However this may break some
uses of pre-1970 timestamps that would otherwise work on
32-bit kernels.

- Mark

>
> Cheers, Andreas

2011-06-27 09:04:10

by Andreas Dilger

[permalink] [raw]
Subject: Re: [BUG] ext4 timestamps corruption

On 2011-06-24, at 11:12 PM, Mark Harris wrote:
> On Fri, Jun 24, 2011 at 15:46, Andreas Dilger wrote:
>> The problem with this encoding is that it requires existing 32-bit
>> timestamps before 1970 to have encoded "11" in the extra epoch bits,
>> which is not the case. Current pre-1970 timestamps would be encoded
>> with "00" there, which would (according to your table) bump them past
>> 2038 incorrectly.
>
> I was under the impression that the encoding code stored bits
> 33 & 32 of tv_sec there, which would be 1,1 for a negative value
> like -1. Certainly the decoding would be simpler if the extra
> value was only non-zero for large timestamps.

One problem with a symmetrical encoding is that it wastes half of the
dynamic range for values that nobody will ever use. Even values before
1970 seem so unlikely that I question whether we should support them
at all.

> On closer inspection of ext4_encode_extra_time, it looks like for
> tv_sec = -1, a 64-bit kernel will store 1,1 in the extra bits and
> a 32-bit kernel will store 0,0 in the extra bits. That is a problem
> if both of these need to be decoded as -1 on a 64-bit system.

That is definitely a problem.

>> What we need is an encoding that preserves the times for extra epoch "00":
>>
>> 2 msb of adjustment needed to convert
>> extra 32-bit sign-extended 32-bit tv_sec
>> bits time decoded 64-bit tv_sec to decoded 64-bit tv_sec
>> 0 0 1 -0x80000000..-1 0
>> 0 0 0 0x000000000..0x07fffffff 0
>> 0 1 1 0x080000000..0x0ffffffff 0x100000000
>> 0 1 0 0x100000000..0x17fffffff 0x100000000
>> 1 0 1 0x180000000..0x1ffffffff 0x200000000
>> 1 0 0 0x200000000..0x27fffffff 0x200000000
>> 1 1 1 0x280000000..0x2ffffffff 0x300000000
>> 1 1 0 0x300000000..0x37fffffff 0x300000000
>>
>> So, looking at the above desired encoding, it looks like the error in
>> the existing code is that it is doing a boolean operation on decode
>> instead of a mathematical one, and it was incorrectly trying to extend
>> the time to (1ULL<<34). The below should fix this:
>>
>> static inline void ext4_decode_extra_time(struct timespec *time, __le32 extra)
>> {
>> if (unlikely(sizeof(time->tv_sec) > 4 &&
>> (extra & cpu_to_le32(EXT4_EPOCH_MASK)))
>> time->tv_sec += (u64)(le32_to_cpu(extra) & EXT4_EPOCH_MASK) << 32;
>>
>> time->tv_nsec = (le32_to_cpu(extra) & EXT4_NSEC_MASK) >> EXT4_EPOCH_BITS;
>> }
>
> That is not compatible with the existing ext4_encode_extra_time.
> For example, 2038-01-31 (0x80101500) is encoded with extra bits
> equal to bits 33 & 32, i.e. 0,0. But this code would decode it
> as 1901-12-25 (i.e. it would leave the sign-extended 32-bit value
> unchanged).

Part of the problem is that the encoding/decoding of timestamps beyond
2038 is already broken today, so I don't think anyone has been using
them so far. This gives us some leeway for fixing this problem I think.

> Possible solutions:
>
> (a) Define the current 64-bit encoding as the correct encoding since
> the 2 extra bits are not even decoded on 32-bit kernels, so its
> encoding doesn't matter much. However, if anyone with existing
> pre-1970 timestamps written using a 32-bit kernel wants to use
> their ext4 filesystem with a 64-bit kernel, the pre-1970
> timestamps would be wrong unless they re-write them with a
> fixed kernel.
>
> Change ext4_decode_extra_time "if" body to something like:
> time->tv_sec += ((__u32)time->tv_sec +
> ((__u64)le32_to_cpu(extra) << 32) +
> 0x80000000LL) & 0x300000000LL;
>
> Change ext4_encode_extra_time ": 0" to something like:
> time->tv_sec < 0 ? EXT4_EPOCH_MASK : 0

The real-world problem isn't with 32-bit systems, where it doesn't
really matter at all how time is encoded, nor with files on 64-bit systems
with timestamps 26 years in the future, but rather 256-byte inodes that
were previously written with ext3 that will break if they are mounted
with ext4 on a 64-bit system.

> (b) Change the encoding of the extra bits to be those in your new
> table. This is compatible with the 32-bit kernel encoding
> (which does not decode these bits) but incompatible with the
> 64-bit kernel encoding. Existing pre-1970 timestamps written
> with a 64-bit kernel would be decoded as dates far in the future.
>
> Requires your change to ext4_decode_extra_time.
> Also requires a change to ext4_encode_extra_time, changing
> (time->tv_sec >> 32) to something like:
> ((time->tv_sec - (signed int)time->tv_sec) >> 32)

I think this is a reasonable solution, but I dislike that it breaks
pre-1970 timestamps on 64-bit systems.

> (c) If 100% compatibility with existing pre-1970 32-bit timestamps
> is desired even when switching between 32-bit and 64-bit kernels,
> both extra=1,1/msb=1 and extra=0,0/msb=1 could be treated as year
> 1901..1969 timestamps. However this would reduce the maximum
> 64-bit ext4 timestamp, and would necessarily be incompatible with
> the existing 64-bit kernel encoding of timestamps > year 2038
> (since a current 64-bit kernel encodes a year 2039 timestamp
> exactly the same as a current 32-bit kernel encodes a year 1902
> timestamp).
>
> This requires additional complexity in both ext4_decode_extra_time
> and ext4_encode_extra_time.

This would also be a good option for the short term, and then have e2fsck
fix up the "11" encoded pre-1970 times to use "00", or just allow both
to work. This cuts 136 years off the range (2310-04-04..2446-05-10) but
to be honest I don't think that will matter very much.

> (d) Declare that ext4 supports only timestamps with year >= 1970.
> i.e. 1970..2514 (64-bit), 1970..2038 (32-bit).
> Any existing pre-1970 timestamps would now be interpreted as a
> year >= 2038 timestamp on 64-bit kernels.
>
> It may be possible for users of 32-bit kernels to continue to
> successfully read and write 1901..1969 timestamps, but this
> would have to be unsupported. If such a timestamp was read with
> a 64-bit kernel, or a program like fsck.ext4, the time may be
> different.

I'm not so fond of this solution either.

> If some day, as 2038 approaches, 32-bit time_t is changed to
> unsigned, ext4 would once again support all 32-bit time_t
> values.
>
> To implement, the decoding can simply drop all casts to (signed).
> Optionally, the encoding could encode any negative tv_sec as 0
> to make 32-bit and 64-bit behavior for pre-1970 timestamps
> consistent (bugzilla 5079/8643). However this may break some
> uses of pre-1970 timestamps that would otherwise work on
> 32-bit kernels.

Hopefully Ted can chime in on this as well.

Cheers, Andreas






2011-07-05 10:52:10

by Akira Fujita

[permalink] [raw]
Subject: Re: [BUG] ext4 timestamps corruption

Hi, Andreas and Mark,
Thank you for looking at this issue.

(2011/06/27 18:04), Andreas Dilger wrote:
> On 2011-06-24, at 11:12 PM, Mark Harris wrote:
>> On Fri, Jun 24, 2011 at 15:46, Andreas Dilger wrote:
>>> The problem with this encoding is that it requires existing 32-bit
>>> timestamps before 1970 to have encoded "11" in the extra epoch bits,
>>> which is not the case. Current pre-1970 timestamps would be encoded
>>> with "00" there, which would (according to your table) bump them past
>>> 2038 incorrectly.
>>
>> I was under the impression that the encoding code stored bits
>> 33& 32 of tv_sec there, which would be 1,1 for a negative value
>> like -1. Certainly the decoding would be simpler if the extra
>> value was only non-zero for large timestamps.
>
> One problem with a symmetrical encoding is that it wastes half of the
> dynamic range for values that nobody will ever use. Even values before
> 1970 seem so unlikely that I question whether we should support them
> at all.
>
>> On closer inspection of ext4_encode_extra_time, it looks like for
>> tv_sec = -1, a 64-bit kernel will store 1,1 in the extra bits and
>> a 32-bit kernel will store 0,0 in the extra bits. That is a problem
>> if both of these need to be decoded as -1 on a 64-bit system.
>
> That is definitely a problem.
>
>>> What we need is an encoding that preserves the times for extra epoch "00":
>>>
>>> 2 msb of adjustment needed to convert
>>> extra 32-bit sign-extended 32-bit tv_sec
>>> bits time decoded 64-bit tv_sec to decoded 64-bit tv_sec
>>> 0 0 1 -0x80000000..-1 0
>>> 0 0 0 0x000000000..0x07fffffff 0
>>> 0 1 1 0x080000000..0x0ffffffff 0x100000000
>>> 0 1 0 0x100000000..0x17fffffff 0x100000000
>>> 1 0 1 0x180000000..0x1ffffffff 0x200000000
>>> 1 0 0 0x200000000..0x27fffffff 0x200000000
>>> 1 1 1 0x280000000..0x2ffffffff 0x300000000
>>> 1 1 0 0x300000000..0x37fffffff 0x300000000
>>>
>>> So, looking at the above desired encoding, it looks like the error in
>>> the existing code is that it is doing a boolean operation on decode
>>> instead of a mathematical one, and it was incorrectly trying to extend
>>> the time to (1ULL<<34). The below should fix this:
>>>
>>> static inline void ext4_decode_extra_time(struct timespec *time, __le32 extra)
>>> {
>>> if (unlikely(sizeof(time->tv_sec)> 4&&
>>> (extra& cpu_to_le32(EXT4_EPOCH_MASK)))
>>> time->tv_sec += (u64)(le32_to_cpu(extra)& EXT4_EPOCH_MASK)<< 32;
>>>
>>> time->tv_nsec = (le32_to_cpu(extra)& EXT4_NSEC_MASK)>> EXT4_EPOCH_BITS;
>>> }
>>
>> That is not compatible with the existing ext4_encode_extra_time.
>> For example, 2038-01-31 (0x80101500) is encoded with extra bits
>> equal to bits 33& 32, i.e. 0,0. But this code would decode it
>> as 1901-12-25 (i.e. it would leave the sign-extended 32-bit value
>> unchanged).
>
> Part of the problem is that the encoding/decoding of timestamps beyond
> 2038 is already broken today, so I don't think anyone has been using
> them so far. This gives us some leeway for fixing this problem I think.
>
>> Possible solutions:
>>
>> (a) Define the current 64-bit encoding as the correct encoding since
>> the 2 extra bits are not even decoded on 32-bit kernels, so its
>> encoding doesn't matter much. However, if anyone with existing
>> pre-1970 timestamps written using a 32-bit kernel wants to use
>> their ext4 filesystem with a 64-bit kernel, the pre-1970
>> timestamps would be wrong unless they re-write them with a
>> fixed kernel.
>>
>> Change ext4_decode_extra_time "if" body to something like:
>> time->tv_sec += ((__u32)time->tv_sec +
>> ((__u64)le32_to_cpu(extra)<< 32) +
>> 0x80000000LL)& 0x300000000LL;
>>
>> Change ext4_encode_extra_time ": 0" to something like:
>> time->tv_sec< 0 ? EXT4_EPOCH_MASK : 0
>
> The real-world problem isn't with 32-bit systems, where it doesn't
> really matter at all how time is encoded, nor with files on 64-bit systems
> with timestamps 26 years in the future, but rather 256-byte inodes that
> were previously written with ext3 that will break if they are mounted
> with ext4 on a 64-bit system.
>
>> (b) Change the encoding of the extra bits to be those in your new
>> table. This is compatible with the 32-bit kernel encoding
>> (which does not decode these bits) but incompatible with the
>> 64-bit kernel encoding. Existing pre-1970 timestamps written
>> with a 64-bit kernel would be decoded as dates far in the future.
>>
>> Requires your change to ext4_decode_extra_time.
>> Also requires a change to ext4_encode_extra_time, changing
>> (time->tv_sec>> 32) to something like:
>> ((time->tv_sec - (signed int)time->tv_sec)>> 32)
>
> I think this is a reasonable solution, but I dislike that it breaks
> pre-1970 timestamps on 64-bit systems.

I agree with this solution.
I guess that no one has pre-1970 timestamps on ext4, actually.

Mark, are you working on this right now?
If you have a patch to fix this issue, please send it to the list.

Regards,
Akira Fujita

2011-07-08 17:06:29

by Mark Harris

[permalink] [raw]
Subject: Re: [BUG] ext4 timestamps corruption

Akira Fujita wrote:
> Hi, Andreas and Mark,
> Thank you for looking at this issue.
>
> (2011/06/27 18:04), Andreas Dilger wrote:
>> On 2011-06-24, at 11:12 PM, Mark Harris wrote:
>>> On Fri, Jun 24, 2011 at 15:46, Andreas Dilger wrote:
>>>> The problem with this encoding is that it requires existing 32-bit
>>>> timestamps before 1970 to have encoded "11" in the extra epoch bits,
>>>> which is not the case. ?Current pre-1970 timestamps would be encoded
>>>> with "00" there, which would (according to your table) bump them past
>>>> 2038 incorrectly.
>>>
>>> I was under the impression that the encoding code stored bits
>>> 33& ?32 of tv_sec there, which would be 1,1 for a negative value
>>> like -1. ?Certainly the decoding would be simpler if the extra
>>> value was only non-zero for large timestamps.
>>
>> One problem with a symmetrical encoding is that it wastes half of the
>> dynamic range for values that nobody will ever use. ?Even values before
>> 1970 seem so unlikely that I question whether we should support them
>> at all.
>>
>>> On closer inspection of ext4_encode_extra_time, it looks like for
>>> tv_sec = -1, a 64-bit kernel will store 1,1 in the extra bits and
>>> a 32-bit kernel will store 0,0 in the extra bits. ?That is a problem
>>> if both of these need to be decoded as -1 on a 64-bit system.
>>
>> That is definitely a problem.
>>
>>>> What we need is an encoding that preserves the times for extra epoch "00":
>>>>
>>>> ? 2 ? ?msb of ? ? ? ? ? ? ? ? ? ? ? ? adjustment needed to convert
>>>> extra ?32-bit ? ? ? ? ? ? ? ? ? ? ? ? sign-extended 32-bit tv_sec
>>>> ? bits ? time ? decoded 64-bit tv_sec ? to decoded 64-bit tv_sec
>>>> ? 0 0 ? ? 1 ? ?-0x80000000..-1 ? ? ? ? ? 0
>>>> ? 0 0 ? ? 0 ? ?0x000000000..0x07fffffff ?0
>>>> ? 0 1 ? ? 1 ? ?0x080000000..0x0ffffffff ?0x100000000
>>>> ? 0 1 ? ? 0 ? ?0x100000000..0x17fffffff ?0x100000000
>>>> ? 1 0 ? ? 1 ? ?0x180000000..0x1ffffffff ?0x200000000
>>>> ? 1 0 ? ? 0 ? ?0x200000000..0x27fffffff ?0x200000000
>>>> ? 1 1 ? ? 1 ? ?0x280000000..0x2ffffffff ?0x300000000
>>>> ? 1 1 ? ? 0 ? ?0x300000000..0x37fffffff ?0x300000000
>>>>
>>>> So, looking at the above desired encoding, it looks like the error in
>>>> the existing code is that it is doing a boolean operation on decode
>>>> instead of a mathematical one, and it was incorrectly trying to extend
>>>> the time to (1ULL<<34). ?The below should fix this:
>>>>
>>>> static inline void ext4_decode_extra_time(struct timespec *time, __le32 extra)
>>>> {
>>>> ? ? ? ? if (unlikely(sizeof(time->tv_sec)> ?4&&
>>>> ? ? ? ? ? ? ? ? ? ? ?(extra& ?cpu_to_le32(EXT4_EPOCH_MASK)))
>>>> ? ? ? ? ? ? ? ? time->tv_sec += (u64)(le32_to_cpu(extra)& ?EXT4_EPOCH_MASK)<< ?32;
>>>>
>>>> ? ? ? ? time->tv_nsec = (le32_to_cpu(extra)& ?EXT4_NSEC_MASK)>> ?EXT4_EPOCH_BITS;
>>>> }
>>>
>>> That is not compatible with the existing ext4_encode_extra_time.
>>> For example, 2038-01-31 (0x80101500) is encoded with extra bits
>>> equal to bits 33& ?32, i.e. 0,0. ?But this code would decode it
>>> as 1901-12-25 (i.e. it would leave the sign-extended 32-bit value
>>> unchanged).
>>
>> Part of the problem is that the encoding/decoding of timestamps beyond
>> 2038 is already broken today, so I don't think anyone has been using
>> them so far. ?This gives us some leeway for fixing this problem I think.
>>
>>> Possible solutions:
>>>
>>> (a) Define the current 64-bit encoding as the correct encoding since
>>> ? ? ?the 2 extra bits are not even decoded on 32-bit kernels, so its
>>> ? ? ?encoding doesn't matter much. ?However, if anyone with existing
>>> ? ? ?pre-1970 timestamps written using a 32-bit kernel wants to use
>>> ? ? ?their ext4 filesystem with a 64-bit kernel, the pre-1970
>>> ? ? ?timestamps would be wrong unless they re-write them with a
>>> ? ? ?fixed kernel.
>>>
>>> ? ? ?Change ext4_decode_extra_time "if" body to something like:
>>> ? ? ? ? time->tv_sec += ((__u32)time->tv_sec +
>>> ? ? ? ? ? ? ? ? ((__u64)le32_to_cpu(extra)<< ?32) +
>>> ? ? ? ? ? ? ? ? 0x80000000LL)& ?0x300000000LL;
>>>
>>> ? ? ?Change ext4_encode_extra_time ": 0" to something like:
>>> ? ? ? ? time->tv_sec< ?0 ? EXT4_EPOCH_MASK : 0
>>
>> The real-world problem isn't with 32-bit systems, where it doesn't
>> really matter at all how time is encoded, nor with files on 64-bit systems
>> with timestamps 26 years in the future, but rather 256-byte inodes that
>> were previously written with ext3 that will break if they are mounted
>> with ext4 on a 64-bit system.
>>
>>> (b) Change the encoding of the extra bits to be those in your new
>>> ? ? ?table. ?This is compatible with the 32-bit kernel encoding
>>> ? ? ?(which does not decode these bits) but incompatible with the
>>> ? ? ?64-bit kernel encoding. ?Existing pre-1970 timestamps written
>>> ? ? ?with a 64-bit kernel would be decoded as dates far in the future.
>>>
>>> ? ? ?Requires your change to ext4_decode_extra_time.
>>> ? ? ?Also requires a change to ext4_encode_extra_time, changing
>>> ? ? ?(time->tv_sec>> ?32) to something like:
>>> ? ? ? ? ((time->tv_sec - (signed int)time->tv_sec)>> ?32)
>>
>> I think this is a reasonable solution, but I dislike that it breaks
>> pre-1970 timestamps on 64-bit systems.
>
> I agree with this solution.
> I guess that no one has pre-1970 timestamps on ext4, actually.
>
> Mark, are you working on this right now?
> If you have a patch to fix this issue, please send it to the list.

I think that all of the code changes needed to implement this
solution (b) are in ext4_decode_extra_time and ext4_encode_extra_time
and are included above, if you want to submit them in patch format
as a new version of your patch. The issue with this solution is
that it breaks existing 1901..1969 timestamps written with a 64-bit
kernel to ext4 with 256-byte inodes. (It breaks existing year 2038+
timestamps as well, but those are already broken.) It sounds like
Andreas favors either (b) or (c) but would like to hear from Ted.

- Mark

>
> Regards,
> Akira Fujita
>