LinuxLists.cc - [sparc64] sigbus in e2fsck

2016-08-30 12:42:22

Subject: [sparc64] sigbus in e2fsck

Hello!

I'm getting SIGBUS (unaligned access?) in fsck.ext4 / e2fsck running
on debian sparc64 sid/unstable linux:

root@ttip:/home/mator/e2fsprogs# git describe
v1.43.1-14-g9a23fa8
root@ttip:/home/mator/e2fsprogs# git remote -v
origin git://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git (fetch)
root@ttip:/home/mator/e2fsprogs/build/e2fsck# gdb -q
(gdb) file ./e2fsck
Reading symbols from ./e2fsck...done.
(gdb) set args /dev/vdiskc2
(gdb) run
Starting program: /home/mator/e2fsprogs/build/e2fsck/e2fsck /dev/vdiskc2
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/sparc64-linux-gnu/libthread_db.so.1".
e2fsck 1.43.1 (08-Jun-2016)
/dev/vdiskc2: recovering journal

Program received signal SIGBUS, Bus error.
0x0000000000142f54 in scan_revoke_records (journal=0x2decd0,
bh=0x2e9b70, sequence=36855, info=0x7feffffef68) at
../../e2fsck/recovery.c:866
866 blocknr = ext2fs_be64_to_cpu(* ((__u64
*) (bh->b_data+offset)));
(gdb) bt
#0 0x0000000000142f54 in scan_revoke_records (journal=0x2decd0,
bh=0x2e9b70, sequence=36855, info=0x7feffffef68) at
../../e2fsck/recovery.c:866
#1 0x0000000000142bf8 in do_one_pass (journal=0x2decd0,
info=0x7feffffef68, pass=PASS_REVOKE) at ../../e2fsck/recovery.c:767
#2 0x0000000000141c54 in journal_recover (journal=0x2decd0) at
../../e2fsck/recovery.c:273
#3 0x0000000000139570 in recover_ext3_journal (ctx=0x2d1000) at
../../e2fsck/journal.c:940
#4 0x0000000000139750 in e2fsck_run_ext3_journal (ctx=0x2d1000) at
../../e2fsck/journal.c:978
#5 0x00000000001152d0 in main (argc=2, argv=0x7fefffff6e8) at
../../e2fsck/unix.c:1678
(gdb)

It should be the same sigbus which I have on debian boot (log cut):

[952000.246726] sunvnet: eth0: PORT ( remote-mac 00:14:4f:f8:38:39 )
Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done.
Begin: Running /scripts/local-premount ... done.
Begin: Will now check root file system ... fsck from util-linux 2.28.1
[/sbin/fsck.ext4 (1) -- /dev/vdiska2] fsck.ext4 -a -C0 /dev/vdiska2
/dev/vdiska2: recovering journal
Signal (10) SIGBUS si_code=BUS_ADRALN fault addr=0x10000164a8c
fsck.ext4(+0x313f0)[0x100000313f0]
fsck exited with status code 8
done.
Warning: File system check failed but did not detect errors
[952005.583237] EXT4-fs (vdiska2): INFO: recovery required on readonly
filesystem
[952005.583350] EXT4-fs (vdiska2): write access will be enabled during recovery
[952005.666515] EXT4-fs (vdiska2): recovery complete
[952005.680055] EXT4-fs (vdiska2): mounted filesystem with ordered
data mode. Opts: (null)
done.
Begin: Running /scripts/local-bottom ... done.
Begin: Running /scripts/init-bottom ... done.
[952005.743934] ip_tables: (C) 2000-2006 Netfilter Core Team

2016-08-30 12:56:50

by John Paul Adrian Glaubitz

[permalink] [raw]

Subject: Re: [sparc64] sigbus in e2fsck

On 08/30/2016 02:42 PM, Anatoly Pugachev wrote:
> ../../e2fsck/recovery.c:866
> 866 blocknr = ext2fs_be64_to_cpu(* ((__u64
> *) (bh->b_data+offset)));

The reason is that this expression is casting "char * b_data" [1] into u64 [2]
which provokes unaligned access. Since such expression are often inevitable,
it's probably best to modify the conversion macros in bitops.h [3] to be
safe against unaligned accesses.

An example for that can be seen in the systemd code base [4].

Thanks,
Adrian

> [1] http://lxr.free-electrons.com/source/include/linux/buffer_head.h#L69
> [2] http://git.kernel.org/cgit/fs/ext2/e2fsprogs.git/tree/e2fsck/recovery.c#n866
> [3] http://git.kernel.org/cgit/fs/ext2/e2fsprogs.git/tree/lib/ext2fs/bitops.h
> [4] https://github.com/systemd/systemd/blob/master/src/basic/unaligned.h

--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - [email protected]
`. `' Freie Universitaet Berlin - [email protected]
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

2016-08-30 14:58:17

by Theodore Ts'o

[permalink] [raw]

Subject: Re: [sparc64] sigbus in e2fsck

On Tue, Aug 30, 2016 at 02:56:45PM +0200, John Paul Adrian Glaubitz wrote:
> On 08/30/2016 02:42 PM, Anatoly Pugachev wrote:
> > ../../e2fsck/recovery.c:866
> > 866 blocknr = ext2fs_be64_to_cpu(* ((__u64
> > *) (bh->b_data+offset)));
>
> The reason is that this expression is casting "char * b_data" [1] into u64 [2]
> which provokes unaligned access. Since such expression are often inevitable,
> it's probably best to modify the conversion macros in bitops.h [3] to be
> safe against unaligned accesses.

I don't think that's it. b_data is a 4k buffer should be 8 byte
aligned. For a file system with 64-bit blocks (which you presumably
have since we're on the be64 path as shown in your debugger output)
the offset is initially set to 16, and is incremented in chunks of 8
bytes. So there shouldn't be any unaligned access.

Since you are able to provke this in a debugger, can you have gdb
print out the value of bh->b_data and offset, so we can be sure what's
going on?

I do see a potential problem which is that we're defining
journal_revoke_header_t structure in a non-portable way:

typedef struct journal_header_s
{
__u32 h_magic;
__u32 h_blocktype;
__u32 h_sequence;
} journal_header_t;

typedef struct journal_revoke_header_s
{
journal_header_t r_header;
int r_count; /* Count of bytes used in the block */
} journal_revoke_header_t;

The "int" above should be a __u32. What is the size of int on
sparc64? Is it by any chance 8 bytes? If that's the problem then I'm
kind of surprised we haven't run into massive problems earlier, as
this looks like a long standing bug.

- Ted

2016-08-30 15:03:16

by John Paul Adrian Glaubitz

[permalink] [raw]

Subject: Re: [sparc64] sigbus in e2fsck

Hi Ted!

On 08/30/2016 04:58 PM, Theodore Ts'o wrote:
> Since you are able to provke this in a debugger, can you have gdb
> print out the value of bh->b_data and offset, so we can be sure what's
> going on?

Since you're a Debian Developer yourself, you have access to the sparc64
porterbox I have set up. If you're interested, please feel free to use
notker [1] for any tests and debugging sessions.

Although I'm sure that Anatoly will follow up with your questions anyway.

Thanks,
Adrian

> [1] https://db.debian.org/machines.cgi?host=notker

--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - [email protected]
`. `' Freie Universitaet Berlin - [email protected]
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

2016-08-30 15:12:41

by Anatoly Pugachev

[permalink] [raw]

Subject: Re: [sparc64] sigbus in e2fsck

On Tue, Aug 30, 2016 at 5:58 PM, Theodore Ts'o <[email protected]> wrote:
> On Tue, Aug 30, 2016 at 02:56:45PM +0200, John Paul Adrian Glaubitz wrote:
>> On 08/30/2016 02:42 PM, Anatoly Pugachev wrote:
>> > ../../e2fsck/recovery.c:866
>> > 866 blocknr = ext2fs_be64_to_cpu(* ((__u64
>> > *) (bh->b_data+offset)));
>>
>> The reason is that this expression is casting "char * b_data" [1] into u64 [2]
>> which provokes unaligned access. Since such expression are often inevitable,
>> it's probably best to modify the conversion macros in bitops.h [3] to be
>> safe against unaligned accesses.
>
> I don't think that's it. b_data is a 4k buffer should be 8 byte
> aligned. For a file system with 64-bit blocks (which you presumably
> have since we're on the be64 path as shown in your debugger output)
> the offset is initially set to 16, and is incremented in chunks of 8
> bytes. So there shouldn't be any unaligned access.
>
> Since you are able to provke this in a debugger, can you have gdb
> print out the value of bh->b_data and offset, so we can be sure what's
> going on?

(gdb) p bh->b_data
$1 = "\300;9\230\000\000\000\005\000\000\253\204\000\000\000\070\000\000\000\000\000\000$\022\000\000\000\000\000\000$<\000\000\000\000\000\000$\270\000\000\000\000\000\000$]\000\000\000\000\000\000$\024",
'\000' <repeats 967 times>
(gdb) p offset
$2 = 16
(gdb) p *bh->b_data
$3 = -64 '\300'
(gdb) p *(bh->b_data+offset)
$6 = 0 '\000'

2016-08-30 16:39:14

by Jose E. Marchesi

[permalink] [raw]

Subject: Re: [sparc64] sigbus in e2fsck

typedef struct journal_revoke_header_s
{
journal_header_t r_header;
int r_count; /* Count of bytes used in the block */
} journal_revoke_header_t;

The "int" above should be a __u32. What is the size of int on
sparc64? Is it by any chance 8 bytes? If that's the problem then I'm
kind of surprised we haven't run into massive problems earlier, as
this looks like a long standing bug.

sizeof(int) == 4 in both sparc64-linux-gnu and sparcv9-linux-gnu (32bit)
targets.

2016-08-30 19:16:49

by Theodore Ts'o

[permalink] [raw]

Subject: Re: [sparc64] sigbus in e2fsck

On Tue, Aug 30, 2016 at 06:12:39PM +0300, Anatoly Pugachev wrote:
>
> (gdb) p bh->b_data
> $1 = "\300;9\230\000\000\000\005\000\000\253\204\000\000\000\070\000\000\000\000\000\000$\022\000\000\000\000\000\000$<\000\000\000\000\000\000$\270\000\000\000\000\000\000$]\000\000\000\000\000\000$\024",
> '\000' <repeats 967 times>
> (gdb) p offset
> $2 = 16
> (gdb) p *bh->b_data
> $3 = -64 '\300'
> (gdb) p *(bh->b_data+offset)
> $6 = 0 '\000'

Can you give us "p &bh->b_data" (so we can get the starting address of
b_data to make sure it's aligned) and "p offset" (so we can check and
make sure offset is sane)?

Thanks,

- Ted

2016-08-30 20:17:08

by Anatoly Pugachev

[permalink] [raw]

Subject: Re: [sparc64] sigbus in e2fsck

On Tue, Aug 30, 2016 at 10:16 PM, Theodore Ts'o <[email protected]> wrote:
> On Tue, Aug 30, 2016 at 06:12:39PM +0300, Anatoly Pugachev wrote:
>>
>> (gdb) p bh->b_data
>> $1 = "\300;9\230\000\000\000\005\000\000\253\204\000\000\000\070\000\000\000\000\000\000$\022\000\000\000\000\000\000$<\000\000\000\000\000\000$\270\000\000\000\000\000\000$]\000\000\000\000\000\000$\024",
>> '\000' <repeats 967 times>
>> (gdb) p offset
>> $2 = 16
>> (gdb) p *bh->b_data
>> $3 = -64 '\300'
>> (gdb) p *(bh->b_data+offset)
>> $6 = 0 '\000'
>
> Can you give us "p &bh->b_data" (so we can get the starting address of
> b_data to make sure it's aligned) and "p offset" (so we can check and
> make sure offset is sane)?

(gdb) p &bh->b_data
$7 = (char (*)[1024]) 0x2e9b9c
(gdb) p offset
$8 = 16

2016-08-30 20:26:14

by Darrick J. Wong

[permalink] [raw]

Subject: Re: [sparc64] sigbus in e2fsck

On Tue, Aug 30, 2016 at 11:17:07PM +0300, Anatoly Pugachev wrote:
> On Tue, Aug 30, 2016 at 10:16 PM, Theodore Ts'o <[email protected]> wrote:
> > On Tue, Aug 30, 2016 at 06:12:39PM +0300, Anatoly Pugachev wrote:
> >>
> >> (gdb) p bh->b_data
> >> $1 = "\300;9\230\000\000\000\005\000\000\253\204\000\000\000\070\000\000\000\000\000\000$\022\000\000\000\000\000\000$<\000\000\000\000\000\000$\270\000\000\000\000\000\000$]\000\000\000\000\000\000$\024",
> >> '\000' <repeats 967 times>
> >> (gdb) p offset
> >> $2 = 16
> >> (gdb) p *bh->b_data
> >> $3 = -64 '\300'
> >> (gdb) p *(bh->b_data+offset)
> >> $6 = 0 '\000'
> >
> > Can you give us "p &bh->b_data" (so we can get the starting address of
> > b_data to make sure it's aligned) and "p offset" (so we can check and
> > make sure offset is sane)?
>
> (gdb) p &bh->b_data
> $7 = (char (*)[1024]) 0x2e9b9c
> (gdb) p offset
> $8 = 16

AFAICT, each bh is malloc'd via e2fsck_allocate_memory and nothing seems
to guarantee that the char b_data[1024] will be aligned to a multiple of
8 (it certainly isn't on x64), so I guess this isn't much of a surprise.

We could change b_data to a pointer and then posix_memalign it.

--D

> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2016-09-02 06:02:57

by Theodore Ts'o

[permalink] [raw]

Subject: Re: [sparc64] sigbus in e2fsck

On Tue, Aug 30, 2016 at 01:25:10PM -0700, Darrick J. Wong wrote:
>
> AFAICT, each bh is malloc'd via e2fsck_allocate_memory and nothing seems
> to guarantee that the char b_data[1024] will be aligned to a multiple of
> 8 (it certainly isn't on x64), so I guess this isn't much of a surprise.
>
> We could change b_data to a pointer and then posix_memalign it.

Actually, all we need to do is to rearrange the structure elements so
it looks like this:

unsigned long long b_blocknr;
char b_data[1024];
};

Since b_blocknr will need to be eight byte aligned, this will also
ensure that b_data will also be aligned correctly.

- Ted