2013-10-09 23:26:54

by Josh Triplett

[permalink] [raw]
Subject: Regression parsing GPT (EFI) partition tables

When testing ChromeOS with a 3.12 kernel from git, I encountered a
regression introduced between 3.11 to 3.12: at boot time, the kernel
failed to find any partitions on the USB disk I booted from, which uses
a GPT partition table with 12 partitions. This prevented the system
from booting.

Reverting all the patches to block/partitions/efi.c between 3.11 and
3.12 allowed the system to detect partitions again. The patches I
reverted:

6b02fa5 partitions/efi: loosen check fot pmbr size in lba
b4bc4a1 block/partitions/efi.c: consistently use pr_foo()
70f637e partitions/efi: some style cleanups
aa054bc partitions/efi: compare first and last usable LBAs
27a7c64 partitions/efi: account for pmbr size in lba
b05ebbb partitions/efi: detect hybrid MBRs
3e69ac3 partitions/efi: do not require gpt partition to begin at sector 1
33afd7a partitions/efi: check pmbr record's starting lba
c2ebdc2 partitions/efi: use lba-aware partition records

I haven't yet attempted to find out if reverting a subset of those patches
fixes the problem, though I'd strongly suspect the commits that introduce
strictness checks for the partition layout and fail when those checks don't
pass. Most likely ChromeOS's partition layout tripped one of those checks.

Here's the partition layout:

$ gdisk -l /dev/sdc
GPT fdisk (gdisk) version 0.8.7

Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sdc: 31293440 sectors, 14.9 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 44F4ACFD-546C-E04C-A788-6782FCBDDB3E
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 4956062
Partitions will be aligned on 1-sector boundaries
Total free space is 167801 sectors (81.9 MiB)

Number Start (sector) End (sector) Size Code Name
1 2826240 4923391 1024.0 MiB 0700 STATE
2 20480 53247 16.0 MiB 7F00 KERN-A
3 286720 2826239 1.2 GiB 7F01 ROOT-A
4 53248 86015 16.0 MiB 7F00 KERN-B
5 282624 286719 2.0 MiB 7F01 ROOT-B
6 16448 16448 512 bytes 7F00 KERN-C
7 16449 16449 512 bytes 7F01 ROOT-C
8 86016 118783 16.0 MiB 0700 OEM
9 16450 16450 512 bytes 7F02 reserved
10 16451 16451 512 bytes 7F02 reserved
11 64 16447 8.0 MiB FFFF RWFW
12 249856 282623 16.0 MiB EF00 EFI-SYSTEM


Also note that ChromeOS USB disks typically only have one valid GPT, at the
beginning of the disk, since they're imaged from a file and nothing moves the
secondary GPT from the end of the image to the end of the disk. Partitioning
tools, and the kernel, typically gripe about this; for instance, dmesg from a
3.10 kernel when I plug in a ChromeOS USB disk:

GPT:Primary header thinks Alt. header is not at the end of the disk.
GPT:4956095 != 31293439
GPT:Alternate GPT header not at the end of the disk.
GPT:4956095 != 31293439

I can easily supply additional information about ChromeOS images and their
partition tables.

- Josh Triplett


2013-10-10 00:37:39

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: Regression parsing GPT (EFI) partition tables

Hi Josh,

On Wed, 2013-10-09 at 16:26 -0700, Josh Triplett wrote:
> When testing ChromeOS with a 3.12 kernel from git, I encountered a
> regression introduced between 3.11 to 3.12: at boot time, the kernel
> failed to find any partitions on the USB disk I booted from, which uses
> a GPT partition table with 12 partitions. This prevented the system
> from booting.
>
> Reverting all the patches to block/partitions/efi.c between 3.11 and
> 3.12 allowed the system to detect partitions again. The patches I
> reverted:
>
> 6b02fa5 partitions/efi: loosen check fot pmbr size in lba
> b4bc4a1 block/partitions/efi.c: consistently use pr_foo()
> 70f637e partitions/efi: some style cleanups
> aa054bc partitions/efi: compare first and last usable LBAs
> 27a7c64 partitions/efi: account for pmbr size in lba
> b05ebbb partitions/efi: detect hybrid MBRs
> 3e69ac3 partitions/efi: do not require gpt partition to begin at sector 1
> 33afd7a partitions/efi: check pmbr record's starting lba
> c2ebdc2 partitions/efi: use lba-aware partition records

It would be good to bisect this :)
I suspect it might be caused by 33afd7a otherwise there's something
wrong with the mbr size in lba (6b02fa5+27a7c64).

>
> I haven't yet attempted to find out if reverting a subset of those patches
> fixes the problem, though I'd strongly suspect the commits that introduce
> strictness checks for the partition layout and fail when those checks don't
> pass. Most likely ChromeOS's partition layout tripped one of those checks.
>
> Here's the partition layout:
>
> $ gdisk -l /dev/sdc
> GPT fdisk (gdisk) version 0.8.7
>
> Partition table scan:
> MBR: protective
> BSD: not present
> APM: not present
> GPT: present
>
> Found valid GPT with protective MBR; using GPT.
> Disk /dev/sdc: 31293440 sectors, 14.9 GiB
> Logical sector size: 512 bytes
> Disk identifier (GUID): 44F4ACFD-546C-E04C-A788-6782FCBDDB3E
> Partition table holds up to 128 entries
> First usable sector is 34, last usable sector is 4956062
> Partitions will be aligned on 1-sector boundaries
> Total free space is 167801 sectors (81.9 MiB)
>
> Number Start (sector) End (sector) Size Code Name
> 1 2826240 4923391 1024.0 MiB 0700 STATE
> 2 20480 53247 16.0 MiB 7F00 KERN-A
> 3 286720 2826239 1.2 GiB 7F01 ROOT-A
> 4 53248 86015 16.0 MiB 7F00 KERN-B
> 5 282624 286719 2.0 MiB 7F01 ROOT-B
> 6 16448 16448 512 bytes 7F00 KERN-C
> 7 16449 16449 512 bytes 7F01 ROOT-C
> 8 86016 118783 16.0 MiB 0700 OEM
> 9 16450 16450 512 bytes 7F02 reserved
> 10 16451 16451 512 bytes 7F02 reserved
> 11 64 16447 8.0 MiB FFFF RWFW
> 12 249856 282623 16.0 MiB EF00 EFI-SYSTEM
>
>
> Also note that ChromeOS USB disks typically only have one valid GPT, at the
> beginning of the disk, since they're imaged from a file and nothing moves the
> secondary GPT from the end of the image to the end of the disk. Partitioning
> tools, and the kernel, typically gripe about this; for instance, dmesg from a
> 3.10 kernel when I plug in a ChromeOS USB disk:
>
> GPT:Primary header thinks Alt. header is not at the end of the disk.
> GPT:4956095 != 31293439
> GPT:Alternate GPT header not at the end of the disk.
> GPT:4956095 != 31293439

Hmmm, could you try the following?

diff --git a/block/partitions/efi.c b/block/partitions/efi.c
index 1eb09ee..d589937 100644
--- a/block/partitions/efi.c
+++ b/block/partitions/efi.c
@@ -157,10 +157,6 @@ static inline int pmbr_part_valid(gpt_mbr_record *part)
if (part->os_type != EFI_PMBR_OSTYPE_EFI_GPT)
goto invalid;

- /* set to 0x00000001 (i.e., the LBA of the GPT Partition Header) */
- if (le32_to_cpu(part->starting_lba) != GPT_PRIMARY_PARTITION_TABLE_LBA)
- goto invalid;
-
return GPT_MBR_PROTECTIVE;
invalid:
return 0;


If this doesn't work, could you bypass mbr checks with force_gpt option
and add some printk to the partition_record.size_in_lba, like:

diff --git a/block/partitions/efi.c b/block/partitions/efi.c
index 1eb09ee..77cfed2 100644
--- a/block/partitions/efi.c
+++ b/block/partitions/efi.c
@@ -225,6 +225,7 @@ check_hybrid:
*/
if (ret == GPT_MBR_PROTECTIVE) {
sz = le32_to_cpu(mbr->partition_record[part].size_in_lba);
+ printk("gpt sz check: %d, %ld\n", sz, total_sectors);
if (sz != (uint32_t) total_sectors - 1 && sz != 0xFFFFFFFF)
ret = 0;
}

Thanks,
Davidlohr

2013-10-10 20:15:07

by Doug Anderson

[permalink] [raw]
Subject: Re: Regression parsing GPT (EFI) partition tables

Hi,

Just ran into this same problem and tracked it down to the same
commit. Luckily Sean found this thread. :)

On Wed, Oct 9, 2013 at 5:37 PM, Davidlohr Bueso <[email protected]> wrote:
> Hi Josh,
>
> On Wed, 2013-10-09 at 16:26 -0700, Josh Triplett wrote:
>> When testing ChromeOS with a 3.12 kernel from git, I encountered a
>> regression introduced between 3.11 to 3.12: at boot time, the kernel
>> failed to find any partitions on the USB disk I booted from, which uses
>> a GPT partition table with 12 partitions. This prevented the system
>> from booting.
>>
>> Reverting all the patches to block/partitions/efi.c between 3.11 and
>> 3.12 allowed the system to detect partitions again. The patches I
>> reverted:
>>
>> 6b02fa5 partitions/efi: loosen check fot pmbr size in lba
>> b4bc4a1 block/partitions/efi.c: consistently use pr_foo()
>> 70f637e partitions/efi: some style cleanups
>> aa054bc partitions/efi: compare first and last usable LBAs
>> 27a7c64 partitions/efi: account for pmbr size in lba
>> b05ebbb partitions/efi: detect hybrid MBRs
>> 3e69ac3 partitions/efi: do not require gpt partition to begin at sector 1
>> 33afd7a partitions/efi: check pmbr record's starting lba
>> c2ebdc2 partitions/efi: use lba-aware partition records
>
> It would be good to bisect this :)
> I suspect it might be caused by 33afd7a otherwise there's something
> wrong with the mbr size in lba (6b02fa5+27a7c64).

So I found that (c2ebdc2 partitions/efi: use lba-aware partition
records) broke things but that breakage is short-lived. Specifically
in c2ebdc2 we changed from looking at "part->start_sect" to
"part->start_sector", but "part->start_sect" is equivalent to
"part->starting_lba" in the new scheme.

...moving forward, I then found the next breakage was 27a7c64 and that
appears to be the culprit. Adjusting to further changes on ToT, you
get a fix that looks roughly like:

diff --git a/block/partitions/efi.c b/block/partitions/efi.c
index 1eb09ee..9df329d 100644
--- a/block/partitions/efi.c
+++ b/block/partitions/efi.c
@@ -226,7 +226,8 @@ check_hybrid:
if (ret == GPT_MBR_PROTECTIVE) {
sz = le32_to_cpu(mbr->partition_record[part].size_in_lba);
if (sz != (uint32_t) total_sectors - 1 && sz != 0xFFFFFFFF)
- ret = 0;
+ pr_warn("%s sz=%u, total_sectors - 1=%u\n", __func__,
+ sz, (uint32_t)((uint32_t) total_sectors - 1));
}
done:
return ret;

Now, when I boot up I see messages like:

[ 4.038359] is_pmbr_valid sz=4956095, total_sectors - 1=15523839
[ 4.043963] GPT:Primary header thinks Alt. header is not at the end
of the disk.
[ 4.043967] GPT:4956095 != 15523839
[ 4.043969] GPT:Alternate GPT header not at the end of the disk.
[ 4.043972] GPT:4956095 != 15523839
[ 4.043975] GPT: Use GNU Parted to correct GPT errors.

...so basically it looks like we're now considering something an error
that used to be considered a warning.

I don't know __anything__ about how GPT is supposed to work and I'll
leave it to the experts to debate. It always is possible that the
Chrome OS GPT is somehow wrong, but Bill Richardson (CCed) says he
thinks that what we're doing is OK and that the alternate header is
supposed to be a backup copy (and thus it should only be a warning if
it's missing).


> diff --git a/block/partitions/efi.c b/block/partitions/efi.c
> index 1eb09ee..d589937 100644
> --- a/block/partitions/efi.c
> +++ b/block/partitions/efi.c
> @@ -157,10 +157,6 @@ static inline int pmbr_part_valid(gpt_mbr_record *part)
> if (part->os_type != EFI_PMBR_OSTYPE_EFI_GPT)
> goto invalid;
>
> - /* set to 0x00000001 (i.e., the LBA of the GPT Partition Header) */
> - if (le32_to_cpu(part->starting_lba) != GPT_PRIMARY_PARTITION_TABLE_LBA)
> - goto invalid;
> -
> return GPT_MBR_PROTECTIVE;
> invalid:
> return 0;
>
>
> If this doesn't work, could you bypass mbr checks with force_gpt option
> and add some printk to the partition_record.size_in_lba, like:
>
> diff --git a/block/partitions/efi.c b/block/partitions/efi.c
> index 1eb09ee..77cfed2 100644
> --- a/block/partitions/efi.c
> +++ b/block/partitions/efi.c
> @@ -225,6 +225,7 @@ check_hybrid:
> */
> if (ret == GPT_MBR_PROTECTIVE) {
> sz = le32_to_cpu(mbr->partition_record[part].size_in_lba);
> + printk("gpt sz check: %d, %ld\n", sz, total_sectors);
> if (sz != (uint32_t) total_sectors - 1 && sz != 0xFFFFFFFF)
> ret = 0;
> }

Just for thoroughness, I did try this patch (without my patch) and it
didn't work for me.



-Doug

2013-10-10 21:26:47

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: Regression parsing GPT (EFI) partition tables

On Thu, 2013-10-10 at 13:15 -0700, Doug Anderson wrote:
> Hi,
>
> Just ran into this same problem and tracked it down to the same
> commit. Luckily Sean found this thread. :)
>
> On Wed, Oct 9, 2013 at 5:37 PM, Davidlohr Bueso <[email protected]> wrote:
> > Hi Josh,
> >
> > On Wed, 2013-10-09 at 16:26 -0700, Josh Triplett wrote:
> >> When testing ChromeOS with a 3.12 kernel from git, I encountered a
> >> regression introduced between 3.11 to 3.12: at boot time, the kernel
> >> failed to find any partitions on the USB disk I booted from, which uses
> >> a GPT partition table with 12 partitions. This prevented the system
> >> from booting.
> >>
> >> Reverting all the patches to block/partitions/efi.c between 3.11 and
> >> 3.12 allowed the system to detect partitions again. The patches I
> >> reverted:
> >>
> >> 6b02fa5 partitions/efi: loosen check fot pmbr size in lba
> >> b4bc4a1 block/partitions/efi.c: consistently use pr_foo()
> >> 70f637e partitions/efi: some style cleanups
> >> aa054bc partitions/efi: compare first and last usable LBAs
> >> 27a7c64 partitions/efi: account for pmbr size in lba
> >> b05ebbb partitions/efi: detect hybrid MBRs
> >> 3e69ac3 partitions/efi: do not require gpt partition to begin at sector 1
> >> 33afd7a partitions/efi: check pmbr record's starting lba
> >> c2ebdc2 partitions/efi: use lba-aware partition records
> >
> > It would be good to bisect this :)
> > I suspect it might be caused by 33afd7a otherwise there's something
> > wrong with the mbr size in lba (6b02fa5+27a7c64).
>
> So I found that (c2ebdc2 partitions/efi: use lba-aware partition
> records) broke things but that breakage is short-lived. Specifically
> in c2ebdc2 we changed from looking at "part->start_sect" to
> "part->start_sector", but "part->start_sect" is equivalent to
> "part->starting_lba" in the new scheme.

Right, because we're not using CHS addressing anymore for GPT. What do
you mean by "breakage is short-lived"? I didn't quite get that.

>
> ...moving forward, I then found the next breakage was 27a7c64 and that
> appears to be the culprit. Adjusting to further changes on ToT, you
> get a fix that looks roughly like:
>
> diff --git a/block/partitions/efi.c b/block/partitions/efi.c
> index 1eb09ee..9df329d 100644
> --- a/block/partitions/efi.c
> +++ b/block/partitions/efi.c
> @@ -226,7 +226,8 @@ check_hybrid:
> if (ret == GPT_MBR_PROTECTIVE) {
> sz = le32_to_cpu(mbr->partition_record[part].size_in_lba);
> if (sz != (uint32_t) total_sectors - 1 && sz != 0xFFFFFFFF)
> - ret = 0;
> + pr_warn("%s sz=%u, total_sectors - 1=%u\n", __func__,
> + sz, (uint32_t)((uint32_t) total_sectors - 1));
> }
> done:
> return ret;
>
> Now, when I boot up I see messages like:
>
> [ 4.038359] is_pmbr_valid sz=4956095, total_sectors - 1=15523839

Hmmm, this confuses me: sz represents the size of the disk and should be
equal to total_sectors - 1 (aka lastlba) in this case, and since you're
not using the whole 2Tb disk limit, so it'll never be 0xFFFFFFFF....

> [ 4.043963] GPT:Primary header thinks Alt. header is not at the end
> of the disk.
> [ 4.043967] GPT:4956095 != 15523839

but in reality, sz is equal to where the primary GPT header thinks the
alternate header is (pgpt->alternate_lba):

if (le64_to_cpu(pgpt->alternate_lba) != lastlba) {
pr_warn("GPT:Primary header thinks Alt. header is not at the end of the disk.\n");
pr_warn("GPT:%lld != %lld\n",
(unsigned long long)le64_to_cpu(pgpt->alternate_lba),
(unsigned long long)lastlba);
error_found++;
}

> [ 4.043969] GPT:Alternate GPT header not at the end of the disk.
> [ 4.043972] GPT:4956095 != 15523839
> [ 4.043975] GPT: Use GNU Parted to correct GPT errors.

At least the backup/alternate GPT header is consistent with the above:
4956095 == pgpt->alternate_lba == agpt->my_lba

But sz shouldn't be 4956095... strange.

Do you happen to know what tool was used to create the GPT partition(s)?
I ask because it is up to the partitioning program to set sz correctly,
and most if not all set it to either the disk size - 1 or 0xFFFFFFFF.

> ...so basically it looks like we're now considering something an error
> that used to be considered a warning.

It simply wasn't checked before.

It's worthwhile finding out if this scenario also occurs without the
recent GPT changes. If so, then we could go ahead and just use a warning
and not consider it an error for the sake of booting. Otherwise there's
something wrong with c2ebdc2 and replacing 'struct partition' with
'struct _gpt_mbr_record' caused some sort of structure offset problem
and some fields are holding the wrong data.

To check this, you can revert all patches, starting from c2ebdc2, then
add the following (untested) to pmbr_part_valid():

diff --git a/block/partitions/efi.c b/block/partitions/efi.c
index c85fc89..7144d8a 100644
--- a/block/partitions/efi.c
+++ b/block/partitions/efi.c
@@ -152,6 +152,8 @@ static u64 last_lba(struct block_device *bdev)
static inline int
pmbr_part_valid(struct partition *part)
{
+ printk("part->nr_sects = %d\n", le32_to_cpu(part->nr_sects));
+
if (part->sys_ind == EFI_PMBR_OSTYPE_EFI_GPT &&
le32_to_cpu(part->start_sect) == 1UL)
return 1;

Thanks,
Davidlohr

2013-10-10 21:53:56

by Bill Richardson

[permalink] [raw]
Subject: Re: Regression parsing GPT (EFI) partition tables

I'm about to go offline for a week, but here are some thoughts.

The GPT spec is over-constrained. It says that the secondary GPT
header must be in the last LBA of the disk, yet it also says that both
primary and secondary headers must point to each other using their
AlternateLBA fields. We don't need both of those rules - and the spec
is unclear what it means if the headers *don't* point to each other,
or if they do but from the wrong place.

Chrome OS install images are created in a binary file, so the ending
LBAs are just the last LBA in the binary image, and that's where the
primary GPT header's AlternateLBA field points. When we dd that image
onto a USB or SD card and boot it, the secondary GPT header is not at
the end of the disk. The primary GPT header is otherwise valid, so we
boot it anyway and the chromeos-install process creates valid primary
and secondary headers (with cross-pointing AlternateLBA fields) on the
fixed drive.

That bootable binary image does have a protective MBR (partition type
0xEE), but again the extent of it is only that of the original binary,
not the USB drive it was copied onto.

Chrome OS BIOS pays no attention *at all* to the legacy MBR, or to the
AlternateLBA fields of either GPT header. It only looks for the GPT
headers in LBA 1 and LBA <last>.

The tool that we use to create the partition tables is called "cgpt". To build:

git clone https://chromium.googlesource.com/chromiumos/platform/vboot_reference
cd vboot_reference
make
./build/cgpt/cgpt help

We wrote cgpt ourselves because we wanted to use the same small
GPT-aware library in the BIOS that we use to create and test disk
images.


Bill
--
Art for Art's Sake
Engineering for Money

2013-10-10 22:29:29

by Doug Anderson

[permalink] [raw]
Subject: Re: Regression parsing GPT (EFI) partition tables

Hi,

On Thu, Oct 10, 2013 at 2:26 PM, Davidlohr Bueso <[email protected]> wrote:
> On Thu, 2013-10-10 at 13:15 -0700, Doug Anderson wrote:
>> Hi,
>>
>> Just ran into this same problem and tracked it down to the same
>> commit. Luckily Sean found this thread. :)
>>
>> On Wed, Oct 9, 2013 at 5:37 PM, Davidlohr Bueso <[email protected]> wrote:
>> > Hi Josh,
>> >
>> > On Wed, 2013-10-09 at 16:26 -0700, Josh Triplett wrote:
>> >> When testing ChromeOS with a 3.12 kernel from git, I encountered a
>> >> regression introduced between 3.11 to 3.12: at boot time, the kernel
>> >> failed to find any partitions on the USB disk I booted from, which uses
>> >> a GPT partition table with 12 partitions. This prevented the system
>> >> from booting.
>> >>
>> >> Reverting all the patches to block/partitions/efi.c between 3.11 and
>> >> 3.12 allowed the system to detect partitions again. The patches I
>> >> reverted:
>> >>
>> >> 6b02fa5 partitions/efi: loosen check fot pmbr size in lba
>> >> b4bc4a1 block/partitions/efi.c: consistently use pr_foo()
>> >> 70f637e partitions/efi: some style cleanups
>> >> aa054bc partitions/efi: compare first and last usable LBAs
>> >> 27a7c64 partitions/efi: account for pmbr size in lba
>> >> b05ebbb partitions/efi: detect hybrid MBRs
>> >> 3e69ac3 partitions/efi: do not require gpt partition to begin at sector 1
>> >> 33afd7a partitions/efi: check pmbr record's starting lba
>> >> c2ebdc2 partitions/efi: use lba-aware partition records
>> >
>> > It would be good to bisect this :)
>> > I suspect it might be caused by 33afd7a otherwise there's something
>> > wrong with the mbr size in lba (6b02fa5+27a7c64).
>>
>> So I found that (c2ebdc2 partitions/efi: use lba-aware partition
>> records) broke things but that breakage is short-lived. Specifically
>> in c2ebdc2 we changed from looking at "part->start_sect" to
>> "part->start_sector", but "part->start_sect" is equivalent to
>> "part->starting_lba" in the new scheme.
>
> Right, because we're not using CHS addressing anymore for GPT. What do
> you mean by "breakage is short-lived"? I didn't quite get that.

Sorry, it's "short-lived" because the CLs right after this effectively
change it back to looking at "starting_lba" again.


>> ...moving forward, I then found the next breakage was 27a7c64 and that
>> appears to be the culprit. Adjusting to further changes on ToT, you
>> get a fix that looks roughly like:
>>
>> diff --git a/block/partitions/efi.c b/block/partitions/efi.c
>> index 1eb09ee..9df329d 100644
>> --- a/block/partitions/efi.c
>> +++ b/block/partitions/efi.c
>> @@ -226,7 +226,8 @@ check_hybrid:
>> if (ret == GPT_MBR_PROTECTIVE) {
>> sz = le32_to_cpu(mbr->partition_record[part].size_in_lba);
>> if (sz != (uint32_t) total_sectors - 1 && sz != 0xFFFFFFFF)
>> - ret = 0;
>> + pr_warn("%s sz=%u, total_sectors - 1=%u\n", __func__,
>> + sz, (uint32_t)((uint32_t) total_sectors - 1));
>> }
>> done:
>> return ret;
>>
>> Now, when I boot up I see messages like:
>>
>> [ 4.038359] is_pmbr_valid sz=4956095, total_sectors - 1=15523839
>
> Hmmm, this confuses me: sz represents the size of the disk and should be
> equal to total_sectors - 1 (aka lastlba) in this case, and since you're
> not using the whole 2Tb disk limit, so it'll never be 0xFFFFFFFF....
>
>> [ 4.043963] GPT:Primary header thinks Alt. header is not at the end
>> of the disk.
>> [ 4.043967] GPT:4956095 != 15523839
>
> but in reality, sz is equal to where the primary GPT header thinks the
> alternate header is (pgpt->alternate_lba):
>
> if (le64_to_cpu(pgpt->alternate_lba) != lastlba) {
> pr_warn("GPT:Primary header thinks Alt. header is not at the end of the disk.\n");
> pr_warn("GPT:%lld != %lld\n",
> (unsigned long long)le64_to_cpu(pgpt->alternate_lba),
> (unsigned long long)lastlba);
> error_found++;
> }
>
>> [ 4.043969] GPT:Alternate GPT header not at the end of the disk.
>> [ 4.043972] GPT:4956095 != 15523839
>> [ 4.043975] GPT: Use GNU Parted to correct GPT errors.
>
> At least the backup/alternate GPT header is consistent with the above:
> 4956095 == pgpt->alternate_lba == agpt->my_lba
>
> But sz shouldn't be 4956095... strange.
>
> Do you happen to know what tool was used to create the GPT partition(s)?
> I ask because it is up to the partitioning program to set sz correctly,
> and most if not all set it to either the disk size - 1 or 0xFFFFFFFF.

See Bill's email. It was created using "cgpt" to create a binary
file, which was then "dd"ed onto removable media.


FWIW, I can "fix" my problems also by tweaking my image:

setword.py /.../chromiumos_test_image.bin 0x1ca 0xffffffff
0x000001ca: 0x004b9fbf => 0xffffffff

Now things boot up nicely. :) Similarly I can get things to boot by running:

cgpt boot -p /dev/mmcblk1

...which appears to put the right size into this field.


>> ...so basically it looks like we're now considering something an error
>> that used to be considered a warning.
>
> It simply wasn't checked before.
>
> It's worthwhile finding out if this scenario also occurs without the
> recent GPT changes. If so, then we could go ahead and just use a warning
> and not consider it an error for the sake of booting. Otherwise there's
> something wrong with c2ebdc2 and replacing 'struct partition' with
> 'struct _gpt_mbr_record' caused some sort of structure offset problem
> and some fields are holding the wrong data.
>
> To check this, you can revert all patches, starting from c2ebdc2, then
> add the following (untested) to pmbr_part_valid():
>
> diff --git a/block/partitions/efi.c b/block/partitions/efi.c
> index c85fc89..7144d8a 100644
> --- a/block/partitions/efi.c
> +++ b/block/partitions/efi.c
> @@ -152,6 +152,8 @@ static u64 last_lba(struct block_device *bdev)
> static inline int
> pmbr_part_valid(struct partition *part)
> {
> + printk("part->nr_sects = %d\n", le32_to_cpu(part->nr_sects));
> +
> if (part->sys_ind == EFI_PMBR_OSTYPE_EFI_GPT &&
> le32_to_cpu(part->start_sect) == 1UL)
> return 1;
>
> Thanks,
> Davidlohr

I can run this test if you need, but it sounds like we're tracked it
down pretty well.


My dumb summary of this without digging and understanding everything is:

* It used to be "OK" if the sz was wrong.

* If you're "dd"ing an image from a smaller device to a bigger device,
the "sz" will likely be wrong. It would be nice if this were a
warning not an error since this can be a useful thing to do.

* We could fix our tool to not specify "sz" (aka use "-1") when
creating our images and it would work.

* This is not exactly the same as the GPT/alternate GPT error, since
it's a different header field.


Does that sound reasonable?

-Doug

2013-10-10 22:31:20

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: Regression parsing GPT (EFI) partition tables

On Thu, 2013-10-10 at 14:53 -0700, Bill Richardson wrote:
> I'm about to go offline for a week, but here are some thoughts.
>
> The GPT spec is over-constrained. It says that the secondary GPT
> header must be in the last LBA of the disk, yet it also says that both
> primary and secondary headers must point to each other using their
> AlternateLBA fields. We don't need both of those rules - and the spec
> is unclear what it means if the headers *don't* point to each other,
> or if they do but from the wrong place.
>

I completely agree. But I'm concerned about the size of lba, and not the
primary/backup header differences.

> Chrome OS install images are created in a binary file, so the ending
> LBAs are just the last LBA in the binary image, and that's where the
> primary GPT header's AlternateLBA field points. When we dd that image
> onto a USB or SD card and boot it, the secondary GPT header is not at
> the end of the disk. The primary GPT header is otherwise valid, so we
> boot it anyway and the chromeos-install process creates valid primary
> and secondary headers (with cross-pointing AlternateLBA fields) on the
> fixed drive.
>
> That bootable binary image does have a protective MBR (partition type
> 0xEE), but again the extent of it is only that of the original binary,
> not the USB drive it was copied onto.

Ahh, now that makes a whole lot of sense for this problem. That's why
your size in lba is the same as the backup gpt lba. All in all you're
using the binary file as your "whole disk", instead of the actual disk
itself, so while this is a very out of the ordinary configuration, it is
constant for GPT: this way size in lba == last lba for the _file_.

>
> Chrome OS BIOS pays no attention *at all* to the legacy MBR, or to the
> AlternateLBA fields of either GPT header. It only looks for the GPT
> headers in LBA 1 and LBA <last>.

Then you should *really* use the force_gpt option, which is there to
bypass any MBR checks, and you can avoid issues like this :)

Anyway, this is still a regression and I believe we can go ahead and
just warn the user about the case instead of not recognizing the disk.

Bill/Doug, care to send a formal patch (with corresponding comments)?

Thanks,
Davidlohr

2013-10-10 22:49:18

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: Regression parsing GPT (EFI) partition tables

On Thu, 2013-10-10 at 15:29 -0700, Doug Anderson wrote:
> Hi,
[...]
> > Do you happen to know what tool was used to create the GPT partition(s)?
> > I ask because it is up to the partitioning program to set sz correctly,
> > and most if not all set it to either the disk size - 1 or 0xFFFFFFFF.
>
> See Bill's email. It was created using "cgpt" to create a binary
> file, which was then "dd"ed onto removable media.
>
>
> FWIW, I can "fix" my problems also by tweaking my image:
>
> setword.py /.../chromiumos_test_image.bin 0x1ca 0xffffffff
> 0x000001ca: 0x004b9fbf => 0xffffffff
>
> Now things boot up nicely. :) Similarly I can get things to boot by running:

Right, now sz == 0xffffffff so you're "using" the whole disk, but you
probably want 0xecdfff since the disk is smaller than that.

>
> cgpt boot -p /dev/mmcblk1
>
> ...which appears to put the right size into this field.
>
>
> >> ...so basically it looks like we're now considering something an error
> >> that used to be considered a warning.
> >
> > It simply wasn't checked before.
> >
> > It's worthwhile finding out if this scenario also occurs without the
> > recent GPT changes. If so, then we could go ahead and just use a warning
> > and not consider it an error for the sake of booting. Otherwise there's
> > something wrong with c2ebdc2 and replacing 'struct partition' with
> > 'struct _gpt_mbr_record' caused some sort of structure offset problem
> > and some fields are holding the wrong data.
> >
> > To check this, you can revert all patches, starting from c2ebdc2, then
> > add the following (untested) to pmbr_part_valid():
> >
> > diff --git a/block/partitions/efi.c b/block/partitions/efi.c
> > index c85fc89..7144d8a 100644
> > --- a/block/partitions/efi.c
> > +++ b/block/partitions/efi.c
> > @@ -152,6 +152,8 @@ static u64 last_lba(struct block_device *bdev)
> > static inline int
> > pmbr_part_valid(struct partition *part)
> > {
> > + printk("part->nr_sects = %d\n", le32_to_cpu(part->nr_sects));
> > +
> > if (part->sys_ind == EFI_PMBR_OSTYPE_EFI_GPT &&
> > le32_to_cpu(part->start_sect) == 1UL)
> > return 1;
> >
> > Thanks,
> > Davidlohr
>
> I can run this test if you need, but it sounds like we're tracked it
> down pretty well.
>
>
> My dumb summary of this without digging and understanding everything is:
>
> * It used to be "OK" if the sz was wrong.

Yes, because it was never checked.

>
> * If you're "dd"ing an image from a smaller device to a bigger device,
> the "sz" will likely be wrong. It would be nice if this were a
> warning not an error since this can be a useful thing to do.
>
> * We could fix our tool to not specify "sz" (aka use "-1") when
> creating our images and it would work.

You should use force_gpt and forget about MBR all together.

>
> * This is not exactly the same as the GPT/alternate GPT error, since
> it's a different header field.
>
>
> Does that sound reasonable?
>

Since it used to work before, I blame my change and it's a regression,
I'll happily ack the "warn instead of not recognizing approach".

Thanks,
Davidlohr

2013-10-10 23:29:02

by Doug Anderson

[permalink] [raw]
Subject: [PATCH] partitions/efi: treat size mismatch as a warning, not an error

In (27a7c64 partitions/efi: account for pmbr size in lba) we started
treating bad sizes in lba field of the partition that has the 0xEE
(GPT protective) as errors. However, we may run into these "bad
sizes" in the real world if someone uses dd to copy an image from a
smaller disk to a bigger disk. Since this case used to work (even
without using force_gpt), keep it working and treat the size mismatch
as a warning instead of an error.

Reported-by: Josh Triplett <[email protected]>
Reported-by: Sean Paul <[email protected]>
Signed-off-by: Doug Anderson <[email protected]>
---
block/partitions/efi.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/block/partitions/efi.c b/block/partitions/efi.c
index 1eb09ee..ac23dc1 100644
--- a/block/partitions/efi.c
+++ b/block/partitions/efi.c
@@ -222,11 +222,15 @@ check_hybrid:
* the disk size.
*
* Hybrid MBRs do not necessarily comply with this.
+ *
+ * Consider a bad value here to be a warning to support dd-ing
+ * an image from a smaller disk to a bigger disk.
*/
if (ret == GPT_MBR_PROTECTIVE) {
sz = le32_to_cpu(mbr->partition_record[part].size_in_lba);
if (sz != (uint32_t) total_sectors - 1 && sz != 0xFFFFFFFF)
- ret = 0;
+ pr_warn("%s: mbr size mismatch (%u != %u)\n", __func__,
+ sz, (uint32_t)((uint32_t) total_sectors - 1));
}
done:
return ret;
--
1.8.4

2013-10-10 23:40:34

by Doug Anderson

[permalink] [raw]
Subject: Re: Regression parsing GPT (EFI) partition tables

Davidlohr,

On Thu, Oct 10, 2013 at 3:31 PM, Davidlohr Bueso <[email protected]> wrote:
> Then you should *really* use the force_gpt option, which is there to
> bypass any MBR checks, and you can avoid issues like this :)
>
> Anyway, this is still a regression and I believe we can go ahead and
> just warn the user about the case instead of not recognizing the disk.
>
> Bill/Doug, care to send a formal patch (with corresponding comments)?

OK, I've posted up at <https://patchwork.kernel.org/patch/3019531/>.
As I've said, I haven't spent the time to really understand every last
detail (I was just doing dumb git bisects), so if my explanation /
comments don't actually make sense then please correct me.

Thanks for your help in tracking this down!

-Doug

2013-10-11 00:25:34

by Josh Triplett

[permalink] [raw]
Subject: Re: [PATCH] partitions/efi: treat size mismatch as a warning, not an error

On Thu, Oct 10, 2013 at 04:28:22PM -0700, Doug Anderson wrote:
> In (27a7c64 partitions/efi: account for pmbr size in lba) we started
> treating bad sizes in lba field of the partition that has the 0xEE
> (GPT protective) as errors. However, we may run into these "bad
> sizes" in the real world if someone uses dd to copy an image from a
> smaller disk to a bigger disk. Since this case used to work (even
> without using force_gpt), keep it working and treat the size mismatch
> as a warning instead of an error.
>
> Reported-by: Josh Triplett <[email protected]>
> Reported-by: Sean Paul <[email protected]>
> Signed-off-by: Doug Anderson <[email protected]>

Reviewed-by: Josh Triplett <[email protected]>

> ---
> block/partitions/efi.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/block/partitions/efi.c b/block/partitions/efi.c
> index 1eb09ee..ac23dc1 100644
> --- a/block/partitions/efi.c
> +++ b/block/partitions/efi.c
> @@ -222,11 +222,15 @@ check_hybrid:
> * the disk size.
> *
> * Hybrid MBRs do not necessarily comply with this.
> + *
> + * Consider a bad value here to be a warning to support dd-ing
> + * an image from a smaller disk to a bigger disk.
> */
> if (ret == GPT_MBR_PROTECTIVE) {
> sz = le32_to_cpu(mbr->partition_record[part].size_in_lba);
> if (sz != (uint32_t) total_sectors - 1 && sz != 0xFFFFFFFF)
> - ret = 0;
> + pr_warn("%s: mbr size mismatch (%u != %u)\n", __func__,
> + sz, (uint32_t)((uint32_t) total_sectors - 1));
> }
> done:
> return ret;
> --
> 1.8.4
>

2013-10-11 00:31:35

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [PATCH] partitions/efi: treat size mismatch as a warning, not an error

On Thu, 2013-10-10 at 16:28 -0700, Doug Anderson wrote:
> In (27a7c64 partitions/efi: account for pmbr size in lba) we started
> treating bad sizes in lba field of the partition that has the 0xEE
> (GPT protective) as errors. However, we may run into these "bad
> sizes" in the real world if someone uses dd to copy an image from a
> smaller disk to a bigger disk. Since this case used to work (even
> without using force_gpt), keep it working and treat the size mismatch
> as a warning instead of an error.
>
> Reported-by: Josh Triplett <[email protected]>
> Reported-by: Sean Paul <[email protected]>
> Signed-off-by: Doug Anderson <[email protected]>
> ---
> block/partitions/efi.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/block/partitions/efi.c b/block/partitions/efi.c
> index 1eb09ee..ac23dc1 100644
> --- a/block/partitions/efi.c
> +++ b/block/partitions/efi.c
> @@ -222,11 +222,15 @@ check_hybrid:
> * the disk size.
> *
> * Hybrid MBRs do not necessarily comply with this.
> + *
> + * Consider a bad value here to be a warning to support dd-ing

dd'ing instead?

> + * an image from a smaller disk to a bigger disk.

'larger' disk sounds better.

> */
> if (ret == GPT_MBR_PROTECTIVE) {
> sz = le32_to_cpu(mbr->partition_record[part].size_in_lba);
> if (sz != (uint32_t) total_sectors - 1 && sz != 0xFFFFFFFF)
> - ret = 0;
> + pr_warn("%s: mbr size mismatch (%u != %u)\n", __func__,
> + sz, (uint32_t)((uint32_t) total_sectors - 1));

How about this instead?
pr_debug("GPT: mbr size in lba (%d) different than whole disk (%d).\n", sz, min(total_sectors -1, 0xFFFFFFFF));

> }
> done:
> return ret;

Thanks,
Davidlohr

2013-10-11 09:52:10

by Karel Zak

[permalink] [raw]
Subject: Re: [PATCH] partitions/efi: treat size mismatch as a warning, not an error

On Thu, Oct 10, 2013 at 04:28:22PM -0700, Doug Anderson wrote:
> + *
> + * Consider a bad value here to be a warning to support dd-ing
> + * an image from a smaller disk to a bigger disk.
> */
> if (ret == GPT_MBR_PROTECTIVE) {
> sz = le32_to_cpu(mbr->partition_record[part].size_in_lba);
> if (sz != (uint32_t) total_sectors - 1 && sz != 0xFFFFFFFF)
> - ret = 0;
> + pr_warn("%s: mbr size mismatch (%u != %u)\n", __func__,
> + sz, (uint32_t)((uint32_t) total_sectors - 1));
> }

I did the same change to util-linux v2.24 fdisk, so you can use the fdisk
to update your protective MBR and boot without warnings.

Karel

--
Karel Zak <[email protected]>
http://karelzak.blogspot.com

2013-10-11 15:48:21

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH] partitions/efi: treat size mismatch as a warning, not an error

Davidlohr,

On Thu, Oct 10, 2013 at 5:31 PM, Davidlohr Bueso <[email protected]> wrote:
> On Thu, 2013-10-10 at 16:28 -0700, Doug Anderson wrote:
>> In (27a7c64 partitions/efi: account for pmbr size in lba) we started
>> treating bad sizes in lba field of the partition that has the 0xEE
>> (GPT protective) as errors. However, we may run into these "bad
>> sizes" in the real world if someone uses dd to copy an image from a
>> smaller disk to a bigger disk. Since this case used to work (even
>> without using force_gpt), keep it working and treat the size mismatch
>> as a warning instead of an error.
>>
>> Reported-by: Josh Triplett <[email protected]>
>> Reported-by: Sean Paul <[email protected]>
>> Signed-off-by: Doug Anderson <[email protected]>
>> ---
>> block/partitions/efi.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/partitions/efi.c b/block/partitions/efi.c
>> index 1eb09ee..ac23dc1 100644
>> --- a/block/partitions/efi.c
>> +++ b/block/partitions/efi.c
>> @@ -222,11 +222,15 @@ check_hybrid:
>> * the disk size.
>> *
>> * Hybrid MBRs do not necessarily comply with this.
>> + *
>> + * Consider a bad value here to be a warning to support dd-ing
>
> dd'ing instead?

Done.

>> + * an image from a smaller disk to a bigger disk.
>
> 'larger' disk sounds better.

Done.

>> */
>> if (ret == GPT_MBR_PROTECTIVE) {
>> sz = le32_to_cpu(mbr->partition_record[part].size_in_lba);
>> if (sz != (uint32_t) total_sectors - 1 && sz != 0xFFFFFFFF)
>> - ret = 0;
>> + pr_warn("%s: mbr size mismatch (%u != %u)\n", __func__,
>> + sz, (uint32_t)((uint32_t) total_sectors - 1));
>
> How about this instead?
> pr_debug("GPT: mbr size in lba (%d) different than whole disk (%d).\n", sz, min(total_sectors -1, 0xFFFFFFFF));

Done with modifications to avoid kernel compiler warnings (%u vs %d,
min_t vs min).

-Doug

2013-10-11 15:48:30

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v2] partitions/efi: treat size mismatch as a warning, not an error

In (27a7c64 partitions/efi: account for pmbr size in lba) we started
treating bad sizes in lba field of the partition that has the 0xEE
(GPT protective) as errors. However, we may run into these "bad
sizes" in the real world if someone uses dd to copy an image from a
smaller disk to a bigger disk. Since this case used to work (even
without using force_gpt), keep it working and treat the size mismatch
as a warning instead of an error.

Reported-by: Josh Triplett <[email protected]>
Reported-by: Sean Paul <[email protected]>
Signed-off-by: Doug Anderson <[email protected]>
Reviewed-by: Josh Triplett <[email protected]>
---
Changes in v2:
- Cleaned up comments/warning as per Davidlohr.

block/partitions/efi.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/block/partitions/efi.c b/block/partitions/efi.c
index 1eb09ee..a8287b4 100644
--- a/block/partitions/efi.c
+++ b/block/partitions/efi.c
@@ -222,11 +222,16 @@ check_hybrid:
* the disk size.
*
* Hybrid MBRs do not necessarily comply with this.
+ *
+ * Consider a bad value here to be a warning to support dd'ing
+ * an image from a smaller disk to a larger disk.
*/
if (ret == GPT_MBR_PROTECTIVE) {
sz = le32_to_cpu(mbr->partition_record[part].size_in_lba);
if (sz != (uint32_t) total_sectors - 1 && sz != 0xFFFFFFFF)
- ret = 0;
+ pr_debug("GPT: mbr size in lba (%u) different than whole disk (%u).\n",
+ sz, min_t(uint32_t,
+ total_sectors - 1, 0xFFFFFFFF));
}
done:
return ret;
--
1.8.4

2013-10-11 15:53:59

by Doug Anderson

[permalink] [raw]
Subject: Re: [PATCH] partitions/efi: treat size mismatch as a warning, not an error

Karel,

On Fri, Oct 11, 2013 at 2:51 AM, Karel Zak <[email protected]> wrote:
> On Thu, Oct 10, 2013 at 04:28:22PM -0700, Doug Anderson wrote:
>> + *
>> + * Consider a bad value here to be a warning to support dd-ing
>> + * an image from a smaller disk to a bigger disk.
>> */
>> if (ret == GPT_MBR_PROTECTIVE) {
>> sz = le32_to_cpu(mbr->partition_record[part].size_in_lba);
>> if (sz != (uint32_t) total_sectors - 1 && sz != 0xFFFFFFFF)
>> - ret = 0;
>> + pr_warn("%s: mbr size mismatch (%u != %u)\n", __func__,
>> + sz, (uint32_t)((uint32_t) total_sectors - 1));
>> }
>
> I did the same change to util-linux v2.24 fdisk, so you can use the fdisk
> to update your protective MBR and boot without warnings.

Yup, I think we've come up with a number of ways that we can update
the partition table to avoid the problems. :) ...but the "feature"
that used to work was being able to dd an image from a smaller disk
onto a larger disk and have it "just boot" (though perhaps with some
warning messages). This feature was used in the Chrome OS workflow to
create bootable USB sticks / SD cards. We certainly could say that
what the Chrome OS workflow is doing is illegal / bad / unethical, but
it has been working for years and it would be nice for it to keep
working if possible. :)

http://xkcd.com/1172/

-Doug

2013-10-11 16:11:16

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [PATCH v2] partitions/efi: treat size mismatch as a warning, not an error

On Fri, 2013-10-11 at 08:47 -0700, Doug Anderson wrote:
> In (27a7c64 partitions/efi: account for pmbr size in lba) we started
> treating bad sizes in lba field of the partition that has the 0xEE
> (GPT protective) as errors. However, we may run into these "bad
> sizes" in the real world if someone uses dd to copy an image from a
> smaller disk to a bigger disk. Since this case used to work (even
> without using force_gpt), keep it working and treat the size mismatch
> as a warning instead of an error.
>
> Reported-by: Josh Triplett <[email protected]>
> Reported-by: Sean Paul <[email protected]>
> Signed-off-by: Doug Anderson <[email protected]>
> Reviewed-by: Josh Triplett <[email protected]>

Acked-by: Davidlohr Bueso <[email protected]>

Andrew, could you queue this up?

Thanks,
Davidlohr

> ---
> Changes in v2:
> - Cleaned up comments/warning as per Davidlohr.
>
> block/partitions/efi.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/block/partitions/efi.c b/block/partitions/efi.c
> index 1eb09ee..a8287b4 100644
> --- a/block/partitions/efi.c
> +++ b/block/partitions/efi.c
> @@ -222,11 +222,16 @@ check_hybrid:
> * the disk size.
> *
> * Hybrid MBRs do not necessarily comply with this.
> + *
> + * Consider a bad value here to be a warning to support dd'ing
> + * an image from a smaller disk to a larger disk.
> */
> if (ret == GPT_MBR_PROTECTIVE) {
> sz = le32_to_cpu(mbr->partition_record[part].size_in_lba);
> if (sz != (uint32_t) total_sectors - 1 && sz != 0xFFFFFFFF)
> - ret = 0;
> + pr_debug("GPT: mbr size in lba (%u) different than whole disk (%u).\n",
> + sz, min_t(uint32_t,
> + total_sectors - 1, 0xFFFFFFFF));
> }
> done:
> return ret;