2010-08-02 19:17:36

by Will Drewry

[permalink] [raw]
Subject: [PATCH RFC] efi: add and expose efi_partition_by_guid

EFI's GPT partitioning scheme expects that all partitions have a unique
identifiers. After initial partition scanning, this information is
completely lost to the rest of the kernel.

efi_partition_by_guid exposes GPT parsing support in a limited fashion
to allow other portions of the kernel to map a partition from GUID to
map.

An alternate implementation (and more generic) would be to expose a function
(efi_partition_walk) that iterates over the partition table providing access to
each partitions information to a callback, much like device class iterators.

The motivation for this change is the ability to have device mapper targets
resolve backing devices by GUID instead of specifically by partition number.
This could also be used in the init boot path (root=GUID) quite simply by
modeling scanning code on printk_all_partitions(), or with another patchset
allowing dm="" in the boot path: http://lkml.org/lkml/2010/6/7/418

[ Additional context: http://codereview.chromium.org/3026039/show ]

Would a change like this or something that exposes the GPT information via a
walking function be something that would be of any interest, or is it explicitly
against the current design/access goals with respect to partition information?

Any and all feedback is truly appreciated.

Signed-off-by: Will Drewry <[email protected]>
---
fs/partitions/efi.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/efi.h | 5 ++++
2 files changed, 66 insertions(+), 0 deletions(-)

diff --git a/fs/partitions/efi.c b/fs/partitions/efi.c
index 9efb2cf..4f642c5 100644
--- a/fs/partitions/efi.c
+++ b/fs/partitions/efi.c
@@ -633,3 +633,64 @@ int efi_partition(struct parsed_partitions *state)
printk("\n");
return 1;
}
+
+/**
+ * efi_partition_by_guid
+ * @bdev: Whole block device to scan for a GPT.
+ * @guid: Unique identifier for the partition to find.
+ *
+ * N.b., returns on the first match since it should be unique.
+ *
+ * Returns:
+ * -1 if an error occurred
+ * 0 if there was no match (or not GPT)
+ * >=1 is the index of the partition found.
+ *
+ */
+int efi_partition_by_guid(struct block_device *bdev, efi_guid_t *guid) {
+ gpt_header *gpt = NULL;
+ gpt_entry *ptes = NULL;
+ u32 i;
+ struct parsed_partitions *state;
+ int part = 0;
+
+ if (!bdev || !guid)
+ return -1;
+
+ state = kzalloc(sizeof(struct parsed_partitions), GFP_KERNEL);
+ if (!state)
+ return -1;
+
+ state->limit = disk_max_parts(bdev->bd_disk);
+ pr_debug(KERN_WARNING "efi_find_partition looking for gpt\n");
+
+ state->bdev = bdev;
+ if (!find_valid_gpt(state, &gpt, &ptes) || !gpt || !ptes) {
+ pr_debug(KERN_WARNING "efi_find_partition no GPT\n");
+ kfree(gpt);
+ kfree(ptes);
+ kfree(state);
+ return 0;
+ }
+
+ pr_debug("GUID Partition Table is valid! Yea!\n");
+ pr_debug(KERN_WARNING "efi_find_partition: 0 -> %d (limit:%d)\n",
+ le32_to_cpu(gpt->num_partition_entries),
+ state->limit);
+ for (i = 0; i < le32_to_cpu(gpt->num_partition_entries) &&
+ i < state->limit-1; i++) {
+ if (!is_pte_valid(&ptes[i], last_lba(bdev)))
+ continue;
+
+ /* Bails on first hit so duped "unique" GUIDs will be FCFS. */
+ if (!efi_guidcmp(ptes[i].unique_partition_guid,
+ *guid)) {
+ part = i + 1;
+ break;
+ }
+ }
+ kfree(ptes);
+ kfree(gpt);
+ kfree(state);
+ return part;
+}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index fb737bc..1a77259 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -301,6 +301,11 @@ extern unsigned long efi_get_time(void);
extern int efi_set_rtc_mmss(unsigned long nowtime);
extern struct efi_memory_map memmap;

+#ifdef CONFIG_EFI_PARTITION
+struct block_device;
+extern int efi_partition_by_guid(struct block_device *bdev, efi_guid_t *guid);
+#endif
+
/**
* efi_range_is_wc - check the WC bit on an address range
* @start: starting kvirt address
--
1.7.0.4


2010-08-02 23:00:40

by David Miller

[permalink] [raw]
Subject: Re: [PATCH RFC] efi: add and expose efi_partition_by_guid

From: Will Drewry <[email protected]>
Date: Mon, 2 Aug 2010 14:17:03 -0500

> +int efi_partition_by_guid(struct block_device *bdev, efi_guid_t *guid) {

Please put the openning brace on a new line.

2010-08-03 02:44:34

by Will Drewry

[permalink] [raw]
Subject: Re: [PATCH RFC] efi: add and expose efi_partition_by_guid

On Mon, Aug 2, 2010 at 6:00 PM, David Miller <[email protected]> wrote:
> From: Will Drewry <[email protected]>
> Date: Mon, ?2 Aug 2010 14:17:03 -0500
>
>> +int efi_partition_by_guid(struct block_device *bdev, efi_guid_t *guid) {
>
> Please put the openning brace on a new line.

Thanks! I'll post a new version with that fixed as well as the
alternative approach I mentioned.

cheers!
will

2010-08-03 02:49:07

by Will Drewry

[permalink] [raw]
Subject: [PATCH RFC (alt)] efi: add efi_partition_walk and expose for kernel access

efi_partition_walk provides a generic mechanism for iterating over a partition
table using a callback over the GPT entry data. However, it avoids exposing
too many internals while still providing less specialized access than the
previous patch.

The same questions apply - any and all feedback is truly appreciated.

I'll post an updated version of the original with feedback from Joe Perches and
David Miller shortly.

Signed-off-by: Will Drewry <[email protected]>
---
fs/partitions/efi.c | 123 ++++++++++++++++++++++++++++++++++++++-------------
include/linux/efi.h | 9 ++++
2 files changed, 101 insertions(+), 31 deletions(-)

diff --git a/fs/partitions/efi.c b/fs/partitions/efi.c
index 9efb2cf..cfb52ac 100644
--- a/fs/partitions/efi.c
+++ b/fs/partitions/efi.c
@@ -580,56 +580,117 @@ static int find_valid_gpt(struct parsed_partitions *state, gpt_header **gpt,
}

/**
- * efi_partition(struct parsed_partitions *state)
- * @state
+ * efi_partition_walk
+ * @bdev: Whole block device to scan for a GPT.
+ * @walk_arg: Opaque data to pass to the walker
+ * @walk: Function to walk valid GPT entries
*
- * Description: called from check.c, if the disk contains GPT
- * partitions, sets up partition entries in the kernel.
+ * If walk() returns non-zero, then the walk will terminate early.
*
- * If the first block on the disk is a legacy MBR,
- * it will get handled by msdos_partition().
- * If it's a Protective MBR, we'll handle it here.
+ * We ignore the GPT entry so the first number refers to the first
+ * data partition.
*
- * We do not create a Linux partition for GPT, but
- * only for the actual data partitions.
* Returns:
- * -1 if unable to read the partition table
- * 0 if this isn't our partition table
- * 1 if successful
+ * -1 if an error occurred or unable to read
+ * 0 if there is no gpt entry
+ * 1 if traversal occurred (even if terminated early)
*
*/
-int efi_partition(struct parsed_partitions *state)
+int efi_partition_walk(struct block_device *bdev, void *walk_arg,
+ efi_partition_callback_t walk)
{
gpt_header *gpt = NULL;
gpt_entry *ptes = NULL;
+ struct parsed_partitions *state = NULL;
+ unsigned ssz = 0;
u32 i;
- unsigned ssz = bdev_logical_block_size(state->bdev) / 512;

- if (!find_valid_gpt(state, &gpt, &ptes) || !gpt || !ptes) {
+ if (!bdev || !walk)
+ return -1;
+
+ ssz = bdev_logical_block_size(bdev) / 512;
+
+ state = kzalloc(sizeof(struct parsed_partitions), GFP_KERNEL);
+ if (!state)
+ return -1;
+ state->limit = disk_max_parts(bdev->bd_disk);
+ state->bdev = bdev;
+
+ pr_debug(KERN_WARNING "efi_find_partition looking for gpt\n");
+ if (!find_valid_gpt(state, &gpt, &ptes) || !gpt || !ptes) {
+ pr_debug(KERN_WARNING "efi_find_partition no GPT\n");
kfree(gpt);
kfree(ptes);
+ kfree(state);
return 0;
}

- pr_debug("GUID Partition Table is valid! Yea!\n");
-
- for (i = 0; i < le32_to_cpu(gpt->num_partition_entries) && i < state->limit-1; i++) {
- u64 start = le64_to_cpu(ptes[i].starting_lba);
- u64 size = le64_to_cpu(ptes[i].ending_lba) -
- le64_to_cpu(ptes[i].starting_lba) + 1ULL;
-
- if (!is_pte_valid(&ptes[i], last_lba(state->bdev)))
+ pr_debug(KERN_WARNING "efi_find_partition: 0 -> %d (limit:%d)\n",
+ le32_to_cpu(gpt->num_partition_entries),
+ state->limit);
+ for (i = 0; i < le32_to_cpu(gpt->num_partition_entries) &&
+ i < state->limit-1; i++) {
+ /* Convert start and size to sectors from lbas */
+ sector_t start = le64_to_cpu(ptes[i].starting_lba) * ssz;
+ sector_t size = ssz * (le64_to_cpu(ptes[i].ending_lba) -
+ le64_to_cpu(ptes[i].starting_lba) + 1ULL);
+ if (!is_pte_valid(&ptes[i], last_lba(bdev)))
continue;
-
- put_partition(state, i+1, start * ssz, size * ssz);
-
- /* If this is a RAID volume, tell md */
- if (!efi_guidcmp(ptes[i].partition_type_guid,
- PARTITION_LINUX_RAID_GUID))
- state->parts[i + 1].flags = ADDPART_FLAG_RAID;
+ if (walk(i+1,
+ &ptes[i].partition_type_guid,
+ &ptes[i].unique_partition_guid,
+ start, size, walk_arg) != 0)
+ break;
}
kfree(ptes);
kfree(gpt);
- printk("\n");
+ kfree(state);
return 1;
}
+
+/**
+ * efi_partition_add
+ * @num: current partition number
+ * @entry: gpt_entry
+ * @arg: void * to a struct parsed_partition *
+ *
+ * Callback for efi_partition().
+ */
+static int efi_partition_add(int num, efi_guid_t *partition_type_guid,
+ efi_guid_t *unique_partition_guid, sector_t start, sector_t size,
+ void *arg)
+{
+ struct parsed_partitions *state = arg;
+
+ put_partition(state, num, start, size);
+
+ /* If this is a RAID volume, tell md */
+ if (!efi_guidcmp(*partition_type_guid, PARTITION_LINUX_RAID_GUID))
+ state->parts[num].flags = 1;
+ return 0;
+}
+
+
+/**
+ * efi_partition(struct parsed_partitions *state)
+ * @state
+ *
+ * Description: called from check.c, if the disk contains GPT
+ * partitions, sets up partition entries in the kernel.
+ *
+ * If the first block on the disk is a legacy MBR,
+ * it will get handled by msdos_partition().
+ * If it's a Protective MBR, we'll handle it here.
+ *
+ * We do not create a Linux partition for GPT, but
+ * only for the actual data partitions.
+ * Returns:
+ * -1 if unable to read the partition table
+ * 0 if this isn't our partition table
+ * 1 if successful
+ *
+ */
+int efi_partition(struct parsed_partitions *state)
+{
+ return efi_partition_walk(state->bdev, state, efi_partition_add);
+}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index fb737bc..15165bc 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -301,6 +301,15 @@ extern unsigned long efi_get_time(void);
extern int efi_set_rtc_mmss(unsigned long nowtime);
extern struct efi_memory_map memmap;

+#ifdef CONFIG_EFI_PARTITION
+typedef int (*efi_partition_callback_t) (int num,
+ efi_guid_t *partition_type_guid, efi_guid_t *unique_partition_guid,
+ sector_t start, sector_t size, void *arg);
+struct block_device;
+int efi_partition_walk(struct block_device *bdev, void *walk_arg,
+ efi_partition_callback_t walk);
+#endif
+
/**
* efi_range_is_wc - check the WC bit on an address range
* @start: starting kvirt address
--
1.7.0.4

2010-08-03 02:53:10

by Will Drewry

[permalink] [raw]
Subject: [PATCH v2 RFC] efi: add and expose efi_partition_by_guid

EFI's GPT partitioning scheme expects that all partitions have a unique
identifiers. After initial partition scanning, this information is
completely lost to the rest of the kernel.

efi_partition_by_guid exposes GPT parsing support in a limited fashion
to allow other portions of the kernel to map a partition from GUID to
map.

This change is motivated the desire to have the ability to specify a
UUID argument to a device mapper target allowing it to select the device
by the UUID.

Change against Linus's tree: 9fe6206f400646a2322096b56c59891d530e8d51

Signed-off-by: Will Drewry <[email protected]>

v2: pr_debug(KERN_WARNING -> pr_debug(. [email protected]
moved down trailing {. [email protected]
---
fs/partitions/efi.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/efi.h | 5 ++++
2 files changed, 66 insertions(+), 0 deletions(-)

diff --git a/fs/partitions/efi.c b/fs/partitions/efi.c
index 9efb2cf..8669c4f 100644
--- a/fs/partitions/efi.c
+++ b/fs/partitions/efi.c
@@ -633,3 +633,64 @@ int efi_partition(struct parsed_partitions *state)
printk("\n");
return 1;
}
+
+/**
+ * efi_partition_by_guid
+ * @bdev: Whole block device to scan for a GPT.
+ * @guid: Unique identifier for the partition to find.
+ *
+ * N.b., returns on the first match since it should be unique.
+ *
+ * Returns:
+ * -1 if an error occurred
+ * 0 if there was no match (or not GPT)
+ * >=1 is the index of the partition found.
+ *
+ */
+int efi_partition_by_guid(struct block_device *bdev, efi_guid_t *guid) {
+ gpt_header *gpt = NULL;
+ gpt_entry *ptes = NULL;
+ u32 i;
+ struct parsed_partitions *state;
+ int part = 0;
+
+ if (!bdev || !guid)
+ return -1;
+
+ state = kzalloc(sizeof(struct parsed_partitions), GFP_KERNEL);
+ if (!state)
+ return -1;
+
+ state->limit = disk_max_parts(bdev->bd_disk);
+ pr_debug("efi_find_partition looking for gpt\n");
+
+ state->bdev = bdev;
+ if (!find_valid_gpt(state, &gpt, &ptes) || !gpt || !ptes) {
+ pr_debug("efi_find_partition no GPT\n");
+ kfree(gpt);
+ kfree(ptes);
+ kfree(state);
+ return 0;
+ }
+
+ pr_debug("GUID Partition Table is valid! Yea!\n");
+ pr_debug("efi_find_partition: 0 -> %d (limit:%d)\n",
+ le32_to_cpu(gpt->num_partition_entries),
+ state->limit);
+ for (i = 0; i < le32_to_cpu(gpt->num_partition_entries) &&
+ i < state->limit-1; i++) {
+ if (!is_pte_valid(&ptes[i], last_lba(bdev)))
+ continue;
+
+ /* Bails on first hit so duped "unique" GUIDs will be FCFS. */
+ if (!efi_guidcmp(ptes[i].unique_partition_guid,
+ *guid)) {
+ part = i + 1;
+ break;
+ }
+ }
+ kfree(ptes);
+ kfree(gpt);
+ kfree(state);
+ return part;
+}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index fb737bc..1a77259 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -301,6 +301,11 @@ extern unsigned long efi_get_time(void);
extern int efi_set_rtc_mmss(unsigned long nowtime);
extern struct efi_memory_map memmap;

+#ifdef CONFIG_EFI_PARTITION
+struct block_device;
+extern int efi_partition_by_guid(struct block_device *bdev, efi_guid_t *guid);
+#endif
+
/**
* efi_range_is_wc - check the WC bit on an address range
* @start: starting kvirt address
--
1.7.0.4

2010-08-03 16:08:33

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH RFC] efi: add and expose efi_partition_by_guid

(cc'ing Kay and quoting whole message for him)

Kay, you were talking about using GUID in GPT for finding out root
device and so on. Does this fit your use case too? If not it would
be nice to find out something which can be shared.

On 08/02/2010 09:17 PM, Will Drewry wrote:
> EFI's GPT partitioning scheme expects that all partitions have a unique
> identifiers. After initial partition scanning, this information is
> completely lost to the rest of the kernel.
>
> efi_partition_by_guid exposes GPT parsing support in a limited fashion
> to allow other portions of the kernel to map a partition from GUID to
> map.
>
> An alternate implementation (and more generic) would be to expose a function
> (efi_partition_walk) that iterates over the partition table providing access to
> each partitions information to a callback, much like device class iterators.
>
> The motivation for this change is the ability to have device mapper targets
> resolve backing devices by GUID instead of specifically by partition number.
> This could also be used in the init boot path (root=GUID) quite simply by
> modeling scanning code on printk_all_partitions(), or with another patchset
> allowing dm="" in the boot path: http://lkml.org/lkml/2010/6/7/418
>
> [ Additional context: http://codereview.chromium.org/3026039/show ]
>
> Would a change like this or something that exposes the GPT information via a
> walking function be something that would be of any interest, or is it explicitly
> against the current design/access goals with respect to partition information?
>
> Any and all feedback is truly appreciated.
>
> Signed-off-by: Will Drewry <[email protected]>
> ---
> fs/partitions/efi.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++
> include/linux/efi.h | 5 ++++
> 2 files changed, 66 insertions(+), 0 deletions(-)
>
> diff --git a/fs/partitions/efi.c b/fs/partitions/efi.c
> index 9efb2cf..4f642c5 100644
> --- a/fs/partitions/efi.c
> +++ b/fs/partitions/efi.c
> @@ -633,3 +633,64 @@ int efi_partition(struct parsed_partitions *state)
> printk("\n");
> return 1;
> }
> +
> +/**
> + * efi_partition_by_guid
> + * @bdev: Whole block device to scan for a GPT.
> + * @guid: Unique identifier for the partition to find.
> + *
> + * N.b., returns on the first match since it should be unique.
> + *
> + * Returns:
> + * -1 if an error occurred
> + * 0 if there was no match (or not GPT)
> + * >=1 is the index of the partition found.
> + *
> + */
> +int efi_partition_by_guid(struct block_device *bdev, efi_guid_t *guid) {
> + gpt_header *gpt = NULL;
> + gpt_entry *ptes = NULL;
> + u32 i;
> + struct parsed_partitions *state;
> + int part = 0;
> +
> + if (!bdev || !guid)
> + return -1;
> +
> + state = kzalloc(sizeof(struct parsed_partitions), GFP_KERNEL);
> + if (!state)
> + return -1;
> +
> + state->limit = disk_max_parts(bdev->bd_disk);
> + pr_debug(KERN_WARNING "efi_find_partition looking for gpt\n");
> +
> + state->bdev = bdev;
> + if (!find_valid_gpt(state, &gpt, &ptes) || !gpt || !ptes) {
> + pr_debug(KERN_WARNING "efi_find_partition no GPT\n");
> + kfree(gpt);
> + kfree(ptes);
> + kfree(state);
> + return 0;
> + }
> +
> + pr_debug("GUID Partition Table is valid! Yea!\n");
> + pr_debug(KERN_WARNING "efi_find_partition: 0 -> %d (limit:%d)\n",
> + le32_to_cpu(gpt->num_partition_entries),
> + state->limit);
> + for (i = 0; i < le32_to_cpu(gpt->num_partition_entries) &&
> + i < state->limit-1; i++) {
> + if (!is_pte_valid(&ptes[i], last_lba(bdev)))
> + continue;
> +
> + /* Bails on first hit so duped "unique" GUIDs will be FCFS. */
> + if (!efi_guidcmp(ptes[i].unique_partition_guid,
> + *guid)) {
> + part = i + 1;
> + break;
> + }
> + }
> + kfree(ptes);
> + kfree(gpt);
> + kfree(state);
> + return part;
> +}
> diff --git a/include/linux/efi.h b/include/linux/efi.h
> index fb737bc..1a77259 100644
> --- a/include/linux/efi.h
> +++ b/include/linux/efi.h
> @@ -301,6 +301,11 @@ extern unsigned long efi_get_time(void);
> extern int efi_set_rtc_mmss(unsigned long nowtime);
> extern struct efi_memory_map memmap;
>
> +#ifdef CONFIG_EFI_PARTITION
> +struct block_device;
> +extern int efi_partition_by_guid(struct block_device *bdev, efi_guid_t *guid);
> +#endif
> +
> /**
> * efi_range_is_wc - check the WC bit on an address range
> * @start: starting kvirt address

--
tejun

2010-08-03 17:18:05

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH RFC] efi: add and expose efi_partition_by_guid

On Tue, Aug 3, 2010 at 18:08, Tejun Heo <[email protected]> wrote:
> On 08/02/2010 09:17 PM, Will Drewry wrote:
>> EFI's GPT partitioning scheme expects that all partitions have a unique
>> identifiers.  After initial partition scanning, this information is
>> completely lost to the rest of the kernel.
>>
>> efi_partition_by_guid exposes GPT parsing support in a limited fashion
>> to allow other portions of the kernel to map a partition from GUID to
>> map.

> Kay, you were talking about using GUID in GPT for finding out root
> device and so on. Does this fit your use case too? If not it would
> be nice to find out something which can be shared.

Yeah, we have something similar in mind since a while, to be able to
safely boot a box without an initramfs, and to be able to to specify
something like:
root=PARTUUID=6547567-575-7567-567567-57
root=PARTLABEL=foo
on the kernel commandline.

The current 'blkid' already reports stuff like, to have the same
information in userspace:
$ blkid -p -oudev /dev/sde1
ID_FS_LABEL=10GB
ID_FS_LABEL_ENC=10GB
ID_FS_UUID=5aafa1bb-70a7-4fe6-b93f-30658ec99fac
ID_FS_UUID_ENC=5aafa1bb-70a7-4fe6-b93f-30658ec99fac
ID_FS_VERSION=1.0
ID_FS_TYPE=ext4
ID_FS_USAGE=filesystem
ID_PART_ENTRY_SCHEME=gpt
ID_PART_ENTRY_UUID=1f765dcb-5214-bd47-b1c5-f2f18848335e
ID_PART_ENTRY_TYPE=a2a0d0eb-e5b9-3344-87c0-68b6b72699c7
ID_PART_ENTRY_NUMBER=1

I guess we want to store these identifiers directly into the partition
structure, independent of the partition format, so any code can
register a callback for a new block device, and can just check if
that's the device in question. Walking the block devices would just be
something usual provided by the driver core, instead of having some
specific EFI walk functions.

Kay

2010-08-03 17:55:14

by Will Drewry

[permalink] [raw]
Subject: Re: [PATCH RFC] efi: add and expose efi_partition_by_guid

On Tue, Aug 3, 2010 at 12:17 PM, Kay Sievers <[email protected]> wrote:
> On Tue, Aug 3, 2010 at 18:08, Tejun Heo <[email protected]> wrote:
>> On 08/02/2010 09:17 PM, Will Drewry wrote:
>>> EFI's GPT partitioning scheme expects that all partitions have a unique
>>> identifiers. ?After initial partition scanning, this information is
>>> completely lost to the rest of the kernel.
>>>
>>> efi_partition_by_guid exposes GPT parsing support in a limited fashion
>>> to allow other portions of the kernel to map a partition from GUID to
>>> map.
>
>> Kay, you were talking about using GUID in GPT for finding out root
>> device and so on. ?Does this fit your use case too? ?If not it would
>> be nice to find out something which can be shared.
>
> Yeah, we have something similar in mind since a while, to be able to
> safely boot a box without an initramfs, and to be able to to specify
> something like:
> ?root=PARTUUID=6547567-575-7567-567567-57
> ?root=PARTLABEL=foo
> on the kernel commandline.

Cool. So I'd like this as well (at least the UUID part), and I'd like
this to be available for other consumers in the kernel, like
dm_get_device() or at least for mapped device targets to implement
support for themselves. (I have a separate patch for
mimicking md= for device mapper devices which I should probably post
to the lists again soon.)

> The current 'blkid' already reports stuff like, to have the same
> information in userspace:
> ?$ blkid -p -oudev /dev/sde1
> ?ID_FS_LABEL=10GB
> ?ID_FS_LABEL_ENC=10GB
> ?ID_FS_UUID=5aafa1bb-70a7-4fe6-b93f-30658ec99fac
> ?ID_FS_UUID_ENC=5aafa1bb-70a7-4fe6-b93f-30658ec99fac
> ?ID_FS_VERSION=1.0
> ?ID_FS_TYPE=ext4
> ?ID_FS_USAGE=filesystem
> ?ID_PART_ENTRY_SCHEME=gpt
> ?ID_PART_ENTRY_UUID=1f765dcb-5214-bd47-b1c5-f2f18848335e
> ?ID_PART_ENTRY_TYPE=a2a0d0eb-e5b9-3344-87c0-68b6b72699c7
> ?ID_PART_ENTRY_NUMBER=1
>
> I guess we want to store these identifiers directly into the partition
> structure, independent of the partition format, so any code can
> register a callback for a new block device, and can just check if
> that's the device in question. Walking the block devices would just be
> something usual provided by the driver core, instead of having some
> specific EFI walk functions.

Yeah - when I use this function, I end up doing a walk over all the
block devices, checking if they are whole disk entries, then calling
the efi_partition_by_guid() function. (Or the walker which I posted
separately.) It's not ideal but it has the smallest impact on the
existing code. (Not having disk_type available is irritating though.)

Would the type GUID and unique GUID be viable additions to a more
public struct? If they were CONFIG_EFI_PARTITION guarded, then they
wouldn't waste memory for systems without GPT support, but it seems a
bit specific. Also, I don't think it'd make sense to put it in the
partition struct as that represents the on-disk format for some tables
(from a quick scan over the code). However, hd_struct looks the
sanest to me.

I'd be happy to pull together a potential change that exposes this
data once after disk (re)scan, but I'd hate to do so in a way that'd
be fundamentally unacceptable (but I don't want to end up down the
deep hole of adding support across all the part tables either if I can
:).

So I could see something like:

struct hd_struct {
...
#ifdef CONFIG_EFI_PARTITION
efi_guid_t type_guid;
efi_guid_t uuid;
u16 label[72 / ...];
};

Alternatively, a slightly more generic option might be:

#ifdef CONFIG_PARTITION_INFO
/* ASCII hex-formatted uuids inclusive of hyphens */
u8 type_guid[MAX_HD_STRUCT_UUID_SIZE];
u8 uuid[MAX_HD_STRUCT_UUID_SIZE];
u16 label[MAX_HD_STRUCT_NAME + sizeof(u16)];
#endif


Any way, if any of this seems slightly palatable, let me know. I'd
love to make this data accessible to the rest of the kernel.

thanks!
will

2010-08-03 18:23:33

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH RFC] efi: add and expose efi_partition_by_guid

On Tue, Aug 3, 2010 at 19:55, Will Drewry <[email protected]> wrote:
> On Tue, Aug 3, 2010 at 12:17 PM, Kay Sievers <[email protected]> wrote:
>> On Tue, Aug 3, 2010 at 18:08, Tejun Heo <[email protected]> wrote:
>>> On 08/02/2010 09:17 PM, Will Drewry wrote:
>>>> EFI's GPT partitioning scheme expects that all partitions have a unique
>>>> identifiers.  After initial partition scanning, this information is
>>>> completely lost to the rest of the kernel.
>>>>
>>>> efi_partition_by_guid exposes GPT parsing support in a limited fashion
>>>> to allow other portions of the kernel to map a partition from GUID to
>>>> map.
>>
>>> Kay, you were talking about using GUID in GPT for finding out root
>>> device and so on.  Does this fit your use case too?  If not it would
>>> be nice to find out something which can be shared.
>>
>> Yeah, we have something similar in mind since a while, to be able to
>> safely boot a box without an initramfs, and to be able to to specify
>> something like:
>>  root=PARTUUID=6547567-575-7567-567567-57
>>  root=PARTLABEL=foo
>> on the kernel commandline.
>
> Cool. So I'd like this as well (at least the UUID part), and I'd like
> this to be available for other consumers in the kernel, like
> dm_get_device() or at least for mapped device targets to implement
> support for themselves.  (I have a separate patch for
> mimicking md= for device mapper devices which I should probably post
> to the lists again soon.)
>
>> The current 'blkid' already reports stuff like, to have the same
>> information in userspace:
>>  $ blkid -p -oudev /dev/sde1
>>  ID_FS_LABEL=10GB
>>  ID_FS_LABEL_ENC=10GB
>>  ID_FS_UUID=5aafa1bb-70a7-4fe6-b93f-30658ec99fac
>>  ID_FS_UUID_ENC=5aafa1bb-70a7-4fe6-b93f-30658ec99fac
>>  ID_FS_VERSION=1.0
>>  ID_FS_TYPE=ext4
>>  ID_FS_USAGE=filesystem
>>  ID_PART_ENTRY_SCHEME=gpt
>>  ID_PART_ENTRY_UUID=1f765dcb-5214-bd47-b1c5-f2f18848335e
>>  ID_PART_ENTRY_TYPE=a2a0d0eb-e5b9-3344-87c0-68b6b72699c7
>>  ID_PART_ENTRY_NUMBER=1
>>
>> I guess we want to store these identifiers directly into the partition
>> structure, independent of the partition format, so any code can
>> register a callback for a new block device, and can just check if
>> that's the device in question. Walking the block devices would just be
>> something usual provided by the driver core, instead of having some
>> specific EFI walk functions.
>
> Yeah - when I use this function, I end up doing a walk over all the
> block devices, checking if they are whole disk entries, then calling
> the efi_partition_by_guid() function. (Or the walker which I posted
> separately.)  It's not ideal but it has the smallest impact on the
> existing code. (Not having disk_type available is irritating though.)
>
> Would the type GUID and unique GUID be viable additions to a more
> public struct?  If they were CONFIG_EFI_PARTITION guarded, then they
> wouldn't waste memory for systems without GPT support, but it seems a
> bit specific.  Also, I don't think it'd make sense to put it in the
> partition struct as that represents the on-disk format for some tables
> (from a quick scan over the code).  However, hd_struct looks the
> sanest to me.
>
> I'd be happy to pull together a potential change that exposes this
> data once after disk (re)scan, but I'd hate to do so in a way that'd
> be fundamentally unacceptable (but I don't want to end up down the
> deep hole of adding support across all the part tables either if I can
> :).
>
> So I could see something like:
>
> struct hd_struct {
> ...
> #ifdef CONFIG_EFI_PARTITION
>  efi_guid_t type_guid;
>  efi_guid_t uuid;
>  u16 label[72 / ...];
> };
>
> Alternatively, a slightly more generic option might be:
>
> #ifdef CONFIG_PARTITION_INFO
>  /* ASCII hex-formatted uuids inclusive of hyphens */
>  u8 type_guid[MAX_HD_STRUCT_UUID_SIZE];
>  u8 uuid[MAX_HD_STRUCT_UUID_SIZE];
>  u16 label[MAX_HD_STRUCT_NAME + sizeof(u16)];
> #endif
>
>
> Any way, if any of this seems slightly palatable, let me know.  I'd
> love to make this data accessible to the rest of the kernel.

Maybe we go for a single pointer in the partition device, and allocate
a struct partition_meta_info, or something like this, if we have such
data to store. In that structure we can add all needed fields we need?
That would not really waste anything if it's not needed, or it can
possibly be free()d later, if nothing needs it anymore.

Kay

2010-08-03 18:51:47

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH v2 RFC] efi: add and expose efi_partition_by_guid

On Mon, 2 Aug 2010 21:52:46 -0500 Will Drewry wrote:


> v2: pr_debug(KERN_WARNING -> pr_debug(. [email protected]
> moved down trailing {. [email protected]

Eh? see below.

> ---
> fs/partitions/efi.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++++
> include/linux/efi.h | 5 ++++
> 2 files changed, 66 insertions(+), 0 deletions(-)
>
> diff --git a/fs/partitions/efi.c b/fs/partitions/efi.c
> index 9efb2cf..8669c4f 100644
> --- a/fs/partitions/efi.c
> +++ b/fs/partitions/efi.c
> @@ -633,3 +633,64 @@ int efi_partition(struct parsed_partitions *state)
> printk("\n");
> return 1;
> }
> +
> +/**
> + * efi_partition_by_guid
> + * @bdev: Whole block device to scan for a GPT.
> + * @guid: Unique identifier for the partition to find.
> + *
> + * N.b., returns on the first match since it should be unique.
> + *
> + * Returns:
> + * -1 if an error occurred
> + * 0 if there was no match (or not GPT)
> + * >=1 is the index of the partition found.
> + *
> + */
> +int efi_partition_by_guid(struct block_device *bdev, efi_guid_t *guid) {

That opening brace should be on a line by itself:

int efi_partition_by_guid(struct block_device *bdev, efi_guid_t *guid)
{

and the kernel-doc function description should be like so:

/**
* efi_partition_by_guid - find the first (only) matching guid on a block device

or some such short function description.



> + gpt_header *gpt = NULL;
> + gpt_entry *ptes = NULL;
> + u32 i;
> + struct parsed_partitions *state;
> + int part = 0;



---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2010-08-03 18:52:22

by Will Drewry

[permalink] [raw]
Subject: Re: [PATCH v2 RFC] efi: add and expose efi_partition_by_guid

On Tue, Aug 3, 2010 at 1:50 PM, Randy Dunlap <[email protected]> wrote:
> On Mon, ?2 Aug 2010 21:52:46 -0500 Will Drewry wrote:
>
>
>> v2: pr_debug(KERN_WARNING -> pr_debug(. [email protected]
>> ? ? moved down trailing {. [email protected]
>
> Eh? ?see below.
>
>> ---
>> ?fs/partitions/efi.c | ? 61 +++++++++++++++++++++++++++++++++++++++++++++++++++
>> ?include/linux/efi.h | ? ?5 ++++
>> ?2 files changed, 66 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/partitions/efi.c b/fs/partitions/efi.c
>> index 9efb2cf..8669c4f 100644
>> --- a/fs/partitions/efi.c
>> +++ b/fs/partitions/efi.c
>> @@ -633,3 +633,64 @@ int efi_partition(struct parsed_partitions *state)
>> ? ? ? printk("\n");
>> ? ? ? return 1;
>> ?}
>> +
>> +/**
>> + * efi_partition_by_guid
>> + * @bdev: ? ?Whole block device to scan for a GPT.
>> + * @guid: ? ?Unique identifier for the partition to find.
>> + *
>> + * N.b., returns on the first match since it should be unique.
>> + *
>> + * Returns:
>> + * -1 if an error occurred
>> + * ?0 if there was no match (or not GPT)
>> + * ?>=1 is the index of the partition found.
>> + *
>> + */
>> +int efi_partition_by_guid(struct block_device *bdev, efi_guid_t *guid) {
>
> That opening brace should be on a line by itself:
>
> int efi_partition_by_guid(struct block_device *bdev, efi_guid_t *guid)
> {

Thanks - yeah I failed to git commit --amend properly on the second posting :/

> and the kernel-doc function description should be like so:
>
> /**
> ?* efi_partition_by_guid - find the first (only) matching guid on a block device
>
> or some such short function description.

I'll be sure to use the proper format on the next patch round!

thanks!

2010-08-03 18:52:41

by Will Drewry

[permalink] [raw]
Subject: Re: [PATCH RFC] efi: add and expose efi_partition_by_guid

On Tue, Aug 3, 2010 at 1:23 PM, Kay Sievers <[email protected]> wrote:
> On Tue, Aug 3, 2010 at 19:55, Will Drewry <[email protected]> wrote:
>> On Tue, Aug 3, 2010 at 12:17 PM, Kay Sievers <[email protected]> wrote:
>>> On Tue, Aug 3, 2010 at 18:08, Tejun Heo <[email protected]> wrote:
>>>> On 08/02/2010 09:17 PM, Will Drewry wrote:
>>>>> EFI's GPT partitioning scheme expects that all partitions have a unique
>>>>> identifiers. ?After initial partition scanning, this information is
>>>>> completely lost to the rest of the kernel.
>>>>>
>>>>> efi_partition_by_guid exposes GPT parsing support in a limited fashion
>>>>> to allow other portions of the kernel to map a partition from GUID to
>>>>> map.
>>>
>>>> Kay, you were talking about using GUID in GPT for finding out root
>>>> device and so on. ?Does this fit your use case too? ?If not it would
>>>> be nice to find out something which can be shared.
>>>
>>> Yeah, we have something similar in mind since a while, to be able to
>>> safely boot a box without an initramfs, and to be able to to specify
>>> something like:
>>> ?root=PARTUUID=6547567-575-7567-567567-57
>>> ?root=PARTLABEL=foo
>>> on the kernel commandline.
>>
>> Cool. So I'd like this as well (at least the UUID part), and I'd like
>> this to be available for other consumers in the kernel, like
>> dm_get_device() or at least for mapped device targets to implement
>> support for themselves. ?(I have a separate patch for
>> mimicking md= for device mapper devices which I should probably post
>> to the lists again soon.)
>>
>>> The current 'blkid' already reports stuff like, to have the same
>>> information in userspace:
>>> ?$ blkid -p -oudev /dev/sde1
>>> ?ID_FS_LABEL=10GB
>>> ?ID_FS_LABEL_ENC=10GB
>>> ?ID_FS_UUID=5aafa1bb-70a7-4fe6-b93f-30658ec99fac
>>> ?ID_FS_UUID_ENC=5aafa1bb-70a7-4fe6-b93f-30658ec99fac
>>> ?ID_FS_VERSION=1.0
>>> ?ID_FS_TYPE=ext4
>>> ?ID_FS_USAGE=filesystem
>>> ?ID_PART_ENTRY_SCHEME=gpt
>>> ?ID_PART_ENTRY_UUID=1f765dcb-5214-bd47-b1c5-f2f18848335e
>>> ?ID_PART_ENTRY_TYPE=a2a0d0eb-e5b9-3344-87c0-68b6b72699c7
>>> ?ID_PART_ENTRY_NUMBER=1
>>>
>>> I guess we want to store these identifiers directly into the partition
>>> structure, independent of the partition format, so any code can
>>> register a callback for a new block device, and can just check if
>>> that's the device in question. Walking the block devices would just be
>>> something usual provided by the driver core, instead of having some
>>> specific EFI walk functions.
>>
>> Yeah - when I use this function, I end up doing a walk over all the
>> block devices, checking if they are whole disk entries, then calling
>> the efi_partition_by_guid() function. (Or the walker which I posted
>> separately.) ?It's not ideal but it has the smallest impact on the
>> existing code. (Not having disk_type available is irritating though.)
>>
>> Would the type GUID and unique GUID be viable additions to a more
>> public struct? ?If they were CONFIG_EFI_PARTITION guarded, then they
>> wouldn't waste memory for systems without GPT support, but it seems a
>> bit specific. ?Also, I don't think it'd make sense to put it in the
>> partition struct as that represents the on-disk format for some tables
>> (from a quick scan over the code). ?However, hd_struct looks the
>> sanest to me.
>>
>> I'd be happy to pull together a potential change that exposes this
>> data once after disk (re)scan, but I'd hate to do so in a way that'd
>> be fundamentally unacceptable (but I don't want to end up down the
>> deep hole of adding support across all the part tables either if I can
>> :).
>>
>> So I could see something like:
>>
>> struct hd_struct {
>> ...
>> #ifdef CONFIG_EFI_PARTITION
>> ?efi_guid_t type_guid;
>> ?efi_guid_t uuid;
>> ?u16 label[72 / ...];
>> };
>>
>> Alternatively, a slightly more generic option might be:
>>
>> #ifdef CONFIG_PARTITION_INFO
>> ?/* ASCII hex-formatted uuids inclusive of hyphens */
>> ?u8 type_guid[MAX_HD_STRUCT_UUID_SIZE];
>> ?u8 uuid[MAX_HD_STRUCT_UUID_SIZE];
>> ?u16 label[MAX_HD_STRUCT_NAME + sizeof(u16)];
>> #endif
>>
>>
>> Any way, if any of this seems slightly palatable, let me know. ?I'd
>> love to make this data accessible to the rest of the kernel.
>
> Maybe we go for a single pointer in the partition device, and allocate
> a struct partition_meta_info, or something like this, if we have such
> data to store. In that structure we can add all needed fields we need?
> That would not really waste anything if it's not needed, or it can
> possibly be free()d later, if nothing needs it anymore.

Sounds reasonable to me. I'll see what I can cook up and post it back
to this thread.

cheers!
will

2010-08-03 21:36:31

by Will Drewry

[permalink] [raw]
Subject: [PATCH 1/2] block, partition: add partition_meta_info to hd_struct

This changes adds a partition_meta_info struct which itself contains a
union of structures that provide partition table specific metadata.

This change leaves the union empty. The subsequent patch includes an
implementation for CONFIG_EFI_PARTITION-based metadata.

Signed-off-by: Will Drewry <[email protected]>
---
block/genhd.c | 1 +
block/ioctl.c | 2 +-
fs/partitions/check.c | 21 ++++++++++++++++++---
fs/partitions/check.h | 2 ++
include/linux/genhd.h | 32 ++++++++++++++++++++++++++++++--
5 files changed, 52 insertions(+), 6 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 59a2db6..c8da120 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1004,6 +1004,7 @@ static void disk_release(struct device *dev)
kfree(disk->random);
disk_replace_part_tbl(disk, NULL);
free_part_stats(&disk->part0);
+ free_part_info(&disk->part0);
kfree(disk);
}
struct class block_class = {
diff --git a/block/ioctl.c b/block/ioctl.c
index e8eb679..f8e4bfe 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -62,7 +62,7 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user

/* all seems OK */
part = add_partition(disk, partno, start, length,
- ADDPART_FLAG_NONE);
+ ADDPART_FLAG_NONE, NULL);
mutex_unlock(&bdev->bd_mutex);
return IS_ERR(part) ? PTR_ERR(part) : 0;
case BLKPG_DEL_PARTITION:
diff --git a/fs/partitions/check.c b/fs/partitions/check.c
index 5dcd4b0..635725a 100644
--- a/fs/partitions/check.c
+++ b/fs/partitions/check.c
@@ -196,6 +196,8 @@ check_partition(struct gendisk *hd, struct block_device *bdev)
printk(" unknown partition table\n");
else if (warn_no_part)
printk(" unable to read partition table\n");
+ for (i = 0; i < DISK_MAX_PARTS; ++i)
+ kfree(state->parts[i].info);
kfree(state);
return ERR_PTR(res);
}
@@ -338,6 +340,7 @@ static void part_release(struct device *dev)
{
struct hd_struct *p = dev_to_part(dev);
free_part_stats(p);
+ free_part_info(p);
kfree(p);
}

@@ -387,7 +390,8 @@ static DEVICE_ATTR(whole_disk, S_IRUSR | S_IRGRP | S_IROTH,
whole_disk_show, NULL);

struct hd_struct *add_partition(struct gendisk *disk, int partno,
- sector_t start, sector_t len, int flags)
+ sector_t start, sector_t len, int flags,
+ struct partition_meta_info *info)
{
struct hd_struct *p;
dev_t devt = MKDEV(0, 0);
@@ -424,6 +428,13 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,
p->partno = partno;
p->policy = get_disk_ro(disk);

+ if (info) {
+ p->info = alloc_part_info(disk);
+ if (!p->info)
+ goto out_free_stats;
+ memcpy(p->info, info, sizeof(*info));
+ }
+
dname = dev_name(ddev);
if (isdigit(dname[strlen(dname) - 1]))
dev_set_name(pdev, "%sp%d", dname, partno);
@@ -437,7 +448,7 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,

err = blk_alloc_devt(p, &devt);
if (err)
- goto out_free_stats;
+ goto out_free_info;
pdev->devt = devt;

/* delay uevent until 'holders' subdir is created */
@@ -468,6 +479,8 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,

return p;

+out_free_info:
+ free_part_info(p);
out_free_stats:
free_part_stats(p);
out_free:
@@ -663,7 +676,9 @@ rescan:
}
}
part = add_partition(disk, p, from, size,
- state->parts[p].flags);
+ state->parts[p].flags,
+ state->parts[p].info);
+ kfree(state->parts[p].info);
if (IS_ERR(part)) {
printk(KERN_ERR " %s: p%d could not be added: %ld\n",
disk->disk_name, p, -PTR_ERR(part));
diff --git a/fs/partitions/check.h b/fs/partitions/check.h
index 52f8bd3..8c724a7 100644
--- a/fs/partitions/check.h
+++ b/fs/partitions/check.h
@@ -1,6 +1,7 @@
#include <linux/pagemap.h>
#include <linux/blkdev.h>

+struct partition_meta_info;
/*
* add_gd_partition adds a partitions details to the devices partition
* description.
@@ -12,6 +13,7 @@ struct parsed_partitions {
sector_t from;
sector_t size;
int flags;
+ struct partition_meta_info *info;
} parts[DISK_MAX_PARTS];
int next;
int limit;
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 5f2f4c4..7b6644a 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -12,6 +12,7 @@
#include <linux/types.h>
#include <linux/kdev_t.h>
#include <linux/rcupdate.h>
+#include <linux/slab.h>

#ifdef CONFIG_BLOCK

@@ -86,7 +87,18 @@ struct disk_stats {
unsigned long io_ticks;
unsigned long time_in_queue;
};
-
+
+enum partition_meta_info_format_t {
+ /* Partition info format */
+ PARTITION_META_INFO_FORMAT_NONE = 0,
+};
+
+struct partition_meta_info {
+ enum partition_meta_info_format_t format;
+ union {
+ };
+};
+
struct hd_struct {
sector_t start_sect;
sector_t nr_sects;
@@ -95,6 +107,7 @@ struct hd_struct {
struct device __dev;
struct kobject *holder_dir;
int policy, partno;
+ struct partition_meta_info *info;
#ifdef CONFIG_FAIL_MAKE_REQUEST
int make_it_fail;
#endif
@@ -342,6 +355,19 @@ static inline int part_in_flight(struct hd_struct *part)
return part->in_flight[0] + part->in_flight[1];
}

+static inline struct partition_meta_info *alloc_part_info(struct gendisk *disk)
+{
+ if (disk)
+ return kzalloc_node(sizeof(struct partition_meta_info),
+ GFP_KERNEL, disk->node_id);
+ return kzalloc(sizeof(struct partition_meta_info), GFP_KERNEL);
+}
+
+static inline void free_part_info(struct hd_struct *part)
+{
+ kfree(part->info);
+}
+
/* block/blk-core.c */
extern void part_round_stats(int cpu, struct hd_struct *part);

@@ -533,7 +559,9 @@ extern int disk_expand_part_tbl(struct gendisk *disk, int target);
extern int rescan_partitions(struct gendisk *disk, struct block_device *bdev);
extern struct hd_struct * __must_check add_partition(struct gendisk *disk,
int partno, sector_t start,
- sector_t len, int flags);
+ sector_t len, int flags,
+ struct partition_meta_info
+ *info);
extern void delete_partition(struct gendisk *, int);
extern void printk_all_partitions(void);

--
1.7.0.4

2010-08-03 21:36:33

by Will Drewry

[permalink] [raw]
Subject: [PATCH 2/2] genhd, efi: add efi partition metadata to hd_structs

This change extends the partition_meta_info structure to
support EFI GPT-specific metadata and ensures that data
is copied in on partition scanning.

Adding this information would make it possible to identify a
partition by GUID using something like disk_part_iter_*(),
calls that make hd_struct accessible, or even class_find_device.

Signed-off-by: Will Drewry <[email protected]>
---
fs/partitions/efi.c | 16 ++++++++++++++++
include/linux/genhd.h | 12 ++++++++++++
2 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/fs/partitions/efi.c b/fs/partitions/efi.c
index 9efb2cf..2880b33 100644
--- a/fs/partitions/efi.c
+++ b/fs/partitions/efi.c
@@ -614,6 +614,7 @@ int efi_partition(struct parsed_partitions *state)
pr_debug("GUID Partition Table is valid! Yea!\n");

for (i = 0; i < le32_to_cpu(gpt->num_partition_entries) && i < state->limit-1; i++) {
+ struct partition_meta_info *info;
u64 start = le64_to_cpu(ptes[i].starting_lba);
u64 size = le64_to_cpu(ptes[i].ending_lba) -
le64_to_cpu(ptes[i].starting_lba) + 1ULL;
@@ -627,6 +628,21 @@ int efi_partition(struct parsed_partitions *state)
if (!efi_guidcmp(ptes[i].partition_type_guid,
PARTITION_LINUX_RAID_GUID))
state->parts[i + 1].flags = ADDPART_FLAG_RAID;
+
+ info = alloc_part_info(NULL);
+ if (!info) {
+ printk(KERN_WARNING
+ "unable to allocate memory for part->info\n");
+ continue;
+ }
+ state->parts[i + 1].info = info;
+ info->format = PARTITION_META_INFO_FORMAT_EFI;
+ memcpy(info->efi.uuid.b, ptes[i].unique_partition_guid.b,
+ sizeof(info->efi.uuid.b));
+ memcpy(info->efi.type.b, ptes[i].partition_type_guid.b,
+ sizeof(info->efi.type.b));
+ memcpy(info->efi.label, ptes[i].partition_name,
+ sizeof(info->efi.label));
}
kfree(ptes);
kfree(gpt);
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 7b6644a..e0a742f 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -91,11 +91,23 @@ struct disk_stats {
enum partition_meta_info_format_t {
/* Partition info format */
PARTITION_META_INFO_FORMAT_NONE = 0,
+ PARTITION_META_INFO_FORMAT_EFI = 1,
};

+#ifdef CONFIG_EFI_PARTITION
+#include <linux/efi.h>
+#endif
+
struct partition_meta_info {
enum partition_meta_info_format_t format;
union {
+#ifdef CONFIG_EFI_PARTITION
+ struct {
+ efi_guid_t uuid;
+ efi_guid_t type;
+ efi_char16_t label[72 / sizeof(efi_char16_t)];
+ } efi;
+#endif
};
};

--
1.7.0.4

2010-08-03 21:54:50

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 2/2] genhd, efi: add efi partition metadata to hd_structs

On Tue, Aug 3, 2010 at 23:35, Will Drewry <[email protected]> wrote:
> This change extends the partition_meta_info structure to
> support EFI GPT-specific metadata and ensures that data
> is copied in on partition scanning.
>
> Adding this information would make it possible to identify a
> partition by GUID using something like disk_part_iter_*(),
> calls that make hd_struct accessible, or even class_find_device.

Wow, you are fast. :) Sounds and looks good to me.

I guess we should assign the meta structure only after all values are
filled in? Otherwise we could get partial reads from a possible user?

Did you already test to put a lookup-call to the in-kernel mounter, if
we use some special partition table uuid identifier for root=? That
would be nice to see, if all that works as expected, and we can get to
the data we collect.

Kay

2010-08-03 22:27:07

by Will Drewry

[permalink] [raw]
Subject: Re: [PATCH 2/2] genhd, efi: add efi partition metadata to hd_structs

On Tue, Aug 3, 2010 at 4:54 PM, Kay Sievers <[email protected]> wrote:
> On Tue, Aug 3, 2010 at 23:35, Will Drewry <[email protected]> wrote:
>> This change extends the partition_meta_info structure to
>> support EFI GPT-specific metadata and ensures that data
>> is copied in on partition scanning.
>>
>> Adding this information would make it possible to identify a
>> partition by GUID using something like disk_part_iter_*(),
>> calls that make hd_struct accessible, or even class_find_device.
>
> Wow, you are fast. :) Sounds and looks good to me.

Thanks!

> I guess we should assign the meta structure only after all values are
> filled in? Otherwise we could get partial reads from a possible user?

In add_partition(), I don't think the partition itself is shared until
rcu_assign_pointer is called, and I don't think that parsed_partitions
is safe to be shared inside check.c, but perhaps that's a dangerous
assumption

I can certainly move the pointer assignment to occur after the data is
copied over since it shouldn't hurt anything.

> Did you already test to put a lookup-call to the in-kernel mounter, if
> we use some special partition table uuid identifier for root=? That
> would be nice to see, if all that works as expected, and we can get to
> the data we collect.

Not yet, but I'd like to have it working there (and in my
device-mapper target) as soon as possible. Hopefully, I'll have that
done pretty soon and I'll repost the series inclusive of an init:
change.

Any preferences on the variable? I'll start with your example of
PARTUUID=, but that follows the initramfs model (UUID=) and not the
existing magic root devices (/dev/nfs, /dev/ram).
/dev/by-part-uuid/XXX... doesn't seem super-friendly though :)

cheers -
will

2010-08-03 23:13:41

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 2/2] genhd, efi: add efi partition metadata to hd_structs

On Wed, Aug 4, 2010 at 00:27, Will Drewry <[email protected]> wrote:
> On Tue, Aug 3, 2010 at 4:54 PM, Kay Sievers <[email protected]> wrote:

>> Did you already test to put a lookup-call to the in-kernel mounter, if
>> we use some special partition table uuid identifier for root=? That
>> would be nice to see, if all that works as expected, and we can get to
>> the data we collect.
>
> Not yet, but I'd like to have it working there (and in my
> device-mapper target) as soon as possible.  Hopefully, I'll have that
> done pretty soon and I'll repost the series inclusive of an init:
> change.

Nice.

> Any preferences on the variable? I'll start with your example of
> PARTUUID=, but that follows the initramfs model (UUID=) and not the
> existing magic root devices (/dev/nfs, /dev/ram).
> /dev/by-part-uuid/XXX... doesn't seem super-friendly though :)

Yeah, this stuff does not really fit in to the path notation stuff,
unless we use the udev-style link names, which some people like to
avoid, and prefer some more abstract thing here.

We have UUID=, LABEL= for mount and fstab, I think PARTUUID=,
PARTLABEL= matches this.

We can always change that, if something better comes up before this
gets official. :)

Kay

2010-08-04 02:06:21

by Will Drewry

[permalink] [raw]
Subject: [PATCH v2 1/3] block, partition: add partition_meta_info to hd_struct

This changes adds a partition_meta_info struct which itself contains a
union of structures that provide partition table specific metadata.

This change leaves the union empty. The subsequent patch includes an
implementation for CONFIG_EFI_PARTITION-based metadata.

Signed-off-by: Will Drewry <[email protected]>

v2: move assignment of p->info after copy
---
block/genhd.c | 1 +
block/ioctl.c | 2 +-
fs/partitions/check.c | 22 +++++++++++++++++++---
fs/partitions/check.h | 2 ++
include/linux/genhd.h | 32 ++++++++++++++++++++++++++++++--
5 files changed, 53 insertions(+), 6 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 59a2db6..c8da120 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1004,6 +1004,7 @@ static void disk_release(struct device *dev)
kfree(disk->random);
disk_replace_part_tbl(disk, NULL);
free_part_stats(&disk->part0);
+ free_part_info(&disk->part0);
kfree(disk);
}
struct class block_class = {
diff --git a/block/ioctl.c b/block/ioctl.c
index e8eb679..f8e4bfe 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -62,7 +62,7 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user

/* all seems OK */
part = add_partition(disk, partno, start, length,
- ADDPART_FLAG_NONE);
+ ADDPART_FLAG_NONE, NULL);
mutex_unlock(&bdev->bd_mutex);
return IS_ERR(part) ? PTR_ERR(part) : 0;
case BLKPG_DEL_PARTITION:
diff --git a/fs/partitions/check.c b/fs/partitions/check.c
index 5dcd4b0..b616da8 100644
--- a/fs/partitions/check.c
+++ b/fs/partitions/check.c
@@ -196,6 +196,8 @@ check_partition(struct gendisk *hd, struct block_device *bdev)
printk(" unknown partition table\n");
else if (warn_no_part)
printk(" unable to read partition table\n");
+ for (i = 0; i < DISK_MAX_PARTS; ++i)
+ kfree(state->parts[i].info);
kfree(state);
return ERR_PTR(res);
}
@@ -338,6 +340,7 @@ static void part_release(struct device *dev)
{
struct hd_struct *p = dev_to_part(dev);
free_part_stats(p);
+ free_part_info(p);
kfree(p);
}

@@ -387,7 +390,8 @@ static DEVICE_ATTR(whole_disk, S_IRUSR | S_IRGRP | S_IROTH,
whole_disk_show, NULL);

struct hd_struct *add_partition(struct gendisk *disk, int partno,
- sector_t start, sector_t len, int flags)
+ sector_t start, sector_t len, int flags,
+ struct partition_meta_info *info)
{
struct hd_struct *p;
dev_t devt = MKDEV(0, 0);
@@ -424,6 +428,14 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,
p->partno = partno;
p->policy = get_disk_ro(disk);

+ if (info) {
+ struct partition_meta_info *pinfo = alloc_part_info(disk);
+ if (!pinfo)
+ goto out_free_stats;
+ memcpy(pinfo, info, sizeof(*info));
+ p->info = pinfo;
+ }
+
dname = dev_name(ddev);
if (isdigit(dname[strlen(dname) - 1]))
dev_set_name(pdev, "%sp%d", dname, partno);
@@ -437,7 +449,7 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,

err = blk_alloc_devt(p, &devt);
if (err)
- goto out_free_stats;
+ goto out_free_info;
pdev->devt = devt;

/* delay uevent until 'holders' subdir is created */
@@ -468,6 +480,8 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,

return p;

+out_free_info:
+ free_part_info(p);
out_free_stats:
free_part_stats(p);
out_free:
@@ -663,7 +677,9 @@ rescan:
}
}
part = add_partition(disk, p, from, size,
- state->parts[p].flags);
+ state->parts[p].flags,
+ state->parts[p].info);
+ kfree(state->parts[p].info);
if (IS_ERR(part)) {
printk(KERN_ERR " %s: p%d could not be added: %ld\n",
disk->disk_name, p, -PTR_ERR(part));
diff --git a/fs/partitions/check.h b/fs/partitions/check.h
index 52f8bd3..8c724a7 100644
--- a/fs/partitions/check.h
+++ b/fs/partitions/check.h
@@ -1,6 +1,7 @@
#include <linux/pagemap.h>
#include <linux/blkdev.h>

+struct partition_meta_info;
/*
* add_gd_partition adds a partitions details to the devices partition
* description.
@@ -12,6 +13,7 @@ struct parsed_partitions {
sector_t from;
sector_t size;
int flags;
+ struct partition_meta_info *info;
} parts[DISK_MAX_PARTS];
int next;
int limit;
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 5f2f4c4..7b6644a 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -12,6 +12,7 @@
#include <linux/types.h>
#include <linux/kdev_t.h>
#include <linux/rcupdate.h>
+#include <linux/slab.h>

#ifdef CONFIG_BLOCK

@@ -86,7 +87,18 @@ struct disk_stats {
unsigned long io_ticks;
unsigned long time_in_queue;
};
-
+
+enum partition_meta_info_format_t {
+ /* Partition info format */
+ PARTITION_META_INFO_FORMAT_NONE = 0,
+};
+
+struct partition_meta_info {
+ enum partition_meta_info_format_t format;
+ union {
+ };
+};
+
struct hd_struct {
sector_t start_sect;
sector_t nr_sects;
@@ -95,6 +107,7 @@ struct hd_struct {
struct device __dev;
struct kobject *holder_dir;
int policy, partno;
+ struct partition_meta_info *info;
#ifdef CONFIG_FAIL_MAKE_REQUEST
int make_it_fail;
#endif
@@ -342,6 +355,19 @@ static inline int part_in_flight(struct hd_struct *part)
return part->in_flight[0] + part->in_flight[1];
}

+static inline struct partition_meta_info *alloc_part_info(struct gendisk *disk)
+{
+ if (disk)
+ return kzalloc_node(sizeof(struct partition_meta_info),
+ GFP_KERNEL, disk->node_id);
+ return kzalloc(sizeof(struct partition_meta_info), GFP_KERNEL);
+}
+
+static inline void free_part_info(struct hd_struct *part)
+{
+ kfree(part->info);
+}
+
/* block/blk-core.c */
extern void part_round_stats(int cpu, struct hd_struct *part);

@@ -533,7 +559,9 @@ extern int disk_expand_part_tbl(struct gendisk *disk, int target);
extern int rescan_partitions(struct gendisk *disk, struct block_device *bdev);
extern struct hd_struct * __must_check add_partition(struct gendisk *disk,
int partno, sector_t start,
- sector_t len, int flags);
+ sector_t len, int flags,
+ struct partition_meta_info
+ *info);
extern void delete_partition(struct gendisk *, int);
extern void printk_all_partitions(void);

--
1.7.0.4

2010-08-04 02:06:24

by Will Drewry

[permalink] [raw]
Subject: [PATCH v2 2/3] genhd, efi: add efi partition metadata to hd_structs

This change extends the partition_meta_info structure to
support EFI GPT-specific metadata and ensures that data
is copied in on partition scanning.

Signed-off-by: Will Drewry <[email protected]>

v2: move info assignment after memcpy()s
adds _MAX to the enum after EFI
---
fs/partitions/efi.c | 16 ++++++++++++++++
include/linux/genhd.h | 14 ++++++++++++++
2 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/fs/partitions/efi.c b/fs/partitions/efi.c
index 9efb2cf..fac527d 100644
--- a/fs/partitions/efi.c
+++ b/fs/partitions/efi.c
@@ -614,6 +614,7 @@ int efi_partition(struct parsed_partitions *state)
pr_debug("GUID Partition Table is valid! Yea!\n");

for (i = 0; i < le32_to_cpu(gpt->num_partition_entries) && i < state->limit-1; i++) {
+ struct partition_meta_info *info;
u64 start = le64_to_cpu(ptes[i].starting_lba);
u64 size = le64_to_cpu(ptes[i].ending_lba) -
le64_to_cpu(ptes[i].starting_lba) + 1ULL;
@@ -627,6 +628,21 @@ int efi_partition(struct parsed_partitions *state)
if (!efi_guidcmp(ptes[i].partition_type_guid,
PARTITION_LINUX_RAID_GUID))
state->parts[i + 1].flags = ADDPART_FLAG_RAID;
+
+ info = alloc_part_info(NULL);
+ if (!info) {
+ printk(KERN_WARNING
+ "unable to allocate memory for part->info\n");
+ continue;
+ }
+ info->format = PARTITION_META_INFO_FORMAT_EFI;
+ memcpy(info->efi.uuid.b, ptes[i].unique_partition_guid.b,
+ sizeof(info->efi.uuid.b));
+ memcpy(info->efi.type.b, ptes[i].partition_type_guid.b,
+ sizeof(info->efi.type.b));
+ memcpy(info->efi.label, ptes[i].partition_name,
+ sizeof(info->efi.label));
+ state->parts[i + 1].info = info;
}
kfree(ptes);
kfree(gpt);
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 7b6644a..beb98e3 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -91,11 +91,25 @@ struct disk_stats {
enum partition_meta_info_format_t {
/* Partition info format */
PARTITION_META_INFO_FORMAT_NONE = 0,
+ PARTITION_META_INFO_FORMAT_EFI,
+ /* Place additional formats here. */
+ PARTITION_META_INFO_FORMAT_MAX,
};

+#ifdef CONFIG_EFI_PARTITION
+#include <linux/efi.h>
+#endif
+
struct partition_meta_info {
enum partition_meta_info_format_t format;
union {
+#ifdef CONFIG_EFI_PARTITION
+ struct {
+ efi_guid_t uuid;
+ efi_guid_t type;
+ efi_char16_t label[72 / sizeof(efi_char16_t)];
+ } efi;
+#endif
};
};

--
1.7.0.4

2010-08-04 02:06:36

by Will Drewry

[permalink] [raw]
Subject: [PATCH 3/3] init: add support for root devices specified by partition UUID

This is the third patch in a series which adds support for
storing partition metadata, optionally, off of the hd_struct.

One major use for that data is being able to resolve partition
by other identities than just the index on a block device. Device
enumeration varies by platform and there's a benefit to being able
to use something like EFI GPT's GUIDs to determine the correct
block device and partition to mount as the root.

This change adds that support to root= by adding support for
the following syntax:

root=PARTUUID=hex-uuid

I can't say this is the best way to do it, but it should be reasonably
clean to extend. There are a number of alternate approaches, and I'd love to
hear what is preferred.

Signed-off-by: Will Drewry <[email protected]>
---
init/do_mounts.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 84 insertions(+), 0 deletions(-)

diff --git a/init/do_mounts.c b/init/do_mounts.c
index 02e3ca4..7b22abf 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -58,6 +58,78 @@ static int __init readwrite(char *str)
__setup("ro", readonly);
__setup("rw", readwrite);

+/**
+ * match_dev_by_uuid - callback for finding a partition using its uuid
+ * @dev: device passed in by the caller
+ * @data: opaque pointer to a 36 byte char array with a UUID
+ *
+ * Returns 1 if the device matches, and 0 otherwise.
+ */
+static int __init match_dev_by_uuid(struct device *dev, void *data)
+{
+ u8 *uuid = data;
+ char candidate[37];
+ struct hd_struct *part = dev_to_part(dev);
+
+ if (!part->info)
+ goto no_match;
+
+ /* Each format may parse UUIDs differently. To that end,
+ * each format will have to parse either the given uuid or
+ * the candidate partition. It'd be more efficient to parse
+ * prior to the walk, but we'd need to store all the possible
+ * parsed guids.
+ */
+ switch (part->info->format) {
+#ifdef CONFIG_EFI_PARTITION
+ case PARTITION_META_INFO_FORMAT_EFI:
+ snprintf(candidate, sizeof(candidate), "%pUl",
+ part->info->efi.uuid.b);
+ if (!strcmp(uuid, candidate))
+ goto match;
+#endif
+ default:
+ goto no_match;
+ }
+
+match:
+ return 1;
+no_match:
+ return 0;
+}
+
+
+/**
+ * devt_from_partuuid - looks up the dev_t of a partition by its UUID
+ * @uuid: 36 byte char array containing a hex ascii UUID
+ *
+ * The function will return the first partition which contains a matching
+ * UUID value in its partition_meta_info struct. This does not search
+ * by filesystem UUIDs.
+ *
+ * Returns the matching dev_t on success or 0 on failure.
+ */
+static dev_t __init devt_from_partuuid(char *uuid)
+{
+ dev_t res = 0;
+ struct device *dev = NULL;
+ unsigned char *uuid_cur = (unsigned char *) uuid;
+
+ /* Downcase the letters in the UUID for uniformity. */
+ for (; *uuid_cur; ++uuid_cur)
+ *uuid_cur = tolower(*uuid_cur);
+
+ dev = class_find_device(&block_class, NULL, uuid, &match_dev_by_uuid);
+ if (!dev)
+ goto done;
+
+ res = dev->devt;
+ put_device(dev);
+
+done:
+ return res;
+}
+
/*
* Convert a name into device number. We accept the following variants:
*
@@ -68,6 +140,8 @@ __setup("rw", readwrite);
* of partition - device number of disk plus the partition number
* 5) /dev/<disk_name>p<decimal> - same as the above, that form is
* used when disk name of partitioned disk ends on a digit.
+ * 6) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the
+ * unique id of a partition if the partition table provides it.
*
* If name doesn't have fall into the categories above, we return (0,0).
* block_class is used to check if something is a disk name. If the disk
@@ -82,6 +156,16 @@ dev_t name_to_dev_t(char *name)
dev_t res = 0;
int part;

+ if (strncmp(name, "PARTUUID=", 9) == 0) {
+ name += 9;
+ if (strlen(name) != 36)
+ goto fail;
+ res = devt_from_partuuid(name);
+ if (!res)
+ goto fail;
+ goto done;
+ }
+
if (strncmp(name, "/dev/", 5) != 0) {
unsigned maj, min;

--
1.7.0.4

2010-08-04 07:57:25

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] block, partition: add partition_meta_info to hd_struct

Hello,

On 08/04/2010 04:04 AM, Will Drewry wrote:
> This changes adds a partition_meta_info struct which itself contains a
> union of structures that provide partition table specific metadata.
>
> This change leaves the union empty. The subsequent patch includes an
> implementation for CONFIG_EFI_PARTITION-based metadata.
>
> Signed-off-by: Will Drewry <[email protected]>

Generally looks good to me.

> /*
> * add_gd_partition adds a partitions details to the devices partition
> * description.
> @@ -12,6 +13,7 @@ struct parsed_partitions {
> sector_t from;
> sector_t size;
> int flags;
> + struct partition_meta_info *info;
> } parts[DISK_MAX_PARTS];

But you can just embed the structure here. It's a temp data structure
to make things easier for individual partition scan code. There's no
need to save bytes.

Thanks.

--
tejun

2010-08-04 07:59:07

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] genhd, efi: add efi partition metadata to hd_structs

Hello,

On 08/04/2010 04:04 AM, Will Drewry wrote:
> struct partition_meta_info {
> enum partition_meta_info_format_t format;
> union {
> +#ifdef CONFIG_EFI_PARTITION
> + struct {
> + efi_guid_t uuid;
> + efi_guid_t type;
> + efi_char16_t label[72 / sizeof(efi_char16_t)];
> + } efi;
> +#endif

It would be nice if uuid can be made a common field outside of the
union so that generic code which only cares about UUID can simply read
it off.

Thanks.

--
tejun

2010-08-04 09:01:36

by Karel Zak

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] genhd, efi: add efi partition metadata to hd_structs

On Tue, Aug 03, 2010 at 09:04:42PM -0500, Will Drewry wrote:
> This change extends the partition_meta_info structure to
> support EFI GPT-specific metadata and ensures that data
> is copied in on partition scanning.

Why do want to store GPT-specific data (efi_guid_t) to
partition_meta_info? I think it would be better to use label and uuid
in a generic format (e.g. string or u8 uuid[16]) -- then you don't
have to use things like union, disklabel specific code to compare
uuids, etc. IMHO your current code is too complicated.

> + info = alloc_part_info(NULL);
> + if (!info) {
> + printk(KERN_WARNING
> + "unable to allocate memory for part->info\n");
> + continue;
> + }
> + info->format = PARTITION_META_INFO_FORMAT_EFI;
> + memcpy(info->efi.uuid.b, ptes[i].unique_partition_guid.b,
> + sizeof(info->efi.uuid.b));
> + memcpy(info->efi.type.b, ptes[i].partition_type_guid.b,
> + sizeof(info->efi.type.b));

why do you need to partition type?

> + memcpy(info->efi.label, ptes[i].partition_name,
> + sizeof(info->efi.label));

the partition name is in UTF8LE, is it correct to use it in raw
format?

> + state->parts[i + 1].info = info;
> }
> kfree(ptes);
> kfree(gpt);
> diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> index 7b6644a..beb98e3 100644
> --- a/include/linux/genhd.h
> +++ b/include/linux/genhd.h
> @@ -91,11 +91,25 @@ struct disk_stats {
> enum partition_meta_info_format_t {
> /* Partition info format */
> PARTITION_META_INFO_FORMAT_NONE = 0,
> + PARTITION_META_INFO_FORMAT_EFI,
> + /* Place additional formats here. */
> + PARTITION_META_INFO_FORMAT_MAX,
> };
>
> +#ifdef CONFIG_EFI_PARTITION
> +#include <linux/efi.h>
> +#endif
> +
> struct partition_meta_info {
> enum partition_meta_info_format_t format;
> union {
> +#ifdef CONFIG_EFI_PARTITION
> + struct {
> + efi_guid_t uuid;
> + efi_guid_t type;
> + efi_char16_t label[72 / sizeof(efi_char16_t)];
> + } efi;
> +#endif
> };
> };
>
> --
> 1.7.0.4
>

--
Karel Zak <[email protected]>
http://karelzak.blogspot.com

2010-08-04 10:15:04

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] genhd, efi: add efi partition metadata to hd_structs

On Wed, Aug 4, 2010 at 11:00, Karel Zak <[email protected]> wrote:
> On Tue, Aug 03, 2010 at 09:04:42PM -0500, Will Drewry wrote:
>> This change extends the partition_meta_info structure to
>> support EFI GPT-specific metadata and ensures that data
>> is copied in on partition scanning.
>
> Why do want to store GPT-specific data (efi_guid_t) to
> partition_meta_info? I think it would be better to use label and uuid
> in a generic format (e.g. string or u8 uuid[16]) -- then you don't
> have to use things like union, disklabel specific code to compare
> uuids, etc.  IMHO your current code is too complicated.

I don't mind having the raw data and the type accessible. It might be
useful for something we don't know about and it basically comes for
free.

But the only thing we are really interested in is the UUID, which,
like Tejun already suggested, we should probably store
format-independent, and have it always accessible. That way, we would
not need any type-specific parser, we just handle the normal DCE
format.

I don't think we should support any of the labels anyway in root= and
similar, because they never really worked reliably with duplicates,
and just ask for trouble.

Kay

2010-08-04 14:39:15

by John Stoffel

[permalink] [raw]
Subject: Re: [PATCH 2/2] genhd, efi: add efi partition metadata to hd_structs

>>>>> "Kay" == Kay Sievers <[email protected]> writes:

Kay> On Wed, Aug 4, 2010 at 00:27, Will Drewry <[email protected]> wrote:
>> On Tue, Aug 3, 2010 at 4:54 PM, Kay Sievers <[email protected]> wrote:

>>> Did you already test to put a lookup-call to the in-kernel mounter, if
>>> we use some special partition table uuid identifier for root=? That
>>> would be nice to see, if all that works as expected, and we can get to
>>> the data we collect.
>>
>> Not yet, but I'd like to have it working there (and in my
>> device-mapper target) as soon as possible. ?Hopefully, I'll have that
>> done pretty soon and I'll repost the series inclusive of an init:
>> change.

Kay> Nice.

>> Any preferences on the variable? I'll start with your example of
>> PARTUUID=, but that follows the initramfs model (UUID=) and not the
>> existing magic root devices (/dev/nfs, /dev/ram).
>> /dev/by-part-uuid/XXX... doesn't seem super-friendly though :)

Kay> Yeah, this stuff does not really fit in to the path notation
Kay> stuff, unless we use the udev-style link names, which some people
Kay> like to avoid, and prefer some more abstract thing here.

Kay> We have UUID=, LABEL= for mount and fstab, I think PARTUUID=,
Kay> PARTLABEL= matches this.

Ugh, why do we care whether a UUID refers to a disk or a partition?
It should be unique no matter what, so just do /dev/uuid/XXXX and call
it done.

John

2010-08-04 14:44:50

by Will Drewry

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] genhd, efi: add efi partition metadata to hd_structs

On Wed, Aug 4, 2010 at 4:00 AM, Karel Zak <[email protected]> wrote:
> On Tue, Aug 03, 2010 at 09:04:42PM -0500, Will Drewry wrote:
>> This change extends the partition_meta_info structure to
>> support EFI GPT-specific metadata and ensures that data
>> is copied in on partition scanning.
>
> Why do want to store GPT-specific data (efi_guid_t) to
> partition_meta_info? I think it would be better to use label and uuid
> in a generic format (e.g. string or u8 uuid[16]) -- then you don't
> have to use things like union, disklabel specific code to compare
> uuids, etc. ?IMHO your current code is too complicated.
>
>> + ? ? ? ? ? ? info = alloc_part_info(NULL);
>> + ? ? ? ? ? ? if (!info) {
>> + ? ? ? ? ? ? ? ? ? ? printk(KERN_WARNING
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?"unable to allocate memory for part->info\n");
>> + ? ? ? ? ? ? ? ? ? ? continue;
>> + ? ? ? ? ? ? }
>> + ? ? ? ? ? ? info->format = PARTITION_META_INFO_FORMAT_EFI;
>> + ? ? ? ? ? ? memcpy(info->efi.uuid.b, ptes[i].unique_partition_guid.b,
>> + ? ? ? ? ? ? ? ? ? ? sizeof(info->efi.uuid.b));
>> + ? ? ? ? ? ? memcpy(info->efi.type.b, ptes[i].partition_type_guid.b,
>> + ? ? ? ? ? ? ? ? ? ? sizeof(info->efi.type.b));
>
> why do you need to partition type?

For GPT, at least, the partition type isn't directly encoded anywhere
else and it can be vendor specific. For instance, if a vendor wants
to make an auto-configuring, they could reserve a
VENDOR-ROOT-PRI0-PART-TYPE-GUID then easily create a mount path for
it. Of course, that could easily be kludged by reusing a UUID.


>
>> + ? ? ? ? ? ? memcpy(info->efi.label, ptes[i].partition_name,
>> + ? ? ? ? ? ? ? ? ? ? sizeof(info->efi.label));
>
> the partition name is in UTF8LE, is it correct to use it in raw
> format?

Since the type was partition specific, I left the data untouched. It
sounds like it'd be more useful to expose the data in a generic way
which would mean converting it, I guess?

>> + ? ? ? ? ? ? state->parts[i + 1].info = info;
>> ? ? ? }
>> ? ? ? kfree(ptes);
>> ? ? ? kfree(gpt);
>> diff --git a/include/linux/genhd.h b/include/linux/genhd.h
>> index 7b6644a..beb98e3 100644
>> --- a/include/linux/genhd.h
>> +++ b/include/linux/genhd.h
>> @@ -91,11 +91,25 @@ struct disk_stats {
>> ?enum partition_meta_info_format_t {
>> ? ? ? /* Partition info format */
>> ? ? ? PARTITION_META_INFO_FORMAT_NONE = 0,
>> + ? ? PARTITION_META_INFO_FORMAT_EFI,
>> + ? ? /* Place additional formats here. */
>> + ? ? PARTITION_META_INFO_FORMAT_MAX,
>> ?};
>>
>> +#ifdef CONFIG_EFI_PARTITION
>> +#include <linux/efi.h>
>> +#endif
>> +
>> ?struct partition_meta_info {
>> ? ? ? enum partition_meta_info_format_t format;
>> ? ? ? union {
>> +#ifdef CONFIG_EFI_PARTITION
>> + ? ? ? ? ? ? struct {
>> + ? ? ? ? ? ? ? ? ? ? efi_guid_t uuid;
>> + ? ? ? ? ? ? ? ? ? ? efi_guid_t type;
>> + ? ? ? ? ? ? ? ? ? ? efi_char16_t label[72 / sizeof(efi_char16_t)];
>> + ? ? ? ? ? ? } efi;
>> +#endif
>> ? ? ? };
>> ?};
>>
>> --
>> 1.7.0.4
>>
>
> --
> ?Karel Zak ?<[email protected]>
> ?http://karelzak.blogspot.com
>

2010-08-04 14:44:58

by Will Drewry

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] genhd, efi: add efi partition metadata to hd_structs

On Wed, Aug 4, 2010 at 5:14 AM, Kay Sievers <[email protected]> wrote:
> On Wed, Aug 4, 2010 at 11:00, Karel Zak <[email protected]> wrote:
>> On Tue, Aug 03, 2010 at 09:04:42PM -0500, Will Drewry wrote:
>>> This change extends the partition_meta_info structure to
>>> support EFI GPT-specific metadata and ensures that data
>>> is copied in on partition scanning.
>>
>> Why do want to store GPT-specific data (efi_guid_t) to
>> partition_meta_info? I think it would be better to use label and uuid
>> in a generic format (e.g. string or u8 uuid[16]) -- then you don't
>> have to use things like union, disklabel specific code to compare
>> uuids, etc. ?IMHO your current code is too complicated.
>
> I don't mind having the raw data and the type accessible. It might be
> useful for something we don't know about and it basically comes for
> free.

I'll bump out the uuid at least, but it might be worth keeping an
extended meta info option. But it's a lot less work to ditch it, so
I'm happy enough either way.


> But the only thing we are really interested in is the UUID, which,
> like Tejun already suggested, we should probably store
> format-independent, and have it always accessible. That way, we would
> not need any type-specific parser, we just handle the normal DCE
> format.

I'll bump it out and make it the efi code generic-ify its uuid. Out
of curiousity, were you and Tejun thinking of keeping it as a 36 byte
string or as the 16 byte packed value. While less space efficient,
I'd prefer the u8[36] as it avoids dealing with versioning when
parsing the user-supplied string. Instead, each partition type can
properly unparse its uuids according to what they expect.

Seem reasonable?

> I don't think we should support any of the labels anyway in root= and
> similar, because they never really worked reliably with duplicates,
> and just ask for trouble.

Agreed. I don't think labels make sense, but we may later want to
support partition types (as I mention in my other mail).

2010-08-04 14:45:25

by Will Drewry

[permalink] [raw]
Subject: Re: [PATCH 2/2] genhd, efi: add efi partition metadata to hd_structs

On Wed, Aug 4, 2010 at 9:27 AM, John Stoffel <[email protected]> wrote:
>>>>>> "Kay" == Kay Sievers <[email protected]> writes:
>
> Kay> On Wed, Aug 4, 2010 at 00:27, Will Drewry <[email protected]> wrote:
>>> On Tue, Aug 3, 2010 at 4:54 PM, Kay Sievers <[email protected]> wrote:
>
>>>> Did you already test to put a lookup-call to the in-kernel mounter, if
>>>> we use some special partition table uuid identifier for root=? That
>>>> would be nice to see, if all that works as expected, and we can get to
>>>> the data we collect.
>>>
>>> Not yet, but I'd like to have it working there (and in my
>>> device-mapper target) as soon as possible. ?Hopefully, I'll have that
>>> done pretty soon and I'll repost the series inclusive of an init:
>>> change.
>
> Kay> Nice.
>
>>> Any preferences on the variable? I'll start with your example of
>>> PARTUUID=, but that follows the initramfs model (UUID=) and not the
>>> existing magic root devices (/dev/nfs, /dev/ram).
>>> /dev/by-part-uuid/XXX... doesn't seem super-friendly though :)
>
> Kay> Yeah, this stuff does not really fit in to the path notation
> Kay> stuff, unless we use the udev-style link names, which some people
> Kay> like to avoid, and prefer some more abstract thing here.
>
> Kay> We have UUID=, LABEL= for mount and fstab, I think PARTUUID=,
> Kay> PARTLABEL= matches this.
>
> Ugh, why do we care whether a UUID refers to a disk or a partition?
> It should be unique no matter what, so just do /dev/uuid/XXXX and call
> it done.

Since these are supposed to be _unique_ it seems that any reference to
a uuid should be fair game. And if a root=/dev/uuid/XXX makes it to
an initramfs environment, then that'll be able to easily check if the
UUID refers to a filesystem uuid or not.

I can roll that into the next version unless there's specific complaints.

2010-08-04 14:46:06

by Will Drewry

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] block, partition: add partition_meta_info to hd_struct

On Wed, Aug 4, 2010 at 2:57 AM, Tejun Heo <[email protected]> wrote:
> Hello,
>
> On 08/04/2010 04:04 AM, Will Drewry wrote:
>> This changes adds a partition_meta_info struct which itself contains a
>> union of structures that provide partition table specific metadata.
>>
>> This change leaves the union empty. The subsequent patch includes an
>> implementation for CONFIG_EFI_PARTITION-based metadata.
>>
>> Signed-off-by: Will Drewry <[email protected]>
>
> Generally looks good to me.
>
>> ?/*
>> ? * add_gd_partition adds a partitions details to the devices partition
>> ? * description.
>> @@ -12,6 +13,7 @@ struct parsed_partitions {
>> ? ? ? ? ? ? ? sector_t from;
>> ? ? ? ? ? ? ? sector_t size;
>> ? ? ? ? ? ? ? int flags;
>> + ? ? ? ? ? ? struct partition_meta_info *info;
>> ? ? ? } parts[DISK_MAX_PARTS];
>
> But you can just embed the structure here. ?It's a temp data structure
> to make things easier for individual partition scan code. ?There's no
> need to save bytes.

Great - that'll simplify the code more.

Thanks!

2010-08-04 15:26:07

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH 2/2] genhd, efi: add efi partition metadata to hd_structs

On Wed, Aug 4, 2010 at 16:45, Will Drewry <[email protected]> wrote:
> On Wed, Aug 4, 2010 at 9:27 AM, John Stoffel <[email protected]> wrote:

>> Ugh, why do we care whether a UUID refers to a disk or a partition?

Because it is not a filesystem UUID, and that matters to distinguish.
You can even reformat the partition and it stays the same.

>> It should be unique no matter what, so just do /dev/uuid/XXXX and call
>> it done.
>
> Since these are supposed to be _unique_ it seems that any reference to
> a uuid should be fair game.  And if a root=/dev/uuid/XXX makes it to
> an initramfs environment, then that'll be able to easily check if the
> UUID refers to a filesystem uuid or not.
>
> I can roll that into the next version unless there's specific complaints.

I don't think so. People should stop encoding non-existing stuff in
the /dev namespace. :)

Thanks,
Kay

2010-08-04 15:29:08

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] genhd, efi: add efi partition metadata to hd_structs

On Wed, Aug 4, 2010 at 16:44, Will Drewry <[email protected]> wrote:
> On Wed, Aug 4, 2010 at 5:14 AM, Kay Sievers <[email protected]> wrote:

>> But the only thing we are really interested in is the UUID, which,
>> like Tejun already suggested, we should probably store
>> format-independent, and have it always accessible. That way, we would
>> not need any type-specific parser, we just handle the normal DCE
>> format.
>
> I'll bump it out and make it the efi code generic-ify its uuid.  Out
> of curiousity, were you and Tejun thinking of keeping it as a 36 byte
> string or as the 16 byte packed value.  While less space efficient,
> I'd prefer the u8[36] as it avoids dealing with versioning when
> parsing the user-supplied string.  Instead, each partition type can
> properly unparse its uuids according to what they expect.

I think we should use the packed version, which is case-insensitive,
or at least normalize it to a defined case.

Kay

2010-08-04 15:57:04

by Will Drewry

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] genhd, efi: add efi partition metadata to hd_structs

On Wed, Aug 4, 2010 at 10:28 AM, Kay Sievers <[email protected]> wrote:
> On Wed, Aug 4, 2010 at 16:44, Will Drewry <[email protected]> wrote:
>> On Wed, Aug 4, 2010 at 5:14 AM, Kay Sievers <[email protected]> wrote:
>
>>> But the only thing we are really interested in is the UUID, which,
>>> like Tejun already suggested, we should probably store
>>> format-independent, and have it always accessible. That way, we would
>>> not need any type-specific parser, we just handle the normal DCE
>>> format.
>>
>> I'll bump it out and make it the efi code generic-ify its uuid. ?Out
>> of curiousity, were you and Tejun thinking of keeping it as a 36 byte
>> string or as the 16 byte packed value. ?While less space efficient,
>> I'd prefer the u8[36] as it avoids dealing with versioning when
>> parsing the user-supplied string. ?Instead, each partition type can
>> properly unparse its uuids according to what they expect.
>
> I think we should use the packed version, which is case-insensitive,
> or at least normalize it to a defined case.

Yeah, I'm all for tolower()ing the entire value instead of packing :)
, but looking at the uuid code quickly, and it seems that the only
real difference in ordering practically is big-endian versus little
endian. Since sprintf supports converting both of those to strings,
it would make sense to just have a part_pack_uuid(char *uuid) which
will always do, let's say big endian, packing. Then it won't be too
much work for the partition format to export the value either directly
to the right format or just via sprintf then pack.

I'll add a common packer/unpacker, and see how it looks.

2010-08-04 18:23:23

by Will Drewry

[permalink] [raw]
Subject: [PATCH v3 3/3] init: add support for root devices specified by partition UUID

This is the third patch in a series which adds support for
storing partition metadata, optionally, off of the hd_struct.

One major use for that data is being able to resolve partition
by other identities than just the index on a block device. Device
enumeration varies by platform and there's a benefit to being able
to use something like EFI GPT's GUIDs to determine the correct
block device and partition to mount as the root.

This change adds that support to root= by adding support for
the following syntax:

root=PARTUUID=hex-uuid

Signed-off-by: Will Drewry <[email protected]>

v2: added to patch series
v3: adds uuids to printk_all_partitions
simplifies matching to a uniform uuid packed format (big endian)
---
block/genhd.c | 9 +++++-
init/do_mounts.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index c8da120..5c9c503 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -642,6 +642,7 @@ void __init printk_all_partitions(void)
struct hd_struct *part;
char name_buf[BDEVNAME_SIZE];
char devt_buf[BDEVT_SIZE];
+ u8 uuid[PARTITION_META_INFO_UUIDLTH * 2 + 1];

/*
* Don't show empty devices or things that have been
@@ -660,10 +661,14 @@ void __init printk_all_partitions(void)
while ((part = disk_part_iter_next(&piter))) {
bool is_part0 = part == &disk->part0;

- printk("%s%s %10llu %s", is_part0 ? "" : " ",
+ uuid[0] = 0;
+ if (part->info)
+ part_unpack_uuid(part->info->uuid, uuid);
+
+ printk("%s%s %10llu %s %s", is_part0 ? "" : " ",
bdevt_str(part_devt(part), devt_buf),
(unsigned long long)part->nr_sects >> 1,
- disk_name(disk, part->partno, name_buf));
+ disk_name(disk, part->partno, name_buf), uuid);
if (is_part0) {
if (disk->driverfs_dev != NULL &&
disk->driverfs_dev->driver != NULL)
diff --git a/init/do_mounts.c b/init/do_mounts.c
index 02e3ca4..804f9c6 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -58,6 +58,60 @@ static int __init readwrite(char *str)
__setup("ro", readonly);
__setup("rw", readwrite);

+/**
+ * match_dev_by_uuid - callback for finding a partition using its uuid
+ * @dev: device passed in by the caller
+ * @data: opaque pointer to a 36 byte char array with a UUID
+ *
+ * Returns 1 if the device matches, and 0 otherwise.
+ */
+static int __init match_dev_by_uuid(struct device *dev, void *data)
+{
+ u8 *uuid = data;
+ struct hd_struct *part = dev_to_part(dev);
+
+ if (!part->info)
+ goto no_match;
+
+ if (memcmp(uuid, part->info->uuid, sizeof(part->info->uuid)))
+ goto no_match;
+
+ return 1;
+no_match:
+ return 0;
+}
+
+
+/**
+ * devt_from_partuuid - looks up the dev_t of a partition by its UUID
+ * @uuid: 36 byte char array containing a hex ascii UUID
+ *
+ * The function will return the first partition which contains a matching
+ * UUID value in its partition_meta_info struct. This does not search
+ * by filesystem UUIDs.
+ *
+ * Returns the matching dev_t on success or 0 on failure.
+ */
+static dev_t __init devt_from_partuuid(char *uuid_str)
+{
+ dev_t res = 0;
+ struct device *dev = NULL;
+ u8 uuid[16];
+
+ /* Pack the requested UUID in the expected format. */
+ part_pack_uuid(uuid_str, uuid);
+
+ dev = class_find_device(&block_class, NULL, uuid, &match_dev_by_uuid);
+ if (!dev)
+ goto done;
+
+ res = dev->devt;
+ put_device(dev);
+
+done:
+ return res;
+}
+
/*
* Convert a name into device number. We accept the following variants:
*
@@ -68,6 +122,8 @@ __setup("rw", readwrite);
* of partition - device number of disk plus the partition number
* 5) /dev/<disk_name>p<decimal> - same as the above, that form is
* used when disk name of partitioned disk ends on a digit.
+ * 6) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the
+ * unique id of a partition if the partition table provides it.
*
* If name doesn't have fall into the categories above, we return (0,0).
* block_class is used to check if something is a disk name. If the disk
@@ -82,6 +138,16 @@ dev_t name_to_dev_t(char *name)
dev_t res = 0;
int part;

+ if (strncmp(name, "PARTUUID=", 9) == 0) {
+ name += 9;
+ if (strlen(name) != 36)
+ goto fail;
+ res = devt_from_partuuid(name);
+ if (!res)
+ goto fail;
+ goto done;
+ }
+
if (strncmp(name, "/dev/", 5) != 0) {
unsigned maj, min;

--
1.7.0.4

2010-08-04 18:23:08

by Will Drewry

[permalink] [raw]
Subject: [PATCH v3 2/3] genhd, efi: add efi partition metadata to hd_structs

This change extends the partition_meta_info structure to
support EFI GPT-specific metadata and ensures that data
is copied in on partition scanning.

Signed-off-by: Will Drewry <[email protected]>

v2: move info assignment after memcpy()s
adds _MAX to the enum after EFI
v3: exports a generic uuid/volname format
---
fs/partitions/efi.c | 25 +++++++++++++++++++++++++
1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/fs/partitions/efi.c b/fs/partitions/efi.c
index 9efb2cf..427f2fb 100644
--- a/fs/partitions/efi.c
+++ b/fs/partitions/efi.c
@@ -94,6 +94,7 @@
*
************************************************************/
#include <linux/crc32.h>
+#include <linux/ctype.h>
#include <linux/math64.h>
#include <linux/slab.h>
#include "check.h"
@@ -604,6 +605,7 @@ int efi_partition(struct parsed_partitions *state)
gpt_entry *ptes = NULL;
u32 i;
unsigned ssz = bdev_logical_block_size(state->bdev) / 512;
+ u8 unparsed_guid[37];

if (!find_valid_gpt(state, &gpt, &ptes) || !gpt || !ptes) {
kfree(gpt);
@@ -614,6 +616,9 @@ int efi_partition(struct parsed_partitions *state)
pr_debug("GUID Partition Table is valid! Yea!\n");

for (i = 0; i < le32_to_cpu(gpt->num_partition_entries) && i < state->limit-1; i++) {
+ struct partition_meta_info *info;
+ unsigned label_count = 0;
+ unsigned label_max;
u64 start = le64_to_cpu(ptes[i].starting_lba);
u64 size = le64_to_cpu(ptes[i].ending_lba) -
le64_to_cpu(ptes[i].starting_lba) + 1ULL;
@@ -627,6 +632,26 @@ int efi_partition(struct parsed_partitions *state)
if (!efi_guidcmp(ptes[i].partition_type_guid,
PARTITION_LINUX_RAID_GUID))
state->parts[i + 1].flags = ADDPART_FLAG_RAID;
+
+ info = &state->parts[i + 1].info;
+ /* Instead of doing a manual swap to big endian, reuse the
+ * common ASCII hex format as the interim.
+ */
+ efi_guid_unparse(&ptes[i].unique_partition_guid, unparsed_guid);
+ part_pack_uuid(unparsed_guid, info->uuid);
+
+ /* Naively convert UTF16-LE to 7 bits. */
+ label_max = min(sizeof(info->volname) - 1,
+ sizeof(ptes[i].partition_name));
+ info->volname[label_max] = 0;
+ while (label_count < label_max) {
+ u8 c = ptes[i].partition_name[label_count] & 0xff;
+ if (c && !isprint(c))
+ c = '!';
+ info->volname[label_count] = c;
+ label_count++;
+ }
+ state->parts[i + 1].has_info = true;
}
kfree(ptes);
kfree(gpt);
--
1.7.0.4

2010-08-04 18:23:39

by Will Drewry

[permalink] [raw]
Subject: [PATCH v3 1/3] block, partition: add partition_meta_info to hd_struct

This changes adds a partition_meta_info struct which itself contains a
union of structures that provide partition table specific metadata.

This change leaves the union empty. The subsequent patch includes an
implementation for CONFIG_EFI_PARTITION-based metadata.

Signed-off-by: Will Drewry <[email protected]>

v2: move assignment of p->info after copy
v3: make partition_meta_info part of parsed_partitions (not a ptr)
get rid of extended partition-format specific information
(this could be easily be added to the struct at any point later)
add generic pack/unpack uuid helpers
---
block/genhd.c | 1 +
block/ioctl.c | 2 +-
fs/partitions/check.c | 23 ++++++++++++++++++--
fs/partitions/check.h | 3 ++
include/linux/genhd.h | 53 +++++++++++++++++++++++++++++++++++++++++++++++-
5 files changed, 76 insertions(+), 6 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 59a2db6..c8da120 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1004,6 +1004,7 @@ static void disk_release(struct device *dev)
kfree(disk->random);
disk_replace_part_tbl(disk, NULL);
free_part_stats(&disk->part0);
+ free_part_info(&disk->part0);
kfree(disk);
}
struct class block_class = {
diff --git a/block/ioctl.c b/block/ioctl.c
index e8eb679..f8e4bfe 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -62,7 +62,7 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user

/* all seems OK */
part = add_partition(disk, partno, start, length,
- ADDPART_FLAG_NONE);
+ ADDPART_FLAG_NONE, NULL);
mutex_unlock(&bdev->bd_mutex);
return IS_ERR(part) ? PTR_ERR(part) : 0;
case BLKPG_DEL_PARTITION:
diff --git a/fs/partitions/check.c b/fs/partitions/check.c
index 5dcd4b0..ec9c8b6 100644
--- a/fs/partitions/check.c
+++ b/fs/partitions/check.c
@@ -338,6 +338,7 @@ static void part_release(struct device *dev)
{
struct hd_struct *p = dev_to_part(dev);
free_part_stats(p);
+ free_part_info(p);
kfree(p);
}

@@ -387,7 +388,8 @@ static DEVICE_ATTR(whole_disk, S_IRUSR | S_IRGRP | S_IROTH,
whole_disk_show, NULL);

struct hd_struct *add_partition(struct gendisk *disk, int partno,
- sector_t start, sector_t len, int flags)
+ sector_t start, sector_t len, int flags,
+ struct partition_meta_info *info)
{
struct hd_struct *p;
dev_t devt = MKDEV(0, 0);
@@ -424,6 +426,14 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,
p->partno = partno;
p->policy = get_disk_ro(disk);

+ if (info) {
+ struct partition_meta_info *pinfo = alloc_part_info(disk);
+ if (!pinfo)
+ goto out_free_stats;
+ memcpy(pinfo, info, sizeof(*info));
+ p->info = pinfo;
+ }
+
dname = dev_name(ddev);
if (isdigit(dname[strlen(dname) - 1]))
dev_set_name(pdev, "%sp%d", dname, partno);
@@ -437,7 +447,7 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,

err = blk_alloc_devt(p, &devt);
if (err)
- goto out_free_stats;
+ goto out_free_info;
pdev->devt = devt;

/* delay uevent until 'holders' subdir is created */
@@ -468,6 +478,8 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,

return p;

+out_free_info:
+ free_part_info(p);
out_free_stats:
free_part_stats(p);
out_free:
@@ -629,6 +641,7 @@ rescan:
/* add partitions */
for (p = 1; p < state->limit; p++) {
sector_t size, from;
+ struct partition_meta_info *info = NULL;

size = state->parts[p].size;
if (!size)
@@ -662,8 +675,12 @@ rescan:
size = get_capacity(disk) - from;
}
}
+
+ if (state->parts[p].has_info)
+ info = &state->parts[p].info;
part = add_partition(disk, p, from, size,
- state->parts[p].flags);
+ state->parts[p].flags,
+ &state->parts[p].info);
if (IS_ERR(part)) {
printk(KERN_ERR " %s: p%d could not be added: %ld\n",
disk->disk_name, p, -PTR_ERR(part));
diff --git a/fs/partitions/check.h b/fs/partitions/check.h
index 52f8bd3..ccf3371 100644
--- a/fs/partitions/check.h
+++ b/fs/partitions/check.h
@@ -1,5 +1,6 @@
#include <linux/pagemap.h>
#include <linux/blkdev.h>
+#include <linux/genhd.h>

/*
* add_gd_partition adds a partitions details to the devices partition
@@ -12,6 +13,8 @@ struct parsed_partitions {
sector_t from;
sector_t size;
int flags;
+ bool has_info;
+ struct partition_meta_info info;
} parts[DISK_MAX_PARTS];
int next;
int limit;
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 5f2f4c4..66e26b5 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -12,6 +12,7 @@
#include <linux/types.h>
#include <linux/kdev_t.h>
#include <linux/rcupdate.h>
+#include <linux/slab.h>

#ifdef CONFIG_BLOCK

@@ -86,7 +87,15 @@ struct disk_stats {
unsigned long io_ticks;
unsigned long time_in_queue;
};
-
+
+#define PARTITION_META_INFO_VOLNAMELTH 64
+#define PARTITION_META_INFO_UUIDLTH 16
+
+struct partition_meta_info {
+ u8 uuid[PARTITION_META_INFO_UUIDLTH]; /* always big endian */
+ u8 volname[PARTITION_META_INFO_VOLNAMELTH];
+};
+
struct hd_struct {
sector_t start_sect;
sector_t nr_sects;
@@ -95,6 +104,7 @@ struct hd_struct {
struct device __dev;
struct kobject *holder_dir;
int policy, partno;
+ struct partition_meta_info *info;
#ifdef CONFIG_FAIL_MAKE_REQUEST
int make_it_fail;
#endif
@@ -181,6 +191,30 @@ static inline struct gendisk *part_to_disk(struct hd_struct *part)
return NULL;
}

+static inline void part_pack_uuid(const u8 *uuid_str, u8 *to)
+{
+ int i;
+ for (i = 0; i < 16; ++i) {
+ *to++ = (hex_to_bin(*uuid_str) << 4) |
+ (hex_to_bin(*(uuid_str + 1)));
+ uuid_str += 2;
+ switch (i) {
+ case 3:
+ case 5:
+ case 7:
+ case 9:
+ uuid_str++;
+ continue;
+ }
+ }
+}
+
+static inline char *part_unpack_uuid(const u8 *uuid, char *out)
+{
+ sprintf(out, "%pU", uuid);
+ return out;
+}
+
static inline int disk_max_parts(struct gendisk *disk)
{
if (disk->flags & GENHD_FL_EXT_DEVT)
@@ -342,6 +376,19 @@ static inline int part_in_flight(struct hd_struct *part)
return part->in_flight[0] + part->in_flight[1];
}

+static inline struct partition_meta_info *alloc_part_info(struct gendisk *disk)
+{
+ if (disk)
+ return kzalloc_node(sizeof(struct partition_meta_info),
+ GFP_KERNEL, disk->node_id);
+ return kzalloc(sizeof(struct partition_meta_info), GFP_KERNEL);
+}
+
+static inline void free_part_info(struct hd_struct *part)
+{
+ kfree(part->info);
+}
+
/* block/blk-core.c */
extern void part_round_stats(int cpu, struct hd_struct *part);

@@ -533,7 +580,9 @@ extern int disk_expand_part_tbl(struct gendisk *disk, int target);
extern int rescan_partitions(struct gendisk *disk, struct block_device *bdev);
extern struct hd_struct * __must_check add_partition(struct gendisk *disk,
int partno, sector_t start,
- sector_t len, int flags);
+ sector_t len, int flags,
+ struct partition_meta_info
+ *info);
extern void delete_partition(struct gendisk *, int);
extern void printk_all_partitions(void);

--
1.7.0.4

2010-08-05 10:57:33

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] init: add support for root devices specified by partition UUID

Hello,

On 08/04/2010 08:22 PM, Will Drewry wrote:
> @@ -642,6 +642,7 @@ void __init printk_all_partitions(void)
> struct hd_struct *part;
> char name_buf[BDEVNAME_SIZE];
> char devt_buf[BDEVT_SIZE];
> + u8 uuid[PARTITION_META_INFO_UUIDLTH * 2 + 1];
>
> /*
> * Don't show empty devices or things that have been
> @@ -660,10 +661,14 @@ void __init printk_all_partitions(void)
> while ((part = disk_part_iter_next(&piter))) {
> bool is_part0 = part == &disk->part0;
>
> - printk("%s%s %10llu %s", is_part0 ? "" : " ",
> + uuid[0] = 0;
> + if (part->info)
> + part_unpack_uuid(part->info->uuid, uuid);
> +
> + printk("%s%s %10llu %s %s", is_part0 ? "" : " ",
> bdevt_str(part_devt(part), devt_buf),
> (unsigned long long)part->nr_sects >> 1,
> - disk_name(disk, part->partno, name_buf));
> + disk_name(disk, part->partno, name_buf), uuid);

Hmmm... I'm a bit worried about this. This might break userspace
tools. I think it would be better to export it via sysfs.

Thanks.

--
tejun

2010-08-05 14:26:39

by Will Drewry

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] init: add support for root devices specified by partition UUID

On Thu, Aug 5, 2010 at 5:55 AM, Tejun Heo <[email protected]> wrote:
> Hello,
>
> On 08/04/2010 08:22 PM, Will Drewry wrote:
>> @@ -642,6 +642,7 @@ void __init printk_all_partitions(void)
>> ? ? ? ? ? ? ? struct hd_struct *part;
>> ? ? ? ? ? ? ? char name_buf[BDEVNAME_SIZE];
>> ? ? ? ? ? ? ? char devt_buf[BDEVT_SIZE];
>> + ? ? ? ? ? ? u8 uuid[PARTITION_META_INFO_UUIDLTH * 2 + 1];
>>
>> ? ? ? ? ? ? ? /*
>> ? ? ? ? ? ? ? ?* Don't show empty devices or things that have been
>> @@ -660,10 +661,14 @@ void __init printk_all_partitions(void)
>> ? ? ? ? ? ? ? while ((part = disk_part_iter_next(&piter))) {
>> ? ? ? ? ? ? ? ? ? ? ? bool is_part0 = part == &disk->part0;
>>
>> - ? ? ? ? ? ? ? ? ? ? printk("%s%s %10llu %s", is_part0 ? "" : " ?",
>> + ? ? ? ? ? ? ? ? ? ? uuid[0] = 0;
>> + ? ? ? ? ? ? ? ? ? ? if (part->info)
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? part_unpack_uuid(part->info->uuid, uuid);
>> +
>> + ? ? ? ? ? ? ? ? ? ? printk("%s%s %10llu %s %s", is_part0 ? "" : " ?",
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?bdevt_str(part_devt(part), devt_buf),
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(unsigned long long)part->nr_sects >> 1,
>> - ? ? ? ? ? ? ? ? ? ? ? ? ? ?disk_name(disk, part->partno, name_buf));
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?disk_name(disk, part->partno, name_buf), uuid);
>
> Hmmm... I'm a bit worried about this. ?This might break userspace
> tools. ?I think it would be better to export it via sysfs.

Cool - I'm happy to drop that part from the patch. I was more
following the spirit of the comment around making it easy for a user
to figure out what went wrong (e.g., wrong uuid) if they don't have a
root, but if there are tools parsing that output, I'd rather not break
them!

Would sysfs make sense as part of this patch series or would it be
fair game for a follow-on? I'm inclined to wait since the UUIDs can't
be changed (blkpg support isn't wired up) except with rescans, and a
userspace tool can just walk the partition table to get the uuids.

Thanks!

2010-08-05 14:29:22

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] init: add support for root devices specified by partition UUID

Hello,

On 08/05/2010 04:26 PM, Will Drewry wrote:
> On Thu, Aug 5, 2010 at 5:55 AM, Tejun Heo <[email protected]> wrote:
>> Hello,
>>
>> On 08/04/2010 08:22 PM, Will Drewry wrote:
>>> @@ -642,6 +642,7 @@ void __init printk_all_partitions(void)
>>> struct hd_struct *part;
>>> char name_buf[BDEVNAME_SIZE];
>>> char devt_buf[BDEVT_SIZE];
>>> + u8 uuid[PARTITION_META_INFO_UUIDLTH * 2 + 1];
>>>
>>> /*
>>> * Don't show empty devices or things that have been
>>> @@ -660,10 +661,14 @@ void __init printk_all_partitions(void)
>>> while ((part = disk_part_iter_next(&piter))) {
>>> bool is_part0 = part == &disk->part0;
>>>
>>> - printk("%s%s %10llu %s", is_part0 ? "" : " ",
>>> + uuid[0] = 0;
>>> + if (part->info)
>>> + part_unpack_uuid(part->info->uuid, uuid);
>>> +
>>> + printk("%s%s %10llu %s %s", is_part0 ? "" : " ",
>>> bdevt_str(part_devt(part), devt_buf),
>>> (unsigned long long)part->nr_sects >> 1,
>>> - disk_name(disk, part->partno, name_buf));
>>> + disk_name(disk, part->partno, name_buf), uuid);
>>
>> Hmmm... I'm a bit worried about this. This might break userspace
>> tools. I think it would be better to export it via sysfs.
>
> Cool - I'm happy to drop that part from the patch. I was more
> following the spirit of the comment around making it easy for a user
> to figure out what went wrong (e.g., wrong uuid) if they don't have a
> root, but if there are tools parsing that output, I'd rather not break
> them!

Oh, forget about what I said. For some reason I thought the above
code was for /proc/partitions. :-)

Thanks.

--
tejun

2010-08-05 19:19:21

by Will Drewry

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] init: add support for root devices specified by partition UUID

On Thu, Aug 5, 2010 at 9:29 AM, Tejun Heo <[email protected]> wrote:
> Hello,
>
> On 08/05/2010 04:26 PM, Will Drewry wrote:
>> On Thu, Aug 5, 2010 at 5:55 AM, Tejun Heo <[email protected]> wrote:
>>> Hello,
>>>
>>> On 08/04/2010 08:22 PM, Will Drewry wrote:
>>>> @@ -642,6 +642,7 @@ void __init printk_all_partitions(void)
>>>> ? ? ? ? ? ? ? struct hd_struct *part;
>>>> ? ? ? ? ? ? ? char name_buf[BDEVNAME_SIZE];
>>>> ? ? ? ? ? ? ? char devt_buf[BDEVT_SIZE];
>>>> + ? ? ? ? ? ? u8 uuid[PARTITION_META_INFO_UUIDLTH * 2 + 1];
>>>>
>>>> ? ? ? ? ? ? ? /*
>>>> ? ? ? ? ? ? ? ?* Don't show empty devices or things that have been
>>>> @@ -660,10 +661,14 @@ void __init printk_all_partitions(void)
>>>> ? ? ? ? ? ? ? while ((part = disk_part_iter_next(&piter))) {
>>>> ? ? ? ? ? ? ? ? ? ? ? bool is_part0 = part == &disk->part0;
>>>>
>>>> - ? ? ? ? ? ? ? ? ? ? printk("%s%s %10llu %s", is_part0 ? "" : " ?",
>>>> + ? ? ? ? ? ? ? ? ? ? uuid[0] = 0;
>>>> + ? ? ? ? ? ? ? ? ? ? if (part->info)
>>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? part_unpack_uuid(part->info->uuid, uuid);
>>>> +
>>>> + ? ? ? ? ? ? ? ? ? ? printk("%s%s %10llu %s %s", is_part0 ? "" : " ?",
>>>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?bdevt_str(part_devt(part), devt_buf),
>>>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(unsigned long long)part->nr_sects >> 1,
>>>> - ? ? ? ? ? ? ? ? ? ? ? ? ? ?disk_name(disk, part->partno, name_buf));
>>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ?disk_name(disk, part->partno, name_buf), uuid);
>>>
>>> Hmmm... I'm a bit worried about this. ?This might break userspace
>>> tools. ?I think it would be better to export it via sysfs.
>>
>> Cool - I'm happy to drop that part from the patch. ?I was more
>> following the spirit of the comment around making it easy for a user
>> to figure out what went wrong (e.g., wrong uuid) if they don't have a
>> root, but if there are tools parsing that output, I'd rather not break
>> them!
>
> Oh, forget about what I said. ?For some reason I thought the above
> code was for /proc/partitions. ?:-)

Cool - even better!

Thanks!
will

2010-08-05 19:29:48

by Kay Sievers

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] init: add support for root devices specified by partition UUID

On Wed, Aug 4, 2010 at 20:22, Will Drewry <[email protected]> wrote:
> This is the third patch in a series which adds support for
> storing partition metadata, optionally, off of the hd_struct.
>
> One major use for that data is being able to resolve partition
> by other identities than just the index on a block device.  Device
> enumeration varies by platform and there's a benefit to being able
> to use something like EFI GPT's GUIDs to determine the correct
> block device and partition to mount as the root.
>
> This change adds that support to root= by adding support for
> the following syntax:
>
>  root=PARTUUID=hex-uuid

Just an update:

There is still no better idea than using this notation. We should
distinguish between filesystem and partiton UUIDs, and overloading
/dev with magic strings that will never exist there doesn't sound like
a good idea.

I checked with Karel, and fstab supports UUID=, initramfs tools
support root=UUID=, so we are probably going to have PARTUUID support
in fstab and initramfs too when we get there.

Kay

2010-08-31 20:48:07

by Will Drewry

[permalink] [raw]
Subject: [PATCH v4 1/3] block, partition: add partition_meta_info to hd_struct

I'm reposting this patch series as v4 since there have been no additional
comments, and I cleaned up one extra bit of unneeded code (in 3/3). The patches
are against Linus's tree: 2bfc96a127bc1cc94d26bfaa40159966064f9c8c
(2.6.36-rc3).

Would this patchset be suitable for inclusion in an mm branch?


This changes adds a partition_meta_info struct which itself contains a
union of structures that provide partition table specific metadata.

This change leaves the union empty. The subsequent patch includes an
implementation for CONFIG_EFI_PARTITION-based metadata.

Signed-off-by: Will Drewry <[email protected]>

---
v2: move assignment of p->info after copy
v3: make partition_meta_info part of parsed_partitions (not a ptr)
get rid of extended partition-format specific information
add generic pack/unpack uuid helpers
v4: include genhd.h in check.h

block/genhd.c | 1 +
block/ioctl.c | 2 +-
fs/partitions/check.c | 23 ++++++++++++++++++--
fs/partitions/check.h | 3 ++
include/linux/genhd.h | 53 +++++++++++++++++++++++++++++++++++++++++++++++-
5 files changed, 76 insertions(+), 6 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 59a2db6..c8da120 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1004,6 +1004,7 @@ static void disk_release(struct device *dev)
kfree(disk->random);
disk_replace_part_tbl(disk, NULL);
free_part_stats(&disk->part0);
+ free_part_info(&disk->part0);
kfree(disk);
}
struct class block_class = {
diff --git a/block/ioctl.c b/block/ioctl.c
index d8052f0..2c15fe0 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -62,7 +62,7 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user

/* all seems OK */
part = add_partition(disk, partno, start, length,
- ADDPART_FLAG_NONE);
+ ADDPART_FLAG_NONE, NULL);
mutex_unlock(&bdev->bd_mutex);
return IS_ERR(part) ? PTR_ERR(part) : 0;
case BLKPG_DEL_PARTITION:
diff --git a/fs/partitions/check.c b/fs/partitions/check.c
index 79fbf3f..6dfbee0 100644
--- a/fs/partitions/check.c
+++ b/fs/partitions/check.c
@@ -352,6 +352,7 @@ static void part_release(struct device *dev)
{
struct hd_struct *p = dev_to_part(dev);
free_part_stats(p);
+ free_part_info(p);
kfree(p);
}

@@ -401,7 +402,8 @@ static DEVICE_ATTR(whole_disk, S_IRUSR | S_IRGRP | S_IROTH,
whole_disk_show, NULL);

struct hd_struct *add_partition(struct gendisk *disk, int partno,
- sector_t start, sector_t len, int flags)
+ sector_t start, sector_t len, int flags,
+ struct partition_meta_info *info)
{
struct hd_struct *p;
dev_t devt = MKDEV(0, 0);
@@ -438,6 +440,14 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,
p->partno = partno;
p->policy = get_disk_ro(disk);

+ if (info) {
+ struct partition_meta_info *pinfo = alloc_part_info(disk);
+ if (!pinfo)
+ goto out_free_stats;
+ memcpy(pinfo, info, sizeof(*info));
+ p->info = pinfo;
+ }
+
dname = dev_name(ddev);
if (isdigit(dname[strlen(dname) - 1]))
dev_set_name(pdev, "%sp%d", dname, partno);
@@ -451,7 +461,7 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,

err = blk_alloc_devt(p, &devt);
if (err)
- goto out_free_stats;
+ goto out_free_info;
pdev->devt = devt;

/* delay uevent until 'holders' subdir is created */
@@ -481,6 +491,8 @@ struct hd_struct *add_partition(struct gendisk *disk, int partno,

return p;

+out_free_info:
+ free_part_info(p);
out_free_stats:
free_part_stats(p);
out_free:
@@ -642,6 +654,7 @@ rescan:
/* add partitions */
for (p = 1; p < state->limit; p++) {
sector_t size, from;
+ struct partition_meta_info *info = NULL;

size = state->parts[p].size;
if (!size)
@@ -675,8 +688,12 @@ rescan:
size = get_capacity(disk) - from;
}
}
+
+ if (state->parts[p].has_info)
+ info = &state->parts[p].info;
part = add_partition(disk, p, from, size,
- state->parts[p].flags);
+ state->parts[p].flags,
+ &state->parts[p].info);
if (IS_ERR(part)) {
printk(KERN_ERR " %s: p%d could not be added: %ld\n",
disk->disk_name, p, -PTR_ERR(part));
diff --git a/fs/partitions/check.h b/fs/partitions/check.h
index 8e4e103..d68bf4d 100644
--- a/fs/partitions/check.h
+++ b/fs/partitions/check.h
@@ -1,5 +1,6 @@
#include <linux/pagemap.h>
#include <linux/blkdev.h>
+#include <linux/genhd.h>

/*
* add_gd_partition adds a partitions details to the devices partition
@@ -12,6 +13,8 @@ struct parsed_partitions {
sector_t from;
sector_t size;
int flags;
+ bool has_info;
+ struct partition_meta_info info;
} parts[DISK_MAX_PARTS];
int next;
int limit;
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 5f2f4c4..66e26b5 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -12,6 +12,7 @@
#include <linux/types.h>
#include <linux/kdev_t.h>
#include <linux/rcupdate.h>
+#include <linux/slab.h>

#ifdef CONFIG_BLOCK

@@ -86,7 +87,15 @@ struct disk_stats {
unsigned long io_ticks;
unsigned long time_in_queue;
};
-
+
+#define PARTITION_META_INFO_VOLNAMELTH 64
+#define PARTITION_META_INFO_UUIDLTH 16
+
+struct partition_meta_info {
+ u8 uuid[PARTITION_META_INFO_UUIDLTH]; /* always big endian */
+ u8 volname[PARTITION_META_INFO_VOLNAMELTH];
+};
+
struct hd_struct {
sector_t start_sect;
sector_t nr_sects;
@@ -95,6 +104,7 @@ struct hd_struct {
struct device __dev;
struct kobject *holder_dir;
int policy, partno;
+ struct partition_meta_info *info;
#ifdef CONFIG_FAIL_MAKE_REQUEST
int make_it_fail;
#endif
@@ -181,6 +191,30 @@ static inline struct gendisk *part_to_disk(struct hd_struct *part)
return NULL;
}

+static inline void part_pack_uuid(const u8 *uuid_str, u8 *to)
+{
+ int i;
+ for (i = 0; i < 16; ++i) {
+ *to++ = (hex_to_bin(*uuid_str) << 4) |
+ (hex_to_bin(*(uuid_str + 1)));
+ uuid_str += 2;
+ switch (i) {
+ case 3:
+ case 5:
+ case 7:
+ case 9:
+ uuid_str++;
+ continue;
+ }
+ }
+}
+
+static inline char *part_unpack_uuid(const u8 *uuid, char *out)
+{
+ sprintf(out, "%pU", uuid);
+ return out;
+}
+
static inline int disk_max_parts(struct gendisk *disk)
{
if (disk->flags & GENHD_FL_EXT_DEVT)
@@ -342,6 +376,19 @@ static inline int part_in_flight(struct hd_struct *part)
return part->in_flight[0] + part->in_flight[1];
}

+static inline struct partition_meta_info *alloc_part_info(struct gendisk *disk)
+{
+ if (disk)
+ return kzalloc_node(sizeof(struct partition_meta_info),
+ GFP_KERNEL, disk->node_id);
+ return kzalloc(sizeof(struct partition_meta_info), GFP_KERNEL);
+}
+
+static inline void free_part_info(struct hd_struct *part)
+{
+ kfree(part->info);
+}
+
/* block/blk-core.c */
extern void part_round_stats(int cpu, struct hd_struct *part);

@@ -533,7 +580,9 @@ extern int disk_expand_part_tbl(struct gendisk *disk, int target);
extern int rescan_partitions(struct gendisk *disk, struct block_device *bdev);
extern struct hd_struct * __must_check add_partition(struct gendisk *disk,
int partno, sector_t start,
- sector_t len, int flags);
+ sector_t len, int flags,
+ struct partition_meta_info
+ *info);
extern void delete_partition(struct gendisk *, int);
extern void printk_all_partitions(void);

--
1.7.0.4

2010-08-31 20:48:16

by Will Drewry

[permalink] [raw]
Subject: [PATCH v4 3/3] init: add support for root devices specified by partition UUID

This is the third patch in a series which adds support for
storing partition metadata, optionally, off of the hd_struct.

One major use for that data is being able to resolve partition
by other identities than just the index on a block device. Device
enumeration varies by platform and there's a benefit to being able
to use something like EFI GPT's GUIDs to determine the correct
block device and partition to mount as the root.

This change adds that support to root= by adding support for
the following syntax:

root=PARTUUID=hex-uuid

Signed-off-by: Will Drewry <[email protected]>

---
v2: added to patch series
v3: adds uuids to printk_all_partitions
simplifies matching to a uniform uuid packed format (big endian)
v4: no changes

block/genhd.c | 9 +++++-
init/do_mounts.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index c8da120..5c9c503 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -642,6 +642,7 @@ void __init printk_all_partitions(void)
struct hd_struct *part;
char name_buf[BDEVNAME_SIZE];
char devt_buf[BDEVT_SIZE];
+ u8 uuid[PARTITION_META_INFO_UUIDLTH * 2 + 1];

/*
* Don't show empty devices or things that have been
@@ -660,10 +661,14 @@ void __init printk_all_partitions(void)
while ((part = disk_part_iter_next(&piter))) {
bool is_part0 = part == &disk->part0;

- printk("%s%s %10llu %s", is_part0 ? "" : " ",
+ uuid[0] = 0;
+ if (part->info)
+ part_unpack_uuid(part->info->uuid, uuid);
+
+ printk("%s%s %10llu %s %s", is_part0 ? "" : " ",
bdevt_str(part_devt(part), devt_buf),
(unsigned long long)part->nr_sects >> 1,
- disk_name(disk, part->partno, name_buf));
+ disk_name(disk, part->partno, name_buf), uuid);
if (is_part0) {
if (disk->driverfs_dev != NULL &&
disk->driverfs_dev->driver != NULL)
diff --git a/init/do_mounts.c b/init/do_mounts.c
index 02e3ca4..804f9c6 100644
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -58,6 +58,60 @@ static int __init readwrite(char *str)
__setup("ro", readonly);
__setup("rw", readwrite);

+/**
+ * match_dev_by_uuid - callback for finding a partition using its uuid
+ * @dev: device passed in by the caller
+ * @data: opaque pointer to a 36 byte char array with a UUID
+ *
+ * Returns 1 if the device matches, and 0 otherwise.
+ */
+static int __init match_dev_by_uuid(struct device *dev, void *data)
+{
+ u8 *uuid = data;
+ struct hd_struct *part = dev_to_part(dev);
+
+ if (!part->info)
+ goto no_match;
+
+ if (memcmp(uuid, part->info->uuid, sizeof(part->info->uuid)))
+ goto no_match;
+
+ return 1;
+no_match:
+ return 0;
+}
+
+
+/**
+ * devt_from_partuuid - looks up the dev_t of a partition by its UUID
+ * @uuid: 36 byte char array containing a hex ascii UUID
+ *
+ * The function will return the first partition which contains a matching
+ * UUID value in its partition_meta_info struct. This does not search
+ * by filesystem UUIDs.
+ *
+ * Returns the matching dev_t on success or 0 on failure.
+ */
+static dev_t __init devt_from_partuuid(char *uuid_str)
+{
+ dev_t res = 0;
+ struct device *dev = NULL;
+ u8 uuid[16];
+
+ /* Pack the requested UUID in the expected format. */
+ part_pack_uuid(uuid_str, uuid);
+
+ dev = class_find_device(&block_class, NULL, uuid, &match_dev_by_uuid);
+ if (!dev)
+ goto done;
+
+ res = dev->devt;
+ put_device(dev);
+
+done:
+ return res;
+}
+
/*
* Convert a name into device number. We accept the following variants:
*
@@ -68,6 +122,8 @@ __setup("rw", readwrite);
* of partition - device number of disk plus the partition number
* 5) /dev/<disk_name>p<decimal> - same as the above, that form is
* used when disk name of partitioned disk ends on a digit.
+ * 6) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the
+ * unique id of a partition if the partition table provides it.
*
* If name doesn't have fall into the categories above, we return (0,0).
* block_class is used to check if something is a disk name. If the disk
@@ -82,6 +138,16 @@ dev_t name_to_dev_t(char *name)
dev_t res = 0;
int part;

+ if (strncmp(name, "PARTUUID=", 9) == 0) {
+ name += 9;
+ if (strlen(name) != 36)
+ goto fail;
+ res = devt_from_partuuid(name);
+ if (!res)
+ goto fail;
+ goto done;
+ }
+
if (strncmp(name, "/dev/", 5) != 0) {
unsigned maj, min;

--
1.7.0.4

2010-08-31 20:48:09

by Will Drewry

[permalink] [raw]
Subject: [PATCH v4 2/3] genhd, efi: add efi partition metadata to hd_structs

This change extends the partition_meta_info structure to
support EFI GPT-specific metadata and ensures that data
is copied in on partition scanning.

Signed-off-by: Will Drewry <[email protected]>

---
v2: move info assignment after memcpy()s
adds _MAX to the enum after EFI
v3: exports a generic uuid/volname format
v4: no changes

fs/partitions/efi.c | 25 +++++++++++++++++++++++++
1 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/fs/partitions/efi.c b/fs/partitions/efi.c
index dbb44d4..ac0ccb5 100644
--- a/fs/partitions/efi.c
+++ b/fs/partitions/efi.c
@@ -94,6 +94,7 @@
*
************************************************************/
#include <linux/crc32.h>
+#include <linux/ctype.h>
#include <linux/math64.h>
#include <linux/slab.h>
#include "check.h"
@@ -604,6 +605,7 @@ int efi_partition(struct parsed_partitions *state)
gpt_entry *ptes = NULL;
u32 i;
unsigned ssz = bdev_logical_block_size(state->bdev) / 512;
+ u8 unparsed_guid[37];

if (!find_valid_gpt(state, &gpt, &ptes) || !gpt || !ptes) {
kfree(gpt);
@@ -614,6 +616,9 @@ int efi_partition(struct parsed_partitions *state)
pr_debug("GUID Partition Table is valid! Yea!\n");

for (i = 0; i < le32_to_cpu(gpt->num_partition_entries) && i < state->limit-1; i++) {
+ struct partition_meta_info *info;
+ unsigned label_count = 0;
+ unsigned label_max;
u64 start = le64_to_cpu(ptes[i].starting_lba);
u64 size = le64_to_cpu(ptes[i].ending_lba) -
le64_to_cpu(ptes[i].starting_lba) + 1ULL;
@@ -627,6 +632,26 @@ int efi_partition(struct parsed_partitions *state)
if (!efi_guidcmp(ptes[i].partition_type_guid,
PARTITION_LINUX_RAID_GUID))
state->parts[i + 1].flags = ADDPART_FLAG_RAID;
+
+ info = &state->parts[i + 1].info;
+ /* Instead of doing a manual swap to big endian, reuse the
+ * common ASCII hex format as the interim.
+ */
+ efi_guid_unparse(&ptes[i].unique_partition_guid, unparsed_guid);
+ part_pack_uuid(unparsed_guid, info->uuid);
+
+ /* Naively convert UTF16-LE to 7 bits. */
+ label_max = min(sizeof(info->volname) - 1,
+ sizeof(ptes[i].partition_name));
+ info->volname[label_max] = 0;
+ while (label_count < label_max) {
+ u8 c = ptes[i].partition_name[label_count] & 0xff;
+ if (c && !isprint(c))
+ c = '!';
+ info->volname[label_count] = c;
+ label_count++;
+ }
+ state->parts[i + 1].has_info = true;
}
kfree(ptes);
kfree(gpt);
--
1.7.0.4