Jason Gunthorpe <[email protected]> pointed out that u64 VFIO ioctl struct fields
have architecture-dependent alignment. iommufd already uses __aligned_u64 to
avoid this problem.
Reasons for using __aligned_u64 include avoiding potential information leaks
due to architecture-specific holes in structs and 32-bit userspace on 64-bit
kernel ioctl compatibility issues. See the __aligned_u64 typedef in
<uapi/linux/types.h> for details.
This series modifies the VFIO ioctl structs to use __aligned_u64. Some of the
changes preserve the existing memory layout on all architectures, so I put them
together into the first patch. The remaining patches are for structs where
explanation is necessary about why changing the memory layout does not break
the uapi.
Stefan Hajnoczi (4):
vfio: trivially use __aligned_u64 for ioctl structs
vfio: use __aligned_u64 in struct vfio_device_gfx_plane_info
vfio: use __aligned_u64 in struct vfio_iommu_type1_info
vfio: use __aligned_u64 in struct vfio_device_ioeventfd
include/uapi/linux/vfio.h | 27 +++++++++++++++------------
drivers/gpu/drm/i915/gvt/kvmgt.c | 4 +++-
drivers/vfio/vfio_iommu_type1.c | 11 ++---------
samples/vfio-mdev/mbochs.c | 6 ++++--
samples/vfio-mdev/mdpy.c | 4 +++-
5 files changed, 27 insertions(+), 25 deletions(-)
--
2.41.0
u64 alignment behaves differently depending on the architecture and so
<uapi/linux/types.h> offers __aligned_u64 to achieve consistent behavior
in kernel<->userspace ABIs.
There are structs in <uapi/linux/vfio.h> that can trivially be updated
to __aligned_u64 because the struct sizes are multiples of 8 bytes.
There is no change in memory layout on any CPU architecture and
therefore this change is safe.
The commits that follow this one handle the trickier cases where
explanation about ABI breakage is necessary.
Suggested-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>
---
include/uapi/linux/vfio.h | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 20c804bdc09c..b1dfcf3b7665 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -276,8 +276,8 @@ struct vfio_region_info {
#define VFIO_REGION_INFO_FLAG_CAPS (1 << 3) /* Info supports caps */
__u32 index; /* Region index */
__u32 cap_offset; /* Offset within info struct of first cap */
- __u64 size; /* Region size (bytes) */
- __u64 offset; /* Region offset from start of device fd */
+ __aligned_u64 size; /* Region size (bytes) */
+ __aligned_u64 offset; /* Region offset from start of device fd */
};
#define VFIO_DEVICE_GET_REGION_INFO _IO(VFIO_TYPE, VFIO_BASE + 8)
@@ -293,8 +293,8 @@ struct vfio_region_info {
#define VFIO_REGION_INFO_CAP_SPARSE_MMAP 1
struct vfio_region_sparse_mmap_area {
- __u64 offset; /* Offset of mmap'able area within region */
- __u64 size; /* Size of mmap'able area */
+ __aligned_u64 offset; /* Offset of mmap'able area within region */
+ __aligned_u64 size; /* Size of mmap'able area */
};
struct vfio_region_info_cap_sparse_mmap {
@@ -449,9 +449,9 @@ struct vfio_device_migration_info {
VFIO_DEVICE_STATE_V1_RESUMING)
__u32 reserved;
- __u64 pending_bytes;
- __u64 data_offset;
- __u64 data_size;
+ __aligned_u64 pending_bytes;
+ __aligned_u64 data_offset;
+ __aligned_u64 data_size;
};
/*
@@ -475,7 +475,7 @@ struct vfio_device_migration_info {
struct vfio_region_info_cap_nvlink2_ssatgt {
struct vfio_info_cap_header header;
- __u64 tgt;
+ __aligned_u64 tgt;
};
/*
--
2.41.0
The memory layout of struct vfio_device_gfx_plane_info is
architecture-dependent due to a u64 field and a struct size that is not
a multiple of 8 bytes:
- On x86_64 the struct size is padded to a multiple of 8 bytes.
- On x32 the struct size is only a multiple of 4 bytes, not 8.
- Other architectures may vary.
Use __aligned_u64 to make memory layout consistent. This reduces the
chance of holes that result in an information leak and the chance that
32-bit userspace on a 64-bit kernel breakage.
This patch increases the struct size on x32 but this is safe because of
the struct's argsz field. The kernel may grow the struct as long as it
still supports smaller argsz values from userspace (e.g. applications
compiled against older kernel headers).
Suggested-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>
---
include/uapi/linux/vfio.h | 3 ++-
drivers/gpu/drm/i915/gvt/kvmgt.c | 4 +++-
samples/vfio-mdev/mbochs.c | 6 ++++--
samples/vfio-mdev/mdpy.c | 4 +++-
4 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index b1dfcf3b7665..45db62d74064 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -746,7 +746,7 @@ struct vfio_device_gfx_plane_info {
__u32 drm_plane_type; /* type of plane: DRM_PLANE_TYPE_* */
/* out */
__u32 drm_format; /* drm format of plane */
- __u64 drm_format_mod; /* tiled mode */
+ __aligned_u64 drm_format_mod; /* tiled mode */
__u32 width; /* width of plane */
__u32 height; /* height of plane */
__u32 stride; /* stride of plane */
@@ -759,6 +759,7 @@ struct vfio_device_gfx_plane_info {
__u32 region_index; /* region index */
__u32 dmabuf_id; /* dma-buf id */
};
+ __u32 reserved;
};
#define VFIO_DEVICE_QUERY_GFX_PLANE _IO(VFIO_TYPE, VFIO_BASE + 14)
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index de675d799c7d..ffab3536dc8a 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -1382,7 +1382,7 @@ static long intel_vgpu_ioctl(struct vfio_device *vfio_dev, unsigned int cmd,
intel_gvt_reset_vgpu(vgpu);
return 0;
} else if (cmd == VFIO_DEVICE_QUERY_GFX_PLANE) {
- struct vfio_device_gfx_plane_info dmabuf;
+ struct vfio_device_gfx_plane_info dmabuf = {};
int ret = 0;
minsz = offsetofend(struct vfio_device_gfx_plane_info,
@@ -1392,6 +1392,8 @@ static long intel_vgpu_ioctl(struct vfio_device *vfio_dev, unsigned int cmd,
if (dmabuf.argsz < minsz)
return -EINVAL;
+ minsz = min(minsz, sizeof(dmabuf));
+
ret = intel_vgpu_query_plane(vgpu, &dmabuf);
if (ret != 0)
return ret;
diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index c6c6b5d26670..ee42a780041f 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -1262,7 +1262,7 @@ static long mbochs_ioctl(struct vfio_device *vdev, unsigned int cmd,
case VFIO_DEVICE_QUERY_GFX_PLANE:
{
- struct vfio_device_gfx_plane_info plane;
+ struct vfio_device_gfx_plane_info plane = {};
minsz = offsetofend(struct vfio_device_gfx_plane_info,
region_index);
@@ -1273,11 +1273,13 @@ static long mbochs_ioctl(struct vfio_device *vdev, unsigned int cmd,
if (plane.argsz < minsz)
return -EINVAL;
+ outsz = min_t(unsigned long, plane.argsz, sizeof(plane));
+
ret = mbochs_query_gfx_plane(mdev_state, &plane);
if (ret)
return ret;
- if (copy_to_user((void __user *)arg, &plane, minsz))
+ if (copy_to_user((void __user *)arg, &plane, outsz))
return -EFAULT;
return 0;
diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
index a62ea11e20ec..1500b120de04 100644
--- a/samples/vfio-mdev/mdpy.c
+++ b/samples/vfio-mdev/mdpy.c
@@ -591,7 +591,7 @@ static long mdpy_ioctl(struct vfio_device *vdev, unsigned int cmd,
case VFIO_DEVICE_QUERY_GFX_PLANE:
{
- struct vfio_device_gfx_plane_info plane;
+ struct vfio_device_gfx_plane_info plane = {};
minsz = offsetofend(struct vfio_device_gfx_plane_info,
region_index);
@@ -602,6 +602,8 @@ static long mdpy_ioctl(struct vfio_device *vdev, unsigned int cmd,
if (plane.argsz < minsz)
return -EINVAL;
+ minsz = min_t(unsigned long, plane.argsz, sizeof(plane));
+
ret = mdpy_query_gfx_plane(mdev_state, &plane);
if (ret)
return ret;
--
2.41.0
The memory layout of struct vfio_device_ioeventfd is
architecture-dependent due to a u64 field and a struct size that is not
a multiple of 8 bytes:
- On x86_64 the struct size is padded to a multiple of 8 bytes.
- On x32 the struct size is only a multiple of 4 bytes, not 8.
- Other architectures may vary.
Use __aligned_u64 to make memory layout consistent. This reduces the
chance of holes that result in an information leak and the chance that
32-bit userspace on a 64-bit kernel breakage.
This patch increases the struct size on x32 but this is safe because of
the struct's argsz field. The kernel may grow the struct as long as it
still supports smaller argsz values from userspace (e.g. applications
compiled against older kernel headers).
The code that uses struct vfio_device_ioeventfd already works correctly
when the struct size grows, so only the struct definition needs to be
changed.
Suggested-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>
---
include/uapi/linux/vfio.h | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 0b5786ec50d8..d61269765bf8 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -794,9 +794,10 @@ struct vfio_device_ioeventfd {
#define VFIO_DEVICE_IOEVENTFD_32 (1 << 2) /* 4-byte write */
#define VFIO_DEVICE_IOEVENTFD_64 (1 << 3) /* 8-byte write */
#define VFIO_DEVICE_IOEVENTFD_SIZE_MASK (0xf)
- __u64 offset; /* device fd offset of write */
- __u64 data; /* data to be written */
+ __aligned_u64 offset; /* device fd offset of write */
+ __aligned_u64 data; /* data to be written */
__s32 fd; /* -1 for de-assignment */
+ __u32 reserved;
};
#define VFIO_DEVICE_IOEVENTFD _IO(VFIO_TYPE, VFIO_BASE + 16)
--
2.41.0
The memory layout of struct vfio_iommu_type1_info is
architecture-dependent due to a u64 field and a struct size that is not
a multiple of 8 bytes:
- On x86_64 the struct size is padded to a multiple of 8 bytes.
- On x32 the struct size is only a multiple of 4 bytes, not 8.
- Other architectures may vary.
Use __aligned_u64 to make memory layout consistent. This reduces the
chance of holes that result in an information leak and the chance that
32-bit userspace on a 64-bit kernel breakage.
This patch increases the struct size on x32 but this is safe because of
the struct's argsz field. The kernel may grow the struct as long as it
still supports smaller argsz values from userspace (e.g. applications
compiled against older kernel headers).
Suggested-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Stefan Hajnoczi <[email protected]>
---
include/uapi/linux/vfio.h | 3 ++-
drivers/vfio/vfio_iommu_type1.c | 11 ++---------
2 files changed, 4 insertions(+), 10 deletions(-)
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 45db62d74064..0b5786ec50d8 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -1303,8 +1303,9 @@ struct vfio_iommu_type1_info {
__u32 flags;
#define VFIO_IOMMU_INFO_PGSIZES (1 << 0) /* supported page sizes info */
#define VFIO_IOMMU_INFO_CAPS (1 << 1) /* Info supports caps */
- __u64 iova_pgsizes; /* Bitmap of supported page sizes */
+ __aligned_u64 iova_pgsizes; /* Bitmap of supported page sizes */
__u32 cap_offset; /* Offset within info struct of first cap */
+ __u32 reserved;
};
/*
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index ebe0ad31d0b0..f51159a7a4de 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -2762,27 +2762,20 @@ static int vfio_iommu_dma_avail_build_caps(struct vfio_iommu *iommu,
static int vfio_iommu_type1_get_info(struct vfio_iommu *iommu,
unsigned long arg)
{
- struct vfio_iommu_type1_info info;
+ struct vfio_iommu_type1_info info = {};
unsigned long minsz;
struct vfio_info_cap caps = { .buf = NULL, .size = 0 };
- unsigned long capsz;
int ret;
minsz = offsetofend(struct vfio_iommu_type1_info, iova_pgsizes);
- /* For backward compatibility, cannot require this */
- capsz = offsetofend(struct vfio_iommu_type1_info, cap_offset);
-
if (copy_from_user(&info, (void __user *)arg, minsz))
return -EFAULT;
if (info.argsz < minsz)
return -EINVAL;
- if (info.argsz >= capsz) {
- minsz = capsz;
- info.cap_offset = 0; /* output, no-recopy necessary */
- }
+ minsz = min_t(unsigned long, info.argsz, sizeof(info));
mutex_lock(&iommu->lock);
info.flags = VFIO_IOMMU_INFO_PGSIZES;
--
2.41.0
> From: Stefan Hajnoczi <[email protected]>
> Sent: Thursday, August 10, 2023 5:03 AM
>
> The memory layout of struct vfio_device_gfx_plane_info is
> architecture-dependent due to a u64 field and a struct size that is not
> a multiple of 8 bytes:
> - On x86_64 the struct size is padded to a multiple of 8 bytes.
> - On x32 the struct size is only a multiple of 4 bytes, not 8.
> - Other architectures may vary.
>
> Use __aligned_u64 to make memory layout consistent. This reduces the
> chance of holes that result in an information leak and the chance that
I didn't quite get this. The leak example [1] from your earlier fix is really
not caused by the use of __u64. Instead it's a counter example that on
x32 there is no hole with 4byte alignment for __u64.
I'd remove the hole part and just keep the compat reason.
[1] https://lore.kernel.org/lkml/[email protected]/T/
> @@ -1392,6 +1392,8 @@ static long intel_vgpu_ioctl(struct vfio_device
> *vfio_dev, unsigned int cmd,
> if (dmabuf.argsz < minsz)
> return -EINVAL;
>
> + minsz = min(minsz, sizeof(dmabuf));
> +
Is there a case where minsz could be greater than sizeof(dmabuf)?
> From: Stefan Hajnoczi <[email protected]>
> Sent: Thursday, August 10, 2023 5:03 AM
>
> @@ -1303,8 +1303,9 @@ struct vfio_iommu_type1_info {
> __u32 flags;
> #define VFIO_IOMMU_INFO_PGSIZES (1 << 0) /* supported page sizes info
> */
> #define VFIO_IOMMU_INFO_CAPS (1 << 1) /* Info supports caps */
> - __u64 iova_pgsizes; /* Bitmap of supported page sizes */
> + __aligned_u64 iova_pgsizes; /* Bitmap of supported page
> sizes */
> __u32 cap_offset; /* Offset within info struct of first cap */
> + __u32 reserved;
isn't this conflicting with the new 'pad' field introduced in your another
patch " [PATCH v3] vfio: align capability structures"?
@@ -1304,6 +1305,7 @@ struct vfio_iommu_type1_info {
#define VFIO_IOMMU_INFO_CAPS (1 << 1) /* Info supports caps */
__u64 iova_pgsizes; /* Bitmap of supported page sizes */
__u32 cap_offset; /* Offset within info struct of first cap */
+ __u32 pad;
};
On Thu, Aug 10, 2023 at 03:25:37AM +0000, Tian, Kevin wrote:
> > From: Stefan Hajnoczi <[email protected]>
> > Sent: Thursday, August 10, 2023 5:03 AM
> >
> > @@ -1303,8 +1303,9 @@ struct vfio_iommu_type1_info {
> > __u32 flags;
> > #define VFIO_IOMMU_INFO_PGSIZES (1 << 0) /* supported page sizes info
> > */
> > #define VFIO_IOMMU_INFO_CAPS (1 << 1) /* Info supports caps */
> > - __u64 iova_pgsizes; /* Bitmap of supported page sizes */
> > + __aligned_u64 iova_pgsizes; /* Bitmap of supported page
> > sizes */
> > __u32 cap_offset; /* Offset within info struct of first cap */
> > + __u32 reserved;
>
> isn't this conflicting with the new 'pad' field introduced in your another
> patch " [PATCH v3] vfio: align capability structures"?
>
> @@ -1304,6 +1305,7 @@ struct vfio_iommu_type1_info {
> #define VFIO_IOMMU_INFO_CAPS (1 << 1) /* Info supports caps */
> __u64 iova_pgsizes; /* Bitmap of supported page sizes */
> __u32 cap_offset; /* Offset within info struct of first cap */
> + __u32 pad;
> };
Yes, I will rebase this series when "[PATCH v3] vfio: align capability
structures" is merged. I see the __aligned_u64 as a separate issue and
don't want to combine the patch series.
Stefan
On Thu, Aug 10, 2023 at 03:22:56AM +0000, Tian, Kevin wrote:
> > From: Stefan Hajnoczi <[email protected]>
> > Sent: Thursday, August 10, 2023 5:03 AM
> >
> > The memory layout of struct vfio_device_gfx_plane_info is
> > architecture-dependent due to a u64 field and a struct size that is not
> > a multiple of 8 bytes:
> > - On x86_64 the struct size is padded to a multiple of 8 bytes.
> > - On x32 the struct size is only a multiple of 4 bytes, not 8.
> > - Other architectures may vary.
> >
> > Use __aligned_u64 to make memory layout consistent. This reduces the
> > chance of holes that result in an information leak and the chance that
>
> I didn't quite get this. The leak example [1] from your earlier fix is really
> not caused by the use of __u64. Instead it's a counter example that on
> x32 there is no hole with 4byte alignment for __u64.
>
> I'd remove the hole part and just keep the compat reason.
>
> [1] https://lore.kernel.org/lkml/[email protected]/T/
Okay.
>
> > @@ -1392,6 +1392,8 @@ static long intel_vgpu_ioctl(struct vfio_device
> > *vfio_dev, unsigned int cmd,
> > if (dmabuf.argsz < minsz)
> > return -EINVAL;
> >
> > + minsz = min(minsz, sizeof(dmabuf));
> > +
>
> Is there a case where minsz could be greater than sizeof(dmabuf)?
Thanks for spotting this, it's a bug in the patch. It should be
min(dmabuf.argsz, sizeof(dmabuf)).
Stefan
From: Stefan Hajnoczi
> Sent: 09 August 2023 22:03
>
> The memory layout of struct vfio_device_gfx_plane_info is
> architecture-dependent due to a u64 field and a struct size that is not
> a multiple of 8 bytes:
> - On x86_64 the struct size is padded to a multiple of 8 bytes.
> - On x32 the struct size is only a multiple of 4 bytes, not 8.
> - Other architectures may vary.
>
> Use __aligned_u64 to make memory layout consistent. This reduces the
> chance of holes that result in an information leak and the chance that
> 32-bit userspace on a 64-bit kernel breakage.
Isn't the hole likely to cause an information leak?
Forcing it to be there doesn't make any difference.
I'd add an explicit pad as well.
It is a shame there isn't an __attribute__(()) to error padded structures.
>
> This patch increases the struct size on x32 but this is safe because of
> the struct's argsz field. The kernel may grow the struct as long as it
> still supports smaller argsz values from userspace (e.g. applications
> compiled against older kernel headers).
Doesn't changing the offset of later fields break compatibility?
The size field (probably) only lets you extend the structure.
Oh, for sanity do min(variable, constant).
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On Tue, Aug 15, 2023 at 03:23:50PM +0000, David Laight wrote:
> From: Stefan Hajnoczi
> > Sent: 09 August 2023 22:03
> >
> > The memory layout of struct vfio_device_gfx_plane_info is
> > architecture-dependent due to a u64 field and a struct size that is not
> > a multiple of 8 bytes:
> > - On x86_64 the struct size is padded to a multiple of 8 bytes.
> > - On x32 the struct size is only a multiple of 4 bytes, not 8.
> > - Other architectures may vary.
> >
> > Use __aligned_u64 to make memory layout consistent. This reduces the
> > chance of holes that result in an information leak and the chance that
> > 32-bit userspace on a 64-bit kernel breakage.
>
> Isn't the hole likely to cause an information leak?
> Forcing it to be there doesn't make any difference.
> I'd add an explicit pad as well.
Yes, Kevin had a similar comment about this text. What I meant was that
it's safest to have a single memory layout across all architectures
(with explicit padding) so that there are no surprises. I'm going to
remove the statement about information leaks because it's confusing.
>
> It is a shame there isn't an __attribute__(()) to error padded structures.
>
> >
> > This patch increases the struct size on x32 but this is safe because of
> > the struct's argsz field. The kernel may grow the struct as long as it
> > still supports smaller argsz values from userspace (e.g. applications
> > compiled against older kernel headers).
>
> Doesn't changing the offset of later fields break compatibility?
> The size field (probably) only lets you extend the structure.
Yes, that would break compatibility but I don't see any changes in this
patch series that modifies the offsets of later fields. Have I missed
something?
> Oh, for sanity do min(variable, constant).
Can you elaborate?
Thanks,
Stefan