2023-05-09 05:17:36

by Lucas De Marchi

[permalink] [raw]
Subject: [PATCH 0/3] Fixed-width mask/bit helpers

Generalize the REG_GENMASK*() and REG_BIT*() macros so they can be used
by other drivers. The intention is to migrate i915 to the generic
helpers and also make use of them on the upcoming xe driver. There are
possibly other users in the kernel that need u32/u16/u8 bit handling.

First patch is one of the possible alternatives in radeon/amdgpu drivers
so they use the U32() that is planned to be used here. Other
alternatives would be to use a amd/radeon prefix or use a _Generic().

Last patch is a temporary one to demonstrate what would be changed on
the i915 side. However instead of replacing the implementation of the
REG_* macros, the goal is to replace the callers as well.

Patches here are currently based on drm-tip branch.

Lucas De Marchi (3):
drm/amd: Remove wrapper macros over get_u{32,16,8}
linux/bits.h: Add fixed-width GENMASK and BIT macros
drm/i915: Temporary conversion to new GENMASK/BIT macros

drivers/gpu/drm/amd/amdgpu/atom.c | 212 ++++++++++++------------
drivers/gpu/drm/amd/include/atom-bits.h | 9 +-
drivers/gpu/drm/i915/i915_reg_defs.h | 28 +---
drivers/gpu/drm/radeon/atom-bits.h | 9 +-
drivers/gpu/drm/radeon/atom.c | 209 +++++++++++------------
include/linux/bits.h | 22 +++
include/uapi/linux/const.h | 2 +
include/vdso/const.h | 1 +
8 files changed, 249 insertions(+), 243 deletions(-)

--
2.40.1


2023-05-09 05:29:10

by Lucas De Marchi

[permalink] [raw]
Subject: [PATCH 3/3] drm/i915: Temporary conversion to new GENMASK/BIT macros

Convert the REG_* macros from i915_reg_defs.h to use the new macros
defined in linux/bits.h. This is just to help on the implementation
of the new macros and not intended to be applied.

Signed-off-by: Lucas De Marchi <[email protected]>
---
drivers/gpu/drm/i915/i915_reg_defs.h | 28 +++++-----------------------
1 file changed, 5 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h
index 622d603080f9..61fbb8d62b25 100644
--- a/drivers/gpu/drm/i915/i915_reg_defs.h
+++ b/drivers/gpu/drm/i915/i915_reg_defs.h
@@ -17,10 +17,7 @@
*
* @return: Value with bit @__n set.
*/
-#define REG_BIT(__n) \
- ((u32)(BIT(__n) + \
- BUILD_BUG_ON_ZERO(__is_constexpr(__n) && \
- ((__n) < 0 || (__n) > 31))))
+#define REG_BIT(__n) BIT_U32(__n)

/**
* REG_BIT8() - Prepare a u8 bit value
@@ -30,10 +27,7 @@
*
* @return: Value with bit @__n set.
*/
-#define REG_BIT8(__n) \
- ((u8)(BIT(__n) + \
- BUILD_BUG_ON_ZERO(__is_constexpr(__n) && \
- ((__n) < 0 || (__n) > 7))))
+#define REG_BIT8(__n) BIT_U8(__n)

/**
* REG_GENMASK() - Prepare a continuous u32 bitmask
@@ -44,11 +38,7 @@
*
* @return: Continuous bitmask from @__high to @__low, inclusive.
*/
-#define REG_GENMASK(__high, __low) \
- ((u32)(GENMASK(__high, __low) + \
- BUILD_BUG_ON_ZERO(__is_constexpr(__high) && \
- __is_constexpr(__low) && \
- ((__low) < 0 || (__high) > 31 || (__low) > (__high)))))
+#define REG_GENMASK(__high, __low) GENMASK_U32(__high, __low)

/**
* REG_GENMASK64() - Prepare a continuous u64 bitmask
@@ -59,11 +49,7 @@
*
* @return: Continuous bitmask from @__high to @__low, inclusive.
*/
-#define REG_GENMASK64(__high, __low) \
- ((u64)(GENMASK_ULL(__high, __low) + \
- BUILD_BUG_ON_ZERO(__is_constexpr(__high) && \
- __is_constexpr(__low) && \
- ((__low) < 0 || (__high) > 63 || (__low) > (__high)))))
+#define REG_GENMASK64(__high, __low) GENMASK_ULL(__high, __low)

/**
* REG_GENMASK8() - Prepare a continuous u8 bitmask
@@ -74,11 +60,7 @@
*
* @return: Continuous bitmask from @__high to @__low, inclusive.
*/
-#define REG_GENMASK8(__high, __low) \
- ((u8)(GENMASK(__high, __low) + \
- BUILD_BUG_ON_ZERO(__is_constexpr(__high) && \
- __is_constexpr(__low) && \
- ((__low) < 0 || (__high) > 7 || (__low) > (__high)))))
+#define REG_GENMASK8(__high, __low) GENMASK_U8(__high, __low)

/*
* Local integer constant expression version of is_power_of_2().
--
2.40.1

2023-05-09 05:29:16

by Lucas De Marchi

[permalink] [raw]
Subject: [PATCH 1/3] drm/amd: Remove wrapper macros over get_u{32,16,8}

Both amdgpu and radeon use some wrapper macros over get_u{32,16,8}()
functions which end up adding an implicit argument. Instead of using
the macros, just call the functions directly without hiding the context
that is being passed. This will allow the macros to be used in a more
global context like ULL() and UL() currently are.

Callers are automatically converted with the following coccinelle
script:

$ cat utype.cocci
virtual patch

@@
expression e;
@@
(
- U32(e)
+ get_u32(ctx->ctx->bios, e)
|
- U16(e)
+ get_u16(ctx->ctx->bios, e)
|
- U8(e)
+ get_u8(ctx->ctx->bios, e)
|
- CU32(e)
+ get_u32(ctx->bios, e)
|
- CU16(e)
+ get_u16(ctx->bios, e)
|
- CU8(e)
+ get_u8(ctx->bios, e)
)

$ coccicheck SPFLAGS=--in-place MODE=patch \
COCCI=utype.cocci \
M=./drivers/gpu/drm/

Signed-off-by: Lucas De Marchi <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/atom.c | 212 ++++++++++++------------
drivers/gpu/drm/amd/include/atom-bits.h | 9 +-
drivers/gpu/drm/radeon/atom-bits.h | 9 +-
drivers/gpu/drm/radeon/atom.c | 209 +++++++++++------------
4 files changed, 219 insertions(+), 220 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/atom.c b/drivers/gpu/drm/amd/amdgpu/atom.c
index 1c5d9388ad0b..eea49bfb403f 100644
--- a/drivers/gpu/drm/amd/amdgpu/atom.c
+++ b/drivers/gpu/drm/amd/amdgpu/atom.c
@@ -112,62 +112,62 @@ static uint32_t atom_iio_execute(struct atom_context *ctx, int base,
uint32_t temp = 0xCDCDCDCD;

while (1)
- switch (CU8(base)) {
+ switch (get_u8(ctx->bios, base)) {
case ATOM_IIO_NOP:
base++;
break;
case ATOM_IIO_READ:
- temp = ctx->card->reg_read(ctx->card, CU16(base + 1));
+ temp = ctx->card->reg_read(ctx->card,
+ get_u16(ctx->bios, base + 1));
base += 3;
break;
case ATOM_IIO_WRITE:
- ctx->card->reg_write(ctx->card, CU16(base + 1), temp);
+ ctx->card->reg_write(ctx->card,
+ get_u16(ctx->bios, base + 1),
+ temp);
base += 3;
break;
case ATOM_IIO_CLEAR:
temp &=
- ~((0xFFFFFFFF >> (32 - CU8(base + 1))) <<
- CU8(base + 2));
+ ~((0xFFFFFFFF >> (32 - get_u8(ctx->bios, base + 1))) <<
+ get_u8(ctx->bios, base + 2));
base += 3;
break;
case ATOM_IIO_SET:
temp |=
- (0xFFFFFFFF >> (32 - CU8(base + 1))) << CU8(base +
- 2);
+ (0xFFFFFFFF >> (32 - get_u8(ctx->bios, base + 1))) << get_u8(ctx->bios,
+ base + 2);
base += 3;
break;
case ATOM_IIO_MOVE_INDEX:
temp &=
- ~((0xFFFFFFFF >> (32 - CU8(base + 1))) <<
- CU8(base + 3));
+ ~((0xFFFFFFFF >> (32 - get_u8(ctx->bios, base + 1))) <<
+ get_u8(ctx->bios, base + 3));
temp |=
- ((index >> CU8(base + 2)) &
- (0xFFFFFFFF >> (32 - CU8(base + 1)))) << CU8(base +
- 3);
+ ((index >> get_u8(ctx->bios, base + 2)) &
+ (0xFFFFFFFF >> (32 - get_u8(ctx->bios, base + 1)))) << get_u8(ctx->bios,
+ base + 3);
base += 4;
break;
case ATOM_IIO_MOVE_DATA:
temp &=
- ~((0xFFFFFFFF >> (32 - CU8(base + 1))) <<
- CU8(base + 3));
+ ~((0xFFFFFFFF >> (32 - get_u8(ctx->bios, base + 1))) <<
+ get_u8(ctx->bios, base + 3));
temp |=
- ((data >> CU8(base + 2)) &
- (0xFFFFFFFF >> (32 - CU8(base + 1)))) << CU8(base +
- 3);
+ ((data >> get_u8(ctx->bios, base + 2)) &
+ (0xFFFFFFFF >> (32 - get_u8(ctx->bios, base + 1)))) << get_u8(ctx->bios,
+ base + 3);
base += 4;
break;
case ATOM_IIO_MOVE_ATTR:
temp &=
- ~((0xFFFFFFFF >> (32 - CU8(base + 1))) <<
- CU8(base + 3));
+ ~((0xFFFFFFFF >> (32 - get_u8(ctx->bios, base + 1))) <<
+ get_u8(ctx->bios, base + 3));
temp |=
((ctx->
- io_attr >> CU8(base + 2)) & (0xFFFFFFFF >> (32 -
- CU8
- (base
- +
- 1))))
- << CU8(base + 3);
+ io_attr >> get_u8(ctx->bios, base + 2)) & (0xFFFFFFFF >> (32 -
+ get_u8(ctx->bios, base + 1))))
+ << get_u8(ctx->bios, base + 3);
base += 4;
break;
case ATOM_IIO_END:
@@ -187,7 +187,7 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
align = (attr >> 3) & 7;
switch (arg) {
case ATOM_ARG_REG:
- idx = U16(*ptr);
+ idx = get_u16(ctx->ctx->bios, *ptr);
(*ptr) += 2;
if (print)
DEBUG("REG[0x%04X]", idx);
@@ -219,7 +219,7 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
}
break;
case ATOM_ARG_PS:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
/* get_unaligned_le32 avoids unaligned accesses from atombios
* tables, noticed on a DEC Alpha. */
@@ -228,7 +228,7 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
DEBUG("PS[0x%02X,0x%04X]", idx, val);
break;
case ATOM_ARG_WS:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
if (print)
DEBUG("WS[0x%02X]", idx);
@@ -265,7 +265,7 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
}
break;
case ATOM_ARG_ID:
- idx = U16(*ptr);
+ idx = get_u16(ctx->ctx->bios, *ptr);
(*ptr) += 2;
if (print) {
if (gctx->data_block)
@@ -273,10 +273,10 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
else
DEBUG("ID[0x%04X]", idx);
}
- val = U32(idx + gctx->data_block);
+ val = get_u32(ctx->ctx->bios, idx + gctx->data_block);
break;
case ATOM_ARG_FB:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
if ((gctx->fb_base + (idx * 4)) > gctx->scratch_size_bytes) {
DRM_ERROR("ATOM: fb read beyond scratch region: %d vs. %d\n",
@@ -290,7 +290,7 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
case ATOM_ARG_IMM:
switch (align) {
case ATOM_SRC_DWORD:
- val = U32(*ptr);
+ val = get_u32(ctx->ctx->bios, *ptr);
(*ptr) += 4;
if (print)
DEBUG("IMM 0x%08X\n", val);
@@ -298,7 +298,7 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
case ATOM_SRC_WORD0:
case ATOM_SRC_WORD8:
case ATOM_SRC_WORD16:
- val = U16(*ptr);
+ val = get_u16(ctx->ctx->bios, *ptr);
(*ptr) += 2;
if (print)
DEBUG("IMM 0x%04X\n", val);
@@ -307,7 +307,7 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
case ATOM_SRC_BYTE8:
case ATOM_SRC_BYTE16:
case ATOM_SRC_BYTE24:
- val = U8(*ptr);
+ val = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
if (print)
DEBUG("IMM 0x%02X\n", val);
@@ -315,14 +315,14 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
}
return 0;
case ATOM_ARG_PLL:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
if (print)
DEBUG("PLL[0x%02X]", idx);
val = gctx->card->pll_read(gctx->card, idx);
break;
case ATOM_ARG_MC:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
if (print)
DEBUG("MC[0x%02X]", idx);
@@ -410,20 +410,20 @@ static uint32_t atom_get_src_direct(atom_exec_context *ctx, uint8_t align, int *

switch (align) {
case ATOM_SRC_DWORD:
- val = U32(*ptr);
+ val = get_u32(ctx->ctx->bios, *ptr);
(*ptr) += 4;
break;
case ATOM_SRC_WORD0:
case ATOM_SRC_WORD8:
case ATOM_SRC_WORD16:
- val = U16(*ptr);
+ val = get_u16(ctx->ctx->bios, *ptr);
(*ptr) += 2;
break;
case ATOM_SRC_BYTE0:
case ATOM_SRC_BYTE8:
case ATOM_SRC_BYTE16:
case ATOM_SRC_BYTE24:
- val = U8(*ptr);
+ val = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
break;
}
@@ -460,7 +460,7 @@ static void atom_put_dst(atom_exec_context *ctx, int arg, uint8_t attr,
val |= saved;
switch (arg) {
case ATOM_ARG_REG:
- idx = U16(*ptr);
+ idx = get_u16(ctx->ctx->bios, *ptr);
(*ptr) += 2;
DEBUG("REG[0x%04X]", idx);
idx += gctx->reg_block;
@@ -493,13 +493,13 @@ static void atom_put_dst(atom_exec_context *ctx, int arg, uint8_t attr,
}
break;
case ATOM_ARG_PS:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
DEBUG("PS[0x%02X]", idx);
ctx->ps[idx] = cpu_to_le32(val);
break;
case ATOM_ARG_WS:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
DEBUG("WS[0x%02X]", idx);
switch (idx) {
@@ -532,7 +532,7 @@ static void atom_put_dst(atom_exec_context *ctx, int arg, uint8_t attr,
}
break;
case ATOM_ARG_FB:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
if ((gctx->fb_base + (idx * 4)) > gctx->scratch_size_bytes) {
DRM_ERROR("ATOM: fb write beyond scratch region: %d vs. %d\n",
@@ -542,13 +542,13 @@ static void atom_put_dst(atom_exec_context *ctx, int arg, uint8_t attr,
DEBUG("FB[0x%02X]", idx);
break;
case ATOM_ARG_PLL:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
DEBUG("PLL[0x%02X]", idx);
gctx->card->pll_write(gctx->card, idx, val);
break;
case ATOM_ARG_MC:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
DEBUG("MC[0x%02X]", idx);
gctx->card->mc_write(gctx->card, idx, val);
@@ -584,7 +584,7 @@ static void atom_put_dst(atom_exec_context *ctx, int arg, uint8_t attr,

static void atom_op_add(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src, saved;
int dptr = *ptr;
SDEBUG(" dst: ");
@@ -598,7 +598,7 @@ static void atom_op_add(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_and(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src, saved;
int dptr = *ptr;
SDEBUG(" dst: ");
@@ -617,14 +617,14 @@ static void atom_op_beep(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_calltable(atom_exec_context *ctx, int *ptr, int arg)
{
- int idx = U8((*ptr)++);
+ int idx = get_u8(ctx->ctx->bios, (*ptr)++);
int r = 0;

if (idx < ATOM_TABLE_NAMES_CNT)
SDEBUG(" table: %d (%s)\n", idx, atom_table_names[idx]);
else
SDEBUG(" table: %d\n", idx);
- if (U16(ctx->ctx->cmd_table + 4 + 2 * idx))
+ if (get_u16(ctx->ctx->bios, ctx->ctx->cmd_table + 4 + 2 * idx))
r = amdgpu_atom_execute_table_locked(ctx->ctx, idx, ctx->ps + ctx->ps_shift);
if (r) {
ctx->abort = true;
@@ -633,7 +633,7 @@ static void atom_op_calltable(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_clear(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t saved;
int dptr = *ptr;
attr &= 0x38;
@@ -645,7 +645,7 @@ static void atom_op_clear(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_compare(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src;
SDEBUG(" src1: ");
dst = atom_get_dst(ctx, arg, attr, ptr, NULL, 1);
@@ -659,7 +659,7 @@ static void atom_op_compare(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_delay(atom_exec_context *ctx, int *ptr, int arg)
{
- unsigned count = U8((*ptr)++);
+ unsigned count = get_u8(ctx->ctx->bios, (*ptr)++);
SDEBUG(" count: %d\n", count);
if (arg == ATOM_UNIT_MICROSEC)
udelay(count);
@@ -671,7 +671,7 @@ static void atom_op_delay(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_div(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src;
SDEBUG(" src1: ");
dst = atom_get_dst(ctx, arg, attr, ptr, NULL, 1);
@@ -689,7 +689,7 @@ static void atom_op_div(atom_exec_context *ctx, int *ptr, int arg)
static void atom_op_div32(atom_exec_context *ctx, int *ptr, int arg)
{
uint64_t val64;
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src;
SDEBUG(" src1: ");
dst = atom_get_dst(ctx, arg, attr, ptr, NULL, 1);
@@ -714,7 +714,7 @@ static void atom_op_eot(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_jump(atom_exec_context *ctx, int *ptr, int arg)
{
- int execute = 0, target = U16(*ptr);
+ int execute = 0, target = get_u16(ctx->ctx->bios, *ptr);
unsigned long cjiffies;

(*ptr) += 2;
@@ -768,7 +768,7 @@ static void atom_op_jump(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_mask(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, mask, src, saved;
int dptr = *ptr;
SDEBUG(" dst: ");
@@ -785,7 +785,7 @@ static void atom_op_mask(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_move(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t src, saved;
int dptr = *ptr;
if (((attr >> 3) & 7) != ATOM_SRC_DWORD)
@@ -802,7 +802,7 @@ static void atom_op_move(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_mul(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src;
SDEBUG(" src1: ");
dst = atom_get_dst(ctx, arg, attr, ptr, NULL, 1);
@@ -814,7 +814,7 @@ static void atom_op_mul(atom_exec_context *ctx, int *ptr, int arg)
static void atom_op_mul32(atom_exec_context *ctx, int *ptr, int arg)
{
uint64_t val64;
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src;
SDEBUG(" src1: ");
dst = atom_get_dst(ctx, arg, attr, ptr, NULL, 1);
@@ -832,7 +832,7 @@ static void atom_op_nop(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_or(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src, saved;
int dptr = *ptr;
SDEBUG(" dst: ");
@@ -846,7 +846,7 @@ static void atom_op_or(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_postcard(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t val = U8((*ptr)++);
+ uint8_t val = get_u8(ctx->ctx->bios, (*ptr)++);
SDEBUG("POST card output: 0x%02X\n", val);
}

@@ -867,7 +867,7 @@ static void atom_op_savereg(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_setdatablock(atom_exec_context *ctx, int *ptr, int arg)
{
- int idx = U8(*ptr);
+ int idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
SDEBUG(" block: %d\n", idx);
if (!idx)
@@ -875,13 +875,14 @@ static void atom_op_setdatablock(atom_exec_context *ctx, int *ptr, int arg)
else if (idx == 255)
ctx->ctx->data_block = ctx->start;
else
- ctx->ctx->data_block = U16(ctx->ctx->data_table + 4 + 2 * idx);
+ ctx->ctx->data_block = get_u16(ctx->ctx->bios,
+ ctx->ctx->data_table + 4 + 2 * idx);
SDEBUG(" base: 0x%04X\n", ctx->ctx->data_block);
}

static void atom_op_setfbbase(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
SDEBUG(" fb_base: ");
ctx->ctx->fb_base = atom_get_src(ctx, attr, ptr);
}
@@ -891,7 +892,7 @@ static void atom_op_setport(atom_exec_context *ctx, int *ptr, int arg)
int port;
switch (arg) {
case ATOM_PORT_ATI:
- port = U16(*ptr);
+ port = get_u16(ctx->ctx->bios, *ptr);
if (port < ATOM_IO_NAMES_CNT)
SDEBUG(" port: %d (%s)\n", port, atom_io_names[port]);
else
@@ -915,14 +916,14 @@ static void atom_op_setport(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_setregblock(atom_exec_context *ctx, int *ptr, int arg)
{
- ctx->ctx->reg_block = U16(*ptr);
+ ctx->ctx->reg_block = get_u16(ctx->ctx->bios, *ptr);
(*ptr) += 2;
SDEBUG(" base: 0x%04X\n", ctx->ctx->reg_block);
}

static void atom_op_shift_left(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++), shift;
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++), shift;
uint32_t saved, dst;
int dptr = *ptr;
attr &= 0x38;
@@ -938,7 +939,7 @@ static void atom_op_shift_left(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_shift_right(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++), shift;
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++), shift;
uint32_t saved, dst;
int dptr = *ptr;
attr &= 0x38;
@@ -954,7 +955,7 @@ static void atom_op_shift_right(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_shl(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++), shift;
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++), shift;
uint32_t saved, dst;
int dptr = *ptr;
uint32_t dst_align = atom_dst_to_src[(attr >> 3) & 7][(attr >> 6) & 3];
@@ -973,7 +974,7 @@ static void atom_op_shl(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_shr(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++), shift;
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++), shift;
uint32_t saved, dst;
int dptr = *ptr;
uint32_t dst_align = atom_dst_to_src[(attr >> 3) & 7][(attr >> 6) & 3];
@@ -992,7 +993,7 @@ static void atom_op_shr(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_sub(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src, saved;
int dptr = *ptr;
SDEBUG(" dst: ");
@@ -1006,18 +1007,18 @@ static void atom_op_sub(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_switch(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t src, val, target;
SDEBUG(" switch: ");
src = atom_get_src(ctx, attr, ptr);
- while (U16(*ptr) != ATOM_CASE_END)
- if (U8(*ptr) == ATOM_CASE_MAGIC) {
+ while (get_u16(ctx->ctx->bios, *ptr) != ATOM_CASE_END)
+ if (get_u8(ctx->ctx->bios, *ptr) == ATOM_CASE_MAGIC) {
(*ptr)++;
SDEBUG(" case: ");
val =
atom_get_src(ctx, (attr & 0x38) | ATOM_ARG_IMM,
ptr);
- target = U16(*ptr);
+ target = get_u16(ctx->ctx->bios, *ptr);
if (val == src) {
SDEBUG(" target: %04X\n", target);
*ptr = ctx->start + target;
@@ -1033,7 +1034,7 @@ static void atom_op_switch(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_test(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src;
SDEBUG(" src1: ");
dst = atom_get_dst(ctx, arg, attr, ptr, NULL, 1);
@@ -1045,7 +1046,7 @@ static void atom_op_test(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_xor(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src, saved;
int dptr = *ptr;
SDEBUG(" dst: ");
@@ -1059,13 +1060,13 @@ static void atom_op_xor(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_debug(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t val = U8((*ptr)++);
+ uint8_t val = get_u8(ctx->ctx->bios, (*ptr)++);
SDEBUG("DEBUG output: 0x%02X\n", val);
}

static void atom_op_processds(atom_exec_context *ctx, int *ptr, int arg)
{
- uint16_t val = U16(*ptr);
+ uint16_t val = get_u16(ctx->ctx->bios, *ptr);
(*ptr) += val + 2;
SDEBUG("PROCESSDS output: 0x%02X\n", val);
}
@@ -1206,7 +1207,7 @@ static struct {

static int amdgpu_atom_execute_table_locked(struct atom_context *ctx, int index, uint32_t *params)
{
- int base = CU16(ctx->cmd_table + 4 + 2 * index);
+ int base = get_u16(ctx->bios, ctx->cmd_table + 4 + 2 * index);
int len, ws, ps, ptr;
unsigned char op;
atom_exec_context ectx;
@@ -1215,9 +1216,9 @@ static int amdgpu_atom_execute_table_locked(struct atom_context *ctx, int index,
if (!base)
return -EINVAL;

- len = CU16(base + ATOM_CT_SIZE_PTR);
- ws = CU8(base + ATOM_CT_WS_PTR);
- ps = CU8(base + ATOM_CT_PS_PTR) & ATOM_CT_PS_MASK;
+ len = get_u16(ctx->bios, base + ATOM_CT_SIZE_PTR);
+ ws = get_u8(ctx->bios, base + ATOM_CT_WS_PTR);
+ ps = get_u8(ctx->bios, base + ATOM_CT_PS_PTR) & ATOM_CT_PS_MASK;
ptr = base + ATOM_CT_CODE_PTR;

SDEBUG(">> execute %04X (len %d, WS %d, PS %d)\n", base, len, ws, ps);
@@ -1235,7 +1236,7 @@ static int amdgpu_atom_execute_table_locked(struct atom_context *ctx, int index,

debug_depth++;
while (1) {
- op = CU8(ptr++);
+ op = get_u8(ctx->bios, ptr++);
if (op < ATOM_OP_NAMES_CNT)
SDEBUG("%s @ 0x%04X\n", atom_op_names[op], ptr - 1);
else
@@ -1293,11 +1294,11 @@ static void atom_index_iio(struct atom_context *ctx, int base)
ctx->iio = kzalloc(2 * 256, GFP_KERNEL);
if (!ctx->iio)
return;
- while (CU8(base) == ATOM_IIO_START) {
- ctx->iio[CU8(base + 1)] = base + 2;
+ while (get_u8(ctx->bios, base) == ATOM_IIO_START) {
+ ctx->iio[get_u8(ctx->bios, base + 1)] = base + 2;
base += 2;
- while (CU8(base) != ATOM_IIO_END)
- base += atom_iio_len[CU8(base)];
+ while (get_u8(ctx->bios, base) != ATOM_IIO_END)
+ base += atom_iio_len[get_u8(ctx->bios, base)];
base += 3;
}
}
@@ -1472,7 +1473,7 @@ struct atom_context *amdgpu_atom_parse(struct card_info *card, void *bios)
ctx->card = card;
ctx->bios = bios;

- if (CU16(0) != ATOM_BIOS_MAGIC) {
+ if (get_u16(ctx->bios, 0) != ATOM_BIOS_MAGIC) {
pr_info("Invalid BIOS magic\n");
kfree(ctx);
return NULL;
@@ -1485,7 +1486,7 @@ struct atom_context *amdgpu_atom_parse(struct card_info *card, void *bios)
return NULL;
}

- base = CU16(ATOM_ROM_TABLE_PTR);
+ base = get_u16(ctx->bios, ATOM_ROM_TABLE_PTR);
if (strncmp
(CSTR(base + ATOM_ROM_MAGIC_PTR), ATOM_ROM_MAGIC,
strlen(ATOM_ROM_MAGIC))) {
@@ -1494,15 +1495,16 @@ struct atom_context *amdgpu_atom_parse(struct card_info *card, void *bios)
return NULL;
}

- ctx->cmd_table = CU16(base + ATOM_ROM_CMD_PTR);
- ctx->data_table = CU16(base + ATOM_ROM_DATA_PTR);
- atom_index_iio(ctx, CU16(ctx->data_table + ATOM_DATA_IIO_PTR) + 4);
+ ctx->cmd_table = get_u16(ctx->bios, base + ATOM_ROM_CMD_PTR);
+ ctx->data_table = get_u16(ctx->bios, base + ATOM_ROM_DATA_PTR);
+ atom_index_iio(ctx,
+ get_u16(ctx->bios, ctx->data_table + ATOM_DATA_IIO_PTR) + 4);
if (!ctx->iio) {
amdgpu_atom_destroy(ctx);
return NULL;
}

- idx = CU16(ATOM_ROM_PART_NUMBER_PTR);
+ idx = get_u16(ctx->bios, ATOM_ROM_PART_NUMBER_PTR);
if (idx == 0)
idx = 0x80;

@@ -1533,18 +1535,18 @@ struct atom_context *amdgpu_atom_parse(struct card_info *card, void *bios)

int amdgpu_atom_asic_init(struct atom_context *ctx)
{
- int hwi = CU16(ctx->data_table + ATOM_DATA_FWI_PTR);
+ int hwi = get_u16(ctx->bios, ctx->data_table + ATOM_DATA_FWI_PTR);
uint32_t ps[16];
int ret;

memset(ps, 0, 64);

- ps[0] = cpu_to_le32(CU32(hwi + ATOM_FWI_DEFSCLK_PTR));
- ps[1] = cpu_to_le32(CU32(hwi + ATOM_FWI_DEFMCLK_PTR));
+ ps[0] = cpu_to_le32(get_u32(ctx->bios, hwi + ATOM_FWI_DEFSCLK_PTR));
+ ps[1] = cpu_to_le32(get_u32(ctx->bios, hwi + ATOM_FWI_DEFMCLK_PTR));
if (!ps[0] || !ps[1])
return 1;

- if (!CU16(ctx->cmd_table + 4 + 2 * ATOM_CMD_INIT))
+ if (!get_u16(ctx->bios, ctx->cmd_table + 4 + 2 * ATOM_CMD_INIT))
return 1;
ret = amdgpu_atom_execute_table(ctx, ATOM_CMD_INIT, ps);
if (ret)
@@ -1566,18 +1568,18 @@ bool amdgpu_atom_parse_data_header(struct atom_context *ctx, int index,
uint16_t *data_start)
{
int offset = index * 2 + 4;
- int idx = CU16(ctx->data_table + offset);
+ int idx = get_u16(ctx->bios, ctx->data_table + offset);
u16 *mdt = (u16 *)(ctx->bios + ctx->data_table + 4);

if (!mdt[index])
return false;

if (size)
- *size = CU16(idx);
+ *size = get_u16(ctx->bios, idx);
if (frev)
- *frev = CU8(idx + 2);
+ *frev = get_u8(ctx->bios, idx + 2);
if (crev)
- *crev = CU8(idx + 3);
+ *crev = get_u8(ctx->bios, idx + 3);
*data_start = idx;
return true;
}
@@ -1586,16 +1588,16 @@ bool amdgpu_atom_parse_cmd_header(struct atom_context *ctx, int index, uint8_t *
uint8_t *crev)
{
int offset = index * 2 + 4;
- int idx = CU16(ctx->cmd_table + offset);
+ int idx = get_u16(ctx->bios, ctx->cmd_table + offset);
u16 *mct = (u16 *)(ctx->bios + ctx->cmd_table + 4);

if (!mct[index])
return false;

if (frev)
- *frev = CU8(idx + 2);
+ *frev = get_u8(ctx->bios, idx + 2);
if (crev)
- *crev = CU8(idx + 3);
+ *crev = get_u8(ctx->bios, idx + 3);
return true;
}

diff --git a/drivers/gpu/drm/amd/include/atom-bits.h b/drivers/gpu/drm/amd/include/atom-bits.h
index e8fae5c77514..28c196a91221 100644
--- a/drivers/gpu/drm/amd/include/atom-bits.h
+++ b/drivers/gpu/drm/amd/include/atom-bits.h
@@ -29,20 +29,17 @@ static inline uint8_t get_u8(void *bios, int ptr)
{
return ((unsigned char *)bios)[ptr];
}
-#define U8(ptr) get_u8(ctx->ctx->bios, (ptr))
-#define CU8(ptr) get_u8(ctx->bios, (ptr))
+
static inline uint16_t get_u16(void *bios, int ptr)
{
return get_u8(bios ,ptr)|(((uint16_t)get_u8(bios, ptr+1))<<8);
}
-#define U16(ptr) get_u16(ctx->ctx->bios, (ptr))
-#define CU16(ptr) get_u16(ctx->bios, (ptr))
+
static inline uint32_t get_u32(void *bios, int ptr)
{
return get_u16(bios, ptr)|(((uint32_t)get_u16(bios, ptr+2))<<16);
}
-#define U32(ptr) get_u32(ctx->ctx->bios, (ptr))
-#define CU32(ptr) get_u32(ctx->bios, (ptr))
+
#define CSTR(ptr) (((char *)(ctx->bios))+(ptr))

#endif
diff --git a/drivers/gpu/drm/radeon/atom-bits.h b/drivers/gpu/drm/radeon/atom-bits.h
index e8fae5c77514..28c196a91221 100644
--- a/drivers/gpu/drm/radeon/atom-bits.h
+++ b/drivers/gpu/drm/radeon/atom-bits.h
@@ -29,20 +29,17 @@ static inline uint8_t get_u8(void *bios, int ptr)
{
return ((unsigned char *)bios)[ptr];
}
-#define U8(ptr) get_u8(ctx->ctx->bios, (ptr))
-#define CU8(ptr) get_u8(ctx->bios, (ptr))
+
static inline uint16_t get_u16(void *bios, int ptr)
{
return get_u8(bios ,ptr)|(((uint16_t)get_u8(bios, ptr+1))<<8);
}
-#define U16(ptr) get_u16(ctx->ctx->bios, (ptr))
-#define CU16(ptr) get_u16(ctx->bios, (ptr))
+
static inline uint32_t get_u32(void *bios, int ptr)
{
return get_u16(bios, ptr)|(((uint32_t)get_u16(bios, ptr+2))<<16);
}
-#define U32(ptr) get_u32(ctx->ctx->bios, (ptr))
-#define CU32(ptr) get_u32(ctx->bios, (ptr))
+
#define CSTR(ptr) (((char *)(ctx->bios))+(ptr))

#endif
diff --git a/drivers/gpu/drm/radeon/atom.c b/drivers/gpu/drm/radeon/atom.c
index c1bbfbe28bda..1c54d52c4cb0 100644
--- a/drivers/gpu/drm/radeon/atom.c
+++ b/drivers/gpu/drm/radeon/atom.c
@@ -112,64 +112,65 @@ static uint32_t atom_iio_execute(struct atom_context *ctx, int base,
uint32_t temp = 0xCDCDCDCD;

while (1)
- switch (CU8(base)) {
+ switch (get_u8(ctx->bios, base)) {
case ATOM_IIO_NOP:
base++;
break;
case ATOM_IIO_READ:
- temp = ctx->card->ioreg_read(ctx->card, CU16(base + 1));
+ temp = ctx->card->ioreg_read(ctx->card,
+ get_u16(ctx->bios, base + 1));
base += 3;
break;
case ATOM_IIO_WRITE:
if (rdev->family == CHIP_RV515)
- (void)ctx->card->ioreg_read(ctx->card, CU16(base + 1));
- ctx->card->ioreg_write(ctx->card, CU16(base + 1), temp);
+ (void)ctx->card->ioreg_read(ctx->card,
+ get_u16(ctx->bios, base + 1));
+ ctx->card->ioreg_write(ctx->card,
+ get_u16(ctx->bios, base + 1),
+ temp);
base += 3;
break;
case ATOM_IIO_CLEAR:
temp &=
- ~((0xFFFFFFFF >> (32 - CU8(base + 1))) <<
- CU8(base + 2));
+ ~((0xFFFFFFFF >> (32 - get_u8(ctx->bios, base + 1))) <<
+ get_u8(ctx->bios, base + 2));
base += 3;
break;
case ATOM_IIO_SET:
temp |=
- (0xFFFFFFFF >> (32 - CU8(base + 1))) << CU8(base +
- 2);
+ (0xFFFFFFFF >> (32 - get_u8(ctx->bios, base + 1))) << get_u8(ctx->bios,
+ base + 2);
base += 3;
break;
case ATOM_IIO_MOVE_INDEX:
temp &=
- ~((0xFFFFFFFF >> (32 - CU8(base + 1))) <<
- CU8(base + 3));
+ ~((0xFFFFFFFF >> (32 - get_u8(ctx->bios, base + 1))) <<
+ get_u8(ctx->bios, base + 3));
temp |=
- ((index >> CU8(base + 2)) &
- (0xFFFFFFFF >> (32 - CU8(base + 1)))) << CU8(base +
- 3);
+ ((index >> get_u8(ctx->bios, base + 2)) &
+ (0xFFFFFFFF >> (32 - get_u8(ctx->bios, base + 1)))) << get_u8(ctx->bios,
+ base + 3);
base += 4;
break;
case ATOM_IIO_MOVE_DATA:
temp &=
- ~((0xFFFFFFFF >> (32 - CU8(base + 1))) <<
- CU8(base + 3));
+ ~((0xFFFFFFFF >> (32 - get_u8(ctx->bios, base + 1))) <<
+ get_u8(ctx->bios, base + 3));
temp |=
- ((data >> CU8(base + 2)) &
- (0xFFFFFFFF >> (32 - CU8(base + 1)))) << CU8(base +
- 3);
+ ((data >> get_u8(ctx->bios, base + 2)) &
+ (0xFFFFFFFF >> (32 - get_u8(ctx->bios, base + 1)))) << get_u8(ctx->bios,
+ base + 3);
base += 4;
break;
case ATOM_IIO_MOVE_ATTR:
temp &=
- ~((0xFFFFFFFF >> (32 - CU8(base + 1))) <<
- CU8(base + 3));
+ ~((0xFFFFFFFF >> (32 - get_u8(ctx->bios, base + 1))) <<
+ get_u8(ctx->bios, base + 3));
temp |=
((ctx->
- io_attr >> CU8(base + 2)) & (0xFFFFFFFF >> (32 -
- CU8
- (base
- +
- 1))))
- << CU8(base + 3);
+ io_attr >> get_u8(ctx->bios, base + 2)) & (0xFFFFFFFF >> (32 -
+ get_u8(ctx->bios, base + 1))))
+ << get_u8(ctx->bios, base + 3);
base += 4;
break;
case ATOM_IIO_END:
@@ -189,7 +190,7 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
align = (attr >> 3) & 7;
switch (arg) {
case ATOM_ARG_REG:
- idx = U16(*ptr);
+ idx = get_u16(ctx->ctx->bios, *ptr);
(*ptr) += 2;
if (print)
DEBUG("REG[0x%04X]", idx);
@@ -221,7 +222,7 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
}
break;
case ATOM_ARG_PS:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
/* get_unaligned_le32 avoids unaligned accesses from atombios
* tables, noticed on a DEC Alpha. */
@@ -230,7 +231,7 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
DEBUG("PS[0x%02X,0x%04X]", idx, val);
break;
case ATOM_ARG_WS:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
if (print)
DEBUG("WS[0x%02X]", idx);
@@ -267,7 +268,7 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
}
break;
case ATOM_ARG_ID:
- idx = U16(*ptr);
+ idx = get_u16(ctx->ctx->bios, *ptr);
(*ptr) += 2;
if (print) {
if (gctx->data_block)
@@ -275,10 +276,10 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
else
DEBUG("ID[0x%04X]", idx);
}
- val = U32(idx + gctx->data_block);
+ val = get_u32(ctx->ctx->bios, idx + gctx->data_block);
break;
case ATOM_ARG_FB:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
if ((gctx->fb_base + (idx * 4)) > gctx->scratch_size_bytes) {
DRM_ERROR("ATOM: fb read beyond scratch region: %d vs. %d\n",
@@ -292,7 +293,7 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
case ATOM_ARG_IMM:
switch (align) {
case ATOM_SRC_DWORD:
- val = U32(*ptr);
+ val = get_u32(ctx->ctx->bios, *ptr);
(*ptr) += 4;
if (print)
DEBUG("IMM 0x%08X\n", val);
@@ -300,7 +301,7 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
case ATOM_SRC_WORD0:
case ATOM_SRC_WORD8:
case ATOM_SRC_WORD16:
- val = U16(*ptr);
+ val = get_u16(ctx->ctx->bios, *ptr);
(*ptr) += 2;
if (print)
DEBUG("IMM 0x%04X\n", val);
@@ -309,7 +310,7 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
case ATOM_SRC_BYTE8:
case ATOM_SRC_BYTE16:
case ATOM_SRC_BYTE24:
- val = U8(*ptr);
+ val = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
if (print)
DEBUG("IMM 0x%02X\n", val);
@@ -317,14 +318,14 @@ static uint32_t atom_get_src_int(atom_exec_context *ctx, uint8_t attr,
}
return 0;
case ATOM_ARG_PLL:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
if (print)
DEBUG("PLL[0x%02X]", idx);
val = gctx->card->pll_read(gctx->card, idx);
break;
case ATOM_ARG_MC:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
if (print)
DEBUG("MC[0x%02X]", idx);
@@ -412,20 +413,20 @@ static uint32_t atom_get_src_direct(atom_exec_context *ctx, uint8_t align, int *

switch (align) {
case ATOM_SRC_DWORD:
- val = U32(*ptr);
+ val = get_u32(ctx->ctx->bios, *ptr);
(*ptr) += 4;
break;
case ATOM_SRC_WORD0:
case ATOM_SRC_WORD8:
case ATOM_SRC_WORD16:
- val = U16(*ptr);
+ val = get_u16(ctx->ctx->bios, *ptr);
(*ptr) += 2;
break;
case ATOM_SRC_BYTE0:
case ATOM_SRC_BYTE8:
case ATOM_SRC_BYTE16:
case ATOM_SRC_BYTE24:
- val = U8(*ptr);
+ val = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
break;
}
@@ -462,7 +463,7 @@ static void atom_put_dst(atom_exec_context *ctx, int arg, uint8_t attr,
val |= saved;
switch (arg) {
case ATOM_ARG_REG:
- idx = U16(*ptr);
+ idx = get_u16(ctx->ctx->bios, *ptr);
(*ptr) += 2;
DEBUG("REG[0x%04X]", idx);
idx += gctx->reg_block;
@@ -495,13 +496,13 @@ static void atom_put_dst(atom_exec_context *ctx, int arg, uint8_t attr,
}
break;
case ATOM_ARG_PS:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
DEBUG("PS[0x%02X]", idx);
ctx->ps[idx] = cpu_to_le32(val);
break;
case ATOM_ARG_WS:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
DEBUG("WS[0x%02X]", idx);
switch (idx) {
@@ -534,7 +535,7 @@ static void atom_put_dst(atom_exec_context *ctx, int arg, uint8_t attr,
}
break;
case ATOM_ARG_FB:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
if ((gctx->fb_base + (idx * 4)) > gctx->scratch_size_bytes) {
DRM_ERROR("ATOM: fb write beyond scratch region: %d vs. %d\n",
@@ -544,13 +545,13 @@ static void atom_put_dst(atom_exec_context *ctx, int arg, uint8_t attr,
DEBUG("FB[0x%02X]", idx);
break;
case ATOM_ARG_PLL:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
DEBUG("PLL[0x%02X]", idx);
gctx->card->pll_write(gctx->card, idx, val);
break;
case ATOM_ARG_MC:
- idx = U8(*ptr);
+ idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
DEBUG("MC[0x%02X]", idx);
gctx->card->mc_write(gctx->card, idx, val);
@@ -586,7 +587,7 @@ static void atom_put_dst(atom_exec_context *ctx, int arg, uint8_t attr,

static void atom_op_add(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src, saved;
int dptr = *ptr;
SDEBUG(" dst: ");
@@ -600,7 +601,7 @@ static void atom_op_add(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_and(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src, saved;
int dptr = *ptr;
SDEBUG(" dst: ");
@@ -619,14 +620,14 @@ static void atom_op_beep(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_calltable(atom_exec_context *ctx, int *ptr, int arg)
{
- int idx = U8((*ptr)++);
+ int idx = get_u8(ctx->ctx->bios, (*ptr)++);
int r = 0;

if (idx < ATOM_TABLE_NAMES_CNT)
SDEBUG(" table: %d (%s)\n", idx, atom_table_names[idx]);
else
SDEBUG(" table: %d\n", idx);
- if (U16(ctx->ctx->cmd_table + 4 + 2 * idx))
+ if (get_u16(ctx->ctx->bios, ctx->ctx->cmd_table + 4 + 2 * idx))
r = atom_execute_table_locked(ctx->ctx, idx, ctx->ps + ctx->ps_shift);
if (r) {
ctx->abort = true;
@@ -635,7 +636,7 @@ static void atom_op_calltable(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_clear(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t saved;
int dptr = *ptr;
attr &= 0x38;
@@ -647,7 +648,7 @@ static void atom_op_clear(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_compare(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src;
SDEBUG(" src1: ");
dst = atom_get_dst(ctx, arg, attr, ptr, NULL, 1);
@@ -661,7 +662,7 @@ static void atom_op_compare(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_delay(atom_exec_context *ctx, int *ptr, int arg)
{
- unsigned count = U8((*ptr)++);
+ unsigned count = get_u8(ctx->ctx->bios, (*ptr)++);
SDEBUG(" count: %d\n", count);
if (arg == ATOM_UNIT_MICROSEC)
udelay(count);
@@ -673,7 +674,7 @@ static void atom_op_delay(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_div(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src;
SDEBUG(" src1: ");
dst = atom_get_dst(ctx, arg, attr, ptr, NULL, 1);
@@ -695,7 +696,7 @@ static void atom_op_eot(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_jump(atom_exec_context *ctx, int *ptr, int arg)
{
- int execute = 0, target = U16(*ptr);
+ int execute = 0, target = get_u16(ctx->ctx->bios, *ptr);
unsigned long cjiffies;

(*ptr) += 2;
@@ -748,7 +749,7 @@ static void atom_op_jump(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_mask(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, mask, src, saved;
int dptr = *ptr;
SDEBUG(" dst: ");
@@ -765,7 +766,7 @@ static void atom_op_mask(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_move(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t src, saved;
int dptr = *ptr;
if (((attr >> 3) & 7) != ATOM_SRC_DWORD)
@@ -782,7 +783,7 @@ static void atom_op_move(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_mul(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src;
SDEBUG(" src1: ");
dst = atom_get_dst(ctx, arg, attr, ptr, NULL, 1);
@@ -798,7 +799,7 @@ static void atom_op_nop(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_or(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src, saved;
int dptr = *ptr;
SDEBUG(" dst: ");
@@ -812,7 +813,7 @@ static void atom_op_or(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_postcard(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t val = U8((*ptr)++);
+ uint8_t val = get_u8(ctx->ctx->bios, (*ptr)++);
SDEBUG("POST card output: 0x%02X\n", val);
}

@@ -833,7 +834,7 @@ static void atom_op_savereg(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_setdatablock(atom_exec_context *ctx, int *ptr, int arg)
{
- int idx = U8(*ptr);
+ int idx = get_u8(ctx->ctx->bios, *ptr);
(*ptr)++;
SDEBUG(" block: %d\n", idx);
if (!idx)
@@ -841,13 +842,14 @@ static void atom_op_setdatablock(atom_exec_context *ctx, int *ptr, int arg)
else if (idx == 255)
ctx->ctx->data_block = ctx->start;
else
- ctx->ctx->data_block = U16(ctx->ctx->data_table + 4 + 2 * idx);
+ ctx->ctx->data_block = get_u16(ctx->ctx->bios,
+ ctx->ctx->data_table + 4 + 2 * idx);
SDEBUG(" base: 0x%04X\n", ctx->ctx->data_block);
}

static void atom_op_setfbbase(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
SDEBUG(" fb_base: ");
ctx->ctx->fb_base = atom_get_src(ctx, attr, ptr);
}
@@ -857,7 +859,7 @@ static void atom_op_setport(atom_exec_context *ctx, int *ptr, int arg)
int port;
switch (arg) {
case ATOM_PORT_ATI:
- port = U16(*ptr);
+ port = get_u16(ctx->ctx->bios, *ptr);
if (port < ATOM_IO_NAMES_CNT)
SDEBUG(" port: %d (%s)\n", port, atom_io_names[port]);
else
@@ -881,14 +883,14 @@ static void atom_op_setport(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_setregblock(atom_exec_context *ctx, int *ptr, int arg)
{
- ctx->ctx->reg_block = U16(*ptr);
+ ctx->ctx->reg_block = get_u16(ctx->ctx->bios, *ptr);
(*ptr) += 2;
SDEBUG(" base: 0x%04X\n", ctx->ctx->reg_block);
}

static void atom_op_shift_left(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++), shift;
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++), shift;
uint32_t saved, dst;
int dptr = *ptr;
attr &= 0x38;
@@ -904,7 +906,7 @@ static void atom_op_shift_left(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_shift_right(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++), shift;
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++), shift;
uint32_t saved, dst;
int dptr = *ptr;
attr &= 0x38;
@@ -920,7 +922,7 @@ static void atom_op_shift_right(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_shl(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++), shift;
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++), shift;
uint32_t saved, dst;
int dptr = *ptr;
uint32_t dst_align = atom_dst_to_src[(attr >> 3) & 7][(attr >> 6) & 3];
@@ -939,7 +941,7 @@ static void atom_op_shl(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_shr(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++), shift;
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++), shift;
uint32_t saved, dst;
int dptr = *ptr;
uint32_t dst_align = atom_dst_to_src[(attr >> 3) & 7][(attr >> 6) & 3];
@@ -958,7 +960,7 @@ static void atom_op_shr(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_sub(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src, saved;
int dptr = *ptr;
SDEBUG(" dst: ");
@@ -972,18 +974,18 @@ static void atom_op_sub(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_switch(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t src, val, target;
SDEBUG(" switch: ");
src = atom_get_src(ctx, attr, ptr);
- while (U16(*ptr) != ATOM_CASE_END)
- if (U8(*ptr) == ATOM_CASE_MAGIC) {
+ while (get_u16(ctx->ctx->bios, *ptr) != ATOM_CASE_END)
+ if (get_u8(ctx->ctx->bios, *ptr) == ATOM_CASE_MAGIC) {
(*ptr)++;
SDEBUG(" case: ");
val =
atom_get_src(ctx, (attr & 0x38) | ATOM_ARG_IMM,
ptr);
- target = U16(*ptr);
+ target = get_u16(ctx->ctx->bios, *ptr);
if (val == src) {
SDEBUG(" target: %04X\n", target);
*ptr = ctx->start + target;
@@ -999,7 +1001,7 @@ static void atom_op_switch(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_test(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src;
SDEBUG(" src1: ");
dst = atom_get_dst(ctx, arg, attr, ptr, NULL, 1);
@@ -1011,7 +1013,7 @@ static void atom_op_test(atom_exec_context *ctx, int *ptr, int arg)

static void atom_op_xor(atom_exec_context *ctx, int *ptr, int arg)
{
- uint8_t attr = U8((*ptr)++);
+ uint8_t attr = get_u8(ctx->ctx->bios, (*ptr)++);
uint32_t dst, src, saved;
int dptr = *ptr;
SDEBUG(" dst: ");
@@ -1158,7 +1160,7 @@ atom_op_debug, 0},};

static int atom_execute_table_locked(struct atom_context *ctx, int index, uint32_t * params)
{
- int base = CU16(ctx->cmd_table + 4 + 2 * index);
+ int base = get_u16(ctx->bios, ctx->cmd_table + 4 + 2 * index);
int len, ws, ps, ptr;
unsigned char op;
atom_exec_context ectx;
@@ -1167,9 +1169,9 @@ static int atom_execute_table_locked(struct atom_context *ctx, int index, uint32
if (!base)
return -EINVAL;

- len = CU16(base + ATOM_CT_SIZE_PTR);
- ws = CU8(base + ATOM_CT_WS_PTR);
- ps = CU8(base + ATOM_CT_PS_PTR) & ATOM_CT_PS_MASK;
+ len = get_u16(ctx->bios, base + ATOM_CT_SIZE_PTR);
+ ws = get_u8(ctx->bios, base + ATOM_CT_WS_PTR);
+ ps = get_u8(ctx->bios, base + ATOM_CT_PS_PTR) & ATOM_CT_PS_MASK;
ptr = base + ATOM_CT_CODE_PTR;

SDEBUG(">> execute %04X (len %d, WS %d, PS %d)\n", base, len, ws, ps);
@@ -1187,7 +1189,7 @@ static int atom_execute_table_locked(struct atom_context *ctx, int index, uint32

debug_depth++;
while (1) {
- op = CU8(ptr++);
+ op = get_u8(ctx->bios, ptr++);
if (op < ATOM_OP_NAMES_CNT)
SDEBUG("%s @ 0x%04X\n", atom_op_names[op], ptr - 1);
else
@@ -1253,11 +1255,11 @@ static void atom_index_iio(struct atom_context *ctx, int base)
ctx->iio = kzalloc(2 * 256, GFP_KERNEL);
if (!ctx->iio)
return;
- while (CU8(base) == ATOM_IIO_START) {
- ctx->iio[CU8(base + 1)] = base + 2;
+ while (get_u8(ctx->bios, base) == ATOM_IIO_START) {
+ ctx->iio[get_u8(ctx->bios, base + 1)] = base + 2;
base += 2;
- while (CU8(base) != ATOM_IIO_END)
- base += atom_iio_len[CU8(base)];
+ while (get_u8(ctx->bios, base) != ATOM_IIO_END)
+ base += atom_iio_len[get_u8(ctx->bios, base)];
base += 3;
}
}
@@ -1277,7 +1279,7 @@ struct atom_context *atom_parse(struct card_info *card, void *bios)
ctx->card = card;
ctx->bios = bios;

- if (CU16(0) != ATOM_BIOS_MAGIC) {
+ if (get_u16(ctx->bios, 0) != ATOM_BIOS_MAGIC) {
pr_info("Invalid BIOS magic\n");
kfree(ctx);
return NULL;
@@ -1290,7 +1292,7 @@ struct atom_context *atom_parse(struct card_info *card, void *bios)
return NULL;
}

- base = CU16(ATOM_ROM_TABLE_PTR);
+ base = get_u16(ctx->bios, ATOM_ROM_TABLE_PTR);
if (strncmp
(CSTR(base + ATOM_ROM_MAGIC_PTR), ATOM_ROM_MAGIC,
strlen(ATOM_ROM_MAGIC))) {
@@ -1299,15 +1301,16 @@ struct atom_context *atom_parse(struct card_info *card, void *bios)
return NULL;
}

- ctx->cmd_table = CU16(base + ATOM_ROM_CMD_PTR);
- ctx->data_table = CU16(base + ATOM_ROM_DATA_PTR);
- atom_index_iio(ctx, CU16(ctx->data_table + ATOM_DATA_IIO_PTR) + 4);
+ ctx->cmd_table = get_u16(ctx->bios, base + ATOM_ROM_CMD_PTR);
+ ctx->data_table = get_u16(ctx->bios, base + ATOM_ROM_DATA_PTR);
+ atom_index_iio(ctx,
+ get_u16(ctx->bios, ctx->data_table + ATOM_DATA_IIO_PTR) + 4);
if (!ctx->iio) {
atom_destroy(ctx);
return NULL;
}

- str = CSTR(CU16(base + ATOM_ROM_MSG_PTR));
+ str = CSTR(get_u16(ctx->bios, base + ATOM_ROM_MSG_PTR));
while (*str && ((*str == '\n') || (*str == '\r')))
str++;
/* name string isn't always 0 terminated */
@@ -1326,18 +1329,18 @@ struct atom_context *atom_parse(struct card_info *card, void *bios)
int atom_asic_init(struct atom_context *ctx)
{
struct radeon_device *rdev = ctx->card->dev->dev_private;
- int hwi = CU16(ctx->data_table + ATOM_DATA_FWI_PTR);
+ int hwi = get_u16(ctx->bios, ctx->data_table + ATOM_DATA_FWI_PTR);
uint32_t ps[16];
int ret;

memset(ps, 0, 64);

- ps[0] = cpu_to_le32(CU32(hwi + ATOM_FWI_DEFSCLK_PTR));
- ps[1] = cpu_to_le32(CU32(hwi + ATOM_FWI_DEFMCLK_PTR));
+ ps[0] = cpu_to_le32(get_u32(ctx->bios, hwi + ATOM_FWI_DEFSCLK_PTR));
+ ps[1] = cpu_to_le32(get_u32(ctx->bios, hwi + ATOM_FWI_DEFMCLK_PTR));
if (!ps[0] || !ps[1])
return 1;

- if (!CU16(ctx->cmd_table + 4 + 2 * ATOM_CMD_INIT))
+ if (!get_u16(ctx->bios, ctx->cmd_table + 4 + 2 * ATOM_CMD_INIT))
return 1;
ret = atom_execute_table(ctx, ATOM_CMD_INIT, ps);
if (ret)
@@ -1346,7 +1349,7 @@ int atom_asic_init(struct atom_context *ctx)
memset(ps, 0, 64);

if (rdev->family < CHIP_R600) {
- if (CU16(ctx->cmd_table + 4 + 2 * ATOM_CMD_SPDFANCNTL))
+ if (get_u16(ctx->bios, ctx->cmd_table + 4 + 2 * ATOM_CMD_SPDFANCNTL))
atom_execute_table(ctx, ATOM_CMD_SPDFANCNTL, ps);
}
return ret;
@@ -1363,18 +1366,18 @@ bool atom_parse_data_header(struct atom_context *ctx, int index,
uint16_t * data_start)
{
int offset = index * 2 + 4;
- int idx = CU16(ctx->data_table + offset);
+ int idx = get_u16(ctx->bios, ctx->data_table + offset);
u16 *mdt = (u16 *)(ctx->bios + ctx->data_table + 4);

if (!mdt[index])
return false;

if (size)
- *size = CU16(idx);
+ *size = get_u16(ctx->bios, idx);
if (frev)
- *frev = CU8(idx + 2);
+ *frev = get_u8(ctx->bios, idx + 2);
if (crev)
- *crev = CU8(idx + 3);
+ *crev = get_u8(ctx->bios, idx + 3);
*data_start = idx;
return true;
}
@@ -1383,16 +1386,16 @@ bool atom_parse_cmd_header(struct atom_context *ctx, int index, uint8_t * frev,
uint8_t * crev)
{
int offset = index * 2 + 4;
- int idx = CU16(ctx->cmd_table + offset);
+ int idx = get_u16(ctx->bios, ctx->cmd_table + offset);
u16 *mct = (u16 *)(ctx->bios + ctx->cmd_table + 4);

if (!mct[index])
return false;

if (frev)
- *frev = CU8(idx + 2);
+ *frev = get_u8(ctx->bios, idx + 2);
if (crev)
- *crev = CU8(idx + 3);
+ *crev = get_u8(ctx->bios, idx + 3);
return true;
}

--
2.40.1

2023-05-09 05:30:23

by Lucas De Marchi

[permalink] [raw]
Subject: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
masks for fixed-width types and also the corresponding BIT_U32(),
BIT_U16() and BIT_U8().

All of those depend on a new "U" suffix added to the integer constant.
Due to naming clashes it's better to call the macro U32. Since C doesn't
have a proper suffix for short and char types, the U16 and U18 variants
just use U32 with one additional check in the BIT_* macros to make
sure the compiler gives an error when the those types overflow.
The BIT_U16() and BIT_U8() need the help of GENMASK_INPUT_CHECK(),
as otherwise they would allow an invalid bit to be passed. Hence
implement them in include/linux/bits.h rather than together with
the other BIT* variants.

The following test file is is used to test this:

$ cat mask.c
#include <linux/types.h>
#include <linux/bits.h>

static const u32 a = GENMASK_U32(31, 0);
static const u16 b = GENMASK_U16(15, 0);
static const u8 c = GENMASK_U8(7, 0);
static const u32 x = BIT_U32(31);
static const u16 y = BIT_U16(15);
static const u8 z = BIT_U8(7);

#if FAIL
static const u32 a2 = GENMASK_U32(32, 0);
static const u16 b2 = GENMASK_U16(16, 0);
static const u8 c2 = GENMASK_U8(8, 0);
static const u32 x2 = BIT_U32(32);
static const u16 y2 = BIT_U16(16);
static const u8 z2 = BIT_U8(8);
#endif

Signed-off-by: Lucas De Marchi <[email protected]>
---
include/linux/bits.h | 22 ++++++++++++++++++++++
include/uapi/linux/const.h | 2 ++
include/vdso/const.h | 1 +
3 files changed, 25 insertions(+)

diff --git a/include/linux/bits.h b/include/linux/bits.h
index 7c0cf5031abe..ff4786c99b8c 100644
--- a/include/linux/bits.h
+++ b/include/linux/bits.h
@@ -42,4 +42,26 @@
#define GENMASK_ULL(h, l) \
(GENMASK_INPUT_CHECK(h, l) + __GENMASK_ULL(h, l))

+#define __GENMASK_U32(h, l) \
+ (((~U32(0)) - (U32(1) << (l)) + 1) & \
+ (~U32(0) >> (32 - 1 - (h))))
+#define GENMASK_U32(h, l) \
+ (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U32(h, l))
+
+#define __GENMASK_U16(h, l) \
+ ((U32(0xffff) - (U32(1) << (l)) + 1) & \
+ (U32(0xffff) >> (16 - 1 - (h))))
+#define GENMASK_U16(h, l) \
+ (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U16(h, l))
+
+#define __GENMASK_U8(h, l) \
+ (((U32(0xff)) - (U32(1) << (l)) + 1) & \
+ (U32(0xff) >> (8 - 1 - (h))))
+#define GENMASK_U8(h, l) \
+ (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U8(h, l))
+
+#define BIT_U32(nr) _BITU32(nr)
+#define BIT_U16(nr) (GENMASK_INPUT_CHECK(16 - 1, nr) + (U32(1) << (nr)))
+#define BIT_U8(nr) (GENMASK_INPUT_CHECK(32 - 1, nr) + (U32(1) << (nr)))
+
#endif /* __LINUX_BITS_H */
diff --git a/include/uapi/linux/const.h b/include/uapi/linux/const.h
index a429381e7ca5..3a4e152520f4 100644
--- a/include/uapi/linux/const.h
+++ b/include/uapi/linux/const.h
@@ -22,9 +22,11 @@
#define _AT(T,X) ((T)(X))
#endif

+#define _U32(x) (_AC(x, U))
#define _UL(x) (_AC(x, UL))
#define _ULL(x) (_AC(x, ULL))

+#define _BITU32(x) (_U32(1) << (x))
#define _BITUL(x) (_UL(1) << (x))
#define _BITULL(x) (_ULL(1) << (x))

diff --git a/include/vdso/const.h b/include/vdso/const.h
index 94b385ad438d..417384a9795b 100644
--- a/include/vdso/const.h
+++ b/include/vdso/const.h
@@ -4,6 +4,7 @@

#include <uapi/linux/const.h>

+#define U32(x) (_U32(x))
#define UL(x) (_UL(x))
#define ULL(x) (_ULL(x))

--
2.40.1

2023-05-09 08:11:27

by Jani Nikula

[permalink] [raw]
Subject: Re: [PATCH 3/3] drm/i915: Temporary conversion to new GENMASK/BIT macros

On Mon, 08 May 2023, Lucas De Marchi <[email protected]> wrote:
> Convert the REG_* macros from i915_reg_defs.h to use the new macros
> defined in linux/bits.h. This is just to help on the implementation
> of the new macros and not intended to be applied.

This drops a number of build time input checks as well as casts to the
specified types.

BR,
Jani.

>
> Signed-off-by: Lucas De Marchi <[email protected]>
> ---
> drivers/gpu/drm/i915/i915_reg_defs.h | 28 +++++-----------------------
> 1 file changed, 5 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h
> index 622d603080f9..61fbb8d62b25 100644
> --- a/drivers/gpu/drm/i915/i915_reg_defs.h
> +++ b/drivers/gpu/drm/i915/i915_reg_defs.h
> @@ -17,10 +17,7 @@
> *
> * @return: Value with bit @__n set.
> */
> -#define REG_BIT(__n) \
> - ((u32)(BIT(__n) + \
> - BUILD_BUG_ON_ZERO(__is_constexpr(__n) && \
> - ((__n) < 0 || (__n) > 31))))
> +#define REG_BIT(__n) BIT_U32(__n)
>
> /**
> * REG_BIT8() - Prepare a u8 bit value
> @@ -30,10 +27,7 @@
> *
> * @return: Value with bit @__n set.
> */
> -#define REG_BIT8(__n) \
> - ((u8)(BIT(__n) + \
> - BUILD_BUG_ON_ZERO(__is_constexpr(__n) && \
> - ((__n) < 0 || (__n) > 7))))
> +#define REG_BIT8(__n) BIT_U8(__n)
>
> /**
> * REG_GENMASK() - Prepare a continuous u32 bitmask
> @@ -44,11 +38,7 @@
> *
> * @return: Continuous bitmask from @__high to @__low, inclusive.
> */
> -#define REG_GENMASK(__high, __low) \
> - ((u32)(GENMASK(__high, __low) + \
> - BUILD_BUG_ON_ZERO(__is_constexpr(__high) && \
> - __is_constexpr(__low) && \
> - ((__low) < 0 || (__high) > 31 || (__low) > (__high)))))
> +#define REG_GENMASK(__high, __low) GENMASK_U32(__high, __low)
>
> /**
> * REG_GENMASK64() - Prepare a continuous u64 bitmask
> @@ -59,11 +49,7 @@
> *
> * @return: Continuous bitmask from @__high to @__low, inclusive.
> */
> -#define REG_GENMASK64(__high, __low) \
> - ((u64)(GENMASK_ULL(__high, __low) + \
> - BUILD_BUG_ON_ZERO(__is_constexpr(__high) && \
> - __is_constexpr(__low) && \
> - ((__low) < 0 || (__high) > 63 || (__low) > (__high)))))
> +#define REG_GENMASK64(__high, __low) GENMASK_ULL(__high, __low)
>
> /**
> * REG_GENMASK8() - Prepare a continuous u8 bitmask
> @@ -74,11 +60,7 @@
> *
> * @return: Continuous bitmask from @__high to @__low, inclusive.
> */
> -#define REG_GENMASK8(__high, __low) \
> - ((u8)(GENMASK(__high, __low) + \
> - BUILD_BUG_ON_ZERO(__is_constexpr(__high) && \
> - __is_constexpr(__low) && \
> - ((__low) < 0 || (__high) > 7 || (__low) > (__high)))))
> +#define REG_GENMASK8(__high, __low) GENMASK_U8(__high, __low)
>
> /*
> * Local integer constant expression version of is_power_of_2().

--
Jani Nikula, Intel Open Source Graphics Center

2023-05-09 08:24:30

by Lucas De Marchi

[permalink] [raw]
Subject: Re: [PATCH 3/3] drm/i915: Temporary conversion to new GENMASK/BIT macros

On Tue, May 09, 2023 at 10:57:19AM +0300, Jani Nikula wrote:
>On Mon, 08 May 2023, Lucas De Marchi <[email protected]> wrote:
>> Convert the REG_* macros from i915_reg_defs.h to use the new macros
>> defined in linux/bits.h. This is just to help on the implementation
>> of the new macros and not intended to be applied.
>
>This drops a number of build time input checks as well as casts to the
>specified types.

the explicit checks... but the checks are still there and the compiler
still gives me a warning or error for using invalid values. See test program in
the second patch. Example:

static const u32 a2 = GENMASK_U32(32, 0);

In file included from /tmp/genmask.c:2:
include/linux/bits.h:47:19: warning: right shift count is negative [-Wshift-count-negative]
47 | (~U32(0) >> (32 - 1 - (h))))
| ^~

It's a warning, not an error though.

Same warning for the other fixed-widths with numbers above the supported width.
For negative values:

In file included from include/linux/bits.h:21,
from /tmp/genmask.c:2:
include/linux/build_bug.h:16:51: error: negative width in bit-field ‘<anonymous>’
16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
| ^


The cast to the specified type we lose indeed. Could you give an example where
those are useful in the context they are used? I debated adding them, but couldn't find
a justified use of them.

Lucas De Marchi

>
>BR,
>Jani.
>
>>
>> Signed-off-by: Lucas De Marchi <[email protected]>
>> ---
>> drivers/gpu/drm/i915/i915_reg_defs.h | 28 +++++-----------------------
>> 1 file changed, 5 insertions(+), 23 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_reg_defs.h b/drivers/gpu/drm/i915/i915_reg_defs.h
>> index 622d603080f9..61fbb8d62b25 100644
>> --- a/drivers/gpu/drm/i915/i915_reg_defs.h
>> +++ b/drivers/gpu/drm/i915/i915_reg_defs.h
>> @@ -17,10 +17,7 @@
>> *
>> * @return: Value with bit @__n set.
>> */
>> -#define REG_BIT(__n) \
>> - ((u32)(BIT(__n) + \
>> - BUILD_BUG_ON_ZERO(__is_constexpr(__n) && \
>> - ((__n) < 0 || (__n) > 31))))
>> +#define REG_BIT(__n) BIT_U32(__n)
>>
>> /**
>> * REG_BIT8() - Prepare a u8 bit value
>> @@ -30,10 +27,7 @@
>> *
>> * @return: Value with bit @__n set.
>> */
>> -#define REG_BIT8(__n) \
>> - ((u8)(BIT(__n) + \
>> - BUILD_BUG_ON_ZERO(__is_constexpr(__n) && \
>> - ((__n) < 0 || (__n) > 7))))
>> +#define REG_BIT8(__n) BIT_U8(__n)
>>
>> /**
>> * REG_GENMASK() - Prepare a continuous u32 bitmask
>> @@ -44,11 +38,7 @@
>> *
>> * @return: Continuous bitmask from @__high to @__low, inclusive.
>> */
>> -#define REG_GENMASK(__high, __low) \
>> - ((u32)(GENMASK(__high, __low) + \
>> - BUILD_BUG_ON_ZERO(__is_constexpr(__high) && \
>> - __is_constexpr(__low) && \
>> - ((__low) < 0 || (__high) > 31 || (__low) > (__high)))))
>> +#define REG_GENMASK(__high, __low) GENMASK_U32(__high, __low)
>>
>> /**
>> * REG_GENMASK64() - Prepare a continuous u64 bitmask
>> @@ -59,11 +49,7 @@
>> *
>> * @return: Continuous bitmask from @__high to @__low, inclusive.
>> */
>> -#define REG_GENMASK64(__high, __low) \
>> - ((u64)(GENMASK_ULL(__high, __low) + \
>> - BUILD_BUG_ON_ZERO(__is_constexpr(__high) && \
>> - __is_constexpr(__low) && \
>> - ((__low) < 0 || (__high) > 63 || (__low) > (__high)))))
>> +#define REG_GENMASK64(__high, __low) GENMASK_ULL(__high, __low)
>>
>> /**
>> * REG_GENMASK8() - Prepare a continuous u8 bitmask
>> @@ -74,11 +60,7 @@
>> *
>> * @return: Continuous bitmask from @__high to @__low, inclusive.
>> */
>> -#define REG_GENMASK8(__high, __low) \
>> - ((u8)(GENMASK(__high, __low) + \
>> - BUILD_BUG_ON_ZERO(__is_constexpr(__high) && \
>> - __is_constexpr(__low) && \
>> - ((__low) < 0 || (__high) > 7 || (__low) > (__high)))))
>> +#define REG_GENMASK8(__high, __low) GENMASK_U8(__high, __low)
>>
>> /*
>> * Local integer constant expression version of is_power_of_2().
>
>--
>Jani Nikula, Intel Open Source Graphics Center

2023-05-09 14:20:28

by Gustavo Sousa

[permalink] [raw]
Subject: Re: [Intel-xe] [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

Quoting Lucas De Marchi (2023-05-09 02:14:02)
>Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
>masks for fixed-width types and also the corresponding BIT_U32(),
>BIT_U16() and BIT_U8().
>
>All of those depend on a new "U" suffix added to the integer constant.
>Due to naming clashes it's better to call the macro U32. Since C doesn't
>have a proper suffix for short and char types, the U16 and U18 variants
>just use U32 with one additional check in the BIT_* macros to make
>sure the compiler gives an error when the those types overflow.
>The BIT_U16() and BIT_U8() need the help of GENMASK_INPUT_CHECK(),
>as otherwise they would allow an invalid bit to be passed. Hence
>implement them in include/linux/bits.h rather than together with
>the other BIT* variants.
>
>The following test file is is used to test this:
>
> $ cat mask.c
> #include <linux/types.h>
> #include <linux/bits.h>
>
> static const u32 a = GENMASK_U32(31, 0);
> static const u16 b = GENMASK_U16(15, 0);
> static const u8 c = GENMASK_U8(7, 0);
> static const u32 x = BIT_U32(31);
> static const u16 y = BIT_U16(15);
> static const u8 z = BIT_U8(7);
>
> #if FAIL
> static const u32 a2 = GENMASK_U32(32, 0);
> static const u16 b2 = GENMASK_U16(16, 0);
> static const u8 c2 = GENMASK_U8(8, 0);
> static const u32 x2 = BIT_U32(32);
> static const u16 y2 = BIT_U16(16);
> static const u8 z2 = BIT_U8(8);
> #endif
>
>Signed-off-by: Lucas De Marchi <[email protected]>
>---
> include/linux/bits.h | 22 ++++++++++++++++++++++
> include/uapi/linux/const.h | 2 ++
> include/vdso/const.h | 1 +
> 3 files changed, 25 insertions(+)
>
>diff --git a/include/linux/bits.h b/include/linux/bits.h
>index 7c0cf5031abe..ff4786c99b8c 100644
>--- a/include/linux/bits.h
>+++ b/include/linux/bits.h
>@@ -42,4 +42,26 @@
> #define GENMASK_ULL(h, l) \
> (GENMASK_INPUT_CHECK(h, l) + __GENMASK_ULL(h, l))
>
>+#define __GENMASK_U32(h, l) \
>+ (((~U32(0)) - (U32(1) << (l)) + 1) & \
>+ (~U32(0) >> (32 - 1 - (h))))
>+#define GENMASK_U32(h, l) \
>+ (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U32(h, l))
>+
>+#define __GENMASK_U16(h, l) \
>+ ((U32(0xffff) - (U32(1) << (l)) + 1) & \
>+ (U32(0xffff) >> (16 - 1 - (h))))
>+#define GENMASK_U16(h, l) \
>+ (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U16(h, l))
>+
>+#define __GENMASK_U8(h, l) \
>+ (((U32(0xff)) - (U32(1) << (l)) + 1) & \
>+ (U32(0xff) >> (8 - 1 - (h))))
>+#define GENMASK_U8(h, l) \
>+ (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U8(h, l))

I wonder if we should use BIT_U* variants in the above to ensure the values are
valid. If so, we get a nice boundary check and we also can use a single
definition for the mask generation:

#define __GENMASK_U32(h, l) \
(((~U32(0)) - (U32(1) << (l)) + 1) & \
(~U32(0) >> (32 - 1 - (h))))
#define GENMASK_U32(h, l) \
(GENMASK_INPUT_CHECK(h, l) + __GENMASK_U32(BIT_U32(h), BIT_U32(l)))
#define GENMASK_U16(h, l) \
(GENMASK_INPUT_CHECK(h, l) + __GENMASK_U32(BIT_U16(h), BIT_U16(l)))
#define GENMASK_U8(h, l) \
(GENMASK_INPUT_CHECK(h, l) + __GENMASK_U32(BIT_U8(h), BIT_U8(l)))

>+
>+#define BIT_U32(nr) _BITU32(nr)
>+#define BIT_U16(nr) (GENMASK_INPUT_CHECK(16 - 1, nr) + (U32(1) << (nr)))
>+#define BIT_U8(nr) (GENMASK_INPUT_CHECK(32 - 1, nr) + (U32(1) << (nr)))

Shouldn't this be GENMASK_INPUT_CHECK(8 - 1, nr)?

--
Gustavo Sousa

>+
> #endif /* __LINUX_BITS_H */
>diff --git a/include/uapi/linux/const.h b/include/uapi/linux/const.h
>index a429381e7ca5..3a4e152520f4 100644
>--- a/include/uapi/linux/const.h
>+++ b/include/uapi/linux/const.h
>@@ -22,9 +22,11 @@
> #define _AT(T,X) ((T)(X))
> #endif
>
>+#define _U32(x) (_AC(x, U))
> #define _UL(x) (_AC(x, UL))
> #define _ULL(x) (_AC(x, ULL))
>
>+#define _BITU32(x) (_U32(1) << (x))
> #define _BITUL(x) (_UL(1) << (x))
> #define _BITULL(x) (_ULL(1) << (x))
>
>diff --git a/include/vdso/const.h b/include/vdso/const.h
>index 94b385ad438d..417384a9795b 100644
>--- a/include/vdso/const.h
>+++ b/include/vdso/const.h
>@@ -4,6 +4,7 @@
>
> #include <uapi/linux/const.h>
>
>+#define U32(x) (_U32(x))
> #define UL(x) (_UL(x))
> #define ULL(x) (_ULL(x))
>
>--
>2.40.1
>

2023-05-09 22:34:22

by Lucas De Marchi

[permalink] [raw]
Subject: Re: [Intel-xe] [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

On Tue, May 09, 2023 at 11:00:36AM -0300, Gustavo Sousa wrote:
>Quoting Lucas De Marchi (2023-05-09 02:14:02)
>>Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
>>masks for fixed-width types and also the corresponding BIT_U32(),
>>BIT_U16() and BIT_U8().
>>
>>All of those depend on a new "U" suffix added to the integer constant.
>>Due to naming clashes it's better to call the macro U32. Since C doesn't
>>have a proper suffix for short and char types, the U16 and U18 variants
>>just use U32 with one additional check in the BIT_* macros to make
>>sure the compiler gives an error when the those types overflow.
>>The BIT_U16() and BIT_U8() need the help of GENMASK_INPUT_CHECK(),
>>as otherwise they would allow an invalid bit to be passed. Hence
>>implement them in include/linux/bits.h rather than together with
>>the other BIT* variants.
>>
>>The following test file is is used to test this:
>>
>> $ cat mask.c
>> #include <linux/types.h>
>> #include <linux/bits.h>
>>
>> static const u32 a = GENMASK_U32(31, 0);
>> static const u16 b = GENMASK_U16(15, 0);
>> static const u8 c = GENMASK_U8(7, 0);
>> static const u32 x = BIT_U32(31);
>> static const u16 y = BIT_U16(15);
>> static const u8 z = BIT_U8(7);
>>
>> #if FAIL
>> static const u32 a2 = GENMASK_U32(32, 0);
>> static const u16 b2 = GENMASK_U16(16, 0);
>> static const u8 c2 = GENMASK_U8(8, 0);
>> static const u32 x2 = BIT_U32(32);
>> static const u16 y2 = BIT_U16(16);
>> static const u8 z2 = BIT_U8(8);
>> #endif
>>
>>Signed-off-by: Lucas De Marchi <[email protected]>
>>---
>> include/linux/bits.h | 22 ++++++++++++++++++++++
>> include/uapi/linux/const.h | 2 ++
>> include/vdso/const.h | 1 +
>> 3 files changed, 25 insertions(+)
>>
>>diff --git a/include/linux/bits.h b/include/linux/bits.h
>>index 7c0cf5031abe..ff4786c99b8c 100644
>>--- a/include/linux/bits.h
>>+++ b/include/linux/bits.h
>>@@ -42,4 +42,26 @@
>> #define GENMASK_ULL(h, l) \
>> (GENMASK_INPUT_CHECK(h, l) + __GENMASK_ULL(h, l))
>>
>>+#define __GENMASK_U32(h, l) \
>>+ (((~U32(0)) - (U32(1) << (l)) + 1) & \
>>+ (~U32(0) >> (32 - 1 - (h))))
>>+#define GENMASK_U32(h, l) \
>>+ (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U32(h, l))
>>+
>>+#define __GENMASK_U16(h, l) \
>>+ ((U32(0xffff) - (U32(1) << (l)) + 1) & \
>>+ (U32(0xffff) >> (16 - 1 - (h))))
>>+#define GENMASK_U16(h, l) \
>>+ (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U16(h, l))
>>+
>>+#define __GENMASK_U8(h, l) \
>>+ (((U32(0xff)) - (U32(1) << (l)) + 1) & \
>>+ (U32(0xff) >> (8 - 1 - (h))))
>>+#define GENMASK_U8(h, l) \
>>+ (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U8(h, l))
>
>I wonder if we should use BIT_U* variants in the above to ensure the values are
>valid. If so, we get a nice boundary check and we also can use a single
>definition for the mask generation:
>
> #define __GENMASK_U32(h, l) \
> (((~U32(0)) - (U32(1) << (l)) + 1) & \
> (~U32(0) >> (32 - 1 - (h))))

the boundary for h and l are already covered here because (32 - 1 - (h))
would lead to a negative value if h >= 32. Similar reason for l

Doing ~U32(0) didn't work for me as it wouldn't catch the invalid values
due to expanding to U32_MAX


> #define GENMASK_U32(h, l) \
> (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U32(BIT_U32(h), BIT_U32(l)))

^^^^
that doesn't really work as BIT_U32(h) would expand here,
creating the equivalent of

~U32(0) >> (32 - 1 - (BIT_U32(h))),

which is not what we want

> #define GENMASK_U16(h, l) \
> (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U32(BIT_U16(h), BIT_U16(l)))
> #define GENMASK_U8(h, l) \
> (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U32(BIT_U8(h), BIT_U8(l)))
>
>>+
>>+#define BIT_U32(nr) _BITU32(nr)
>>+#define BIT_U16(nr) (GENMASK_INPUT_CHECK(16 - 1, nr) + (U32(1) << (nr)))
>>+#define BIT_U8(nr) (GENMASK_INPUT_CHECK(32 - 1, nr) + (U32(1) << (nr)))
>
>Shouldn't this be GENMASK_INPUT_CHECK(8 - 1, nr)?

ugh, good catch. Thanks

I will think if I can come up with something that reuses a single
__GENMASK_U(). Meanwhile I improved my negative tests to cover more
cases.

Lucas De Marchi



>
>--
>Gustavo Sousa
>
>>+
>> #endif /* __LINUX_BITS_H */
>>diff --git a/include/uapi/linux/const.h b/include/uapi/linux/const.h
>>index a429381e7ca5..3a4e152520f4 100644
>>--- a/include/uapi/linux/const.h
>>+++ b/include/uapi/linux/const.h
>>@@ -22,9 +22,11 @@
>> #define _AT(T,X) ((T)(X))
>> #endif
>>
>>+#define _U32(x) (_AC(x, U))
>> #define _UL(x) (_AC(x, UL))
>> #define _ULL(x) (_AC(x, ULL))
>>
>>+#define _BITU32(x) (_U32(1) << (x))
>> #define _BITUL(x) (_UL(1) << (x))
>> #define _BITULL(x) (_ULL(1) << (x))
>>
>>diff --git a/include/vdso/const.h b/include/vdso/const.h
>>index 94b385ad438d..417384a9795b 100644
>>--- a/include/vdso/const.h
>>+++ b/include/vdso/const.h
>>@@ -4,6 +4,7 @@
>>
>> #include <uapi/linux/const.h>
>>
>>+#define U32(x) (_U32(x))
>> #define UL(x) (_UL(x))
>> #define ULL(x) (_ULL(x))
>>
>>--
>>2.40.1
>>

2023-05-10 12:31:31

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

Hi Lucas,

kernel test robot noticed the following build errors:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on drm-intel/for-linux-next-fixes drm-tip/drm-tip linus/master v6.4-rc1 next-20230510]
[cannot apply to drm-misc/drm-misc-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Lucas-De-Marchi/drm-amd-Remove-wrapper-macros-over-get_u-32-16-8/20230509-131544
base: git://anongit.freedesktop.org/drm-intel for-linux-next
patch link: https://lore.kernel.org/r/20230509051403.2748545-3-lucas.demarchi%40intel.com
patch subject: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros
config: arm64-randconfig-r021-20230509 (https://download.01.org/0day-ci/archive/20230510/[email protected]/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project b0fb98227c90adf2536c9ad644a74d5e92961111)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm64 cross compiling tool for clang build
# apt-get install binutils-aarch64-linux-gnu
# https://github.com/intel-lab-lkp/linux/commit/dc308f14f76fa2d6c1698a701dfbe0f1b247e6bd
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Lucas-De-Marchi/drm-amd-Remove-wrapper-macros-over-get_u-32-16-8/20230509-131544
git checkout dc308f14f76fa2d6c1698a701dfbe0f1b247e6bd
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm64 SHELL=/bin/bash lib/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>):

>> lib/zstd/compress/zstd_opt.c:785:9: error: type specifier missing, defaults to 'int'; ISO C99 and later do not support implicit int [-Wimplicit-int]
typedef U32 (*ZSTD_getAllMatchesFn)(
~~~~~~~ ^
int
include/vdso/const.h:7:18: note: expanded from macro 'U32'
#define U32(x) (_U32(x))
^
include/uapi/linux/const.h:25:19: note: expanded from macro '_U32'
#define _U32(x) (_AC(x, U))
^
include/uapi/linux/const.h:21:18: note: expanded from macro '_AC'
#define _AC(X,Y) __AC(X,Y)
^
include/uapi/linux/const.h:20:20: note: expanded from macro '__AC'
#define __AC(X,Y) (X##Y)
^
<scratch space>:178:1: note: expanded from here
ZSTD_getAllMatchesFnU
^
>> lib/zstd/compress/zstd_opt.c:851:8: error: unknown type name 'ZSTD_getAllMatchesFn'; did you mean 'ZSTD_getAllMatchesFnU'?
static ZSTD_getAllMatchesFn
^~~~~~~~~~~~~~~~~~~~
ZSTD_getAllMatchesFnU
lib/zstd/compress/zstd_opt.c:785:9: note: 'ZSTD_getAllMatchesFnU' declared here
typedef U32 (*ZSTD_getAllMatchesFn)(
^
include/vdso/const.h:7:18: note: expanded from macro 'U32'
#define U32(x) (_U32(x))
^
include/uapi/linux/const.h:25:19: note: expanded from macro '_U32'
#define _U32(x) (_AC(x, U))
^
include/uapi/linux/const.h:21:18: note: expanded from macro '_AC'
#define _AC(X,Y) __AC(X,Y)
^
include/uapi/linux/const.h:20:20: note: expanded from macro '__AC'
#define __AC(X,Y) (X##Y)
^
<scratch space>:178:1: note: expanded from here
ZSTD_getAllMatchesFnU
^
>> lib/zstd/compress/zstd_opt.c:854:5: error: use of undeclared identifier 'ZSTD_getAllMatchesFn'
ZSTD_getAllMatchesFn const getAllMatchesFns[3][4] = {
^
>> lib/zstd/compress/zstd_opt.c:862:12: error: use of undeclared identifier 'getAllMatchesFns'
return getAllMatchesFns[(int)dictMode][mls - 3];
^
lib/zstd/compress/zstd_opt.c:1054:5: error: unknown type name 'ZSTD_getAllMatchesFn'; did you mean 'ZSTD_getAllMatchesFnU'?
ZSTD_getAllMatchesFn getAllMatches = ZSTD_selectBtGetAllMatches(ms, dictMode);
^~~~~~~~~~~~~~~~~~~~
ZSTD_getAllMatchesFnU
lib/zstd/compress/zstd_opt.c:785:9: note: 'ZSTD_getAllMatchesFnU' declared here
typedef U32 (*ZSTD_getAllMatchesFn)(
^
include/vdso/const.h:7:18: note: expanded from macro 'U32'
#define U32(x) (_U32(x))
^
include/uapi/linux/const.h:25:19: note: expanded from macro '_U32'
#define _U32(x) (_AC(x, U))
^
include/uapi/linux/const.h:21:18: note: expanded from macro '_AC'
#define _AC(X,Y) __AC(X,Y)
^
include/uapi/linux/const.h:20:20: note: expanded from macro '__AC'
#define __AC(X,Y) (X##Y)
^
<scratch space>:178:1: note: expanded from here
ZSTD_getAllMatchesFnU
^
5 errors generated.


vim +/int +785 lib/zstd/compress/zstd_opt.c

e0c1b49f5b674c Nick Terrell 2020-09-11 784
2aa14b1ab2c41a Nick Terrell 2022-10-17 @785 typedef U32 (*ZSTD_getAllMatchesFn)(
2aa14b1ab2c41a Nick Terrell 2022-10-17 786 ZSTD_match_t*,
2aa14b1ab2c41a Nick Terrell 2022-10-17 787 ZSTD_matchState_t*,
2aa14b1ab2c41a Nick Terrell 2022-10-17 788 U32*,
2aa14b1ab2c41a Nick Terrell 2022-10-17 789 const BYTE*,
2aa14b1ab2c41a Nick Terrell 2022-10-17 790 const BYTE*,
2aa14b1ab2c41a Nick Terrell 2022-10-17 791 const U32 rep[ZSTD_REP_NUM],
2aa14b1ab2c41a Nick Terrell 2022-10-17 792 U32 const ll0,
2aa14b1ab2c41a Nick Terrell 2022-10-17 793 U32 const lengthToBeat);
e0c1b49f5b674c Nick Terrell 2020-09-11 794
2aa14b1ab2c41a Nick Terrell 2022-10-17 795 FORCE_INLINE_TEMPLATE U32 ZSTD_btGetAllMatches_internal(
2aa14b1ab2c41a Nick Terrell 2022-10-17 796 ZSTD_match_t* matches,
e0c1b49f5b674c Nick Terrell 2020-09-11 797 ZSTD_matchState_t* ms,
e0c1b49f5b674c Nick Terrell 2020-09-11 798 U32* nextToUpdate3,
2aa14b1ab2c41a Nick Terrell 2022-10-17 799 const BYTE* ip,
2aa14b1ab2c41a Nick Terrell 2022-10-17 800 const BYTE* const iHighLimit,
e0c1b49f5b674c Nick Terrell 2020-09-11 801 const U32 rep[ZSTD_REP_NUM],
e0c1b49f5b674c Nick Terrell 2020-09-11 802 U32 const ll0,
2aa14b1ab2c41a Nick Terrell 2022-10-17 803 U32 const lengthToBeat,
2aa14b1ab2c41a Nick Terrell 2022-10-17 804 const ZSTD_dictMode_e dictMode,
2aa14b1ab2c41a Nick Terrell 2022-10-17 805 const U32 mls)
e0c1b49f5b674c Nick Terrell 2020-09-11 806 {
2aa14b1ab2c41a Nick Terrell 2022-10-17 807 assert(BOUNDED(3, ms->cParams.minMatch, 6) == mls);
2aa14b1ab2c41a Nick Terrell 2022-10-17 808 DEBUGLOG(8, "ZSTD_BtGetAllMatches(dictMode=%d, mls=%u)", (int)dictMode, mls);
2aa14b1ab2c41a Nick Terrell 2022-10-17 809 if (ip < ms->window.base + ms->nextToUpdate)
2aa14b1ab2c41a Nick Terrell 2022-10-17 810 return 0; /* skipped area */
2aa14b1ab2c41a Nick Terrell 2022-10-17 811 ZSTD_updateTree_internal(ms, ip, iHighLimit, mls, dictMode);
2aa14b1ab2c41a Nick Terrell 2022-10-17 812 return ZSTD_insertBtAndGetAllMatches(matches, ms, nextToUpdate3, ip, iHighLimit, dictMode, rep, ll0, lengthToBeat, mls);
2aa14b1ab2c41a Nick Terrell 2022-10-17 813 }
2aa14b1ab2c41a Nick Terrell 2022-10-17 814
2aa14b1ab2c41a Nick Terrell 2022-10-17 815 #define ZSTD_BT_GET_ALL_MATCHES_FN(dictMode, mls) ZSTD_btGetAllMatches_##dictMode##_##mls
2aa14b1ab2c41a Nick Terrell 2022-10-17 816
2aa14b1ab2c41a Nick Terrell 2022-10-17 817 #define GEN_ZSTD_BT_GET_ALL_MATCHES_(dictMode, mls) \
2aa14b1ab2c41a Nick Terrell 2022-10-17 818 static U32 ZSTD_BT_GET_ALL_MATCHES_FN(dictMode, mls)( \
2aa14b1ab2c41a Nick Terrell 2022-10-17 819 ZSTD_match_t* matches, \
2aa14b1ab2c41a Nick Terrell 2022-10-17 820 ZSTD_matchState_t* ms, \
2aa14b1ab2c41a Nick Terrell 2022-10-17 821 U32* nextToUpdate3, \
2aa14b1ab2c41a Nick Terrell 2022-10-17 822 const BYTE* ip, \
2aa14b1ab2c41a Nick Terrell 2022-10-17 823 const BYTE* const iHighLimit, \
2aa14b1ab2c41a Nick Terrell 2022-10-17 824 const U32 rep[ZSTD_REP_NUM], \
2aa14b1ab2c41a Nick Terrell 2022-10-17 825 U32 const ll0, \
2aa14b1ab2c41a Nick Terrell 2022-10-17 826 U32 const lengthToBeat) \
2aa14b1ab2c41a Nick Terrell 2022-10-17 827 { \
2aa14b1ab2c41a Nick Terrell 2022-10-17 828 return ZSTD_btGetAllMatches_internal( \
2aa14b1ab2c41a Nick Terrell 2022-10-17 829 matches, ms, nextToUpdate3, ip, iHighLimit, \
2aa14b1ab2c41a Nick Terrell 2022-10-17 830 rep, ll0, lengthToBeat, ZSTD_##dictMode, mls); \
2aa14b1ab2c41a Nick Terrell 2022-10-17 831 }
2aa14b1ab2c41a Nick Terrell 2022-10-17 832
2aa14b1ab2c41a Nick Terrell 2022-10-17 833 #define GEN_ZSTD_BT_GET_ALL_MATCHES(dictMode) \
2aa14b1ab2c41a Nick Terrell 2022-10-17 834 GEN_ZSTD_BT_GET_ALL_MATCHES_(dictMode, 3) \
2aa14b1ab2c41a Nick Terrell 2022-10-17 835 GEN_ZSTD_BT_GET_ALL_MATCHES_(dictMode, 4) \
2aa14b1ab2c41a Nick Terrell 2022-10-17 836 GEN_ZSTD_BT_GET_ALL_MATCHES_(dictMode, 5) \
2aa14b1ab2c41a Nick Terrell 2022-10-17 837 GEN_ZSTD_BT_GET_ALL_MATCHES_(dictMode, 6)
2aa14b1ab2c41a Nick Terrell 2022-10-17 838
2aa14b1ab2c41a Nick Terrell 2022-10-17 839 GEN_ZSTD_BT_GET_ALL_MATCHES(noDict)
2aa14b1ab2c41a Nick Terrell 2022-10-17 840 GEN_ZSTD_BT_GET_ALL_MATCHES(extDict)
2aa14b1ab2c41a Nick Terrell 2022-10-17 841 GEN_ZSTD_BT_GET_ALL_MATCHES(dictMatchState)
2aa14b1ab2c41a Nick Terrell 2022-10-17 842
2aa14b1ab2c41a Nick Terrell 2022-10-17 843 #define ZSTD_BT_GET_ALL_MATCHES_ARRAY(dictMode) \
2aa14b1ab2c41a Nick Terrell 2022-10-17 844 { \
2aa14b1ab2c41a Nick Terrell 2022-10-17 845 ZSTD_BT_GET_ALL_MATCHES_FN(dictMode, 3), \
2aa14b1ab2c41a Nick Terrell 2022-10-17 846 ZSTD_BT_GET_ALL_MATCHES_FN(dictMode, 4), \
2aa14b1ab2c41a Nick Terrell 2022-10-17 847 ZSTD_BT_GET_ALL_MATCHES_FN(dictMode, 5), \
2aa14b1ab2c41a Nick Terrell 2022-10-17 848 ZSTD_BT_GET_ALL_MATCHES_FN(dictMode, 6) \
2aa14b1ab2c41a Nick Terrell 2022-10-17 849 }
2aa14b1ab2c41a Nick Terrell 2022-10-17 850
2aa14b1ab2c41a Nick Terrell 2022-10-17 @851 static ZSTD_getAllMatchesFn
2aa14b1ab2c41a Nick Terrell 2022-10-17 852 ZSTD_selectBtGetAllMatches(ZSTD_matchState_t const* ms, ZSTD_dictMode_e const dictMode)
e0c1b49f5b674c Nick Terrell 2020-09-11 853 {
2aa14b1ab2c41a Nick Terrell 2022-10-17 @854 ZSTD_getAllMatchesFn const getAllMatchesFns[3][4] = {
2aa14b1ab2c41a Nick Terrell 2022-10-17 855 ZSTD_BT_GET_ALL_MATCHES_ARRAY(noDict),
2aa14b1ab2c41a Nick Terrell 2022-10-17 856 ZSTD_BT_GET_ALL_MATCHES_ARRAY(extDict),
2aa14b1ab2c41a Nick Terrell 2022-10-17 857 ZSTD_BT_GET_ALL_MATCHES_ARRAY(dictMatchState)
2aa14b1ab2c41a Nick Terrell 2022-10-17 858 };
2aa14b1ab2c41a Nick Terrell 2022-10-17 859 U32 const mls = BOUNDED(3, ms->cParams.minMatch, 6);
2aa14b1ab2c41a Nick Terrell 2022-10-17 860 assert((U32)dictMode < 3);
2aa14b1ab2c41a Nick Terrell 2022-10-17 861 assert(mls - 3 < 4);
2aa14b1ab2c41a Nick Terrell 2022-10-17 @862 return getAllMatchesFns[(int)dictMode][mls - 3];
e0c1b49f5b674c Nick Terrell 2020-09-11 863 }
e0c1b49f5b674c Nick Terrell 2020-09-11 864

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

2023-05-12 11:46:20

by Jani Nikula

[permalink] [raw]
Subject: Re: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

On Fri, 12 May 2023, Andy Shevchenko <[email protected]> wrote:
> On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
>> Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
>> masks for fixed-width types and also the corresponding BIT_U32(),
>> BIT_U16() and BIT_U8().
>
> Why?

The main reason is that GENMASK() and BIT() size varies for 32/64 bit
builds.


BR,
Jani.

>
>> All of those depend on a new "U" suffix added to the integer constant.
>> Due to naming clashes it's better to call the macro U32. Since C doesn't
>> have a proper suffix for short and char types, the U16 and U18 variants
>> just use U32 with one additional check in the BIT_* macros to make
>> sure the compiler gives an error when the those types overflow.
>> The BIT_U16() and BIT_U8() need the help of GENMASK_INPUT_CHECK(),
>> as otherwise they would allow an invalid bit to be passed. Hence
>> implement them in include/linux/bits.h rather than together with
>> the other BIT* variants.
>
> So, we have _Generic() in case you still wish to implement this.

--
Jani Nikula, Intel Open Source Graphics Center

2023-05-12 11:50:19

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

On Fri, May 12, 2023 at 02:25:18PM +0300, Jani Nikula wrote:
> On Fri, 12 May 2023, Andy Shevchenko <[email protected]> wrote:
> > On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
> >> Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
> >> masks for fixed-width types and also the corresponding BIT_U32(),
> >> BIT_U16() and BIT_U8().
> >
> > Why?
>
> The main reason is that GENMASK() and BIT() size varies for 32/64 bit
> builds.

When needed GENMASK_ULL() can be used (with respective castings perhaps)
and BIT_ULL(), no?

--
With Best Regards,
Andy Shevchenko



2023-05-12 11:50:38

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
> Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
> masks for fixed-width types and also the corresponding BIT_U32(),
> BIT_U16() and BIT_U8().

Why?

> All of those depend on a new "U" suffix added to the integer constant.
> Due to naming clashes it's better to call the macro U32. Since C doesn't
> have a proper suffix for short and char types, the U16 and U18 variants
> just use U32 with one additional check in the BIT_* macros to make
> sure the compiler gives an error when the those types overflow.
> The BIT_U16() and BIT_U8() need the help of GENMASK_INPUT_CHECK(),
> as otherwise they would allow an invalid bit to be passed. Hence
> implement them in include/linux/bits.h rather than together with
> the other BIT* variants.

So, we have _Generic() in case you still wish to implement this.

--
With Best Regards,
Andy Shevchenko



2023-05-12 12:31:20

by Jani Nikula

[permalink] [raw]
Subject: Re: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

On Fri, 12 May 2023, Andy Shevchenko <[email protected]> wrote:
> On Fri, May 12, 2023 at 02:25:18PM +0300, Jani Nikula wrote:
>> On Fri, 12 May 2023, Andy Shevchenko <[email protected]> wrote:
>> > On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
>> >> Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
>> >> masks for fixed-width types and also the corresponding BIT_U32(),
>> >> BIT_U16() and BIT_U8().
>> >
>> > Why?
>>
>> The main reason is that GENMASK() and BIT() size varies for 32/64 bit
>> builds.
>
> When needed GENMASK_ULL() can be used (with respective castings perhaps)
> and BIT_ULL(), no?

How does that help with making them the same 32-bit size on both 32 and
64 bit builds?

BR,
Jani.


--
Jani Nikula, Intel Open Source Graphics Center

2023-05-12 16:38:57

by Lucas De Marchi

[permalink] [raw]
Subject: Re: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

On Fri, May 12, 2023 at 02:14:19PM +0300, Andy Shevchenko wrote:
>On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
>> Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
>> masks for fixed-width types and also the corresponding BIT_U32(),
>> BIT_U16() and BIT_U8().
>
>Why?

to create the masks/values for device registers that are
of a certain width, preventing mistakes like:

#define REG1 0x10
#define REG1_ENABLE BIT(17)
#define REG1_FOO GENMASK(16, 15);

register_write(REG1_ENABLE, REG1);


... if REG1 is a 16bit register for example. There were mistakes in the
past in the i915 source leading to the creation of the REG_* variants on
top of normal GENMASK/BIT (see last patch and commit 09b434d4f6d2
("drm/i915: introduce REG_BIT() and REG_GENMASK() to define register
contents")

We are preparing another driver (xe), still to be merged but already
open (https://gitlab.freedesktop.org/drm/xe/kernel), that has
similar requirements.


>
>> All of those depend on a new "U" suffix added to the integer constant.
>> Due to naming clashes it's better to call the macro U32. Since C doesn't
>> have a proper suffix for short and char types, the U16 and U18 variants
>> just use U32 with one additional check in the BIT_* macros to make
>> sure the compiler gives an error when the those types overflow.
>> The BIT_U16() and BIT_U8() need the help of GENMASK_INPUT_CHECK(),
>> as otherwise they would allow an invalid bit to be passed. Hence
>> implement them in include/linux/bits.h rather than together with
>> the other BIT* variants.
>
>So, we have _Generic() in case you still wish to implement this.

humn... how would a _Generic() help here? The input is 1 or 2 integer
literals (h and l) so the compiler can check it is correct at build
time. See example above.

Lucas De Marchi

2023-06-15 16:21:24

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

On Fri, May 12, 2023 at 02:45:19PM +0300, Jani Nikula wrote:
> On Fri, 12 May 2023, Andy Shevchenko <[email protected]> wrote:
> > On Fri, May 12, 2023 at 02:25:18PM +0300, Jani Nikula wrote:
> >> On Fri, 12 May 2023, Andy Shevchenko <[email protected]> wrote:
> >> > On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
> >> >> Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
> >> >> masks for fixed-width types and also the corresponding BIT_U32(),
> >> >> BIT_U16() and BIT_U8().
> >> >
> >> > Why?
> >>
> >> The main reason is that GENMASK() and BIT() size varies for 32/64 bit
> >> builds.
> >
> > When needed GENMASK_ULL() can be used (with respective castings perhaps)
> > and BIT_ULL(), no?
>
> How does that help with making them the same 32-bit size on both 32 and
> 64 bit builds?

u32 x = GENMASK();
u64 y = GENMASK_ULL();

No? Then use in your code either x or y. Note that I assume that the parameters
to GENMASK*() are built-time constants. Is it the case for you?

--
With Best Regards,
Andy Shevchenko



2023-06-15 16:23:19

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

On Fri, May 12, 2023 at 09:29:23AM -0700, Lucas De Marchi wrote:
> On Fri, May 12, 2023 at 02:14:19PM +0300, Andy Shevchenko wrote:
> > On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
> > > Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
> > > masks for fixed-width types and also the corresponding BIT_U32(),
> > > BIT_U16() and BIT_U8().
> >
> > Why?
>
> to create the masks/values for device registers that are
> of a certain width, preventing mistakes like:
>
> #define REG1 0x10
> #define REG1_ENABLE BIT(17)
> #define REG1_FOO GENMASK(16, 15);
>
> register_write(REG1_ENABLE, REG1);
>
>
> ... if REG1 is a 16bit register for example. There were mistakes in the
> past in the i915 source leading to the creation of the REG_* variants on
> top of normal GENMASK/BIT (see last patch and commit 09b434d4f6d2
> ("drm/i915: introduce REG_BIT() and REG_GENMASK() to define register
> contents")

Doesn't it look like something for bitfield.h candidate?
If your definition doesn't fit the given mask, bail out.

--
With Best Regards,
Andy Shevchenko



2023-06-20 15:11:45

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

On Tue, Jun 20, 2023 at 05:47:34PM +0300, Jani Nikula wrote:
> On Thu, 15 Jun 2023, Andy Shevchenko <[email protected]> wrote:
> > On Fri, May 12, 2023 at 02:45:19PM +0300, Jani Nikula wrote:
> >> On Fri, 12 May 2023, Andy Shevchenko <[email protected]> wrote:
> >> > On Fri, May 12, 2023 at 02:25:18PM +0300, Jani Nikula wrote:
> >> >> On Fri, 12 May 2023, Andy Shevchenko <[email protected]> wrote:
> >> >> > On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
> >> >> >> Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
> >> >> >> masks for fixed-width types and also the corresponding BIT_U32(),
> >> >> >> BIT_U16() and BIT_U8().
> >> >> >
> >> >> > Why?
> >> >>
> >> >> The main reason is that GENMASK() and BIT() size varies for 32/64 bit
> >> >> builds.
> >> >
> >> > When needed GENMASK_ULL() can be used (with respective castings perhaps)
> >> > and BIT_ULL(), no?
> >>
> >> How does that help with making them the same 32-bit size on both 32 and
> >> 64 bit builds?
> >
> > u32 x = GENMASK();
> > u64 y = GENMASK_ULL();
> >
> > No? Then use in your code either x or y. Note that I assume that the parameters
> > to GENMASK*() are built-time constants. Is it the case for you?
>
> What's wrong with wanting to define macros with specific size, depending
> on e.g. hardware registers instead of build size?

Nothing, but I think the problem is smaller than it's presented.
And there are already header for bitfields with a lot of helpers
for (similar) cases if not yours.

> What would you use for printk format if you wanted to to print
> GENMASK()?

%lu, no?

--
With Best Regards,
Andy Shevchenko



2023-06-20 15:14:23

by Jani Nikula

[permalink] [raw]
Subject: Re: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

On Thu, 15 Jun 2023, Andy Shevchenko <[email protected]> wrote:
> On Fri, May 12, 2023 at 02:45:19PM +0300, Jani Nikula wrote:
>> On Fri, 12 May 2023, Andy Shevchenko <[email protected]> wrote:
>> > On Fri, May 12, 2023 at 02:25:18PM +0300, Jani Nikula wrote:
>> >> On Fri, 12 May 2023, Andy Shevchenko <[email protected]> wrote:
>> >> > On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
>> >> >> Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
>> >> >> masks for fixed-width types and also the corresponding BIT_U32(),
>> >> >> BIT_U16() and BIT_U8().
>> >> >
>> >> > Why?
>> >>
>> >> The main reason is that GENMASK() and BIT() size varies for 32/64 bit
>> >> builds.
>> >
>> > When needed GENMASK_ULL() can be used (with respective castings perhaps)
>> > and BIT_ULL(), no?
>>
>> How does that help with making them the same 32-bit size on both 32 and
>> 64 bit builds?
>
> u32 x = GENMASK();
> u64 y = GENMASK_ULL();
>
> No? Then use in your code either x or y. Note that I assume that the parameters
> to GENMASK*() are built-time constants. Is it the case for you?

What's wrong with wanting to define macros with specific size, depending
on e.g. hardware registers instead of build size?

What would you use for printk format if you wanted to to print
GENMASK()?


BR,
Jani.


--
Jani Nikula, Intel Open Source Graphics Center

2023-06-20 18:32:06

by Lucas De Marchi

[permalink] [raw]
Subject: Re: [Intel-xe] [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

On Tue, Jun 20, 2023 at 05:55:19PM +0300, Andy Shevchenko wrote:
>On Tue, Jun 20, 2023 at 05:47:34PM +0300, Jani Nikula wrote:
>> On Thu, 15 Jun 2023, Andy Shevchenko <[email protected]> wrote:
>> > On Fri, May 12, 2023 at 02:45:19PM +0300, Jani Nikula wrote:
>> >> On Fri, 12 May 2023, Andy Shevchenko <[email protected]> wrote:
>> >> > On Fri, May 12, 2023 at 02:25:18PM +0300, Jani Nikula wrote:
>> >> >> On Fri, 12 May 2023, Andy Shevchenko <[email protected]> wrote:
>> >> >> > On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
>> >> >> >> Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
>> >> >> >> masks for fixed-width types and also the corresponding BIT_U32(),
>> >> >> >> BIT_U16() and BIT_U8().
>> >> >> >
>> >> >> > Why?
>> >> >>
>> >> >> The main reason is that GENMASK() and BIT() size varies for 32/64 bit
>> >> >> builds.
>> >> >
>> >> > When needed GENMASK_ULL() can be used (with respective castings perhaps)
>> >> > and BIT_ULL(), no?
>> >>
>> >> How does that help with making them the same 32-bit size on both 32 and
>> >> 64 bit builds?
>> >
>> > u32 x = GENMASK();
>> > u64 y = GENMASK_ULL();
>> >
>> > No? Then use in your code either x or y. Note that I assume that the parameters
>> > to GENMASK*() are built-time constants. Is it the case for you?
>>
>> What's wrong with wanting to define macros with specific size, depending
>> on e.g. hardware registers instead of build size?
>
>Nothing, but I think the problem is smaller than it's presented.

not sure about big/small problem you are talking about. It's a problem
for when the *device* register is a 32b fixed width, which is
independent from the CPU you are running on. We also have registers that
are u16 and u64. Having fixed-width GENMASK and BIT helps avoiding
mistakes like below. Just to use one example, the diff below builds
fine on my 64b machine, yet it's obviously wrong:

$ git diff
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
index 0b414eae1683..692a0ad9a768 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_mcr.c
@@ -261,8 +261,8 @@ static u32 rw_with_mcr_steering_fw(struct intel_gt *gt,
* No need to save old steering reg value.
*/
intel_uncore_write_fw(uncore, MTL_MCR_SELECTOR,
- REG_FIELD_PREP(MTL_MCR_GROUPID, group) |
- REG_FIELD_PREP(MTL_MCR_INSTANCEID, instance) |
+ FIELD_PREP(MTL_MCR_GROUPID, group) |
+ FIELD_PREP(MTL_MCR_INSTANCEID, instance) |
(rw_flag == FW_REG_READ ? GEN11_MCR_MULTICAST : 0));
} else if (GRAPHICS_VER(uncore->i915) >= 11) {
mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 718cb2c80f79..c42bc2900c6a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -80,8 +80,8 @@
#define GEN11_MCR_SLICE_MASK GEN11_MCR_SLICE(0xf)
#define GEN11_MCR_SUBSLICE(subslice) (((subslice) & 0x7) << 24)
#define GEN11_MCR_SUBSLICE_MASK GEN11_MCR_SUBSLICE(0x7)
-#define MTL_MCR_GROUPID REG_GENMASK(11, 8)
-#define MTL_MCR_INSTANCEID REG_GENMASK(3, 0)
+#define MTL_MCR_GROUPID GENMASK(32, 8)
+#define MTL_MCR_INSTANCEID GENMASK(3, 0)

#define IPEIR_I965 _MMIO(0x2064)
#define IPEHR_I965 _MMIO(0x2068)

If the driver didn't support 32b CPUs, this would even go unnoticed.

Lucas De Marchi

>And there are already header for bitfields with a lot of helpers
>for (similar) cases if not yours.
>
>> What would you use for printk format if you wanted to to print
>> GENMASK()?
>
>%lu, no?
>
>--
>With Best Regards,
>Andy Shevchenko
>
>

2023-06-22 03:28:14

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

Hi Lucas, all!

(Thanks, Andy, for pointing to this thread.)

On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
> Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
> masks for fixed-width types and also the corresponding BIT_U32(),
> BIT_U16() and BIT_U8().

Can you split BIT() and GENMASK() material to separate patches?

> All of those depend on a new "U" suffix added to the integer constant.
> Due to naming clashes it's better to call the macro U32. Since C doesn't
> have a proper suffix for short and char types, the U16 and U18 variants
> just use U32 with one additional check in the BIT_* macros to make
> sure the compiler gives an error when the those types overflow.

I feel like I don't understand the sentence...

> The BIT_U16() and BIT_U8() need the help of GENMASK_INPUT_CHECK(),
> as otherwise they would allow an invalid bit to be passed. Hence
> implement them in include/linux/bits.h rather than together with
> the other BIT* variants.

I don't think it's a good way to go because BIT() belongs to a more basic
level than GENMASK(). Not mentioning possible header dependency issues.
If you need to test against tighter numeric region, I'd suggest to
do the same trick as GENMASK_INPUT_CHECK() does, but in uapi/linux/const.h
directly. Something like:
#define _U8(x) (CONST_GT(U8_MAX, x) + _AC(x, U))

> The following test file is is used to test this:
>
> $ cat mask.c
> #include <linux/types.h>
> #include <linux/bits.h>
>
> static const u32 a = GENMASK_U32(31, 0);
> static const u16 b = GENMASK_U16(15, 0);
> static const u8 c = GENMASK_U8(7, 0);
> static const u32 x = BIT_U32(31);
> static const u16 y = BIT_U16(15);
> static const u8 z = BIT_U8(7);
>
> #if FAIL
> static const u32 a2 = GENMASK_U32(32, 0);
> static const u16 b2 = GENMASK_U16(16, 0);
> static const u8 c2 = GENMASK_U8(8, 0);
> static const u32 x2 = BIT_U32(32);
> static const u16 y2 = BIT_U16(16);
> static const u8 z2 = BIT_U8(8);
> #endif
>
> Signed-off-by: Lucas De Marchi <[email protected]>
> ---
> include/linux/bits.h | 22 ++++++++++++++++++++++
> include/uapi/linux/const.h | 2 ++
> include/vdso/const.h | 1 +
> 3 files changed, 25 insertions(+)
>
> diff --git a/include/linux/bits.h b/include/linux/bits.h
> index 7c0cf5031abe..ff4786c99b8c 100644
> --- a/include/linux/bits.h
> +++ b/include/linux/bits.h
> @@ -42,4 +42,26 @@
> #define GENMASK_ULL(h, l) \
> (GENMASK_INPUT_CHECK(h, l) + __GENMASK_ULL(h, l))
>
> +#define __GENMASK_U32(h, l) \
> + (((~U32(0)) - (U32(1) << (l)) + 1) & \
> + (~U32(0) >> (32 - 1 - (h))))
> +#define GENMASK_U32(h, l) \
> + (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U32(h, l))
> +
> +#define __GENMASK_U16(h, l) \
> + ((U32(0xffff) - (U32(1) << (l)) + 1) & \
> + (U32(0xffff) >> (16 - 1 - (h))))
> +#define GENMASK_U16(h, l) \
> + (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U16(h, l))
> +
> +#define __GENMASK_U8(h, l) \
> + (((U32(0xff)) - (U32(1) << (l)) + 1) & \
> + (U32(0xff) >> (8 - 1 - (h))))
> +#define GENMASK_U8(h, l) \
> + (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U8(h, l))

[...]

I see nothing wrong with fixed-wight versions of GENMASK if it helps
people to write safer code. Can you please in commit message mention
the exact patch(es) that added a bug related to GENMASK() misuse? It
would be easier to advocate the purpose of new API with that in mind.

Regarding implementation - we should avoid copy-pasting in cases
like this. Below is the patch that I boot-tested for x86_64 and
compile-tested for arm64.

It looks less opencoded, and maybe Andy will be less skeptical about
this approach because of less maintenance burden. Please take it if
you like for v2.

Thanks,
Yury

From 39c5b35075df67e7d88644470ca78a3486367c02 Mon Sep 17 00:00:00 2001
From: Yury Norov <[email protected]>
Date: Wed, 21 Jun 2023 15:27:29 -0700
Subject: [PATCH] bits: introduce fixed-type genmasks

Generalize __GENMASK() to support different types, and implement
fixed-types versions of GENMASK() based on it.

Signed-off-by: Yury Norov <[email protected]>
---
include/linux/bitops.h | 1 -
include/linux/bits.h | 22 ++++++++++++----------
2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 2ba557e067fe..1db50c69cfdb 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -15,7 +15,6 @@
# define aligned_byte_mask(n) (~0xffUL << (BITS_PER_LONG - 8 - 8*(n)))
#endif

-#define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
#define BITS_TO_LONGS(nr) __KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
#define BITS_TO_U64(nr) __KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u64))
#define BITS_TO_U32(nr) __KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
diff --git a/include/linux/bits.h b/include/linux/bits.h
index 7c0cf5031abe..cb94128171b2 100644
--- a/include/linux/bits.h
+++ b/include/linux/bits.h
@@ -6,6 +6,8 @@
#include <vdso/bits.h>
#include <asm/bitsperlong.h>

+#define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
+
#define BIT_MASK(nr) (UL(1) << ((nr) % BITS_PER_LONG))
#define BIT_WORD(nr) ((nr) / BITS_PER_LONG)
#define BIT_ULL_MASK(nr) (ULL(1) << ((nr) % BITS_PER_LONG_LONG))
@@ -30,16 +32,16 @@
#define GENMASK_INPUT_CHECK(h, l) 0
#endif

-#define __GENMASK(h, l) \
- (((~UL(0)) - (UL(1) << (l)) + 1) & \
- (~UL(0) >> (BITS_PER_LONG - 1 - (h))))
-#define GENMASK(h, l) \
- (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
+#define __GENMASK(t, h, l) \
+ (GENMASK_INPUT_CHECK(h, l) + \
+ (((t)~0ULL - ((t)(1) << (l)) + 1) & \
+ ((t)~0ULL >> (BITS_PER_TYPE(t) - 1 - (h)))))

-#define __GENMASK_ULL(h, l) \
- (((~ULL(0)) - (ULL(1) << (l)) + 1) & \
- (~ULL(0) >> (BITS_PER_LONG_LONG - 1 - (h))))
-#define GENMASK_ULL(h, l) \
- (GENMASK_INPUT_CHECK(h, l) + __GENMASK_ULL(h, l))
+#define GENMASK(h, l) __GENMASK(unsigned long, h, l)
+#define GENMASK_ULL(h, l) __GENMASK(unsigned long long, h, l)
+#define GENMASK_U8(h, l) __GENMASK(u8, h, l)
+#define GENMASK_U16(h, l) __GENMASK(u16, h, l)
+#define GENMASK_U32(h, l) __GENMASK(u32, h, l)
+#define GENMASK_U64(h, l) __GENMASK(u64, h, l)

#endif /* __LINUX_BITS_H */
--
2.39.2


2023-06-22 06:31:36

by Lucas De Marchi

[permalink] [raw]
Subject: Re: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

On Wed, Jun 21, 2023 at 07:20:59PM -0700, Yury Norov wrote:
>Hi Lucas, all!
>
>(Thanks, Andy, for pointing to this thread.)
>
>On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
>> Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
>> masks for fixed-width types and also the corresponding BIT_U32(),
>> BIT_U16() and BIT_U8().
>
>Can you split BIT() and GENMASK() material to separate patches?
>
>> All of those depend on a new "U" suffix added to the integer constant.
>> Due to naming clashes it's better to call the macro U32. Since C doesn't
>> have a proper suffix for short and char types, the U16 and U18 variants
>> just use U32 with one additional check in the BIT_* macros to make
>> sure the compiler gives an error when the those types overflow.
>
>I feel like I don't understand the sentence...

maybe it was a digression of the integer constants

>
>> The BIT_U16() and BIT_U8() need the help of GENMASK_INPUT_CHECK(),
>> as otherwise they would allow an invalid bit to be passed. Hence
>> implement them in include/linux/bits.h rather than together with
>> the other BIT* variants.
>
>I don't think it's a good way to go because BIT() belongs to a more basic
>level than GENMASK(). Not mentioning possible header dependency issues.
>If you need to test against tighter numeric region, I'd suggest to
>do the same trick as GENMASK_INPUT_CHECK() does, but in uapi/linux/const.h
>directly. Something like:
> #define _U8(x) (CONST_GT(U8_MAX, x) + _AC(x, U))
>
>> The following test file is is used to test this:
>>
>> $ cat mask.c
>> #include <linux/types.h>
>> #include <linux/bits.h>
>>
>> static const u32 a = GENMASK_U32(31, 0);
>> static const u16 b = GENMASK_U16(15, 0);
>> static const u8 c = GENMASK_U8(7, 0);
>> static const u32 x = BIT_U32(31);
>> static const u16 y = BIT_U16(15);
>> static const u8 z = BIT_U8(7);
>>
>> #if FAIL
>> static const u32 a2 = GENMASK_U32(32, 0);
>> static const u16 b2 = GENMASK_U16(16, 0);
>> static const u8 c2 = GENMASK_U8(8, 0);
>> static const u32 x2 = BIT_U32(32);
>> static const u16 y2 = BIT_U16(16);
>> static const u8 z2 = BIT_U8(8);
>> #endif
>>
>> Signed-off-by: Lucas De Marchi <[email protected]>
>> ---
>> include/linux/bits.h | 22 ++++++++++++++++++++++
>> include/uapi/linux/const.h | 2 ++
>> include/vdso/const.h | 1 +
>> 3 files changed, 25 insertions(+)
>>
>> diff --git a/include/linux/bits.h b/include/linux/bits.h
>> index 7c0cf5031abe..ff4786c99b8c 100644
>> --- a/include/linux/bits.h
>> +++ b/include/linux/bits.h
>> @@ -42,4 +42,26 @@
>> #define GENMASK_ULL(h, l) \
>> (GENMASK_INPUT_CHECK(h, l) + __GENMASK_ULL(h, l))
>>
>> +#define __GENMASK_U32(h, l) \
>> + (((~U32(0)) - (U32(1) << (l)) + 1) & \
>> + (~U32(0) >> (32 - 1 - (h))))
>> +#define GENMASK_U32(h, l) \
>> + (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U32(h, l))
>> +
>> +#define __GENMASK_U16(h, l) \
>> + ((U32(0xffff) - (U32(1) << (l)) + 1) & \
>> + (U32(0xffff) >> (16 - 1 - (h))))
>> +#define GENMASK_U16(h, l) \
>> + (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U16(h, l))
>> +
>> +#define __GENMASK_U8(h, l) \
>> + (((U32(0xff)) - (U32(1) << (l)) + 1) & \
>> + (U32(0xff) >> (8 - 1 - (h))))
>> +#define GENMASK_U8(h, l) \
>> + (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U8(h, l))
>
>[...]
>
>I see nothing wrong with fixed-wight versions of GENMASK if it helps
>people to write safer code. Can you please in commit message mention
>the exact patch(es) that added a bug related to GENMASK() misuse? It
>would be easier to advocate the purpose of new API with that in mind.
>
>Regarding implementation - we should avoid copy-pasting in cases
>like this. Below is the patch that I boot-tested for x86_64 and
>compile-tested for arm64.
>
>It looks less opencoded, and maybe Andy will be less skeptical about
>this approach because of less maintenance burden. Please take it if
>you like for v2.
>
>Thanks,
>Yury
>
>From 39c5b35075df67e7d88644470ca78a3486367c02 Mon Sep 17 00:00:00 2001
>From: Yury Norov <[email protected]>
>Date: Wed, 21 Jun 2023 15:27:29 -0700
>Subject: [PATCH] bits: introduce fixed-type genmasks
>
>Generalize __GENMASK() to support different types, and implement
>fixed-types versions of GENMASK() based on it.
>
>Signed-off-by: Yury Norov <[email protected]>
>---
> include/linux/bitops.h | 1 -
> include/linux/bits.h | 22 ++++++++++++----------
> 2 files changed, 12 insertions(+), 11 deletions(-)
>
>diff --git a/include/linux/bitops.h b/include/linux/bitops.h
>index 2ba557e067fe..1db50c69cfdb 100644
>--- a/include/linux/bitops.h
>+++ b/include/linux/bitops.h
>@@ -15,7 +15,6 @@
> # define aligned_byte_mask(n) (~0xffUL << (BITS_PER_LONG - 8 - 8*(n)))
> #endif
>
>-#define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
> #define BITS_TO_LONGS(nr) __KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
> #define BITS_TO_U64(nr) __KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u64))
> #define BITS_TO_U32(nr) __KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
>diff --git a/include/linux/bits.h b/include/linux/bits.h
>index 7c0cf5031abe..cb94128171b2 100644
>--- a/include/linux/bits.h
>+++ b/include/linux/bits.h
>@@ -6,6 +6,8 @@
> #include <vdso/bits.h>
> #include <asm/bitsperlong.h>
>
>+#define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
>+
> #define BIT_MASK(nr) (UL(1) << ((nr) % BITS_PER_LONG))
> #define BIT_WORD(nr) ((nr) / BITS_PER_LONG)
> #define BIT_ULL_MASK(nr) (ULL(1) << ((nr) % BITS_PER_LONG_LONG))
>@@ -30,16 +32,16 @@
> #define GENMASK_INPUT_CHECK(h, l) 0
> #endif
>
>-#define __GENMASK(h, l) \
>- (((~UL(0)) - (UL(1) << (l)) + 1) & \
>- (~UL(0) >> (BITS_PER_LONG - 1 - (h))))
>-#define GENMASK(h, l) \
>- (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
>+#define __GENMASK(t, h, l) \
>+ (GENMASK_INPUT_CHECK(h, l) + \
>+ (((t)~0ULL - ((t)(1) << (l)) + 1) & \
>+ ((t)~0ULL >> (BITS_PER_TYPE(t) - 1 - (h)))))

yeah... forcing the use of ull and then casting to the type is simpler
and does the job. Checked that it does not break the build if h is
greater than the type and it works

../include/linux/bits.h:40:20: error: right shift count >= width of type [-Werror=shift-count-overflow]
40 | ((t)~0ULL >> (BITS_PER_TYPE(t) - 1 - (h)))))
| ^~

However this new version does increase the size. Using i915 module
to test:

$ size build64/drivers/gpu/drm/i915/i915.ko*
text data bss dec hex filename
4355676 213473 7048 4576197 45d3c5 build64/drivers/gpu/drm/i915/i915.ko
4361052 213505 7048 4581605 45e8e5 build64/drivers/gpu/drm/i915/i915.ko.new

Lucas De Marchi

>
>-#define __GENMASK_ULL(h, l) \
>- (((~ULL(0)) - (ULL(1) << (l)) + 1) & \
>- (~ULL(0) >> (BITS_PER_LONG_LONG - 1 - (h))))
>-#define GENMASK_ULL(h, l) \
>- (GENMASK_INPUT_CHECK(h, l) + __GENMASK_ULL(h, l))
>+#define GENMASK(h, l) __GENMASK(unsigned long, h, l)
>+#define GENMASK_ULL(h, l) __GENMASK(unsigned long long, h, l)
>+#define GENMASK_U8(h, l) __GENMASK(u8, h, l)
>+#define GENMASK_U16(h, l) __GENMASK(u16, h, l)
>+#define GENMASK_U32(h, l) __GENMASK(u32, h, l)
>+#define GENMASK_U64(h, l) __GENMASK(u64, h, l)
>
> #endif /* __LINUX_BITS_H */
>--
>2.39.2
>

2023-06-22 15:24:26

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

+ Rasmus Villemoes <[email protected]>

> > -#define __GENMASK(h, l) \
> > - (((~UL(0)) - (UL(1) << (l)) + 1) & \
> > - (~UL(0) >> (BITS_PER_LONG - 1 - (h))))
> > -#define GENMASK(h, l) \
> > - (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
> > +#define __GENMASK(t, h, l) \
> > + (GENMASK_INPUT_CHECK(h, l) + \
> > + (((t)~0ULL - ((t)(1) << (l)) + 1) & \
> > + ((t)~0ULL >> (BITS_PER_TYPE(t) - 1 - (h)))))
>
> yeah... forcing the use of ull and then casting to the type is simpler
> and does the job. Checked that it does not break the build if h is
> greater than the type and it works
>
> ../include/linux/bits.h:40:20: error: right shift count >= width of type [-Werror=shift-count-overflow]
> 40 | ((t)~0ULL >> (BITS_PER_TYPE(t) - 1 - (h)))))
> | ^~
>
> However this new version does increase the size. Using i915 module
> to test:
>
> $ size build64/drivers/gpu/drm/i915/i915.ko*
> text data bss dec hex filename
> 4355676 213473 7048 4576197 45d3c5 build64/drivers/gpu/drm/i915/i915.ko
> 4361052 213505 7048 4581605 45e8e5 build64/drivers/gpu/drm/i915/i915.ko.new

It sounds weird because all that should anyways boil down at compile
time...

I enabled DRM_I915 in config and ran bloat-o-meter against today's
master, and I don't see that much difference.

$ size vmlinux vmlinux.new
text data bss dec hex filename
44978613 23962202 3026948 71967763 44a2413 vmlinux
44978653 23966298 3026948 71971899 44a343b vmlinux.new
$ scripts/bloat-o-meter vmlinux vmlinux.new
add/remove: 0/0 grow/shrink: 3/2 up/down: 28/-5 (23)
Function old new delta
kvm_mmu_reset_all_pte_masks 623 639 +16
intel_psr_invalidate 1112 1119 +7
intel_drrs_activate 624 629 +5
intel_psr_flush 1410 1409 -1
clk_fractional_divider_general_approximation 207 203 -4
Total: Before=35398799, After=35398822, chg +0.00%

Can you please check your numbers?

Interestingly, the kvm_mmu_reset_all_pte_masks() uses GENMASK_ULL(),
which should generate the same code across versions. Maybe it's just
a noise? Rasmus, can you please take a look?

Thanks,
Yury


2024-01-18 20:42:36

by Lucas De Marchi

[permalink] [raw]
Subject: Re: Re: [Intel-xe] [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

Hi,

Reviving this thread as now with xe driver merged we have 2 users for
a fixed-width BIT/GENMASK.

On Wed, Jun 21, 2023 at 07:20:59PM -0700, Yury Norov wrote:
>Hi Lucas, all!
>
>(Thanks, Andy, for pointing to this thread.)
>
>On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
>> Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
>> masks for fixed-width types and also the corresponding BIT_U32(),
>> BIT_U16() and BIT_U8().
>
>Can you split BIT() and GENMASK() material to separate patches?
>
>> All of those depend on a new "U" suffix added to the integer constant.
>> Due to naming clashes it's better to call the macro U32. Since C doesn't
>> have a proper suffix for short and char types, the U16 and U18 variants
>> just use U32 with one additional check in the BIT_* macros to make
>> sure the compiler gives an error when the those types overflow.
>
>I feel like I don't understand the sentence...
>
>> The BIT_U16() and BIT_U8() need the help of GENMASK_INPUT_CHECK(),
>> as otherwise they would allow an invalid bit to be passed. Hence
>> implement them in include/linux/bits.h rather than together with
>> the other BIT* variants.
>
>I don't think it's a good way to go because BIT() belongs to a more basic
>level than GENMASK(). Not mentioning possible header dependency issues.
>If you need to test against tighter numeric region, I'd suggest to
>do the same trick as GENMASK_INPUT_CHECK() does, but in uapi/linux/const.h
>directly. Something like:
> #define _U8(x) (CONST_GT(U8_MAX, x) + _AC(x, U))

but then make uapi/linux/const.h include linux/build_bug.h?
I was thinking about leaving BIT() define where it is, and add the
fixed-width versions in this header. I was thinking uapi/linux/const.h
was more about allowing the U/ULL suffixes for things shared with asm.

Lucas De Marchi

>
>> The following test file is is used to test this:
>>
>> $ cat mask.c
>> #include <linux/types.h>
>> #include <linux/bits.h>
>>
>> static const u32 a = GENMASK_U32(31, 0);
>> static const u16 b = GENMASK_U16(15, 0);
>> static const u8 c = GENMASK_U8(7, 0);
>> static const u32 x = BIT_U32(31);
>> static const u16 y = BIT_U16(15);
>> static const u8 z = BIT_U8(7);
>>
>> #if FAIL
>> static const u32 a2 = GENMASK_U32(32, 0);
>> static const u16 b2 = GENMASK_U16(16, 0);
>> static const u8 c2 = GENMASK_U8(8, 0);
>> static const u32 x2 = BIT_U32(32);
>> static const u16 y2 = BIT_U16(16);
>> static const u8 z2 = BIT_U8(8);
>> #endif
>>
>> Signed-off-by: Lucas De Marchi <[email protected]>
>> ---
>> include/linux/bits.h | 22 ++++++++++++++++++++++
>> include/uapi/linux/const.h | 2 ++
>> include/vdso/const.h | 1 +
>> 3 files changed, 25 insertions(+)
>>
>> diff --git a/include/linux/bits.h b/include/linux/bits.h
>> index 7c0cf5031abe..ff4786c99b8c 100644
>> --- a/include/linux/bits.h
>> +++ b/include/linux/bits.h
>> @@ -42,4 +42,26 @@
>> #define GENMASK_ULL(h, l) \
>> (GENMASK_INPUT_CHECK(h, l) + __GENMASK_ULL(h, l))
>>
>> +#define __GENMASK_U32(h, l) \
>> + (((~U32(0)) - (U32(1) << (l)) + 1) & \
>> + (~U32(0) >> (32 - 1 - (h))))
>> +#define GENMASK_U32(h, l) \
>> + (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U32(h, l))
>> +
>> +#define __GENMASK_U16(h, l) \
>> + ((U32(0xffff) - (U32(1) << (l)) + 1) & \
>> + (U32(0xffff) >> (16 - 1 - (h))))
>> +#define GENMASK_U16(h, l) \
>> + (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U16(h, l))
>> +
>> +#define __GENMASK_U8(h, l) \
>> + (((U32(0xff)) - (U32(1) << (l)) + 1) & \
>> + (U32(0xff) >> (8 - 1 - (h))))
>> +#define GENMASK_U8(h, l) \
>> + (GENMASK_INPUT_CHECK(h, l) + __GENMASK_U8(h, l))
>
>[...]
>
>I see nothing wrong with fixed-wight versions of GENMASK if it helps
>people to write safer code. Can you please in commit message mention
>the exact patch(es) that added a bug related to GENMASK() misuse? It
>would be easier to advocate the purpose of new API with that in mind.
>
>Regarding implementation - we should avoid copy-pasting in cases
>like this. Below is the patch that I boot-tested for x86_64 and
>compile-tested for arm64.
>
>It looks less opencoded, and maybe Andy will be less skeptical about
>this approach because of less maintenance burden. Please take it if
>you like for v2.
>
>Thanks,
>Yury
>
>From 39c5b35075df67e7d88644470ca78a3486367c02 Mon Sep 17 00:00:00 2001
>From: Yury Norov <[email protected]>
>Date: Wed, 21 Jun 2023 15:27:29 -0700
>Subject: [PATCH] bits: introduce fixed-type genmasks
>
>Generalize __GENMASK() to support different types, and implement
>fixed-types versions of GENMASK() based on it.
>
>Signed-off-by: Yury Norov <[email protected]>
>---
> include/linux/bitops.h | 1 -
> include/linux/bits.h | 22 ++++++++++++----------
> 2 files changed, 12 insertions(+), 11 deletions(-)
>
>diff --git a/include/linux/bitops.h b/include/linux/bitops.h
>index 2ba557e067fe..1db50c69cfdb 100644
>--- a/include/linux/bitops.h
>+++ b/include/linux/bitops.h
>@@ -15,7 +15,6 @@
> # define aligned_byte_mask(n) (~0xffUL << (BITS_PER_LONG - 8 - 8*(n)))
> #endif
>
>-#define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
> #define BITS_TO_LONGS(nr) __KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
> #define BITS_TO_U64(nr) __KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u64))
> #define BITS_TO_U32(nr) __KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
>diff --git a/include/linux/bits.h b/include/linux/bits.h
>index 7c0cf5031abe..cb94128171b2 100644
>--- a/include/linux/bits.h
>+++ b/include/linux/bits.h
>@@ -6,6 +6,8 @@
> #include <vdso/bits.h>
> #include <asm/bitsperlong.h>
>
>+#define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
>+
> #define BIT_MASK(nr) (UL(1) << ((nr) % BITS_PER_LONG))
> #define BIT_WORD(nr) ((nr) / BITS_PER_LONG)
> #define BIT_ULL_MASK(nr) (ULL(1) << ((nr) % BITS_PER_LONG_LONG))
>@@ -30,16 +32,16 @@
> #define GENMASK_INPUT_CHECK(h, l) 0
> #endif
>
>-#define __GENMASK(h, l) \
>- (((~UL(0)) - (UL(1) << (l)) + 1) & \
>- (~UL(0) >> (BITS_PER_LONG - 1 - (h))))
>-#define GENMASK(h, l) \
>- (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
>+#define __GENMASK(t, h, l) \
>+ (GENMASK_INPUT_CHECK(h, l) + \
>+ (((t)~0ULL - ((t)(1) << (l)) + 1) & \
>+ ((t)~0ULL >> (BITS_PER_TYPE(t) - 1 - (h)))))
>
>-#define __GENMASK_ULL(h, l) \
>- (((~ULL(0)) - (ULL(1) << (l)) + 1) & \
>- (~ULL(0) >> (BITS_PER_LONG_LONG - 1 - (h))))
>-#define GENMASK_ULL(h, l) \
>- (GENMASK_INPUT_CHECK(h, l) + __GENMASK_ULL(h, l))
>+#define GENMASK(h, l) __GENMASK(unsigned long, h, l)
>+#define GENMASK_ULL(h, l) __GENMASK(unsigned long long, h, l)
>+#define GENMASK_U8(h, l) __GENMASK(u8, h, l)
>+#define GENMASK_U16(h, l) __GENMASK(u16, h, l)
>+#define GENMASK_U32(h, l) __GENMASK(u32, h, l)
>+#define GENMASK_U64(h, l) __GENMASK(u64, h, l)
>
> #endif /* __LINUX_BITS_H */
>--
>2.39.2
>

2024-01-18 21:48:55

by Yury Norov

[permalink] [raw]
Subject: Re: Re: [Intel-xe] [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

On Thu, Jan 18, 2024 at 02:42:12PM -0600, Lucas De Marchi wrote:
> Hi,
>
> Reviving this thread as now with xe driver merged we have 2 users for
> a fixed-width BIT/GENMASK.

Can you point where and why?

> On Wed, Jun 21, 2023 at 07:20:59PM -0700, Yury Norov wrote:
> > Hi Lucas, all!
> >
> > (Thanks, Andy, for pointing to this thread.)
> >
> > On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
> > > Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
> > > masks for fixed-width types and also the corresponding BIT_U32(),
> > > BIT_U16() and BIT_U8().
> >
> > Can you split BIT() and GENMASK() material to separate patches?
> >
> > > All of those depend on a new "U" suffix added to the integer constant.
> > > Due to naming clashes it's better to call the macro U32. Since C doesn't
> > > have a proper suffix for short and char types, the U16 and U18 variants
> > > just use U32 with one additional check in the BIT_* macros to make
> > > sure the compiler gives an error when the those types overflow.
> >
> > I feel like I don't understand the sentence...
> >
> > > The BIT_U16() and BIT_U8() need the help of GENMASK_INPUT_CHECK(),
> > > as otherwise they would allow an invalid bit to be passed. Hence
> > > implement them in include/linux/bits.h rather than together with
> > > the other BIT* variants.
> >
> > I don't think it's a good way to go because BIT() belongs to a more basic
> > level than GENMASK(). Not mentioning possible header dependency issues.
> > If you need to test against tighter numeric region, I'd suggest to
> > do the same trick as GENMASK_INPUT_CHECK() does, but in uapi/linux/const.h
> > directly. Something like:
> > #define _U8(x) (CONST_GT(U8_MAX, x) + _AC(x, U))
>
> but then make uapi/linux/const.h include linux/build_bug.h?
> I was thinking about leaving BIT() define where it is, and add the
> fixed-width versions in this header. I was thinking uapi/linux/const.h
> was more about allowing the U/ULL suffixes for things shared with asm.

You can't include kernel headers in uapi code. But you can try doing
vice-versa: implement or move the pieces you need to share to the
uapi/linux/const.h, and use them in the kernel code.

In the worst case, you can just implement the macro you need in the
uapi header, and make it working that way.

Can you confirm that my proposal increases the kernel size? If so, is
there any way to fix it? If it doesn't, I'd prefer to use the
__GENMASK() approach.

Thanks,
Yury

2024-01-18 23:25:27

by Lucas De Marchi

[permalink] [raw]
Subject: Re: Re: Re: [Intel-xe] [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

On Thu, Jan 18, 2024 at 01:48:43PM -0800, Yury Norov wrote:
>On Thu, Jan 18, 2024 at 02:42:12PM -0600, Lucas De Marchi wrote:
>> Hi,
>>
>> Reviving this thread as now with xe driver merged we have 2 users for
>> a fixed-width BIT/GENMASK.
>
>Can you point where and why?

See users of REG_GENMASK and REG_BIT in drivers/gpu/drm/i915 and
drivers/gpu/drm/xe. I think the register definition in the xe shows it
in a good way:

drivers/gpu/drm/xe/regs/xe_gt_regs.h

The GPU registers are mostly 32-bit wide. We don't want to accidently do
something like below (s/30/33/ added for illustration purposes):

#define LSC_CHICKEN_BIT_0 XE_REG_MCR(0xe7c8)
#define DISABLE_D8_D16_COASLESCE REG_BIT(33)

Same thing for GENMASK family of macros and for registers that are 16 or
8 bits. See e.g. drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h


>
>> On Wed, Jun 21, 2023 at 07:20:59PM -0700, Yury Norov wrote:
>> > Hi Lucas, all!
>> >
>> > (Thanks, Andy, for pointing to this thread.)
>> >
>> > On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
>> > > Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
>> > > masks for fixed-width types and also the corresponding BIT_U32(),
>> > > BIT_U16() and BIT_U8().
>> >
>> > Can you split BIT() and GENMASK() material to separate patches?
>> >
>> > > All of those depend on a new "U" suffix added to the integer constant.
>> > > Due to naming clashes it's better to call the macro U32. Since C doesn't
>> > > have a proper suffix for short and char types, the U16 and U18 variants
>> > > just use U32 with one additional check in the BIT_* macros to make
>> > > sure the compiler gives an error when the those types overflow.
>> >
>> > I feel like I don't understand the sentence...
>> >
>> > > The BIT_U16() and BIT_U8() need the help of GENMASK_INPUT_CHECK(),
>> > > as otherwise they would allow an invalid bit to be passed. Hence
>> > > implement them in include/linux/bits.h rather than together with
>> > > the other BIT* variants.
>> >
>> > I don't think it's a good way to go because BIT() belongs to a more basic
>> > level than GENMASK(). Not mentioning possible header dependency issues.
>> > If you need to test against tighter numeric region, I'd suggest to
>> > do the same trick as GENMASK_INPUT_CHECK() does, but in uapi/linux/const.h
>> > directly. Something like:
>> > #define _U8(x) (CONST_GT(U8_MAX, x) + _AC(x, U))
>>
>> but then make uapi/linux/const.h include linux/build_bug.h?
>> I was thinking about leaving BIT() define where it is, and add the
>> fixed-width versions in this header. I was thinking uapi/linux/const.h
>> was more about allowing the U/ULL suffixes for things shared with asm.
>
>You can't include kernel headers in uapi code. But you can try doing
>vice-versa: implement or move the pieces you need to share to the
>uapi/linux/const.h, and use them in the kernel code.

but in this CONST_GE() should trigger a BUG/static_assert
on U8_MAX < x. AFAICS that check can't be on the uapi/ side,
so there's nothing much left to change in uapi/linux/const.h.

I'd expect drivers to be the primary user of these fixed-width BIT
variants, hence the proposal to do in include/linux/bits.h.
Ssomething like this WIP/untested diff (on top of your previous patch):


diff --git a/include/linux/bits.h b/include/linux/bits.h
index cb94128171b2..409cd10f7597 100644
--- a/include/linux/bits.h
+++ b/include/linux/bits.h
@@ -24,12 +24,16 @@
#define GENMASK_INPUT_CHECK(h, l) \
(BUILD_BUG_ON_ZERO(__builtin_choose_expr( \
__is_constexpr((l) > (h)), (l) > (h), 0)))
+#define BIT_INPUT_CHECK(type, b) \
+ ((BUILD_BUG_ON_ZERO(__builtin_choose_expr( \
+ __is_constexpr(b), (b) >= BITS_PER_TYPE(type), 0))))
#else
/*
* BUILD_BUG_ON_ZERO is not available in h files included from asm files,
* disable the input check if that is the case.
*/
#define GENMASK_INPUT_CHECK(h, l) 0
+#define BIT_INPUT_CHECK(type, b) 0
#endif

#define __GENMASK(t, h, l) \
@@ -44,4 +48,9 @@
#define GENMASK_U32(h, l) __GENMASK(u32, h, l)
#define GENMASK_U64(h, l) __GENMASK(u64, h, l)

+#define BIT_U8(b) (u8)(BIT_INPUT_CHECK(u8, b) + BIT(b))
+#define BIT_U16(b) (u16)(BIT_INPUT_CHECK(u16, b) + BIT(b))
+#define BIT_U32(b) (u32)(BIT_INPUT_CHECK(u32, b) + BIT(b))
+#define BIT_U64(b) (u64)(BIT_INPUT_CHECK(u64, b) + BIT(b))
+
#endif /* __LINUX_BITS_H */

>
>In the worst case, you can just implement the macro you need in the
>uapi header, and make it working that way.
>
>Can you confirm that my proposal increases the kernel size? If so, is
>there any way to fix it? If it doesn't, I'd prefer to use the
>__GENMASK() approach.

I agree on continuing with your approach. The bloat-o-meter indeed
showed almost no difference. `size ....i915.o` on the other hand
increased, but then decreased when I replaced our current REG_GENMASK()
implementation to reuse the new GENMASK_U*()

$ # test-genmask.00: before any change
$ # test-genmask.01: after your patch to GENMASK
$ # test-genmask.01: after converting drivers/gpu/drm/i915/i915_reg_defs.h
to use the new macros
$ size build64/drivers/gpu/drm/i915/i915.o-test-genmask.*
text data bss dec hex filename
4506628 215083 7168 4728879 48282f build64/drivers/gpu/drm/i915/i915.o-test-genmask.00
4511084 215083 7168 4733335 483997 build64/drivers/gpu/drm/i915/i915.o-test-genmask.01
4493292 215083 7168 4715543 47f417 build64/drivers/gpu/drm/i915/i915.o-test-genmask.02

$ ./scripts/bloat-o-meter build64/drivers/gpu/drm/i915/i915.o-test-genmask.0[01]
add/remove: 0/0 grow/shrink: 2/1 up/down: 4/-5 (-1)
Function old new delta
intel_drrs_activate 399 402 +3
intel_psr_invalidate 546 547 +1
intel_psr_flush 880 875 -5
Total: Before=2980530, After=2980529, chg -0.00%

$ ./scripts/bloat-o-meter build64/drivers/gpu/drm/i915/i915.o-test-genmask.0[12]
add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
Function old new delta
Total: Before=2980529, After=2980529, chg +0.00%

thanks
Lucas De Marchi

>
>Thanks,
>Yury

2024-01-19 02:02:14

by Yury Norov

[permalink] [raw]
Subject: Re: Re: Re: [Intel-xe] [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

On Thu, Jan 18, 2024 at 05:25:00PM -0600, Lucas De Marchi wrote:
> SA2PR11MB4874
> X-OriginatorOrg: intel.com
> Status: RO
> Content-Length: 6257
> Lines: 150
>
> On Thu, Jan 18, 2024 at 01:48:43PM -0800, Yury Norov wrote:
> > On Thu, Jan 18, 2024 at 02:42:12PM -0600, Lucas De Marchi wrote:
> > > Hi,
> > >
> > > Reviving this thread as now with xe driver merged we have 2 users for
> > > a fixed-width BIT/GENMASK.
> >
> > Can you point where and why?
>
> See users of REG_GENMASK and REG_BIT in drivers/gpu/drm/i915 and
> drivers/gpu/drm/xe. I think the register definition in the xe shows it
> in a good way:
>
> drivers/gpu/drm/xe/regs/xe_gt_regs.h
>
> The GPU registers are mostly 32-bit wide. We don't want to accidently do
> something like below (s/30/33/ added for illustration purposes):
>
> #define LSC_CHICKEN_BIT_0 XE_REG_MCR(0xe7c8)
> #define DISABLE_D8_D16_COASLESCE REG_BIT(33)
>
> Same thing for GENMASK family of macros and for registers that are 16 or
> 8 bits. See e.g. drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
>
>
> >
> > > On Wed, Jun 21, 2023 at 07:20:59PM -0700, Yury Norov wrote:
> > > > Hi Lucas, all!
> > > >
> > > > (Thanks, Andy, for pointing to this thread.)
> > > >
> > > > On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
> > > > > Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
> > > > > masks for fixed-width types and also the corresponding BIT_U32(),
> > > > > BIT_U16() and BIT_U8().
> > > >
> > > > Can you split BIT() and GENMASK() material to separate patches?
> > > >
> > > > > All of those depend on a new "U" suffix added to the integer constant.
> > > > > Due to naming clashes it's better to call the macro U32. Since C doesn't
> > > > > have a proper suffix for short and char types, the U16 and U18 variants
> > > > > just use U32 with one additional check in the BIT_* macros to make
> > > > > sure the compiler gives an error when the those types overflow.
> > > >
> > > > I feel like I don't understand the sentence...
> > > >
> > > > > The BIT_U16() and BIT_U8() need the help of GENMASK_INPUT_CHECK(),
> > > > > as otherwise they would allow an invalid bit to be passed. Hence
> > > > > implement them in include/linux/bits.h rather than together with
> > > > > the other BIT* variants.
> > > >
> > > > I don't think it's a good way to go because BIT() belongs to a more basic
> > > > level than GENMASK(). Not mentioning possible header dependency issues.
> > > > If you need to test against tighter numeric region, I'd suggest to
> > > > do the same trick as GENMASK_INPUT_CHECK() does, but in uapi/linux/const.h
> > > > directly. Something like:
> > > > #define _U8(x) (CONST_GT(U8_MAX, x) + _AC(x, U))
> > >
> > > but then make uapi/linux/const.h include linux/build_bug.h?
> > > I was thinking about leaving BIT() define where it is, and add the
> > > fixed-width versions in this header. I was thinking uapi/linux/const.h
> > > was more about allowing the U/ULL suffixes for things shared with asm.
> >
> > You can't include kernel headers in uapi code. But you can try doing
> > vice-versa: implement or move the pieces you need to share to the
> > uapi/linux/const.h, and use them in the kernel code.
>
> but in this CONST_GE() should trigger a BUG/static_assert
> on U8_MAX < x. AFAICS that check can't be on the uapi/ side,
> so there's nothing much left to change in uapi/linux/const.h.
>
> I'd expect drivers to be the primary user of these fixed-width BIT
> variants, hence the proposal to do in include/linux/bits.h.
> Ssomething like this WIP/untested diff (on top of your previous patch):
>
>
> diff --git a/include/linux/bits.h b/include/linux/bits.h
> index cb94128171b2..409cd10f7597 100644
> --- a/include/linux/bits.h
> +++ b/include/linux/bits.h
> @@ -24,12 +24,16 @@
> #define GENMASK_INPUT_CHECK(h, l) \
> (BUILD_BUG_ON_ZERO(__builtin_choose_expr( \
> __is_constexpr((l) > (h)), (l) > (h), 0)))
> +#define BIT_INPUT_CHECK(type, b) \
> + ((BUILD_BUG_ON_ZERO(__builtin_choose_expr( \
> + __is_constexpr(b), (b) >= BITS_PER_TYPE(type), 0))))
> #else
> /*
> * BUILD_BUG_ON_ZERO is not available in h files included from asm files,
> * disable the input check if that is the case.
> */
> #define GENMASK_INPUT_CHECK(h, l) 0
> +#define BIT_INPUT_CHECK(type, b) 0
> #endif
> #define __GENMASK(t, h, l) \
> @@ -44,4 +48,9 @@
> #define GENMASK_U32(h, l) __GENMASK(u32, h, l)
> #define GENMASK_U64(h, l) __GENMASK(u64, h, l)
> +#define BIT_U8(b) (u8)(BIT_INPUT_CHECK(u8, b) + BIT(b))
> +#define BIT_U16(b) (u16)(BIT_INPUT_CHECK(u16, b) + BIT(b))
> +#define BIT_U32(b) (u32)(BIT_INPUT_CHECK(u32, b) + BIT(b))
> +#define BIT_U64(b) (u64)(BIT_INPUT_CHECK(u64, b) + BIT(b))

Can you add some vertical spacing here, like between GENMASK and BIT
blocks?

> +
> #endif /* __LINUX_BITS_H */
>
> >
> > In the worst case, you can just implement the macro you need in the
> > uapi header, and make it working that way.
> >
> > Can you confirm that my proposal increases the kernel size? If so, is
> > there any way to fix it? If it doesn't, I'd prefer to use the
> > __GENMASK() approach.
>
> I agree on continuing with your approach. The bloat-o-meter indeed
> showed almost no difference. `size ....i915.o` on the other hand
> increased, but then decreased when I replaced our current REG_GENMASK()
> implementation to reuse the new GENMASK_U*()
>
> $ # test-genmask.00: before any change
> $ # test-genmask.01: after your patch to GENMASK
> $ # test-genmask.01: after converting drivers/gpu/drm/i915/i915_reg_defs.h
> to use the new macros
> $ size build64/drivers/gpu/drm/i915/i915.o-test-genmask.*
> text data bss dec hex filename
> 4506628 215083 7168 4728879 48282f build64/drivers/gpu/drm/i915/i915.o-test-genmask.00
> 4511084 215083 7168 4733335 483997 build64/drivers/gpu/drm/i915/i915.o-test-genmask.01
> 4493292 215083 7168 4715543 47f417 build64/drivers/gpu/drm/i915/i915.o-test-genmask.02
>
> $ ./scripts/bloat-o-meter build64/drivers/gpu/drm/i915/i915.o-test-genmask.0[01]
> add/remove: 0/0 grow/shrink: 2/1 up/down: 4/-5 (-1)
> Function old new delta
> intel_drrs_activate 399 402 +3
> intel_psr_invalidate 546 547 +1
> intel_psr_flush 880 875 -5
> Total: Before=2980530, After=2980529, chg -0.00%
>
> $ ./scripts/bloat-o-meter build64/drivers/gpu/drm/i915/i915.o-test-genmask.0[12]
> add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
> Function old new delta
> Total

OK then. With the above approach, fixed-type BIT() macros look like wrappers
around the plain BIT(), and I think, we can live with that.

Can you send all the material as a proper series, including my
GENMASK patch, your patch above and a patch that switches your driver
to using the new API? I'll take it then in bitmap-for-next when the
merge window will get closed.

Thanks,
Yury

2024-01-19 15:07:35

by Lucas De Marchi

[permalink] [raw]
Subject: Re: Re: Re: Re: [Intel-xe] [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

On Thu, Jan 18, 2024 at 06:01:58PM -0800, Yury Norov wrote:
>On Thu, Jan 18, 2024 at 05:25:00PM -0600, Lucas De Marchi wrote:
>> SA2PR11MB4874
>> X-OriginatorOrg: intel.com
>> Status: RO
>> Content-Length: 6257
>> Lines: 150
>>
>> On Thu, Jan 18, 2024 at 01:48:43PM -0800, Yury Norov wrote:
>> > On Thu, Jan 18, 2024 at 02:42:12PM -0600, Lucas De Marchi wrote:
>> > > Hi,
>> > >
>> > > Reviving this thread as now with xe driver merged we have 2 users for
>> > > a fixed-width BIT/GENMASK.
>> >
>> > Can you point where and why?
>>
>> See users of REG_GENMASK and REG_BIT in drivers/gpu/drm/i915 and
>> drivers/gpu/drm/xe. I think the register definition in the xe shows it
>> in a good way:
>>
>> drivers/gpu/drm/xe/regs/xe_gt_regs.h
>>
>> The GPU registers are mostly 32-bit wide. We don't want to accidently do
>> something like below (s/30/33/ added for illustration purposes):
>>
>> #define LSC_CHICKEN_BIT_0 XE_REG_MCR(0xe7c8)
>> #define DISABLE_D8_D16_COASLESCE REG_BIT(33)
>>
>> Same thing for GENMASK family of macros and for registers that are 16 or
>> 8 bits. See e.g. drivers/gpu/drm/i915/display/intel_cx0_phy_regs.h
>>
>>
>> >
>> > > On Wed, Jun 21, 2023 at 07:20:59PM -0700, Yury Norov wrote:
>> > > > Hi Lucas, all!
>> > > >
>> > > > (Thanks, Andy, for pointing to this thread.)
>> > > >
>> > > > On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
>> > > > > Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8() macros to create
>> > > > > masks for fixed-width types and also the corresponding BIT_U32(),
>> > > > > BIT_U16() and BIT_U8().
>> > > >
>> > > > Can you split BIT() and GENMASK() material to separate patches?
>> > > >
>> > > > > All of those depend on a new "U" suffix added to the integer constant.
>> > > > > Due to naming clashes it's better to call the macro U32. Since C doesn't
>> > > > > have a proper suffix for short and char types, the U16 and U18 variants
>> > > > > just use U32 with one additional check in the BIT_* macros to make
>> > > > > sure the compiler gives an error when the those types overflow.
>> > > >
>> > > > I feel like I don't understand the sentence...
>> > > >
>> > > > > The BIT_U16() and BIT_U8() need the help of GENMASK_INPUT_CHECK(),
>> > > > > as otherwise they would allow an invalid bit to be passed. Hence
>> > > > > implement them in include/linux/bits.h rather than together with
>> > > > > the other BIT* variants.
>> > > >
>> > > > I don't think it's a good way to go because BIT() belongs to a more basic
>> > > > level than GENMASK(). Not mentioning possible header dependency issues.
>> > > > If you need to test against tighter numeric region, I'd suggest to
>> > > > do the same trick as GENMASK_INPUT_CHECK() does, but in uapi/linux/const.h
>> > > > directly. Something like:
>> > > > #define _U8(x) (CONST_GT(U8_MAX, x) + _AC(x, U))
>> > >
>> > > but then make uapi/linux/const.h include linux/build_bug.h?
>> > > I was thinking about leaving BIT() define where it is, and add the
>> > > fixed-width versions in this header. I was thinking uapi/linux/const.h
>> > > was more about allowing the U/ULL suffixes for things shared with asm.
>> >
>> > You can't include kernel headers in uapi code. But you can try doing
>> > vice-versa: implement or move the pieces you need to share to the
>> > uapi/linux/const.h, and use them in the kernel code.
>>
>> but in this CONST_GE() should trigger a BUG/static_assert
>> on U8_MAX < x. AFAICS that check can't be on the uapi/ side,
>> so there's nothing much left to change in uapi/linux/const.h.
>>
>> I'd expect drivers to be the primary user of these fixed-width BIT
>> variants, hence the proposal to do in include/linux/bits.h.
>> Ssomething like this WIP/untested diff (on top of your previous patch):
>>
>>
>> diff --git a/include/linux/bits.h b/include/linux/bits.h
>> index cb94128171b2..409cd10f7597 100644
>> --- a/include/linux/bits.h
>> +++ b/include/linux/bits.h
>> @@ -24,12 +24,16 @@
>> #define GENMASK_INPUT_CHECK(h, l) \
>> (BUILD_BUG_ON_ZERO(__builtin_choose_expr( \
>> __is_constexpr((l) > (h)), (l) > (h), 0)))
>> +#define BIT_INPUT_CHECK(type, b) \
>> + ((BUILD_BUG_ON_ZERO(__builtin_choose_expr( \
>> + __is_constexpr(b), (b) >= BITS_PER_TYPE(type), 0))))
>> #else
>> /*
>> * BUILD_BUG_ON_ZERO is not available in h files included from asm files,
>> * disable the input check if that is the case.
>> */
>> #define GENMASK_INPUT_CHECK(h, l) 0
>> +#define BIT_INPUT_CHECK(type, b) 0
>> #endif
>> #define __GENMASK(t, h, l) \
>> @@ -44,4 +48,9 @@
>> #define GENMASK_U32(h, l) __GENMASK(u32, h, l)
>> #define GENMASK_U64(h, l) __GENMASK(u64, h, l)
>> +#define BIT_U8(b) (u8)(BIT_INPUT_CHECK(u8, b) + BIT(b))
>> +#define BIT_U16(b) (u16)(BIT_INPUT_CHECK(u16, b) + BIT(b))
>> +#define BIT_U32(b) (u32)(BIT_INPUT_CHECK(u32, b) + BIT(b))
>> +#define BIT_U64(b) (u64)(BIT_INPUT_CHECK(u64, b) + BIT(b))
>
>Can you add some vertical spacing here, like between GENMASK and BIT
>blocks?

I think gmail mangled this, because it does show up with more vertical
space on the email I sent:
https://lore.kernel.org/all/clamvpymzwiehjqd6jhuigymyg5ikxewxyeee2eae4tgzmaz7u@6rposizee3t6/

Anyway, I will clean this up and probably add some docs about its usage.

>
>> +
>> #endif /* __LINUX_BITS_H */
>>
>> >
>> > In the worst case, you can just implement the macro you need in the
>> > uapi header, and make it working that way.
>> >
>> > Can you confirm that my proposal increases the kernel size? If so, is
>> > there any way to fix it? If it doesn't, I'd prefer to use the
>> > __GENMASK() approach.
>>
>> I agree on continuing with your approach. The bloat-o-meter indeed
>> showed almost no difference. `size ....i915.o` on the other hand
>> increased, but then decreased when I replaced our current REG_GENMASK()
>> implementation to reuse the new GENMASK_U*()
>>
>> $ # test-genmask.00: before any change
>> $ # test-genmask.01: after your patch to GENMASK
>> $ # test-genmask.01: after converting drivers/gpu/drm/i915/i915_reg_defs.h
>> to use the new macros
>> $ size build64/drivers/gpu/drm/i915/i915.o-test-genmask.*
>> text data bss dec hex filename
>> 4506628 215083 7168 4728879 48282f build64/drivers/gpu/drm/i915/i915.o-test-genmask.00
>> 4511084 215083 7168 4733335 483997 build64/drivers/gpu/drm/i915/i915.o-test-genmask.01
>> 4493292 215083 7168 4715543 47f417 build64/drivers/gpu/drm/i915/i915.o-test-genmask.02
>>
>> $ ./scripts/bloat-o-meter build64/drivers/gpu/drm/i915/i915.o-test-genmask.0[01]
>> add/remove: 0/0 grow/shrink: 2/1 up/down: 4/-5 (-1)
>> Function old new delta
>> intel_drrs_activate 399 402 +3
>> intel_psr_invalidate 546 547 +1
>> intel_psr_flush 880 875 -5
>> Total: Before=2980530, After=2980529, chg -0.00%
>>
>> $ ./scripts/bloat-o-meter build64/drivers/gpu/drm/i915/i915.o-test-genmask.0[12]
>> add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
>> Function old new delta
>> Total
>
>OK then. With the above approach, fixed-type BIT() macros look like wrappers
>around the plain BIT(), and I think, we can live with that.
>
>Can you send all the material as a proper series, including my
>GENMASK patch, your patch above and a patch that switches your driver
>to using the new API? I'll take it then in bitmap-for-next when the
>merge window will get closed.

sure, thanks


Lucas De Marchi

>
>Thanks,
>Yury