2021-03-19 12:21:31

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 00/18] 5.4.107-rc1 review

This is the start of the stable review cycle for the 5.4.107 release.
There are 18 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sun, 21 Mar 2021 12:17:37 +0000.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.107-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 5.4.107-rc1

Florian Fainelli <[email protected]>
net: dsa: b53: Support setting learning on port

DENG Qingfang <[email protected]>
net: dsa: tag_mtk: fix 802.1ad VLAN egress

Ard Biesheuvel <[email protected]>
crypto: x86/aes-ni-xts - use direct calls to and 4-way stride

Uros Bizjak <[email protected]>
crypto: aesni - Use TEST %reg,%reg instead of CMP $0,%reg

Kees Cook <[email protected]>
crypto: x86 - Regularize glue function prototypes

Amir Goldstein <[email protected]>
fuse: fix live lock in fuse_iget()

Colin Xu <[email protected]>
drm/i915/gvt: Fix vfio_edid issue for BXT/APL

Colin Xu <[email protected]>
drm/i915/gvt: Fix port number for BDW on EDID region setup

Colin Xu <[email protected]>
drm/i915/gvt: Fix virtual display setup for BXT/APL

Colin Xu <[email protected]>
drm/i915/gvt: Fix mmio handler break on BXT/APL.

Colin Xu <[email protected]>
drm/i915/gvt: Set SNOOP for PAT3 on BXT/APL to workaround GPU BB hang

Qu Wenruo <[email protected]>
btrfs: scrub: Don't check free space before marking a block group RO

Piotr Krysiuk <[email protected]>
bpf, selftests: Fix up some test_verifier cases for unprivileged

Piotr Krysiuk <[email protected]>
bpf: Add sanity check for upper ptr_limit

Piotr Krysiuk <[email protected]>
bpf: Simplify alu_limit masking for pointer arithmetic

Piotr Krysiuk <[email protected]>
bpf: Fix off-by-one for area size in creating mask to left

Piotr Krysiuk <[email protected]>
bpf: Prohibit alu ops for pointer types not defining ptr_limit

Suzuki K Poulose <[email protected]>
KVM: arm64: nvhe: Save the SPE context early


-------------

Diffstat:

Makefile | 4 +-
arch/arm64/include/asm/kvm_hyp.h | 3 +
arch/arm64/kvm/hyp/debug-sr.c | 24 ++-
arch/arm64/kvm/hyp/switch.c | 13 +-
arch/x86/crypto/aesni-intel_asm.S | 137 +++++++------
arch/x86/crypto/aesni-intel_avx-x86_64.S | 20 +-
arch/x86/crypto/aesni-intel_glue.c | 54 +++---
arch/x86/crypto/camellia_aesni_avx2_glue.c | 74 ++++---
arch/x86/crypto/camellia_aesni_avx_glue.c | 72 ++++---
arch/x86/crypto/camellia_glue.c | 45 +++--
arch/x86/crypto/cast6_avx_glue.c | 68 +++----
arch/x86/crypto/glue_helper.c | 23 ++-
arch/x86/crypto/serpent_avx2_glue.c | 65 +++----
arch/x86/crypto/serpent_avx_glue.c | 63 +++---
arch/x86/crypto/serpent_sse2_glue.c | 30 +--
arch/x86/crypto/twofish_avx_glue.c | 75 ++++----
arch/x86/crypto/twofish_glue_3way.c | 37 ++--
arch/x86/include/asm/crypto/camellia.h | 63 +++---
arch/x86/include/asm/crypto/glue_helper.h | 18 +-
arch/x86/include/asm/crypto/serpent-avx.h | 20 +-
arch/x86/include/asm/crypto/serpent-sse2.h | 28 ++-
arch/x86/include/asm/crypto/twofish.h | 19 +-
crypto/cast6_generic.c | 18 +-
crypto/serpent_generic.c | 6 +-
drivers/gpu/drm/i915/gvt/display.c | 212 +++++++++++++++++++++
drivers/gpu/drm/i915/gvt/handlers.c | 40 +++-
drivers/gpu/drm/i915/gvt/mmio.c | 5 +
drivers/gpu/drm/i915/gvt/vgpu.c | 5 +-
drivers/net/dsa/b53/b53_common.c | 18 ++
drivers/net/dsa/b53/b53_regs.h | 1 +
drivers/net/dsa/bcm_sf2.c | 5 -
fs/btrfs/block-group.c | 48 +++--
fs/btrfs/block-group.h | 3 +-
fs/btrfs/relocation.c | 2 +-
fs/btrfs/scrub.c | 21 +-
fs/fuse/fuse_i.h | 1 +
include/crypto/cast6.h | 4 +-
include/crypto/serpent.h | 4 +-
include/crypto/xts.h | 2 -
kernel/bpf/verifier.c | 33 ++--
net/dsa/tag_mtk.c | 19 +-
.../selftests/bpf/verifier/bounds_deduction.c | 27 ++-
tools/testing/selftests/bpf/verifier/unpriv.c | 15 +-
.../selftests/bpf/verifier/value_ptr_arith.c | 23 ++-
44 files changed, 920 insertions(+), 547 deletions(-)



2021-03-19 12:22:01

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 02/18] bpf: Prohibit alu ops for pointer types not defining ptr_limit

From: Piotr Krysiuk <[email protected]>

commit f232326f6966cf2a1d1db7bc917a4ce5f9f55f76 upstream.

The purpose of this patch is to streamline error propagation and in particular
to propagate retrieve_ptr_limit() errors for pointer types that are not defining
a ptr_limit such that register-based alu ops against these types can be rejected.

The main rationale is that a gap has been identified by Piotr in the existing
protection against speculatively out-of-bounds loads, for example, in case of
ctx pointers, unprivileged programs can still perform pointer arithmetic. This
can be abused to execute speculatively out-of-bounds loads without restrictions
and thus extract contents of kernel memory.

Fix this by rejecting unprivileged programs that attempt any pointer arithmetic
on unprotected pointer types. The two affected ones are pointer to ctx as well
as pointer to map. Field access to a modified ctx' pointer is rejected at a
later point in time in the verifier, and 7c6967326267 ("bpf: Permit map_ptr
arithmetic with opcode add and offset 0") only relevant for root-only use cases.
Risk of unprivileged program breakage is considered very low.

Fixes: 7c6967326267 ("bpf: Permit map_ptr arithmetic with opcode add and offset 0")
Fixes: b2157399cc98 ("bpf: prevent out-of-bounds speculation")
Signed-off-by: Piotr Krysiuk <[email protected]>
Co-developed-by: Daniel Borkmann <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/bpf/verifier.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4341,6 +4341,7 @@ static int sanitize_ptr_alu(struct bpf_v
u32 alu_state, alu_limit;
struct bpf_reg_state tmp;
bool ret;
+ int err;

if (can_skip_alu_sanitation(env, insn))
return 0;
@@ -4356,10 +4357,13 @@ static int sanitize_ptr_alu(struct bpf_v
alu_state |= ptr_is_dst_reg ?
BPF_ALU_SANITIZE_SRC : BPF_ALU_SANITIZE_DST;

- if (retrieve_ptr_limit(ptr_reg, &alu_limit, opcode, off_is_neg))
- return 0;
- if (update_alu_sanitation_state(aux, alu_state, alu_limit))
- return -EACCES;
+ err = retrieve_ptr_limit(ptr_reg, &alu_limit, opcode, off_is_neg);
+ if (err < 0)
+ return err;
+
+ err = update_alu_sanitation_state(aux, alu_state, alu_limit);
+ if (err < 0)
+ return err;
do_sim:
/* Simulate and find potential out-of-bounds access under
* speculative execution from truncation as a result of
@@ -4467,7 +4471,7 @@ static int adjust_ptr_min_max_vals(struc
case BPF_ADD:
ret = sanitize_ptr_alu(env, insn, ptr_reg, dst_reg, smin_val < 0);
if (ret < 0) {
- verbose(env, "R%d tried to add from different maps or paths\n", dst);
+ verbose(env, "R%d tried to add from different maps, paths, or prohibited types\n", dst);
return ret;
}
/* We can take a fixed offset as long as it doesn't overflow
@@ -4522,7 +4526,7 @@ static int adjust_ptr_min_max_vals(struc
case BPF_SUB:
ret = sanitize_ptr_alu(env, insn, ptr_reg, dst_reg, smin_val < 0);
if (ret < 0) {
- verbose(env, "R%d tried to sub from different maps or paths\n", dst);
+ verbose(env, "R%d tried to sub from different maps, paths, or prohibited types\n", dst);
return ret;
}
if (dst_reg == off_reg) {


2021-03-19 12:22:01

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 04/18] bpf: Simplify alu_limit masking for pointer arithmetic

From: Piotr Krysiuk <[email protected]>

commit b5871dca250cd391885218b99cc015aca1a51aea upstream.

Instead of having the mov32 with aux->alu_limit - 1 immediate, move this
operation to retrieve_ptr_limit() instead to simplify the logic and to
allow for subsequent sanity boundary checks inside retrieve_ptr_limit().
This avoids in future that at the time of the verifier masking rewrite
we'd run into an underflow which would not sign extend due to the nature
of mov32 instruction.

Signed-off-by: Piotr Krysiuk <[email protected]>
Co-developed-by: Daniel Borkmann <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/bpf/verifier.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4277,16 +4277,16 @@ static int retrieve_ptr_limit(const stru
*/
off = ptr_reg->off + ptr_reg->var_off.value;
if (mask_to_left)
- *ptr_limit = MAX_BPF_STACK + off + 1;
+ *ptr_limit = MAX_BPF_STACK + off;
else
- *ptr_limit = -off;
+ *ptr_limit = -off - 1;
return 0;
case PTR_TO_MAP_VALUE:
if (mask_to_left) {
- *ptr_limit = ptr_reg->umax_value + ptr_reg->off + 1;
+ *ptr_limit = ptr_reg->umax_value + ptr_reg->off;
} else {
off = ptr_reg->smin_value + ptr_reg->off;
- *ptr_limit = ptr_reg->map_ptr->value_size - off;
+ *ptr_limit = ptr_reg->map_ptr->value_size - off - 1;
}
return 0;
default:
@@ -9081,7 +9081,7 @@ static int fixup_bpf_calls(struct bpf_ve
off_reg = issrc ? insn->src_reg : insn->dst_reg;
if (isneg)
*patch++ = BPF_ALU64_IMM(BPF_MUL, off_reg, -1);
- *patch++ = BPF_MOV32_IMM(BPF_REG_AX, aux->alu_limit - 1);
+ *patch++ = BPF_MOV32_IMM(BPF_REG_AX, aux->alu_limit);
*patch++ = BPF_ALU64_REG(BPF_SUB, BPF_REG_AX, off_reg);
*patch++ = BPF_ALU64_REG(BPF_OR, BPF_REG_AX, off_reg);
*patch++ = BPF_ALU64_IMM(BPF_NEG, BPF_REG_AX, 0);


2021-03-19 12:22:01

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 05/18] bpf: Add sanity check for upper ptr_limit

From: Piotr Krysiuk <[email protected]>

commit 1b1597e64e1a610c7a96710fc4717158e98a08b3 upstream.

Given we know the max possible value of ptr_limit at the time of retrieving
the latter, add basic assertions, so that the verifier can bail out if
anything looks odd and reject the program. Nothing triggered this so far,
but it also does not hurt to have these.

Signed-off-by: Piotr Krysiuk <[email protected]>
Co-developed-by: Daniel Borkmann <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/bpf/verifier.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4268,10 +4268,14 @@ static int retrieve_ptr_limit(const stru
{
bool mask_to_left = (opcode == BPF_ADD && off_is_neg) ||
(opcode == BPF_SUB && !off_is_neg);
- u32 off;
+ u32 off, max;

switch (ptr_reg->type) {
case PTR_TO_STACK:
+ /* Offset 0 is out-of-bounds, but acceptable start for the
+ * left direction, see BPF_REG_FP.
+ */
+ max = MAX_BPF_STACK + mask_to_left;
/* Indirect variable offset stack access is prohibited in
* unprivileged mode so it's not handled here.
*/
@@ -4280,15 +4284,16 @@ static int retrieve_ptr_limit(const stru
*ptr_limit = MAX_BPF_STACK + off;
else
*ptr_limit = -off - 1;
- return 0;
+ return *ptr_limit >= max ? -ERANGE : 0;
case PTR_TO_MAP_VALUE:
+ max = ptr_reg->map_ptr->value_size;
if (mask_to_left) {
*ptr_limit = ptr_reg->umax_value + ptr_reg->off;
} else {
off = ptr_reg->smin_value + ptr_reg->off;
*ptr_limit = ptr_reg->map_ptr->value_size - off - 1;
}
- return 0;
+ return *ptr_limit >= max ? -ERANGE : 0;
default:
return -EINVAL;
}


2021-03-19 12:22:03

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 06/18] bpf, selftests: Fix up some test_verifier cases for unprivileged

From: Piotr Krysiuk <[email protected]>

commit 0a13e3537ea67452d549a6a80da3776d6b7dedb3 upstream.

Fix up test_verifier error messages for the case where the original error
message changed, or for the case where pointer alu errors differ between
privileged and unprivileged tests. Also, add alternative tests for keeping
coverage of the original verifier rejection error message (fp alu), and
newly reject map_ptr += rX where rX == 0 given we now forbid alu on these
types for unprivileged. All test_verifier cases pass after the change. The
test case fixups were kept separate to ease backporting of core changes.

Signed-off-by: Piotr Krysiuk <[email protected]>
Co-developed-by: Daniel Borkmann <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
tools/testing/selftests/bpf/verifier/bounds_deduction.c | 27 +++++++++++-----
tools/testing/selftests/bpf/verifier/unpriv.c | 15 ++++++++
tools/testing/selftests/bpf/verifier/value_ptr_arith.c | 23 +++++++++++++
3 files changed, 55 insertions(+), 10 deletions(-)

--- a/tools/testing/selftests/bpf/verifier/bounds_deduction.c
+++ b/tools/testing/selftests/bpf/verifier/bounds_deduction.c
@@ -6,8 +6,9 @@
BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
BPF_EXIT_INSN(),
},
- .result = REJECT,
+ .errstr_unpriv = "R0 tried to sub from different maps, paths, or prohibited types",
.errstr = "R0 tried to subtract pointer from scalar",
+ .result = REJECT,
},
{
"check deducing bounds from const, 2",
@@ -20,6 +21,8 @@
BPF_ALU64_REG(BPF_SUB, BPF_REG_1, BPF_REG_0),
BPF_EXIT_INSN(),
},
+ .errstr_unpriv = "R1 tried to sub from different maps, paths, or prohibited types",
+ .result_unpriv = REJECT,
.result = ACCEPT,
.retval = 1,
},
@@ -31,8 +34,9 @@
BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
BPF_EXIT_INSN(),
},
- .result = REJECT,
+ .errstr_unpriv = "R0 tried to sub from different maps, paths, or prohibited types",
.errstr = "R0 tried to subtract pointer from scalar",
+ .result = REJECT,
},
{
"check deducing bounds from const, 4",
@@ -45,6 +49,8 @@
BPF_ALU64_REG(BPF_SUB, BPF_REG_1, BPF_REG_0),
BPF_EXIT_INSN(),
},
+ .errstr_unpriv = "R1 tried to sub from different maps, paths, or prohibited types",
+ .result_unpriv = REJECT,
.result = ACCEPT,
},
{
@@ -55,8 +61,9 @@
BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
BPF_EXIT_INSN(),
},
- .result = REJECT,
+ .errstr_unpriv = "R0 tried to sub from different maps, paths, or prohibited types",
.errstr = "R0 tried to subtract pointer from scalar",
+ .result = REJECT,
},
{
"check deducing bounds from const, 6",
@@ -67,8 +74,9 @@
BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
BPF_EXIT_INSN(),
},
- .result = REJECT,
+ .errstr_unpriv = "R0 tried to sub from different maps, paths, or prohibited types",
.errstr = "R0 tried to subtract pointer from scalar",
+ .result = REJECT,
},
{
"check deducing bounds from const, 7",
@@ -80,8 +88,9 @@
offsetof(struct __sk_buff, mark)),
BPF_EXIT_INSN(),
},
- .result = REJECT,
+ .errstr_unpriv = "R1 tried to sub from different maps, paths, or prohibited types",
.errstr = "dereference of modified ctx ptr",
+ .result = REJECT,
.flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
},
{
@@ -94,8 +103,9 @@
offsetof(struct __sk_buff, mark)),
BPF_EXIT_INSN(),
},
- .result = REJECT,
+ .errstr_unpriv = "R1 tried to add from different maps, paths, or prohibited types",
.errstr = "dereference of modified ctx ptr",
+ .result = REJECT,
.flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
},
{
@@ -106,8 +116,9 @@
BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
BPF_EXIT_INSN(),
},
- .result = REJECT,
+ .errstr_unpriv = "R0 tried to sub from different maps, paths, or prohibited types",
.errstr = "R0 tried to subtract pointer from scalar",
+ .result = REJECT,
},
{
"check deducing bounds from const, 10",
@@ -119,6 +130,6 @@
BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
BPF_EXIT_INSN(),
},
- .result = REJECT,
.errstr = "math between ctx pointer and register with unbounded min value is not allowed",
+ .result = REJECT,
},
--- a/tools/testing/selftests/bpf/verifier/unpriv.c
+++ b/tools/testing/selftests/bpf/verifier/unpriv.c
@@ -495,7 +495,7 @@
.result = ACCEPT,
},
{
- "unpriv: adding of fp",
+ "unpriv: adding of fp, reg",
.insns = {
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_MOV64_IMM(BPF_REG_1, 0),
@@ -503,6 +503,19 @@
BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, -8),
BPF_EXIT_INSN(),
},
+ .errstr_unpriv = "R1 tried to add from different maps, paths, or prohibited types",
+ .result_unpriv = REJECT,
+ .result = ACCEPT,
+},
+{
+ "unpriv: adding of fp, imm",
+ .insns = {
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 0),
+ BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, -8),
+ BPF_EXIT_INSN(),
+ },
.errstr_unpriv = "R1 stack pointer arithmetic goes out of range",
.result_unpriv = REJECT,
.result = ACCEPT,
--- a/tools/testing/selftests/bpf/verifier/value_ptr_arith.c
+++ b/tools/testing/selftests/bpf/verifier/value_ptr_arith.c
@@ -169,7 +169,7 @@
.fixup_map_array_48b = { 1 },
.result = ACCEPT,
.result_unpriv = REJECT,
- .errstr_unpriv = "R2 tried to add from different maps or paths",
+ .errstr_unpriv = "R2 tried to add from different maps, paths, or prohibited types",
.retval = 0,
},
{
@@ -517,6 +517,27 @@
.retval = 0xabcdef12,
},
{
+ "map access: value_ptr += N, value_ptr -= N known scalar",
+ .insns = {
+ BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+ BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+ BPF_LD_MAP_FD(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 6),
+ BPF_MOV32_IMM(BPF_REG_1, 0x12345678),
+ BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, 0),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, 2),
+ BPF_MOV64_IMM(BPF_REG_1, 2),
+ BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
+ BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .fixup_map_array_48b = { 3 },
+ .result = ACCEPT,
+ .retval = 0x12345678,
+},
+{
"map access: unknown scalar += value_ptr, 1",
.insns = {
BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),


2021-03-19 12:22:10

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 10/18] drm/i915/gvt: Fix virtual display setup for BXT/APL

From: Colin Xu <[email protected]>

commit a5a8ef937cfa79167f4b2a5602092b8d14fd6b9a upstream

Program display related vregs to proper value at initialization, setup
virtual monitor and hotplug.

vGPU virtual display vregs inherit the value from pregs. The virtual DP
monitor is always setup on PORT_B for BXT/APL. However the host may
connect monitor on other PORT or without any monitor connected. Without
properly setup PIPE/DDI/PLL related vregs, guest driver may not setup
the virutal display as expected, and the guest desktop may not be
created.
Since only one virtual display is supported, enable PIPE_A only. And
enable transcoder/DDI/PLL based on which port is setup for BXT/APL.

V2:
Revise commit message.

V3:
set_edid should on PORT_B for BXT.
Inject hpd event for BXT.

V4:
Temporarily disable vfio edid on BXT/APL until issue fixed.

V5:
Rebase to use new HPD define GEN8_DE_PORT_HOTPLUG for BXT.
Put vfio edid disabling on BXT/APL to a separate patch.

Acked-by: Zhenyu Wang <[email protected]>
Signed-off-by: Colin Xu <[email protected]>
Signed-off-by: Zhenyu Wang <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit a5a8ef937cfa79167f4b2a5602092b8d14fd6b9a)
Signed-off-by: Colin Xu <[email protected]>
Cc: <[email protected]> # 5.4.y
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/i915/gvt/display.c | 173 +++++++++++++++++++++++++++++++++++++
drivers/gpu/drm/i915/gvt/mmio.c | 5 +
2 files changed, 178 insertions(+)

--- a/drivers/gpu/drm/i915/gvt/display.c
+++ b/drivers/gpu/drm/i915/gvt/display.c
@@ -172,21 +172,161 @@ static void emulate_monitor_status_chang
int pipe;

if (IS_BROXTON(dev_priv)) {
+ enum transcoder trans;
+ enum port port;
+
+ /* Clear PIPE, DDI, PHY, HPD before setting new */
vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) &= ~(BXT_DE_PORT_HP_DDIA |
BXT_DE_PORT_HP_DDIB |
BXT_DE_PORT_HP_DDIC);

+ for_each_pipe(dev_priv, pipe) {
+ vgpu_vreg_t(vgpu, PIPECONF(pipe)) &=
+ ~(PIPECONF_ENABLE | I965_PIPECONF_ACTIVE);
+ vgpu_vreg_t(vgpu, DSPCNTR(pipe)) &= ~DISPLAY_PLANE_ENABLE;
+ vgpu_vreg_t(vgpu, SPRCTL(pipe)) &= ~SPRITE_ENABLE;
+ vgpu_vreg_t(vgpu, CURCNTR(pipe)) &= ~MCURSOR_MODE;
+ vgpu_vreg_t(vgpu, CURCNTR(pipe)) |= MCURSOR_MODE_DISABLE;
+ }
+
+ for (trans = TRANSCODER_A; trans <= TRANSCODER_EDP; trans++) {
+ vgpu_vreg_t(vgpu, TRANS_DDI_FUNC_CTL(trans)) &=
+ ~(TRANS_DDI_BPC_MASK | TRANS_DDI_MODE_SELECT_MASK |
+ TRANS_DDI_PORT_MASK | TRANS_DDI_FUNC_ENABLE);
+ }
+ vgpu_vreg_t(vgpu, TRANS_DDI_FUNC_CTL(TRANSCODER_A)) &=
+ ~(TRANS_DDI_BPC_MASK | TRANS_DDI_MODE_SELECT_MASK |
+ TRANS_DDI_PORT_MASK);
+
+ for (port = PORT_A; port <= PORT_C; port++) {
+ vgpu_vreg_t(vgpu, BXT_PHY_CTL(port)) &=
+ ~BXT_PHY_LANE_ENABLED;
+ vgpu_vreg_t(vgpu, BXT_PHY_CTL(port)) |=
+ (BXT_PHY_CMNLANE_POWERDOWN_ACK |
+ BXT_PHY_LANE_POWERDOWN_ACK);
+
+ vgpu_vreg_t(vgpu, BXT_PORT_PLL_ENABLE(port)) &=
+ ~(PORT_PLL_POWER_STATE | PORT_PLL_POWER_ENABLE |
+ PORT_PLL_REF_SEL | PORT_PLL_LOCK |
+ PORT_PLL_ENABLE);
+
+ vgpu_vreg_t(vgpu, DDI_BUF_CTL(port)) &=
+ ~(DDI_INIT_DISPLAY_DETECTED |
+ DDI_BUF_CTL_ENABLE);
+ vgpu_vreg_t(vgpu, DDI_BUF_CTL(port)) |= DDI_BUF_IS_IDLE;
+ }
+
+ vgpu_vreg_t(vgpu, BXT_P_CR_GT_DISP_PWRON) &= ~(BIT(0) | BIT(1));
+ vgpu_vreg_t(vgpu, BXT_PORT_CL1CM_DW0(DPIO_PHY0)) &=
+ ~PHY_POWER_GOOD;
+ vgpu_vreg_t(vgpu, BXT_PORT_CL1CM_DW0(DPIO_PHY1)) &=
+ ~PHY_POWER_GOOD;
+ vgpu_vreg_t(vgpu, BXT_PHY_CTL_FAMILY(DPIO_PHY0)) &= ~BIT(30);
+ vgpu_vreg_t(vgpu, BXT_PHY_CTL_FAMILY(DPIO_PHY1)) &= ~BIT(30);
+
+ vgpu_vreg_t(vgpu, SFUSE_STRAP) &= ~SFUSE_STRAP_DDIB_DETECTED;
+ vgpu_vreg_t(vgpu, SFUSE_STRAP) &= ~SFUSE_STRAP_DDIC_DETECTED;
+
+ /*
+ * Only 1 PIPE enabled in current vGPU display and PIPE_A is
+ * tied to TRANSCODER_A in HW, so it's safe to assume PIPE_A,
+ * TRANSCODER_A can be enabled. PORT_x depends on the input of
+ * setup_virtual_dp_monitor.
+ */
+ vgpu_vreg_t(vgpu, PIPECONF(PIPE_A)) |= PIPECONF_ENABLE;
+ vgpu_vreg_t(vgpu, PIPECONF(PIPE_A)) |= I965_PIPECONF_ACTIVE;
+
+ /*
+ * Golden M/N are calculated based on:
+ * 24 bpp, 4 lanes, 154000 pixel clk (from virtual EDID),
+ * DP link clk 1620 MHz and non-constant_n.
+ * TODO: calculate DP link symbol clk and stream clk m/n.
+ */
+ vgpu_vreg_t(vgpu, PIPE_DATA_M1(TRANSCODER_A)) = 63 << TU_SIZE_SHIFT;
+ vgpu_vreg_t(vgpu, PIPE_DATA_M1(TRANSCODER_A)) |= 0x5b425e;
+ vgpu_vreg_t(vgpu, PIPE_DATA_N1(TRANSCODER_A)) = 0x800000;
+ vgpu_vreg_t(vgpu, PIPE_LINK_M1(TRANSCODER_A)) = 0x3cd6e;
+ vgpu_vreg_t(vgpu, PIPE_LINK_N1(TRANSCODER_A)) = 0x80000;
+
+ /* Enable per-DDI/PORT vreg */
if (intel_vgpu_has_monitor_on_port(vgpu, PORT_A)) {
+ vgpu_vreg_t(vgpu, BXT_P_CR_GT_DISP_PWRON) |= BIT(1);
+ vgpu_vreg_t(vgpu, BXT_PORT_CL1CM_DW0(DPIO_PHY1)) |=
+ PHY_POWER_GOOD;
+ vgpu_vreg_t(vgpu, BXT_PHY_CTL_FAMILY(DPIO_PHY1)) |=
+ BIT(30);
+ vgpu_vreg_t(vgpu, BXT_PHY_CTL(PORT_A)) |=
+ BXT_PHY_LANE_ENABLED;
+ vgpu_vreg_t(vgpu, BXT_PHY_CTL(PORT_A)) &=
+ ~(BXT_PHY_CMNLANE_POWERDOWN_ACK |
+ BXT_PHY_LANE_POWERDOWN_ACK);
+ vgpu_vreg_t(vgpu, BXT_PORT_PLL_ENABLE(PORT_A)) |=
+ (PORT_PLL_POWER_STATE | PORT_PLL_POWER_ENABLE |
+ PORT_PLL_REF_SEL | PORT_PLL_LOCK |
+ PORT_PLL_ENABLE);
+ vgpu_vreg_t(vgpu, DDI_BUF_CTL(PORT_A)) |=
+ (DDI_BUF_CTL_ENABLE | DDI_INIT_DISPLAY_DETECTED);
+ vgpu_vreg_t(vgpu, DDI_BUF_CTL(PORT_A)) &=
+ ~DDI_BUF_IS_IDLE;
+ vgpu_vreg_t(vgpu, TRANS_DDI_FUNC_CTL(TRANSCODER_EDP)) |=
+ (TRANS_DDI_BPC_8 | TRANS_DDI_MODE_SELECT_DP_SST |
+ TRANS_DDI_FUNC_ENABLE);
vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) |=
BXT_DE_PORT_HP_DDIA;
}

if (intel_vgpu_has_monitor_on_port(vgpu, PORT_B)) {
+ vgpu_vreg_t(vgpu, SFUSE_STRAP) |= SFUSE_STRAP_DDIB_DETECTED;
+ vgpu_vreg_t(vgpu, BXT_P_CR_GT_DISP_PWRON) |= BIT(0);
+ vgpu_vreg_t(vgpu, BXT_PORT_CL1CM_DW0(DPIO_PHY0)) |=
+ PHY_POWER_GOOD;
+ vgpu_vreg_t(vgpu, BXT_PHY_CTL_FAMILY(DPIO_PHY0)) |=
+ BIT(30);
+ vgpu_vreg_t(vgpu, BXT_PHY_CTL(PORT_B)) |=
+ BXT_PHY_LANE_ENABLED;
+ vgpu_vreg_t(vgpu, BXT_PHY_CTL(PORT_B)) &=
+ ~(BXT_PHY_CMNLANE_POWERDOWN_ACK |
+ BXT_PHY_LANE_POWERDOWN_ACK);
+ vgpu_vreg_t(vgpu, BXT_PORT_PLL_ENABLE(PORT_B)) |=
+ (PORT_PLL_POWER_STATE | PORT_PLL_POWER_ENABLE |
+ PORT_PLL_REF_SEL | PORT_PLL_LOCK |
+ PORT_PLL_ENABLE);
+ vgpu_vreg_t(vgpu, DDI_BUF_CTL(PORT_B)) |=
+ DDI_BUF_CTL_ENABLE;
+ vgpu_vreg_t(vgpu, DDI_BUF_CTL(PORT_B)) &=
+ ~DDI_BUF_IS_IDLE;
+ vgpu_vreg_t(vgpu, TRANS_DDI_FUNC_CTL(TRANSCODER_A)) |=
+ (TRANS_DDI_BPC_8 | TRANS_DDI_MODE_SELECT_DP_SST |
+ (PORT_B << TRANS_DDI_PORT_SHIFT) |
+ TRANS_DDI_FUNC_ENABLE);
vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) |=
BXT_DE_PORT_HP_DDIB;
}

if (intel_vgpu_has_monitor_on_port(vgpu, PORT_C)) {
+ vgpu_vreg_t(vgpu, SFUSE_STRAP) |= SFUSE_STRAP_DDIC_DETECTED;
+ vgpu_vreg_t(vgpu, BXT_P_CR_GT_DISP_PWRON) |= BIT(0);
+ vgpu_vreg_t(vgpu, BXT_PORT_CL1CM_DW0(DPIO_PHY0)) |=
+ PHY_POWER_GOOD;
+ vgpu_vreg_t(vgpu, BXT_PHY_CTL_FAMILY(DPIO_PHY0)) |=
+ BIT(30);
+ vgpu_vreg_t(vgpu, BXT_PHY_CTL(PORT_C)) |=
+ BXT_PHY_LANE_ENABLED;
+ vgpu_vreg_t(vgpu, BXT_PHY_CTL(PORT_C)) &=
+ ~(BXT_PHY_CMNLANE_POWERDOWN_ACK |
+ BXT_PHY_LANE_POWERDOWN_ACK);
+ vgpu_vreg_t(vgpu, BXT_PORT_PLL_ENABLE(PORT_C)) |=
+ (PORT_PLL_POWER_STATE | PORT_PLL_POWER_ENABLE |
+ PORT_PLL_REF_SEL | PORT_PLL_LOCK |
+ PORT_PLL_ENABLE);
+ vgpu_vreg_t(vgpu, DDI_BUF_CTL(PORT_C)) |=
+ DDI_BUF_CTL_ENABLE;
+ vgpu_vreg_t(vgpu, DDI_BUF_CTL(PORT_C)) &=
+ ~DDI_BUF_IS_IDLE;
+ vgpu_vreg_t(vgpu, TRANS_DDI_FUNC_CTL(TRANSCODER_A)) |=
+ (TRANS_DDI_BPC_8 | TRANS_DDI_MODE_SELECT_DP_SST |
+ (PORT_B << TRANS_DDI_PORT_SHIFT) |
+ TRANS_DDI_FUNC_ENABLE);
vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) |=
BXT_DE_PORT_HP_DDIC;
}
@@ -511,6 +651,39 @@ void intel_vgpu_emulate_hotplug(struct i
vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) |=
PORTD_HOTPLUG_STATUS_MASK;
intel_vgpu_trigger_virtual_event(vgpu, DP_D_HOTPLUG);
+ } else if (IS_BROXTON(dev_priv)) {
+ if (connected) {
+ if (intel_vgpu_has_monitor_on_port(vgpu, PORT_A)) {
+ vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) |= BXT_DE_PORT_HP_DDIA;
+ }
+ if (intel_vgpu_has_monitor_on_port(vgpu, PORT_B)) {
+ vgpu_vreg_t(vgpu, SFUSE_STRAP) |=
+ SFUSE_STRAP_DDIB_DETECTED;
+ vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) |= BXT_DE_PORT_HP_DDIB;
+ }
+ if (intel_vgpu_has_monitor_on_port(vgpu, PORT_C)) {
+ vgpu_vreg_t(vgpu, SFUSE_STRAP) |=
+ SFUSE_STRAP_DDIC_DETECTED;
+ vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) |= BXT_DE_PORT_HP_DDIC;
+ }
+ } else {
+ if (intel_vgpu_has_monitor_on_port(vgpu, PORT_A)) {
+ vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) &= ~BXT_DE_PORT_HP_DDIA;
+ }
+ if (intel_vgpu_has_monitor_on_port(vgpu, PORT_B)) {
+ vgpu_vreg_t(vgpu, SFUSE_STRAP) &=
+ ~SFUSE_STRAP_DDIB_DETECTED;
+ vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) &= ~BXT_DE_PORT_HP_DDIB;
+ }
+ if (intel_vgpu_has_monitor_on_port(vgpu, PORT_C)) {
+ vgpu_vreg_t(vgpu, SFUSE_STRAP) &=
+ ~SFUSE_STRAP_DDIC_DETECTED;
+ vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) &= ~BXT_DE_PORT_HP_DDIC;
+ }
+ }
+ vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) |=
+ PORTB_HOTPLUG_STATUS_MASK;
+ intel_vgpu_trigger_virtual_event(vgpu, DP_B_HOTPLUG);
}
}

--- a/drivers/gpu/drm/i915/gvt/mmio.c
+++ b/drivers/gpu/drm/i915/gvt/mmio.c
@@ -271,6 +271,11 @@ void intel_vgpu_reset_mmio(struct intel_
vgpu_vreg_t(vgpu, BXT_PHY_CTL(PORT_C)) |=
BXT_PHY_CMNLANE_POWERDOWN_ACK |
BXT_PHY_LANE_POWERDOWN_ACK;
+ vgpu_vreg_t(vgpu, SKL_FUSE_STATUS) |=
+ SKL_FUSE_DOWNLOAD_STATUS |
+ SKL_FUSE_PG_DIST_STATUS(SKL_PG0) |
+ SKL_FUSE_PG_DIST_STATUS(SKL_PG1) |
+ SKL_FUSE_PG_DIST_STATUS(SKL_PG2);
}
} else {
#define GVT_GEN8_MMIO_RESET_OFFSET (0x44200)


2021-03-19 12:22:11

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 07/18] btrfs: scrub: Dont check free space before marking a block group RO

From: Qu Wenruo <[email protected]>

commit b12de52896c0e8213f70e3a168fde9e6eee95909 upstream.

[BUG]
When running btrfs/072 with only one online CPU, it has a pretty high
chance to fail:

# btrfs/072 12s ... _check_dmesg: something found in dmesg (see xfstests-dev/results//btrfs/072.dmesg)
# - output mismatch (see xfstests-dev/results//btrfs/072.out.bad)
# --- tests/btrfs/072.out 2019-10-22 15:18:14.008965340 +0800
# +++ /xfstests-dev/results//btrfs/072.out.bad 2019-11-14 15:56:45.877152240 +0800
# @@ -1,2 +1,3 @@
# QA output created by 072
# Silence is golden
# +Scrub find errors in "-m dup -d single" test
# ...

And with the following call trace:

BTRFS info (device dm-5): scrub: started on devid 1
------------[ cut here ]------------
BTRFS: Transaction aborted (error -27)
WARNING: CPU: 0 PID: 55087 at fs/btrfs/block-group.c:1890 btrfs_create_pending_block_groups+0x3e6/0x470 [btrfs]
CPU: 0 PID: 55087 Comm: btrfs Tainted: G W O 5.4.0-rc1-custom+ #13
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:btrfs_create_pending_block_groups+0x3e6/0x470 [btrfs]
Call Trace:
__btrfs_end_transaction+0xdb/0x310 [btrfs]
btrfs_end_transaction+0x10/0x20 [btrfs]
btrfs_inc_block_group_ro+0x1c9/0x210 [btrfs]
scrub_enumerate_chunks+0x264/0x940 [btrfs]
btrfs_scrub_dev+0x45c/0x8f0 [btrfs]
btrfs_ioctl+0x31a1/0x3fb0 [btrfs]
do_vfs_ioctl+0x636/0xaa0
ksys_ioctl+0x67/0x90
__x64_sys_ioctl+0x43/0x50
do_syscall_64+0x79/0xe0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
---[ end trace 166c865cec7688e7 ]---

[CAUSE]
The error number -27 is -EFBIG, returned from the following call chain:
btrfs_end_transaction()
|- __btrfs_end_transaction()
|- btrfs_create_pending_block_groups()
|- btrfs_finish_chunk_alloc()
|- btrfs_add_system_chunk()

This happens because we have used up all space of
btrfs_super_block::sys_chunk_array.

The root cause is, we have the following bad loop of creating tons of
system chunks:

1. The only SYSTEM chunk is being scrubbed
It's very common to have only one SYSTEM chunk.
2. New SYSTEM bg will be allocated
As btrfs_inc_block_group_ro() will check if we have enough space
after marking current bg RO. If not, then allocate a new chunk.
3. New SYSTEM bg is still empty, will be reclaimed
During the reclaim, we will mark it RO again.
4. That newly allocated empty SYSTEM bg get scrubbed
We go back to step 2, as the bg is already mark RO but still not
cleaned up yet.

If the cleaner kthread doesn't get executed fast enough (e.g. only one
CPU), then we will get more and more empty SYSTEM chunks, using up all
the space of btrfs_super_block::sys_chunk_array.

[FIX]
Since scrub/dev-replace doesn't always need to allocate new extent,
especially chunk tree extent, so we don't really need to do chunk
pre-allocation.

To break above spiral, here we introduce a new parameter to
btrfs_inc_block_group(), @do_chunk_alloc, which indicates whether we
need extra chunk pre-allocation.

For relocation, we pass @do_chunk_alloc=true, while for scrub, we pass
@do_chunk_alloc=false.
This should keep unnecessary empty chunks from popping up for scrub.

Also, since there are two parameters for btrfs_inc_block_group_ro(),
add more comment for it.

Reviewed-by: Filipe Manana <[email protected]>
Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/btrfs/block-group.c | 48 +++++++++++++++++++++++++++++++-----------------
fs/btrfs/block-group.h | 3 ++-
fs/btrfs/relocation.c | 2 +-
fs/btrfs/scrub.c | 21 ++++++++++++++++++++-
4 files changed, 54 insertions(+), 20 deletions(-)

--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2048,8 +2048,17 @@ static u64 update_block_group_flags(stru
return flags;
}

-int btrfs_inc_block_group_ro(struct btrfs_block_group_cache *cache)
-
+/*
+ * Mark one block group RO, can be called several times for the same block
+ * group.
+ *
+ * @cache: the destination block group
+ * @do_chunk_alloc: whether need to do chunk pre-allocation, this is to
+ * ensure we still have some free space after marking this
+ * block group RO.
+ */
+int btrfs_inc_block_group_ro(struct btrfs_block_group_cache *cache,
+ bool do_chunk_alloc)
{
struct btrfs_fs_info *fs_info = cache->fs_info;
struct btrfs_trans_handle *trans;
@@ -2079,25 +2088,29 @@ again:
goto again;
}

- /*
- * if we are changing raid levels, try to allocate a corresponding
- * block group with the new raid level.
- */
- alloc_flags = update_block_group_flags(fs_info, cache->flags);
- if (alloc_flags != cache->flags) {
- ret = btrfs_chunk_alloc(trans, alloc_flags, CHUNK_ALLOC_FORCE);
+ if (do_chunk_alloc) {
/*
- * ENOSPC is allowed here, we may have enough space
- * already allocated at the new raid level to
- * carry on
+ * If we are changing raid levels, try to allocate a
+ * corresponding block group with the new raid level.
*/
- if (ret == -ENOSPC)
- ret = 0;
- if (ret < 0)
- goto out;
+ alloc_flags = update_block_group_flags(fs_info, cache->flags);
+ if (alloc_flags != cache->flags) {
+ ret = btrfs_chunk_alloc(trans, alloc_flags,
+ CHUNK_ALLOC_FORCE);
+ /*
+ * ENOSPC is allowed here, we may have enough space
+ * already allocated at the new raid level to carry on
+ */
+ if (ret == -ENOSPC)
+ ret = 0;
+ if (ret < 0)
+ goto out;
+ }
}

- ret = inc_block_group_ro(cache, 0);
+ ret = inc_block_group_ro(cache, !do_chunk_alloc);
+ if (!do_chunk_alloc)
+ goto unlock_out;
if (!ret)
goto out;
alloc_flags = btrfs_get_alloc_profile(fs_info, cache->space_info->flags);
@@ -2112,6 +2125,7 @@ out:
check_system_chunk(trans, alloc_flags);
mutex_unlock(&fs_info->chunk_mutex);
}
+unlock_out:
mutex_unlock(&fs_info->ro_block_group_mutex);

btrfs_end_transaction(trans);
--- a/fs/btrfs/block-group.h
+++ b/fs/btrfs/block-group.h
@@ -205,7 +205,8 @@ int btrfs_read_block_groups(struct btrfs
int btrfs_make_block_group(struct btrfs_trans_handle *trans, u64 bytes_used,
u64 type, u64 chunk_offset, u64 size);
void btrfs_create_pending_block_groups(struct btrfs_trans_handle *trans);
-int btrfs_inc_block_group_ro(struct btrfs_block_group_cache *cache);
+int btrfs_inc_block_group_ro(struct btrfs_block_group_cache *cache,
+ bool do_chunk_alloc);
void btrfs_dec_block_group_ro(struct btrfs_block_group_cache *cache);
int btrfs_start_dirty_block_groups(struct btrfs_trans_handle *trans);
int btrfs_write_dirty_block_groups(struct btrfs_trans_handle *trans);
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -4428,7 +4428,7 @@ int btrfs_relocate_block_group(struct bt
rc->extent_root = extent_root;
rc->block_group = bg;

- ret = btrfs_inc_block_group_ro(rc->block_group);
+ ret = btrfs_inc_block_group_ro(rc->block_group, true);
if (ret) {
err = ret;
goto out;
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -3560,7 +3560,26 @@ int scrub_enumerate_chunks(struct scrub_
* -> btrfs_scrub_pause()
*/
scrub_pause_on(fs_info);
- ret = btrfs_inc_block_group_ro(cache);
+
+ /*
+ * Don't do chunk preallocation for scrub.
+ *
+ * This is especially important for SYSTEM bgs, or we can hit
+ * -EFBIG from btrfs_finish_chunk_alloc() like:
+ * 1. The only SYSTEM bg is marked RO.
+ * Since SYSTEM bg is small, that's pretty common.
+ * 2. New SYSTEM bg will be allocated
+ * Due to regular version will allocate new chunk.
+ * 3. New SYSTEM bg is empty and will get cleaned up
+ * Before cleanup really happens, it's marked RO again.
+ * 4. Empty SYSTEM bg get scrubbed
+ * We go back to 2.
+ *
+ * This can easily boost the amount of SYSTEM chunks if cleaner
+ * thread can't be triggered fast enough, and use up all space
+ * of btrfs_super_block::sys_chunk_array
+ */
+ ret = btrfs_inc_block_group_ro(cache, false);
if (!ret && sctx->is_dev_replace) {
/*
* If we are doing a device replace wait for any tasks


2021-03-19 12:22:33

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 13/18] fuse: fix live lock in fuse_iget()

From: Amir Goldstein <[email protected]>

commit 775c5033a0d164622d9d10dd0f0a5531639ed3ed upstream.

Commit 5d069dbe8aaf ("fuse: fix bad inode") replaced make_bad_inode()
in fuse_iget() with a private implementation fuse_make_bad().

The private implementation fails to remove the bad inode from inode
cache, so the retry loop with iget5_locked() finds the same bad inode
and marks it bad forever.

kmsg snip:

[ ] rcu: INFO: rcu_sched self-detected stall on CPU
...
[ ] ? bit_wait_io+0x50/0x50
[ ] ? fuse_init_file_inode+0x70/0x70
[ ] ? find_inode.isra.32+0x60/0xb0
[ ] ? fuse_init_file_inode+0x70/0x70
[ ] ilookup5_nowait+0x65/0x90
[ ] ? fuse_init_file_inode+0x70/0x70
[ ] ilookup5.part.36+0x2e/0x80
[ ] ? fuse_init_file_inode+0x70/0x70
[ ] ? fuse_inode_eq+0x20/0x20
[ ] iget5_locked+0x21/0x80
[ ] ? fuse_inode_eq+0x20/0x20
[ ] fuse_iget+0x96/0x1b0

Fixes: 5d069dbe8aaf ("fuse: fix bad inode")
Cc: [email protected] # 5.10+
Signed-off-by: Amir Goldstein <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/fuse/fuse_i.h | 1 +
1 file changed, 1 insertion(+)

--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -791,6 +791,7 @@ static inline u64 fuse_get_attr_version(

static inline void fuse_make_bad(struct inode *inode)
{
+ remove_inode_hash(inode);
set_bit(FUSE_I_BAD, &get_fuse_inode(inode)->state);
}



2021-03-19 12:22:33

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 09/18] drm/i915/gvt: Fix mmio handler break on BXT/APL.

From: Colin Xu <[email protected]>

commit 92010a97098c4c9fd777408cc98064d26b32695b upstream

- Remove dup mmio handler for BXT/APL. Otherwise mmio handler will fail
to init.
- Add engine GPR with F_CMD_ACCESS since BXT/APL will load them via
LRI. Otherwise, guest will enter failsafe mode.

V2:
Use RCS/BCS GPR macros instead of offset.
Revise commit message.

V3:
Use GEN8_RING_CS_GPR macros on ring base.

Reviewed-by: Zhenyu Wang <[email protected]>
Signed-off-by: Colin Xu <[email protected]>
Signed-off-by: Zhenyu Wang <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 92010a97098c4c9fd777408cc98064d26b32695b)
Signed-off-by: Colin Xu <[email protected]>
Cc: <[email protected]> # 5.4.y
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/i915/gvt/handlers.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -3132,7 +3132,7 @@ static int init_skl_mmio_info(struct int
NULL, NULL);

MMIO_D(GAMT_CHKN_BIT_REG, D_KBL | D_CFL);
- MMIO_D(GEN9_CTX_PREEMPT_REG, D_SKL_PLUS);
+ MMIO_D(GEN9_CTX_PREEMPT_REG, D_SKL_PLUS & ~D_BXT);

return 0;
}
@@ -3306,6 +3306,12 @@ static int init_bxt_mmio_info(struct int
MMIO_D(GEN8_PUSHBUS_SHIFT, D_BXT);
MMIO_D(GEN6_GFXPAUSE, D_BXT);
MMIO_DFH(GEN8_L3SQCREG1, D_BXT, F_CMD_ACCESS, NULL, NULL);
+ MMIO_DFH(GEN8_L3CNTLREG, D_BXT, F_CMD_ACCESS, NULL, NULL);
+ MMIO_DFH(_MMIO(0x20D8), D_BXT, F_CMD_ACCESS, NULL, NULL);
+ MMIO_F(HSW_CS_GPR(0), 0x40, F_CMD_ACCESS, 0, 0, D_BXT, NULL, NULL);
+ MMIO_F(_MMIO(0x12600), 0x40, F_CMD_ACCESS, 0, 0, D_BXT, NULL, NULL);
+ MMIO_F(BCS_GPR(0), 0x40, F_CMD_ACCESS, 0, 0, D_BXT, NULL, NULL);
+ MMIO_F(_MMIO(0x1a600), 0x40, F_CMD_ACCESS, 0, 0, D_BXT, NULL, NULL);

MMIO_DFH(GEN9_CTX_PREEMPT_REG, D_BXT, F_CMD_ACCESS, NULL, NULL);



2021-03-19 12:22:34

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 15/18] crypto: aesni - Use TEST %reg,%reg instead of CMP $0,%reg

From: Uros Bizjak <[email protected]>

commit 032d049ea0f45b45c21f3f02b542aa18bc6b6428 upstream.

CMP $0,%reg can't set overflow flag, so we can use shorter TEST %reg,%reg
instruction when only zero and sign flags are checked (E,L,LE,G,GE conditions).

Signed-off-by: Uros Bizjak <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/crypto/aesni-intel_asm.S | 20 ++++++++++----------
arch/x86/crypto/aesni-intel_avx-x86_64.S | 20 ++++++++++----------
2 files changed, 20 insertions(+), 20 deletions(-)

--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -319,7 +319,7 @@ _initial_blocks_\@:

# Main loop - Encrypt/Decrypt remaining blocks

- cmp $0, %r13
+ test %r13, %r13
je _zero_cipher_left_\@
sub $64, %r13
je _four_cipher_left_\@
@@ -438,7 +438,7 @@ _multiple_of_16_bytes_\@:

mov PBlockLen(%arg2), %r12

- cmp $0, %r12
+ test %r12, %r12
je _partial_done\@

GHASH_MUL %xmm8, %xmm13, %xmm9, %xmm10, %xmm11, %xmm5, %xmm6
@@ -475,7 +475,7 @@ _T_8_\@:
add $8, %r10
sub $8, %r11
psrldq $8, %xmm0
- cmp $0, %r11
+ test %r11, %r11
je _return_T_done_\@
_T_4_\@:
movd %xmm0, %eax
@@ -483,7 +483,7 @@ _T_4_\@:
add $4, %r10
sub $4, %r11
psrldq $4, %xmm0
- cmp $0, %r11
+ test %r11, %r11
je _return_T_done_\@
_T_123_\@:
movd %xmm0, %eax
@@ -620,7 +620,7 @@ _get_AAD_blocks\@:

/* read the last <16B of AAD */
_get_AAD_rest\@:
- cmp $0, %r11
+ test %r11, %r11
je _get_AAD_done\@

READ_PARTIAL_BLOCK %r10, %r11, \TMP1, \TMP7
@@ -641,7 +641,7 @@ _get_AAD_done\@:
.macro PARTIAL_BLOCK CYPH_PLAIN_OUT PLAIN_CYPH_IN PLAIN_CYPH_LEN DATA_OFFSET \
AAD_HASH operation
mov PBlockLen(%arg2), %r13
- cmp $0, %r13
+ test %r13, %r13
je _partial_block_done_\@ # Leave Macro if no partial blocks
# Read in input data without over reading
cmp $16, \PLAIN_CYPH_LEN
@@ -693,7 +693,7 @@ _no_extra_mask_1_\@:
PSHUFB_XMM %xmm2, %xmm3
pxor %xmm3, \AAD_HASH

- cmp $0, %r10
+ test %r10, %r10
jl _partial_incomplete_1_\@

# GHASH computation for the last <16 Byte block
@@ -728,7 +728,7 @@ _no_extra_mask_2_\@:
PSHUFB_XMM %xmm2, %xmm9
pxor %xmm9, \AAD_HASH

- cmp $0, %r10
+ test %r10, %r10
jl _partial_incomplete_2_\@

# GHASH computation for the last <16 Byte block
@@ -748,7 +748,7 @@ _encode_done_\@:
PSHUFB_XMM %xmm2, %xmm9
.endif
# output encrypted Bytes
- cmp $0, %r10
+ test %r10, %r10
jl _partial_fill_\@
mov %r13, %r12
mov $16, %r13
@@ -2731,7 +2731,7 @@ ENDPROC(aesni_ctr_enc)
*/
ENTRY(aesni_xts_crypt8)
FRAME_BEGIN
- cmpb $0, %cl
+ testb %cl, %cl
movl $0, %ecx
movl $240, %r10d
leaq _aesni_enc4, %r11
--- a/arch/x86/crypto/aesni-intel_avx-x86_64.S
+++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S
@@ -370,7 +370,7 @@ _initial_num_blocks_is_0\@:


_initial_blocks_encrypted\@:
- cmp $0, %r13
+ test %r13, %r13
je _zero_cipher_left\@

sub $128, %r13
@@ -529,7 +529,7 @@ _multiple_of_16_bytes\@:
vmovdqu HashKey(arg2), %xmm13

mov PBlockLen(arg2), %r12
- cmp $0, %r12
+ test %r12, %r12
je _partial_done\@

#GHASH computation for the last <16 Byte block
@@ -574,7 +574,7 @@ _T_8\@:
add $8, %r10
sub $8, %r11
vpsrldq $8, %xmm9, %xmm9
- cmp $0, %r11
+ test %r11, %r11
je _return_T_done\@
_T_4\@:
vmovd %xmm9, %eax
@@ -582,7 +582,7 @@ _T_4\@:
add $4, %r10
sub $4, %r11
vpsrldq $4, %xmm9, %xmm9
- cmp $0, %r11
+ test %r11, %r11
je _return_T_done\@
_T_123\@:
vmovd %xmm9, %eax
@@ -626,7 +626,7 @@ _get_AAD_blocks\@:
cmp $16, %r11
jge _get_AAD_blocks\@
vmovdqu \T8, \T7
- cmp $0, %r11
+ test %r11, %r11
je _get_AAD_done\@

vpxor \T7, \T7, \T7
@@ -645,7 +645,7 @@ _get_AAD_rest8\@:
vpxor \T1, \T7, \T7
jmp _get_AAD_rest8\@
_get_AAD_rest4\@:
- cmp $0, %r11
+ test %r11, %r11
jle _get_AAD_rest0\@
mov (%r10), %eax
movq %rax, \T1
@@ -750,7 +750,7 @@ _done_read_partial_block_\@:
.macro PARTIAL_BLOCK GHASH_MUL CYPH_PLAIN_OUT PLAIN_CYPH_IN PLAIN_CYPH_LEN DATA_OFFSET \
AAD_HASH ENC_DEC
mov PBlockLen(arg2), %r13
- cmp $0, %r13
+ test %r13, %r13
je _partial_block_done_\@ # Leave Macro if no partial blocks
# Read in input data without over reading
cmp $16, \PLAIN_CYPH_LEN
@@ -802,7 +802,7 @@ _no_extra_mask_1_\@:
vpshufb %xmm2, %xmm3, %xmm3
vpxor %xmm3, \AAD_HASH, \AAD_HASH

- cmp $0, %r10
+ test %r10, %r10
jl _partial_incomplete_1_\@

# GHASH computation for the last <16 Byte block
@@ -837,7 +837,7 @@ _no_extra_mask_2_\@:
vpshufb %xmm2, %xmm9, %xmm9
vpxor %xmm9, \AAD_HASH, \AAD_HASH

- cmp $0, %r10
+ test %r10, %r10
jl _partial_incomplete_2_\@

# GHASH computation for the last <16 Byte block
@@ -857,7 +857,7 @@ _encode_done_\@:
vpshufb %xmm2, %xmm9, %xmm9
.endif
# output encrypted Bytes
- cmp $0, %r10
+ test %r10, %r10
jl _partial_fill_\@
mov %r13, %r12
mov $16, %r13


2021-03-19 12:22:35

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 16/18] crypto: x86/aes-ni-xts - use direct calls to and 4-way stride

From: Ard Biesheuvel <[email protected]>

commit 86ad60a65f29dd862a11c22bb4b5be28d6c5cef1 upstream.

The XTS asm helper arrangement is a bit odd: the 8-way stride helper
consists of back-to-back calls to the 4-way core transforms, which
are called indirectly, based on a boolean that indicates whether we
are performing encryption or decryption.

Given how costly indirect calls are on x86, let's switch to direct
calls, and given how the 8-way stride doesn't really add anything
substantial, use a 4-way stride instead, and make the asm core
routine deal with any multiple of 4 blocks. Since 512 byte sectors
or 4 KB blocks are the typical quantities XTS operates on, increase
the stride exported to the glue helper to 512 bytes as well.

As a result, the number of indirect calls is reduced from 3 per 64 bytes
of in/output to 1 per 512 bytes of in/output, which produces a 65% speedup
when operating on 1 KB blocks (measured on a Intel(R) Core(TM) i7-8650U CPU)

Fixes: 9697fa39efd3f ("x86/retpoline/crypto: Convert crypto assembler indirect jumps")
Tested-by: Eric Biggers <[email protected]> # x86_64
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
[ardb: rebase onto stable/linux-5.4.y]
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/crypto/aesni-intel_asm.S | 115 ++++++++++++++++++++++---------------
arch/x86/crypto/aesni-intel_glue.c | 25 ++++----
2 files changed, 84 insertions(+), 56 deletions(-)

--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -2726,25 +2726,18 @@ ENDPROC(aesni_ctr_enc)
pxor CTR, IV;

/*
- * void aesni_xts_crypt8(const struct crypto_aes_ctx *ctx, u8 *dst,
- * const u8 *src, bool enc, le128 *iv)
+ * void aesni_xts_encrypt(const struct crypto_aes_ctx *ctx, u8 *dst,
+ * const u8 *src, unsigned int len, le128 *iv)
*/
-ENTRY(aesni_xts_crypt8)
+ENTRY(aesni_xts_encrypt)
FRAME_BEGIN
- testb %cl, %cl
- movl $0, %ecx
- movl $240, %r10d
- leaq _aesni_enc4, %r11
- leaq _aesni_dec4, %rax
- cmovel %r10d, %ecx
- cmoveq %rax, %r11

movdqa .Lgf128mul_x_ble_mask, GF128MUL_MASK
movups (IVP), IV

mov 480(KEYP), KLEN
- addq %rcx, KEYP

+.Lxts_enc_loop4:
movdqa IV, STATE1
movdqu 0x00(INP), INC
pxor INC, STATE1
@@ -2768,71 +2761,103 @@ ENTRY(aesni_xts_crypt8)
pxor INC, STATE4
movdqu IV, 0x30(OUTP)

- CALL_NOSPEC %r11
+ call _aesni_enc4

movdqu 0x00(OUTP), INC
pxor INC, STATE1
movdqu STATE1, 0x00(OUTP)

- _aesni_gf128mul_x_ble()
- movdqa IV, STATE1
- movdqu 0x40(INP), INC
- pxor INC, STATE1
- movdqu IV, 0x40(OUTP)
-
movdqu 0x10(OUTP), INC
pxor INC, STATE2
movdqu STATE2, 0x10(OUTP)

- _aesni_gf128mul_x_ble()
- movdqa IV, STATE2
- movdqu 0x50(INP), INC
- pxor INC, STATE2
- movdqu IV, 0x50(OUTP)
-
movdqu 0x20(OUTP), INC
pxor INC, STATE3
movdqu STATE3, 0x20(OUTP)

- _aesni_gf128mul_x_ble()
- movdqa IV, STATE3
- movdqu 0x60(INP), INC
- pxor INC, STATE3
- movdqu IV, 0x60(OUTP)
-
movdqu 0x30(OUTP), INC
pxor INC, STATE4
movdqu STATE4, 0x30(OUTP)

_aesni_gf128mul_x_ble()
- movdqa IV, STATE4
- movdqu 0x70(INP), INC
- pxor INC, STATE4
- movdqu IV, 0x70(OUTP)

- _aesni_gf128mul_x_ble()
+ add $64, INP
+ add $64, OUTP
+ sub $64, LEN
+ ja .Lxts_enc_loop4
+
movups IV, (IVP)

- CALL_NOSPEC %r11
+ FRAME_END
+ ret
+ENDPROC(aesni_xts_encrypt)
+
+/*
+ * void aesni_xts_decrypt(const struct crypto_aes_ctx *ctx, u8 *dst,
+ * const u8 *src, unsigned int len, le128 *iv)
+ */
+ENTRY(aesni_xts_decrypt)
+ FRAME_BEGIN
+
+ movdqa .Lgf128mul_x_ble_mask, GF128MUL_MASK
+ movups (IVP), IV
+
+ mov 480(KEYP), KLEN
+ add $240, KEYP
+
+.Lxts_dec_loop4:
+ movdqa IV, STATE1
+ movdqu 0x00(INP), INC
+ pxor INC, STATE1
+ movdqu IV, 0x00(OUTP)
+
+ _aesni_gf128mul_x_ble()
+ movdqa IV, STATE2
+ movdqu 0x10(INP), INC
+ pxor INC, STATE2
+ movdqu IV, 0x10(OUTP)
+
+ _aesni_gf128mul_x_ble()
+ movdqa IV, STATE3
+ movdqu 0x20(INP), INC
+ pxor INC, STATE3
+ movdqu IV, 0x20(OUTP)
+
+ _aesni_gf128mul_x_ble()
+ movdqa IV, STATE4
+ movdqu 0x30(INP), INC
+ pxor INC, STATE4
+ movdqu IV, 0x30(OUTP)
+
+ call _aesni_dec4

- movdqu 0x40(OUTP), INC
+ movdqu 0x00(OUTP), INC
pxor INC, STATE1
- movdqu STATE1, 0x40(OUTP)
+ movdqu STATE1, 0x00(OUTP)

- movdqu 0x50(OUTP), INC
+ movdqu 0x10(OUTP), INC
pxor INC, STATE2
- movdqu STATE2, 0x50(OUTP)
+ movdqu STATE2, 0x10(OUTP)

- movdqu 0x60(OUTP), INC
+ movdqu 0x20(OUTP), INC
pxor INC, STATE3
- movdqu STATE3, 0x60(OUTP)
+ movdqu STATE3, 0x20(OUTP)

- movdqu 0x70(OUTP), INC
+ movdqu 0x30(OUTP), INC
pxor INC, STATE4
- movdqu STATE4, 0x70(OUTP)
+ movdqu STATE4, 0x30(OUTP)
+
+ _aesni_gf128mul_x_ble()
+
+ add $64, INP
+ add $64, OUTP
+ sub $64, LEN
+ ja .Lxts_dec_loop4
+
+ movups IV, (IVP)

FRAME_END
ret
-ENDPROC(aesni_xts_crypt8)
+ENDPROC(aesni_xts_decrypt)

#endif
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -97,6 +97,12 @@ asmlinkage void aesni_cbc_dec(struct cry
#define AVX_GEN2_OPTSIZE 640
#define AVX_GEN4_OPTSIZE 4096

+asmlinkage void aesni_xts_encrypt(const struct crypto_aes_ctx *ctx, u8 *out,
+ const u8 *in, unsigned int len, u8 *iv);
+
+asmlinkage void aesni_xts_decrypt(const struct crypto_aes_ctx *ctx, u8 *out,
+ const u8 *in, unsigned int len, u8 *iv);
+
#ifdef CONFIG_X86_64

static void (*aesni_ctr_enc_tfm)(struct crypto_aes_ctx *ctx, u8 *out,
@@ -104,9 +110,6 @@ static void (*aesni_ctr_enc_tfm)(struct
asmlinkage void aesni_ctr_enc(struct crypto_aes_ctx *ctx, u8 *out,
const u8 *in, unsigned int len, u8 *iv);

-asmlinkage void aesni_xts_crypt8(const struct crypto_aes_ctx *ctx, u8 *out,
- const u8 *in, bool enc, le128 *iv);
-
/* asmlinkage void aesni_gcm_enc()
* void *ctx, AES Key schedule. Starts on a 16 byte boundary.
* struct gcm_context_data. May be uninitialized.
@@ -558,14 +561,14 @@ static void aesni_xts_dec(const void *ct
glue_xts_crypt_128bit_one(ctx, dst, src, iv, aesni_dec);
}

-static void aesni_xts_enc8(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
+static void aesni_xts_enc32(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
- aesni_xts_crypt8(ctx, dst, src, true, iv);
+ aesni_xts_encrypt(ctx, dst, src, 32 * AES_BLOCK_SIZE, (u8 *)iv);
}

-static void aesni_xts_dec8(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
+static void aesni_xts_dec32(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
- aesni_xts_crypt8(ctx, dst, src, false, iv);
+ aesni_xts_decrypt(ctx, dst, src, 32 * AES_BLOCK_SIZE, (u8 *)iv);
}

static const struct common_glue_ctx aesni_enc_xts = {
@@ -573,8 +576,8 @@ static const struct common_glue_ctx aesn
.fpu_blocks_limit = 1,

.funcs = { {
- .num_blocks = 8,
- .fn_u = { .xts = aesni_xts_enc8 }
+ .num_blocks = 32,
+ .fn_u = { .xts = aesni_xts_enc32 }
}, {
.num_blocks = 1,
.fn_u = { .xts = aesni_xts_enc }
@@ -586,8 +589,8 @@ static const struct common_glue_ctx aesn
.fpu_blocks_limit = 1,

.funcs = { {
- .num_blocks = 8,
- .fn_u = { .xts = aesni_xts_dec8 }
+ .num_blocks = 32,
+ .fn_u = { .xts = aesni_xts_dec32 }
}, {
.num_blocks = 1,
.fn_u = { .xts = aesni_xts_dec }


2021-03-19 12:22:35

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 17/18] net: dsa: tag_mtk: fix 802.1ad VLAN egress

From: DENG Qingfang <[email protected]>

commit 9200f515c41f4cbaeffd8fdd1d8b6373a18b1b67 upstream.

A different TPID bit is used for 802.1ad VLAN frames.

Reported-by: Ilario Gelmetti <[email protected]>
Fixes: f0af34317f4b ("net: dsa: mediatek: combine MediaTek tag with VLAN tag")
Signed-off-by: DENG Qingfang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/dsa/tag_mtk.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)

--- a/net/dsa/tag_mtk.c
+++ b/net/dsa/tag_mtk.c
@@ -13,6 +13,7 @@
#define MTK_HDR_LEN 4
#define MTK_HDR_XMIT_UNTAGGED 0
#define MTK_HDR_XMIT_TAGGED_TPID_8100 1
+#define MTK_HDR_XMIT_TAGGED_TPID_88A8 2
#define MTK_HDR_RECV_SOURCE_PORT_MASK GENMASK(2, 0)
#define MTK_HDR_XMIT_DP_BIT_MASK GENMASK(5, 0)
#define MTK_HDR_XMIT_SA_DIS BIT(6)
@@ -21,8 +22,8 @@ static struct sk_buff *mtk_tag_xmit(stru
struct net_device *dev)
{
struct dsa_port *dp = dsa_slave_to_port(dev);
+ u8 xmit_tpid;
u8 *mtk_tag;
- bool is_vlan_skb = true;
unsigned char *dest = eth_hdr(skb)->h_dest;
bool is_multicast_skb = is_multicast_ether_addr(dest) &&
!is_broadcast_ether_addr(dest);
@@ -33,13 +34,20 @@ static struct sk_buff *mtk_tag_xmit(stru
* the both special and VLAN tag at the same time and then look up VLAN
* table with VID.
*/
- if (!skb_vlan_tagged(skb)) {
+ switch (skb->protocol) {
+ case htons(ETH_P_8021Q):
+ xmit_tpid = MTK_HDR_XMIT_TAGGED_TPID_8100;
+ break;
+ case htons(ETH_P_8021AD):
+ xmit_tpid = MTK_HDR_XMIT_TAGGED_TPID_88A8;
+ break;
+ default:
if (skb_cow_head(skb, MTK_HDR_LEN) < 0)
return NULL;

+ xmit_tpid = MTK_HDR_XMIT_UNTAGGED;
skb_push(skb, MTK_HDR_LEN);
memmove(skb->data, skb->data + MTK_HDR_LEN, 2 * ETH_ALEN);
- is_vlan_skb = false;
}

mtk_tag = skb->data + 2 * ETH_ALEN;
@@ -47,8 +55,7 @@ static struct sk_buff *mtk_tag_xmit(stru
/* Mark tag attribute on special tag insertion to notify hardware
* whether that's a combined special tag with 802.1Q header.
*/
- mtk_tag[0] = is_vlan_skb ? MTK_HDR_XMIT_TAGGED_TPID_8100 :
- MTK_HDR_XMIT_UNTAGGED;
+ mtk_tag[0] = xmit_tpid;
mtk_tag[1] = (1 << dp->index) & MTK_HDR_XMIT_DP_BIT_MASK;

/* Disable SA learning for multicast frames */
@@ -56,7 +63,7 @@ static struct sk_buff *mtk_tag_xmit(stru
mtk_tag[1] |= MTK_HDR_XMIT_SA_DIS;

/* Tag control information is kept for 802.1Q */
- if (!is_vlan_skb) {
+ if (xmit_tpid == MTK_HDR_XMIT_UNTAGGED) {
mtk_tag[2] = 0;
mtk_tag[3] = 0;
}


2021-03-19 12:22:45

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 18/18] net: dsa: b53: Support setting learning on port

From: Florian Fainelli <[email protected]>

commit f9b3827ee66cfcf297d0acd6ecf33653a5f297ef upstream.

Add support for being able to set the learning attribute on port, and
make sure that the standalone ports start up with learning disabled.

We can remove the code in bcm_sf2 that configured the ports learning
attribute because we want the standalone ports to have learning disabled
by default and port 7 cannot be bridged, so its learning attribute will
not change past its initial configuration.

Signed-off-by: Florian Fainelli <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/dsa/b53/b53_common.c | 18 ++++++++++++++++++
drivers/net/dsa/b53/b53_regs.h | 1 +
drivers/net/dsa/bcm_sf2.c | 5 -----
3 files changed, 19 insertions(+), 5 deletions(-)

--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -514,6 +514,19 @@ void b53_imp_vlan_setup(struct dsa_switc
}
EXPORT_SYMBOL(b53_imp_vlan_setup);

+static void b53_port_set_learning(struct b53_device *dev, int port,
+ bool learning)
+{
+ u16 reg;
+
+ b53_read16(dev, B53_CTRL_PAGE, B53_DIS_LEARNING, &reg);
+ if (learning)
+ reg &= ~BIT(port);
+ else
+ reg |= BIT(port);
+ b53_write16(dev, B53_CTRL_PAGE, B53_DIS_LEARNING, reg);
+}
+
int b53_enable_port(struct dsa_switch *ds, int port, struct phy_device *phy)
{
struct b53_device *dev = ds->priv;
@@ -527,6 +540,7 @@ int b53_enable_port(struct dsa_switch *d
cpu_port = ds->ports[port].cpu_dp->index;

b53_br_egress_floods(ds, port, true, true);
+ b53_port_set_learning(dev, port, false);

if (dev->ops->irq_enable)
ret = dev->ops->irq_enable(dev, port);
@@ -645,6 +659,7 @@ static void b53_enable_cpu_port(struct b
b53_brcm_hdr_setup(dev->ds, port);

b53_br_egress_floods(dev->ds, port, true, true);
+ b53_port_set_learning(dev, port, false);
}

static void b53_enable_mib(struct b53_device *dev)
@@ -1704,6 +1719,8 @@ int b53_br_join(struct dsa_switch *ds, i
b53_write16(dev, B53_PVLAN_PAGE, B53_PVLAN_PORT_MASK(port), pvlan);
dev->ports[port].vlan_ctl_mask = pvlan;

+ b53_port_set_learning(dev, port, true);
+
return 0;
}
EXPORT_SYMBOL(b53_br_join);
@@ -1751,6 +1768,7 @@ void b53_br_leave(struct dsa_switch *ds,
vl->untag |= BIT(port) | BIT(cpu_port);
b53_set_vlan_entry(dev, pvid, vl);
}
+ b53_port_set_learning(dev, port, false);
}
EXPORT_SYMBOL(b53_br_leave);

--- a/drivers/net/dsa/b53/b53_regs.h
+++ b/drivers/net/dsa/b53/b53_regs.h
@@ -115,6 +115,7 @@
#define B53_UC_FLOOD_MASK 0x32
#define B53_MC_FLOOD_MASK 0x34
#define B53_IPMC_FLOOD_MASK 0x36
+#define B53_DIS_LEARNING 0x3c

/*
* Override Ports 0-7 State on devices with xMII interfaces (8 bit)
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -172,11 +172,6 @@ static int bcm_sf2_port_setup(struct dsa
reg &= ~P_TXQ_PSM_VDD(port);
core_writel(priv, reg, CORE_MEM_PSM_VDD_CTRL);

- /* Enable learning */
- reg = core_readl(priv, CORE_DIS_LEARN);
- reg &= ~BIT(port);
- core_writel(priv, reg, CORE_DIS_LEARN);
-
/* Enable Broadcom tags for that port if requested */
if (priv->brcm_tag_mask & BIT(port))
b53_brcm_hdr_setup(ds, port);


2021-03-19 12:23:39

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 03/18] bpf: Fix off-by-one for area size in creating mask to left

From: Piotr Krysiuk <[email protected]>

commit 10d2bb2e6b1d8c4576c56a748f697dbeb8388899 upstream.

retrieve_ptr_limit() computes the ptr_limit for registers with stack and
map_value type. ptr_limit is the size of the memory area that is still
valid / in-bounds from the point of the current position and direction
of the operation (add / sub). This size will later be used for masking
the operation such that attempting out-of-bounds access in the speculative
domain is redirected to remain within the bounds of the current map value.

When masking to the right the size is correct, however, when masking to
the left, the size is off-by-one which would lead to an incorrect mask
and thus incorrect arithmetic operation in the non-speculative domain.
Piotr found that if the resulting alu_limit value is zero, then the
BPF_MOV32_IMM() from the fixup_bpf_calls() rewrite will end up loading
0xffffffff into AX instead of sign-extending to the full 64 bit range,
and as a result, this allows abuse for executing speculatively out-of-
bounds loads against 4GB window of address space and thus extracting the
contents of kernel memory via side-channel.

Fixes: 979d63d50c0c ("bpf: prevent out of bounds speculation on pointer arithmetic")
Signed-off-by: Piotr Krysiuk <[email protected]>
Co-developed-by: Daniel Borkmann <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/bpf/verifier.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4277,13 +4277,13 @@ static int retrieve_ptr_limit(const stru
*/
off = ptr_reg->off + ptr_reg->var_off.value;
if (mask_to_left)
- *ptr_limit = MAX_BPF_STACK + off;
+ *ptr_limit = MAX_BPF_STACK + off + 1;
else
*ptr_limit = -off;
return 0;
case PTR_TO_MAP_VALUE:
if (mask_to_left) {
- *ptr_limit = ptr_reg->umax_value + ptr_reg->off;
+ *ptr_limit = ptr_reg->umax_value + ptr_reg->off + 1;
} else {
off = ptr_reg->smin_value + ptr_reg->off;
*ptr_limit = ptr_reg->map_ptr->value_size - off;


2021-03-19 12:23:49

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 14/18] crypto: x86 - Regularize glue function prototypes

From: Kees Cook <[email protected]>

commit 9c1e8836edbbaf3656bc07437b59c04be034ac4e upstream.

The crypto glue performed function prototype casting via macros to make
indirect calls to assembly routines. Instead of performing casts at the
call sites (which trips Control Flow Integrity prototype checking), switch
each prototype to a common standard set of arguments which allows the
removal of the existing macros. In order to keep pointer math unchanged,
internal casting between u128 pointers and u8 pointers is added.

Co-developed-by: João Moreira <[email protected]>
Signed-off-by: João Moreira <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Reviewed-by: Eric Biggers <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/crypto/aesni-intel_asm.S | 8 +-
arch/x86/crypto/aesni-intel_glue.c | 45 ++++++----------
arch/x86/crypto/camellia_aesni_avx2_glue.c | 78 +++++++++++++---------------
arch/x86/crypto/camellia_aesni_avx_glue.c | 72 +++++++++++---------------
arch/x86/crypto/camellia_glue.c | 45 ++++++++--------
arch/x86/crypto/cast6_avx_glue.c | 68 +++++++++++-------------
arch/x86/crypto/glue_helper.c | 23 +++++---
arch/x86/crypto/serpent_avx2_glue.c | 65 +++++++++++------------
arch/x86/crypto/serpent_avx_glue.c | 63 +++++++++++------------
arch/x86/crypto/serpent_sse2_glue.c | 30 ++++++-----
arch/x86/crypto/twofish_avx_glue.c | 79 ++++++++++++-----------------
arch/x86/crypto/twofish_glue_3way.c | 37 +++++++------
arch/x86/include/asm/crypto/camellia.h | 61 ++++++++++------------
arch/x86/include/asm/crypto/glue_helper.h | 18 ++----
arch/x86/include/asm/crypto/serpent-avx.h | 20 +++----
arch/x86/include/asm/crypto/serpent-sse2.h | 28 ++++------
arch/x86/include/asm/crypto/twofish.h | 19 ++----
crypto/cast6_generic.c | 18 +++---
crypto/serpent_generic.c | 6 +-
include/crypto/cast6.h | 4 -
include/crypto/serpent.h | 4 -
include/crypto/xts.h | 2
22 files changed, 377 insertions(+), 416 deletions(-)

--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -1946,7 +1946,7 @@ ENTRY(aesni_set_key)
ENDPROC(aesni_set_key)

/*
- * void aesni_enc(struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src)
+ * void aesni_enc(const void *ctx, u8 *dst, const u8 *src)
*/
ENTRY(aesni_enc)
FRAME_BEGIN
@@ -2137,7 +2137,7 @@ _aesni_enc4:
ENDPROC(_aesni_enc4)

/*
- * void aesni_dec (struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src)
+ * void aesni_dec (const void *ctx, u8 *dst, const u8 *src)
*/
ENTRY(aesni_dec)
FRAME_BEGIN
@@ -2726,8 +2726,8 @@ ENDPROC(aesni_ctr_enc)
pxor CTR, IV;

/*
- * void aesni_xts_crypt8(struct crypto_aes_ctx *ctx, const u8 *dst, u8 *src,
- * bool enc, u8 *iv)
+ * void aesni_xts_crypt8(const struct crypto_aes_ctx *ctx, u8 *dst,
+ * const u8 *src, bool enc, le128 *iv)
*/
ENTRY(aesni_xts_crypt8)
FRAME_BEGIN
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -83,10 +83,8 @@ struct gcm_context_data {

asmlinkage int aesni_set_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
unsigned int key_len);
-asmlinkage void aesni_enc(struct crypto_aes_ctx *ctx, u8 *out,
- const u8 *in);
-asmlinkage void aesni_dec(struct crypto_aes_ctx *ctx, u8 *out,
- const u8 *in);
+asmlinkage void aesni_enc(const void *ctx, u8 *out, const u8 *in);
+asmlinkage void aesni_dec(const void *ctx, u8 *out, const u8 *in);
asmlinkage void aesni_ecb_enc(struct crypto_aes_ctx *ctx, u8 *out,
const u8 *in, unsigned int len);
asmlinkage void aesni_ecb_dec(struct crypto_aes_ctx *ctx, u8 *out,
@@ -106,8 +104,8 @@ static void (*aesni_ctr_enc_tfm)(struct
asmlinkage void aesni_ctr_enc(struct crypto_aes_ctx *ctx, u8 *out,
const u8 *in, unsigned int len, u8 *iv);

-asmlinkage void aesni_xts_crypt8(struct crypto_aes_ctx *ctx, u8 *out,
- const u8 *in, bool enc, u8 *iv);
+asmlinkage void aesni_xts_crypt8(const struct crypto_aes_ctx *ctx, u8 *out,
+ const u8 *in, bool enc, le128 *iv);

/* asmlinkage void aesni_gcm_enc()
* void *ctx, AES Key schedule. Starts on a 16 byte boundary.
@@ -550,29 +548,24 @@ static int xts_aesni_setkey(struct crypt
}


-static void aesni_xts_tweak(void *ctx, u8 *out, const u8 *in)
+static void aesni_xts_enc(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
- aesni_enc(ctx, out, in);
+ glue_xts_crypt_128bit_one(ctx, dst, src, iv, aesni_enc);
}

-static void aesni_xts_enc(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+static void aesni_xts_dec(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
- glue_xts_crypt_128bit_one(ctx, dst, src, iv, GLUE_FUNC_CAST(aesni_enc));
+ glue_xts_crypt_128bit_one(ctx, dst, src, iv, aesni_dec);
}

-static void aesni_xts_dec(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+static void aesni_xts_enc8(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
- glue_xts_crypt_128bit_one(ctx, dst, src, iv, GLUE_FUNC_CAST(aesni_dec));
+ aesni_xts_crypt8(ctx, dst, src, true, iv);
}

-static void aesni_xts_enc8(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+static void aesni_xts_dec8(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
- aesni_xts_crypt8(ctx, (u8 *)dst, (const u8 *)src, true, (u8 *)iv);
-}
-
-static void aesni_xts_dec8(void *ctx, u128 *dst, const u128 *src, le128 *iv)
-{
- aesni_xts_crypt8(ctx, (u8 *)dst, (const u8 *)src, false, (u8 *)iv);
+ aesni_xts_crypt8(ctx, dst, src, false, iv);
}

static const struct common_glue_ctx aesni_enc_xts = {
@@ -581,10 +574,10 @@ static const struct common_glue_ctx aesn

.funcs = { {
.num_blocks = 8,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(aesni_xts_enc8) }
+ .fn_u = { .xts = aesni_xts_enc8 }
}, {
.num_blocks = 1,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(aesni_xts_enc) }
+ .fn_u = { .xts = aesni_xts_enc }
} }
};

@@ -594,10 +587,10 @@ static const struct common_glue_ctx aesn

.funcs = { {
.num_blocks = 8,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(aesni_xts_dec8) }
+ .fn_u = { .xts = aesni_xts_dec8 }
}, {
.num_blocks = 1,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(aesni_xts_dec) }
+ .fn_u = { .xts = aesni_xts_dec }
} }
};

@@ -606,8 +599,7 @@ static int xts_encrypt(struct skcipher_r
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesni_xts_ctx *ctx = crypto_skcipher_ctx(tfm);

- return glue_xts_req_128bit(&aesni_enc_xts, req,
- XTS_TWEAK_CAST(aesni_xts_tweak),
+ return glue_xts_req_128bit(&aesni_enc_xts, req, aesni_enc,
aes_ctx(ctx->raw_tweak_ctx),
aes_ctx(ctx->raw_crypt_ctx),
false);
@@ -618,8 +610,7 @@ static int xts_decrypt(struct skcipher_r
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesni_xts_ctx *ctx = crypto_skcipher_ctx(tfm);

- return glue_xts_req_128bit(&aesni_dec_xts, req,
- XTS_TWEAK_CAST(aesni_xts_tweak),
+ return glue_xts_req_128bit(&aesni_dec_xts, req, aesni_enc,
aes_ctx(ctx->raw_tweak_ctx),
aes_ctx(ctx->raw_crypt_ctx),
true);
--- a/arch/x86/crypto/camellia_aesni_avx2_glue.c
+++ b/arch/x86/crypto/camellia_aesni_avx2_glue.c
@@ -19,20 +19,17 @@
#define CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS 32

/* 32-way AVX2/AES-NI parallel cipher functions */
-asmlinkage void camellia_ecb_enc_32way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src);
-asmlinkage void camellia_ecb_dec_32way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src);
-
-asmlinkage void camellia_cbc_dec_32way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src);
-asmlinkage void camellia_ctr_32way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
-
-asmlinkage void camellia_xts_enc_32way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
-asmlinkage void camellia_xts_dec_32way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
+asmlinkage void camellia_ecb_enc_32way(const void *ctx, u8 *dst, const u8 *src);
+asmlinkage void camellia_ecb_dec_32way(const void *ctx, u8 *dst, const u8 *src);
+
+asmlinkage void camellia_cbc_dec_32way(const void *ctx, u8 *dst, const u8 *src);
+asmlinkage void camellia_ctr_32way(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);
+
+asmlinkage void camellia_xts_enc_32way(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);
+asmlinkage void camellia_xts_dec_32way(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);

static const struct common_glue_ctx camellia_enc = {
.num_funcs = 4,
@@ -40,16 +37,16 @@ static const struct common_glue_ctx came

.funcs = { {
.num_blocks = CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_ecb_enc_32way) }
+ .fn_u = { .ecb = camellia_ecb_enc_32way }
}, {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_ecb_enc_16way) }
+ .fn_u = { .ecb = camellia_ecb_enc_16way }
}, {
.num_blocks = 2,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_enc_blk_2way) }
+ .fn_u = { .ecb = camellia_enc_blk_2way }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_enc_blk) }
+ .fn_u = { .ecb = camellia_enc_blk }
} }
};

@@ -59,16 +56,16 @@ static const struct common_glue_ctx came

.funcs = { {
.num_blocks = CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(camellia_ctr_32way) }
+ .fn_u = { .ctr = camellia_ctr_32way }
}, {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(camellia_ctr_16way) }
+ .fn_u = { .ctr = camellia_ctr_16way }
}, {
.num_blocks = 2,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(camellia_crypt_ctr_2way) }
+ .fn_u = { .ctr = camellia_crypt_ctr_2way }
}, {
.num_blocks = 1,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(camellia_crypt_ctr) }
+ .fn_u = { .ctr = camellia_crypt_ctr }
} }
};

@@ -78,13 +75,13 @@ static const struct common_glue_ctx came

.funcs = { {
.num_blocks = CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(camellia_xts_enc_32way) }
+ .fn_u = { .xts = camellia_xts_enc_32way }
}, {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(camellia_xts_enc_16way) }
+ .fn_u = { .xts = camellia_xts_enc_16way }
}, {
.num_blocks = 1,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(camellia_xts_enc) }
+ .fn_u = { .xts = camellia_xts_enc }
} }
};

@@ -94,16 +91,16 @@ static const struct common_glue_ctx came

.funcs = { {
.num_blocks = CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_ecb_dec_32way) }
+ .fn_u = { .ecb = camellia_ecb_dec_32way }
}, {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_ecb_dec_16way) }
+ .fn_u = { .ecb = camellia_ecb_dec_16way }
}, {
.num_blocks = 2,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_dec_blk_2way) }
+ .fn_u = { .ecb = camellia_dec_blk_2way }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_dec_blk) }
+ .fn_u = { .ecb = camellia_dec_blk }
} }
};

@@ -113,16 +110,16 @@ static const struct common_glue_ctx came

.funcs = { {
.num_blocks = CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(camellia_cbc_dec_32way) }
+ .fn_u = { .cbc = camellia_cbc_dec_32way }
}, {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(camellia_cbc_dec_16way) }
+ .fn_u = { .cbc = camellia_cbc_dec_16way }
}, {
.num_blocks = 2,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(camellia_decrypt_cbc_2way) }
+ .fn_u = { .cbc = camellia_decrypt_cbc_2way }
}, {
.num_blocks = 1,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(camellia_dec_blk) }
+ .fn_u = { .cbc = camellia_dec_blk }
} }
};

@@ -132,13 +129,13 @@ static const struct common_glue_ctx came

.funcs = { {
.num_blocks = CAMELLIA_AESNI_AVX2_PARALLEL_BLOCKS,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(camellia_xts_dec_32way) }
+ .fn_u = { .xts = camellia_xts_dec_32way }
}, {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(camellia_xts_dec_16way) }
+ .fn_u = { .xts = camellia_xts_dec_16way }
}, {
.num_blocks = 1,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(camellia_xts_dec) }
+ .fn_u = { .xts = camellia_xts_dec }
} }
};

@@ -161,8 +158,7 @@ static int ecb_decrypt(struct skcipher_r

static int cbc_encrypt(struct skcipher_request *req)
{
- return glue_cbc_encrypt_req_128bit(GLUE_FUNC_CAST(camellia_enc_blk),
- req);
+ return glue_cbc_encrypt_req_128bit(camellia_enc_blk, req);
}

static int cbc_decrypt(struct skcipher_request *req)
@@ -180,8 +176,7 @@ static int xts_encrypt(struct skcipher_r
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct camellia_xts_ctx *ctx = crypto_skcipher_ctx(tfm);

- return glue_xts_req_128bit(&camellia_enc_xts, req,
- XTS_TWEAK_CAST(camellia_enc_blk),
+ return glue_xts_req_128bit(&camellia_enc_xts, req, camellia_enc_blk,
&ctx->tweak_ctx, &ctx->crypt_ctx, false);
}

@@ -190,8 +185,7 @@ static int xts_decrypt(struct skcipher_r
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct camellia_xts_ctx *ctx = crypto_skcipher_ctx(tfm);

- return glue_xts_req_128bit(&camellia_dec_xts, req,
- XTS_TWEAK_CAST(camellia_enc_blk),
+ return glue_xts_req_128bit(&camellia_dec_xts, req, camellia_enc_blk,
&ctx->tweak_ctx, &ctx->crypt_ctx, true);
}

--- a/arch/x86/crypto/camellia_aesni_avx_glue.c
+++ b/arch/x86/crypto/camellia_aesni_avx_glue.c
@@ -18,41 +18,36 @@
#define CAMELLIA_AESNI_PARALLEL_BLOCKS 16

/* 16-way parallel cipher functions (avx/aes-ni) */
-asmlinkage void camellia_ecb_enc_16way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src);
+asmlinkage void camellia_ecb_enc_16way(const void *ctx, u8 *dst, const u8 *src);
EXPORT_SYMBOL_GPL(camellia_ecb_enc_16way);

-asmlinkage void camellia_ecb_dec_16way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src);
+asmlinkage void camellia_ecb_dec_16way(const void *ctx, u8 *dst, const u8 *src);
EXPORT_SYMBOL_GPL(camellia_ecb_dec_16way);

-asmlinkage void camellia_cbc_dec_16way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src);
+asmlinkage void camellia_cbc_dec_16way(const void *ctx, u8 *dst, const u8 *src);
EXPORT_SYMBOL_GPL(camellia_cbc_dec_16way);

-asmlinkage void camellia_ctr_16way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
+asmlinkage void camellia_ctr_16way(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);
EXPORT_SYMBOL_GPL(camellia_ctr_16way);

-asmlinkage void camellia_xts_enc_16way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
+asmlinkage void camellia_xts_enc_16way(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);
EXPORT_SYMBOL_GPL(camellia_xts_enc_16way);

-asmlinkage void camellia_xts_dec_16way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
+asmlinkage void camellia_xts_dec_16way(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);
EXPORT_SYMBOL_GPL(camellia_xts_dec_16way);

-void camellia_xts_enc(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+void camellia_xts_enc(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
- glue_xts_crypt_128bit_one(ctx, dst, src, iv,
- GLUE_FUNC_CAST(camellia_enc_blk));
+ glue_xts_crypt_128bit_one(ctx, dst, src, iv, camellia_enc_blk);
}
EXPORT_SYMBOL_GPL(camellia_xts_enc);

-void camellia_xts_dec(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+void camellia_xts_dec(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
- glue_xts_crypt_128bit_one(ctx, dst, src, iv,
- GLUE_FUNC_CAST(camellia_dec_blk));
+ glue_xts_crypt_128bit_one(ctx, dst, src, iv, camellia_dec_blk);
}
EXPORT_SYMBOL_GPL(camellia_xts_dec);

@@ -62,13 +57,13 @@ static const struct common_glue_ctx came

.funcs = { {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_ecb_enc_16way) }
+ .fn_u = { .ecb = camellia_ecb_enc_16way }
}, {
.num_blocks = 2,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_enc_blk_2way) }
+ .fn_u = { .ecb = camellia_enc_blk_2way }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_enc_blk) }
+ .fn_u = { .ecb = camellia_enc_blk }
} }
};

@@ -78,13 +73,13 @@ static const struct common_glue_ctx came

.funcs = { {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(camellia_ctr_16way) }
+ .fn_u = { .ctr = camellia_ctr_16way }
}, {
.num_blocks = 2,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(camellia_crypt_ctr_2way) }
+ .fn_u = { .ctr = camellia_crypt_ctr_2way }
}, {
.num_blocks = 1,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(camellia_crypt_ctr) }
+ .fn_u = { .ctr = camellia_crypt_ctr }
} }
};

@@ -94,10 +89,10 @@ static const struct common_glue_ctx came

.funcs = { {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(camellia_xts_enc_16way) }
+ .fn_u = { .xts = camellia_xts_enc_16way }
}, {
.num_blocks = 1,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(camellia_xts_enc) }
+ .fn_u = { .xts = camellia_xts_enc }
} }
};

@@ -107,13 +102,13 @@ static const struct common_glue_ctx came

.funcs = { {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_ecb_dec_16way) }
+ .fn_u = { .ecb = camellia_ecb_dec_16way }
}, {
.num_blocks = 2,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_dec_blk_2way) }
+ .fn_u = { .ecb = camellia_dec_blk_2way }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_dec_blk) }
+ .fn_u = { .ecb = camellia_dec_blk }
} }
};

@@ -123,13 +118,13 @@ static const struct common_glue_ctx came

.funcs = { {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(camellia_cbc_dec_16way) }
+ .fn_u = { .cbc = camellia_cbc_dec_16way }
}, {
.num_blocks = 2,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(camellia_decrypt_cbc_2way) }
+ .fn_u = { .cbc = camellia_decrypt_cbc_2way }
}, {
.num_blocks = 1,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(camellia_dec_blk) }
+ .fn_u = { .cbc = camellia_dec_blk }
} }
};

@@ -139,10 +134,10 @@ static const struct common_glue_ctx came

.funcs = { {
.num_blocks = CAMELLIA_AESNI_PARALLEL_BLOCKS,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(camellia_xts_dec_16way) }
+ .fn_u = { .xts = camellia_xts_dec_16way }
}, {
.num_blocks = 1,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(camellia_xts_dec) }
+ .fn_u = { .xts = camellia_xts_dec }
} }
};

@@ -165,8 +160,7 @@ static int ecb_decrypt(struct skcipher_r

static int cbc_encrypt(struct skcipher_request *req)
{
- return glue_cbc_encrypt_req_128bit(GLUE_FUNC_CAST(camellia_enc_blk),
- req);
+ return glue_cbc_encrypt_req_128bit(camellia_enc_blk, req);
}

static int cbc_decrypt(struct skcipher_request *req)
@@ -206,8 +200,7 @@ static int xts_encrypt(struct skcipher_r
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct camellia_xts_ctx *ctx = crypto_skcipher_ctx(tfm);

- return glue_xts_req_128bit(&camellia_enc_xts, req,
- XTS_TWEAK_CAST(camellia_enc_blk),
+ return glue_xts_req_128bit(&camellia_enc_xts, req, camellia_enc_blk,
&ctx->tweak_ctx, &ctx->crypt_ctx, false);
}

@@ -216,8 +209,7 @@ static int xts_decrypt(struct skcipher_r
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct camellia_xts_ctx *ctx = crypto_skcipher_ctx(tfm);

- return glue_xts_req_128bit(&camellia_dec_xts, req,
- XTS_TWEAK_CAST(camellia_enc_blk),
+ return glue_xts_req_128bit(&camellia_dec_xts, req, camellia_enc_blk,
&ctx->tweak_ctx, &ctx->crypt_ctx, true);
}

--- a/arch/x86/crypto/camellia_glue.c
+++ b/arch/x86/crypto/camellia_glue.c
@@ -18,19 +18,17 @@
#include <asm/crypto/glue_helper.h>

/* regular block cipher functions */
-asmlinkage void __camellia_enc_blk(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src, bool xor);
+asmlinkage void __camellia_enc_blk(const void *ctx, u8 *dst, const u8 *src,
+ bool xor);
EXPORT_SYMBOL_GPL(__camellia_enc_blk);
-asmlinkage void camellia_dec_blk(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src);
+asmlinkage void camellia_dec_blk(const void *ctx, u8 *dst, const u8 *src);
EXPORT_SYMBOL_GPL(camellia_dec_blk);

/* 2-way parallel cipher functions */
-asmlinkage void __camellia_enc_blk_2way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src, bool xor);
+asmlinkage void __camellia_enc_blk_2way(const void *ctx, u8 *dst, const u8 *src,
+ bool xor);
EXPORT_SYMBOL_GPL(__camellia_enc_blk_2way);
-asmlinkage void camellia_dec_blk_2way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src);
+asmlinkage void camellia_dec_blk_2way(const void *ctx, u8 *dst, const u8 *src);
EXPORT_SYMBOL_GPL(camellia_dec_blk_2way);

static void camellia_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
@@ -1267,8 +1265,10 @@ static int camellia_setkey_skcipher(stru
return camellia_setkey(&tfm->base, key, key_len);
}

-void camellia_decrypt_cbc_2way(void *ctx, u128 *dst, const u128 *src)
+void camellia_decrypt_cbc_2way(const void *ctx, u8 *d, const u8 *s)
{
+ u128 *dst = (u128 *)d;
+ const u128 *src = (const u128 *)s;
u128 iv = *src;

camellia_dec_blk_2way(ctx, (u8 *)dst, (u8 *)src);
@@ -1277,9 +1277,11 @@ void camellia_decrypt_cbc_2way(void *ctx
}
EXPORT_SYMBOL_GPL(camellia_decrypt_cbc_2way);

-void camellia_crypt_ctr(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+void camellia_crypt_ctr(const void *ctx, u8 *d, const u8 *s, le128 *iv)
{
be128 ctrblk;
+ u128 *dst = (u128 *)d;
+ const u128 *src = (const u128 *)s;

if (dst != src)
*dst = *src;
@@ -1291,9 +1293,11 @@ void camellia_crypt_ctr(void *ctx, u128
}
EXPORT_SYMBOL_GPL(camellia_crypt_ctr);

-void camellia_crypt_ctr_2way(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+void camellia_crypt_ctr_2way(const void *ctx, u8 *d, const u8 *s, le128 *iv)
{
be128 ctrblks[2];
+ u128 *dst = (u128 *)d;
+ const u128 *src = (const u128 *)s;

if (dst != src) {
dst[0] = src[0];
@@ -1315,10 +1319,10 @@ static const struct common_glue_ctx came

.funcs = { {
.num_blocks = 2,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_enc_blk_2way) }
+ .fn_u = { .ecb = camellia_enc_blk_2way }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_enc_blk) }
+ .fn_u = { .ecb = camellia_enc_blk }
} }
};

@@ -1328,10 +1332,10 @@ static const struct common_glue_ctx came

.funcs = { {
.num_blocks = 2,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(camellia_crypt_ctr_2way) }
+ .fn_u = { .ctr = camellia_crypt_ctr_2way }
}, {
.num_blocks = 1,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(camellia_crypt_ctr) }
+ .fn_u = { .ctr = camellia_crypt_ctr }
} }
};

@@ -1341,10 +1345,10 @@ static const struct common_glue_ctx came

.funcs = { {
.num_blocks = 2,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_dec_blk_2way) }
+ .fn_u = { .ecb = camellia_dec_blk_2way }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(camellia_dec_blk) }
+ .fn_u = { .ecb = camellia_dec_blk }
} }
};

@@ -1354,10 +1358,10 @@ static const struct common_glue_ctx came

.funcs = { {
.num_blocks = 2,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(camellia_decrypt_cbc_2way) }
+ .fn_u = { .cbc = camellia_decrypt_cbc_2way }
}, {
.num_blocks = 1,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(camellia_dec_blk) }
+ .fn_u = { .cbc = camellia_dec_blk }
} }
};

@@ -1373,8 +1377,7 @@ static int ecb_decrypt(struct skcipher_r

static int cbc_encrypt(struct skcipher_request *req)
{
- return glue_cbc_encrypt_req_128bit(GLUE_FUNC_CAST(camellia_enc_blk),
- req);
+ return glue_cbc_encrypt_req_128bit(camellia_enc_blk, req);
}

static int cbc_decrypt(struct skcipher_request *req)
--- a/arch/x86/crypto/cast6_avx_glue.c
+++ b/arch/x86/crypto/cast6_avx_glue.c
@@ -20,20 +20,17 @@

#define CAST6_PARALLEL_BLOCKS 8

-asmlinkage void cast6_ecb_enc_8way(struct cast6_ctx *ctx, u8 *dst,
- const u8 *src);
-asmlinkage void cast6_ecb_dec_8way(struct cast6_ctx *ctx, u8 *dst,
- const u8 *src);
-
-asmlinkage void cast6_cbc_dec_8way(struct cast6_ctx *ctx, u8 *dst,
- const u8 *src);
-asmlinkage void cast6_ctr_8way(struct cast6_ctx *ctx, u8 *dst, const u8 *src,
+asmlinkage void cast6_ecb_enc_8way(const void *ctx, u8 *dst, const u8 *src);
+asmlinkage void cast6_ecb_dec_8way(const void *ctx, u8 *dst, const u8 *src);
+
+asmlinkage void cast6_cbc_dec_8way(const void *ctx, u8 *dst, const u8 *src);
+asmlinkage void cast6_ctr_8way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);

-asmlinkage void cast6_xts_enc_8way(struct cast6_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
-asmlinkage void cast6_xts_dec_8way(struct cast6_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
+asmlinkage void cast6_xts_enc_8way(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);
+asmlinkage void cast6_xts_dec_8way(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);

static int cast6_setkey_skcipher(struct crypto_skcipher *tfm,
const u8 *key, unsigned int keylen)
@@ -41,21 +38,21 @@ static int cast6_setkey_skcipher(struct
return cast6_setkey(&tfm->base, key, keylen);
}

-static void cast6_xts_enc(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+static void cast6_xts_enc(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
- glue_xts_crypt_128bit_one(ctx, dst, src, iv,
- GLUE_FUNC_CAST(__cast6_encrypt));
+ glue_xts_crypt_128bit_one(ctx, dst, src, iv, __cast6_encrypt);
}

-static void cast6_xts_dec(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+static void cast6_xts_dec(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
- glue_xts_crypt_128bit_one(ctx, dst, src, iv,
- GLUE_FUNC_CAST(__cast6_decrypt));
+ glue_xts_crypt_128bit_one(ctx, dst, src, iv, __cast6_decrypt);
}

-static void cast6_crypt_ctr(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+static void cast6_crypt_ctr(const void *ctx, u8 *d, const u8 *s, le128 *iv)
{
be128 ctrblk;
+ u128 *dst = (u128 *)d;
+ const u128 *src = (const u128 *)s;

le128_to_be128(&ctrblk, iv);
le128_inc(iv);
@@ -70,10 +67,10 @@ static const struct common_glue_ctx cast

.funcs = { {
.num_blocks = CAST6_PARALLEL_BLOCKS,
- .fn_u = { .ecb = GLUE_FUNC_CAST(cast6_ecb_enc_8way) }
+ .fn_u = { .ecb = cast6_ecb_enc_8way }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(__cast6_encrypt) }
+ .fn_u = { .ecb = __cast6_encrypt }
} }
};

@@ -83,10 +80,10 @@ static const struct common_glue_ctx cast

.funcs = { {
.num_blocks = CAST6_PARALLEL_BLOCKS,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(cast6_ctr_8way) }
+ .fn_u = { .ctr = cast6_ctr_8way }
}, {
.num_blocks = 1,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(cast6_crypt_ctr) }
+ .fn_u = { .ctr = cast6_crypt_ctr }
} }
};

@@ -96,10 +93,10 @@ static const struct common_glue_ctx cast

.funcs = { {
.num_blocks = CAST6_PARALLEL_BLOCKS,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(cast6_xts_enc_8way) }
+ .fn_u = { .xts = cast6_xts_enc_8way }
}, {
.num_blocks = 1,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(cast6_xts_enc) }
+ .fn_u = { .xts = cast6_xts_enc }
} }
};

@@ -109,10 +106,10 @@ static const struct common_glue_ctx cast

.funcs = { {
.num_blocks = CAST6_PARALLEL_BLOCKS,
- .fn_u = { .ecb = GLUE_FUNC_CAST(cast6_ecb_dec_8way) }
+ .fn_u = { .ecb = cast6_ecb_dec_8way }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(__cast6_decrypt) }
+ .fn_u = { .ecb = __cast6_decrypt }
} }
};

@@ -122,10 +119,10 @@ static const struct common_glue_ctx cast

.funcs = { {
.num_blocks = CAST6_PARALLEL_BLOCKS,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(cast6_cbc_dec_8way) }
+ .fn_u = { .cbc = cast6_cbc_dec_8way }
}, {
.num_blocks = 1,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(__cast6_decrypt) }
+ .fn_u = { .cbc = __cast6_decrypt }
} }
};

@@ -135,10 +132,10 @@ static const struct common_glue_ctx cast

.funcs = { {
.num_blocks = CAST6_PARALLEL_BLOCKS,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(cast6_xts_dec_8way) }
+ .fn_u = { .xts = cast6_xts_dec_8way }
}, {
.num_blocks = 1,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(cast6_xts_dec) }
+ .fn_u = { .xts = cast6_xts_dec }
} }
};

@@ -154,8 +151,7 @@ static int ecb_decrypt(struct skcipher_r

static int cbc_encrypt(struct skcipher_request *req)
{
- return glue_cbc_encrypt_req_128bit(GLUE_FUNC_CAST(__cast6_encrypt),
- req);
+ return glue_cbc_encrypt_req_128bit(__cast6_encrypt, req);
}

static int cbc_decrypt(struct skcipher_request *req)
@@ -199,8 +195,7 @@ static int xts_encrypt(struct skcipher_r
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct cast6_xts_ctx *ctx = crypto_skcipher_ctx(tfm);

- return glue_xts_req_128bit(&cast6_enc_xts, req,
- XTS_TWEAK_CAST(__cast6_encrypt),
+ return glue_xts_req_128bit(&cast6_enc_xts, req, __cast6_encrypt,
&ctx->tweak_ctx, &ctx->crypt_ctx, false);
}

@@ -209,8 +204,7 @@ static int xts_decrypt(struct skcipher_r
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct cast6_xts_ctx *ctx = crypto_skcipher_ctx(tfm);

- return glue_xts_req_128bit(&cast6_dec_xts, req,
- XTS_TWEAK_CAST(__cast6_encrypt),
+ return glue_xts_req_128bit(&cast6_dec_xts, req, __cast6_encrypt,
&ctx->tweak_ctx, &ctx->crypt_ctx, true);
}

--- a/arch/x86/crypto/glue_helper.c
+++ b/arch/x86/crypto/glue_helper.c
@@ -134,7 +134,8 @@ int glue_cbc_decrypt_req_128bit(const st
src -= num_blocks - 1;
dst -= num_blocks - 1;

- gctx->funcs[i].fn_u.cbc(ctx, dst, src);
+ gctx->funcs[i].fn_u.cbc(ctx, (u8 *)dst,
+ (const u8 *)src);

nbytes -= func_bytes;
if (nbytes < bsize)
@@ -188,7 +189,9 @@ int glue_ctr_req_128bit(const struct com

/* Process multi-block batch */
do {
- gctx->funcs[i].fn_u.ctr(ctx, dst, src, &ctrblk);
+ gctx->funcs[i].fn_u.ctr(ctx, (u8 *)dst,
+ (const u8 *)src,
+ &ctrblk);
src += num_blocks;
dst += num_blocks;
nbytes -= func_bytes;
@@ -210,7 +213,8 @@ int glue_ctr_req_128bit(const struct com

be128_to_le128(&ctrblk, (be128 *)walk.iv);
memcpy(&tmp, walk.src.virt.addr, nbytes);
- gctx->funcs[gctx->num_funcs - 1].fn_u.ctr(ctx, &tmp, &tmp,
+ gctx->funcs[gctx->num_funcs - 1].fn_u.ctr(ctx, (u8 *)&tmp,
+ (const u8 *)&tmp,
&ctrblk);
memcpy(walk.dst.virt.addr, &tmp, nbytes);
le128_to_be128((be128 *)walk.iv, &ctrblk);
@@ -240,7 +244,8 @@ static unsigned int __glue_xts_req_128bi

if (nbytes >= func_bytes) {
do {
- gctx->funcs[i].fn_u.xts(ctx, dst, src,
+ gctx->funcs[i].fn_u.xts(ctx, (u8 *)dst,
+ (const u8 *)src,
walk->iv);

src += num_blocks;
@@ -354,8 +359,8 @@ out:
}
EXPORT_SYMBOL_GPL(glue_xts_req_128bit);

-void glue_xts_crypt_128bit_one(void *ctx, u128 *dst, const u128 *src, le128 *iv,
- common_glue_func_t fn)
+void glue_xts_crypt_128bit_one(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv, common_glue_func_t fn)
{
le128 ivblk = *iv;

@@ -363,13 +368,13 @@ void glue_xts_crypt_128bit_one(void *ctx
gf128mul_x_ble(iv, &ivblk);

/* CC <- T xor C */
- u128_xor(dst, src, (u128 *)&ivblk);
+ u128_xor((u128 *)dst, (const u128 *)src, (u128 *)&ivblk);

/* PP <- D(Key2,CC) */
- fn(ctx, (u8 *)dst, (u8 *)dst);
+ fn(ctx, dst, dst);

/* P <- T xor PP */
- u128_xor(dst, dst, (u128 *)&ivblk);
+ u128_xor((u128 *)dst, (u128 *)dst, (u128 *)&ivblk);
}
EXPORT_SYMBOL_GPL(glue_xts_crypt_128bit_one);

--- a/arch/x86/crypto/serpent_avx2_glue.c
+++ b/arch/x86/crypto/serpent_avx2_glue.c
@@ -19,18 +19,16 @@
#define SERPENT_AVX2_PARALLEL_BLOCKS 16

/* 16-way AVX2 parallel cipher functions */
-asmlinkage void serpent_ecb_enc_16way(struct serpent_ctx *ctx, u8 *dst,
- const u8 *src);
-asmlinkage void serpent_ecb_dec_16way(struct serpent_ctx *ctx, u8 *dst,
- const u8 *src);
-asmlinkage void serpent_cbc_dec_16way(void *ctx, u128 *dst, const u128 *src);
+asmlinkage void serpent_ecb_enc_16way(const void *ctx, u8 *dst, const u8 *src);
+asmlinkage void serpent_ecb_dec_16way(const void *ctx, u8 *dst, const u8 *src);
+asmlinkage void serpent_cbc_dec_16way(const void *ctx, u8 *dst, const u8 *src);

-asmlinkage void serpent_ctr_16way(void *ctx, u128 *dst, const u128 *src,
+asmlinkage void serpent_ctr_16way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
-asmlinkage void serpent_xts_enc_16way(struct serpent_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
-asmlinkage void serpent_xts_dec_16way(struct serpent_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
+asmlinkage void serpent_xts_enc_16way(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);
+asmlinkage void serpent_xts_dec_16way(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);

static int serpent_setkey_skcipher(struct crypto_skcipher *tfm,
const u8 *key, unsigned int keylen)
@@ -44,13 +42,13 @@ static const struct common_glue_ctx serp

.funcs = { {
.num_blocks = 16,
- .fn_u = { .ecb = GLUE_FUNC_CAST(serpent_ecb_enc_16way) }
+ .fn_u = { .ecb = serpent_ecb_enc_16way }
}, {
.num_blocks = 8,
- .fn_u = { .ecb = GLUE_FUNC_CAST(serpent_ecb_enc_8way_avx) }
+ .fn_u = { .ecb = serpent_ecb_enc_8way_avx }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(__serpent_encrypt) }
+ .fn_u = { .ecb = __serpent_encrypt }
} }
};

@@ -60,13 +58,13 @@ static const struct common_glue_ctx serp

.funcs = { {
.num_blocks = 16,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(serpent_ctr_16way) }
+ .fn_u = { .ctr = serpent_ctr_16way }
}, {
.num_blocks = 8,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(serpent_ctr_8way_avx) }
+ .fn_u = { .ctr = serpent_ctr_8way_avx }
}, {
.num_blocks = 1,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(__serpent_crypt_ctr) }
+ .fn_u = { .ctr = __serpent_crypt_ctr }
} }
};

@@ -76,13 +74,13 @@ static const struct common_glue_ctx serp

.funcs = { {
.num_blocks = 16,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(serpent_xts_enc_16way) }
+ .fn_u = { .xts = serpent_xts_enc_16way }
}, {
.num_blocks = 8,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(serpent_xts_enc_8way_avx) }
+ .fn_u = { .xts = serpent_xts_enc_8way_avx }
}, {
.num_blocks = 1,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(serpent_xts_enc) }
+ .fn_u = { .xts = serpent_xts_enc }
} }
};

@@ -92,13 +90,13 @@ static const struct common_glue_ctx serp

.funcs = { {
.num_blocks = 16,
- .fn_u = { .ecb = GLUE_FUNC_CAST(serpent_ecb_dec_16way) }
+ .fn_u = { .ecb = serpent_ecb_dec_16way }
}, {
.num_blocks = 8,
- .fn_u = { .ecb = GLUE_FUNC_CAST(serpent_ecb_dec_8way_avx) }
+ .fn_u = { .ecb = serpent_ecb_dec_8way_avx }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(__serpent_decrypt) }
+ .fn_u = { .ecb = __serpent_decrypt }
} }
};

@@ -108,13 +106,13 @@ static const struct common_glue_ctx serp

.funcs = { {
.num_blocks = 16,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(serpent_cbc_dec_16way) }
+ .fn_u = { .cbc = serpent_cbc_dec_16way }
}, {
.num_blocks = 8,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(serpent_cbc_dec_8way_avx) }
+ .fn_u = { .cbc = serpent_cbc_dec_8way_avx }
}, {
.num_blocks = 1,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(__serpent_decrypt) }
+ .fn_u = { .cbc = __serpent_decrypt }
} }
};

@@ -124,13 +122,13 @@ static const struct common_glue_ctx serp

.funcs = { {
.num_blocks = 16,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(serpent_xts_dec_16way) }
+ .fn_u = { .xts = serpent_xts_dec_16way }
}, {
.num_blocks = 8,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(serpent_xts_dec_8way_avx) }
+ .fn_u = { .xts = serpent_xts_dec_8way_avx }
}, {
.num_blocks = 1,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(serpent_xts_dec) }
+ .fn_u = { .xts = serpent_xts_dec }
} }
};

@@ -146,8 +144,7 @@ static int ecb_decrypt(struct skcipher_r

static int cbc_encrypt(struct skcipher_request *req)
{
- return glue_cbc_encrypt_req_128bit(GLUE_FUNC_CAST(__serpent_encrypt),
- req);
+ return glue_cbc_encrypt_req_128bit(__serpent_encrypt, req);
}

static int cbc_decrypt(struct skcipher_request *req)
@@ -166,8 +163,8 @@ static int xts_encrypt(struct skcipher_r
struct serpent_xts_ctx *ctx = crypto_skcipher_ctx(tfm);

return glue_xts_req_128bit(&serpent_enc_xts, req,
- XTS_TWEAK_CAST(__serpent_encrypt),
- &ctx->tweak_ctx, &ctx->crypt_ctx, false);
+ __serpent_encrypt, &ctx->tweak_ctx,
+ &ctx->crypt_ctx, false);
}

static int xts_decrypt(struct skcipher_request *req)
@@ -176,8 +173,8 @@ static int xts_decrypt(struct skcipher_r
struct serpent_xts_ctx *ctx = crypto_skcipher_ctx(tfm);

return glue_xts_req_128bit(&serpent_dec_xts, req,
- XTS_TWEAK_CAST(__serpent_encrypt),
- &ctx->tweak_ctx, &ctx->crypt_ctx, true);
+ __serpent_encrypt, &ctx->tweak_ctx,
+ &ctx->crypt_ctx, true);
}

static struct skcipher_alg serpent_algs[] = {
--- a/arch/x86/crypto/serpent_avx_glue.c
+++ b/arch/x86/crypto/serpent_avx_glue.c
@@ -20,33 +20,35 @@
#include <asm/crypto/serpent-avx.h>

/* 8-way parallel cipher functions */
-asmlinkage void serpent_ecb_enc_8way_avx(struct serpent_ctx *ctx, u8 *dst,
+asmlinkage void serpent_ecb_enc_8way_avx(const void *ctx, u8 *dst,
const u8 *src);
EXPORT_SYMBOL_GPL(serpent_ecb_enc_8way_avx);

-asmlinkage void serpent_ecb_dec_8way_avx(struct serpent_ctx *ctx, u8 *dst,
+asmlinkage void serpent_ecb_dec_8way_avx(const void *ctx, u8 *dst,
const u8 *src);
EXPORT_SYMBOL_GPL(serpent_ecb_dec_8way_avx);

-asmlinkage void serpent_cbc_dec_8way_avx(struct serpent_ctx *ctx, u8 *dst,
+asmlinkage void serpent_cbc_dec_8way_avx(const void *ctx, u8 *dst,
const u8 *src);
EXPORT_SYMBOL_GPL(serpent_cbc_dec_8way_avx);

-asmlinkage void serpent_ctr_8way_avx(struct serpent_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
+asmlinkage void serpent_ctr_8way_avx(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);
EXPORT_SYMBOL_GPL(serpent_ctr_8way_avx);

-asmlinkage void serpent_xts_enc_8way_avx(struct serpent_ctx *ctx, u8 *dst,
+asmlinkage void serpent_xts_enc_8way_avx(const void *ctx, u8 *dst,
const u8 *src, le128 *iv);
EXPORT_SYMBOL_GPL(serpent_xts_enc_8way_avx);

-asmlinkage void serpent_xts_dec_8way_avx(struct serpent_ctx *ctx, u8 *dst,
+asmlinkage void serpent_xts_dec_8way_avx(const void *ctx, u8 *dst,
const u8 *src, le128 *iv);
EXPORT_SYMBOL_GPL(serpent_xts_dec_8way_avx);

-void __serpent_crypt_ctr(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+void __serpent_crypt_ctr(const void *ctx, u8 *d, const u8 *s, le128 *iv)
{
be128 ctrblk;
+ u128 *dst = (u128 *)d;
+ const u128 *src = (const u128 *)s;

le128_to_be128(&ctrblk, iv);
le128_inc(iv);
@@ -56,17 +58,15 @@ void __serpent_crypt_ctr(void *ctx, u128
}
EXPORT_SYMBOL_GPL(__serpent_crypt_ctr);

-void serpent_xts_enc(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+void serpent_xts_enc(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
- glue_xts_crypt_128bit_one(ctx, dst, src, iv,
- GLUE_FUNC_CAST(__serpent_encrypt));
+ glue_xts_crypt_128bit_one(ctx, dst, src, iv, __serpent_encrypt);
}
EXPORT_SYMBOL_GPL(serpent_xts_enc);

-void serpent_xts_dec(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+void serpent_xts_dec(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
- glue_xts_crypt_128bit_one(ctx, dst, src, iv,
- GLUE_FUNC_CAST(__serpent_decrypt));
+ glue_xts_crypt_128bit_one(ctx, dst, src, iv, __serpent_decrypt);
}
EXPORT_SYMBOL_GPL(serpent_xts_dec);

@@ -102,10 +102,10 @@ static const struct common_glue_ctx serp

.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
- .fn_u = { .ecb = GLUE_FUNC_CAST(serpent_ecb_enc_8way_avx) }
+ .fn_u = { .ecb = serpent_ecb_enc_8way_avx }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(__serpent_encrypt) }
+ .fn_u = { .ecb = __serpent_encrypt }
} }
};

@@ -115,10 +115,10 @@ static const struct common_glue_ctx serp

.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(serpent_ctr_8way_avx) }
+ .fn_u = { .ctr = serpent_ctr_8way_avx }
}, {
.num_blocks = 1,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(__serpent_crypt_ctr) }
+ .fn_u = { .ctr = __serpent_crypt_ctr }
} }
};

@@ -128,10 +128,10 @@ static const struct common_glue_ctx serp

.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(serpent_xts_enc_8way_avx) }
+ .fn_u = { .xts = serpent_xts_enc_8way_avx }
}, {
.num_blocks = 1,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(serpent_xts_enc) }
+ .fn_u = { .xts = serpent_xts_enc }
} }
};

@@ -141,10 +141,10 @@ static const struct common_glue_ctx serp

.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
- .fn_u = { .ecb = GLUE_FUNC_CAST(serpent_ecb_dec_8way_avx) }
+ .fn_u = { .ecb = serpent_ecb_dec_8way_avx }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(__serpent_decrypt) }
+ .fn_u = { .ecb = __serpent_decrypt }
} }
};

@@ -154,10 +154,10 @@ static const struct common_glue_ctx serp

.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(serpent_cbc_dec_8way_avx) }
+ .fn_u = { .cbc = serpent_cbc_dec_8way_avx }
}, {
.num_blocks = 1,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(__serpent_decrypt) }
+ .fn_u = { .cbc = __serpent_decrypt }
} }
};

@@ -167,10 +167,10 @@ static const struct common_glue_ctx serp

.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(serpent_xts_dec_8way_avx) }
+ .fn_u = { .xts = serpent_xts_dec_8way_avx }
}, {
.num_blocks = 1,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(serpent_xts_dec) }
+ .fn_u = { .xts = serpent_xts_dec }
} }
};

@@ -186,8 +186,7 @@ static int ecb_decrypt(struct skcipher_r

static int cbc_encrypt(struct skcipher_request *req)
{
- return glue_cbc_encrypt_req_128bit(GLUE_FUNC_CAST(__serpent_encrypt),
- req);
+ return glue_cbc_encrypt_req_128bit(__serpent_encrypt, req);
}

static int cbc_decrypt(struct skcipher_request *req)
@@ -206,8 +205,8 @@ static int xts_encrypt(struct skcipher_r
struct serpent_xts_ctx *ctx = crypto_skcipher_ctx(tfm);

return glue_xts_req_128bit(&serpent_enc_xts, req,
- XTS_TWEAK_CAST(__serpent_encrypt),
- &ctx->tweak_ctx, &ctx->crypt_ctx, false);
+ __serpent_encrypt, &ctx->tweak_ctx,
+ &ctx->crypt_ctx, false);
}

static int xts_decrypt(struct skcipher_request *req)
@@ -216,8 +215,8 @@ static int xts_decrypt(struct skcipher_r
struct serpent_xts_ctx *ctx = crypto_skcipher_ctx(tfm);

return glue_xts_req_128bit(&serpent_dec_xts, req,
- XTS_TWEAK_CAST(__serpent_encrypt),
- &ctx->tweak_ctx, &ctx->crypt_ctx, true);
+ __serpent_encrypt, &ctx->tweak_ctx,
+ &ctx->crypt_ctx, true);
}

static struct skcipher_alg serpent_algs[] = {
--- a/arch/x86/crypto/serpent_sse2_glue.c
+++ b/arch/x86/crypto/serpent_sse2_glue.c
@@ -31,9 +31,11 @@ static int serpent_setkey_skcipher(struc
return __serpent_setkey(crypto_skcipher_ctx(tfm), key, keylen);
}

-static void serpent_decrypt_cbc_xway(void *ctx, u128 *dst, const u128 *src)
+static void serpent_decrypt_cbc_xway(const void *ctx, u8 *d, const u8 *s)
{
u128 ivs[SERPENT_PARALLEL_BLOCKS - 1];
+ u128 *dst = (u128 *)d;
+ const u128 *src = (const u128 *)s;
unsigned int j;

for (j = 0; j < SERPENT_PARALLEL_BLOCKS - 1; j++)
@@ -45,9 +47,11 @@ static void serpent_decrypt_cbc_xway(voi
u128_xor(dst + (j + 1), dst + (j + 1), ivs + j);
}

-static void serpent_crypt_ctr(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+static void serpent_crypt_ctr(const void *ctx, u8 *d, const u8 *s, le128 *iv)
{
be128 ctrblk;
+ u128 *dst = (u128 *)d;
+ const u128 *src = (const u128 *)s;

le128_to_be128(&ctrblk, iv);
le128_inc(iv);
@@ -56,10 +60,12 @@ static void serpent_crypt_ctr(void *ctx,
u128_xor(dst, src, (u128 *)&ctrblk);
}

-static void serpent_crypt_ctr_xway(void *ctx, u128 *dst, const u128 *src,
+static void serpent_crypt_ctr_xway(const void *ctx, u8 *d, const u8 *s,
le128 *iv)
{
be128 ctrblks[SERPENT_PARALLEL_BLOCKS];
+ u128 *dst = (u128 *)d;
+ const u128 *src = (const u128 *)s;
unsigned int i;

for (i = 0; i < SERPENT_PARALLEL_BLOCKS; i++) {
@@ -79,10 +85,10 @@ static const struct common_glue_ctx serp

.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
- .fn_u = { .ecb = GLUE_FUNC_CAST(serpent_enc_blk_xway) }
+ .fn_u = { .ecb = serpent_enc_blk_xway }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(__serpent_encrypt) }
+ .fn_u = { .ecb = __serpent_encrypt }
} }
};

@@ -92,10 +98,10 @@ static const struct common_glue_ctx serp

.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(serpent_crypt_ctr_xway) }
+ .fn_u = { .ctr = serpent_crypt_ctr_xway }
}, {
.num_blocks = 1,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(serpent_crypt_ctr) }
+ .fn_u = { .ctr = serpent_crypt_ctr }
} }
};

@@ -105,10 +111,10 @@ static const struct common_glue_ctx serp

.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
- .fn_u = { .ecb = GLUE_FUNC_CAST(serpent_dec_blk_xway) }
+ .fn_u = { .ecb = serpent_dec_blk_xway }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(__serpent_decrypt) }
+ .fn_u = { .ecb = __serpent_decrypt }
} }
};

@@ -118,10 +124,10 @@ static const struct common_glue_ctx serp

.funcs = { {
.num_blocks = SERPENT_PARALLEL_BLOCKS,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(serpent_decrypt_cbc_xway) }
+ .fn_u = { .cbc = serpent_decrypt_cbc_xway }
}, {
.num_blocks = 1,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(__serpent_decrypt) }
+ .fn_u = { .cbc = __serpent_decrypt }
} }
};

@@ -137,7 +143,7 @@ static int ecb_decrypt(struct skcipher_r

static int cbc_encrypt(struct skcipher_request *req)
{
- return glue_cbc_encrypt_req_128bit(GLUE_FUNC_CAST(__serpent_encrypt),
+ return glue_cbc_encrypt_req_128bit(__serpent_encrypt,
req);
}

--- a/arch/x86/crypto/twofish_avx_glue.c
+++ b/arch/x86/crypto/twofish_avx_glue.c
@@ -22,20 +22,17 @@
#define TWOFISH_PARALLEL_BLOCKS 8

/* 8-way parallel cipher functions */
-asmlinkage void twofish_ecb_enc_8way(struct twofish_ctx *ctx, u8 *dst,
- const u8 *src);
-asmlinkage void twofish_ecb_dec_8way(struct twofish_ctx *ctx, u8 *dst,
- const u8 *src);
-
-asmlinkage void twofish_cbc_dec_8way(struct twofish_ctx *ctx, u8 *dst,
- const u8 *src);
-asmlinkage void twofish_ctr_8way(struct twofish_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
-
-asmlinkage void twofish_xts_enc_8way(struct twofish_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
-asmlinkage void twofish_xts_dec_8way(struct twofish_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
+asmlinkage void twofish_ecb_enc_8way(const void *ctx, u8 *dst, const u8 *src);
+asmlinkage void twofish_ecb_dec_8way(const void *ctx, u8 *dst, const u8 *src);
+
+asmlinkage void twofish_cbc_dec_8way(const void *ctx, u8 *dst, const u8 *src);
+asmlinkage void twofish_ctr_8way(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);
+
+asmlinkage void twofish_xts_enc_8way(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);
+asmlinkage void twofish_xts_dec_8way(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);

static int twofish_setkey_skcipher(struct crypto_skcipher *tfm,
const u8 *key, unsigned int keylen)
@@ -43,22 +40,19 @@ static int twofish_setkey_skcipher(struc
return twofish_setkey(&tfm->base, key, keylen);
}

-static inline void twofish_enc_blk_3way(struct twofish_ctx *ctx, u8 *dst,
- const u8 *src)
+static inline void twofish_enc_blk_3way(const void *ctx, u8 *dst, const u8 *src)
{
__twofish_enc_blk_3way(ctx, dst, src, false);
}

-static void twofish_xts_enc(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+static void twofish_xts_enc(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
- glue_xts_crypt_128bit_one(ctx, dst, src, iv,
- GLUE_FUNC_CAST(twofish_enc_blk));
+ glue_xts_crypt_128bit_one(ctx, dst, src, iv, twofish_enc_blk);
}

-static void twofish_xts_dec(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+static void twofish_xts_dec(const void *ctx, u8 *dst, const u8 *src, le128 *iv)
{
- glue_xts_crypt_128bit_one(ctx, dst, src, iv,
- GLUE_FUNC_CAST(twofish_dec_blk));
+ glue_xts_crypt_128bit_one(ctx, dst, src, iv, twofish_dec_blk);
}

struct twofish_xts_ctx {
@@ -93,13 +87,13 @@ static const struct common_glue_ctx twof

.funcs = { {
.num_blocks = TWOFISH_PARALLEL_BLOCKS,
- .fn_u = { .ecb = GLUE_FUNC_CAST(twofish_ecb_enc_8way) }
+ .fn_u = { .ecb = twofish_ecb_enc_8way }
}, {
.num_blocks = 3,
- .fn_u = { .ecb = GLUE_FUNC_CAST(twofish_enc_blk_3way) }
+ .fn_u = { .ecb = twofish_enc_blk_3way }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(twofish_enc_blk) }
+ .fn_u = { .ecb = twofish_enc_blk }
} }
};

@@ -109,13 +103,13 @@ static const struct common_glue_ctx twof

.funcs = { {
.num_blocks = TWOFISH_PARALLEL_BLOCKS,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(twofish_ctr_8way) }
+ .fn_u = { .ctr = twofish_ctr_8way }
}, {
.num_blocks = 3,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(twofish_enc_blk_ctr_3way) }
+ .fn_u = { .ctr = twofish_enc_blk_ctr_3way }
}, {
.num_blocks = 1,
- .fn_u = { .ctr = GLUE_CTR_FUNC_CAST(twofish_enc_blk_ctr) }
+ .fn_u = { .ctr = twofish_enc_blk_ctr }
} }
};

@@ -125,10 +119,10 @@ static const struct common_glue_ctx twof

.funcs = { {
.num_blocks = TWOFISH_PARALLEL_BLOCKS,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(twofish_xts_enc_8way) }
+ .fn_u = { .xts = twofish_xts_enc_8way }
}, {
.num_blocks = 1,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(twofish_xts_enc) }
+ .fn_u = { .xts = twofish_xts_enc }
} }
};

@@ -138,13 +132,13 @@ static const struct common_glue_ctx twof

.funcs = { {
.num_blocks = TWOFISH_PARALLEL_BLOCKS,
- .fn_u = { .ecb = GLUE_FUNC_CAST(twofish_ecb_dec_8way) }
+ .fn_u = { .ecb = twofish_ecb_dec_8way }
}, {
.num_blocks = 3,
- .fn_u = { .ecb = GLUE_FUNC_CAST(twofish_dec_blk_3way) }
+ .fn_u = { .ecb = twofish_dec_blk_3way }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(twofish_dec_blk) }
+ .fn_u = { .ecb = twofish_dec_blk }
} }
};

@@ -154,13 +148,13 @@ static const struct common_glue_ctx twof

.funcs = { {
.num_blocks = TWOFISH_PARALLEL_BLOCKS,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(twofish_cbc_dec_8way) }
+ .fn_u = { .cbc = twofish_cbc_dec_8way }
}, {
.num_blocks = 3,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(twofish_dec_blk_cbc_3way) }
+ .fn_u = { .cbc = twofish_dec_blk_cbc_3way }
}, {
.num_blocks = 1,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(twofish_dec_blk) }
+ .fn_u = { .cbc = twofish_dec_blk }
} }
};

@@ -170,10 +164,10 @@ static const struct common_glue_ctx twof

.funcs = { {
.num_blocks = TWOFISH_PARALLEL_BLOCKS,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(twofish_xts_dec_8way) }
+ .fn_u = { .xts = twofish_xts_dec_8way }
}, {
.num_blocks = 1,
- .fn_u = { .xts = GLUE_XTS_FUNC_CAST(twofish_xts_dec) }
+ .fn_u = { .xts = twofish_xts_dec }
} }
};

@@ -189,8 +183,7 @@ static int ecb_decrypt(struct skcipher_r

static int cbc_encrypt(struct skcipher_request *req)
{
- return glue_cbc_encrypt_req_128bit(GLUE_FUNC_CAST(twofish_enc_blk),
- req);
+ return glue_cbc_encrypt_req_128bit(twofish_enc_blk, req);
}

static int cbc_decrypt(struct skcipher_request *req)
@@ -208,8 +201,7 @@ static int xts_encrypt(struct skcipher_r
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct twofish_xts_ctx *ctx = crypto_skcipher_ctx(tfm);

- return glue_xts_req_128bit(&twofish_enc_xts, req,
- XTS_TWEAK_CAST(twofish_enc_blk),
+ return glue_xts_req_128bit(&twofish_enc_xts, req, twofish_enc_blk,
&ctx->tweak_ctx, &ctx->crypt_ctx, false);
}

@@ -218,8 +210,7 @@ static int xts_decrypt(struct skcipher_r
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct twofish_xts_ctx *ctx = crypto_skcipher_ctx(tfm);

- return glue_xts_req_128bit(&twofish_dec_xts, req,
- XTS_TWEAK_CAST(twofish_enc_blk),
+ return glue_xts_req_128bit(&twofish_dec_xts, req, twofish_enc_blk,
&ctx->tweak_ctx, &ctx->crypt_ctx, true);
}

--- a/arch/x86/crypto/twofish_glue_3way.c
+++ b/arch/x86/crypto/twofish_glue_3way.c
@@ -25,21 +25,22 @@ static int twofish_setkey_skcipher(struc
return twofish_setkey(&tfm->base, key, keylen);
}

-static inline void twofish_enc_blk_3way(struct twofish_ctx *ctx, u8 *dst,
- const u8 *src)
+static inline void twofish_enc_blk_3way(const void *ctx, u8 *dst, const u8 *src)
{
__twofish_enc_blk_3way(ctx, dst, src, false);
}

-static inline void twofish_enc_blk_xor_3way(struct twofish_ctx *ctx, u8 *dst,
+static inline void twofish_enc_blk_xor_3way(const void *ctx, u8 *dst,
const u8 *src)
{
__twofish_enc_blk_3way(ctx, dst, src, true);
}

-void twofish_dec_blk_cbc_3way(void *ctx, u128 *dst, const u128 *src)
+void twofish_dec_blk_cbc_3way(const void *ctx, u8 *d, const u8 *s)
{
u128 ivs[2];
+ u128 *dst = (u128 *)d;
+ const u128 *src = (const u128 *)s;

ivs[0] = src[0];
ivs[1] = src[1];
@@ -51,9 +52,11 @@ void twofish_dec_blk_cbc_3way(void *ctx,
}
EXPORT_SYMBOL_GPL(twofish_dec_blk_cbc_3way);

-void twofish_enc_blk_ctr(void *ctx, u128 *dst, const u128 *src, le128 *iv)
+void twofish_enc_blk_ctr(const void *ctx, u8 *d, const u8 *s, le128 *iv)
{
be128 ctrblk;
+ u128 *dst = (u128 *)d;
+ const u128 *src = (const u128 *)s;

if (dst != src)
*dst = *src;
@@ -66,10 +69,11 @@ void twofish_enc_blk_ctr(void *ctx, u128
}
EXPORT_SYMBOL_GPL(twofish_enc_blk_ctr);

-void twofish_enc_blk_ctr_3way(void *ctx, u128 *dst, const u128 *src,
- le128 *iv)
+void twofish_enc_blk_ctr_3way(const void *ctx, u8 *d, const u8 *s, le128 *iv)
{
be128 ctrblks[3];
+ u128 *dst = (u128 *)d;
+ const u128 *src = (const u128 *)s;

if (dst != src) {
dst[0] = src[0];
@@ -94,10 +98,10 @@ static const struct common_glue_ctx twof

.funcs = { {
.num_blocks = 3,
- .fn_u = { .ecb = GLUE_FUNC_CAST(twofish_enc_blk_3way) }
+ .fn_u = { .ecb = twofish_enc_blk_3way }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(twofish_enc_blk) }
+ .fn_u = { .ecb = twofish_enc_blk }
} }
};

@@ -107,10 +111,10 @@ static const struct common_glue_ctx twof

.funcs = { {
.num_blocks = 3,
- .fn_u = { .ecb = GLUE_FUNC_CAST(twofish_enc_blk_ctr_3way) }
+ .fn_u = { .ctr = twofish_enc_blk_ctr_3way }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(twofish_enc_blk_ctr) }
+ .fn_u = { .ctr = twofish_enc_blk_ctr }
} }
};

@@ -120,10 +124,10 @@ static const struct common_glue_ctx twof

.funcs = { {
.num_blocks = 3,
- .fn_u = { .ecb = GLUE_FUNC_CAST(twofish_dec_blk_3way) }
+ .fn_u = { .ecb = twofish_dec_blk_3way }
}, {
.num_blocks = 1,
- .fn_u = { .ecb = GLUE_FUNC_CAST(twofish_dec_blk) }
+ .fn_u = { .ecb = twofish_dec_blk }
} }
};

@@ -133,10 +137,10 @@ static const struct common_glue_ctx twof

.funcs = { {
.num_blocks = 3,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(twofish_dec_blk_cbc_3way) }
+ .fn_u = { .cbc = twofish_dec_blk_cbc_3way }
}, {
.num_blocks = 1,
- .fn_u = { .cbc = GLUE_CBC_FUNC_CAST(twofish_dec_blk) }
+ .fn_u = { .cbc = twofish_dec_blk }
} }
};

@@ -152,8 +156,7 @@ static int ecb_decrypt(struct skcipher_r

static int cbc_encrypt(struct skcipher_request *req)
{
- return glue_cbc_encrypt_req_128bit(GLUE_FUNC_CAST(twofish_enc_blk),
- req);
+ return glue_cbc_encrypt_req_128bit(twofish_enc_blk, req);
}

static int cbc_decrypt(struct skcipher_request *req)
--- a/arch/x86/include/asm/crypto/camellia.h
+++ b/arch/x86/include/asm/crypto/camellia.h
@@ -32,65 +32,60 @@ extern int xts_camellia_setkey(struct cr
unsigned int keylen);

/* regular block cipher functions */
-asmlinkage void __camellia_enc_blk(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src, bool xor);
-asmlinkage void camellia_dec_blk(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src);
+asmlinkage void __camellia_enc_blk(const void *ctx, u8 *dst, const u8 *src,
+ bool xor);
+asmlinkage void camellia_dec_blk(const void *ctx, u8 *dst, const u8 *src);

/* 2-way parallel cipher functions */
-asmlinkage void __camellia_enc_blk_2way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src, bool xor);
-asmlinkage void camellia_dec_blk_2way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src);
+asmlinkage void __camellia_enc_blk_2way(const void *ctx, u8 *dst, const u8 *src,
+ bool xor);
+asmlinkage void camellia_dec_blk_2way(const void *ctx, u8 *dst, const u8 *src);

/* 16-way parallel cipher functions (avx/aes-ni) */
-asmlinkage void camellia_ecb_enc_16way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src);
-asmlinkage void camellia_ecb_dec_16way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src);
-
-asmlinkage void camellia_cbc_dec_16way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src);
-asmlinkage void camellia_ctr_16way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
-
-asmlinkage void camellia_xts_enc_16way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
-asmlinkage void camellia_xts_dec_16way(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
+asmlinkage void camellia_ecb_enc_16way(const void *ctx, u8 *dst, const u8 *src);
+asmlinkage void camellia_ecb_dec_16way(const void *ctx, u8 *dst, const u8 *src);

-static inline void camellia_enc_blk(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src)
+asmlinkage void camellia_cbc_dec_16way(const void *ctx, u8 *dst, const u8 *src);
+asmlinkage void camellia_ctr_16way(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);
+
+asmlinkage void camellia_xts_enc_16way(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);
+asmlinkage void camellia_xts_dec_16way(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);
+
+static inline void camellia_enc_blk(const void *ctx, u8 *dst, const u8 *src)
{
__camellia_enc_blk(ctx, dst, src, false);
}

-static inline void camellia_enc_blk_xor(struct camellia_ctx *ctx, u8 *dst,
- const u8 *src)
+static inline void camellia_enc_blk_xor(const void *ctx, u8 *dst, const u8 *src)
{
__camellia_enc_blk(ctx, dst, src, true);
}

-static inline void camellia_enc_blk_2way(struct camellia_ctx *ctx, u8 *dst,
+static inline void camellia_enc_blk_2way(const void *ctx, u8 *dst,
const u8 *src)
{
__camellia_enc_blk_2way(ctx, dst, src, false);
}

-static inline void camellia_enc_blk_xor_2way(struct camellia_ctx *ctx, u8 *dst,
+static inline void camellia_enc_blk_xor_2way(const void *ctx, u8 *dst,
const u8 *src)
{
__camellia_enc_blk_2way(ctx, dst, src, true);
}

/* glue helpers */
-extern void camellia_decrypt_cbc_2way(void *ctx, u128 *dst, const u128 *src);
-extern void camellia_crypt_ctr(void *ctx, u128 *dst, const u128 *src,
+extern void camellia_decrypt_cbc_2way(const void *ctx, u8 *dst, const u8 *src);
+extern void camellia_crypt_ctr(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
-extern void camellia_crypt_ctr_2way(void *ctx, u128 *dst, const u128 *src,
+extern void camellia_crypt_ctr_2way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);

-extern void camellia_xts_enc(void *ctx, u128 *dst, const u128 *src, le128 *iv);
-extern void camellia_xts_dec(void *ctx, u128 *dst, const u128 *src, le128 *iv);
+extern void camellia_xts_enc(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);
+extern void camellia_xts_dec(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);

#endif /* ASM_X86_CAMELLIA_H */
--- a/arch/x86/include/asm/crypto/glue_helper.h
+++ b/arch/x86/include/asm/crypto/glue_helper.h
@@ -11,18 +11,13 @@
#include <asm/fpu/api.h>
#include <crypto/b128ops.h>

-typedef void (*common_glue_func_t)(void *ctx, u8 *dst, const u8 *src);
-typedef void (*common_glue_cbc_func_t)(void *ctx, u128 *dst, const u128 *src);
-typedef void (*common_glue_ctr_func_t)(void *ctx, u128 *dst, const u128 *src,
+typedef void (*common_glue_func_t)(const void *ctx, u8 *dst, const u8 *src);
+typedef void (*common_glue_cbc_func_t)(const void *ctx, u8 *dst, const u8 *src);
+typedef void (*common_glue_ctr_func_t)(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
-typedef void (*common_glue_xts_func_t)(void *ctx, u128 *dst, const u128 *src,
+typedef void (*common_glue_xts_func_t)(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);

-#define GLUE_FUNC_CAST(fn) ((common_glue_func_t)(fn))
-#define GLUE_CBC_FUNC_CAST(fn) ((common_glue_cbc_func_t)(fn))
-#define GLUE_CTR_FUNC_CAST(fn) ((common_glue_ctr_func_t)(fn))
-#define GLUE_XTS_FUNC_CAST(fn) ((common_glue_xts_func_t)(fn))
-
struct common_glue_func_entry {
unsigned int num_blocks; /* number of blocks that @fn will process */
union {
@@ -116,7 +111,8 @@ extern int glue_xts_req_128bit(const str
common_glue_func_t tweak_fn, void *tweak_ctx,
void *crypt_ctx, bool decrypt);

-extern void glue_xts_crypt_128bit_one(void *ctx, u128 *dst, const u128 *src,
- le128 *iv, common_glue_func_t fn);
+extern void glue_xts_crypt_128bit_one(const void *ctx, u8 *dst,
+ const u8 *src, le128 *iv,
+ common_glue_func_t fn);

#endif /* _CRYPTO_GLUE_HELPER_H */
--- a/arch/x86/include/asm/crypto/serpent-avx.h
+++ b/arch/x86/include/asm/crypto/serpent-avx.h
@@ -15,26 +15,26 @@ struct serpent_xts_ctx {
struct serpent_ctx crypt_ctx;
};

-asmlinkage void serpent_ecb_enc_8way_avx(struct serpent_ctx *ctx, u8 *dst,
+asmlinkage void serpent_ecb_enc_8way_avx(const void *ctx, u8 *dst,
const u8 *src);
-asmlinkage void serpent_ecb_dec_8way_avx(struct serpent_ctx *ctx, u8 *dst,
+asmlinkage void serpent_ecb_dec_8way_avx(const void *ctx, u8 *dst,
const u8 *src);

-asmlinkage void serpent_cbc_dec_8way_avx(struct serpent_ctx *ctx, u8 *dst,
+asmlinkage void serpent_cbc_dec_8way_avx(const void *ctx, u8 *dst,
const u8 *src);
-asmlinkage void serpent_ctr_8way_avx(struct serpent_ctx *ctx, u8 *dst,
- const u8 *src, le128 *iv);
+asmlinkage void serpent_ctr_8way_avx(const void *ctx, u8 *dst, const u8 *src,
+ le128 *iv);

-asmlinkage void serpent_xts_enc_8way_avx(struct serpent_ctx *ctx, u8 *dst,
+asmlinkage void serpent_xts_enc_8way_avx(const void *ctx, u8 *dst,
const u8 *src, le128 *iv);
-asmlinkage void serpent_xts_dec_8way_avx(struct serpent_ctx *ctx, u8 *dst,
+asmlinkage void serpent_xts_dec_8way_avx(const void *ctx, u8 *dst,
const u8 *src, le128 *iv);

-extern void __serpent_crypt_ctr(void *ctx, u128 *dst, const u128 *src,
+extern void __serpent_crypt_ctr(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);

-extern void serpent_xts_enc(void *ctx, u128 *dst, const u128 *src, le128 *iv);
-extern void serpent_xts_dec(void *ctx, u128 *dst, const u128 *src, le128 *iv);
+extern void serpent_xts_enc(const void *ctx, u8 *dst, const u8 *src, le128 *iv);
+extern void serpent_xts_dec(const void *ctx, u8 *dst, const u8 *src, le128 *iv);

extern int xts_serpent_setkey(struct crypto_skcipher *tfm, const u8 *key,
unsigned int keylen);
--- a/arch/x86/include/asm/crypto/serpent-sse2.h
+++ b/arch/x86/include/asm/crypto/serpent-sse2.h
@@ -9,25 +9,23 @@

#define SERPENT_PARALLEL_BLOCKS 4

-asmlinkage void __serpent_enc_blk_4way(struct serpent_ctx *ctx, u8 *dst,
+asmlinkage void __serpent_enc_blk_4way(const struct serpent_ctx *ctx, u8 *dst,
const u8 *src, bool xor);
-asmlinkage void serpent_dec_blk_4way(struct serpent_ctx *ctx, u8 *dst,
+asmlinkage void serpent_dec_blk_4way(const struct serpent_ctx *ctx, u8 *dst,
const u8 *src);

-static inline void serpent_enc_blk_xway(struct serpent_ctx *ctx, u8 *dst,
- const u8 *src)
+static inline void serpent_enc_blk_xway(const void *ctx, u8 *dst, const u8 *src)
{
__serpent_enc_blk_4way(ctx, dst, src, false);
}

-static inline void serpent_enc_blk_xway_xor(struct serpent_ctx *ctx, u8 *dst,
- const u8 *src)
+static inline void serpent_enc_blk_xway_xor(const struct serpent_ctx *ctx,
+ u8 *dst, const u8 *src)
{
__serpent_enc_blk_4way(ctx, dst, src, true);
}

-static inline void serpent_dec_blk_xway(struct serpent_ctx *ctx, u8 *dst,
- const u8 *src)
+static inline void serpent_dec_blk_xway(const void *ctx, u8 *dst, const u8 *src)
{
serpent_dec_blk_4way(ctx, dst, src);
}
@@ -36,25 +34,23 @@ static inline void serpent_dec_blk_xway(

#define SERPENT_PARALLEL_BLOCKS 8

-asmlinkage void __serpent_enc_blk_8way(struct serpent_ctx *ctx, u8 *dst,
+asmlinkage void __serpent_enc_blk_8way(const struct serpent_ctx *ctx, u8 *dst,
const u8 *src, bool xor);
-asmlinkage void serpent_dec_blk_8way(struct serpent_ctx *ctx, u8 *dst,
+asmlinkage void serpent_dec_blk_8way(const struct serpent_ctx *ctx, u8 *dst,
const u8 *src);

-static inline void serpent_enc_blk_xway(struct serpent_ctx *ctx, u8 *dst,
- const u8 *src)
+static inline void serpent_enc_blk_xway(const void *ctx, u8 *dst, const u8 *src)
{
__serpent_enc_blk_8way(ctx, dst, src, false);
}

-static inline void serpent_enc_blk_xway_xor(struct serpent_ctx *ctx, u8 *dst,
- const u8 *src)
+static inline void serpent_enc_blk_xway_xor(const struct serpent_ctx *ctx,
+ u8 *dst, const u8 *src)
{
__serpent_enc_blk_8way(ctx, dst, src, true);
}

-static inline void serpent_dec_blk_xway(struct serpent_ctx *ctx, u8 *dst,
- const u8 *src)
+static inline void serpent_dec_blk_xway(const void *ctx, u8 *dst, const u8 *src)
{
serpent_dec_blk_8way(ctx, dst, src);
}
--- a/arch/x86/include/asm/crypto/twofish.h
+++ b/arch/x86/include/asm/crypto/twofish.h
@@ -7,22 +7,19 @@
#include <crypto/b128ops.h>

/* regular block cipher functions from twofish_x86_64 module */
-asmlinkage void twofish_enc_blk(struct twofish_ctx *ctx, u8 *dst,
- const u8 *src);
-asmlinkage void twofish_dec_blk(struct twofish_ctx *ctx, u8 *dst,
- const u8 *src);
+asmlinkage void twofish_enc_blk(const void *ctx, u8 *dst, const u8 *src);
+asmlinkage void twofish_dec_blk(const void *ctx, u8 *dst, const u8 *src);

/* 3-way parallel cipher functions */
-asmlinkage void __twofish_enc_blk_3way(struct twofish_ctx *ctx, u8 *dst,
- const u8 *src, bool xor);
-asmlinkage void twofish_dec_blk_3way(struct twofish_ctx *ctx, u8 *dst,
- const u8 *src);
+asmlinkage void __twofish_enc_blk_3way(const void *ctx, u8 *dst, const u8 *src,
+ bool xor);
+asmlinkage void twofish_dec_blk_3way(const void *ctx, u8 *dst, const u8 *src);

/* helpers from twofish_x86_64-3way module */
-extern void twofish_dec_blk_cbc_3way(void *ctx, u128 *dst, const u128 *src);
-extern void twofish_enc_blk_ctr(void *ctx, u128 *dst, const u128 *src,
+extern void twofish_dec_blk_cbc_3way(const void *ctx, u8 *dst, const u8 *src);
+extern void twofish_enc_blk_ctr(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);
-extern void twofish_enc_blk_ctr_3way(void *ctx, u128 *dst, const u128 *src,
+extern void twofish_enc_blk_ctr_3way(const void *ctx, u8 *dst, const u8 *src,
le128 *iv);

#endif /* ASM_X86_TWOFISH_H */
--- a/crypto/cast6_generic.c
+++ b/crypto/cast6_generic.c
@@ -154,7 +154,7 @@ int cast6_setkey(struct crypto_tfm *tfm,
EXPORT_SYMBOL_GPL(cast6_setkey);

/*forward quad round*/
-static inline void Q(u32 *block, u8 *Kr, u32 *Km)
+static inline void Q(u32 *block, const u8 *Kr, const u32 *Km)
{
u32 I;
block[2] ^= F1(block[3], Kr[0], Km[0]);
@@ -164,7 +164,7 @@ static inline void Q(u32 *block, u8 *Kr,
}

/*reverse quad round*/
-static inline void QBAR(u32 *block, u8 *Kr, u32 *Km)
+static inline void QBAR(u32 *block, const u8 *Kr, const u32 *Km)
{
u32 I;
block[3] ^= F1(block[0], Kr[3], Km[3]);
@@ -173,13 +173,14 @@ static inline void QBAR(u32 *block, u8 *
block[2] ^= F1(block[3], Kr[0], Km[0]);
}

-void __cast6_encrypt(struct cast6_ctx *c, u8 *outbuf, const u8 *inbuf)
+void __cast6_encrypt(const void *ctx, u8 *outbuf, const u8 *inbuf)
{
+ const struct cast6_ctx *c = ctx;
const __be32 *src = (const __be32 *)inbuf;
__be32 *dst = (__be32 *)outbuf;
u32 block[4];
- u32 *Km;
- u8 *Kr;
+ const u32 *Km;
+ const u8 *Kr;

block[0] = be32_to_cpu(src[0]);
block[1] = be32_to_cpu(src[1]);
@@ -211,13 +212,14 @@ static void cast6_encrypt(struct crypto_
__cast6_encrypt(crypto_tfm_ctx(tfm), outbuf, inbuf);
}

-void __cast6_decrypt(struct cast6_ctx *c, u8 *outbuf, const u8 *inbuf)
+void __cast6_decrypt(const void *ctx, u8 *outbuf, const u8 *inbuf)
{
+ const struct cast6_ctx *c = ctx;
const __be32 *src = (const __be32 *)inbuf;
__be32 *dst = (__be32 *)outbuf;
u32 block[4];
- u32 *Km;
- u8 *Kr;
+ const u32 *Km;
+ const u8 *Kr;

block[0] = be32_to_cpu(src[0]);
block[1] = be32_to_cpu(src[1]);
--- a/crypto/serpent_generic.c
+++ b/crypto/serpent_generic.c
@@ -449,8 +449,9 @@ int serpent_setkey(struct crypto_tfm *tf
}
EXPORT_SYMBOL_GPL(serpent_setkey);

-void __serpent_encrypt(struct serpent_ctx *ctx, u8 *dst, const u8 *src)
+void __serpent_encrypt(const void *c, u8 *dst, const u8 *src)
{
+ const struct serpent_ctx *ctx = c;
const u32 *k = ctx->expkey;
const __le32 *s = (const __le32 *)src;
__le32 *d = (__le32 *)dst;
@@ -514,8 +515,9 @@ static void serpent_encrypt(struct crypt
__serpent_encrypt(ctx, dst, src);
}

-void __serpent_decrypt(struct serpent_ctx *ctx, u8 *dst, const u8 *src)
+void __serpent_decrypt(const void *c, u8 *dst, const u8 *src)
{
+ const struct serpent_ctx *ctx = c;
const u32 *k = ctx->expkey;
const __le32 *s = (const __le32 *)src;
__le32 *d = (__le32 *)dst;
--- a/include/crypto/cast6.h
+++ b/include/crypto/cast6.h
@@ -19,7 +19,7 @@ int __cast6_setkey(struct cast6_ctx *ctx
unsigned int keylen, u32 *flags);
int cast6_setkey(struct crypto_tfm *tfm, const u8 *key, unsigned int keylen);

-void __cast6_encrypt(struct cast6_ctx *ctx, u8 *dst, const u8 *src);
-void __cast6_decrypt(struct cast6_ctx *ctx, u8 *dst, const u8 *src);
+void __cast6_encrypt(const void *ctx, u8 *dst, const u8 *src);
+void __cast6_decrypt(const void *ctx, u8 *dst, const u8 *src);

#endif
--- a/include/crypto/serpent.h
+++ b/include/crypto/serpent.h
@@ -22,7 +22,7 @@ int __serpent_setkey(struct serpent_ctx
unsigned int keylen);
int serpent_setkey(struct crypto_tfm *tfm, const u8 *key, unsigned int keylen);

-void __serpent_encrypt(struct serpent_ctx *ctx, u8 *dst, const u8 *src);
-void __serpent_decrypt(struct serpent_ctx *ctx, u8 *dst, const u8 *src);
+void __serpent_encrypt(const void *ctx, u8 *dst, const u8 *src);
+void __serpent_decrypt(const void *ctx, u8 *dst, const u8 *src);

#endif
--- a/include/crypto/xts.h
+++ b/include/crypto/xts.h
@@ -8,8 +8,6 @@

#define XTS_BLOCK_SIZE 16

-#define XTS_TWEAK_CAST(x) ((void (*)(void *, u8*, const u8*))(x))
-
static inline int xts_check_key(struct crypto_tfm *tfm,
const u8 *key, unsigned int keylen)
{


2021-03-19 12:24:24

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 08/18] drm/i915/gvt: Set SNOOP for PAT3 on BXT/APL to workaround GPU BB hang

From: Colin Xu <[email protected]>

commit 8fe105679765700378eb328495fcfe1566cdbbd0 upstream

If guest fills non-priv bb on ApolloLake/Broxton as Mesa i965 does in:
717e7539124d (i965: Use a WC map and memcpy for the batch instead of pw-)
Due to the missing flush of bb filled by VM vCPU, host GPU hangs on
executing these MI_BATCH_BUFFER.

Temporarily workaround this by setting SNOOP bit for PAT3 used by PPGTT
PML4 PTE: PAT(0) PCD(1) PWT(1).

The performance is still expected to be low, will need further improvement.

Acked-by: Zhenyu Wang <[email protected]>
Signed-off-by: Colin Xu <[email protected]>
Signed-off-by: Zhenyu Wang <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 8fe105679765700378eb328495fcfe1566cdbbd0)
Signed-off-by: Colin Xu <[email protected]>
Cc: <[email protected]> # 5.4.y
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/i915/gvt/handlers.c | 32 +++++++++++++++++++++++++++++++-
1 file changed, 31 insertions(+), 1 deletion(-)

--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -1632,6 +1632,34 @@ static int edp_psr_imr_iir_write(struct
return 0;
}

+/**
+ * FixMe:
+ * If guest fills non-priv batch buffer on ApolloLake/Broxton as Mesa i965 did:
+ * 717e7539124d (i965: Use a WC map and memcpy for the batch instead of pwrite.)
+ * Due to the missing flush of bb filled by VM vCPU, host GPU hangs on executing
+ * these MI_BATCH_BUFFER.
+ * Temporarily workaround this by setting SNOOP bit for PAT3 used by PPGTT
+ * PML4 PTE: PAT(0) PCD(1) PWT(1).
+ * The performance is still expected to be low, will need further improvement.
+ */
+static int bxt_ppat_low_write(struct intel_vgpu *vgpu, unsigned int offset,
+ void *p_data, unsigned int bytes)
+{
+ u64 pat =
+ GEN8_PPAT(0, CHV_PPAT_SNOOP) |
+ GEN8_PPAT(1, 0) |
+ GEN8_PPAT(2, 0) |
+ GEN8_PPAT(3, CHV_PPAT_SNOOP) |
+ GEN8_PPAT(4, CHV_PPAT_SNOOP) |
+ GEN8_PPAT(5, CHV_PPAT_SNOOP) |
+ GEN8_PPAT(6, CHV_PPAT_SNOOP) |
+ GEN8_PPAT(7, CHV_PPAT_SNOOP);
+
+ vgpu_vreg(vgpu, offset) = lower_32_bits(pat);
+
+ return 0;
+}
+
static int mmio_read_from_hw(struct intel_vgpu *vgpu,
unsigned int offset, void *p_data, unsigned int bytes)
{
@@ -2778,7 +2806,7 @@ static int init_broadwell_mmio_info(stru

MMIO_DH(GEN6_PCODE_MAILBOX, D_BDW_PLUS, NULL, mailbox_write);

- MMIO_D(GEN8_PRIVATE_PAT_LO, D_BDW_PLUS);
+ MMIO_D(GEN8_PRIVATE_PAT_LO, D_BDW_PLUS & ~D_BXT);
MMIO_D(GEN8_PRIVATE_PAT_HI, D_BDW_PLUS);

MMIO_D(GAMTARBMODE, D_BDW_PLUS);
@@ -3281,6 +3309,8 @@ static int init_bxt_mmio_info(struct int

MMIO_DFH(GEN9_CTX_PREEMPT_REG, D_BXT, F_CMD_ACCESS, NULL, NULL);

+ MMIO_DH(GEN8_PRIVATE_PAT_LO, D_BXT, NULL, bxt_ppat_low_write);
+
return 0;
}



2021-03-19 12:24:25

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 12/18] drm/i915/gvt: Fix vfio_edid issue for BXT/APL

From: Colin Xu <[email protected]>

commit 4ceb06e7c336f4a8d3f3b6ac9a4fea2e9c97dc07 upstream

BXT/APL has different isr/irr/hpd regs compared with other GEN9. If not
setting these regs bits correctly according to the emulated monitor
(currently a DP on PORT_B), although gvt still triggers a virtual HPD
event, the guest driver won't detect a valid HPD pulse thus no full
display detection will be executed to read the updated EDID.

With this patch, the vfio_edid is enabled again on BXT/APL, which is
previously disabled.

Fixes: 642403e3599e ("drm/i915/gvt: Temporarily disable vfio_edid for BXT/APL")
Signed-off-by: Colin Xu <[email protected]>
Signed-off-by: Zhenyu Wang <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Zhenyu Wang <[email protected]>
(cherry picked from commit 4ceb06e7c336f4a8d3f3b6ac9a4fea2e9c97dc07)
Signed-off-by: Colin Xu <[email protected]>
Cc: <[email protected]> # 5.4.y
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/i915/gvt/display.c | 83 +++++++++++++++++++++++++++----------
drivers/gpu/drm/i915/gvt/vgpu.c | 2
2 files changed, 62 insertions(+), 23 deletions(-)

--- a/drivers/gpu/drm/i915/gvt/display.c
+++ b/drivers/gpu/drm/i915/gvt/display.c
@@ -215,6 +215,15 @@ static void emulate_monitor_status_chang
DDI_BUF_CTL_ENABLE);
vgpu_vreg_t(vgpu, DDI_BUF_CTL(port)) |= DDI_BUF_IS_IDLE;
}
+ vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) &=
+ ~(PORTA_HOTPLUG_ENABLE | PORTA_HOTPLUG_STATUS_MASK);
+ vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) &=
+ ~(PORTB_HOTPLUG_ENABLE | PORTB_HOTPLUG_STATUS_MASK);
+ vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) &=
+ ~(PORTC_HOTPLUG_ENABLE | PORTC_HOTPLUG_STATUS_MASK);
+ /* No hpd_invert set in vgpu vbt, need to clear invert mask */
+ vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) &= ~BXT_DDI_HPD_INVERT_MASK;
+ vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) &= ~BXT_DE_PORT_HOTPLUG_MASK;

vgpu_vreg_t(vgpu, BXT_P_CR_GT_DISP_PWRON) &= ~(BIT(0) | BIT(1));
vgpu_vreg_t(vgpu, BXT_PORT_CL1CM_DW0(DPIO_PHY0)) &=
@@ -271,6 +280,8 @@ static void emulate_monitor_status_chang
vgpu_vreg_t(vgpu, TRANS_DDI_FUNC_CTL(TRANSCODER_EDP)) |=
(TRANS_DDI_BPC_8 | TRANS_DDI_MODE_SELECT_DP_SST |
TRANS_DDI_FUNC_ENABLE);
+ vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) |=
+ PORTA_HOTPLUG_ENABLE;
vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) |=
BXT_DE_PORT_HP_DDIA;
}
@@ -299,6 +310,8 @@ static void emulate_monitor_status_chang
(TRANS_DDI_BPC_8 | TRANS_DDI_MODE_SELECT_DP_SST |
(PORT_B << TRANS_DDI_PORT_SHIFT) |
TRANS_DDI_FUNC_ENABLE);
+ vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) |=
+ PORTB_HOTPLUG_ENABLE;
vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) |=
BXT_DE_PORT_HP_DDIB;
}
@@ -327,6 +340,8 @@ static void emulate_monitor_status_chang
(TRANS_DDI_BPC_8 | TRANS_DDI_MODE_SELECT_DP_SST |
(PORT_B << TRANS_DDI_PORT_SHIFT) |
TRANS_DDI_FUNC_ENABLE);
+ vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) |=
+ PORTC_HOTPLUG_ENABLE;
vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) |=
BXT_DE_PORT_HP_DDIC;
}
@@ -652,38 +667,62 @@ void intel_vgpu_emulate_hotplug(struct i
PORTD_HOTPLUG_STATUS_MASK;
intel_vgpu_trigger_virtual_event(vgpu, DP_D_HOTPLUG);
} else if (IS_BROXTON(dev_priv)) {
- if (connected) {
- if (intel_vgpu_has_monitor_on_port(vgpu, PORT_A)) {
- vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) |= BXT_DE_PORT_HP_DDIA;
+ if (intel_vgpu_has_monitor_on_port(vgpu, PORT_A)) {
+ if (connected) {
+ vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) |=
+ BXT_DE_PORT_HP_DDIA;
+ } else {
+ vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) &=
+ ~BXT_DE_PORT_HP_DDIA;
}
- if (intel_vgpu_has_monitor_on_port(vgpu, PORT_B)) {
+ vgpu_vreg_t(vgpu, GEN8_DE_PORT_IIR) |=
+ BXT_DE_PORT_HP_DDIA;
+ vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) &=
+ ~PORTA_HOTPLUG_STATUS_MASK;
+ vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) |=
+ PORTA_HOTPLUG_LONG_DETECT;
+ intel_vgpu_trigger_virtual_event(vgpu, DP_A_HOTPLUG);
+ }
+ if (intel_vgpu_has_monitor_on_port(vgpu, PORT_B)) {
+ if (connected) {
+ vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) |=
+ BXT_DE_PORT_HP_DDIB;
vgpu_vreg_t(vgpu, SFUSE_STRAP) |=
SFUSE_STRAP_DDIB_DETECTED;
- vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) |= BXT_DE_PORT_HP_DDIB;
- }
- if (intel_vgpu_has_monitor_on_port(vgpu, PORT_C)) {
- vgpu_vreg_t(vgpu, SFUSE_STRAP) |=
- SFUSE_STRAP_DDIC_DETECTED;
- vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) |= BXT_DE_PORT_HP_DDIC;
- }
- } else {
- if (intel_vgpu_has_monitor_on_port(vgpu, PORT_A)) {
- vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) &= ~BXT_DE_PORT_HP_DDIA;
- }
- if (intel_vgpu_has_monitor_on_port(vgpu, PORT_B)) {
+ } else {
+ vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) &=
+ ~BXT_DE_PORT_HP_DDIB;
vgpu_vreg_t(vgpu, SFUSE_STRAP) &=
~SFUSE_STRAP_DDIB_DETECTED;
- vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) &= ~BXT_DE_PORT_HP_DDIB;
}
- if (intel_vgpu_has_monitor_on_port(vgpu, PORT_C)) {
+ vgpu_vreg_t(vgpu, GEN8_DE_PORT_IIR) |=
+ BXT_DE_PORT_HP_DDIB;
+ vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) &=
+ ~PORTB_HOTPLUG_STATUS_MASK;
+ vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) |=
+ PORTB_HOTPLUG_LONG_DETECT;
+ intel_vgpu_trigger_virtual_event(vgpu, DP_B_HOTPLUG);
+ }
+ if (intel_vgpu_has_monitor_on_port(vgpu, PORT_C)) {
+ if (connected) {
+ vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) |=
+ BXT_DE_PORT_HP_DDIC;
+ vgpu_vreg_t(vgpu, SFUSE_STRAP) |=
+ SFUSE_STRAP_DDIC_DETECTED;
+ } else {
+ vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) &=
+ ~BXT_DE_PORT_HP_DDIC;
vgpu_vreg_t(vgpu, SFUSE_STRAP) &=
~SFUSE_STRAP_DDIC_DETECTED;
- vgpu_vreg_t(vgpu, GEN8_DE_PORT_ISR) &= ~BXT_DE_PORT_HP_DDIC;
}
+ vgpu_vreg_t(vgpu, GEN8_DE_PORT_IIR) |=
+ BXT_DE_PORT_HP_DDIC;
+ vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) &=
+ ~PORTC_HOTPLUG_STATUS_MASK;
+ vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) |=
+ PORTC_HOTPLUG_LONG_DETECT;
+ intel_vgpu_trigger_virtual_event(vgpu, DP_C_HOTPLUG);
}
- vgpu_vreg_t(vgpu, PCH_PORT_HOTPLUG) |=
- PORTB_HOTPLUG_STATUS_MASK;
- intel_vgpu_trigger_virtual_event(vgpu, DP_B_HOTPLUG);
}
}

--- a/drivers/gpu/drm/i915/gvt/vgpu.c
+++ b/drivers/gpu/drm/i915/gvt/vgpu.c
@@ -432,7 +432,7 @@ static struct intel_vgpu *__intel_gvt_cr
if (ret)
goto out_clean_sched_policy;

- if (IS_BROADWELL(gvt->dev_priv))
+ if (IS_BROADWELL(gvt->dev_priv) || IS_BROXTON(gvt->dev_priv))
ret = intel_gvt_hypervisor_set_edid(vgpu, PORT_B);
else
ret = intel_gvt_hypervisor_set_edid(vgpu, PORT_D);


2021-03-19 12:24:29

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.4 11/18] drm/i915/gvt: Fix port number for BDW on EDID region setup

From: Colin Xu <[email protected]>

From: Zhenyu Wang <[email protected]>

commit 28284943ac94014767ecc2f7b3c5747c4a5617a0 upstream

Current BDW virtual display port is initialized as PORT_B, so need
to use same port for VFIO EDID region, otherwise invalid EDID blob
pointer is assigned which caused kernel null pointer reference. We
might evaluate actual display hotplug for BDW to make this function
work as expected, anyway this is always required to be fixed first.

Reported-by: Alejandro Sior <[email protected]>
Cc: Alejandro Sior <[email protected]>
Fixes: 0178f4ce3c3b ("drm/i915/gvt: Enable vfio edid for all GVT supported platform")
Reviewed-by: Hang Yuan <[email protected]>
Signed-off-by: Zhenyu Wang <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 28284943ac94014767ecc2f7b3c5747c4a5617a0)
Signed-off-by: Colin Xu <[email protected]>
Cc: <[email protected]> # 5.4.y
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/i915/gvt/vgpu.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/gpu/drm/i915/gvt/vgpu.c
+++ b/drivers/gpu/drm/i915/gvt/vgpu.c
@@ -432,8 +432,9 @@ static struct intel_vgpu *__intel_gvt_cr
if (ret)
goto out_clean_sched_policy;

- /*TODO: add more platforms support */
- if (IS_SKYLAKE(gvt->dev_priv) || IS_KABYLAKE(gvt->dev_priv))
+ if (IS_BROADWELL(gvt->dev_priv))
+ ret = intel_gvt_hypervisor_set_edid(vgpu, PORT_B);
+ else
ret = intel_gvt_hypervisor_set_edid(vgpu, PORT_D);
if (ret)
goto out_clean_sched_policy;


2021-03-19 19:39:41

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH 5.4 00/18] 5.4.107-rc1 review



On 3/19/2021 5:18 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.4.107 release.
> There are 18 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 21 Mar 2021 12:17:37 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.107-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels:

Tested-by: Florian Fainelli <[email protected]>
--
Florian

2021-03-19 21:24:08

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 5.4 00/18] 5.4.107-rc1 review

On Fri, Mar 19, 2021 at 01:18:38PM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.4.107 release.
> There are 18 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 21 Mar 2021 12:17:37 +0000.
> Anything received after that time might be too late.
>

Build results:
total: 157 pass: 157 fail: 0
Qemu test results:
total: 431 pass: 431 fail: 0

Tested-by: Guenter Roeck <[email protected]>

Guenter

2021-03-20 11:20:04

by Naresh Kamboju

[permalink] [raw]
Subject: Re: [PATCH 5.4 00/18] 5.4.107-rc1 review

On Fri, 19 Mar 2021 at 17:49, Greg Kroah-Hartman
<[email protected]> wrote:
>
> This is the start of the stable review cycle for the 5.4.107 release.
> There are 18 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 21 Mar 2021 12:17:37 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.107-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Tested-by: Linux Kernel Functional Testing <[email protected]>

Summary
------------------------------------------------------------------------

kernel: 5.4.107-rc1
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-5.4.y
git commit: 8a800acdf26f05d289c05e416591b6b18917b044
git describe: v5.4.106-19-g8a800acdf26f
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.4.y/build/v5.4.106-19-g8a800acdf26f

No regressions (compared to build v5.4.106)

No fixes (compared to build v5.4.106)


Ran 62773 total tests in the following environments and test suites.

Environments
--------------
- arc
- arm
- arm64
- dragonboard-410c
- hi6220-hikey
- i386
- juno-r2
- juno-r2-compat
- juno-r2-kasan
- mips
- nxp-ls2088
- nxp-ls2088-64k_page_size
- parisc
- powerpc
- qemu-arm-clang
- qemu-arm-debug
- qemu-arm64-clang
- qemu-arm64-debug
- qemu-arm64-kasan
- qemu-i386-debug
- qemu-x86_64-clang
- qemu-x86_64-debug
- qemu-x86_64-kasan
- qemu-x86_64-kcsan
- qemu_arm
- qemu_arm64
- qemu_arm64-compat
- qemu_i386
- qemu_x86_64
- qemu_x86_64-compat
- riscv
- s390
- sh
- sparc
- x15
- x86
- x86-kasan
- x86_64

Test Suites
-----------
* build
* linux-log-parser
* install-android-platform-tools-r2600
* kselftest-
* kselftest-android
* kselftest-bpf
* kselftest-capabilities
* kselftest-cgroup
* kselftest-clone3
* kselftest-core
* kselftest-cpu-hotplug
* kselftest-cpufreq
* kselftest-efivarfs
* kselftest-filesystems
* kselftest-firmware
* kselftest-fpu
* kselftest-futex
* kselftest-gpio
* kselftest-intel_pstate
* kselftest-ipc
* kselftest-ir
* kselftest-kcmp
* kselftest-kvm
* kselftest-lib
* kselftest-livepatch
* kselftest-membarrier
* kselftest-memfd
* kselftest-memory-hotplug
* kselftest-mincore
* kselftest-mount
* kselftest-mqueue
* kselftest-net
* kselftest-netfilter
* kselftest-nsfs
* kselftest-openat2
* kselftest-pid_namespace
* kselftest-pidfd
* kselftest-proc
* kselftest-pstore
* kselftest-ptrace
* kselftest-rseq
* kselftest-rtc
* kselftest-seccomp
* kselftest-sigaltstack
* kselftest-size
* kselftest-splice
* kselftest-static_keys
* kselftest-sync
* kselftest-sysctl
* kselftest-tc-testing
* kselftest-timens
* kselftest-timers
* kselftest-tmpfs
* kselftest-tpm2
* kselftest-user
* kselftest-zram
* libhugetlbfs
* ltp-cap_bounds-tests
* ltp-commands-tests
* ltp-containers-tests
* ltp-controllers-tests
* ltp-cpuhotplug-tests
* ltp-crypto-tests
* ltp-cve-tests
* ltp-dio-tests
* ltp-fs-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-mm-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* ltp-tracing-tests
* perf
* kselftest-lkdtm
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* network-basic-tests
* kselftest-kexec
* kselftest-vm
* kselftest-x86
* ltp-open-posix-tests
* v4l2-compliance
* kvm-unit-tests
* fwts
* rcutorture
* ssuite

--
Linaro LKFT
https://lkft.linaro.org

2021-03-21 07:27:32

by Zou Wei

[permalink] [raw]
Subject: Re: [PATCH 5.4 00/18] 5.4.107-rc1 review



On 2021/3/19 20:18, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.4.107 release.
> There are 18 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 21 Mar 2021 12:17:37 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.107-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Tested on arm64 and x86 for 5.4.107-rc1,

Kernel repo:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Branch: linux-5.4.y
Version: 5.4.107-rc1
Commit: 8a800acdf26f05d289c05e416591b6b18917b044
Compiler: gcc version 7.3.0 (GCC)

arm64:
--------------------------------------------------------------------
Testcase Result Summary:
total: 4729
passed: 4729
failed: 0
timeout: 0
--------------------------------------------------------------------

x86:
--------------------------------------------------------------------
Testcase Result Summary:
total: 4729
passed: 4729
failed: 0
timeout: 0
--------------------------------------------------------------------

Tested-by: Hulk Robot <[email protected]>