2021-04-05 15:56:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 000/126] 5.10.28-rc1 review

This is the start of the stable review cycle for the 5.10.28 release.
There are 126 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed, 07 Apr 2021 08:50:09 +0000.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.28-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 5.10.28-rc1

Stanislav Fomichev <[email protected]>
bpf: Use NOP_ATOMIC5 instead of emit_nops(&prog, 5) for BPF_TRAMP_F_CALL_ORIG

Jens Axboe <[email protected]>
Revert "kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing"

Ben Dooks <[email protected]>
riscv: evaluate put_user() arg before enabling user access

Du Cheng <[email protected]>
drivers: video: fbcon: fix NULL dereference in fbcon_cursor()

Ahmad Fatoum <[email protected]>
driver core: clear deferred probe reason on probe retry

Atul Gopinathan <[email protected]>
staging: rtl8192e: Change state information from u16 to u8

Atul Gopinathan <[email protected]>
staging: rtl8192e: Fix incorrect source in memcpy()

Roja Rani Yarubandi <[email protected]>
soc: qcom-geni-se: Cleanup the code to remove proxy votes

Wesley Cheng <[email protected]>
usb: dwc3: gadget: Clear DEP flags after stop transfers in ep disable

Shawn Guo <[email protected]>
usb: dwc3: qcom: skip interconnect init for ACPI probe

Artur Petrosyan <[email protected]>
usb: dwc2: Prevent core suspend when port connection flag is 0

Artur Petrosyan <[email protected]>
usb: dwc2: Fix HPRT0.PrtSusp bit setting for HiKey 960 board.

Tong Zhang <[email protected]>
usb: gadget: udc: amd5536udc_pci fix null-ptr-dereference

Johan Hovold <[email protected]>
USB: cdc-acm: fix use-after-free after probe failure

Johan Hovold <[email protected]>
USB: cdc-acm: fix double free on probe failure

Oliver Neukum <[email protected]>
USB: cdc-acm: downgrade message to debug

Oliver Neukum <[email protected]>
USB: cdc-acm: untangle a circular dependency between callback and softint

Oliver Neukum <[email protected]>
cdc-acm: fix BREAK rx code path adding necessary calls

Chunfeng Yun <[email protected]>
usb: xhci-mtk: fix broken streams issue on 0.96 xHCI

Tony Lindgren <[email protected]>
usb: musb: Fix suspend with devices connected for a64

Vincent Palatin <[email protected]>
USB: quirks: ignore remote wake-up on Fibocom L850-GL LTE modem

Shuah Khan <[email protected]>
usbip: vhci_hcd fix shift out-of-bounds in vhci_hub_control()

Zheyu Ma <[email protected]>
firewire: nosy: Fix a use-after-free bug in nosy_ioctl()

Lv Yunlong <[email protected]>
video: hyperv_fb: Fix a double free in hvfb_probe

Andy Shevchenko <[email protected]>
usb: dwc3: pci: Enable dis_uX_susphy_quirk for Intel Merrifield

Richard Gong <[email protected]>
firmware: stratix10-svc: reset COMMAND_RECONFIG_FLAG_PARTIAL to 0

Dinghao Liu <[email protected]>
extcon: Fix error handling in extcon_dev_register

Krzysztof Kozlowski <[email protected]>
extcon: Add stubs for extcon_register_notifier_all() functions

Sean Christopherson <[email protected]>
KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping

Paolo Bonzini <[email protected]>
KVM: x86: compile out TDP MMU on 32-bit systems

Ben Gardon <[email protected]>
KVM: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map

Ben Gardon <[email protected]>
KVM: x86/mmu: Factor out functions to add/remove TDP MMU pages

Ben Gardon <[email protected]>
KVM: x86/mmu: Fix braces in kvm_recover_nx_lpages

Ben Gardon <[email protected]>
KVM: x86/mmu: Don't redundantly clear TDP MMU pt memory

Ben Gardon <[email protected]>
KVM: x86/mmu: Add comment on __tdp_mmu_set_spte

Sean Christopherson <[email protected]>
KVM: x86/mmu: Ensure TLBs are flushed when yielding during GFN range zap

Ben Gardon <[email protected]>
KVM: x86/mmu: Protect TDP MMU page table memory with RCU

Ben Gardon <[email protected]>
KVM: x86/mmu: Factor out handling of removed page tables

Ben Gardon <[email protected]>
KVM: x86/mmu: Add lockdep when setting a TDP MMU SPTE

Ben Gardon <[email protected]>
kvm: x86/mmu: Add existing trace points to TDP MMU

Ben Gardon <[email protected]>
KVM: x86/mmu: Yield in TDU MMU iter even if no SPTES changed

Ben Gardon <[email protected]>
KVM: x86/mmu: Ensure forward progress when yielding in TDP MMU iter

Ben Gardon <[email protected]>
KVM: x86/mmu: Rename goal_gfn to next_last_level_gfn

Ben Gardon <[email protected]>
KVM: x86/mmu: Merge flush and non-flush tdp_mmu_iter_cond_resched

Ben Gardon <[email protected]>
KVM: x86/mmu: change TDP MMU yield function returns to match cond_resched

Wang Panzhenzhuan <[email protected]>
pinctrl: rockchip: fix restore error in resume

Jason Gunthorpe <[email protected]>
vfio/nvlink: Add missing SPAPR_TCE_IOMMU depends

Thierry Reding <[email protected]>
drm/tegra: sor: Grab runtime PM reference across reset

Thierry Reding <[email protected]>
drm/tegra: dc: Restore coupling of display controllers

Pan Bian <[email protected]>
drm/imx: fix memory leak when fails to init

Tetsuo Handa <[email protected]>
reiserfs: update reiserfs_xattrs_initialized() condition

Xℹ Ruoyao <[email protected]>
drm/amdgpu: check alignment on CPU page for bo map

Nirmoy Das <[email protected]>
drm/amdgpu: fix offset calculation in amdgpu_vm_bo_clear_mappings()

Qu Huang <[email protected]>
drm/amdkfd: dqm fence memory corruption

Ilya Lipnitskiy <[email protected]>
mm: fix race by making init_zero_pfn() early_initcall

Heiko Carstens <[email protected]>
s390/vdso: fix tod_steering_delta type

Heiko Carstens <[email protected]>
s390/vdso: copy tod_steering_delta value to vdso_data page

Steven Rostedt (VMware) <[email protected]>
tracing: Fix stack trace event size

Adrian Hunter <[email protected]>
PM: runtime: Fix ordering in pm_runtime_get_suppliers()

Adrian Hunter <[email protected]>
PM: runtime: Fix race getting/putting suppliers at probe

Paolo Bonzini <[email protected]>
KVM: SVM: ensure that EFER.SVME is set when running nested guest or on nested vmexit

Paolo Bonzini <[email protected]>
KVM: SVM: load control fields from VMCB12 before checking them

Max Filippov <[email protected]>
xtensa: move coprocessor_flush to the .text section

Max Filippov <[email protected]>
xtensa: fix uaccess-related livelock in do_page_fault

Jeremy Szu <[email protected]>
ALSA: hda/realtek: fix mute/micmute LEDs for HP 640 G8

Hui Wang <[email protected]>
ALSA: hda/realtek: call alc_update_headset_mode() in hp_automute_hook

Hui Wang <[email protected]>
ALSA: hda/realtek: fix a determine_headset_type issue for a Dell AIO

Takashi Iwai <[email protected]>
ALSA: hda: Add missing sanity checks in PM prepare/complete callbacks

Takashi Iwai <[email protected]>
ALSA: hda: Re-add dropped snd_poewr_change_state() calls

Ikjoon Jang <[email protected]>
ALSA: usb-audio: Apply sample rate quirk to Logitech Connect

Vitaly Kuznetsov <[email protected]>
ACPI: processor: Fix CPU0 wakeup in acpi_idle_play_dead()

Rafael J. Wysocki <[email protected]>
ACPI: tables: x86: Reserve memory occupied by ACPI tables

Jesper Dangaard Brouer <[email protected]>
bpf: Remove MTU check in __bpf_skb_max_len

Jisheng Zhang <[email protected]>
net: 9p: advance iov on empty read

Tong Zhang <[email protected]>
net: wan/lmc: unregister device when no matching device is found

Alex Elder <[email protected]>
net: ipa: fix register write command validation

Alex Elder <[email protected]>
net: ipa: remove two unused register definitions

Doug Brown <[email protected]>
appletalk: Fix skb allocation size in loopback case

Nathan Rossi <[email protected]>
net: ethernet: aquantia: Handle error cleanup of start on open

Shuah Khan <[email protected]>
ath10k: hold RCU lock when calling ieee80211_find_sta_by_ifaddr()

Johannes Berg <[email protected]>
iwlwifi: pcie: don't disable interrupts for reg_lock

Ido Schimmel <[email protected]>
netdevsim: dev: Initialize FIB module after debugfs

Guo-Feng Fan <[email protected]>
rtw88: coex: 8821c: correct antenna switch function

Wen Gong <[email protected]>
ath11k: add ieee80211_unregister_hw to avoid kernel crash caused by NULL pointer

Luca Pesce <[email protected]>
brcmfmac: clear EAP/association status bits on linkdown events

Sasha Levin <[email protected]>
can: tcan4x5x: fix max register value

Oleksij Rempel <[email protected]>
net: introduce CAN specific pointer in the struct net_device

Marc Kleine-Budde <[email protected]>
can: dev: move driver related infrastructure into separate subdir

Davide Caratti <[email protected]>
flow_dissector: fix TTL and TOS dissection on IPv4 fragments

Sasha Levin <[email protected]>
net: mvpp2: fix interrupt mask/unmask skip condition

Stefan Metzmacher <[email protected]>
io_uring: call req_set_fail_links() on short send[msg]()/recv[msg]() with MSG_WAITALL

zhangyi (F) <[email protected]>
ext4: do not iput inode under running transaction in ext4_rename()

Peter Zijlstra <[email protected]>
static_call: Align static_call_is_init() patching condition

Stefan Metzmacher <[email protected]>
io_uring: imply MSG_NOSIGNAL for send[msg]()/recv[msg]() calls

Elad Grupi <[email protected]>
nvmet-tcp: fix kmap leak when data digest in use

Waiman Long <[email protected]>
locking/ww_mutex: Fix acquire/release imbalance in ww_acquire_init()/ww_acquire_fini()

Waiman Long <[email protected]>
locking/ww_mutex: Simplify use_ww_ctx & ww_ctx handling

Manaf Meethalavalappu Pallikunhi <[email protected]>
thermal/core: Add NULL pointer check before using cooling device stats

Bard Liao <[email protected]>
ASoC: rt711: add snd_soc_component remove callback

Sameer Pujar <[email protected]>
ASoC: rt5659: Update MCLK rate in set_sysclk()

Tong Zhang <[email protected]>
staging: comedi: cb_pcidas64: fix request_irq() warn

Tong Zhang <[email protected]>
staging: comedi: cb_pcidas: fix request_irq() warn

Alexey Dobriyan <[email protected]>
scsi: qla2xxx: Fix broken #endif placement

Lv Yunlong <[email protected]>
scsi: st: Fix a use after free in st_open()

Pavel Begunkov <[email protected]>
io_uring: fix ->flags races by linked timeouts

Laurent Vivier <[email protected]>
vhost: Fix vhost_vq_reset()

Jens Axboe <[email protected]>
kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing

Olga Kornievskaia <[email protected]>
NFSD: fix error handling in NFSv4.0 callbacks

Lucas Tanure <[email protected]>
ASoC: cs42l42: Always wait at least 3ms after reset

Lucas Tanure <[email protected]>
ASoC: cs42l42: Fix mixer volume control

Lucas Tanure <[email protected]>
ASoC: cs42l42: Fix channel width support

Lucas Tanure <[email protected]>
ASoC: cs42l42: Fix Bitclock polarity inversion

Jon Hunter <[email protected]>
ASoC: soc-core: Prevent warning if no DMI table is present

Hans de Goede <[email protected]>
ASoC: es8316: Simplify adc_pga_gain_tlv table

Benjamin Rood <[email protected]>
ASoC: sgtl5000: set DAP_AVC_CTRL register to correct default value on probe

Hans de Goede <[email protected]>
ASoC: rt5651: Fix dac- and adc- vol-tlv values being off by a factor of 10

Hans de Goede <[email protected]>
ASoC: rt5640: Fix dac- and adc- vol-tlv values being off by a factor of 10

Jack Yu <[email protected]>
ASoC: rt1015: fix i2c communication error

Ritesh Harjani <[email protected]>
iomap: Fix negative assignment to unsigned sis->pages in iomap_swapfile_activate

J. Bruce Fields <[email protected]>
rpc: fix NULL dereference on kmalloc failure

Julian Braha <[email protected]>
fs: nfsd: fix kconfig dependency warning for NFSD_V4

Zhaolong Zhang <[email protected]>
ext4: fix bh ref count on error paths

Eric Whitney <[email protected]>
ext4: shrink race window in ext4_should_retry_alloc()

Vivek Goyal <[email protected]>
virtiofs: Fail dax mount if device does not support it

Alexei Starovoitov <[email protected]>
bpf: Fix fexit trampoline.

Pavel Tatashin <[email protected]>
arm64: mm: correct the inside linear map range during hotplug check


-------------

Diffstat:

Documentation/virt/kvm/locking.rst | 9 +-
Makefile | 4 +-
arch/arm64/mm/mmu.c | 20 +-
arch/riscv/include/asm/uaccess.h | 7 +-
arch/s390/include/asm/vdso/data.h | 2 +-
arch/s390/kernel/time.c | 1 +
arch/x86/include/asm/kvm_host.h | 15 +
arch/x86/include/asm/smp.h | 1 +
arch/x86/kernel/acpi/boot.c | 25 +-
arch/x86/kernel/setup.c | 8 +-
arch/x86/kernel/smpboot.c | 2 +-
arch/x86/kvm/Makefile | 3 +-
arch/x86/kvm/mmu/mmu.c | 49 +--
arch/x86/kvm/mmu/mmu_internal.h | 5 +
arch/x86/kvm/mmu/tdp_iter.c | 46 +--
arch/x86/kvm/mmu/tdp_iter.h | 21 +-
arch/x86/kvm/mmu/tdp_mmu.c | 450 +++++++++++++++------
arch/x86/kvm/mmu/tdp_mmu.h | 32 +-
arch/x86/kvm/svm/nested.c | 28 +-
arch/x86/net/bpf_jit_comp.c | 27 +-
arch/xtensa/kernel/coprocessor.S | 64 +--
arch/xtensa/mm/fault.c | 5 +-
drivers/acpi/processor_idle.c | 7 +
drivers/acpi/tables.c | 42 +-
drivers/base/dd.c | 3 +
drivers/base/power/runtime.c | 10 +-
drivers/extcon/extcon.c | 1 +
drivers/firewire/nosy.c | 9 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 10 +-
drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +-
.../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 6 +-
.../gpu/drm/amd/amdkfd/kfd_device_queue_manager.h | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_vi.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 8 +-
drivers/gpu/drm/imx/imx-drm-core.c | 2 +-
drivers/gpu/drm/tegra/dc.c | 20 +-
drivers/gpu/drm/tegra/sor.c | 7 +
drivers/net/can/Makefile | 7 +-
drivers/net/can/dev/Makefile | 7 +
drivers/net/can/{ => dev}/dev.c | 4 +-
drivers/net/can/{ => dev}/rx-offload.c | 0
drivers/net/can/m_can/tcan4x5x.c | 2 +-
drivers/net/can/slcan.c | 4 +-
drivers/net/can/vcan.c | 2 +-
drivers/net/can/vxcan.c | 6 +-
drivers/net/ethernet/aquantia/atlantic/aq_main.c | 4 +-
drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 4 +-
drivers/net/ipa/gsi_reg.h | 10 -
drivers/net/ipa/ipa_cmd.c | 32 +-
drivers/net/netdevsim/dev.c | 40 +-
drivers/net/wan/lmc/lmc_main.c | 2 +
drivers/net/wireless/ath/ath10k/wmi-tlv.c | 7 +-
drivers/net/wireless/ath/ath11k/mac.c | 7 +-
.../broadcom/brcm80211/brcmfmac/cfg80211.c | 7 +-
drivers/net/wireless/intel/iwlwifi/pcie/trans.c | 11 +-
drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c | 5 +-
drivers/net/wireless/intel/iwlwifi/pcie/tx.c | 22 +-
drivers/net/wireless/realtek/rtw88/rtw8821c.c | 16 +-
drivers/nvme/target/tcp.c | 2 +-
drivers/pinctrl/pinctrl-rockchip.c | 13 +-
drivers/scsi/qla2xxx/qla_target.h | 2 +-
drivers/scsi/st.c | 2 +-
drivers/soc/qcom/qcom-geni-se.c | 74 ----
drivers/staging/comedi/drivers/cb_pcidas.c | 2 +-
drivers/staging/comedi/drivers/cb_pcidas64.c | 2 +-
drivers/staging/rtl8192e/rtllib.h | 2 +-
drivers/staging/rtl8192e/rtllib_rx.c | 2 +-
drivers/thermal/thermal_sysfs.c | 3 +
drivers/tty/serial/qcom_geni_serial.c | 7 -
drivers/usb/class/cdc-acm.c | 61 ++-
drivers/usb/core/quirks.c | 4 +
drivers/usb/dwc2/hcd.c | 5 +-
drivers/usb/dwc3/dwc3-pci.c | 2 +
drivers/usb/dwc3/dwc3-qcom.c | 3 +
drivers/usb/dwc3/gadget.c | 8 +-
drivers/usb/gadget/udc/amd5536udc_pci.c | 10 +-
drivers/usb/host/xhci-mtk.c | 10 +-
drivers/usb/musb/musb_core.c | 12 +-
drivers/usb/usbip/vhci_hcd.c | 2 +
drivers/vfio/pci/Kconfig | 2 +-
drivers/vhost/vhost.c | 2 +-
drivers/video/fbdev/core/fbcon.c | 3 +
drivers/video/fbdev/hyperv_fb.c | 3 -
fs/ext4/balloc.c | 38 +-
fs/ext4/ext4.h | 1 +
fs/ext4/inode.c | 6 +-
fs/ext4/namei.c | 18 +-
fs/ext4/super.c | 5 +
fs/ext4/sysfs.c | 7 +
fs/fuse/virtio_fs.c | 9 +-
fs/io_uring.c | 33 +-
fs/iomap/swapfile.c | 10 +
fs/nfsd/Kconfig | 1 +
fs/nfsd/nfs4callback.c | 1 +
fs/reiserfs/xattr.h | 2 +-
include/linux/acpi.h | 9 +-
include/linux/bpf.h | 24 +-
include/linux/can/can-ml.h | 12 +
include/linux/extcon.h | 23 ++
.../linux/firmware/intel/stratix10-svc-client.h | 2 +-
include/linux/netdevice.h | 34 +-
include/linux/qcom-geni-se.h | 2 -
include/linux/ww_mutex.h | 5 +-
kernel/bpf/bpf_struct_ops.c | 2 +-
kernel/bpf/core.c | 4 +-
kernel/bpf/trampoline.c | 218 +++++++---
kernel/locking/mutex.c | 25 +-
kernel/static_call.c | 14 +-
kernel/trace/trace.c | 3 +-
mm/memory.c | 2 +-
net/9p/client.c | 4 -
net/appletalk/ddp.c | 33 +-
net/can/af_can.c | 34 +-
net/can/j1939/main.c | 22 +-
net/can/j1939/socket.c | 13 +-
net/can/proc.c | 19 +-
net/core/filter.c | 12 +-
net/core/flow_dissector.c | 6 +-
net/sunrpc/auth_gss/svcauth_gss.c | 11 +-
sound/pci/hda/hda_intel.c | 8 +
sound/pci/hda/patch_realtek.c | 4 +-
sound/soc/codecs/cs42l42.c | 74 ++--
sound/soc/codecs/cs42l42.h | 13 +-
sound/soc/codecs/es8316.c | 9 +-
sound/soc/codecs/rt1015.c | 1 +
sound/soc/codecs/rt5640.c | 4 +-
sound/soc/codecs/rt5651.c | 4 +-
sound/soc/codecs/rt5659.c | 5 +
sound/soc/codecs/rt711.c | 8 +
sound/soc/codecs/sgtl5000.c | 2 +-
sound/soc/soc-core.c | 4 +
sound/usb/quirks.c | 1 +
.../testing/selftests/net/forwarding/tc_flower.sh | 38 +-
135 files changed, 1457 insertions(+), 778 deletions(-)



2021-04-05 15:57:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 044/126] rtw88: coex: 8821c: correct antenna switch function

From: Guo-Feng Fan <[email protected]>

[ Upstream commit adba838af159914eb98fcd55bfd3a89c9a7d41a8 ]

This patch fixes a defect that uses incorrect function to access
registers. Use 8 and 32 bit access function to access 8 and 32 bit long
data respectively.

Signed-off-by: Guo-Feng Fan <[email protected]>
Signed-off-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/realtek/rtw88/rtw8821c.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtw88/rtw8821c.c b/drivers/net/wireless/realtek/rtw88/rtw8821c.c
index da2e7415be8f..f9615f76f173 100644
--- a/drivers/net/wireless/realtek/rtw88/rtw8821c.c
+++ b/drivers/net/wireless/realtek/rtw88/rtw8821c.c
@@ -720,8 +720,8 @@ static void rtw8821c_coex_cfg_ant_switch(struct rtw_dev *rtwdev, u8 ctrl_type,
regval = (!polarity_inverse ? 0x1 : 0x2);
}

- rtw_write8_mask(rtwdev, REG_RFE_CTRL8, BIT_MASK_R_RFE_SEL_15,
- regval);
+ rtw_write32_mask(rtwdev, REG_RFE_CTRL8, BIT_MASK_R_RFE_SEL_15,
+ regval);
break;
case COEX_SWITCH_CTRL_BY_PTA:
rtw_write32_clr(rtwdev, REG_LED_CFG, BIT_DPDT_SEL_EN);
@@ -731,8 +731,8 @@ static void rtw8821c_coex_cfg_ant_switch(struct rtw_dev *rtwdev, u8 ctrl_type,
PTA_CTRL_PIN);

regval = (!polarity_inverse ? 0x2 : 0x1);
- rtw_write8_mask(rtwdev, REG_RFE_CTRL8, BIT_MASK_R_RFE_SEL_15,
- regval);
+ rtw_write32_mask(rtwdev, REG_RFE_CTRL8, BIT_MASK_R_RFE_SEL_15,
+ regval);
break;
case COEX_SWITCH_CTRL_BY_ANTDIV:
rtw_write32_clr(rtwdev, REG_LED_CFG, BIT_DPDT_SEL_EN);
@@ -758,11 +758,11 @@ static void rtw8821c_coex_cfg_ant_switch(struct rtw_dev *rtwdev, u8 ctrl_type,
}

if (ctrl_type == COEX_SWITCH_CTRL_BY_BT) {
- rtw_write32_clr(rtwdev, REG_CTRL_TYPE, BIT_CTRL_TYPE1);
- rtw_write32_clr(rtwdev, REG_CTRL_TYPE, BIT_CTRL_TYPE2);
+ rtw_write8_clr(rtwdev, REG_CTRL_TYPE, BIT_CTRL_TYPE1);
+ rtw_write8_clr(rtwdev, REG_CTRL_TYPE, BIT_CTRL_TYPE2);
} else {
- rtw_write32_set(rtwdev, REG_CTRL_TYPE, BIT_CTRL_TYPE1);
- rtw_write32_set(rtwdev, REG_CTRL_TYPE, BIT_CTRL_TYPE2);
+ rtw_write8_set(rtwdev, REG_CTRL_TYPE, BIT_CTRL_TYPE1);
+ rtw_write8_set(rtwdev, REG_CTRL_TYPE, BIT_CTRL_TYPE2);
}
}

--
2.30.1



2021-04-05 16:04:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 018/126] ASoC: cs42l42: Always wait at least 3ms after reset

From: Lucas Tanure <[email protected]>

[ Upstream commit 19325cfea04446bc79b36bffd4978af15f46a00e ]

This delay is part of the power-up sequence defined in the datasheet.
A runtime_resume is a power-up so must also include the delay.

Signed-off-by: Lucas Tanure <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/soc/codecs/cs42l42.c | 3 ++-
sound/soc/codecs/cs42l42.h | 1 +
2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/sound/soc/codecs/cs42l42.c b/sound/soc/codecs/cs42l42.c
index d5078ce79fad..4d82d24c7828 100644
--- a/sound/soc/codecs/cs42l42.c
+++ b/sound/soc/codecs/cs42l42.c
@@ -1794,7 +1794,7 @@ static int cs42l42_i2c_probe(struct i2c_client *i2c_client,
dev_dbg(&i2c_client->dev, "Found reset GPIO\n");
gpiod_set_value_cansleep(cs42l42->reset_gpio, 1);
}
- mdelay(3);
+ usleep_range(CS42L42_BOOT_TIME_US, CS42L42_BOOT_TIME_US * 2);

/* Request IRQ */
ret = devm_request_threaded_irq(&i2c_client->dev,
@@ -1919,6 +1919,7 @@ static int cs42l42_runtime_resume(struct device *dev)
}

gpiod_set_value_cansleep(cs42l42->reset_gpio, 1);
+ usleep_range(CS42L42_BOOT_TIME_US, CS42L42_BOOT_TIME_US * 2);

regcache_cache_only(cs42l42->regmap, false);
regcache_sync(cs42l42->regmap);
diff --git a/sound/soc/codecs/cs42l42.h b/sound/soc/codecs/cs42l42.h
index 9b017b76828a..866d7c873e3c 100644
--- a/sound/soc/codecs/cs42l42.h
+++ b/sound/soc/codecs/cs42l42.h
@@ -740,6 +740,7 @@
#define CS42L42_FRAC2_VAL(val) (((val) & 0xff0000) >> 16)

#define CS42L42_NUM_SUPPLIES 5
+#define CS42L42_BOOT_TIME_US 3000

static const char *const cs42l42_supply_names[CS42L42_NUM_SUPPLIES] = {
"VA",
--
2.30.1



2021-04-05 16:04:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 014/126] ASoC: soc-core: Prevent warning if no DMI table is present

From: Jon Hunter <[email protected]>

[ Upstream commit 7de14d581dbed57c2b3c6afffa2c3fdc6955a3cd ]

Many systems do not use ACPI and hence do not provide a DMI table. On
non-ACPI systems a warning, such as the following, is printed on boot.

WARNING KERN tegra-audio-graph-card sound: ASoC: no DMI vendor name!

The variable 'dmi_available' is not exported and so currently cannot be
used by kernel modules without adding an accessor. However, it is
possible to use the function is_acpi_device_node() to determine if the
sound card is an ACPI device and hence indicate if we expect a DMI table
to be present. Therefore, call is_acpi_device_node() to see if we are
using ACPI and only parse the DMI table if we are booting with ACPI.

Signed-off-by: Jon Hunter <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/soc/soc-core.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
index 05a085f6dc7c..bf65cba232e6 100644
--- a/sound/soc/soc-core.c
+++ b/sound/soc/soc-core.c
@@ -31,6 +31,7 @@
#include <linux/of.h>
#include <linux/of_graph.h>
#include <linux/dmi.h>
+#include <linux/acpi.h>
#include <sound/core.h>
#include <sound/jack.h>
#include <sound/pcm.h>
@@ -1581,6 +1582,9 @@ int snd_soc_set_dmi_name(struct snd_soc_card *card, const char *flavour)
if (card->long_name)
return 0; /* long name already set by driver or from DMI */

+ if (!is_acpi_device_node(card->dev->fwnode))
+ return 0;
+
/* make up dmi long name as: vendor-product-version-board */
vendor = dmi_get_system_info(DMI_BOARD_VENDOR);
if (!vendor || !is_dmi_valid(vendor)) {
--
2.30.1



2021-04-05 16:04:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 017/126] ASoC: cs42l42: Fix mixer volume control

From: Lucas Tanure <[email protected]>

[ Upstream commit 72d904763ae6a8576e7ad034f9da4f0e3c44bf24 ]

The minimum value is 0x3f (-63dB), which also is mute

Signed-off-by: Lucas Tanure <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/soc/codecs/cs42l42.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/sound/soc/codecs/cs42l42.c b/sound/soc/codecs/cs42l42.c
index 4f9ad9547929..d5078ce79fad 100644
--- a/sound/soc/codecs/cs42l42.c
+++ b/sound/soc/codecs/cs42l42.c
@@ -401,7 +401,7 @@ static const struct regmap_config cs42l42_regmap = {
};

static DECLARE_TLV_DB_SCALE(adc_tlv, -9600, 100, false);
-static DECLARE_TLV_DB_SCALE(mixer_tlv, -6200, 100, false);
+static DECLARE_TLV_DB_SCALE(mixer_tlv, -6300, 100, true);

static const char * const cs42l42_hpf_freq_text[] = {
"1.86Hz", "120Hz", "235Hz", "466Hz"
@@ -458,7 +458,7 @@ static const struct snd_kcontrol_new cs42l42_snd_controls[] = {
CS42L42_DAC_HPF_EN_SHIFT, true, false),
SOC_DOUBLE_R_TLV("Mixer Volume", CS42L42_MIXER_CHA_VOL,
CS42L42_MIXER_CHB_VOL, CS42L42_MIXER_CH_VOL_SHIFT,
- 0x3e, 1, mixer_tlv)
+ 0x3f, 1, mixer_tlv)
};

static int cs42l42_hpdrv_evt(struct snd_soc_dapm_widget *w,
--
2.30.1



2021-04-05 16:04:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 021/126] vhost: Fix vhost_vq_reset()

From: Laurent Vivier <[email protected]>

[ Upstream commit beb691e69f4dec7bfe8b81b509848acfd1f0dbf9 ]

vhost_reset_is_le() is vhost_init_is_le(), and in the case of
cross-endian legacy, vhost_init_is_le() depends on vq->user_be.

vq->user_be is set by vhost_disable_cross_endian().

But in vhost_vq_reset(), we have:

vhost_reset_is_le(vq);
vhost_disable_cross_endian(vq);

And so user_be is used before being set.

To fix that, reverse the lines order as there is no other dependency
between them.

Signed-off-by: Laurent Vivier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/vhost/vhost.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index a262e12c6dc2..5ccb0705beae 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -332,8 +332,8 @@ static void vhost_vq_reset(struct vhost_dev *dev,
vq->error_ctx = NULL;
vq->kick = NULL;
vq->log_ctx = NULL;
- vhost_reset_is_le(vq);
vhost_disable_cross_endian(vq);
+ vhost_reset_is_le(vq);
vq->busyloop_timeout = 0;
vq->umem = NULL;
vq->iotlb = NULL;
--
2.30.1



2021-04-05 16:04:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 022/126] io_uring: fix ->flags races by linked timeouts

From: Pavel Begunkov <[email protected]>

[ Upstream commit efe814a471e0e58f28f1efaf430c8784a4f36626 ]

It's racy to modify req->flags from a not owning context, e.g. linked
timeout calling req_set_fail_links() for the master request might race
with that request setting/clearing flags while being executed
concurrently. Just remove req_set_fail_links(prev) from
io_link_timeout_fn(), io_async_find_and_cancel() and functions down the
line take care of setting the fail bit.

Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/io_uring.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index dde290eb7dd0..4e53445db73f 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6242,7 +6242,6 @@ static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer)
spin_unlock_irqrestore(&ctx->completion_lock, flags);

if (prev) {
- req_set_fail_links(prev);
io_async_find_and_cancel(ctx, req, prev->user_data, -ETIME);
io_put_req_deferred(prev, 1);
} else {
--
2.30.1



2021-04-05 16:04:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 023/126] scsi: st: Fix a use after free in st_open()

From: Lv Yunlong <[email protected]>

[ Upstream commit c8c165dea4c8f5ad67b1240861e4f6c5395fa4ac ]

In st_open(), if STp->in_use is true, STp will be freed by
scsi_tape_put(). However, STp is still used by DEBC_printk() after. It is
better to DEBC_printk() before scsi_tape_put().

Link: https://lore.kernel.org/r/[email protected]
Acked-by: Kai Mäkisara <[email protected]>
Signed-off-by: Lv Yunlong <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/scsi/st.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
index e2e5356a997d..19bc8c923fce 100644
--- a/drivers/scsi/st.c
+++ b/drivers/scsi/st.c
@@ -1269,8 +1269,8 @@ static int st_open(struct inode *inode, struct file *filp)
spin_lock(&st_use_lock);
if (STp->in_use) {
spin_unlock(&st_use_lock);
- scsi_tape_put(STp);
DEBC_printk(STp, "Device already in use.\n");
+ scsi_tape_put(STp);
return (-EBUSY);
}

--
2.30.1



2021-04-05 16:04:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 028/126] ASoC: rt711: add snd_soc_component remove callback

From: Bard Liao <[email protected]>

[ Upstream commit 899b12542b0897f92de9ba30944937c39ebb246d ]

We do some IO operations in the snd_soc_component_set_jack callback
function and snd_soc_component_set_jack() will be called when soc
component is removed. However, we should not access SoundWire registers
when the bus is suspended.
So set regcache_cache_only(regmap, true) to avoid accessing in the
soc component removal process.

Signed-off-by: Bard Liao <[email protected]>
Reviewed-by: Kai Vehmanen <[email protected]>
Reviewed-by: Rander Wang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/soc/codecs/rt711.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/sound/soc/codecs/rt711.c b/sound/soc/codecs/rt711.c
index a9b1b4180c47..93d86f7558e0 100644
--- a/sound/soc/codecs/rt711.c
+++ b/sound/soc/codecs/rt711.c
@@ -895,6 +895,13 @@ static int rt711_probe(struct snd_soc_component *component)
return 0;
}

+static void rt711_remove(struct snd_soc_component *component)
+{
+ struct rt711_priv *rt711 = snd_soc_component_get_drvdata(component);
+
+ regcache_cache_only(rt711->regmap, true);
+}
+
static const struct snd_soc_component_driver soc_codec_dev_rt711 = {
.probe = rt711_probe,
.set_bias_level = rt711_set_bias_level,
@@ -905,6 +912,7 @@ static const struct snd_soc_component_driver soc_codec_dev_rt711 = {
.dapm_routes = rt711_audio_map,
.num_dapm_routes = ARRAY_SIZE(rt711_audio_map),
.set_jack = rt711_set_jack_detect,
+ .remove = rt711_remove,
};

static int rt711_set_sdw_stream(struct snd_soc_dai *dai, void *sdw_stream,
--
2.30.1



2021-04-05 16:04:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 030/126] locking/ww_mutex: Simplify use_ww_ctx & ww_ctx handling

From: Waiman Long <[email protected]>

[ Upstream commit 5de2055d31ea88fd9ae9709ac95c372a505a60fa ]

The use_ww_ctx flag is passed to mutex_optimistic_spin(), but the
function doesn't use it. The frequent use of the (use_ww_ctx && ww_ctx)
combination is repetitive.

In fact, ww_ctx should not be used at all if !use_ww_ctx. Simplify
ww_mutex code by dropping use_ww_ctx from mutex_optimistic_spin() an
clear ww_ctx if !use_ww_ctx. In this way, we can replace (use_ww_ctx &&
ww_ctx) by just (ww_ctx).

Signed-off-by: Waiman Long <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Davidlohr Bueso <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/locking/mutex.c | 25 ++++++++++++++-----------
1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 5352ce50a97e..2c25b830203c 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -636,7 +636,7 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock)
*/
static __always_inline bool
mutex_optimistic_spin(struct mutex *lock, struct ww_acquire_ctx *ww_ctx,
- const bool use_ww_ctx, struct mutex_waiter *waiter)
+ struct mutex_waiter *waiter)
{
if (!waiter) {
/*
@@ -712,7 +712,7 @@ fail:
#else
static __always_inline bool
mutex_optimistic_spin(struct mutex *lock, struct ww_acquire_ctx *ww_ctx,
- const bool use_ww_ctx, struct mutex_waiter *waiter)
+ struct mutex_waiter *waiter)
{
return false;
}
@@ -932,6 +932,9 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
struct ww_mutex *ww;
int ret;

+ if (!use_ww_ctx)
+ ww_ctx = NULL;
+
might_sleep();

#ifdef CONFIG_DEBUG_MUTEXES
@@ -939,7 +942,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
#endif

ww = container_of(lock, struct ww_mutex, base);
- if (use_ww_ctx && ww_ctx) {
+ if (ww_ctx) {
if (unlikely(ww_ctx == READ_ONCE(ww->ctx)))
return -EALREADY;

@@ -956,10 +959,10 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, ip);

if (__mutex_trylock(lock) ||
- mutex_optimistic_spin(lock, ww_ctx, use_ww_ctx, NULL)) {
+ mutex_optimistic_spin(lock, ww_ctx, NULL)) {
/* got the lock, yay! */
lock_acquired(&lock->dep_map, ip);
- if (use_ww_ctx && ww_ctx)
+ if (ww_ctx)
ww_mutex_set_context_fastpath(ww, ww_ctx);
preempt_enable();
return 0;
@@ -970,7 +973,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
* After waiting to acquire the wait_lock, try again.
*/
if (__mutex_trylock(lock)) {
- if (use_ww_ctx && ww_ctx)
+ if (ww_ctx)
__ww_mutex_check_waiters(lock, ww_ctx);

goto skip_wait;
@@ -1023,7 +1026,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
goto err;
}

- if (use_ww_ctx && ww_ctx) {
+ if (ww_ctx) {
ret = __ww_mutex_check_kill(lock, &waiter, ww_ctx);
if (ret)
goto err;
@@ -1036,7 +1039,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
* ww_mutex needs to always recheck its position since its waiter
* list is not FIFO ordered.
*/
- if ((use_ww_ctx && ww_ctx) || !first) {
+ if (ww_ctx || !first) {
first = __mutex_waiter_is_first(lock, &waiter);
if (first)
__mutex_set_flag(lock, MUTEX_FLAG_HANDOFF);
@@ -1049,7 +1052,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
* or we must see its unlock and acquire.
*/
if (__mutex_trylock(lock) ||
- (first && mutex_optimistic_spin(lock, ww_ctx, use_ww_ctx, &waiter)))
+ (first && mutex_optimistic_spin(lock, ww_ctx, &waiter)))
break;

spin_lock(&lock->wait_lock);
@@ -1058,7 +1061,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
acquired:
__set_current_state(TASK_RUNNING);

- if (use_ww_ctx && ww_ctx) {
+ if (ww_ctx) {
/*
* Wound-Wait; we stole the lock (!first_waiter), check the
* waiters as anyone might want to wound us.
@@ -1078,7 +1081,7 @@ skip_wait:
/* got the lock - cleanup and rejoice! */
lock_acquired(&lock->dep_map, ip);

- if (use_ww_ctx && ww_ctx)
+ if (ww_ctx)
ww_mutex_lock_acquired(ww, ww_ctx);

spin_unlock(&lock->wait_lock);
--
2.30.1



2021-04-05 16:04:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 003/126] virtiofs: Fail dax mount if device does not support it

From: Vivek Goyal <[email protected]>

[ Upstream commit 3f9b9efd82a84f27e95d0414f852caf1fa839e83 ]

Right now "mount -t virtiofs -o dax myfs /mnt/virtiofs" succeeds even
if filesystem deivce does not have a cache window and hence DAX can't
be supported.

This gives a false sense to user that they are using DAX with virtiofs
but fact of the matter is that they are not.

Fix this by returning error if dax can't be supported and user has asked
for it.

Signed-off-by: Vivek Goyal <[email protected]>
Reviewed-by: Stefan Hajnoczi <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/fuse/virtio_fs.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index d2c0e58c6416..3d83c9e12848 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -1324,8 +1324,15 @@ static int virtio_fs_fill_super(struct super_block *sb, struct fs_context *fsc)

/* virtiofs allocates and installs its own fuse devices */
ctx->fudptr = NULL;
- if (ctx->dax)
+ if (ctx->dax) {
+ if (!fs->dax_dev) {
+ err = -EINVAL;
+ pr_err("virtio-fs: dax can't be enabled as filesystem"
+ " device does not support it.\n");
+ goto err_free_fuse_devs;
+ }
ctx->dax_dev = fs->dax_dev;
+ }
err = fuse_fill_super_common(sb, ctx);
if (err < 0)
goto err_free_fuse_devs;
--
2.30.1



2021-04-05 16:04:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 025/126] staging: comedi: cb_pcidas: fix request_irq() warn

From: Tong Zhang <[email protected]>

[ Upstream commit 2e5848a3d86f03024ae096478bdb892ab3d79131 ]

request_irq() wont accept a name which contains slash so we need to
repalce it with something else -- otherwise it will trigger a warning
and the entry in /proc/irq/ will not be created
since the .name might be used by userspace and we don't want to break
userspace, so we are changing the parameters passed to request_irq()

[ 1.630764] name 'pci-das1602/16'
[ 1.630950] WARNING: CPU: 0 PID: 181 at fs/proc/generic.c:180 __xlate_proc_name+0x93/0xb0
[ 1.634009] RIP: 0010:__xlate_proc_name+0x93/0xb0
[ 1.639441] Call Trace:
[ 1.639976] proc_mkdir+0x18/0x20
[ 1.641946] request_threaded_irq+0xfe/0x160
[ 1.642186] cb_pcidas_auto_attach+0xf4/0x610 [cb_pcidas]

Suggested-by: Ian Abbott <[email protected]>
Reviewed-by: Ian Abbott <[email protected]>
Signed-off-by: Tong Zhang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/staging/comedi/drivers/cb_pcidas.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/comedi/drivers/cb_pcidas.c b/drivers/staging/comedi/drivers/cb_pcidas.c
index d740c4782775..2f20bd56ec6c 100644
--- a/drivers/staging/comedi/drivers/cb_pcidas.c
+++ b/drivers/staging/comedi/drivers/cb_pcidas.c
@@ -1281,7 +1281,7 @@ static int cb_pcidas_auto_attach(struct comedi_device *dev,
devpriv->amcc + AMCC_OP_REG_INTCSR);

ret = request_irq(pcidev->irq, cb_pcidas_interrupt, IRQF_SHARED,
- dev->board_name, dev);
+ "cb_pcidas", dev);
if (ret) {
dev_dbg(dev->class_dev, "unable to allocate irq %d\n",
pcidev->irq);
--
2.30.1



2021-04-05 16:04:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 032/126] nvmet-tcp: fix kmap leak when data digest in use

From: Elad Grupi <[email protected]>

[ Upstream commit bac04454ef9fada009f0572576837548b190bf94 ]

When data digest is enabled we should unmap pdu iovec before handling
the data digest pdu.

Signed-off-by: Elad Grupi <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/nvme/target/tcp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index 8b0485ada315..d658c6e8263a 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -1098,11 +1098,11 @@ static int nvmet_tcp_try_recv_data(struct nvmet_tcp_queue *queue)
cmd->rbytes_done += ret;
}

+ nvmet_tcp_unmap_pdu_iovec(cmd);
if (queue->data_digest) {
nvmet_tcp_prep_recv_ddgst(cmd);
return 0;
}
- nvmet_tcp_unmap_pdu_iovec(cmd);

if (!(cmd->flags & NVMET_TCP_F_INIT_FAILED) &&
cmd->rbytes_done == cmd->req.transfer_len) {
--
2.30.1



2021-04-05 16:04:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 027/126] ASoC: rt5659: Update MCLK rate in set_sysclk()

From: Sameer Pujar <[email protected]>

[ Upstream commit dbf54a9534350d6aebbb34f5c1c606b81a4f35dd ]

Simple-card/audio-graph-card drivers do not handle MCLK clock when it
is specified in the codec device node. The expectation here is that,
the codec should actually own up the MCLK clock and do necessary setup
in the driver.

Suggested-by: Mark Brown <[email protected]>
Suggested-by: Michael Walle <[email protected]>
Signed-off-by: Sameer Pujar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/soc/codecs/rt5659.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/sound/soc/codecs/rt5659.c b/sound/soc/codecs/rt5659.c
index 41e5917b16a5..91a4ef7f620c 100644
--- a/sound/soc/codecs/rt5659.c
+++ b/sound/soc/codecs/rt5659.c
@@ -3426,12 +3426,17 @@ static int rt5659_set_component_sysclk(struct snd_soc_component *component, int
{
struct rt5659_priv *rt5659 = snd_soc_component_get_drvdata(component);
unsigned int reg_val = 0;
+ int ret;

if (freq == rt5659->sysclk && clk_id == rt5659->sysclk_src)
return 0;

switch (clk_id) {
case RT5659_SCLK_S_MCLK:
+ ret = clk_set_rate(rt5659->mclk, freq);
+ if (ret)
+ return ret;
+
reg_val |= RT5659_SCLK_SRC_MCLK;
break;
case RT5659_SCLK_S_PLL1:
--
2.30.1



2021-04-05 16:04:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 006/126] fs: nfsd: fix kconfig dependency warning for NFSD_V4

From: Julian Braha <[email protected]>

[ Upstream commit 7005227369079963d25fb2d5d736d0feb2c44cf6 ]

When NFSD_V4 is enabled and CRYPTO is disabled,
Kbuild gives the following warning:

WARNING: unmet direct dependencies detected for CRYPTO_SHA256
Depends on [n]: CRYPTO [=n]
Selected by [y]:
- NFSD_V4 [=y] && NETWORK_FILESYSTEMS [=y] && NFSD [=y] && PROC_FS [=y]

WARNING: unmet direct dependencies detected for CRYPTO_MD5
Depends on [n]: CRYPTO [=n]
Selected by [y]:
- NFSD_V4 [=y] && NETWORK_FILESYSTEMS [=y] && NFSD [=y] && PROC_FS [=y]

This is because NFSD_V4 selects CRYPTO_MD5 and CRYPTO_SHA256,
without depending on or selecting CRYPTO, despite those config options
being subordinate to CRYPTO.

Signed-off-by: Julian Braha <[email protected]>
Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/nfsd/Kconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig
index dbbc583d6273..248f1459c039 100644
--- a/fs/nfsd/Kconfig
+++ b/fs/nfsd/Kconfig
@@ -73,6 +73,7 @@ config NFSD_V4
select NFSD_V3
select FS_POSIX_ACL
select SUNRPC_GSS
+ select CRYPTO
select CRYPTO_MD5
select CRYPTO_SHA256
select GRACE_PERIOD
--
2.30.1



2021-04-05 16:05:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 046/126] iwlwifi: pcie: dont disable interrupts for reg_lock

From: Johannes Berg <[email protected]>

[ Upstream commit 874020f8adce535cd318af1768ffe744251b6593 ]

The only thing we do touching the device in hard interrupt context
is, at most, writing an interrupt ACK register, which isn't racing
in with anything protected by the reg_lock.

Thus, avoid disabling interrupts here for potentially long periods
of time, particularly long periods have been observed with dumping
of firmware memory (leading to lockup warnings on some devices.)

Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Luca Coelho <[email protected]>
Link: https://lore.kernel.org/r/iwlwifi.20210210135352.da916ab91298.I064c3e7823b616647293ed97da98edefb9ce9435@changeid
Signed-off-by: Luca Coelho <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
.../net/wireless/intel/iwlwifi/pcie/trans.c | 11 +++++-----
.../net/wireless/intel/iwlwifi/pcie/tx-gen2.c | 5 ++---
drivers/net/wireless/intel/iwlwifi/pcie/tx.c | 22 ++++++++-----------
3 files changed, 16 insertions(+), 22 deletions(-)

diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
index 1a222469b5b4..bb990be7c870 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
@@ -2026,7 +2026,7 @@ static bool iwl_trans_pcie_grab_nic_access(struct iwl_trans *trans,
int ret;
struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans);

- spin_lock_irqsave(&trans_pcie->reg_lock, *flags);
+ spin_lock_bh(&trans_pcie->reg_lock);

if (trans_pcie->cmd_hold_nic_awake)
goto out;
@@ -2111,7 +2111,7 @@ static bool iwl_trans_pcie_grab_nic_access(struct iwl_trans *trans,
}

err:
- spin_unlock_irqrestore(&trans_pcie->reg_lock, *flags);
+ spin_unlock_bh(&trans_pcie->reg_lock);
return false;
}

@@ -2149,7 +2149,7 @@ static void iwl_trans_pcie_release_nic_access(struct iwl_trans *trans,
* scheduled on different CPUs (after we drop reg_lock).
*/
out:
- spin_unlock_irqrestore(&trans_pcie->reg_lock, *flags);
+ spin_unlock_bh(&trans_pcie->reg_lock);
}

static int iwl_trans_pcie_read_mem(struct iwl_trans *trans, u32 addr,
@@ -2403,11 +2403,10 @@ static void iwl_trans_pcie_set_bits_mask(struct iwl_trans *trans, u32 reg,
u32 mask, u32 value)
{
struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans);
- unsigned long flags;

- spin_lock_irqsave(&trans_pcie->reg_lock, flags);
+ spin_lock_bh(&trans_pcie->reg_lock);
__iwl_trans_pcie_set_bits_mask(trans, reg, mask, value);
- spin_unlock_irqrestore(&trans_pcie->reg_lock, flags);
+ spin_unlock_bh(&trans_pcie->reg_lock);
}

static const char *get_csr_string(int cmd)
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c
index baa83a0b8593..8c7138247869 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c
@@ -78,7 +78,6 @@ static int iwl_pcie_gen2_enqueue_hcmd(struct iwl_trans *trans,
struct iwl_txq *txq = trans->txqs.txq[trans->txqs.cmd.q_id];
struct iwl_device_cmd *out_cmd;
struct iwl_cmd_meta *out_meta;
- unsigned long flags;
void *dup_buf = NULL;
dma_addr_t phys_addr;
int i, cmd_pos, idx;
@@ -291,11 +290,11 @@ static int iwl_pcie_gen2_enqueue_hcmd(struct iwl_trans *trans,
if (txq->read_ptr == txq->write_ptr && txq->wd_timeout)
mod_timer(&txq->stuck_timer, jiffies + txq->wd_timeout);

- spin_lock_irqsave(&trans_pcie->reg_lock, flags);
+ spin_lock(&trans_pcie->reg_lock);
/* Increment and update queue's write index */
txq->write_ptr = iwl_txq_inc_wrap(trans, txq->write_ptr);
iwl_txq_inc_wr_ptr(trans, txq);
- spin_unlock_irqrestore(&trans_pcie->reg_lock, flags);
+ spin_unlock(&trans_pcie->reg_lock);

out:
spin_unlock_bh(&txq->lock);
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
index ed54d04e4396..50133c09a780 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
@@ -321,12 +321,10 @@ static void iwl_pcie_txq_unmap(struct iwl_trans *trans, int txq_id)
txq->read_ptr = iwl_txq_inc_wrap(trans, txq->read_ptr);

if (txq->read_ptr == txq->write_ptr) {
- unsigned long flags;
-
- spin_lock_irqsave(&trans_pcie->reg_lock, flags);
+ spin_lock(&trans_pcie->reg_lock);
if (txq_id == trans->txqs.cmd.q_id)
iwl_pcie_clear_cmd_in_flight(trans);
- spin_unlock_irqrestore(&trans_pcie->reg_lock, flags);
+ spin_unlock(&trans_pcie->reg_lock);
}
}

@@ -931,7 +929,6 @@ static void iwl_pcie_cmdq_reclaim(struct iwl_trans *trans, int txq_id, int idx)
{
struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans);
struct iwl_txq *txq = trans->txqs.txq[txq_id];
- unsigned long flags;
int nfreed = 0;
u16 r;

@@ -962,9 +959,10 @@ static void iwl_pcie_cmdq_reclaim(struct iwl_trans *trans, int txq_id, int idx)
}

if (txq->read_ptr == txq->write_ptr) {
- spin_lock_irqsave(&trans_pcie->reg_lock, flags);
+ /* BHs are also disabled due to txq->lock */
+ spin_lock(&trans_pcie->reg_lock);
iwl_pcie_clear_cmd_in_flight(trans);
- spin_unlock_irqrestore(&trans_pcie->reg_lock, flags);
+ spin_unlock(&trans_pcie->reg_lock);
}

iwl_pcie_txq_progress(txq);
@@ -1173,7 +1171,6 @@ static int iwl_pcie_enqueue_hcmd(struct iwl_trans *trans,
struct iwl_txq *txq = trans->txqs.txq[trans->txqs.cmd.q_id];
struct iwl_device_cmd *out_cmd;
struct iwl_cmd_meta *out_meta;
- unsigned long flags;
void *dup_buf = NULL;
dma_addr_t phys_addr;
int idx;
@@ -1416,20 +1413,19 @@ static int iwl_pcie_enqueue_hcmd(struct iwl_trans *trans,
if (txq->read_ptr == txq->write_ptr && txq->wd_timeout)
mod_timer(&txq->stuck_timer, jiffies + txq->wd_timeout);

- spin_lock_irqsave(&trans_pcie->reg_lock, flags);
+ spin_lock(&trans_pcie->reg_lock);
ret = iwl_pcie_set_cmd_in_flight(trans, cmd);
if (ret < 0) {
idx = ret;
- spin_unlock_irqrestore(&trans_pcie->reg_lock, flags);
- goto out;
+ goto unlock_reg;
}

/* Increment and update queue's write index */
txq->write_ptr = iwl_txq_inc_wrap(trans, txq->write_ptr);
iwl_pcie_txq_inc_wr_ptr(trans, txq);

- spin_unlock_irqrestore(&trans_pcie->reg_lock, flags);
-
+ unlock_reg:
+ spin_unlock(&trans_pcie->reg_lock);
out:
spin_unlock_bh(&txq->lock);
free_dup_buf:
--
2.30.1



2021-04-05 16:05:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 007/126] rpc: fix NULL dereference on kmalloc failure

From: J. Bruce Fields <[email protected]>

[ Upstream commit 0ddc942394013f08992fc379ca04cffacbbe3dae ]

I think this is unlikely but possible:

svc_authenticate sets rq_authop and calls svcauth_gss_accept. The
kmalloc(sizeof(*svcdata), GFP_KERNEL) fails, leaving rq_auth_data NULL,
and returning SVC_DENIED.

This causes svc_process_common to go to err_bad_auth, and eventually
call svc_authorise. That calls ->release == svcauth_gss_release, which
tries to dereference rq_auth_data.

Signed-off-by: J. Bruce Fields <[email protected]>
Link: https://lore.kernel.org/linux-nfs/[email protected]/T/#t
Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
net/sunrpc/auth_gss/svcauth_gss.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index bd4678db9d76..6dff64374bfe 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -1825,11 +1825,14 @@ static int
svcauth_gss_release(struct svc_rqst *rqstp)
{
struct gss_svc_data *gsd = (struct gss_svc_data *)rqstp->rq_auth_data;
- struct rpc_gss_wire_cred *gc = &gsd->clcred;
+ struct rpc_gss_wire_cred *gc;
struct xdr_buf *resbuf = &rqstp->rq_res;
int stat = -EINVAL;
struct sunrpc_net *sn = net_generic(SVC_NET(rqstp), sunrpc_net_id);

+ if (!gsd)
+ goto out;
+ gc = &gsd->clcred;
if (gc->gc_proc != RPC_GSS_PROC_DATA)
goto out;
/* Release can be called twice, but we only wrap once. */
@@ -1870,10 +1873,10 @@ out_err:
if (rqstp->rq_cred.cr_group_info)
put_group_info(rqstp->rq_cred.cr_group_info);
rqstp->rq_cred.cr_group_info = NULL;
- if (gsd->rsci)
+ if (gsd && gsd->rsci) {
cache_put(&gsd->rsci->h, sn->rsc_cache);
- gsd->rsci = NULL;
-
+ gsd->rsci = NULL;
+ }
return stat;
}

--
2.30.1



2021-04-05 16:05:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 009/126] ASoC: rt1015: fix i2c communication error

From: Jack Yu <[email protected]>

[ Upstream commit 9e0bdaa9fcb8c64efc1487a7fba07722e7bc515e ]

Remove 0x100 cache re-sync to solve i2c communication error.

Signed-off-by: Jack Yu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/soc/codecs/rt1015.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/sound/soc/codecs/rt1015.c b/sound/soc/codecs/rt1015.c
index 3db07293c70b..2627910060dc 100644
--- a/sound/soc/codecs/rt1015.c
+++ b/sound/soc/codecs/rt1015.c
@@ -209,6 +209,7 @@ static bool rt1015_volatile_register(struct device *dev, unsigned int reg)
case RT1015_VENDOR_ID:
case RT1015_DEVICE_ID:
case RT1015_PRO_ALT:
+ case RT1015_MAN_I2C:
case RT1015_DAC3:
case RT1015_VBAT_TEST_OUT1:
case RT1015_VBAT_TEST_OUT2:
--
2.30.1



2021-04-05 16:05:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 045/126] netdevsim: dev: Initialize FIB module after debugfs

From: Ido Schimmel <[email protected]>

[ Upstream commit f57ab5b75f7193e194c83616cd104f41c8350f68 ]

Initialize the dummy FIB offload module after debugfs, so that the FIB
module could create its own directory there.

Signed-off-by: Amit Cohen <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/netdevsim/dev.c | 40 +++++++++++++++++++------------------
1 file changed, 21 insertions(+), 19 deletions(-)

diff --git a/drivers/net/netdevsim/dev.c b/drivers/net/netdevsim/dev.c
index e7972e88ffe0..9bbecf4d159b 100644
--- a/drivers/net/netdevsim/dev.c
+++ b/drivers/net/netdevsim/dev.c
@@ -1008,23 +1008,25 @@ static int nsim_dev_reload_create(struct nsim_dev *nsim_dev,
nsim_dev->fw_update_status = true;
nsim_dev->fw_update_overwrite_mask = 0;

- nsim_dev->fib_data = nsim_fib_create(devlink, extack);
- if (IS_ERR(nsim_dev->fib_data))
- return PTR_ERR(nsim_dev->fib_data);
-
nsim_devlink_param_load_driverinit_values(devlink);

err = nsim_dev_dummy_region_init(nsim_dev, devlink);
if (err)
- goto err_fib_destroy;
+ return err;

err = nsim_dev_traps_init(devlink);
if (err)
goto err_dummy_region_exit;

+ nsim_dev->fib_data = nsim_fib_create(devlink, extack);
+ if (IS_ERR(nsim_dev->fib_data)) {
+ err = PTR_ERR(nsim_dev->fib_data);
+ goto err_traps_exit;
+ }
+
err = nsim_dev_health_init(nsim_dev, devlink);
if (err)
- goto err_traps_exit;
+ goto err_fib_destroy;

err = nsim_dev_port_add_all(nsim_dev, nsim_bus_dev->port_count);
if (err)
@@ -1039,12 +1041,12 @@ static int nsim_dev_reload_create(struct nsim_dev *nsim_dev,

err_health_exit:
nsim_dev_health_exit(nsim_dev);
+err_fib_destroy:
+ nsim_fib_destroy(devlink, nsim_dev->fib_data);
err_traps_exit:
nsim_dev_traps_exit(devlink);
err_dummy_region_exit:
nsim_dev_dummy_region_exit(nsim_dev);
-err_fib_destroy:
- nsim_fib_destroy(devlink, nsim_dev->fib_data);
return err;
}

@@ -1076,15 +1078,9 @@ int nsim_dev_probe(struct nsim_bus_dev *nsim_bus_dev)
if (err)
goto err_devlink_free;

- nsim_dev->fib_data = nsim_fib_create(devlink, NULL);
- if (IS_ERR(nsim_dev->fib_data)) {
- err = PTR_ERR(nsim_dev->fib_data);
- goto err_resources_unregister;
- }
-
err = devlink_register(devlink, &nsim_bus_dev->dev);
if (err)
- goto err_fib_destroy;
+ goto err_resources_unregister;

err = devlink_params_register(devlink, nsim_devlink_params,
ARRAY_SIZE(nsim_devlink_params));
@@ -1104,9 +1100,15 @@ int nsim_dev_probe(struct nsim_bus_dev *nsim_bus_dev)
if (err)
goto err_traps_exit;

+ nsim_dev->fib_data = nsim_fib_create(devlink, NULL);
+ if (IS_ERR(nsim_dev->fib_data)) {
+ err = PTR_ERR(nsim_dev->fib_data);
+ goto err_debugfs_exit;
+ }
+
err = nsim_dev_health_init(nsim_dev, devlink);
if (err)
- goto err_debugfs_exit;
+ goto err_fib_destroy;

err = nsim_bpf_dev_init(nsim_dev);
if (err)
@@ -1124,6 +1126,8 @@ err_bpf_dev_exit:
nsim_bpf_dev_exit(nsim_dev);
err_health_exit:
nsim_dev_health_exit(nsim_dev);
+err_fib_destroy:
+ nsim_fib_destroy(devlink, nsim_dev->fib_data);
err_debugfs_exit:
nsim_dev_debugfs_exit(nsim_dev);
err_traps_exit:
@@ -1135,8 +1139,6 @@ err_params_unregister:
ARRAY_SIZE(nsim_devlink_params));
err_dl_unregister:
devlink_unregister(devlink);
-err_fib_destroy:
- nsim_fib_destroy(devlink, nsim_dev->fib_data);
err_resources_unregister:
devlink_resources_unregister(devlink, NULL);
err_devlink_free:
@@ -1153,10 +1155,10 @@ static void nsim_dev_reload_destroy(struct nsim_dev *nsim_dev)
debugfs_remove(nsim_dev->take_snapshot);
nsim_dev_port_del_all(nsim_dev);
nsim_dev_health_exit(nsim_dev);
+ nsim_fib_destroy(devlink, nsim_dev->fib_data);
nsim_dev_traps_exit(devlink);
nsim_dev_dummy_region_exit(nsim_dev);
mutex_destroy(&nsim_dev->port_list_lock);
- nsim_fib_destroy(devlink, nsim_dev->fib_data);
}

void nsim_dev_remove(struct nsim_bus_dev *nsim_bus_dev)
--
2.30.1



2021-04-05 16:05:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 034/126] static_call: Align static_call_is_init() patching condition

From: Peter Zijlstra <[email protected]>

[ Upstream commit 698bacefe993ad2922c9d3b1380591ad489355e9 ]

The intent is to avoid writing init code after init (because the text
might have been freed). The code is needlessly different between
jump_label and static_call and not obviously correct.

The existing code relies on the fact that the module loader clears the
init layout, such that within_module_init() always fails, while
jump_label relies on the module state which is more obvious and
matches the kernel logic.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Jarkko Sakkinen <[email protected]>
Tested-by: Sumit Garg <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/static_call.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/kernel/static_call.c b/kernel/static_call.c
index 49efbdc5b480..f59089a12231 100644
--- a/kernel/static_call.c
+++ b/kernel/static_call.c
@@ -149,6 +149,7 @@ void __static_call_update(struct static_call_key *key, void *tramp, void *func)
};

for (site_mod = &first; site_mod; site_mod = site_mod->next) {
+ bool init = system_state < SYSTEM_RUNNING;
struct module *mod = site_mod->mod;

if (!site_mod->sites) {
@@ -168,6 +169,7 @@ void __static_call_update(struct static_call_key *key, void *tramp, void *func)
if (mod) {
stop = mod->static_call_sites +
mod->num_static_call_sites;
+ init = mod->state == MODULE_STATE_COMING;
}
#endif

@@ -175,16 +177,8 @@ void __static_call_update(struct static_call_key *key, void *tramp, void *func)
site < stop && static_call_key(site) == key; site++) {
void *site_addr = static_call_addr(site);

- if (static_call_is_init(site)) {
- /*
- * Don't write to call sites which were in
- * initmem and have since been freed.
- */
- if (!mod && system_state >= SYSTEM_RUNNING)
- continue;
- if (mod && !within_module_init((unsigned long)site_addr, mod))
- continue;
- }
+ if (!init && static_call_is_init(site))
+ continue;

if (!kernel_text_address((unsigned long)site_addr)) {
/*
--
2.30.1



2021-04-05 16:05:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 024/126] scsi: qla2xxx: Fix broken #endif placement

From: Alexey Dobriyan <[email protected]>

[ Upstream commit 5999b9e5b1f8a2f5417b755130919b3ac96f5550 ]

Only half of the file is under include guard because terminating #endif
is placed too early.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Himanshu Madhani <[email protected]>
Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/scsi/qla2xxx/qla_target.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_target.h b/drivers/scsi/qla2xxx/qla_target.h
index 1cff7c69d448..1e94586c7eb2 100644
--- a/drivers/scsi/qla2xxx/qla_target.h
+++ b/drivers/scsi/qla2xxx/qla_target.h
@@ -116,7 +116,6 @@
(min(1270, ((ql) > 0) ? (QLA_TGT_DATASEGS_PER_CMD_24XX + \
QLA_TGT_DATASEGS_PER_CONT_24XX*((ql) - 1)) : 0))
#endif
-#endif

#define GET_TARGET_ID(ha, iocb) ((HAS_EXTENDED_IDS(ha)) \
? le16_to_cpu((iocb)->u.isp2x.target.extended) \
@@ -244,6 +243,7 @@ struct ctio_to_2xxx {
#ifndef CTIO_RET_TYPE
#define CTIO_RET_TYPE 0x17 /* CTIO return entry */
#define ATIO_TYPE7 0x06 /* Accept target I/O entry for 24xx */
+#endif

struct fcp_hdr {
uint8_t r_ctl;
--
2.30.1



2021-04-05 16:05:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 047/126] ath10k: hold RCU lock when calling ieee80211_find_sta_by_ifaddr()

From: Shuah Khan <[email protected]>

[ Upstream commit 09078368d516918666a0122f2533dc73676d3d7e ]

ieee80211_find_sta_by_ifaddr() must be called under the RCU lock and
the resulting pointer is only valid under RCU lock as well.

Fix ath10k_wmi_tlv_op_pull_peer_stats_info() to hold RCU lock before it
calls ieee80211_find_sta_by_ifaddr() and release it when the resulting
pointer is no longer needed.

This problem was found while reviewing code to debug RCU warn from
ath10k_wmi_tlv_parse_peer_stats_info().

Link: https://lore.kernel.org/linux-wireless/[email protected]/
Signed-off-by: Shuah Khan <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/ath/ath10k/wmi-tlv.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/wmi-tlv.c b/drivers/net/wireless/ath/ath10k/wmi-tlv.c
index e6135795719a..e7072fc4f487 100644
--- a/drivers/net/wireless/ath/ath10k/wmi-tlv.c
+++ b/drivers/net/wireless/ath/ath10k/wmi-tlv.c
@@ -576,13 +576,13 @@ static void ath10k_wmi_event_tdls_peer(struct ath10k *ar, struct sk_buff *skb)
case WMI_TDLS_TEARDOWN_REASON_TX:
case WMI_TDLS_TEARDOWN_REASON_RSSI:
case WMI_TDLS_TEARDOWN_REASON_PTR_TIMEOUT:
+ rcu_read_lock();
station = ieee80211_find_sta_by_ifaddr(ar->hw,
ev->peer_macaddr.addr,
NULL);
if (!station) {
ath10k_warn(ar, "did not find station from tdls peer event");
- kfree(tb);
- return;
+ goto exit;
}
arvif = ath10k_get_arvif(ar, __le32_to_cpu(ev->vdev_id));
ieee80211_tdls_oper_request(
@@ -593,6 +593,9 @@ static void ath10k_wmi_event_tdls_peer(struct ath10k *ar, struct sk_buff *skb)
);
break;
}
+
+exit:
+ rcu_read_unlock();
kfree(tb);
}

--
2.30.1



2021-04-05 16:05:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 043/126] ath11k: add ieee80211_unregister_hw to avoid kernel crash caused by NULL pointer

From: Wen Gong <[email protected]>

[ Upstream commit 0d96968315d7ffbd70d608b29e9bea084210b96d ]

When function return fail to __ath11k_mac_register after success called
ieee80211_register_hw, then it set wiphy->dev.parent to NULL by
SET_IEEE80211_DEV(ar->hw, NULL) in end of __ath11k_mac_register, then
cfg80211_get_drvinfo will be called by below call stack, but the
wiphy->dev.parent is NULL, so kernel crash.

Call stack to cfg80211_get_drvinfo:
NetworkManager 826 [001] 6696.731371: probe:cfg80211_get_drvinfo: (ffffffffc107d8f0)
ffffffffc107d8f1 cfg80211_get_drvinfo+0x1 (/lib/modules/5.10.0-rc1-wt-ath+/kernel/net/wireless-back/cfg80211.ko)
ffffffff9d8fc529 ethtool_get_drvinfo+0x99 (vmlinux)
ffffffff9d90080e dev_ethtool+0x1dbe (vmlinux)
ffffffff9d8b88f7 dev_ioctl+0xb7 (vmlinux)
ffffffff9d8668de sock_do_ioctl+0xae (vmlinux)
ffffffff9d866d60 sock_ioctl+0x350 (vmlinux)
ffffffff9d2ca30e __x64_sys_ioctl+0x8e (vmlinux)
ffffffff9da0dda3 do_syscall_64+0x33 (vmlinux)
ffffffff9dc0008c entry_SYSCALL_64_after_hwframe+0x44 (vmlinux)
7feb5f673007 __GI___ioctl+0x7 (/lib/x86_64-linux-gnu/libc-2.23.so)
0 [unknown] ([unknown])

Code of cfg80211_get_drvinfo, the pdev which is wiphy->dev.parent is
NULL when kernel crash:
void cfg80211_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info)
{
struct wireless_dev *wdev = dev->ieee80211_ptr;
struct device *pdev = wiphy_dev(wdev->wiphy);

if (pdev->driver)
....

kernel crash log:
[ 973.619550] ath11k_pci 0000:05:00.0: failed to perform regd update : -16
[ 973.619555] ath11k_pci 0000:05:00.0: ath11k regd update failed: -16
[ 973.619566] ath11k_pci 0000:05:00.0: failed register the radio with mac80211: -16
[ 973.619618] ath11k_pci 0000:05:00.0: failed to create pdev core: -16
[ 973.636035] BUG: kernel NULL pointer dereference, address: 0000000000000068
[ 973.636046] #PF: supervisor read access in kernel mode
[ 973.636050] #PF: error_code(0x0000) - not-present page
[ 973.636054] PGD 800000012452e067 P4D 800000012452e067 PUD 12452d067 PMD 0
[ 973.636064] Oops: 0000 [#1] SMP PTI
[ 973.636072] CPU: 3 PID: 848 Comm: NetworkManager Kdump: loaded Tainted: G W OE 5.10.0-rc1-wt-ath+ #24
[ 973.636076] Hardware name: LENOVO 418065C/418065C, BIOS 83ET63WW (1.33 ) 07/29/2011
[ 973.636161] RIP: 0010:cfg80211_get_drvinfo+0x25/0xd0 [cfg80211]
[ 973.636169] Code: e9 c9 fe ff ff 66 66 66 66 90 55 53 ba 20 00 00 00 48 8b af 08 03 00 00 48 89 f3 48 8d 7e 04 48 8b 45 00 48 8b 80 90 01 00 00 <48> 8b 40 68 48 85 c0 0f 84 8d 00 00 00 48 8b 30 e8 a6 cc 72 c7 48
[ 973.636174] RSP: 0018:ffffaafb4040bbe0 EFLAGS: 00010286
[ 973.636180] RAX: 0000000000000000 RBX: ffffaafb4040bbfc RCX: 0000000000000000
[ 973.636184] RDX: 0000000000000020 RSI: ffffaafb4040bbfc RDI: ffffaafb4040bc00
[ 973.636188] RBP: ffff8a84c9568950 R08: 722d302e30312e35 R09: 74612d74772d3163
[ 973.636192] R10: 3163722d302e3031 R11: 2b6874612d74772d R12: ffffaafb4040bbfc
[ 973.636196] R13: 00007ffe453707c0 R14: ffff8a84c9568000 R15: 0000000000000000
[ 973.636202] FS: 00007fd3d179b940(0000) GS:ffff8a84fa2c0000(0000) knlGS:0000000000000000
[ 973.636206] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 973.636211] CR2: 0000000000000068 CR3: 00000001153b6002 CR4: 00000000000606e0
[ 973.636215] Call Trace:
[ 973.636234] ethtool_get_drvinfo+0x99/0x1f0
[ 973.636246] dev_ethtool+0x1dbe/0x2be0
[ 973.636256] ? mntput_no_expire+0x35/0x220
[ 973.636264] ? inet_ioctl+0x1ce/0x200
[ 973.636274] ? tomoyo_path_number_perm+0x68/0x1d0
[ 973.636282] ? kmem_cache_alloc+0x3cb/0x430
[ 973.636290] ? dev_ioctl+0xb7/0x570
[ 973.636295] dev_ioctl+0xb7/0x570
[ 973.636307] sock_do_ioctl+0xae/0x150
[ 973.636315] ? sock_ioctl+0x350/0x3c0
[ 973.636319] sock_ioctl+0x350/0x3c0
[ 973.636332] ? __x64_sys_ioctl+0x8e/0xd0
[ 973.636339] ? dlci_ioctl_set+0x30/0x30
[ 973.636346] __x64_sys_ioctl+0x8e/0xd0
[ 973.636359] do_syscall_64+0x33/0x80
[ 973.636368] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Sequence of function call when wlan load for success case when function
__ath11k_mac_register return 0:

kworker/u16:3-e 2922 [001] 6696.729734: probe:ieee80211_register_hw: (ffffffffc116ae60)
kworker/u16:3-e 2922 [001] 6696.730210: probe:ieee80211_if_add: (ffffffffc1185cc0)
NetworkManager 826 [001] 6696.731345: probe:ethtool_get_drvinfo: (ffffffff9d8fc490)
NetworkManager 826 [001] 6696.731371: probe:cfg80211_get_drvinfo: (ffffffffc107d8f0)
NetworkManager 826 [001] 6696.731639: probe:ethtool_get_drvinfo: (ffffffff9d8fc490)
NetworkManager 826 [001] 6696.731653: probe:cfg80211_get_drvinfo: (ffffffffc107d8f0)
NetworkManager 826 [001] 6696.732866: probe:ethtool_get_drvinfo: (ffffffff9d8fc490)
NetworkManager 826 [001] 6696.732893: probe:cfg80211_get_drvinfo: (ffffffffc107d8f0)
systemd-udevd 3850 [003] 6696.737199: probe:ethtool_get_drvinfo: (ffffffff9d8fc490)
systemd-udevd 3850 [003] 6696.737226: probe:cfg80211_get_drvinfo: (ffffffffc107d8f0)
NetworkManager 826 [000] 6696.759950: probe:ethtool_get_drvinfo: (ffffffff9d8fc490)
NetworkManager 826 [000] 6696.759967: probe:cfg80211_get_drvinfo: (ffffffffc107d8f0)
NetworkManager 826 [000] 6696.760057: probe:ethtool_get_drvinfo: (ffffffff9d8fc490)
NetworkManager 826 [000] 6696.760062: probe:cfg80211_get_drvinfo: (ffffffffc107d8f0)

After apply this patch, kernel crash gone, and below is the test case's
sequence of function call and log when wlan load with fail by function
ath11k_regd_update, and __ath11k_mac_register return fail:

kworker/u16:5-e 192 [001] 215.174388: probe:ieee80211_register_hw: (ffffffffc1131e60)
kworker/u16:5-e 192 [000] 215.174973: probe:ieee80211_if_add: (ffffffffc114ccc0)
NetworkManager 846 [001] 215.175857: probe:ethtool_get_drvinfo: (ffffffff928fc490)
kworker/u16:5-e 192 [000] 215.175867: probe:ieee80211_unregister_hw: (ffffffffc1131970)
NetworkManager 846 [001] 215.175880: probe:cfg80211_get_drvinfo: (ffffffffc107f8f0)
NetworkManager 846 [001] 215.176105: probe:ethtool_get_drvinfo: (ffffffff928fc490)
NetworkManager 846 [001] 215.176118: probe:cfg80211_get_drvinfo: (ffffffffc107f8f0)
[ 215.175859] ath11k_pci 0000:05:00.0: ath11k regd update failed: -16
NetworkManager 846 [001] 215.196420: probe:ethtool_get_drvinfo: (ffffffff928fc490)
NetworkManager 846 [001] 215.196430: probe:cfg80211_get_drvinfo: (ffffffffc107f8f0)
[ 215.258598] ath11k_pci 0000:05:00.0: failed register the radio with mac80211: -16
[ 215.258613] ath11k_pci 0000:05:00.0: failed to create pdev core: -16

When ath11k_regd_update or ath11k_debugfs_register return fail, function
ieee80211_unregister_hw of mac80211 will be called, then it will wait
untill cfg80211_get_drvinfo finished, the wiphy->dev.parent is not NULL
at this moment, after that, it set wiphy->dev.parent to NULL by
SET_IEEE80211_DEV(ar->hw, NULL) in end of __ath11k_mac_register, so
not happen kernel crash.

Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1
Signed-off-by: Wen Gong <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/ath/ath11k/mac.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c
index ee0edd918560..e9e6b0c4de22 100644
--- a/drivers/net/wireless/ath/ath11k/mac.c
+++ b/drivers/net/wireless/ath/ath11k/mac.c
@@ -6317,17 +6317,20 @@ static int __ath11k_mac_register(struct ath11k *ar)
ret = ath11k_regd_update(ar, true);
if (ret) {
ath11k_err(ar->ab, "ath11k regd update failed: %d\n", ret);
- goto err_free_if_combs;
+ goto err_unregister_hw;
}

ret = ath11k_debugfs_register(ar);
if (ret) {
ath11k_err(ar->ab, "debugfs registration failed: %d\n", ret);
- goto err_free_if_combs;
+ goto err_unregister_hw;
}

return 0;

+err_unregister_hw:
+ ieee80211_unregister_hw(ar->hw);
+
err_free_if_combs:
kfree(ar->hw->wiphy->iface_combinations[0].limits);
kfree(ar->hw->wiphy->iface_combinations);
--
2.30.1



2021-04-05 16:05:22

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 048/126] net: ethernet: aquantia: Handle error cleanup of start on open

From: Nathan Rossi <[email protected]>

[ Upstream commit 8a28af7a3e85ddf358f8c41e401a33002f7a9587 ]

The aq_nic_start function can fail in a variety of cases which leaves
the device in broken state.

An example case where the start function fails is the
request_threaded_irq which can be interrupted, resulting in a EINTR
result. This can be manually triggered by bringing the link up (e.g. ip
link set up) and triggering a SIGINT on the initiating process (e.g.
Ctrl+C). This would put the device into a half configured state.
Subsequently bringing the link up again would cause the napi_enable to
BUG.

In order to correctly clean up the failed attempt to start a device call
aq_nic_stop.

Signed-off-by: Nathan Rossi <[email protected]>
Reviewed-by: Igor Russkikh <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/aquantia/atlantic/aq_main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_main.c b/drivers/net/ethernet/aquantia/atlantic/aq_main.c
index 8f70a3909929..4af0cd9530de 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_main.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_main.c
@@ -71,8 +71,10 @@ static int aq_ndev_open(struct net_device *ndev)
goto err_exit;

err = aq_nic_start(aq_nic);
- if (err < 0)
+ if (err < 0) {
+ aq_nic_stop(aq_nic);
goto err_exit;
+ }

err_exit:
if (err < 0)
--
2.30.1



2021-04-05 16:05:22

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 049/126] appletalk: Fix skb allocation size in loopback case

From: Doug Brown <[email protected]>

[ Upstream commit 39935dccb21c60f9bbf1bb72d22ab6fd14ae7705 ]

If a DDP broadcast packet is sent out to a non-gateway target, it is
also looped back. There is a potential for the loopback device to have a
longer hardware header length than the original target route's device,
which can result in the skb not being created with enough room for the
loopback device's hardware header. This patch fixes the issue by
determining that a loopback will be necessary prior to allocating the
skb, and if so, ensuring the skb has enough room.

This was discovered while testing a new driver that creates a LocalTalk
network interface (LTALK_HLEN = 1). It caused an skb_under_panic.

Signed-off-by: Doug Brown <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
net/appletalk/ddp.c | 33 +++++++++++++++++++++------------
1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index 1d48708c5a2e..c94b212d8e7c 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1576,8 +1576,8 @@ static int atalk_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
struct sk_buff *skb;
struct net_device *dev;
struct ddpehdr *ddp;
- int size;
- struct atalk_route *rt;
+ int size, hard_header_len;
+ struct atalk_route *rt, *rt_lo = NULL;
int err;

if (flags & ~(MSG_DONTWAIT|MSG_CMSG_COMPAT))
@@ -1640,7 +1640,22 @@ static int atalk_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
SOCK_DEBUG(sk, "SK %p: Size needed %d, device %s\n",
sk, size, dev->name);

- size += dev->hard_header_len;
+ hard_header_len = dev->hard_header_len;
+ /* Leave room for loopback hardware header if necessary */
+ if (usat->sat_addr.s_node == ATADDR_BCAST &&
+ (dev->flags & IFF_LOOPBACK || !(rt->flags & RTF_GATEWAY))) {
+ struct atalk_addr at_lo;
+
+ at_lo.s_node = 0;
+ at_lo.s_net = 0;
+
+ rt_lo = atrtr_find(&at_lo);
+
+ if (rt_lo && rt_lo->dev->hard_header_len > hard_header_len)
+ hard_header_len = rt_lo->dev->hard_header_len;
+ }
+
+ size += hard_header_len;
release_sock(sk);
skb = sock_alloc_send_skb(sk, size, (flags & MSG_DONTWAIT), &err);
lock_sock(sk);
@@ -1648,7 +1663,7 @@ static int atalk_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
goto out;

skb_reserve(skb, ddp_dl->header_length);
- skb_reserve(skb, dev->hard_header_len);
+ skb_reserve(skb, hard_header_len);
skb->dev = dev;

SOCK_DEBUG(sk, "SK %p: Begin build.\n", sk);
@@ -1699,18 +1714,12 @@ static int atalk_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
/* loop back */
skb_orphan(skb);
if (ddp->deh_dnode == ATADDR_BCAST) {
- struct atalk_addr at_lo;
-
- at_lo.s_node = 0;
- at_lo.s_net = 0;
-
- rt = atrtr_find(&at_lo);
- if (!rt) {
+ if (!rt_lo) {
kfree_skb(skb);
err = -ENETUNREACH;
goto out;
}
- dev = rt->dev;
+ dev = rt_lo->dev;
skb->dev = dev;
}
ddp_dl->request(ddp_dl, skb, dev->dev_addr);
--
2.30.1



2021-04-05 16:05:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 051/126] net: ipa: fix register write command validation

From: Alex Elder <[email protected]>

[ Upstream commit 2d65ed76924bc772d3974b0894d870b1aa63b34a ]

In ipa_cmd_register_write_valid() we verify that values we will
supply to a REGISTER_WRITE IPA immediate command will fit in
the fields that need to hold them. This patch fixes some issues
in that function and ipa_cmd_register_write_offset_valid().

The dev_err() call in ipa_cmd_register_write_offset_valid() has
some printf format errors:
- The name of the register (corresponding to the string format
specifier) was not supplied.
- The IPA base offset and offset need to be supplied separately to
match the other format specifiers.
Also make the ~0 constant used there to compute the maximum
supported offset value explicitly unsigned.

There are two other issues in ipa_cmd_register_write_valid():
- There's no need to check the hash flush register for platforms
(like IPA v4.2) that do not support hashed tables
- The highest possible endpoint number, whose status register
offset is computed, is COUNT - 1, not COUNT.

Fix these problems, and add some additional commentary.

Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ipa/ipa_cmd.c | 32 ++++++++++++++++++++++++--------
1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ipa/ipa_cmd.c b/drivers/net/ipa/ipa_cmd.c
index d92dd3f09b73..46d8b7336d8f 100644
--- a/drivers/net/ipa/ipa_cmd.c
+++ b/drivers/net/ipa/ipa_cmd.c
@@ -1,7 +1,7 @@
// SPDX-License-Identifier: GPL-2.0

/* Copyright (c) 2012-2018, The Linux Foundation. All rights reserved.
- * Copyright (C) 2019-2020 Linaro Ltd.
+ * Copyright (C) 2019-2021 Linaro Ltd.
*/

#include <linux/types.h>
@@ -244,11 +244,15 @@ static bool ipa_cmd_register_write_offset_valid(struct ipa *ipa,
if (ipa->version != IPA_VERSION_3_5_1)
bit_count += hweight32(REGISTER_WRITE_FLAGS_OFFSET_HIGH_FMASK);
BUILD_BUG_ON(bit_count > 32);
- offset_max = ~0 >> (32 - bit_count);
+ offset_max = ~0U >> (32 - bit_count);

+ /* Make sure the offset can be represented by the field(s)
+ * that holds it. Also make sure the offset is not outside
+ * the overall IPA memory range.
+ */
if (offset > offset_max || ipa->mem_offset > offset_max - offset) {
dev_err(dev, "%s offset too large 0x%04x + 0x%04x > 0x%04x)\n",
- ipa->mem_offset + offset, offset_max);
+ name, ipa->mem_offset, offset, offset_max);
return false;
}

@@ -261,12 +265,24 @@ static bool ipa_cmd_register_write_valid(struct ipa *ipa)
const char *name;
u32 offset;

- offset = ipa_reg_filt_rout_hash_flush_offset(ipa->version);
- name = "filter/route hash flush";
- if (!ipa_cmd_register_write_offset_valid(ipa, name, offset))
- return false;
+ /* If hashed tables are supported, ensure the hash flush register
+ * offset will fit in a register write IPA immediate command.
+ */
+ if (ipa->version != IPA_VERSION_4_2) {
+ offset = ipa_reg_filt_rout_hash_flush_offset(ipa->version);
+ name = "filter/route hash flush";
+ if (!ipa_cmd_register_write_offset_valid(ipa, name, offset))
+ return false;
+ }

- offset = IPA_REG_ENDP_STATUS_N_OFFSET(IPA_ENDPOINT_COUNT);
+ /* Each endpoint can have a status endpoint associated with it,
+ * and this is recorded in an endpoint register. If the modem
+ * crashes, we reset the status endpoint for all modem endpoints
+ * using a register write IPA immediate command. Make sure the
+ * worst case (highest endpoint number) offset of that endpoint
+ * fits in the register write command field(s) that must hold it.
+ */
+ offset = IPA_REG_ENDP_STATUS_N_OFFSET(IPA_ENDPOINT_COUNT - 1);
name = "maximal endpoint status";
if (!ipa_cmd_register_write_offset_valid(ipa, name, offset))
return false;
--
2.30.1



2021-04-05 16:05:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 055/126] ACPI: tables: x86: Reserve memory occupied by ACPI tables

From: Rafael J. Wysocki <[email protected]>

commit 1a1c130ab7575498eed5bcf7220037ae09cd1f8a upstream.

The following problem has been reported by George Kennedy:

Since commit 7fef431be9c9 ("mm/page_alloc: place pages to tail
in __free_pages_core()") the following use after free occurs
intermittently when ACPI tables are accessed.

BUG: KASAN: use-after-free in ibft_init+0x134/0xc49
Read of size 4 at addr ffff8880be453004 by task swapper/0/1
CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.12.0-rc1-7a7fd0d #1
Call Trace:
dump_stack+0xf6/0x158
print_address_description.constprop.9+0x41/0x60
kasan_report.cold.14+0x7b/0xd4
__asan_report_load_n_noabort+0xf/0x20
ibft_init+0x134/0xc49
do_one_initcall+0xc4/0x3e0
kernel_init_freeable+0x5af/0x66b
kernel_init+0x16/0x1d0
ret_from_fork+0x22/0x30

ACPI tables mapped via kmap() do not have their mapped pages
reserved and the pages can be "stolen" by the buddy allocator.

Apparently, on the affected system, the ACPI table in question is
not located in "reserved" memory, like ACPI NVS or ACPI Data, that
will not be used by the buddy allocator, so the memory occupied by
that table has to be explicitly reserved to prevent the buddy
allocator from using it.

In order to address this problem, rearrange the initialization of the
ACPI tables on x86 to locate the initial tables earlier and reserve
the memory occupied by them.

The other architectures using ACPI should not be affected by this
change.

Link: https://lore.kernel.org/linux-acpi/[email protected]/
Reported-by: George Kennedy <[email protected]>
Tested-by: George Kennedy <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Reviewed-by: Mike Rapoport <[email protected]>
Cc: 5.10+ <[email protected]> # 5.10+
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kernel/acpi/boot.c | 25 ++++++++++++-------------
arch/x86/kernel/setup.c | 8 +++-----
drivers/acpi/tables.c | 42 +++++++++++++++++++++++++++++++++++++++---
include/linux/acpi.h | 9 ++++++++-
4 files changed, 62 insertions(+), 22 deletions(-)

--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -1554,10 +1554,18 @@ void __init acpi_boot_table_init(void)
/*
* Initialize the ACPI boot-time table parser.
*/
- if (acpi_table_init()) {
+ if (acpi_locate_initial_tables())
disable_acpi();
- return;
- }
+ else
+ acpi_reserve_initial_tables();
+}
+
+int __init early_acpi_boot_init(void)
+{
+ if (acpi_disabled)
+ return 1;
+
+ acpi_table_init_complete();

acpi_table_parse(ACPI_SIG_BOOT, acpi_parse_sbf);

@@ -1570,18 +1578,9 @@ void __init acpi_boot_table_init(void)
} else {
printk(KERN_WARNING PREFIX "Disabling ACPI support\n");
disable_acpi();
- return;
+ return 1;
}
}
-}
-
-int __init early_acpi_boot_init(void)
-{
- /*
- * If acpi_disabled, bail out
- */
- if (acpi_disabled)
- return 1;

/*
* Process the Multiple APIC Description Table (MADT), if present
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1051,6 +1051,9 @@ void __init setup_arch(char **cmdline_p)

cleanup_highmap();

+ /* Look for ACPI tables and reserve memory occupied by them. */
+ acpi_boot_table_init();
+
memblock_set_current_limit(ISA_END_ADDRESS);
e820__memblock_setup();

@@ -1136,11 +1139,6 @@ void __init setup_arch(char **cmdline_p)

early_platform_quirks();

- /*
- * Parse the ACPI tables for possible boot-time SMP configuration.
- */
- acpi_boot_table_init();
-
early_acpi_boot_init();

initmem_init();
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -780,7 +780,7 @@ acpi_status acpi_os_table_override(struc
}

/*
- * acpi_table_init()
+ * acpi_locate_initial_tables()
*
* find RSDP, find and checksum SDT/XSDT.
* checksum all tables, print SDT/XSDT
@@ -788,7 +788,7 @@ acpi_status acpi_os_table_override(struc
* result: sdt_entry[] is initialized
*/

-int __init acpi_table_init(void)
+int __init acpi_locate_initial_tables(void)
{
acpi_status status;

@@ -803,9 +803,45 @@ int __init acpi_table_init(void)
status = acpi_initialize_tables(initial_tables, ACPI_MAX_TABLES, 0);
if (ACPI_FAILURE(status))
return -EINVAL;
- acpi_table_initrd_scan();

+ return 0;
+}
+
+void __init acpi_reserve_initial_tables(void)
+{
+ int i;
+
+ for (i = 0; i < ACPI_MAX_TABLES; i++) {
+ struct acpi_table_desc *table_desc = &initial_tables[i];
+ u64 start = table_desc->address;
+ u64 size = table_desc->length;
+
+ if (!start || !size)
+ break;
+
+ pr_info("Reserving %4s table memory at [mem 0x%llx-0x%llx]\n",
+ table_desc->signature.ascii, start, start + size - 1);
+
+ memblock_reserve(start, size);
+ }
+}
+
+void __init acpi_table_init_complete(void)
+{
+ acpi_table_initrd_scan();
check_multiple_madt();
+}
+
+int __init acpi_table_init(void)
+{
+ int ret;
+
+ ret = acpi_locate_initial_tables();
+ if (ret)
+ return ret;
+
+ acpi_table_init_complete();
+
return 0;
}

--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -222,10 +222,14 @@ void __iomem *__acpi_map_table(unsigned
void __acpi_unmap_table(void __iomem *map, unsigned long size);
int early_acpi_boot_init(void);
int acpi_boot_init (void);
+void acpi_boot_table_prepare (void);
void acpi_boot_table_init (void);
int acpi_mps_check (void);
int acpi_numa_init (void);

+int acpi_locate_initial_tables (void);
+void acpi_reserve_initial_tables (void);
+void acpi_table_init_complete (void);
int acpi_table_init (void);
int acpi_table_parse(char *id, acpi_tbl_table_handler handler);
int __init acpi_table_parse_entries(char *id, unsigned long table_size,
@@ -807,9 +811,12 @@ static inline int acpi_boot_init(void)
return 0;
}

+static inline void acpi_boot_table_prepare(void)
+{
+}
+
static inline void acpi_boot_table_init(void)
{
- return;
}

static inline int acpi_mps_check(void)


2021-04-05 16:05:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 008/126] iomap: Fix negative assignment to unsigned sis->pages in iomap_swapfile_activate

From: Ritesh Harjani <[email protected]>

[ Upstream commit 5808fecc572391867fcd929662b29c12e6d08d81 ]

In case if isi.nr_pages is 0, we are making sis->pages (which is
unsigned int) a huge value in iomap_swapfile_activate() by assigning -1.
This could cause a kernel crash in kernel v4.18 (with below signature).
Or could lead to unknown issues on latest kernel if the fake big swap gets
used.

Fix this issue by returning -EINVAL in case of nr_pages is 0, since it
is anyway a invalid swapfile. Looks like this issue will be hit when
we have pagesize < blocksize type of configuration.

I was able to hit the issue in case of a tiny swap file with below
test script.
https://raw.githubusercontent.com/riteshharjani/LinuxStudy/master/scripts/swap-issue.sh

kernel crash analysis on v4.18
==============================
On v4.18 kernel, it causes a kernel panic, since sis->pages becomes
a huge value and isi.nr_extents is 0. When 0 is returned it is
considered as a swapfile over NFS and SWP_FILE is set (sis->flags |= SWP_FILE).
Then when swapoff was getting called it was calling a_ops->swap_deactivate()
if (sis->flags & SWP_FILE) is true. Since a_ops->swap_deactivate() is
NULL in case of XFS, it causes below panic.

Panic signature on v4.18 kernel:
=======================================
root@qemu:/home/qemu# [ 8291.723351] XFS (loop2): Unmounting Filesystem
[ 8292.123104] XFS (loop2): Mounting V5 Filesystem
[ 8292.132451] XFS (loop2): Ending clean mount
[ 8292.263362] Adding 4294967232k swap on /mnt1/test/swapfile. Priority:-2 extents:1 across:274877906880k
[ 8292.277834] Unable to handle kernel paging request for instruction fetch
[ 8292.278677] Faulting instruction address: 0x00000000
cpu 0x19: Vector: 400 (Instruction Access) at [c0000009dd5b7ad0]
pc: 0000000000000000
lr: c0000000003eb9dc: destroy_swap_extents+0xfc/0x120
sp: c0000009dd5b7d50
msr: 8000000040009033
current = 0xc0000009b6710080
paca = 0xc00000003ffcb280 irqmask: 0x03 irq_happened: 0x01
pid = 5604, comm = swapoff
Linux version 4.18.0 (riteshh@xxxxxxx) (gcc version 8.4.0 (Ubuntu 8.4.0-1ubuntu1~18.04)) #57 SMP Wed Mar 3 01:33:04 CST 2021
enter ? for help
[link register ] c0000000003eb9dc destroy_swap_extents+0xfc/0x120
[c0000009dd5b7d50] c0000000025a7058 proc_poll_event+0x0/0x4 (unreliable)
[c0000009dd5b7da0] c0000000003f0498 sys_swapoff+0x3f8/0x910
[c0000009dd5b7e30] c00000000000bbe4 system_call+0x5c/0x70
Exception: c01 (System Call) at 00007ffff7d208d8

Signed-off-by: Ritesh Harjani <[email protected]>
[djwong: rework the comment to provide more details]
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/iomap/swapfile.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/fs/iomap/swapfile.c b/fs/iomap/swapfile.c
index a648dbf6991e..a5e478de1417 100644
--- a/fs/iomap/swapfile.c
+++ b/fs/iomap/swapfile.c
@@ -170,6 +170,16 @@ int iomap_swapfile_activate(struct swap_info_struct *sis,
return ret;
}

+ /*
+ * If this swapfile doesn't contain even a single page-aligned
+ * contiguous range of blocks, reject this useless swapfile to
+ * prevent confusion later on.
+ */
+ if (isi.nr_pages == 0) {
+ pr_warn("swapon: Cannot find a single usable page in file.\n");
+ return -EINVAL;
+ }
+
*pagespan = 1 + isi.highest_ppage - isi.lowest_ppage;
sis->max = isi.nr_pages;
sis->pages = isi.nr_pages - 1;
--
2.30.1



2021-04-05 16:05:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 050/126] net: ipa: remove two unused register definitions

From: Alex Elder <[email protected]>

[ Upstream commit d5bc5015eb9d64cbd14e467db1a56db1472d0d6c ]

We do not support inter-EE channel or event ring commands. Inter-EE
interrupts are disabled (and never re-enabled) for all channels and
event rings, so we have no need for the GSI registers that clear
those interrupt conditions. So remove their definitions.

Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ipa/gsi_reg.h | 10 ----------
1 file changed, 10 deletions(-)

diff --git a/drivers/net/ipa/gsi_reg.h b/drivers/net/ipa/gsi_reg.h
index 8e0e9350c383..42e5a8b8d324 100644
--- a/drivers/net/ipa/gsi_reg.h
+++ b/drivers/net/ipa/gsi_reg.h
@@ -48,16 +48,6 @@
#define GSI_INTER_EE_N_SRC_EV_CH_IRQ_OFFSET(ee) \
(0x0000c01c + 0x1000 * (ee))

-#define GSI_INTER_EE_SRC_CH_IRQ_CLR_OFFSET \
- GSI_INTER_EE_N_SRC_CH_IRQ_CLR_OFFSET(GSI_EE_AP)
-#define GSI_INTER_EE_N_SRC_CH_IRQ_CLR_OFFSET(ee) \
- (0x0000c028 + 0x1000 * (ee))
-
-#define GSI_INTER_EE_SRC_EV_CH_IRQ_CLR_OFFSET \
- GSI_INTER_EE_N_SRC_EV_CH_IRQ_CLR_OFFSET(GSI_EE_AP)
-#define GSI_INTER_EE_N_SRC_EV_CH_IRQ_CLR_OFFSET(ee) \
- (0x0000c02c + 0x1000 * (ee))
-
#define GSI_CH_C_CNTXT_0_OFFSET(ch) \
GSI_EE_N_CH_C_CNTXT_0_OFFSET((ch), GSI_EE_AP)
#define GSI_EE_N_CH_C_CNTXT_0_OFFSET(ch, ee) \
--
2.30.1



2021-04-05 16:05:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 054/126] bpf: Remove MTU check in __bpf_skb_max_len

From: Jesper Dangaard Brouer <[email protected]>

commit 6306c1189e77a513bf02720450bb43bd4ba5d8ae upstream.

Multiple BPF-helpers that can manipulate/increase the size of the SKB uses
__bpf_skb_max_len() as the max-length. This function limit size against
the current net_device MTU (skb->dev->mtu).

When a BPF-prog grow the packet size, then it should not be limited to the
MTU. The MTU is a transmit limitation, and software receiving this packet
should be allowed to increase the size. Further more, current MTU check in
__bpf_skb_max_len uses the MTU from ingress/current net_device, which in
case of redirects uses the wrong net_device.

This patch keeps a sanity max limit of SKB_MAX_ALLOC (16KiB). The real limit
is elsewhere in the system. Jesper's testing[1] showed it was not possible
to exceed 8KiB when expanding the SKB size via BPF-helper. The limiting
factor is the define KMALLOC_MAX_CACHE_SIZE which is 8192 for
SLUB-allocator (CONFIG_SLUB) in-case PAGE_SIZE is 4096. This define is
in-effect due to this being called from softirq context see code
__gfp_pfmemalloc_flags() and __do_kmalloc_node(). Jakub's testing showed
that frames above 16KiB can cause NICs to reset (but not crash). Keep this
sanity limit at this level as memory layer can differ based on kernel
config.

[1] https://github.com/xdp-project/bpf-examples/tree/master/MTU-tests

Signed-off-by: Jesper Dangaard Brouer <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/161287788936.790810.2937823995775097177.stgit@firesoul
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/filter.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)

--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3552,11 +3552,7 @@ static int bpf_skb_net_shrink(struct sk_
return 0;
}

-static u32 __bpf_skb_max_len(const struct sk_buff *skb)
-{
- return skb->dev ? skb->dev->mtu + skb->dev->hard_header_len :
- SKB_MAX_ALLOC;
-}
+#define BPF_SKB_MAX_LEN SKB_MAX_ALLOC

BPF_CALL_4(sk_skb_adjust_room, struct sk_buff *, skb, s32, len_diff,
u32, mode, u64, flags)
@@ -3605,7 +3601,7 @@ BPF_CALL_4(bpf_skb_adjust_room, struct s
{
u32 len_cur, len_diff_abs = abs(len_diff);
u32 len_min = bpf_skb_net_base_len(skb);
- u32 len_max = __bpf_skb_max_len(skb);
+ u32 len_max = BPF_SKB_MAX_LEN;
__be16 proto = skb->protocol;
bool shrink = len_diff < 0;
u32 off;
@@ -3688,7 +3684,7 @@ static int bpf_skb_trim_rcsum(struct sk_
static inline int __bpf_skb_change_tail(struct sk_buff *skb, u32 new_len,
u64 flags)
{
- u32 max_len = __bpf_skb_max_len(skb);
+ u32 max_len = BPF_SKB_MAX_LEN;
u32 min_len = __bpf_skb_min_len(skb);
int ret;

@@ -3764,7 +3760,7 @@ static const struct bpf_func_proto sk_sk
static inline int __bpf_skb_change_head(struct sk_buff *skb, u32 head_room,
u64 flags)
{
- u32 max_len = __bpf_skb_max_len(skb);
+ u32 max_len = BPF_SKB_MAX_LEN;
u32 new_len = skb->len + head_room;
int ret;



2021-04-05 16:05:31

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 057/126] ALSA: usb-audio: Apply sample rate quirk to Logitech Connect

From: Ikjoon Jang <[email protected]>

commit 625bd5a616ceda4840cd28f82e957c8ced394b6a upstream.

Logitech ConferenceCam Connect is a compound USB device with UVC and
UAC. Not 100% reproducible but sometimes it keeps responding STALL to
every control transfer once it receives get_freq request.

This patch adds 046d:0x084c to a snd_usb_get_sample_rate_quirk list.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203419
Signed-off-by: Ikjoon Jang <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
sound/usb/quirks.c | 1 +
1 file changed, 1 insertion(+)

--- a/sound/usb/quirks.c
+++ b/sound/usb/quirks.c
@@ -1521,6 +1521,7 @@ bool snd_usb_get_sample_rate_quirk(struc
case USB_ID(0x21b4, 0x0081): /* AudioQuest DragonFly */
case USB_ID(0x2912, 0x30c8): /* Audioengine D1 */
case USB_ID(0x413c, 0xa506): /* Dell AE515 sound bar */
+ case USB_ID(0x046d, 0x084c): /* Logitech ConferenceCam Connect */
return true;
}



2021-04-05 16:05:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 058/126] ALSA: hda: Re-add dropped snd_poewr_change_state() calls

From: Takashi Iwai <[email protected]>

commit c8f79808cd8eb5bc8d14de129bd6d586d3fce0aa upstream.

The card power state change via snd_power_change_state() at the system
suspend/resume seems dropped mistakenly during the PM code rewrite.
The card power state doesn't play much role nowadays but it's still
referred in a few places such as the HDMI codec driver.

This patch restores them, but in a more appropriate place now in the
prepare and complete callbacks.

Fixes: f5dac54d9d93 ("ALSA: hda: Separate runtime and system suspend")
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
sound/pci/hda/hda_intel.c | 2 ++
1 file changed, 2 insertions(+)

--- a/sound/pci/hda/hda_intel.c
+++ b/sound/pci/hda/hda_intel.c
@@ -1025,6 +1025,7 @@ static int azx_prepare(struct device *de

chip = card->private_data;
chip->pm_prepared = 1;
+ snd_power_change_state(card, SNDRV_CTL_POWER_D3hot);

flush_work(&azx_bus(chip)->unsol_work);

@@ -1040,6 +1041,7 @@ static void azx_complete(struct device *
struct azx *chip;

chip = card->private_data;
+ snd_power_change_state(card, SNDRV_CTL_POWER_D0);
chip->pm_prepared = 0;
}



2021-04-05 16:05:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 059/126] ALSA: hda: Add missing sanity checks in PM prepare/complete callbacks

From: Takashi Iwai <[email protected]>

commit 66affb7bb0dc0905155a1b2475261aa704d1ddb5 upstream.

The recently added PM prepare and complete callbacks don't have the
sanity check whether the card instance has been properly initialized,
which may potentially lead to Oops.

This patch adds the azx_is_pm_ready() call in each place
appropriately like other PM callbacks.

Fixes: f5dac54d9d93 ("ALSA: hda: Separate runtime and system suspend")
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
sound/pci/hda/hda_intel.c | 6 ++++++
1 file changed, 6 insertions(+)

--- a/sound/pci/hda/hda_intel.c
+++ b/sound/pci/hda/hda_intel.c
@@ -1023,6 +1023,9 @@ static int azx_prepare(struct device *de
struct snd_card *card = dev_get_drvdata(dev);
struct azx *chip;

+ if (!azx_is_pm_ready(card))
+ return 0;
+
chip = card->private_data;
chip->pm_prepared = 1;
snd_power_change_state(card, SNDRV_CTL_POWER_D3hot);
@@ -1040,6 +1043,9 @@ static void azx_complete(struct device *
struct snd_card *card = dev_get_drvdata(dev);
struct azx *chip;

+ if (!azx_is_pm_ready(card))
+ return;
+
chip = card->private_data;
snd_power_change_state(card, SNDRV_CTL_POWER_D0);
chip->pm_prepared = 0;


2021-04-05 16:05:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 061/126] ALSA: hda/realtek: call alc_update_headset_mode() in hp_automute_hook

From: Hui Wang <[email protected]>

commit e54f30befa7990b897189b44a56c1138c6bfdbb5 upstream.

We found the alc_update_headset_mode() is not called on some machines
when unplugging the headset, as a result, the mode of the
ALC_HEADSET_MODE_UNPLUGGED can't be set, then the current_headset_type
is not cleared, if users plug a differnt type of headset next time,
the determine_headset_type() will not be called and the audio jack is
set to the headset type of previous time.

On the Dell machines which connect the dmic to the PCH, if we open
the gnome-sound-setting and unplug the headset, this issue will
happen. Those machines disable the auto-mute by ucm and has no
internal mic in the input source, so the update_headset_mode() will
not be called by cap_sync_hook or automute_hook when unplugging, and
because the gnome-sound-setting is opened, the codec will not enter
the runtime_suspend state, so the update_headset_mode() will not be
called by alc_resume when unplugging. In this case the
hp_automute_hook is called when unplugging, so add
update_headset_mode() calling to this function.

Cc: <[email protected]>
Signed-off-by: Hui Wang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
sound/pci/hda/patch_realtek.c | 1 +
1 file changed, 1 insertion(+)

--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -5440,6 +5440,7 @@ static void alc_update_headset_jack_cb(s
struct hda_jack_callback *jack)
{
snd_hda_gen_hp_automute(codec, jack);
+ alc_update_headset_mode(codec);
}

static void alc_probe_headset_mode(struct hda_codec *codec)


2021-04-05 16:05:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 053/126] net: 9p: advance iov on empty read

From: Jisheng Zhang <[email protected]>

[ Upstream commit d65614a01d24704b016635abf5cc028a54e45a62 ]

I met below warning when cating a small size(about 80bytes) txt file
on 9pfs(msize=2097152 is passed to 9p mount option), the reason is we
miss iov_iter_advance() if the read count is 0 for zerocopy case, so
we didn't truncate the pipe, then iov_iter_pipe() thinks the pipe is
full. Fix it by removing the exception for 0 to ensure to call
iov_iter_advance() even on empty read for zerocopy case.

[ 8.279568] WARNING: CPU: 0 PID: 39 at lib/iov_iter.c:1203 iov_iter_pipe+0x31/0x40
[ 8.280028] Modules linked in:
[ 8.280561] CPU: 0 PID: 39 Comm: cat Not tainted 5.11.0+ #6
[ 8.281260] RIP: 0010:iov_iter_pipe+0x31/0x40
[ 8.281974] Code: 2b 42 54 39 42 5c 76 22 c7 07 20 00 00 00 48 89 57 18 8b 42 50 48 c7 47 08 b
[ 8.283169] RSP: 0018:ffff888000cbbd80 EFLAGS: 00000246
[ 8.283512] RAX: 0000000000000010 RBX: ffff888000117d00 RCX: 0000000000000000
[ 8.283876] RDX: ffff88800031d600 RSI: 0000000000000000 RDI: ffff888000cbbd90
[ 8.284244] RBP: ffff888000cbbe38 R08: 0000000000000000 R09: ffff8880008d2058
[ 8.284605] R10: 0000000000000002 R11: ffff888000375510 R12: 0000000000000050
[ 8.284964] R13: ffff888000cbbe80 R14: 0000000000000050 R15: ffff88800031d600
[ 8.285439] FS: 00007f24fd8af600(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
[ 8.285844] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8.286150] CR2: 00007f24fd7d7b90 CR3: 0000000000c97000 CR4: 00000000000406b0
[ 8.286710] Call Trace:
[ 8.288279] generic_file_splice_read+0x31/0x1a0
[ 8.289273] ? do_splice_to+0x2f/0x90
[ 8.289511] splice_direct_to_actor+0xcc/0x220
[ 8.289788] ? pipe_to_sendpage+0xa0/0xa0
[ 8.290052] do_splice_direct+0x8b/0xd0
[ 8.290314] do_sendfile+0x1ad/0x470
[ 8.290576] do_syscall_64+0x2d/0x40
[ 8.290818] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 8.291409] RIP: 0033:0x7f24fd7dca0a
[ 8.292511] Code: c3 0f 1f 80 00 00 00 00 4c 89 d2 4c 89 c6 e9 bd fd ff ff 0f 1f 44 00 00 31 8
[ 8.293360] RSP: 002b:00007ffc20932818 EFLAGS: 00000206 ORIG_RAX: 0000000000000028
[ 8.293800] RAX: ffffffffffffffda RBX: 0000000001000000 RCX: 00007f24fd7dca0a
[ 8.294153] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000001
[ 8.294504] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
[ 8.294867] R10: 0000000001000000 R11: 0000000000000206 R12: 0000000000000003
[ 8.295217] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000000
[ 8.295782] ---[ end trace 63317af81b3ca24b ]---

Signed-off-by: Jisheng Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
net/9p/client.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/net/9p/client.c b/net/9p/client.c
index 09f1ec589b80..eb42bbb72f52 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -1617,10 +1617,6 @@ p9_client_read_once(struct p9_fid *fid, u64 offset, struct iov_iter *to,
}

p9_debug(P9_DEBUG_9P, "<<< RREAD count %d\n", count);
- if (!count) {
- p9_tag_remove(clnt, req);
- return 0;
- }

if (non_zc) {
int n = copy_to_iter(dataptr, count, to);
--
2.30.1



2021-04-05 16:05:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 035/126] ext4: do not iput inode under running transaction in ext4_rename()

From: zhangyi (F) <[email protected]>

[ Upstream commit 5dccdc5a1916d4266edd251f20bbbb113a5c495f ]

In ext4_rename(), when RENAME_WHITEOUT failed to add new entry into
directory, it ends up dropping new created whiteout inode under the
running transaction. After commit <9b88f9fb0d2> ("ext4: Do not iput inode
under running transaction"), we follow the assumptions that evict() does
not get called from a transaction context but in ext4_rename() it breaks
this suggestion. Although it's not a real problem, better to obey it, so
this patch add inode to orphan list and stop transaction before final
iput().

Signed-off-by: zhangyi (F) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/ext4/namei.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 6c7eba426a67..ab7baf529917 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -3788,14 +3788,14 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
*/
retval = -ENOENT;
if (!old.bh || le32_to_cpu(old.de->inode) != old.inode->i_ino)
- goto end_rename;
+ goto release_bh;

new.bh = ext4_find_entry(new.dir, &new.dentry->d_name,
&new.de, &new.inlined);
if (IS_ERR(new.bh)) {
retval = PTR_ERR(new.bh);
new.bh = NULL;
- goto end_rename;
+ goto release_bh;
}
if (new.bh) {
if (!new.inode) {
@@ -3812,15 +3812,13 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
handle = ext4_journal_start(old.dir, EXT4_HT_DIR, credits);
if (IS_ERR(handle)) {
retval = PTR_ERR(handle);
- handle = NULL;
- goto end_rename;
+ goto release_bh;
}
} else {
whiteout = ext4_whiteout_for_rename(&old, credits, &handle);
if (IS_ERR(whiteout)) {
retval = PTR_ERR(whiteout);
- whiteout = NULL;
- goto end_rename;
+ goto release_bh;
}
}

@@ -3957,16 +3955,18 @@ end_rename:
ext4_resetent(handle, &old,
old.inode->i_ino, old_file_type);
drop_nlink(whiteout);
+ ext4_orphan_add(handle, whiteout);
}
unlock_new_inode(whiteout);
+ ext4_journal_stop(handle);
iput(whiteout);
-
+ } else {
+ ext4_journal_stop(handle);
}
+release_bh:
brelse(old.dir_bh);
brelse(old.bh);
brelse(new.bh);
- if (handle)
- ext4_journal_stop(handle);
return retval;
}

--
2.30.1



2021-04-05 16:05:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 069/126] tracing: Fix stack trace event size

From: Steven Rostedt (VMware) <[email protected]>

commit 9deb193af69d3fd6dd8e47f292b67c805a787010 upstream.

Commit cbc3b92ce037 fixed an issue to modify the macros of the stack trace
event so that user space could parse it properly. Originally the stack
trace format to user space showed that the called stack was a dynamic
array. But it is not actually a dynamic array, in the way that other
dynamic event arrays worked, and this broke user space parsing for it. The
update was to make the array look to have 8 entries in it. Helper
functions were added to make it parse it correctly, as the stack was
dynamic, but was determined by the size of the event stored.

Although this fixed user space on how it read the event, it changed the
internal structure used for the stack trace event. It changed the array
size from [0] to [8] (added 8 entries). This increased the size of the
stack trace event by 8 words. The size reserved on the ring buffer was the
size of the stack trace event plus the number of stack entries found in
the stack trace. That commit caused the amount to be 8 more than what was
needed because it did not expect the caller field to have any size. This
produced 8 entries of garbage (and reading random data) from the stack
trace event:

<idle>-0 [002] d... 1976396.837549: <stack trace>
=> trace_event_raw_event_sched_switch
=> __traceiter_sched_switch
=> __schedule
=> schedule_idle
=> do_idle
=> cpu_startup_entry
=> secondary_startup_64_no_verify
=> 0xc8c5e150ffff93de
=> 0xffff93de
=> 0
=> 0
=> 0xc8c5e17800000000
=> 0x1f30affff93de
=> 0x00000004
=> 0x200000000

Instead, subtract the size of the caller field from the size of the event
to make sure that only the amount needed to store the stack trace is
reserved.

Link: https://lore.kernel.org/lkml/[email protected]/

Cc: [email protected]
Fixes: cbc3b92ce037 ("tracing: Set kernel_stack's caller size properly")
Reported-by: Vasily Gorbik <[email protected]>
Tested-by: Vasily Gorbik <[email protected]>
Acked-by: Vasily Gorbik <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/trace/trace.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2984,7 +2984,8 @@ static void __ftrace_trace_stack(struct

size = nr_entries * sizeof(unsigned long);
event = __trace_buffer_lock_reserve(buffer, TRACE_STACK,
- sizeof(*entry) + size, flags, pc);
+ (sizeof(*entry) - sizeof(entry->caller)) + size,
+ flags, pc);
if (!event)
goto out;
entry = ring_buffer_event_data(event);


2021-04-05 16:05:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 060/126] ALSA: hda/realtek: fix a determine_headset_type issue for a Dell AIO

From: Hui Wang <[email protected]>

commit febf22565549ea7111e7d45e8f2d64373cc66b11 upstream.

We found a recording issue on a Dell AIO, users plug a headset-mic and
select headset-mic from UI, but can't record any sound from
headset-mic. The root cause is the determine_headset_type() returns a
wrong type, e.g. users plug a ctia type headset, but that function
returns omtp type.

On this machine, the internal mic is not connected to the codec, the
"Input Source" is headset mic by default. And when users plug a
headset, the determine_headset_type() will be called immediately, the
codec on this AIO is alc274, the delay time for this codec in the
determine_headset_type() is only 80ms, the delay is too short to
correctly determine the headset type, the fail rate is nearly 99% when
users plug the headset with the normal speed.

Other codecs set several hundred ms delay time, so here I change the
delay time to 850ms for alc2x4 series, after this change, the fail
rate is zero unless users plug the headset slowly on purpose.

Cc: <[email protected]>
Signed-off-by: Hui Wang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
sound/pci/hda/patch_realtek.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -5256,7 +5256,7 @@ static void alc_determine_headset_type(s
case 0x10ec0274:
case 0x10ec0294:
alc_process_coef_fw(codec, coef0274);
- msleep(80);
+ msleep(850);
val = alc_read_coef_idx(codec, 0x46);
is_ctia = (val & 0x00f0) == 0x00f0;
break;


2021-04-05 16:05:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 039/126] can: dev: move driver related infrastructure into separate subdir

From: Marc Kleine-Budde <[email protected]>

[ Upstream commit 3e77f70e734584e0ad1038e459ed3fd2400f873a ]

This patch moves the CAN driver related infrastructure into a separate subdir.
It will be split into more files in the coming patches.

Reviewed-by: Vincent Mailhol <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/can/Makefile | 7 +------
drivers/net/can/dev/Makefile | 7 +++++++
drivers/net/can/{ => dev}/dev.c | 0
drivers/net/can/{ => dev}/rx-offload.c | 0
4 files changed, 8 insertions(+), 6 deletions(-)
create mode 100644 drivers/net/can/dev/Makefile
rename drivers/net/can/{ => dev}/dev.c (100%)
rename drivers/net/can/{ => dev}/rx-offload.c (100%)

diff --git a/drivers/net/can/Makefile b/drivers/net/can/Makefile
index 22164300122d..a2b4463d8480 100644
--- a/drivers/net/can/Makefile
+++ b/drivers/net/can/Makefile
@@ -7,12 +7,7 @@ obj-$(CONFIG_CAN_VCAN) += vcan.o
obj-$(CONFIG_CAN_VXCAN) += vxcan.o
obj-$(CONFIG_CAN_SLCAN) += slcan.o

-obj-$(CONFIG_CAN_DEV) += can-dev.o
-can-dev-y += dev.o
-can-dev-y += rx-offload.o
-
-can-dev-$(CONFIG_CAN_LEDS) += led.o
-
+obj-y += dev/
obj-y += rcar/
obj-y += spi/
obj-y += usb/
diff --git a/drivers/net/can/dev/Makefile b/drivers/net/can/dev/Makefile
new file mode 100644
index 000000000000..cba92e6bcf6f
--- /dev/null
+++ b/drivers/net/can/dev/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_CAN_DEV) += can-dev.o
+can-dev-y += dev.o
+can-dev-y += rx-offload.o
+
+can-dev-$(CONFIG_CAN_LEDS) += led.o
diff --git a/drivers/net/can/dev.c b/drivers/net/can/dev/dev.c
similarity index 100%
rename from drivers/net/can/dev.c
rename to drivers/net/can/dev/dev.c
diff --git a/drivers/net/can/rx-offload.c b/drivers/net/can/dev/rx-offload.c
similarity index 100%
rename from drivers/net/can/rx-offload.c
rename to drivers/net/can/dev/rx-offload.c
--
2.30.1



2021-04-05 16:05:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 052/126] net: wan/lmc: unregister device when no matching device is found

From: Tong Zhang <[email protected]>

[ Upstream commit 62e69bc419772638369eff8ff81340bde8aceb61 ]

lmc set sc->lmc_media pointer when there is a matching device.
However, when no matching device is found, this pointer is NULL
and the following dereference will result in a null-ptr-deref.

To fix this issue, unregister the hdlc device and return an error.

[ 4.569359] BUG: KASAN: null-ptr-deref in lmc_init_one.cold+0x2b6/0x55d [lmc]
[ 4.569748] Read of size 8 at addr 0000000000000008 by task modprobe/95
[ 4.570102]
[ 4.570187] CPU: 0 PID: 95 Comm: modprobe Not tainted 5.11.0-rc7 #94
[ 4.570527] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-preb4
[ 4.571125] Call Trace:
[ 4.571261] dump_stack+0x7d/0xa3
[ 4.571445] kasan_report.cold+0x10c/0x10e
[ 4.571667] ? lmc_init_one.cold+0x2b6/0x55d [lmc]
[ 4.571932] lmc_init_one.cold+0x2b6/0x55d [lmc]
[ 4.572186] ? lmc_mii_readreg+0xa0/0xa0 [lmc]
[ 4.572432] local_pci_probe+0x6f/0xb0
[ 4.572639] pci_device_probe+0x171/0x240
[ 4.572857] ? pci_device_remove+0xe0/0xe0
[ 4.573080] ? kernfs_create_link+0xb6/0x110
[ 4.573315] ? sysfs_do_create_link_sd.isra.0+0x76/0xe0
[ 4.573598] really_probe+0x161/0x420
[ 4.573799] driver_probe_device+0x6d/0xd0
[ 4.574022] device_driver_attach+0x82/0x90
[ 4.574249] ? device_driver_attach+0x90/0x90
[ 4.574485] __driver_attach+0x60/0x100
[ 4.574694] ? device_driver_attach+0x90/0x90
[ 4.574931] bus_for_each_dev+0xe1/0x140
[ 4.575146] ? subsys_dev_iter_exit+0x10/0x10
[ 4.575387] ? klist_node_init+0x61/0x80
[ 4.575602] bus_add_driver+0x254/0x2a0
[ 4.575812] driver_register+0xd3/0x150
[ 4.576021] ? 0xffffffffc0018000
[ 4.576202] do_one_initcall+0x84/0x250
[ 4.576411] ? trace_event_raw_event_initcall_finish+0x150/0x150
[ 4.576733] ? unpoison_range+0xf/0x30
[ 4.576938] ? ____kasan_kmalloc.constprop.0+0x84/0xa0
[ 4.577219] ? unpoison_range+0xf/0x30
[ 4.577423] ? unpoison_range+0xf/0x30
[ 4.577628] do_init_module+0xf8/0x350
[ 4.577833] load_module+0x3fe6/0x4340
[ 4.578038] ? vm_unmap_ram+0x1d0/0x1d0
[ 4.578247] ? ____kasan_kmalloc.constprop.0+0x84/0xa0
[ 4.578526] ? module_frob_arch_sections+0x20/0x20
[ 4.578787] ? __do_sys_finit_module+0x108/0x170
[ 4.579037] __do_sys_finit_module+0x108/0x170
[ 4.579278] ? __ia32_sys_init_module+0x40/0x40
[ 4.579523] ? file_open_root+0x200/0x200
[ 4.579742] ? do_sys_open+0x85/0xe0
[ 4.579938] ? filp_open+0x50/0x50
[ 4.580125] ? exit_to_user_mode_prepare+0xfc/0x130
[ 4.580390] do_syscall_64+0x33/0x40
[ 4.580586] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 4.580859] RIP: 0033:0x7f1a724c3cf7
[ 4.581054] Code: 48 89 57 30 48 8b 04 24 48 89 47 38 e9 1d a0 02 00 48 89 f8 48 89 f7 48 89 d6 48 891
[ 4.582043] RSP: 002b:00007fff44941c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 4.582447] RAX: ffffffffffffffda RBX: 00000000012ada70 RCX: 00007f1a724c3cf7
[ 4.582827] RDX: 0000000000000000 RSI: 00000000012ac9e0 RDI: 0000000000000003
[ 4.583207] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000001
[ 4.583587] R10: 00007f1a72527300 R11: 0000000000000246 R12: 00000000012ac9e0
[ 4.583968] R13: 0000000000000000 R14: 00000000012acc90 R15: 0000000000000001
[ 4.584349] ==================================================================

Signed-off-by: Tong Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wan/lmc/lmc_main.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/net/wan/lmc/lmc_main.c b/drivers/net/wan/lmc/lmc_main.c
index 36600b0a0ab0..1ee4c8a90632 100644
--- a/drivers/net/wan/lmc/lmc_main.c
+++ b/drivers/net/wan/lmc/lmc_main.c
@@ -901,6 +901,8 @@ static int lmc_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
break;
default:
printk(KERN_WARNING "%s: LMC UNKNOWN CARD!\n", dev->name);
+ unregister_hdlc_device(dev);
+ return -EIO;
break;
}

--
2.30.1



2021-04-05 16:05:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 084/126] KVM: x86/mmu: Rename goal_gfn to next_last_level_gfn

From: Ben Gardon <[email protected]>

[ Upstream commit 74953d3530280dc53256054e1906f58d07bfba44 ]

The goal_gfn field in tdp_iter can be misleading as it implies that it
is the iterator's final goal. It is really a target for the lowest gfn
mapped by the leaf level SPTE the iterator will traverse towards. Change
the field's name to be more precise.

Signed-off-by: Ben Gardon <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/mmu/tdp_iter.c | 20 ++++++++++----------
arch/x86/kvm/mmu/tdp_iter.h | 4 ++--
2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_iter.c b/arch/x86/kvm/mmu/tdp_iter.c
index 87b7e16911db..9917c55b7d24 100644
--- a/arch/x86/kvm/mmu/tdp_iter.c
+++ b/arch/x86/kvm/mmu/tdp_iter.c
@@ -22,21 +22,21 @@ static gfn_t round_gfn_for_level(gfn_t gfn, int level)

/*
* Sets a TDP iterator to walk a pre-order traversal of the paging structure
- * rooted at root_pt, starting with the walk to translate goal_gfn.
+ * rooted at root_pt, starting with the walk to translate next_last_level_gfn.
*/
void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level,
- int min_level, gfn_t goal_gfn)
+ int min_level, gfn_t next_last_level_gfn)
{
WARN_ON(root_level < 1);
WARN_ON(root_level > PT64_ROOT_MAX_LEVEL);

- iter->goal_gfn = goal_gfn;
+ iter->next_last_level_gfn = next_last_level_gfn;
iter->root_level = root_level;
iter->min_level = min_level;
iter->level = root_level;
iter->pt_path[iter->level - 1] = root_pt;

- iter->gfn = round_gfn_for_level(iter->goal_gfn, iter->level);
+ iter->gfn = round_gfn_for_level(iter->next_last_level_gfn, iter->level);
tdp_iter_refresh_sptep(iter);

iter->valid = true;
@@ -82,7 +82,7 @@ static bool try_step_down(struct tdp_iter *iter)

iter->level--;
iter->pt_path[iter->level - 1] = child_pt;
- iter->gfn = round_gfn_for_level(iter->goal_gfn, iter->level);
+ iter->gfn = round_gfn_for_level(iter->next_last_level_gfn, iter->level);
tdp_iter_refresh_sptep(iter);

return true;
@@ -106,7 +106,7 @@ static bool try_step_side(struct tdp_iter *iter)
return false;

iter->gfn += KVM_PAGES_PER_HPAGE(iter->level);
- iter->goal_gfn = iter->gfn;
+ iter->next_last_level_gfn = iter->gfn;
iter->sptep++;
iter->old_spte = READ_ONCE(*iter->sptep);

@@ -166,13 +166,13 @@ void tdp_iter_next(struct tdp_iter *iter)
*/
void tdp_iter_refresh_walk(struct tdp_iter *iter)
{
- gfn_t goal_gfn = iter->goal_gfn;
+ gfn_t next_last_level_gfn = iter->next_last_level_gfn;

- if (iter->gfn > goal_gfn)
- goal_gfn = iter->gfn;
+ if (iter->gfn > next_last_level_gfn)
+ next_last_level_gfn = iter->gfn;

tdp_iter_start(iter, iter->pt_path[iter->root_level - 1],
- iter->root_level, iter->min_level, goal_gfn);
+ iter->root_level, iter->min_level, next_last_level_gfn);
}

u64 *tdp_iter_root_pt(struct tdp_iter *iter)
diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h
index 47170d0dc98e..b2dd269c631f 100644
--- a/arch/x86/kvm/mmu/tdp_iter.h
+++ b/arch/x86/kvm/mmu/tdp_iter.h
@@ -15,7 +15,7 @@ struct tdp_iter {
* The iterator will traverse the paging structure towards the mapping
* for this GFN.
*/
- gfn_t goal_gfn;
+ gfn_t next_last_level_gfn;
/* Pointers to the page tables traversed to reach the current SPTE */
u64 *pt_path[PT64_ROOT_MAX_LEVEL];
/* A pointer to the current SPTE */
@@ -52,7 +52,7 @@ struct tdp_iter {
u64 *spte_to_child_pt(u64 pte, int level);

void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level,
- int min_level, gfn_t goal_gfn);
+ int min_level, gfn_t next_last_level_gfn);
void tdp_iter_next(struct tdp_iter *iter);
void tdp_iter_refresh_walk(struct tdp_iter *iter);
u64 *tdp_iter_root_pt(struct tdp_iter *iter);
--
2.30.1



2021-04-05 16:06:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 064/126] xtensa: move coprocessor_flush to the .text section

From: Max Filippov <[email protected]>

commit ab5eb336411f18fd449a1fb37d36a55ec422603f upstream.

coprocessor_flush is not a part of fast exception handlers, but it uses
parts of fast coprocessor handling code that's why it's in the same
source file. It uses call0 opcode to invoke those parts so there are no
limitations on their relative location, but the rest of the code calls
coprocessor_flush with call8 and that doesn't work when vectors are
placed in a different gigabyte-aligned area than the rest of the kernel.

Move coprocessor_flush from the .exception.text section to the .text so
that it's reachable from the rest of the kernel with call8.

Cc: [email protected]
Signed-off-by: Max Filippov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/xtensa/kernel/coprocessor.S | 64 ++++++++++++++++++++-------------------
1 file changed, 33 insertions(+), 31 deletions(-)

--- a/arch/xtensa/kernel/coprocessor.S
+++ b/arch/xtensa/kernel/coprocessor.S
@@ -100,37 +100,6 @@
LOAD_CP_REGS_TAB(7)

/*
- * coprocessor_flush(struct thread_info*, index)
- * a2 a3
- *
- * Save coprocessor registers for coprocessor 'index'.
- * The register values are saved to or loaded from the coprocessor area
- * inside the task_info structure.
- *
- * Note that this function doesn't update the coprocessor_owner information!
- *
- */
-
-ENTRY(coprocessor_flush)
-
- /* reserve 4 bytes on stack to save a0 */
- abi_entry(4)
-
- s32i a0, a1, 0
- movi a0, .Lsave_cp_regs_jump_table
- addx8 a3, a3, a0
- l32i a4, a3, 4
- l32i a3, a3, 0
- add a2, a2, a4
- beqz a3, 1f
- callx0 a3
-1: l32i a0, a1, 0
-
- abi_ret(4)
-
-ENDPROC(coprocessor_flush)
-
-/*
* Entry condition:
*
* a0: trashed, original value saved on stack (PT_AREG0)
@@ -245,6 +214,39 @@ ENTRY(fast_coprocessor)

ENDPROC(fast_coprocessor)

+ .text
+
+/*
+ * coprocessor_flush(struct thread_info*, index)
+ * a2 a3
+ *
+ * Save coprocessor registers for coprocessor 'index'.
+ * The register values are saved to or loaded from the coprocessor area
+ * inside the task_info structure.
+ *
+ * Note that this function doesn't update the coprocessor_owner information!
+ *
+ */
+
+ENTRY(coprocessor_flush)
+
+ /* reserve 4 bytes on stack to save a0 */
+ abi_entry(4)
+
+ s32i a0, a1, 0
+ movi a0, .Lsave_cp_regs_jump_table
+ addx8 a3, a3, a0
+ l32i a4, a3, 4
+ l32i a3, a3, 0
+ add a2, a2, a4
+ beqz a3, 1f
+ callx0 a3
+1: l32i a0, a1, 0
+
+ abi_ret(4)
+
+ENDPROC(coprocessor_flush)
+
.data

ENTRY(coprocessor_owner)


2021-04-05 16:06:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 065/126] KVM: SVM: load control fields from VMCB12 before checking them

From: Paolo Bonzini <[email protected]>

commit a58d9166a756a0f4a6618e4f593232593d6df134 upstream.

Avoid races between check and use of the nested VMCB controls. This
for example ensures that the VMRUN intercept is always reflected to the
nested hypervisor, instead of being processed by the host. Without this
patch, it is possible to end up with svm->nested.hsave pointing to
the MSR permission bitmap for nested guests.

This bug is CVE-2021-29657.

Reported-by: Felix Wilhelm <[email protected]>
Cc: [email protected]
Fixes: 2fcf4876ada ("KVM: nSVM: implement on demand allocation of the nested state")
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kvm/svm/nested.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -246,7 +246,7 @@ static bool nested_vmcb_check_controls(s
return true;
}

-static bool nested_vmcb_checks(struct vcpu_svm *svm, struct vmcb *vmcb12)
+static bool nested_vmcb_check_save(struct vcpu_svm *svm, struct vmcb *vmcb12)
{
struct kvm_vcpu *vcpu = &svm->vcpu;
bool vmcb12_lma;
@@ -271,7 +271,7 @@ static bool nested_vmcb_checks(struct vc
if (kvm_valid_cr4(&svm->vcpu, vmcb12->save.cr4))
return false;

- return nested_vmcb_check_controls(&vmcb12->control);
+ return true;
}

static void load_nested_vmcb_control(struct vcpu_svm *svm,
@@ -454,7 +454,6 @@ int enter_svm_guest_mode(struct vcpu_svm
int ret;

svm->nested.vmcb12_gpa = vmcb12_gpa;
- load_nested_vmcb_control(svm, &vmcb12->control);
nested_prepare_vmcb_save(svm, vmcb12);
nested_prepare_vmcb_control(svm);

@@ -501,7 +500,10 @@ int nested_svm_vmrun(struct vcpu_svm *sv
if (WARN_ON_ONCE(!svm->nested.initialized))
return -EINVAL;

- if (!nested_vmcb_checks(svm, vmcb12)) {
+ load_nested_vmcb_control(svm, &vmcb12->control);
+
+ if (!nested_vmcb_check_save(svm, vmcb12) ||
+ !nested_vmcb_check_controls(&svm->nested.ctl)) {
vmcb12->control.exit_code = SVM_EXIT_ERR;
vmcb12->control.exit_code_hi = 0;
vmcb12->control.exit_info_1 = 0;


2021-04-05 16:06:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 067/126] PM: runtime: Fix race getting/putting suppliers at probe

From: Adrian Hunter <[email protected]>

commit 9dfacc54a8661bc8be6e08cffee59596ec59f263 upstream.

pm_runtime_put_suppliers() must not decrement rpm_active unless the
consumer is suspended. That is because, otherwise, it could suspend
suppliers for an active consumer.

That can happen as follows:

static int driver_probe_device(struct device_driver *drv, struct device *dev)
{
int ret = 0;

if (!device_is_registered(dev))
return -ENODEV;

dev->can_match = true;
pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
drv->bus->name, __func__, dev_name(dev), drv->name);

pm_runtime_get_suppliers(dev);
if (dev->parent)
pm_runtime_get_sync(dev->parent);

At this point, dev can runtime suspend so rpm_put_suppliers() can run,
rpm_active becomes 1 (the lowest value).

pm_runtime_barrier(dev);
if (initcall_debug)
ret = really_probe_debug(dev, drv);
else
ret = really_probe(dev, drv);

Probe callback can have runtime resumed dev, and then runtime put
so dev is awaiting autosuspend, but rpm_active is 2.

pm_request_idle(dev);

if (dev->parent)
pm_runtime_put(dev->parent);

pm_runtime_put_suppliers(dev);

Now pm_runtime_put_suppliers() will put the supplier
i.e. rpm_active 2 -> 1, but consumer can still be active.

return ret;
}

Fix by checking the runtime status. For any status other than
RPM_SUSPENDED, rpm_active can be considered to be "owned" by
rpm_[get/put]_suppliers() and pm_runtime_put_suppliers() need do nothing.

Reported-by: Asutosh Das <[email protected]>
Fixes: 4c06c4e6cf63 ("driver core: Fix possible supplier PM-usage counter imbalance")
Signed-off-by: Adrian Hunter <[email protected]>
Cc: 5.1+ <[email protected]> # 5.1+
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/base/power/runtime.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -1704,6 +1704,8 @@ void pm_runtime_get_suppliers(struct dev
void pm_runtime_put_suppliers(struct device *dev)
{
struct device_link *link;
+ unsigned long flags;
+ bool put;
int idx;

idx = device_links_read_lock();
@@ -1712,7 +1714,11 @@ void pm_runtime_put_suppliers(struct dev
device_links_read_lock_held())
if (link->supplier_preactivated) {
link->supplier_preactivated = false;
- if (refcount_dec_not_one(&link->rpm_active))
+ spin_lock_irqsave(&dev->power.lock, flags);
+ put = pm_runtime_status_suspended(dev) &&
+ refcount_dec_not_one(&link->rpm_active);
+ spin_unlock_irqrestore(&dev->power.lock, flags);
+ if (put)
pm_runtime_put(link->supplier);
}



2021-04-05 16:06:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 068/126] PM: runtime: Fix ordering in pm_runtime_get_suppliers()

From: Adrian Hunter <[email protected]>

commit c0c33442f7203704aef345647e14c2fb86071001 upstream.

rpm_active indicates how many times the supplier usage_count has been
incremented. Consequently it must be updated after pm_runtime_get_sync() of
the supplier, not before.

Fixes: 4c06c4e6cf63 ("driver core: Fix possible supplier PM-usage counter imbalance")
Signed-off-by: Adrian Hunter <[email protected]>
Cc: 5.1+ <[email protected]> # 5.1+
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/base/power/runtime.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -1690,8 +1690,8 @@ void pm_runtime_get_suppliers(struct dev
device_links_read_lock_held())
if (link->flags & DL_FLAG_PM_RUNTIME) {
link->supplier_preactivated = true;
- refcount_inc(&link->rpm_active);
pm_runtime_get_sync(link->supplier);
+ refcount_inc(&link->rpm_active);
}

device_links_read_unlock(idx);


2021-04-05 16:06:09

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 070/126] s390/vdso: copy tod_steering_delta value to vdso_data page

From: Heiko Carstens <[email protected]>

commit 72bbc226ed2ef0a46c165a482861fff00dd6d4e1 upstream.

When converting the vdso assembler code to C it was forgotten to
actually copy the tod_steering_delta value to vdso_data page.

Which in turn means that tod clock steering will not work correctly.

Fix this by simply copying the value whenever it is updated.

Fixes: 4bff8cb54502 ("s390: convert to GENERIC_VDSO")
Cc: <[email protected]> # 5.10
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/s390/kernel/time.c | 1 +
1 file changed, 1 insertion(+)

--- a/arch/s390/kernel/time.c
+++ b/arch/s390/kernel/time.c
@@ -398,6 +398,7 @@ static void clock_sync_global(unsigned l
tod_steering_delta);
tod_steering_end = now + (abs(tod_steering_delta) << 15);
vdso_data->arch_data.tod_steering_end = tod_steering_end;
+ vdso_data->arch_data.tod_steering_delta = tod_steering_delta;

/* Update LPAR offset. */
if (ptff_query(PTFF_QTO) && ptff(&qto, sizeof(qto), PTFF_QTO) == 0)


2021-04-05 16:06:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 038/126] flow_dissector: fix TTL and TOS dissection on IPv4 fragments

From: Davide Caratti <[email protected]>

[ Upstream commit d2126838050ccd1dadf310ffb78b2204f3b032b9 ]

the following command:

# tc filter add dev $h2 ingress protocol ip pref 1 handle 101 flower \
$tcflags dst_ip 192.0.2.2 ip_ttl 63 action drop

doesn't drop all IPv4 packets that match the configured TTL / destination
address. In particular, if "fragment offset" or "more fragments" have non
zero value in the IPv4 header, setting of FLOW_DISSECTOR_KEY_IP is simply
ignored. Fix this dissecting IPv4 TTL and TOS before fragment info; while
at it, add a selftest for tc flower's match on 'ip_ttl' that verifies the
correct behavior.

Fixes: 518d8a2e9bad ("net/flow_dissector: add support for dissection of misc ip header fields")
Reported-by: Shuang Li <[email protected]>
Signed-off-by: Davide Caratti <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
net/core/flow_dissector.c | 6 +--
.../selftests/net/forwarding/tc_flower.sh | 38 ++++++++++++++++++-
2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index c79be25b2e0c..d48b37b15b27 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -1050,6 +1050,9 @@ proto_again:
key_control->addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
}

+ __skb_flow_dissect_ipv4(skb, flow_dissector,
+ target_container, data, iph);
+
if (ip_is_fragment(iph)) {
key_control->flags |= FLOW_DIS_IS_FRAGMENT;

@@ -1066,9 +1069,6 @@ proto_again:
}
}

- __skb_flow_dissect_ipv4(skb, flow_dissector,
- target_container, data, iph);
-
break;
}
case htons(ETH_P_IPV6): {
diff --git a/tools/testing/selftests/net/forwarding/tc_flower.sh b/tools/testing/selftests/net/forwarding/tc_flower.sh
index 058c746ee300..b11d8e6b5bc1 100755
--- a/tools/testing/selftests/net/forwarding/tc_flower.sh
+++ b/tools/testing/selftests/net/forwarding/tc_flower.sh
@@ -3,7 +3,7 @@

ALL_TESTS="match_dst_mac_test match_src_mac_test match_dst_ip_test \
match_src_ip_test match_ip_flags_test match_pcp_test match_vlan_test \
- match_ip_tos_test match_indev_test"
+ match_ip_tos_test match_indev_test match_ip_ttl_test"
NUM_NETIFS=2
source tc_common.sh
source lib.sh
@@ -310,6 +310,42 @@ match_ip_tos_test()
log_test "ip_tos match ($tcflags)"
}

+match_ip_ttl_test()
+{
+ RET=0
+
+ tc filter add dev $h2 ingress protocol ip pref 1 handle 101 flower \
+ $tcflags dst_ip 192.0.2.2 ip_ttl 63 action drop
+ tc filter add dev $h2 ingress protocol ip pref 2 handle 102 flower \
+ $tcflags dst_ip 192.0.2.2 action drop
+
+ $MZ $h1 -c 1 -p 64 -a $h1mac -b $h2mac -A 192.0.2.1 -B 192.0.2.2 \
+ -t ip "ttl=63" -q
+
+ $MZ $h1 -c 1 -p 64 -a $h1mac -b $h2mac -A 192.0.2.1 -B 192.0.2.2 \
+ -t ip "ttl=63,mf,frag=256" -q
+
+ tc_check_packets "dev $h2 ingress" 102 1
+ check_fail $? "Matched on the wrong filter (no check on ttl)"
+
+ tc_check_packets "dev $h2 ingress" 101 2
+ check_err $? "Did not match on correct filter (ttl=63)"
+
+ $MZ $h1 -c 1 -p 64 -a $h1mac -b $h2mac -A 192.0.2.1 -B 192.0.2.2 \
+ -t ip "ttl=255" -q
+
+ tc_check_packets "dev $h2 ingress" 101 3
+ check_fail $? "Matched on a wrong filter (ttl=63)"
+
+ tc_check_packets "dev $h2 ingress" 102 1
+ check_err $? "Did not match on correct filter (no check on ttl)"
+
+ tc filter del dev $h2 ingress protocol ip pref 2 handle 102 flower
+ tc filter del dev $h2 ingress protocol ip pref 1 handle 101 flower
+
+ log_test "ip_ttl match ($tcflags)"
+}
+
match_indev_test()
{
RET=0
--
2.30.1



2021-04-05 16:06:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 087/126] kvm: x86/mmu: Add existing trace points to TDP MMU

From: Ben Gardon <[email protected]>

[ Upstream commit 33dd3574f5fef57c2c6caccf98925d63aa2a8d09 ]

The TDP MMU was initially implemented without some of the usual
tracepoints found in mmu.c. Correct this discrepancy by adding the
missing trace points to the TDP MMU.

Tested: ran the demand paging selftest on an Intel Skylake machine with
all the trace points used by the TDP MMU enabled and observed
them firing with expected values.

This patch can be viewed in Gerrit at:
https://linux-review.googlesource.com/c/virt/kvm/kvm/+/3812

Signed-off-by: Ben Gardon <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/mmu/tdp_mmu.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 0d17457f1c84..61be95c6db20 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -7,6 +7,8 @@
#include "tdp_mmu.h"
#include "spte.h"

+#include <trace/events/kvm.h>
+
#ifdef CONFIG_X86_64
static bool __read_mostly tdp_mmu_enabled = false;
module_param_named(tdp_mmu, tdp_mmu_enabled, bool, 0644);
@@ -149,6 +151,8 @@ static struct kvm_mmu_page *alloc_tdp_mmu_page(struct kvm_vcpu *vcpu, gfn_t gfn,
sp->gfn = gfn;
sp->tdp_mmu_page = true;

+ trace_kvm_mmu_get_page(sp, true);
+
return sp;
}

@@ -319,6 +323,8 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
pt = spte_to_child_pt(old_spte, level);
sp = sptep_to_sp(pt);

+ trace_kvm_mmu_prepare_zap_page(sp);
+
list_del(&sp->link);

if (sp->lpage_disallowed)
@@ -530,11 +536,13 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int write,
if (unlikely(is_noslot_pfn(pfn))) {
new_spte = make_mmio_spte(vcpu, iter->gfn, ACC_ALL);
trace_mark_mmio_spte(iter->sptep, iter->gfn, new_spte);
- } else
+ } else {
make_spte_ret = make_spte(vcpu, ACC_ALL, iter->level, iter->gfn,
pfn, iter->old_spte, prefault, true,
map_writable, !shadow_accessed_mask,
&new_spte);
+ trace_kvm_mmu_set_spte(iter->level, iter->gfn, iter->sptep);
+ }

if (new_spte == iter->old_spte)
ret = RET_PF_SPURIOUS;
@@ -740,6 +748,8 @@ static int age_gfn_range(struct kvm *kvm, struct kvm_memory_slot *slot,

tdp_mmu_set_spte_no_acc_track(kvm, &iter, new_spte);
young = 1;
+
+ trace_kvm_age_page(iter.gfn, iter.level, slot, young);
}

return young;
--
2.30.1



2021-04-05 16:06:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 089/126] KVM: x86/mmu: Factor out handling of removed page tables

From: Ben Gardon <[email protected]>

[ Upstream commit a066e61f13cf4b17d043ad8bea0cdde2b1e5ee49 ]

Factor out the code to handle a disconnected subtree of the TDP paging
structure from the code to handle the change to an individual SPTE.
Future commits will build on this to allow asynchronous page freeing.

No functional change intended.

Reviewed-by: Peter Feiner <[email protected]>
Acked-by: Paolo Bonzini <[email protected]>
Signed-off-by: Ben Gardon <[email protected]>

Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/mmu/tdp_mmu.c | 71 ++++++++++++++++++++++----------------
1 file changed, 42 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index ad9f8f187045..f52a22bc0fe8 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -234,6 +234,45 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn,
}
}

+/**
+ * handle_removed_tdp_mmu_page - handle a pt removed from the TDP structure
+ *
+ * @kvm: kvm instance
+ * @pt: the page removed from the paging structure
+ *
+ * Given a page table that has been removed from the TDP paging structure,
+ * iterates through the page table to clear SPTEs and free child page tables.
+ */
+static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt)
+{
+ struct kvm_mmu_page *sp = sptep_to_sp(pt);
+ int level = sp->role.level;
+ gfn_t gfn = sp->gfn;
+ u64 old_child_spte;
+ int i;
+
+ trace_kvm_mmu_prepare_zap_page(sp);
+
+ list_del(&sp->link);
+
+ if (sp->lpage_disallowed)
+ unaccount_huge_nx_page(kvm, sp);
+
+ for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
+ old_child_spte = READ_ONCE(*(pt + i));
+ WRITE_ONCE(*(pt + i), 0);
+ handle_changed_spte(kvm, kvm_mmu_page_as_id(sp),
+ gfn + (i * KVM_PAGES_PER_HPAGE(level - 1)),
+ old_child_spte, 0, level - 1);
+ }
+
+ kvm_flush_remote_tlbs_with_address(kvm, gfn,
+ KVM_PAGES_PER_HPAGE(level));
+
+ free_page((unsigned long)pt);
+ kmem_cache_free(mmu_page_header_cache, sp);
+}
+
/**
* handle_changed_spte - handle bookkeeping associated with an SPTE change
* @kvm: kvm instance
@@ -254,10 +293,6 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
bool was_leaf = was_present && is_last_spte(old_spte, level);
bool is_leaf = is_present && is_last_spte(new_spte, level);
bool pfn_changed = spte_to_pfn(old_spte) != spte_to_pfn(new_spte);
- u64 *pt;
- struct kvm_mmu_page *sp;
- u64 old_child_spte;
- int i;

WARN_ON(level > PT64_ROOT_MAX_LEVEL);
WARN_ON(level < PG_LEVEL_4K);
@@ -319,31 +354,9 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
* Recursively handle child PTs if the change removed a subtree from
* the paging structure.
*/
- if (was_present && !was_leaf && (pfn_changed || !is_present)) {
- pt = spte_to_child_pt(old_spte, level);
- sp = sptep_to_sp(pt);
-
- trace_kvm_mmu_prepare_zap_page(sp);
-
- list_del(&sp->link);
-
- if (sp->lpage_disallowed)
- unaccount_huge_nx_page(kvm, sp);
-
- for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
- old_child_spte = READ_ONCE(*(pt + i));
- WRITE_ONCE(*(pt + i), 0);
- handle_changed_spte(kvm, as_id,
- gfn + (i * KVM_PAGES_PER_HPAGE(level - 1)),
- old_child_spte, 0, level - 1);
- }
-
- kvm_flush_remote_tlbs_with_address(kvm, gfn,
- KVM_PAGES_PER_HPAGE(level));
-
- free_page((unsigned long)pt);
- kmem_cache_free(mmu_page_header_cache, sp);
- }
+ if (was_present && !was_leaf && (pfn_changed || !is_present))
+ handle_removed_tdp_mmu_page(kvm,
+ spte_to_child_pt(old_spte, level));
}

static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
--
2.30.1



2021-04-05 16:06:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 091/126] KVM: x86/mmu: Ensure TLBs are flushed when yielding during GFN range zap

From: Sean Christopherson <[email protected]>

[ Upstream commit a835429cda91621fca915d80672a157b47738afb ]

When flushing a range of GFNs across multiple roots, ensure any pending
flush from a previous root is honored before yielding while walking the
tables of the current root.

Note, kvm_tdp_mmu_zap_gfn_range() now intentionally overwrites its local
"flush" with the result to avoid redundant flushes. zap_gfn_range()
preserves and return the incoming "flush", unless of course the flush was
performed prior to yielding and no new flush was triggered.

Fixes: 1af4a96025b3 ("KVM: x86/mmu: Yield in TDU MMU iter even if no SPTES changed")
Cc: [email protected]
Reviewed-by: Ben Gardon <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/mmu/tdp_mmu.c | 23 ++++++++++++-----------
1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index a54a9ed979d1..34ef3e1a0f84 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -111,7 +111,7 @@ bool is_tdp_mmu_root(struct kvm *kvm, hpa_t hpa)
}

static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
- gfn_t start, gfn_t end, bool can_yield);
+ gfn_t start, gfn_t end, bool can_yield, bool flush);

void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root)
{
@@ -124,7 +124,7 @@ void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root)

list_del(&root->link);

- zap_gfn_range(kvm, root, 0, max_gfn, false);
+ zap_gfn_range(kvm, root, 0, max_gfn, false, false);

free_page((unsigned long)root->spt);
kmem_cache_free(mmu_page_header_cache, root);
@@ -504,20 +504,21 @@ static inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm,
* scheduler needs the CPU or there is contention on the MMU lock. If this
* function cannot yield, it will not release the MMU lock or reschedule and
* the caller must ensure it does not supply too large a GFN range, or the
- * operation can cause a soft lockup.
+ * operation can cause a soft lockup. Note, in some use cases a flush may be
+ * required by prior actions. Ensure the pending flush is performed prior to
+ * yielding.
*/
static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
- gfn_t start, gfn_t end, bool can_yield)
+ gfn_t start, gfn_t end, bool can_yield, bool flush)
{
struct tdp_iter iter;
- bool flush_needed = false;

rcu_read_lock();

tdp_root_for_each_pte(iter, root, start, end) {
if (can_yield &&
- tdp_mmu_iter_cond_resched(kvm, &iter, flush_needed)) {
- flush_needed = false;
+ tdp_mmu_iter_cond_resched(kvm, &iter, flush)) {
+ flush = false;
continue;
}

@@ -535,11 +536,11 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
continue;

tdp_mmu_set_spte(kvm, &iter, 0);
- flush_needed = true;
+ flush = true;
}

rcu_read_unlock();
- return flush_needed;
+ return flush;
}

/*
@@ -554,7 +555,7 @@ bool kvm_tdp_mmu_zap_gfn_range(struct kvm *kvm, gfn_t start, gfn_t end)
bool flush = false;

for_each_tdp_mmu_root_yield_safe(kvm, root)
- flush |= zap_gfn_range(kvm, root, start, end, true);
+ flush = zap_gfn_range(kvm, root, start, end, true, flush);

return flush;
}
@@ -757,7 +758,7 @@ static int zap_gfn_range_hva_wrapper(struct kvm *kvm,
struct kvm_mmu_page *root, gfn_t start,
gfn_t end, unsigned long unused)
{
- return zap_gfn_range(kvm, root, start, end, false);
+ return zap_gfn_range(kvm, root, start, end, false, false);
}

int kvm_tdp_mmu_zap_hva_range(struct kvm *kvm, unsigned long start,
--
2.30.1



2021-04-05 16:06:28

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 074/126] drm/amdgpu: fix offset calculation in amdgpu_vm_bo_clear_mappings()

From: Nirmoy Das <[email protected]>

commit 5e61b84f9d3ddfba73091f9fbc940caae1c9eb22 upstream.

Offset calculation wasn't correct as start addresses are in pfn
not in bytes.

CC: [email protected]
Signed-off-by: Nirmoy Das <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2435,7 +2435,7 @@ int amdgpu_vm_bo_clear_mappings(struct a
after->start = eaddr + 1;
after->last = tmp->last;
after->offset = tmp->offset;
- after->offset += after->start - tmp->start;
+ after->offset += (after->start - tmp->start) << PAGE_SHIFT;
after->flags = tmp->flags;
after->bo_va = tmp->bo_va;
list_add(&after->list, &tmp->bo_va->invalids);


2021-04-05 16:06:31

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 095/126] KVM: x86/mmu: Factor out functions to add/remove TDP MMU pages

From: Ben Gardon <[email protected]>

[ Upstream commit a9442f594147f95307f691cfba0c31e25dc79b9d ]

Move the work of adding and removing TDP MMU pages to/from "secondary"
data structures to helper functions. These functions will be built on in
future commits to enable MMU operations to proceed (mostly) in parallel.

No functional change expected.

Signed-off-by: Ben Gardon <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/mmu/tdp_mmu.c | 47 +++++++++++++++++++++++++++++++-------
1 file changed, 39 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 136311be5890..14d69c01c710 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -262,6 +262,39 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn,
}
}

+/**
+ * tdp_mmu_link_page - Add a new page to the list of pages used by the TDP MMU
+ *
+ * @kvm: kvm instance
+ * @sp: the new page
+ * @account_nx: This page replaces a NX large page and should be marked for
+ * eventual reclaim.
+ */
+static void tdp_mmu_link_page(struct kvm *kvm, struct kvm_mmu_page *sp,
+ bool account_nx)
+{
+ lockdep_assert_held_write(&kvm->mmu_lock);
+
+ list_add(&sp->link, &kvm->arch.tdp_mmu_pages);
+ if (account_nx)
+ account_huge_nx_page(kvm, sp);
+}
+
+/**
+ * tdp_mmu_unlink_page - Remove page from the list of pages used by the TDP MMU
+ *
+ * @kvm: kvm instance
+ * @sp: the page to be removed
+ */
+static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp)
+{
+ lockdep_assert_held_write(&kvm->mmu_lock);
+
+ list_del(&sp->link);
+ if (sp->lpage_disallowed)
+ unaccount_huge_nx_page(kvm, sp);
+}
+
/**
* handle_removed_tdp_mmu_page - handle a pt removed from the TDP structure
*
@@ -281,10 +314,7 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt)

trace_kvm_mmu_prepare_zap_page(sp);

- list_del(&sp->link);
-
- if (sp->lpage_disallowed)
- unaccount_huge_nx_page(kvm, sp);
+ tdp_mmu_unlink_page(kvm, sp);

for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
old_child_spte = READ_ONCE(*(pt + i));
@@ -704,15 +734,16 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,

if (!is_shadow_present_pte(iter.old_spte)) {
sp = alloc_tdp_mmu_page(vcpu, iter.gfn, iter.level);
- list_add(&sp->link, &vcpu->kvm->arch.tdp_mmu_pages);
child_pt = sp->spt;
+
+ tdp_mmu_link_page(vcpu->kvm, sp,
+ huge_page_disallowed &&
+ req_level >= iter.level);
+
new_spte = make_nonleaf_spte(child_pt,
!shadow_accessed_mask);

trace_kvm_mmu_get_page(sp, true);
- if (huge_page_disallowed && req_level >= iter.level)
- account_huge_nx_page(vcpu->kvm, sp);
-
tdp_mmu_set_spte(vcpu->kvm, &iter, new_spte);
}
}
--
2.30.1



2021-04-05 16:06:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 093/126] KVM: x86/mmu: Dont redundantly clear TDP MMU pt memory

From: Ben Gardon <[email protected]>

[ Upstream commit 734e45b329d626d2c14e2bcf8be3d069a33c3316 ]

The KVM MMU caches already guarantee that shadow page table memory will
be zeroed, so there is no reason to re-zero the page in the TDP MMU page
fault handler.

No functional change intended.

Reviewed-by: Peter Feiner <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Acked-by: Paolo Bonzini <[email protected]>
Signed-off-by: Ben Gardon <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/mmu/tdp_mmu.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index f88404033e0c..136311be5890 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -706,7 +706,6 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
sp = alloc_tdp_mmu_page(vcpu, iter.gfn, iter.level);
list_add(&sp->link, &vcpu->kvm->arch.tdp_mmu_pages);
child_pt = sp->spt;
- clear_page(child_pt);
new_spte = make_nonleaf_spte(child_pt,
!shadow_accessed_mask);

--
2.30.1



2021-04-05 16:06:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 097/126] KVM: x86: compile out TDP MMU on 32-bit systems

From: Paolo Bonzini <[email protected]>

[ Upstream commit 897218ff7cf19290ec2d69652ce673d8ed6fedeb ]

The TDP MMU assumes that it can do atomic accesses to 64-bit PTEs.
Rather than just disabling it, compile it out completely so that it
is possible to use for example 64-bit xchg.

To limit the number of stubs, wrap all accesses to tdp_mmu_enabled
or tdp_mmu_page with a function. Calls to all other functions in
tdp_mmu.c are eliminated and do not even reach the linker.

Reviewed-by: Sean Christopherson <[email protected]>
Tested-by: Sean Christopherson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 2 ++
arch/x86/kvm/Makefile | 3 ++-
arch/x86/kvm/mmu/mmu.c | 36 ++++++++++++++++-----------------
arch/x86/kvm/mmu/mmu_internal.h | 2 ++
arch/x86/kvm/mmu/tdp_mmu.c | 29 +-------------------------
arch/x86/kvm/mmu/tdp_mmu.h | 32 +++++++++++++++++++++++++----
6 files changed, 53 insertions(+), 51 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 47cd8f9b3fe7..af858f495e75 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1001,6 +1001,7 @@ struct kvm_arch {
struct kvm_pmu_event_filter *pmu_event_filter;
struct task_struct *nx_lpage_recovery_thread;

+#ifdef CONFIG_X86_64
/*
* Whether the TDP MMU is enabled for this VM. This contains a
* snapshot of the TDP MMU module parameter from when the VM was
@@ -1027,6 +1028,7 @@ struct kvm_arch {
* the thread holds the MMU lock in write mode.
*/
spinlock_t tdp_mmu_pages_lock;
+#endif /* CONFIG_X86_64 */
};

struct kvm_vm_stat {
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index b804444e16d4..1d1e31917a88 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -16,7 +16,8 @@ kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o
kvm-y += x86.o emulate.o i8259.o irq.o lapic.o \
i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \
hyperv.o debugfs.o mmu/mmu.o mmu/page_track.o \
- mmu/spte.o mmu/tdp_iter.o mmu/tdp_mmu.o
+ mmu/spte.o
+kvm-$(CONFIG_X86_64) += mmu/tdp_iter.o mmu/tdp_mmu.o

kvm-intel-y += vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \
vmx/evmcs.o vmx/nested.o vmx/posted_intr.o
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 0f45ad05f895..94e6bf004576 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1225,7 +1225,7 @@ static void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
{
struct kvm_rmap_head *rmap_head;

- if (kvm->arch.tdp_mmu_enabled)
+ if (is_tdp_mmu_enabled(kvm))
kvm_tdp_mmu_clear_dirty_pt_masked(kvm, slot,
slot->base_gfn + gfn_offset, mask, true);
while (mask) {
@@ -1254,7 +1254,7 @@ void kvm_mmu_clear_dirty_pt_masked(struct kvm *kvm,
{
struct kvm_rmap_head *rmap_head;

- if (kvm->arch.tdp_mmu_enabled)
+ if (is_tdp_mmu_enabled(kvm))
kvm_tdp_mmu_clear_dirty_pt_masked(kvm, slot,
slot->base_gfn + gfn_offset, mask, false);
while (mask) {
@@ -1301,7 +1301,7 @@ bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
write_protected |= __rmap_write_protect(kvm, rmap_head, true);
}

- if (kvm->arch.tdp_mmu_enabled)
+ if (is_tdp_mmu_enabled(kvm))
write_protected |=
kvm_tdp_mmu_write_protect_gfn(kvm, slot, gfn);

@@ -1513,7 +1513,7 @@ int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end,

r = kvm_handle_hva_range(kvm, start, end, 0, kvm_unmap_rmapp);

- if (kvm->arch.tdp_mmu_enabled)
+ if (is_tdp_mmu_enabled(kvm))
r |= kvm_tdp_mmu_zap_hva_range(kvm, start, end);

return r;
@@ -1525,7 +1525,7 @@ int kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte)

r = kvm_handle_hva(kvm, hva, (unsigned long)&pte, kvm_set_pte_rmapp);

- if (kvm->arch.tdp_mmu_enabled)
+ if (is_tdp_mmu_enabled(kvm))
r |= kvm_tdp_mmu_set_spte_hva(kvm, hva, &pte);

return r;
@@ -1580,7 +1580,7 @@ int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end)
int young = false;

young = kvm_handle_hva_range(kvm, start, end, 0, kvm_age_rmapp);
- if (kvm->arch.tdp_mmu_enabled)
+ if (is_tdp_mmu_enabled(kvm))
young |= kvm_tdp_mmu_age_hva_range(kvm, start, end);

return young;
@@ -1591,7 +1591,7 @@ int kvm_test_age_hva(struct kvm *kvm, unsigned long hva)
int young = false;

young = kvm_handle_hva(kvm, hva, 0, kvm_test_age_rmapp);
- if (kvm->arch.tdp_mmu_enabled)
+ if (is_tdp_mmu_enabled(kvm))
young |= kvm_tdp_mmu_test_age_hva(kvm, hva);

return young;
@@ -3153,7 +3153,7 @@ static void mmu_free_root_page(struct kvm *kvm, hpa_t *root_hpa,
sp = to_shadow_page(*root_hpa & PT64_BASE_ADDR_MASK);

if (kvm_mmu_put_root(kvm, sp)) {
- if (sp->tdp_mmu_page)
+ if (is_tdp_mmu_page(sp))
kvm_tdp_mmu_free_root(kvm, sp);
else if (sp->role.invalid)
kvm_mmu_prepare_zap_page(kvm, sp, invalid_list);
@@ -3247,7 +3247,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
hpa_t root;
unsigned i;

- if (vcpu->kvm->arch.tdp_mmu_enabled) {
+ if (is_tdp_mmu_enabled(vcpu->kvm)) {
root = kvm_tdp_mmu_get_vcpu_root_hpa(vcpu);

if (!VALID_PAGE(root))
@@ -5434,7 +5434,7 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)

kvm_zap_obsolete_pages(kvm);

- if (kvm->arch.tdp_mmu_enabled)
+ if (is_tdp_mmu_enabled(kvm))
kvm_tdp_mmu_zap_all(kvm);

spin_unlock(&kvm->mmu_lock);
@@ -5497,7 +5497,7 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)
}
}

- if (kvm->arch.tdp_mmu_enabled) {
+ if (is_tdp_mmu_enabled(kvm)) {
flush = kvm_tdp_mmu_zap_gfn_range(kvm, gfn_start, gfn_end);
if (flush)
kvm_flush_remote_tlbs(kvm);
@@ -5521,7 +5521,7 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
spin_lock(&kvm->mmu_lock);
flush = slot_handle_level(kvm, memslot, slot_rmap_write_protect,
start_level, KVM_MAX_HUGEPAGE_LEVEL, false);
- if (kvm->arch.tdp_mmu_enabled)
+ if (is_tdp_mmu_enabled(kvm))
flush |= kvm_tdp_mmu_wrprot_slot(kvm, memslot, PG_LEVEL_4K);
spin_unlock(&kvm->mmu_lock);

@@ -5587,7 +5587,7 @@ void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
slot_handle_leaf(kvm, (struct kvm_memory_slot *)memslot,
kvm_mmu_zap_collapsible_spte, true);

- if (kvm->arch.tdp_mmu_enabled)
+ if (is_tdp_mmu_enabled(kvm))
kvm_tdp_mmu_zap_collapsible_sptes(kvm, memslot);
spin_unlock(&kvm->mmu_lock);
}
@@ -5614,7 +5614,7 @@ void kvm_mmu_slot_leaf_clear_dirty(struct kvm *kvm,

spin_lock(&kvm->mmu_lock);
flush = slot_handle_leaf(kvm, memslot, __rmap_clear_dirty, false);
- if (kvm->arch.tdp_mmu_enabled)
+ if (is_tdp_mmu_enabled(kvm))
flush |= kvm_tdp_mmu_clear_dirty_slot(kvm, memslot);
spin_unlock(&kvm->mmu_lock);

@@ -5637,7 +5637,7 @@ void kvm_mmu_slot_largepage_remove_write_access(struct kvm *kvm,
spin_lock(&kvm->mmu_lock);
flush = slot_handle_large_level(kvm, memslot, slot_rmap_write_protect,
false);
- if (kvm->arch.tdp_mmu_enabled)
+ if (is_tdp_mmu_enabled(kvm))
flush |= kvm_tdp_mmu_wrprot_slot(kvm, memslot, PG_LEVEL_2M);
spin_unlock(&kvm->mmu_lock);

@@ -5653,7 +5653,7 @@ void kvm_mmu_slot_set_dirty(struct kvm *kvm,

spin_lock(&kvm->mmu_lock);
flush = slot_handle_all_level(kvm, memslot, __rmap_set_dirty, false);
- if (kvm->arch.tdp_mmu_enabled)
+ if (is_tdp_mmu_enabled(kvm))
flush |= kvm_tdp_mmu_slot_set_dirty(kvm, memslot);
spin_unlock(&kvm->mmu_lock);

@@ -5681,7 +5681,7 @@ void kvm_mmu_zap_all(struct kvm *kvm)

kvm_mmu_commit_zap_page(kvm, &invalid_list);

- if (kvm->arch.tdp_mmu_enabled)
+ if (is_tdp_mmu_enabled(kvm))
kvm_tdp_mmu_zap_all(kvm);

spin_unlock(&kvm->mmu_lock);
@@ -5992,7 +5992,7 @@ static void kvm_recover_nx_lpages(struct kvm *kvm)
struct kvm_mmu_page,
lpage_disallowed_link);
WARN_ON_ONCE(!sp->lpage_disallowed);
- if (sp->tdp_mmu_page) {
+ if (is_tdp_mmu_page(sp)) {
kvm_tdp_mmu_zap_gfn_range(kvm, sp->gfn,
sp->gfn + KVM_PAGES_PER_HPAGE(sp->role.level));
} else {
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index 7f599cc64178..cf67fa6fb8fe 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -56,10 +56,12 @@ struct kvm_mmu_page {
/* Number of writes since the last time traversal visited this page. */
atomic_t write_flooding_count;

+#ifdef CONFIG_X86_64
bool tdp_mmu_page;

/* Used for freeing the page asyncronously if it is a TDP MMU page. */
struct rcu_head rcu_head;
+#endif
};

extern struct kmem_cache *mmu_page_header_cache;
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index eb38f74af3f2..075b9d63bd57 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -10,24 +10,13 @@
#include <asm/cmpxchg.h>
#include <trace/events/kvm.h>

-#ifdef CONFIG_X86_64
static bool __read_mostly tdp_mmu_enabled = false;
module_param_named(tdp_mmu, tdp_mmu_enabled, bool, 0644);
-#endif
-
-static bool is_tdp_mmu_enabled(void)
-{
-#ifdef CONFIG_X86_64
- return tdp_enabled && READ_ONCE(tdp_mmu_enabled);
-#else
- return false;
-#endif /* CONFIG_X86_64 */
-}

/* Initializes the TDP MMU for the VM, if enabled. */
void kvm_mmu_init_tdp_mmu(struct kvm *kvm)
{
- if (!is_tdp_mmu_enabled())
+ if (!tdp_enabled || !READ_ONCE(tdp_mmu_enabled))
return;

/* This should not be changed for the lifetime of the VM. */
@@ -96,22 +85,6 @@ static inline struct kvm_mmu_page *tdp_mmu_next_root(struct kvm *kvm,
#define for_each_tdp_mmu_root(_kvm, _root) \
list_for_each_entry(_root, &_kvm->arch.tdp_mmu_roots, link)

-bool is_tdp_mmu_root(struct kvm *kvm, hpa_t hpa)
-{
- struct kvm_mmu_page *sp;
-
- if (!kvm->arch.tdp_mmu_enabled)
- return false;
- if (WARN_ON(!VALID_PAGE(hpa)))
- return false;
-
- sp = to_shadow_page(hpa);
- if (WARN_ON(!sp))
- return false;
-
- return sp->tdp_mmu_page && sp->root_count;
-}
-
static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
gfn_t start, gfn_t end, bool can_yield, bool flush);

diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
index cbbdbadd1526..b4b65e3699b3 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.h
+++ b/arch/x86/kvm/mmu/tdp_mmu.h
@@ -5,10 +5,6 @@

#include <linux/kvm_host.h>

-void kvm_mmu_init_tdp_mmu(struct kvm *kvm);
-void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm);
-
-bool is_tdp_mmu_root(struct kvm *kvm, hpa_t root);
hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu);
void kvm_tdp_mmu_free_root(struct kvm *kvm, struct kvm_mmu_page *root);

@@ -47,4 +43,32 @@ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm,
int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
int *root_level);

+#ifdef CONFIG_X86_64
+void kvm_mmu_init_tdp_mmu(struct kvm *kvm);
+void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm);
+static inline bool is_tdp_mmu_enabled(struct kvm *kvm) { return kvm->arch.tdp_mmu_enabled; }
+static inline bool is_tdp_mmu_page(struct kvm_mmu_page *sp) { return sp->tdp_mmu_page; }
+#else
+static inline void kvm_mmu_init_tdp_mmu(struct kvm *kvm) {}
+static inline void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm) {}
+static inline bool is_tdp_mmu_enabled(struct kvm *kvm) { return false; }
+static inline bool is_tdp_mmu_page(struct kvm_mmu_page *sp) { return false; }
+#endif
+
+static inline bool is_tdp_mmu_root(struct kvm *kvm, hpa_t hpa)
+{
+ struct kvm_mmu_page *sp;
+
+ if (!is_tdp_mmu_enabled(kvm))
+ return false;
+ if (WARN_ON(!VALID_PAGE(hpa)))
+ return false;
+
+ sp = to_shadow_page(hpa);
+ if (WARN_ON(!sp))
+ return false;
+
+ return is_tdp_mmu_page(sp) && sp->root_count;
+}
+
#endif /* __KVM_X86_MMU_TDP_MMU_H */
--
2.30.1



2021-04-05 16:06:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 094/126] KVM: x86/mmu: Fix braces in kvm_recover_nx_lpages

From: Ben Gardon <[email protected]>

[ Upstream commit 8d1a182ea791f0111b0258c8f3eb8d77af0a8386 ]

No functional change intended.

Fixes: 29cf0f5007a2 ("kvm: x86/mmu: NX largepage recovery for TDP MMU")
Signed-off-by: Ben Gardon <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/mmu/mmu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index dacbd13d32c6..0f45ad05f895 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5992,10 +5992,10 @@ static void kvm_recover_nx_lpages(struct kvm *kvm)
struct kvm_mmu_page,
lpage_disallowed_link);
WARN_ON_ONCE(!sp->lpage_disallowed);
- if (sp->tdp_mmu_page)
+ if (sp->tdp_mmu_page) {
kvm_tdp_mmu_zap_gfn_range(kvm, sp->gfn,
sp->gfn + KVM_PAGES_PER_HPAGE(sp->role.level));
- else {
+ } else {
kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
WARN_ON_ONCE(sp->lpage_disallowed);
}
--
2.30.1



2021-04-05 16:06:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 098/126] KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping

From: Sean Christopherson <[email protected]>

[ Upstream commit 048f49809c526348775425420fb5b8e84fd9a133 ]

Honor the "flush needed" return from kvm_tdp_mmu_zap_gfn_range(), which
does the flush itself if and only if it yields (which it will never do in
this particular scenario), and otherwise expects the caller to do the
flush. If pages are zapped from the TDP MMU but not the legacy MMU, then
no flush will occur.

Fixes: 29cf0f5007a2 ("kvm: x86/mmu: NX largepage recovery for TDP MMU")
Cc: [email protected]
Cc: Ben Gardon <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Reviewed-by: Ben Gardon <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/mmu/mmu.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 94e6bf004576..e69248820d01 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5972,6 +5972,8 @@ static void kvm_recover_nx_lpages(struct kvm *kvm)
struct kvm_mmu_page *sp;
unsigned int ratio;
LIST_HEAD(invalid_list);
+ bool flush = false;
+ gfn_t gfn_end;
ulong to_zap;

rcu_idx = srcu_read_lock(&kvm->srcu);
@@ -5993,19 +5995,20 @@ static void kvm_recover_nx_lpages(struct kvm *kvm)
lpage_disallowed_link);
WARN_ON_ONCE(!sp->lpage_disallowed);
if (is_tdp_mmu_page(sp)) {
- kvm_tdp_mmu_zap_gfn_range(kvm, sp->gfn,
- sp->gfn + KVM_PAGES_PER_HPAGE(sp->role.level));
+ gfn_end = sp->gfn + KVM_PAGES_PER_HPAGE(sp->role.level);
+ flush = kvm_tdp_mmu_zap_gfn_range(kvm, sp->gfn, gfn_end);
} else {
kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
WARN_ON_ONCE(sp->lpage_disallowed);
}

if (need_resched() || spin_needbreak(&kvm->mmu_lock)) {
- kvm_mmu_commit_zap_page(kvm, &invalid_list);
+ kvm_mmu_remote_flush_or_zap(kvm, &invalid_list, flush);
cond_resched_lock(&kvm->mmu_lock);
+ flush = false;
}
}
- kvm_mmu_commit_zap_page(kvm, &invalid_list);
+ kvm_mmu_remote_flush_or_zap(kvm, &invalid_list, flush);

spin_unlock(&kvm->mmu_lock);
srcu_read_unlock(&kvm->srcu, rcu_idx);
--
2.30.1



2021-04-05 16:06:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 099/126] extcon: Add stubs for extcon_register_notifier_all() functions

From: Krzysztof Kozlowski <[email protected]>

[ Upstream commit c9570d4a5efd04479b3cd09c39b571eb031d94f4 ]

Add stubs for extcon_register_notifier_all() function for !CONFIG_EXTCON
case. This is useful for compile testing and for drivers which use
EXTCON but do not require it (therefore do not depend on CONFIG_EXTCON).

Fixes: 815429b39d94 ("extcon: Add new extcon_register_notifier_all() to monitor all external connectors")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Chanwoo Choi <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
include/linux/extcon.h | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)

diff --git a/include/linux/extcon.h b/include/linux/extcon.h
index fd183fb9c20f..0c19010da77f 100644
--- a/include/linux/extcon.h
+++ b/include/linux/extcon.h
@@ -271,6 +271,29 @@ static inline void devm_extcon_unregister_notifier(struct device *dev,
struct extcon_dev *edev, unsigned int id,
struct notifier_block *nb) { }

+static inline int extcon_register_notifier_all(struct extcon_dev *edev,
+ struct notifier_block *nb)
+{
+ return 0;
+}
+
+static inline int extcon_unregister_notifier_all(struct extcon_dev *edev,
+ struct notifier_block *nb)
+{
+ return 0;
+}
+
+static inline int devm_extcon_register_notifier_all(struct device *dev,
+ struct extcon_dev *edev,
+ struct notifier_block *nb)
+{
+ return 0;
+}
+
+static inline void devm_extcon_unregister_notifier_all(struct device *dev,
+ struct extcon_dev *edev,
+ struct notifier_block *nb) { }
+
static inline struct extcon_dev *extcon_get_extcon_dev(const char *extcon_name)
{
return ERR_PTR(-ENODEV);
--
2.30.2



2021-04-05 16:06:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 075/126] drm/amdgpu: check alignment on CPU page for bo map

From: Xℹ Ruoyao <[email protected]>

commit e3512fb67093fabdf27af303066627b921ee9bd8 upstream.

The page table of AMDGPU requires an alignment to CPU page so we should
check ioctl parameters for it. Return -EINVAL if some parameter is
unaligned to CPU page, instead of corrupt the page table sliently.

Reviewed-by: Christian König <[email protected]>
Signed-off-by: Xi Ruoyao <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2223,8 +2223,8 @@ int amdgpu_vm_bo_map(struct amdgpu_devic
uint64_t eaddr;

/* validate the parameters */
- if (saddr & AMDGPU_GPU_PAGE_MASK || offset & AMDGPU_GPU_PAGE_MASK ||
- size == 0 || size & AMDGPU_GPU_PAGE_MASK)
+ if (saddr & ~PAGE_MASK || offset & ~PAGE_MASK ||
+ size == 0 || size & ~PAGE_MASK)
return -EINVAL;

/* make sure object fit at this offset */
@@ -2289,8 +2289,8 @@ int amdgpu_vm_bo_replace_map(struct amdg
int r;

/* validate the parameters */
- if (saddr & AMDGPU_GPU_PAGE_MASK || offset & AMDGPU_GPU_PAGE_MASK ||
- size == 0 || size & AMDGPU_GPU_PAGE_MASK)
+ if (saddr & ~PAGE_MASK || offset & ~PAGE_MASK ||
+ size == 0 || size & ~PAGE_MASK)
return -EINVAL;

/* make sure object fit at this offset */


2021-04-05 16:06:43

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 105/126] usbip: vhci_hcd fix shift out-of-bounds in vhci_hub_control()

From: Shuah Khan <[email protected]>

commit 1cc5ed25bdade86de2650a82b2730108a76de20c upstream.

Fix shift out-of-bounds in vhci_hub_control() SetPortFeature handling.

UBSAN: shift-out-of-bounds in drivers/usb/usbip/vhci_hcd.c:605:42
shift exponent 768 is too large for 32-bit type 'int'

Reported-by: [email protected]
Cc: [email protected]
Signed-off-by: Shuah Khan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/usb/usbip/vhci_hcd.c | 2 ++
1 file changed, 2 insertions(+)

--- a/drivers/usb/usbip/vhci_hcd.c
+++ b/drivers/usb/usbip/vhci_hcd.c
@@ -594,6 +594,8 @@ static int vhci_hub_control(struct usb_h
pr_err("invalid port number %d\n", wIndex);
goto error;
}
+ if (wValue >= 32)
+ goto error;
if (hcd->speed == HCD_USB3) {
if ((vhci_hcd->port_status[rhport] &
USB_SS_PORT_STAT_POWER) != 0) {


2021-04-05 16:06:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 102/126] usb: dwc3: pci: Enable dis_uX_susphy_quirk for Intel Merrifield

From: Andy Shevchenko <[email protected]>

[ Upstream commit b522f830d35189e0283fa4d5b4b3ef8d7a78cfcb ]

It seems that on Intel Merrifield platform the USB PHY shouldn't be suspended.
Otherwise it can't be enabled by simply change the cable in the connector.

Enable corresponding quirk for the platform in question.

Fixes: e5f4ca3fce90 ("usb: dwc3: ulpi: Fix USB2.0 HS/FS/LS PHY suspend regression")
Suggested-by: Serge Semin <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/usb/dwc3/dwc3-pci.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/usb/dwc3/dwc3-pci.c b/drivers/usb/dwc3/dwc3-pci.c
index bae6a70664c8..598daed8086f 100644
--- a/drivers/usb/dwc3/dwc3-pci.c
+++ b/drivers/usb/dwc3/dwc3-pci.c
@@ -118,6 +118,8 @@ static const struct property_entry dwc3_pci_intel_properties[] = {
static const struct property_entry dwc3_pci_mrfld_properties[] = {
PROPERTY_ENTRY_STRING("dr_mode", "otg"),
PROPERTY_ENTRY_STRING("linux,extcon-name", "mrfld_bcove_pwrsrc"),
+ PROPERTY_ENTRY_BOOL("snps,dis_u3_susphy_quirk"),
+ PROPERTY_ENTRY_BOOL("snps,dis_u2_susphy_quirk"),
PROPERTY_ENTRY_BOOL("linux,sysdev_is_parent"),
{}
};
--
2.30.2



2021-04-05 16:06:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 071/126] s390/vdso: fix tod_steering_delta type

From: Heiko Carstens <[email protected]>

commit b24bacd67ffddd9192c4745500fd6f73dbfe565e upstream.

The s390 specific vdso function __arch_get_hw_counter() is supposed to
consider tod clock steering.

If a tod clock steering event happens and the tod clock is set to a
new value __arch_get_hw_counter() will not return the real tod clock
value but slowly drift it from the old delta until the returned value
finally matches the real tod clock value again.

Unfortunately the type of tod_steering_delta unsigned while it is
supposed to be signed. It depends on if tod_steering_delta is negative
or positive in which direction the vdso code drifts the clock value.

Worst case is now that instead of drifting the clock slowly it will
jump into the opposite direction by a factor of two.

Fix this by simply making tod_steering_delta signed.

Fixes: 4bff8cb54502 ("s390: convert to GENERIC_VDSO")
Cc: <[email protected]> # 5.10
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/s390/include/asm/vdso/data.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/s390/include/asm/vdso/data.h
+++ b/arch/s390/include/asm/vdso/data.h
@@ -6,7 +6,7 @@
#include <vdso/datapage.h>

struct arch_vdso_data {
- __u64 tod_steering_delta;
+ __s64 tod_steering_delta;
__u64 tod_steering_end;
};



2021-04-05 16:06:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 119/126] soc: qcom-geni-se: Cleanup the code to remove proxy votes

From: Roja Rani Yarubandi <[email protected]>

commit 29d96eb261345c8d888e248ae79484e681be2faa upstream.

This reverts commit 048eb908a1f2 ("soc: qcom-geni-se: Add interconnect
support to fix earlycon crash")

ICC core and platforms drivers supports sync_state feature, which
ensures that the default ICC BW votes from the bootloader is not
removed until all it's consumers are probes.

The proxy votes were needed in case other QUP child drivers
I2C, SPI probes before UART, they can turn off the QUP-CORE clock
which is shared resources for all QUP driver, this causes unclocked
access to HW from earlycon.

Given above support from ICC there is no longer need to maintain
proxy votes on QUP-CORE ICC node from QUP wrapper driver for early
console usecase, the default votes won't be removed until real
console is probed.

Cc: [email protected]
Fixes: 266cd33b5913 ("interconnect: qcom: Ensure that the floor bandwidth value is enforced")
Fixes: 7d3b0b0d8184 ("interconnect: qcom: Use icc_sync_state")
Signed-off-by: Roja Rani Yarubandi <[email protected]>
Signed-off-by: Akash Asthana <[email protected]>
Reviewed-by: Matthias Kaehlcke <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/soc/qcom/qcom-geni-se.c | 74 ----------------------------------
drivers/tty/serial/qcom_geni_serial.c | 7 ---
include/linux/qcom-geni-se.h | 2
3 files changed, 83 deletions(-)

--- a/drivers/soc/qcom/qcom-geni-se.c
+++ b/drivers/soc/qcom/qcom-geni-se.c
@@ -3,7 +3,6 @@

#include <linux/acpi.h>
#include <linux/clk.h>
-#include <linux/console.h>
#include <linux/slab.h>
#include <linux/dma-mapping.h>
#include <linux/io.h>
@@ -91,14 +90,11 @@ struct geni_wrapper {
struct device *dev;
void __iomem *base;
struct clk_bulk_data ahb_clks[NUM_AHB_CLKS];
- struct geni_icc_path to_core;
};

static const char * const icc_path_names[] = {"qup-core", "qup-config",
"qup-memory"};

-static struct geni_wrapper *earlycon_wrapper;
-
#define QUP_HW_VER_REG 0x4

/* Common SE registers */
@@ -828,44 +824,11 @@ int geni_icc_disable(struct geni_se *se)
}
EXPORT_SYMBOL(geni_icc_disable);

-void geni_remove_earlycon_icc_vote(void)
-{
- struct platform_device *pdev;
- struct geni_wrapper *wrapper;
- struct device_node *parent;
- struct device_node *child;
-
- if (!earlycon_wrapper)
- return;
-
- wrapper = earlycon_wrapper;
- parent = of_get_next_parent(wrapper->dev->of_node);
- for_each_child_of_node(parent, child) {
- if (!of_device_is_compatible(child, "qcom,geni-se-qup"))
- continue;
-
- pdev = of_find_device_by_node(child);
- if (!pdev)
- continue;
-
- wrapper = platform_get_drvdata(pdev);
- icc_put(wrapper->to_core.path);
- wrapper->to_core.path = NULL;
-
- }
- of_node_put(parent);
-
- earlycon_wrapper = NULL;
-}
-EXPORT_SYMBOL(geni_remove_earlycon_icc_vote);
-
static int geni_se_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
struct resource *res;
struct geni_wrapper *wrapper;
- struct console __maybe_unused *bcon;
- bool __maybe_unused has_earlycon = false;
int ret;

wrapper = devm_kzalloc(dev, sizeof(*wrapper), GFP_KERNEL);
@@ -888,43 +851,6 @@ static int geni_se_probe(struct platform
}
}

-#ifdef CONFIG_SERIAL_EARLYCON
- for_each_console(bcon) {
- if (!strcmp(bcon->name, "qcom_geni")) {
- has_earlycon = true;
- break;
- }
- }
- if (!has_earlycon)
- goto exit;
-
- wrapper->to_core.path = devm_of_icc_get(dev, "qup-core");
- if (IS_ERR(wrapper->to_core.path))
- return PTR_ERR(wrapper->to_core.path);
- /*
- * Put minmal BW request on core clocks on behalf of early console.
- * The vote will be removed earlycon exit function.
- *
- * Note: We are putting vote on each QUP wrapper instead only to which
- * earlycon is connected because QUP core clock of different wrapper
- * share same voltage domain. If core1 is put to 0, then core2 will
- * also run at 0, if not voted. Default ICC vote will be removed ASA
- * we touch any of the core clock.
- * core1 = core2 = max(core1, core2)
- */
- ret = icc_set_bw(wrapper->to_core.path, GENI_DEFAULT_BW,
- GENI_DEFAULT_BW);
- if (ret) {
- dev_err(&pdev->dev, "%s: ICC BW voting failed for core: %d\n",
- __func__, ret);
- return ret;
- }
-
- if (of_get_compatible_child(pdev->dev.of_node, "qcom,geni-debug-uart"))
- earlycon_wrapper = wrapper;
- of_node_put(pdev->dev.of_node);
-exit:
-#endif
dev_set_drvdata(dev, wrapper);
dev_dbg(dev, "GENI SE Driver probed\n");
return devm_of_platform_populate(dev);
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -1177,12 +1177,6 @@ static inline void qcom_geni_serial_enab
struct console *con) { }
#endif

-static int qcom_geni_serial_earlycon_exit(struct console *con)
-{
- geni_remove_earlycon_icc_vote();
- return 0;
-}
-
static struct qcom_geni_private_data earlycon_private_data;

static int __init qcom_geni_serial_earlycon_setup(struct earlycon_device *dev,
@@ -1233,7 +1227,6 @@ static int __init qcom_geni_serial_early
writel(stop_bit_len, uport->membase + SE_UART_TX_STOP_BIT_LEN);

dev->con->write = qcom_geni_serial_earlycon_write;
- dev->con->exit = qcom_geni_serial_earlycon_exit;
dev->con->setup = NULL;
qcom_geni_serial_enable_early_read(&se, dev->con);

--- a/include/linux/qcom-geni-se.h
+++ b/include/linux/qcom-geni-se.h
@@ -462,7 +462,5 @@ void geni_icc_set_tag(struct geni_se *se
int geni_icc_enable(struct geni_se *se);

int geni_icc_disable(struct geni_se *se);
-
-void geni_remove_earlycon_icc_vote(void);
#endif
#endif


2021-04-05 16:06:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 111/126] USB: cdc-acm: downgrade message to debug

From: Oliver Neukum <[email protected]>

commit e4c77070ad45fc940af1d7fb1e637c349e848951 upstream.

This failure is so common that logging an error here amounts
to spamming log files.

Reviewed-by: Bruno Thomsen <[email protected]>
Signed-off-by: Oliver Neukum <[email protected]>
Cc: stable <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/usb/class/cdc-acm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/usb/class/cdc-acm.c
+++ b/drivers/usb/class/cdc-acm.c
@@ -659,7 +659,8 @@ static void acm_port_dtr_rts(struct tty_

res = acm_set_control(acm, val);
if (res && (acm->ctrl_caps & USB_CDC_CAP_LINE))
- dev_err(&acm->control->dev, "failed to set dtr/rts\n");
+ /* This is broken in too many devices to spam the logs */
+ dev_dbg(&acm->control->dev, "failed to set dtr/rts\n");
}

static int acm_port_activate(struct tty_port *port, struct tty_struct *tty)


2021-04-05 16:07:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 124/126] riscv: evaluate put_user() arg before enabling user access

From: Ben Dooks <[email protected]>

commit 285a76bb2cf51b0c74c634f2aaccdb93e1f2a359 upstream.

The <asm/uaccess.h> header has a problem with put_user(a, ptr) if
the 'a' is not a simple variable, such as a function. This can lead
to the compiler producing code as so:

1: enable_user_access()
2: evaluate 'a' into register 'r'
3: put 'r' to 'ptr'
4: disable_user_acess()

The issue is that 'a' is now being evaluated with the user memory
protections disabled. So we try and force the evaulation by assigning
'x' to __val at the start, and hoping the compiler barriers in
enable_user_access() do the job of ordering step 2 before step 1.

This has shown up in a bug where 'a' sleeps and thus schedules out
and loses the SR_SUM flag. This isn't sufficient to fully fix, but
should reduce the window of opportunity. The first instance of this
we found is in scheudle_tail() where the code does:

$ less -N kernel/sched/core.c

4263 if (current->set_child_tid)
4264 put_user(task_pid_vnr(current), current->set_child_tid);

Here, the task_pid_vnr(current) is called within the block that has
enabled the user memory access. This can be made worse with KASAN
which makes task_pid_vnr() a rather large call with plenty of
opportunity to sleep.

Signed-off-by: Ben Dooks <[email protected]>
Reported-by: [email protected]
Suggested-by: Arnd Bergman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

--
Changes since v1:
- fixed formatting and updated the patch description with more info

Changes since v2:
- fixed commenting on __put_user() ([email protected])

Change since v3:
- fixed RFC in patch title. Should be ready to merge.

Signed-off-by: Palmer Dabbelt <[email protected]>
---
arch/riscv/include/asm/uaccess.h | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

--- a/arch/riscv/include/asm/uaccess.h
+++ b/arch/riscv/include/asm/uaccess.h
@@ -306,7 +306,9 @@ do { \
* data types like structures or arrays.
*
* @ptr must have pointer-to-simple-variable type, and @x must be assignable
- * to the result of dereferencing @ptr.
+ * to the result of dereferencing @ptr. The value of @x is copied to avoid
+ * re-ordering where @x is evaluated inside the block that enables user-space
+ * access (thus bypassing user space protection if @x is a function).
*
* Caller must check the pointer with access_ok() before calling this
* function.
@@ -316,12 +318,13 @@ do { \
#define __put_user(x, ptr) \
({ \
__typeof__(*(ptr)) __user *__gu_ptr = (ptr); \
+ __typeof__(*__gu_ptr) __val = (x); \
long __pu_err = 0; \
\
__chk_user_ptr(__gu_ptr); \
\
__enable_user_access(); \
- __put_user_nocheck(x, __gu_ptr, __pu_err); \
+ __put_user_nocheck(__val, __gu_ptr, __pu_err); \
__disable_user_access(); \
\
__pu_err; \


2021-04-05 16:07:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 123/126] drivers: video: fbcon: fix NULL dereference in fbcon_cursor()

From: Du Cheng <[email protected]>

commit 01faae5193d6190b7b3aa93dae43f514e866d652 upstream.

add null-check on function pointer before dereference on ops->cursor

Reported-by: [email protected]
Cc: stable <[email protected]>
Signed-off-by: Du Cheng <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/video/fbdev/core/fbcon.c | 3 +++
1 file changed, 3 insertions(+)

--- a/drivers/video/fbdev/core/fbcon.c
+++ b/drivers/video/fbdev/core/fbcon.c
@@ -1344,6 +1344,9 @@ static void fbcon_cursor(struct vc_data

ops->cursor_flash = (mode == CM_ERASE) ? 0 : 1;

+ if (!ops->cursor)
+ return;
+
ops->cursor(vc, info, mode, get_color(vc, info, c, 1),
get_color(vc, info, c, 0));
}


2021-04-05 16:07:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 092/126] KVM: x86/mmu: Add comment on __tdp_mmu_set_spte

From: Ben Gardon <[email protected]>

[ Upstream commit fe43fa2f407b9d513f7bcf18142e14e1bf1508d6 ]

__tdp_mmu_set_spte is a very important function in the TDP MMU which
already accepts several arguments and will take more in future commits.
To offset this complexity, add a comment to the function describing each
of the arguemnts.

No functional change intended.

Reviewed-by: Peter Feiner <[email protected]>
Acked-by: Paolo Bonzini <[email protected]>
Signed-off-by: Ben Gardon <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/mmu/tdp_mmu.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 34ef3e1a0f84..f88404033e0c 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -395,6 +395,22 @@ static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
new_spte, level);
}

+/*
+ * __tdp_mmu_set_spte - Set a TDP MMU SPTE and handle the associated bookkeeping
+ * @kvm: kvm instance
+ * @iter: a tdp_iter instance currently on the SPTE that should be set
+ * @new_spte: The value the SPTE should be set to
+ * @record_acc_track: Notify the MM subsystem of changes to the accessed state
+ * of the page. Should be set unless handling an MMU
+ * notifier for access tracking. Leaving record_acc_track
+ * unset in that case prevents page accesses from being
+ * double counted.
+ * @record_dirty_log: Record the page as dirty in the dirty bitmap if
+ * appropriate for the change being made. Should be set
+ * unless performing certain dirty logging operations.
+ * Leaving record_dirty_log unset in that case prevents page
+ * writes from being double counted.
+ */
static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter,
u64 new_spte, bool record_acc_track,
bool record_dirty_log)
--
2.30.1



2021-04-05 16:07:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 100/126] extcon: Fix error handling in extcon_dev_register

From: Dinghao Liu <[email protected]>

[ Upstream commit d3bdd1c3140724967ca4136755538fa7c05c2b4e ]

When devm_kcalloc() fails, we should execute device_unregister()
to unregister edev->dev from system.

Fixes: 046050f6e623e ("extcon: Update the prototype of extcon_register_notifier() with enum extcon")
Signed-off-by: Dinghao Liu <[email protected]>
Signed-off-by: Chanwoo Choi <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/extcon/extcon.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/extcon/extcon.c b/drivers/extcon/extcon.c
index 0a6438cbb3f3..e7a9561a826d 100644
--- a/drivers/extcon/extcon.c
+++ b/drivers/extcon/extcon.c
@@ -1241,6 +1241,7 @@ int extcon_dev_register(struct extcon_dev *edev)
sizeof(*edev->nh), GFP_KERNEL);
if (!edev->nh) {
ret = -ENOMEM;
+ device_unregister(&edev->dev);
goto err_dev;
}

--
2.30.2



2021-04-05 16:07:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 101/126] firmware: stratix10-svc: reset COMMAND_RECONFIG_FLAG_PARTIAL to 0

From: Richard Gong <[email protected]>

[ Upstream commit 2e8496f31d0be8f43849b2980b069f3a9805d047 ]

Clean up COMMAND_RECONFIG_FLAG_PARTIAL flag by resetting it to 0, which
aligns with the firmware settings.

Fixes: 36847f9e3e56 ("firmware: stratix10-svc: correct reconfig flag and timeout values")
Signed-off-by: Richard Gong <[email protected]>
Reviewed-by: Tom Rix <[email protected]>
Signed-off-by: Moritz Fischer <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
include/linux/firmware/intel/stratix10-svc-client.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/firmware/intel/stratix10-svc-client.h b/include/linux/firmware/intel/stratix10-svc-client.h
index a93d85932eb9..f843c6a10cf3 100644
--- a/include/linux/firmware/intel/stratix10-svc-client.h
+++ b/include/linux/firmware/intel/stratix10-svc-client.h
@@ -56,7 +56,7 @@
* COMMAND_RECONFIG_FLAG_PARTIAL:
* Set to FPGA configuration type (full or partial).
*/
-#define COMMAND_RECONFIG_FLAG_PARTIAL 1
+#define COMMAND_RECONFIG_FLAG_PARTIAL 0

/**
* Timeout settings for service clients:
--
2.30.2



2021-04-05 16:28:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 062/126] ALSA: hda/realtek: fix mute/micmute LEDs for HP 640 G8

From: Jeremy Szu <[email protected]>

commit 417eadfdd9e25188465280edf3668ed163fda2d0 upstream.

The HP EliteBook 640 G8 Notebook PC is using ALC236 codec which is
using 0x02 to control mute LED and 0x01 to control micmute LED.
Therefore, add a quirk to make it works.

Signed-off-by: Jeremy Szu <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
sound/pci/hda/patch_realtek.c | 1 +
1 file changed, 1 insertion(+)

--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -8058,6 +8058,7 @@ static const struct snd_pci_quirk alc269
ALC285_FIXUP_HP_GPIO_AMP_INIT),
SND_PCI_QUIRK(0x103c, 0x87c8, "HP", ALC287_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x87e5, "HP ProBook 440 G8 Notebook PC", ALC236_FIXUP_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x87f2, "HP ProBook 640 G8 Notebook PC", ALC236_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x87f4, "HP", ALC287_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x87f5, "HP", ALC287_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x87f7, "HP Spectre x360 14", ALC245_FIXUP_HP_X360_AMP),


2021-04-05 16:28:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 036/126] io_uring: call req_set_fail_links() on short send[msg]()/recv[msg]() with MSG_WAITALL

From: Stefan Metzmacher <[email protected]>

[ Upstream commit 0031275d119efe16711cd93519b595e6f9b4b330 ]

Without that it's not safe to use them in a linked combination with
others.

Now combinations like IORING_OP_SENDMSG followed by IORING_OP_SPLICE
should be possible.

We already handle short reads and writes for the following opcodes:

- IORING_OP_READV
- IORING_OP_READ_FIXED
- IORING_OP_READ
- IORING_OP_WRITEV
- IORING_OP_WRITE_FIXED
- IORING_OP_WRITE
- IORING_OP_SPLICE
- IORING_OP_TEE

Now we have it for these as well:

- IORING_OP_SENDMSG
- IORING_OP_SEND
- IORING_OP_RECVMSG
- IORING_OP_RECV

For IORING_OP_RECVMSG we also check for the MSG_TRUNC and MSG_CTRUNC
flags in order to call req_set_fail_links().

There might be applications arround depending on the behavior
that even short send[msg]()/recv[msg]() retuns continue an
IOSQE_IO_LINK chain.

It's very unlikely that such applications pass in MSG_WAITALL,
which is only defined in 'man 2 recvmsg', but not in 'man 2 sendmsg'.

It's expected that the low level sock_sendmsg() call just ignores
MSG_WAITALL, as MSG_ZEROCOPY is also ignored without explicitly set
SO_ZEROCOPY.

We also expect the caller to know about the implicit truncation to
MAX_RW_COUNT, which we don't detect.

cc: [email protected]
Link: https://lore.kernel.org/r/c4e1a4cc0d905314f4d5dc567e65a7b09621aab3.1615908477.git.metze@samba.org
Signed-off-by: Stefan Metzmacher <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/io_uring.c | 24 ++++++++++++++++++++----
1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index fe2dfdab0acd..4ccf99cb8cdc 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4401,6 +4401,7 @@ static int io_sendmsg(struct io_kiocb *req, bool force_nonblock,
struct io_async_msghdr iomsg, *kmsg;
struct socket *sock;
unsigned flags;
+ int min_ret = 0;
int ret;

sock = sock_from_file(req->file, &ret);
@@ -4427,6 +4428,9 @@ static int io_sendmsg(struct io_kiocb *req, bool force_nonblock,
else if (force_nonblock)
flags |= MSG_DONTWAIT;

+ if (flags & MSG_WAITALL)
+ min_ret = iov_iter_count(&kmsg->msg.msg_iter);
+
ret = __sys_sendmsg_sock(sock, &kmsg->msg, flags);
if (force_nonblock && ret == -EAGAIN)
return io_setup_async_msg(req, kmsg);
@@ -4436,7 +4440,7 @@ static int io_sendmsg(struct io_kiocb *req, bool force_nonblock,
if (kmsg->iov != kmsg->fast_iov)
kfree(kmsg->iov);
req->flags &= ~REQ_F_NEED_CLEANUP;
- if (ret < 0)
+ if (ret < min_ret)
req_set_fail_links(req);
__io_req_complete(req, ret, 0, cs);
return 0;
@@ -4450,6 +4454,7 @@ static int io_send(struct io_kiocb *req, bool force_nonblock,
struct iovec iov;
struct socket *sock;
unsigned flags;
+ int min_ret = 0;
int ret;

sock = sock_from_file(req->file, &ret);
@@ -4471,6 +4476,9 @@ static int io_send(struct io_kiocb *req, bool force_nonblock,
else if (force_nonblock)
flags |= MSG_DONTWAIT;

+ if (flags & MSG_WAITALL)
+ min_ret = iov_iter_count(&msg.msg_iter);
+
msg.msg_flags = flags;
ret = sock_sendmsg(sock, &msg);
if (force_nonblock && ret == -EAGAIN)
@@ -4478,7 +4486,7 @@ static int io_send(struct io_kiocb *req, bool force_nonblock,
if (ret == -ERESTARTSYS)
ret = -EINTR;

- if (ret < 0)
+ if (ret < min_ret)
req_set_fail_links(req);
__io_req_complete(req, ret, 0, cs);
return 0;
@@ -4630,6 +4638,7 @@ static int io_recvmsg(struct io_kiocb *req, bool force_nonblock,
struct socket *sock;
struct io_buffer *kbuf;
unsigned flags;
+ int min_ret = 0;
int ret, cflags = 0;

sock = sock_from_file(req->file, &ret);
@@ -4665,6 +4674,9 @@ static int io_recvmsg(struct io_kiocb *req, bool force_nonblock,
else if (force_nonblock)
flags |= MSG_DONTWAIT;

+ if (flags & MSG_WAITALL)
+ min_ret = iov_iter_count(&kmsg->msg.msg_iter);
+
ret = __sys_recvmsg_sock(sock, &kmsg->msg, req->sr_msg.umsg,
kmsg->uaddr, flags);
if (force_nonblock && ret == -EAGAIN)
@@ -4677,7 +4689,7 @@ static int io_recvmsg(struct io_kiocb *req, bool force_nonblock,
if (kmsg->iov != kmsg->fast_iov)
kfree(kmsg->iov);
req->flags &= ~REQ_F_NEED_CLEANUP;
- if (ret < 0)
+ if (ret < min_ret || ((flags & MSG_WAITALL) && (kmsg->msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))))
req_set_fail_links(req);
__io_req_complete(req, ret, cflags, cs);
return 0;
@@ -4693,6 +4705,7 @@ static int io_recv(struct io_kiocb *req, bool force_nonblock,
struct socket *sock;
struct iovec iov;
unsigned flags;
+ int min_ret = 0;
int ret, cflags = 0;

sock = sock_from_file(req->file, &ret);
@@ -4723,6 +4736,9 @@ static int io_recv(struct io_kiocb *req, bool force_nonblock,
else if (force_nonblock)
flags |= MSG_DONTWAIT;

+ if (flags & MSG_WAITALL)
+ min_ret = iov_iter_count(&msg.msg_iter);
+
ret = sock_recvmsg(sock, &msg, flags);
if (force_nonblock && ret == -EAGAIN)
return -EAGAIN;
@@ -4731,7 +4747,7 @@ static int io_recv(struct io_kiocb *req, bool force_nonblock,
out_free:
if (req->flags & REQ_F_BUFFER_SELECTED)
cflags = io_put_recv_kbuf(req);
- if (ret < 0)
+ if (ret < min_ret || ((flags & MSG_WAITALL) && (msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))))
req_set_fail_links(req);
__io_req_complete(req, ret, cflags, cs);
return 0;
--
2.30.1



2021-04-05 16:29:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 063/126] xtensa: fix uaccess-related livelock in do_page_fault

From: Max Filippov <[email protected]>

commit 7b9acbb6aad4f54623dcd4bd4b1a60fe0c727b09 upstream.

If a uaccess (e.g. get_user()) triggers a fault and there's a
fault signal pending, the handler will return to the uaccess without
having performed a uaccess fault fixup, and so the CPU will immediately
execute the uaccess instruction again, whereupon it will livelock
bouncing between that instruction and the fault handler.

https://lore.kernel.org/lkml/[email protected]/

Cc: [email protected]
Reported-by: Mark Rutland <[email protected]>
Signed-off-by: Max Filippov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/xtensa/mm/fault.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--- a/arch/xtensa/mm/fault.c
+++ b/arch/xtensa/mm/fault.c
@@ -112,8 +112,11 @@ good_area:
*/
fault = handle_mm_fault(vma, address, flags, regs);

- if (fault_signal_pending(fault, regs))
+ if (fault_signal_pending(fault, regs)) {
+ if (!user_mode(regs))
+ goto bad_page_fault;
return;
+ }

if (unlikely(fault & VM_FAULT_ERROR)) {
if (fault & VM_FAULT_OOM)


2021-04-05 16:34:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 037/126] net: mvpp2: fix interrupt mask/unmask skip condition

[ Upstream commit 7867299cde34e9c2d2c676f2a384a9d5853b914d ]

The condition should be skipped if CPU ID equal to nthreads.
The patch doesn't fix any actual issue since
nthreads = min_t(unsigned int, num_present_cpus(), MVPP2_MAX_THREADS).
On all current Armada platforms, the number of CPU's is
less than MVPP2_MAX_THREADS.

Fixes: e531f76757eb ("net: mvpp2: handle cases where more CPUs are available than s/w threads")
Reported-by: Russell King <[email protected]>
Signed-off-by: Stefan Chulski <[email protected]>
Reviewed-by: Russell King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
index d1f7b51cab62..f5333fc27e14 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -1153,7 +1153,7 @@ static void mvpp2_interrupts_unmask(void *arg)
u32 val;

/* If the thread isn't used, don't do anything */
- if (smp_processor_id() > port->priv->nthreads)
+ if (smp_processor_id() >= port->priv->nthreads)
return;

val = MVPP2_CAUSE_MISC_SUM_MASK |
@@ -2287,7 +2287,7 @@ static void mvpp2_txq_sent_counter_clear(void *arg)
int queue;

/* If the thread isn't used, don't do anything */
- if (smp_processor_id() > port->priv->nthreads)
+ if (smp_processor_id() >= port->priv->nthreads)
return;

for (queue = 0; queue < port->ntxqs; queue++) {
--
2.30.1



2021-04-05 16:34:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 041/126] can: tcan4x5x: fix max register value

[ Upstream commit 6e1caaf8ed22eb700cc47ec353816eee33186c1c ]

This patch fixes the max register value for the regmap.

Reviewed-by: Dan Murphy <[email protected]>
Tested-by: Sean Nyekjaer <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/can/m_can/tcan4x5x.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/can/m_can/tcan4x5x.c b/drivers/net/can/m_can/tcan4x5x.c
index 01f5b6e03a2d..f169d9090e52 100644
--- a/drivers/net/can/m_can/tcan4x5x.c
+++ b/drivers/net/can/m_can/tcan4x5x.c
@@ -88,7 +88,7 @@

#define TCAN4X5X_MRAM_START 0x8000
#define TCAN4X5X_MCAN_OFFSET 0x1000
-#define TCAN4X5X_MAX_REGISTER 0x8fff
+#define TCAN4X5X_MAX_REGISTER 0x8ffc

#define TCAN4X5X_CLEAR_ALL_INT 0xffffffff
#define TCAN4X5X_SET_ALL_INT 0xffffffff
--
2.30.1



2021-04-05 16:34:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 040/126] net: introduce CAN specific pointer in the struct net_device

From: Oleksij Rempel <[email protected]>

[ Upstream commit 4e096a18867a5a989b510f6999d9c6b6622e8f7b ]

Since 20dd3850bcf8 ("can: Speed up CAN frame receiption by using
ml_priv") the CAN framework uses per device specific data in the AF_CAN
protocol. For this purpose the struct net_device->ml_priv is used. Later
the ml_priv usage in CAN was extended for other users, one of them being
CAN_J1939.

Later in the kernel ml_priv was converted to an union, used by other
drivers. E.g. the tun driver started storing it's stats pointer.

Since tun devices can claim to be a CAN device, CAN specific protocols
will wrongly interpret this pointer, which will cause system crashes.
Mostly this issue is visible in the CAN_J1939 stack.

To fix this issue, we request a dedicated CAN pointer within the
net_device struct.

Reported-by: [email protected]
Fixes: 20dd3850bcf8 ("can: Speed up CAN frame receiption by using ml_priv")
Fixes: ffd956eef69b ("can: introduce CAN midlayer private and allocate it automatically")
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Fixes: 497a5757ce4e ("tun: switch to net core provided statistics counters")
Signed-off-by: Oleksij Rempel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/can/dev/dev.c | 4 +++-
drivers/net/can/slcan.c | 4 +++-
drivers/net/can/vcan.c | 2 +-
drivers/net/can/vxcan.c | 6 +++++-
include/linux/can/can-ml.h | 12 ++++++++++++
include/linux/netdevice.h | 34 +++++++++++++++++++++++++++++++++-
net/can/af_can.c | 34 ++--------------------------------
net/can/j1939/main.c | 22 ++++++++--------------
net/can/j1939/socket.c | 13 ++++---------
net/can/proc.c | 19 +++++++++++++------
10 files changed, 84 insertions(+), 66 deletions(-)

diff --git a/drivers/net/can/dev/dev.c b/drivers/net/can/dev/dev.c
index dc9b4aae3abb..2b38a99884f2 100644
--- a/drivers/net/can/dev/dev.c
+++ b/drivers/net/can/dev/dev.c
@@ -747,6 +747,7 @@ EXPORT_SYMBOL_GPL(alloc_can_err_skb);
struct net_device *alloc_candev_mqs(int sizeof_priv, unsigned int echo_skb_max,
unsigned int txqs, unsigned int rxqs)
{
+ struct can_ml_priv *can_ml;
struct net_device *dev;
struct can_priv *priv;
int size;
@@ -778,7 +779,8 @@ struct net_device *alloc_candev_mqs(int sizeof_priv, unsigned int echo_skb_max,
priv = netdev_priv(dev);
priv->dev = dev;

- dev->ml_priv = (void *)priv + ALIGN(sizeof_priv, NETDEV_ALIGN);
+ can_ml = (void *)priv + ALIGN(sizeof_priv, NETDEV_ALIGN);
+ can_set_ml_priv(dev, can_ml);

if (echo_skb_max) {
priv->echo_skb_max = echo_skb_max;
diff --git a/drivers/net/can/slcan.c b/drivers/net/can/slcan.c
index b4a39f0449ba..6471a71c2ee6 100644
--- a/drivers/net/can/slcan.c
+++ b/drivers/net/can/slcan.c
@@ -516,6 +516,7 @@ static struct slcan *slc_alloc(void)
int i;
char name[IFNAMSIZ];
struct net_device *dev = NULL;
+ struct can_ml_priv *can_ml;
struct slcan *sl;
int size;

@@ -538,7 +539,8 @@ static struct slcan *slc_alloc(void)

dev->base_addr = i;
sl = netdev_priv(dev);
- dev->ml_priv = (void *)sl + ALIGN(sizeof(*sl), NETDEV_ALIGN);
+ can_ml = (void *)sl + ALIGN(sizeof(*sl), NETDEV_ALIGN);
+ can_set_ml_priv(dev, can_ml);

/* Initialize channel control data */
sl->magic = SLCAN_MAGIC;
diff --git a/drivers/net/can/vcan.c b/drivers/net/can/vcan.c
index 39ca14b0585d..067705e2850b 100644
--- a/drivers/net/can/vcan.c
+++ b/drivers/net/can/vcan.c
@@ -153,7 +153,7 @@ static void vcan_setup(struct net_device *dev)
dev->addr_len = 0;
dev->tx_queue_len = 0;
dev->flags = IFF_NOARP;
- dev->ml_priv = netdev_priv(dev);
+ can_set_ml_priv(dev, netdev_priv(dev));

/* set flags according to driver capabilities */
if (echo)
diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c
index b1baa4ac1d53..7000c6cd1e48 100644
--- a/drivers/net/can/vxcan.c
+++ b/drivers/net/can/vxcan.c
@@ -141,6 +141,8 @@ static const struct net_device_ops vxcan_netdev_ops = {

static void vxcan_setup(struct net_device *dev)
{
+ struct can_ml_priv *can_ml;
+
dev->type = ARPHRD_CAN;
dev->mtu = CANFD_MTU;
dev->hard_header_len = 0;
@@ -149,7 +151,9 @@ static void vxcan_setup(struct net_device *dev)
dev->flags = (IFF_NOARP|IFF_ECHO);
dev->netdev_ops = &vxcan_netdev_ops;
dev->needs_free_netdev = true;
- dev->ml_priv = netdev_priv(dev) + ALIGN(sizeof(struct vxcan_priv), NETDEV_ALIGN);
+
+ can_ml = netdev_priv(dev) + ALIGN(sizeof(struct vxcan_priv), NETDEV_ALIGN);
+ can_set_ml_priv(dev, can_ml);
}

/* forward declaration for rtnl_create_link() */
diff --git a/include/linux/can/can-ml.h b/include/linux/can/can-ml.h
index 2f5d731ae251..8afa92d15a66 100644
--- a/include/linux/can/can-ml.h
+++ b/include/linux/can/can-ml.h
@@ -44,6 +44,7 @@

#include <linux/can.h>
#include <linux/list.h>
+#include <linux/netdevice.h>

#define CAN_SFF_RCV_ARRAY_SZ (1 << CAN_SFF_ID_BITS)
#define CAN_EFF_RCV_HASH_BITS 10
@@ -65,4 +66,15 @@ struct can_ml_priv {
#endif
};

+static inline struct can_ml_priv *can_get_ml_priv(struct net_device *dev)
+{
+ return netdev_get_ml_priv(dev, ML_PRIV_CAN);
+}
+
+static inline void can_set_ml_priv(struct net_device *dev,
+ struct can_ml_priv *ml_priv)
+{
+ netdev_set_ml_priv(dev, ml_priv, ML_PRIV_CAN);
+}
+
#endif /* CAN_ML_H */
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 8753e98a8d58..e37480b5f4c0 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1600,6 +1600,12 @@ enum netdev_priv_flags {
#define IFF_L3MDEV_RX_HANDLER IFF_L3MDEV_RX_HANDLER
#define IFF_LIVE_RENAME_OK IFF_LIVE_RENAME_OK

+/* Specifies the type of the struct net_device::ml_priv pointer */
+enum netdev_ml_priv_type {
+ ML_PRIV_NONE,
+ ML_PRIV_CAN,
+};
+
/**
* struct net_device - The DEVICE structure.
*
@@ -1795,6 +1801,7 @@ enum netdev_priv_flags {
* @nd_net: Network namespace this network device is inside
*
* @ml_priv: Mid-layer private
+ * @ml_priv_type: Mid-layer private type
* @lstats: Loopback statistics
* @tstats: Tunnel statistics
* @dstats: Dummy statistics
@@ -2107,8 +2114,10 @@ struct net_device {
possible_net_t nd_net;

/* mid-layer private */
+ void *ml_priv;
+ enum netdev_ml_priv_type ml_priv_type;
+
union {
- void *ml_priv;
struct pcpu_lstats __percpu *lstats;
struct pcpu_sw_netstats __percpu *tstats;
struct pcpu_dstats __percpu *dstats;
@@ -2298,6 +2307,29 @@ static inline void netdev_reset_rx_headroom(struct net_device *dev)
netdev_set_rx_headroom(dev, -1);
}

+static inline void *netdev_get_ml_priv(struct net_device *dev,
+ enum netdev_ml_priv_type type)
+{
+ if (dev->ml_priv_type != type)
+ return NULL;
+
+ return dev->ml_priv;
+}
+
+static inline void netdev_set_ml_priv(struct net_device *dev,
+ void *ml_priv,
+ enum netdev_ml_priv_type type)
+{
+ WARN(dev->ml_priv_type && dev->ml_priv_type != type,
+ "Overwriting already set ml_priv_type (%u) with different ml_priv_type (%u)!\n",
+ dev->ml_priv_type, type);
+ WARN(!dev->ml_priv_type && dev->ml_priv,
+ "Overwriting already set ml_priv and ml_priv_type is ML_PRIV_NONE!\n");
+
+ dev->ml_priv = ml_priv;
+ dev->ml_priv_type = type;
+}
+
/*
* Net namespace inlines
*/
diff --git a/net/can/af_can.c b/net/can/af_can.c
index 4c343b43067f..1c95ede2c9a6 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -304,8 +304,8 @@ static struct can_dev_rcv_lists *can_dev_rcv_lists_find(struct net *net,
struct net_device *dev)
{
if (dev) {
- struct can_ml_priv *ml_priv = dev->ml_priv;
- return &ml_priv->dev_rcv_lists;
+ struct can_ml_priv *can_ml = can_get_ml_priv(dev);
+ return &can_ml->dev_rcv_lists;
} else {
return net->can.rx_alldev_list;
}
@@ -790,25 +790,6 @@ void can_proto_unregister(const struct can_proto *cp)
}
EXPORT_SYMBOL(can_proto_unregister);

-/* af_can notifier to create/remove CAN netdevice specific structs */
-static int can_notifier(struct notifier_block *nb, unsigned long msg,
- void *ptr)
-{
- struct net_device *dev = netdev_notifier_info_to_dev(ptr);
-
- if (dev->type != ARPHRD_CAN)
- return NOTIFY_DONE;
-
- switch (msg) {
- case NETDEV_REGISTER:
- WARN(!dev->ml_priv,
- "No CAN mid layer private allocated, please fix your driver and use alloc_candev()!\n");
- break;
- }
-
- return NOTIFY_DONE;
-}
-
static int can_pernet_init(struct net *net)
{
spin_lock_init(&net->can.rcvlists_lock);
@@ -876,11 +857,6 @@ static const struct net_proto_family can_family_ops = {
.owner = THIS_MODULE,
};

-/* notifier block for netdevice event */
-static struct notifier_block can_netdev_notifier __read_mostly = {
- .notifier_call = can_notifier,
-};
-
static struct pernet_operations can_pernet_ops __read_mostly = {
.init = can_pernet_init,
.exit = can_pernet_exit,
@@ -911,17 +887,12 @@ static __init int can_init(void)
err = sock_register(&can_family_ops);
if (err)
goto out_sock;
- err = register_netdevice_notifier(&can_netdev_notifier);
- if (err)
- goto out_notifier;

dev_add_pack(&can_packet);
dev_add_pack(&canfd_packet);

return 0;

-out_notifier:
- sock_unregister(PF_CAN);
out_sock:
unregister_pernet_subsys(&can_pernet_ops);
out_pernet:
@@ -935,7 +906,6 @@ static __exit void can_exit(void)
/* protocol unregister */
dev_remove_pack(&canfd_packet);
dev_remove_pack(&can_packet);
- unregister_netdevice_notifier(&can_netdev_notifier);
sock_unregister(PF_CAN);

unregister_pernet_subsys(&can_pernet_ops);
diff --git a/net/can/j1939/main.c b/net/can/j1939/main.c
index 137054bff9ec..e52330f628c9 100644
--- a/net/can/j1939/main.c
+++ b/net/can/j1939/main.c
@@ -140,9 +140,9 @@ static struct j1939_priv *j1939_priv_create(struct net_device *ndev)
static inline void j1939_priv_set(struct net_device *ndev,
struct j1939_priv *priv)
{
- struct can_ml_priv *can_ml_priv = ndev->ml_priv;
+ struct can_ml_priv *can_ml = can_get_ml_priv(ndev);

- can_ml_priv->j1939_priv = priv;
+ can_ml->j1939_priv = priv;
}

static void __j1939_priv_release(struct kref *kref)
@@ -211,12 +211,9 @@ static void __j1939_rx_release(struct kref *kref)
/* get pointer to priv without increasing ref counter */
static inline struct j1939_priv *j1939_ndev_to_priv(struct net_device *ndev)
{
- struct can_ml_priv *can_ml_priv = ndev->ml_priv;
+ struct can_ml_priv *can_ml = can_get_ml_priv(ndev);

- if (!can_ml_priv)
- return NULL;
-
- return can_ml_priv->j1939_priv;
+ return can_ml->j1939_priv;
}

static struct j1939_priv *j1939_priv_get_by_ndev_locked(struct net_device *ndev)
@@ -225,9 +222,6 @@ static struct j1939_priv *j1939_priv_get_by_ndev_locked(struct net_device *ndev)

lockdep_assert_held(&j1939_netdev_lock);

- if (ndev->type != ARPHRD_CAN)
- return NULL;
-
priv = j1939_ndev_to_priv(ndev);
if (priv)
j1939_priv_get(priv);
@@ -348,15 +342,16 @@ static int j1939_netdev_notify(struct notifier_block *nb,
unsigned long msg, void *data)
{
struct net_device *ndev = netdev_notifier_info_to_dev(data);
+ struct can_ml_priv *can_ml = can_get_ml_priv(ndev);
struct j1939_priv *priv;

+ if (!can_ml)
+ goto notify_done;
+
priv = j1939_priv_get_by_ndev(ndev);
if (!priv)
goto notify_done;

- if (ndev->type != ARPHRD_CAN)
- goto notify_put;
-
switch (msg) {
case NETDEV_DOWN:
j1939_cancel_active_session(priv, NULL);
@@ -365,7 +360,6 @@ static int j1939_netdev_notify(struct notifier_block *nb,
break;
}

-notify_put:
j1939_priv_put(priv);

notify_done:
diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c
index f23966526a88..56aa66147d5a 100644
--- a/net/can/j1939/socket.c
+++ b/net/can/j1939/socket.c
@@ -12,6 +12,7 @@

#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

+#include <linux/can/can-ml.h>
#include <linux/can/core.h>
#include <linux/can/skb.h>
#include <linux/errqueue.h>
@@ -453,6 +454,7 @@ static int j1939_sk_bind(struct socket *sock, struct sockaddr *uaddr, int len)
j1939_jsk_del(priv, jsk);
j1939_local_ecu_put(priv, jsk->addr.src_name, jsk->addr.sa);
} else {
+ struct can_ml_priv *can_ml;
struct net_device *ndev;

ndev = dev_get_by_index(net, addr->can_ifindex);
@@ -461,15 +463,8 @@ static int j1939_sk_bind(struct socket *sock, struct sockaddr *uaddr, int len)
goto out_release_sock;
}

- if (ndev->type != ARPHRD_CAN) {
- dev_put(ndev);
- ret = -ENODEV;
- goto out_release_sock;
- }
-
- if (!ndev->ml_priv) {
- netdev_warn_once(ndev,
- "No CAN mid layer private allocated, please fix your driver and use alloc_candev()!\n");
+ can_ml = can_get_ml_priv(ndev);
+ if (!can_ml) {
dev_put(ndev);
ret = -ENODEV;
goto out_release_sock;
diff --git a/net/can/proc.c b/net/can/proc.c
index 5ea8695f507e..b15760b5c1cc 100644
--- a/net/can/proc.c
+++ b/net/can/proc.c
@@ -322,8 +322,11 @@ static int can_rcvlist_proc_show(struct seq_file *m, void *v)

/* receive list for registered CAN devices */
for_each_netdev_rcu(net, dev) {
- if (dev->type == ARPHRD_CAN && dev->ml_priv)
- can_rcvlist_proc_show_one(m, idx, dev, dev->ml_priv);
+ struct can_ml_priv *can_ml = can_get_ml_priv(dev);
+
+ if (can_ml)
+ can_rcvlist_proc_show_one(m, idx, dev,
+ &can_ml->dev_rcv_lists);
}

rcu_read_unlock();
@@ -375,8 +378,10 @@ static int can_rcvlist_sff_proc_show(struct seq_file *m, void *v)

/* sff receive list for registered CAN devices */
for_each_netdev_rcu(net, dev) {
- if (dev->type == ARPHRD_CAN && dev->ml_priv) {
- dev_rcv_lists = dev->ml_priv;
+ struct can_ml_priv *can_ml = can_get_ml_priv(dev);
+
+ if (can_ml) {
+ dev_rcv_lists = &can_ml->dev_rcv_lists;
can_rcvlist_proc_show_array(m, dev, dev_rcv_lists->rx_sff,
ARRAY_SIZE(dev_rcv_lists->rx_sff));
}
@@ -406,8 +411,10 @@ static int can_rcvlist_eff_proc_show(struct seq_file *m, void *v)

/* eff receive list for registered CAN devices */
for_each_netdev_rcu(net, dev) {
- if (dev->type == ARPHRD_CAN && dev->ml_priv) {
- dev_rcv_lists = dev->ml_priv;
+ struct can_ml_priv *can_ml = can_get_ml_priv(dev);
+
+ if (can_ml) {
+ dev_rcv_lists = &can_ml->dev_rcv_lists;
can_rcvlist_proc_show_array(m, dev, dev_rcv_lists->rx_eff,
ARRAY_SIZE(dev_rcv_lists->rx_eff));
}
--
2.30.1



2021-04-05 16:34:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 073/126] drm/amdkfd: dqm fence memory corruption

From: Qu Huang <[email protected]>

commit e92049ae4548ba09e53eaa9c8f6964b07ea274c9 upstream.

Amdgpu driver uses 4-byte data type as DQM fence memory,
and transmits GPU address of fence memory to microcode
through query status PM4 message. However, query status
PM4 message definition and microcode processing are all
processed according to 8 bytes. Fence memory only allocates
4 bytes of memory, but microcode does write 8 bytes of memory,
so there is a memory corruption.

Changes since v1:
* Change dqm->fence_addr as a u64 pointer to fix this issue,
also fix up query_status and amdkfd_fence_wait_timeout function
uses 64 bit fence value to make them consistent.

Signed-off-by: Qu Huang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 6 +++---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_vi.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 8 ++++----
7 files changed, 12 insertions(+), 12 deletions(-)

--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -155,7 +155,7 @@ static int dbgdev_diq_submit_ib(struct k

/* Wait till CP writes sync code: */
status = amdkfd_fence_wait_timeout(
- (unsigned int *) rm_state,
+ rm_state,
QUEUESTATE__ACTIVE, 1500);

kfd_gtt_sa_free(dbgdev->dev, mem_obj);
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1167,7 +1167,7 @@ static int start_cpsch(struct device_que
if (retval)
goto fail_allocate_vidmem;

- dqm->fence_addr = dqm->fence_mem->cpu_ptr;
+ dqm->fence_addr = (uint64_t *)dqm->fence_mem->cpu_ptr;
dqm->fence_gpu_addr = dqm->fence_mem->gpu_addr;

init_interrupts(dqm);
@@ -1340,8 +1340,8 @@ out:
return retval;
}

-int amdkfd_fence_wait_timeout(unsigned int *fence_addr,
- unsigned int fence_value,
+int amdkfd_fence_wait_timeout(uint64_t *fence_addr,
+ uint64_t fence_value,
unsigned int timeout_ms)
{
unsigned long end_jiffies = msecs_to_jiffies(timeout_ms) + jiffies;
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h
@@ -192,7 +192,7 @@ struct device_queue_manager {
uint16_t vmid_pasid[VMID_NUM];
uint64_t pipelines_addr;
uint64_t fence_gpu_addr;
- unsigned int *fence_addr;
+ uint64_t *fence_addr;
struct kfd_mem_obj *fence_mem;
bool active_runlist;
int sched_policy;
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
@@ -345,7 +345,7 @@ fail_create_runlist_ib:
}

int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
- uint32_t fence_value)
+ uint64_t fence_value)
{
uint32_t *buffer, size;
int retval = 0;
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c
@@ -283,7 +283,7 @@ static int pm_unmap_queues_v9(struct pac
}

static int pm_query_status_v9(struct packet_manager *pm, uint32_t *buffer,
- uint64_t fence_address, uint32_t fence_value)
+ uint64_t fence_address, uint64_t fence_value)
{
struct pm4_mes_query_status *packet;

--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_vi.c
@@ -263,7 +263,7 @@ static int pm_unmap_queues_vi(struct pac
}

static int pm_query_status_vi(struct packet_manager *pm, uint32_t *buffer,
- uint64_t fence_address, uint32_t fence_value)
+ uint64_t fence_address, uint64_t fence_value)
{
struct pm4_mes_query_status *packet;

--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -1006,8 +1006,8 @@ int pqm_get_wave_state(struct process_qu
u32 *ctl_stack_used_size,
u32 *save_area_used_size);

-int amdkfd_fence_wait_timeout(unsigned int *fence_addr,
- unsigned int fence_value,
+int amdkfd_fence_wait_timeout(uint64_t *fence_addr,
+ uint64_t fence_value,
unsigned int timeout_ms);

/* Packet Manager */
@@ -1043,7 +1043,7 @@ struct packet_manager_funcs {
uint32_t filter_param, bool reset,
unsigned int sdma_engine);
int (*query_status)(struct packet_manager *pm, uint32_t *buffer,
- uint64_t fence_address, uint32_t fence_value);
+ uint64_t fence_address, uint64_t fence_value);
int (*release_mem)(uint64_t gpu_addr, uint32_t *buffer);

/* Packet sizes */
@@ -1065,7 +1065,7 @@ int pm_send_set_resources(struct packet_
struct scheduling_resources *res);
int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues);
int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
- uint32_t fence_value);
+ uint64_t fence_value);

int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
enum kfd_unmap_queues_filter mode,


2021-04-05 16:34:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 042/126] brcmfmac: clear EAP/association status bits on linkdown events

From: Luca Pesce <[email protected]>

[ Upstream commit e862a3e4088070de352fdafe9bd9e3ae0a95a33c ]

This ensure that previous association attempts do not leave stale statuses
on subsequent attempts.

This fixes the WARN_ON(!cr->bss)) from __cfg80211_connect_result() when
connecting to an AP after a previous connection failure (e.g. where EAP fails
due to incorrect psk but association succeeded). In some scenarios, indeed,
brcmf_is_linkup() was reporting a link up event too early due to stale
BRCMF_VIF_STATUS_ASSOC_SUCCESS bit, thus reporting to cfg80211 a connection
result with a zeroed bssid (vif->profile.bssid is still empty), causing the
WARN_ON due to the call to cfg80211_get_bss() with the empty bssid.

Signed-off-by: Luca Pesce <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
.../net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
index 0ee421f30aa2..23e6422c2251 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/cfg80211.c
@@ -5611,7 +5611,8 @@ static bool brcmf_is_linkup(struct brcmf_cfg80211_vif *vif,
return false;
}

-static bool brcmf_is_linkdown(const struct brcmf_event_msg *e)
+static bool brcmf_is_linkdown(struct brcmf_cfg80211_vif *vif,
+ const struct brcmf_event_msg *e)
{
u32 event = e->event_code;
u16 flags = e->flags;
@@ -5620,6 +5621,8 @@ static bool brcmf_is_linkdown(const struct brcmf_event_msg *e)
(event == BRCMF_E_DISASSOC_IND) ||
((event == BRCMF_E_LINK) && (!(flags & BRCMF_EVENT_MSG_LINK)))) {
brcmf_dbg(CONN, "Processing link down\n");
+ clear_bit(BRCMF_VIF_STATUS_EAP_SUCCESS, &vif->sme_state);
+ clear_bit(BRCMF_VIF_STATUS_ASSOC_SUCCESS, &vif->sme_state);
return true;
}
return false;
@@ -6067,7 +6070,7 @@ brcmf_notify_connect_status(struct brcmf_if *ifp,
} else
brcmf_bss_connect_done(cfg, ndev, e, true);
brcmf_net_setcarrier(ifp, true);
- } else if (brcmf_is_linkdown(e)) {
+ } else if (brcmf_is_linkdown(ifp->vif, e)) {
brcmf_dbg(CONN, "Linkdown\n");
if (!brcmf_is_ibssmode(ifp->vif) &&
test_bit(BRCMF_VIF_STATUS_CONNECTED,
--
2.30.1



2021-04-05 16:35:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 082/126] KVM: x86/mmu: change TDP MMU yield function returns to match cond_resched

From: Ben Gardon <[email protected]>

[ Upstream commit e28a436ca4f65384cceaf3f4da0e00aa74244e6a ]

Currently the TDP MMU yield / cond_resched functions either return
nothing or return true if the TLBs were not flushed. These are confusing
semantics, especially when making control flow decisions in calling
functions.

To clean things up, change both functions to have the same
return value semantics as cond_resched: true if the thread yielded,
false if it did not. If the function yielded in the _flush_ version,
then the TLBs will have been flushed.

Reviewed-by: Peter Feiner <[email protected]>
Acked-by: Paolo Bonzini <[email protected]>
Signed-off-by: Ben Gardon <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/mmu/tdp_mmu.c | 39 ++++++++++++++++++++++++++++----------
1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index ffa0bd0e033f..22efd016f05e 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -405,8 +405,15 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm,
_mmu->shadow_root_level, _start, _end)

/*
- * Flush the TLB if the process should drop kvm->mmu_lock.
- * Return whether the caller still needs to flush the tlb.
+ * Flush the TLB and yield if the MMU lock is contended or this thread needs to
+ * return control to the scheduler.
+ *
+ * If this function yields, it will also reset the tdp_iter's walk over the
+ * paging structure and the calling function should allow the iterator to
+ * continue its traversal from the paging structure root.
+ *
+ * Return true if this function yielded, the TLBs were flushed, and the
+ * iterator's traversal was reset. Return false if a yield was not needed.
*/
static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *iter)
{
@@ -414,18 +421,32 @@ static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *it
kvm_flush_remote_tlbs(kvm);
cond_resched_lock(&kvm->mmu_lock);
tdp_iter_refresh_walk(iter);
- return false;
- } else {
return true;
}
+
+ return false;
}

-static void tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter)
+/*
+ * Yield if the MMU lock is contended or this thread needs to return control
+ * to the scheduler.
+ *
+ * If this function yields, it will also reset the tdp_iter's walk over the
+ * paging structure and the calling function should allow the iterator to
+ * continue its traversal from the paging structure root.
+ *
+ * Return true if this function yielded and the iterator's traversal was reset.
+ * Return false if a yield was not needed.
+ */
+static bool tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter)
{
if (need_resched() || spin_needbreak(&kvm->mmu_lock)) {
cond_resched_lock(&kvm->mmu_lock);
tdp_iter_refresh_walk(iter);
+ return true;
}
+
+ return false;
}

/*
@@ -461,10 +482,8 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,

tdp_mmu_set_spte(kvm, &iter, 0);

- if (can_yield)
- flush_needed = tdp_mmu_iter_flush_cond_resched(kvm, &iter);
- else
- flush_needed = true;
+ flush_needed = !can_yield ||
+ !tdp_mmu_iter_flush_cond_resched(kvm, &iter);
}
return flush_needed;
}
@@ -1061,7 +1080,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm,

tdp_mmu_set_spte(kvm, &iter, 0);

- spte_set = tdp_mmu_iter_flush_cond_resched(kvm, &iter);
+ spte_set = !tdp_mmu_iter_flush_cond_resched(kvm, &iter);
}

if (spte_set)
--
2.30.1



2021-04-05 16:35:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 083/126] KVM: x86/mmu: Merge flush and non-flush tdp_mmu_iter_cond_resched

From: Ben Gardon <[email protected]>

[ Upstream commit e139a34ef9d5627a41e1c02210229082140d1f92 ]

The flushing and non-flushing variants of tdp_mmu_iter_cond_resched have
almost identical implementations. Merge the two functions and add a
flush parameter.

Signed-off-by: Ben Gardon <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/mmu/tdp_mmu.c | 42 ++++++++++++--------------------------
1 file changed, 13 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 22efd016f05e..3b14d0008f92 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -404,33 +404,13 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm,
for_each_tdp_pte(_iter, __va(_mmu->root_hpa), \
_mmu->shadow_root_level, _start, _end)

-/*
- * Flush the TLB and yield if the MMU lock is contended or this thread needs to
- * return control to the scheduler.
- *
- * If this function yields, it will also reset the tdp_iter's walk over the
- * paging structure and the calling function should allow the iterator to
- * continue its traversal from the paging structure root.
- *
- * Return true if this function yielded, the TLBs were flushed, and the
- * iterator's traversal was reset. Return false if a yield was not needed.
- */
-static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *iter)
-{
- if (need_resched() || spin_needbreak(&kvm->mmu_lock)) {
- kvm_flush_remote_tlbs(kvm);
- cond_resched_lock(&kvm->mmu_lock);
- tdp_iter_refresh_walk(iter);
- return true;
- }
-
- return false;
-}
-
/*
* Yield if the MMU lock is contended or this thread needs to return control
* to the scheduler.
*
+ * If this function should yield and flush is set, it will perform a remote
+ * TLB flush before yielding.
+ *
* If this function yields, it will also reset the tdp_iter's walk over the
* paging structure and the calling function should allow the iterator to
* continue its traversal from the paging structure root.
@@ -438,9 +418,13 @@ static bool tdp_mmu_iter_flush_cond_resched(struct kvm *kvm, struct tdp_iter *it
* Return true if this function yielded and the iterator's traversal was reset.
* Return false if a yield was not needed.
*/
-static bool tdp_mmu_iter_cond_resched(struct kvm *kvm, struct tdp_iter *iter)
+static inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm,
+ struct tdp_iter *iter, bool flush)
{
if (need_resched() || spin_needbreak(&kvm->mmu_lock)) {
+ if (flush)
+ kvm_flush_remote_tlbs(kvm);
+
cond_resched_lock(&kvm->mmu_lock);
tdp_iter_refresh_walk(iter);
return true;
@@ -483,7 +467,7 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
tdp_mmu_set_spte(kvm, &iter, 0);

flush_needed = !can_yield ||
- !tdp_mmu_iter_flush_cond_resched(kvm, &iter);
+ !tdp_mmu_iter_cond_resched(kvm, &iter, true);
}
return flush_needed;
}
@@ -852,7 +836,7 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte);
spte_set = true;

- tdp_mmu_iter_cond_resched(kvm, &iter);
+ tdp_mmu_iter_cond_resched(kvm, &iter, false);
}
return spte_set;
}
@@ -911,7 +895,7 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte);
spte_set = true;

- tdp_mmu_iter_cond_resched(kvm, &iter);
+ tdp_mmu_iter_cond_resched(kvm, &iter, false);
}
return spte_set;
}
@@ -1027,7 +1011,7 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
tdp_mmu_set_spte(kvm, &iter, new_spte);
spte_set = true;

- tdp_mmu_iter_cond_resched(kvm, &iter);
+ tdp_mmu_iter_cond_resched(kvm, &iter, false);
}

return spte_set;
@@ -1080,7 +1064,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm,

tdp_mmu_set_spte(kvm, &iter, 0);

- spte_set = !tdp_mmu_iter_flush_cond_resched(kvm, &iter);
+ spte_set = !tdp_mmu_iter_cond_resched(kvm, &iter, true);
}

if (spte_set)
--
2.30.1



2021-04-05 16:35:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 085/126] KVM: x86/mmu: Ensure forward progress when yielding in TDP MMU iter

From: Ben Gardon <[email protected]>

[ Upstream commit ed5e484b79e8a9b8be714bd85b6fc70bd6dc99a7 ]

In some functions the TDP iter risks not making forward progress if two
threads livelock yielding to one another. This is possible if two threads
are trying to execute wrprot_gfn_range. Each could write protect an entry
and then yield. This would reset the tdp_iter's walk over the paging
structure and the loop would end up repeating the same entry over and
over, preventing either thread from making forward progress.

Fix this issue by only yielding if the loop has made forward progress
since the last yield.

Fixes: a6a0b05da9f3 ("kvm: x86/mmu: Support dirty logging for the TDP MMU")
Reviewed-by: Peter Feiner <[email protected]>
Signed-off-by: Ben Gardon <[email protected]>

Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/mmu/tdp_iter.c | 18 +-----------------
arch/x86/kvm/mmu/tdp_iter.h | 7 ++++++-
arch/x86/kvm/mmu/tdp_mmu.c | 21 ++++++++++++++++-----
3 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_iter.c b/arch/x86/kvm/mmu/tdp_iter.c
index 9917c55b7d24..1a09d212186b 100644
--- a/arch/x86/kvm/mmu/tdp_iter.c
+++ b/arch/x86/kvm/mmu/tdp_iter.c
@@ -31,6 +31,7 @@ void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level,
WARN_ON(root_level > PT64_ROOT_MAX_LEVEL);

iter->next_last_level_gfn = next_last_level_gfn;
+ iter->yielded_gfn = iter->next_last_level_gfn;
iter->root_level = root_level;
iter->min_level = min_level;
iter->level = root_level;
@@ -158,23 +159,6 @@ void tdp_iter_next(struct tdp_iter *iter)
iter->valid = false;
}

-/*
- * Restart the walk over the paging structure from the root, starting from the
- * highest gfn the iterator had previously reached. Assumes that the entire
- * paging structure, except the root page, may have been completely torn down
- * and rebuilt.
- */
-void tdp_iter_refresh_walk(struct tdp_iter *iter)
-{
- gfn_t next_last_level_gfn = iter->next_last_level_gfn;
-
- if (iter->gfn > next_last_level_gfn)
- next_last_level_gfn = iter->gfn;
-
- tdp_iter_start(iter, iter->pt_path[iter->root_level - 1],
- iter->root_level, iter->min_level, next_last_level_gfn);
-}
-
u64 *tdp_iter_root_pt(struct tdp_iter *iter)
{
return iter->pt_path[iter->root_level - 1];
diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h
index b2dd269c631f..d480c540ee27 100644
--- a/arch/x86/kvm/mmu/tdp_iter.h
+++ b/arch/x86/kvm/mmu/tdp_iter.h
@@ -16,6 +16,12 @@ struct tdp_iter {
* for this GFN.
*/
gfn_t next_last_level_gfn;
+ /*
+ * The next_last_level_gfn at the time when the thread last
+ * yielded. Only yielding when the next_last_level_gfn !=
+ * yielded_gfn helps ensure forward progress.
+ */
+ gfn_t yielded_gfn;
/* Pointers to the page tables traversed to reach the current SPTE */
u64 *pt_path[PT64_ROOT_MAX_LEVEL];
/* A pointer to the current SPTE */
@@ -54,7 +60,6 @@ u64 *spte_to_child_pt(u64 pte, int level);
void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level,
int min_level, gfn_t next_last_level_gfn);
void tdp_iter_next(struct tdp_iter *iter);
-void tdp_iter_refresh_walk(struct tdp_iter *iter);
u64 *tdp_iter_root_pt(struct tdp_iter *iter);

#endif /* __KVM_X86_MMU_TDP_ITER_H */
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 3b14d0008f92..f0bc5d3ce3d4 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -412,8 +412,9 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm,
* TLB flush before yielding.
*
* If this function yields, it will also reset the tdp_iter's walk over the
- * paging structure and the calling function should allow the iterator to
- * continue its traversal from the paging structure root.
+ * paging structure and the calling function should skip to the next
+ * iteration to allow the iterator to continue its traversal from the
+ * paging structure root.
*
* Return true if this function yielded and the iterator's traversal was reset.
* Return false if a yield was not needed.
@@ -421,12 +422,22 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm,
static inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm,
struct tdp_iter *iter, bool flush)
{
+ /* Ensure forward progress has been made before yielding. */
+ if (iter->next_last_level_gfn == iter->yielded_gfn)
+ return false;
+
if (need_resched() || spin_needbreak(&kvm->mmu_lock)) {
if (flush)
kvm_flush_remote_tlbs(kvm);

cond_resched_lock(&kvm->mmu_lock);
- tdp_iter_refresh_walk(iter);
+
+ WARN_ON(iter->gfn > iter->next_last_level_gfn);
+
+ tdp_iter_start(iter, iter->pt_path[iter->root_level - 1],
+ iter->root_level, iter->min_level,
+ iter->next_last_level_gfn);
+
return true;
}

@@ -466,8 +477,8 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,

tdp_mmu_set_spte(kvm, &iter, 0);

- flush_needed = !can_yield ||
- !tdp_mmu_iter_cond_resched(kvm, &iter, true);
+ flush_needed = !(can_yield &&
+ tdp_mmu_iter_cond_resched(kvm, &iter, true));
}
return flush_needed;
}
--
2.30.1



2021-04-05 16:35:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 090/126] KVM: x86/mmu: Protect TDP MMU page table memory with RCU

From: Ben Gardon <[email protected]>

[ Upstream commit 7cca2d0b7e7d9f3cd740d41afdc00051c9b508a0 ]

In order to enable concurrent modifications to the paging structures in
the TDP MMU, threads must be able to safely remove pages of page table
memory while other threads are traversing the same memory. To ensure
threads do not access PT memory after it is freed, protect PT memory
with RCU.

Protecting concurrent accesses to page table memory from use-after-free
bugs could also have been acomplished using
walk_shadow_page_lockless_begin/end() and READING_SHADOW_PAGE_TABLES,
coupling with the barriers in a TLB flush. The use of RCU for this case
has several distinct advantages over that approach.
1. Disabling interrupts for long running operations is not desirable.
Future commits will allow operations besides page faults to operate
without the exclusive protection of the MMU lock and those operations
are too long to disable iterrupts for their duration.
2. The use of RCU here avoids long blocking / spinning operations in
perfromance critical paths. By freeing memory with an asynchronous
RCU API we avoid the longer wait times TLB flushes experience when
overlapping with a thread in walk_shadow_page_lockless_begin/end().
3. RCU provides a separation of concerns when removing memory from the
paging structure. Because the RCU callback to free memory can be
scheduled immediately after a TLB flush, there's no need for the
thread to manually free a queue of pages later, as commit_zap_pages
does.

Fixes: 95fb5b0258b7 ("kvm: x86/mmu: Support MMIO in the TDP MMU")
Reviewed-by: Peter Feiner <[email protected]>
Suggested-by: Sean Christopherson <[email protected]>
Signed-off-by: Ben Gardon <[email protected]>

Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/mmu/mmu_internal.h | 3 ++
arch/x86/kvm/mmu/tdp_iter.c | 16 +++---
arch/x86/kvm/mmu/tdp_iter.h | 10 ++--
arch/x86/kvm/mmu/tdp_mmu.c | 95 +++++++++++++++++++++++++++++----
4 files changed, 103 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index bfc6389edc28..7f599cc64178 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -57,6 +57,9 @@ struct kvm_mmu_page {
atomic_t write_flooding_count;

bool tdp_mmu_page;
+
+ /* Used for freeing the page asyncronously if it is a TDP MMU page. */
+ struct rcu_head rcu_head;
};

extern struct kmem_cache *mmu_page_header_cache;
diff --git a/arch/x86/kvm/mmu/tdp_iter.c b/arch/x86/kvm/mmu/tdp_iter.c
index 1a09d212186b..e5f148106e20 100644
--- a/arch/x86/kvm/mmu/tdp_iter.c
+++ b/arch/x86/kvm/mmu/tdp_iter.c
@@ -12,7 +12,7 @@ static void tdp_iter_refresh_sptep(struct tdp_iter *iter)
{
iter->sptep = iter->pt_path[iter->level - 1] +
SHADOW_PT_INDEX(iter->gfn << PAGE_SHIFT, iter->level);
- iter->old_spte = READ_ONCE(*iter->sptep);
+ iter->old_spte = READ_ONCE(*rcu_dereference(iter->sptep));
}

static gfn_t round_gfn_for_level(gfn_t gfn, int level)
@@ -35,7 +35,7 @@ void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level,
iter->root_level = root_level;
iter->min_level = min_level;
iter->level = root_level;
- iter->pt_path[iter->level - 1] = root_pt;
+ iter->pt_path[iter->level - 1] = (tdp_ptep_t)root_pt;

iter->gfn = round_gfn_for_level(iter->next_last_level_gfn, iter->level);
tdp_iter_refresh_sptep(iter);
@@ -48,7 +48,7 @@ void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level,
* address of the child page table referenced by the SPTE. Returns null if
* there is no such entry.
*/
-u64 *spte_to_child_pt(u64 spte, int level)
+tdp_ptep_t spte_to_child_pt(u64 spte, int level)
{
/*
* There's no child entry if this entry isn't present or is a
@@ -57,7 +57,7 @@ u64 *spte_to_child_pt(u64 spte, int level)
if (!is_shadow_present_pte(spte) || is_last_spte(spte, level))
return NULL;

- return __va(spte_to_pfn(spte) << PAGE_SHIFT);
+ return (tdp_ptep_t)__va(spte_to_pfn(spte) << PAGE_SHIFT);
}

/*
@@ -66,7 +66,7 @@ u64 *spte_to_child_pt(u64 spte, int level)
*/
static bool try_step_down(struct tdp_iter *iter)
{
- u64 *child_pt;
+ tdp_ptep_t child_pt;

if (iter->level == iter->min_level)
return false;
@@ -75,7 +75,7 @@ static bool try_step_down(struct tdp_iter *iter)
* Reread the SPTE before stepping down to avoid traversing into page
* tables that are no longer linked from this entry.
*/
- iter->old_spte = READ_ONCE(*iter->sptep);
+ iter->old_spte = READ_ONCE(*rcu_dereference(iter->sptep));

child_pt = spte_to_child_pt(iter->old_spte, iter->level);
if (!child_pt)
@@ -109,7 +109,7 @@ static bool try_step_side(struct tdp_iter *iter)
iter->gfn += KVM_PAGES_PER_HPAGE(iter->level);
iter->next_last_level_gfn = iter->gfn;
iter->sptep++;
- iter->old_spte = READ_ONCE(*iter->sptep);
+ iter->old_spte = READ_ONCE(*rcu_dereference(iter->sptep));

return true;
}
@@ -159,7 +159,7 @@ void tdp_iter_next(struct tdp_iter *iter)
iter->valid = false;
}

-u64 *tdp_iter_root_pt(struct tdp_iter *iter)
+tdp_ptep_t tdp_iter_root_pt(struct tdp_iter *iter)
{
return iter->pt_path[iter->root_level - 1];
}
diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h
index d480c540ee27..4cc177d75c4a 100644
--- a/arch/x86/kvm/mmu/tdp_iter.h
+++ b/arch/x86/kvm/mmu/tdp_iter.h
@@ -7,6 +7,8 @@

#include "mmu.h"

+typedef u64 __rcu *tdp_ptep_t;
+
/*
* A TDP iterator performs a pre-order walk over a TDP paging structure.
*/
@@ -23,9 +25,9 @@ struct tdp_iter {
*/
gfn_t yielded_gfn;
/* Pointers to the page tables traversed to reach the current SPTE */
- u64 *pt_path[PT64_ROOT_MAX_LEVEL];
+ tdp_ptep_t pt_path[PT64_ROOT_MAX_LEVEL];
/* A pointer to the current SPTE */
- u64 *sptep;
+ tdp_ptep_t sptep;
/* The lowest GFN mapped by the current SPTE */
gfn_t gfn;
/* The level of the root page given to the iterator */
@@ -55,11 +57,11 @@ struct tdp_iter {
#define for_each_tdp_pte(iter, root, root_level, start, end) \
for_each_tdp_pte_min_level(iter, root, root_level, PG_LEVEL_4K, start, end)

-u64 *spte_to_child_pt(u64 pte, int level);
+tdp_ptep_t spte_to_child_pt(u64 pte, int level);

void tdp_iter_start(struct tdp_iter *iter, u64 *root_pt, int root_level,
int min_level, gfn_t next_last_level_gfn);
void tdp_iter_next(struct tdp_iter *iter);
-u64 *tdp_iter_root_pt(struct tdp_iter *iter);
+tdp_ptep_t tdp_iter_root_pt(struct tdp_iter *iter);

#endif /* __KVM_X86_MMU_TDP_ITER_H */
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index f52a22bc0fe8..a54a9ed979d1 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -42,6 +42,12 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm)
return;

WARN_ON(!list_empty(&kvm->arch.tdp_mmu_roots));
+
+ /*
+ * Ensure that all the outstanding RCU callbacks to free shadow pages
+ * can run before the VM is torn down.
+ */
+ rcu_barrier();
}

static void tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root)
@@ -196,6 +202,28 @@ hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu)
return __pa(root->spt);
}

+static void tdp_mmu_free_sp(struct kvm_mmu_page *sp)
+{
+ free_page((unsigned long)sp->spt);
+ kmem_cache_free(mmu_page_header_cache, sp);
+}
+
+/*
+ * This is called through call_rcu in order to free TDP page table memory
+ * safely with respect to other kernel threads that may be operating on
+ * the memory.
+ * By only accessing TDP MMU page table memory in an RCU read critical
+ * section, and freeing it after a grace period, lockless access to that
+ * memory won't use it after it is freed.
+ */
+static void tdp_mmu_free_sp_rcu_callback(struct rcu_head *head)
+{
+ struct kvm_mmu_page *sp = container_of(head, struct kvm_mmu_page,
+ rcu_head);
+
+ tdp_mmu_free_sp(sp);
+}
+
static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
u64 old_spte, u64 new_spte, int level);

@@ -269,8 +297,7 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt)
kvm_flush_remote_tlbs_with_address(kvm, gfn,
KVM_PAGES_PER_HPAGE(level));

- free_page((unsigned long)pt);
- kmem_cache_free(mmu_page_header_cache, sp);
+ call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback);
}

/**
@@ -372,13 +399,13 @@ static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter,
u64 new_spte, bool record_acc_track,
bool record_dirty_log)
{
- u64 *root_pt = tdp_iter_root_pt(iter);
+ tdp_ptep_t root_pt = tdp_iter_root_pt(iter);
struct kvm_mmu_page *root = sptep_to_sp(root_pt);
int as_id = kvm_mmu_page_as_id(root);

lockdep_assert_held(&kvm->mmu_lock);

- WRITE_ONCE(*iter->sptep, new_spte);
+ WRITE_ONCE(*rcu_dereference(iter->sptep), new_spte);

__handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte,
iter->level);
@@ -448,10 +475,13 @@ static inline bool tdp_mmu_iter_cond_resched(struct kvm *kvm,
return false;

if (need_resched() || spin_needbreak(&kvm->mmu_lock)) {
+ rcu_read_unlock();
+
if (flush)
kvm_flush_remote_tlbs(kvm);

cond_resched_lock(&kvm->mmu_lock);
+ rcu_read_lock();

WARN_ON(iter->gfn > iter->next_last_level_gfn);

@@ -482,6 +512,8 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
struct tdp_iter iter;
bool flush_needed = false;

+ rcu_read_lock();
+
tdp_root_for_each_pte(iter, root, start, end) {
if (can_yield &&
tdp_mmu_iter_cond_resched(kvm, &iter, flush_needed)) {
@@ -505,6 +537,8 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
tdp_mmu_set_spte(kvm, &iter, 0);
flush_needed = true;
}
+
+ rcu_read_unlock();
return flush_needed;
}

@@ -550,13 +584,15 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int write,

if (unlikely(is_noslot_pfn(pfn))) {
new_spte = make_mmio_spte(vcpu, iter->gfn, ACC_ALL);
- trace_mark_mmio_spte(iter->sptep, iter->gfn, new_spte);
+ trace_mark_mmio_spte(rcu_dereference(iter->sptep), iter->gfn,
+ new_spte);
} else {
make_spte_ret = make_spte(vcpu, ACC_ALL, iter->level, iter->gfn,
pfn, iter->old_spte, prefault, true,
map_writable, !shadow_accessed_mask,
&new_spte);
- trace_kvm_mmu_set_spte(iter->level, iter->gfn, iter->sptep);
+ trace_kvm_mmu_set_spte(iter->level, iter->gfn,
+ rcu_dereference(iter->sptep));
}

if (new_spte == iter->old_spte)
@@ -579,7 +615,8 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int write,
if (unlikely(is_mmio_spte(new_spte)))
ret = RET_PF_EMULATE;

- trace_kvm_mmu_set_spte(iter->level, iter->gfn, iter->sptep);
+ trace_kvm_mmu_set_spte(iter->level, iter->gfn,
+ rcu_dereference(iter->sptep));
if (!prefault)
vcpu->stat.pf_fixed++;

@@ -617,6 +654,9 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
huge_page_disallowed, &req_level);

trace_kvm_mmu_spte_requested(gpa, level, pfn);
+
+ rcu_read_lock();
+
tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) {
if (nx_huge_page_workaround_enabled)
disallowed_hugepage_adjust(iter.old_spte, gfn,
@@ -642,7 +682,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
* because the new value informs the !present
* path below.
*/
- iter.old_spte = READ_ONCE(*iter.sptep);
+ iter.old_spte = READ_ONCE(*rcu_dereference(iter.sptep));
}

if (!is_shadow_present_pte(iter.old_spte)) {
@@ -661,11 +701,14 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
}
}

- if (WARN_ON(iter.level != level))
+ if (WARN_ON(iter.level != level)) {
+ rcu_read_unlock();
return RET_PF_RETRY;
+ }

ret = tdp_mmu_map_handle_target_level(vcpu, write, map_writable, &iter,
pfn, prefault);
+ rcu_read_unlock();

return ret;
}
@@ -736,6 +779,8 @@ static int age_gfn_range(struct kvm *kvm, struct kvm_memory_slot *slot,
int young = 0;
u64 new_spte = 0;

+ rcu_read_lock();
+
tdp_root_for_each_leaf_pte(iter, root, start, end) {
/*
* If we have a non-accessed entry we don't need to change the
@@ -767,6 +812,8 @@ static int age_gfn_range(struct kvm *kvm, struct kvm_memory_slot *slot,
trace_kvm_age_page(iter.gfn, iter.level, slot, young);
}

+ rcu_read_unlock();
+
return young;
}

@@ -812,6 +859,8 @@ static int set_tdp_spte(struct kvm *kvm, struct kvm_memory_slot *slot,
u64 new_spte;
int need_flush = 0;

+ rcu_read_lock();
+
WARN_ON(pte_huge(*ptep));

new_pfn = pte_pfn(*ptep);
@@ -840,6 +889,8 @@ static int set_tdp_spte(struct kvm *kvm, struct kvm_memory_slot *slot,
if (need_flush)
kvm_flush_remote_tlbs_with_address(kvm, gfn, 1);

+ rcu_read_unlock();
+
return 0;
}

@@ -863,6 +914,8 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
u64 new_spte;
bool spte_set = false;

+ rcu_read_lock();
+
BUG_ON(min_level > KVM_MAX_HUGEPAGE_LEVEL);

for_each_tdp_pte_min_level(iter, root->spt, root->role.level,
@@ -879,6 +932,8 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte);
spte_set = true;
}
+
+ rcu_read_unlock();
return spte_set;
}

@@ -920,6 +975,8 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
u64 new_spte;
bool spte_set = false;

+ rcu_read_lock();
+
tdp_root_for_each_leaf_pte(iter, root, start, end) {
if (tdp_mmu_iter_cond_resched(kvm, &iter, false))
continue;
@@ -939,6 +996,8 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte);
spte_set = true;
}
+
+ rcu_read_unlock();
return spte_set;
}

@@ -980,6 +1039,8 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,
struct tdp_iter iter;
u64 new_spte;

+ rcu_read_lock();
+
tdp_root_for_each_leaf_pte(iter, root, gfn + __ffs(mask),
gfn + BITS_PER_LONG) {
if (!mask)
@@ -1005,6 +1066,8 @@ static void clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root,

mask &= ~(1UL << (iter.gfn - gfn));
}
+
+ rcu_read_unlock();
}

/*
@@ -1044,6 +1107,8 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
u64 new_spte;
bool spte_set = false;

+ rcu_read_lock();
+
tdp_root_for_each_pte(iter, root, start, end) {
if (tdp_mmu_iter_cond_resched(kvm, &iter, false))
continue;
@@ -1057,6 +1122,7 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
spte_set = true;
}

+ rcu_read_unlock();
return spte_set;
}

@@ -1094,6 +1160,8 @@ static void zap_collapsible_spte_range(struct kvm *kvm,
kvm_pfn_t pfn;
bool spte_set = false;

+ rcu_read_lock();
+
tdp_root_for_each_pte(iter, root, start, end) {
if (tdp_mmu_iter_cond_resched(kvm, &iter, spte_set)) {
spte_set = false;
@@ -1115,6 +1183,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm,
spte_set = true;
}

+ rcu_read_unlock();
if (spte_set)
kvm_flush_remote_tlbs(kvm);
}
@@ -1151,6 +1220,8 @@ static bool write_protect_gfn(struct kvm *kvm, struct kvm_mmu_page *root,
u64 new_spte;
bool spte_set = false;

+ rcu_read_lock();
+
tdp_root_for_each_leaf_pte(iter, root, gfn, gfn + 1) {
if (!is_writable_pte(iter.old_spte))
break;
@@ -1162,6 +1233,8 @@ static bool write_protect_gfn(struct kvm *kvm, struct kvm_mmu_page *root,
spte_set = true;
}

+ rcu_read_unlock();
+
return spte_set;
}

@@ -1202,10 +1275,14 @@ int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,

*root_level = vcpu->arch.mmu->shadow_root_level;

+ rcu_read_lock();
+
tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) {
leaf = iter.level;
sptes[leaf - 1] = iter.old_spte;
}

+ rcu_read_unlock();
+
return leaf;
}
--
2.30.1



2021-04-05 16:36:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 088/126] KVM: x86/mmu: Add lockdep when setting a TDP MMU SPTE

From: Ben Gardon <[email protected]>

[ Upstream commit 3a9a4aa5657471a02ffb7f9b7f3b7a468b3f257b ]

Add lockdep to __tdp_mmu_set_spte to ensure that SPTEs are only modified
under the MMU lock.

No functional change intended.

Reviewed-by: Peter Feiner <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Acked-by: Paolo Bonzini <[email protected]>
Signed-off-by: Ben Gardon <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/mmu/tdp_mmu.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 61be95c6db20..ad9f8f187045 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -363,6 +363,8 @@ static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter,
struct kvm_mmu_page *root = sptep_to_sp(root_pt);
int as_id = kvm_mmu_page_as_id(root);

+ lockdep_assert_held(&kvm->mmu_lock);
+
WRITE_ONCE(*iter->sptep, new_spte);

__handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte,
--
2.30.1



2021-04-05 16:40:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 096/126] KVM: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map

From: Ben Gardon <[email protected]>

[ Upstream commit 9a77daacc87dee9fd63e31243f21894132ed8407 ]

To prepare for handling page faults in parallel, change the TDP MMU
page fault handler to use atomic operations to set SPTEs so that changes
are not lost if multiple threads attempt to modify the same SPTE.

Reviewed-by: Peter Feiner <[email protected]>
Signed-off-by: Ben Gardon <[email protected]>

Message-Id: <[email protected]>
[Document new locking rules. - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
Documentation/virt/kvm/locking.rst | 9 +-
arch/x86/include/asm/kvm_host.h | 13 +++
arch/x86/kvm/mmu/tdp_mmu.c | 142 ++++++++++++++++++++++-------
3 files changed, 130 insertions(+), 34 deletions(-)

diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index b21a34c34a21..0aa4817b466d 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -16,7 +16,14 @@ The acquisition orders for mutexes are as follows:
- kvm->slots_lock is taken outside kvm->irq_lock, though acquiring
them together is quite rare.

-On x86, vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock.
+On x86:
+
+- vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock
+
+- kvm->arch.mmu_lock is an rwlock. kvm->arch.tdp_mmu_pages_lock is
+ taken inside kvm->arch.mmu_lock, and cannot be taken without already
+ holding kvm->arch.mmu_lock (typically with ``read_lock``, otherwise
+ there's no need to take kvm->arch.tdp_mmu_pages_lock at all).

Everything else is a leaf: no other lock is taken inside the critical
sections.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 02d4c74d30e2..47cd8f9b3fe7 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1014,6 +1014,19 @@ struct kvm_arch {
struct list_head tdp_mmu_roots;
/* List of struct tdp_mmu_pages not being used as roots */
struct list_head tdp_mmu_pages;
+
+ /*
+ * Protects accesses to the following fields when the MMU lock
+ * is held in read mode:
+ * - tdp_mmu_pages (above)
+ * - the link field of struct kvm_mmu_pages used by the TDP MMU
+ * - lpage_disallowed_mmu_pages
+ * - the lpage_disallowed_link field of struct kvm_mmu_pages used
+ * by the TDP MMU
+ * It is acceptable, but not necessary, to acquire this lock when
+ * the thread holds the MMU lock in write mode.
+ */
+ spinlock_t tdp_mmu_pages_lock;
};

struct kvm_vm_stat {
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 14d69c01c710..eb38f74af3f2 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -7,6 +7,7 @@
#include "tdp_mmu.h"
#include "spte.h"

+#include <asm/cmpxchg.h>
#include <trace/events/kvm.h>

#ifdef CONFIG_X86_64
@@ -33,6 +34,7 @@ void kvm_mmu_init_tdp_mmu(struct kvm *kvm)
kvm->arch.tdp_mmu_enabled = true;

INIT_LIST_HEAD(&kvm->arch.tdp_mmu_roots);
+ spin_lock_init(&kvm->arch.tdp_mmu_pages_lock);
INIT_LIST_HEAD(&kvm->arch.tdp_mmu_pages);
}

@@ -225,7 +227,8 @@ static void tdp_mmu_free_sp_rcu_callback(struct rcu_head *head)
}

static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
- u64 old_spte, u64 new_spte, int level);
+ u64 old_spte, u64 new_spte, int level,
+ bool shared);

static int kvm_mmu_page_as_id(struct kvm_mmu_page *sp)
{
@@ -267,17 +270,26 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn,
*
* @kvm: kvm instance
* @sp: the new page
+ * @shared: This operation may not be running under the exclusive use of
+ * the MMU lock and the operation must synchronize with other
+ * threads that might be adding or removing pages.
* @account_nx: This page replaces a NX large page and should be marked for
* eventual reclaim.
*/
static void tdp_mmu_link_page(struct kvm *kvm, struct kvm_mmu_page *sp,
- bool account_nx)
+ bool shared, bool account_nx)
{
- lockdep_assert_held_write(&kvm->mmu_lock);
+ if (shared)
+ spin_lock(&kvm->arch.tdp_mmu_pages_lock);
+ else
+ lockdep_assert_held_write(&kvm->mmu_lock);

list_add(&sp->link, &kvm->arch.tdp_mmu_pages);
if (account_nx)
account_huge_nx_page(kvm, sp);
+
+ if (shared)
+ spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
}

/**
@@ -285,14 +297,24 @@ static void tdp_mmu_link_page(struct kvm *kvm, struct kvm_mmu_page *sp,
*
* @kvm: kvm instance
* @sp: the page to be removed
+ * @shared: This operation may not be running under the exclusive use of
+ * the MMU lock and the operation must synchronize with other
+ * threads that might be adding or removing pages.
*/
-static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp)
+static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp,
+ bool shared)
{
- lockdep_assert_held_write(&kvm->mmu_lock);
+ if (shared)
+ spin_lock(&kvm->arch.tdp_mmu_pages_lock);
+ else
+ lockdep_assert_held_write(&kvm->mmu_lock);

list_del(&sp->link);
if (sp->lpage_disallowed)
unaccount_huge_nx_page(kvm, sp);
+
+ if (shared)
+ spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
}

/**
@@ -300,28 +322,39 @@ static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp)
*
* @kvm: kvm instance
* @pt: the page removed from the paging structure
+ * @shared: This operation may not be running under the exclusive use
+ * of the MMU lock and the operation must synchronize with other
+ * threads that might be modifying SPTEs.
*
* Given a page table that has been removed from the TDP paging structure,
* iterates through the page table to clear SPTEs and free child page tables.
*/
-static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt)
+static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt,
+ bool shared)
{
struct kvm_mmu_page *sp = sptep_to_sp(pt);
int level = sp->role.level;
gfn_t gfn = sp->gfn;
u64 old_child_spte;
+ u64 *sptep;
int i;

trace_kvm_mmu_prepare_zap_page(sp);

- tdp_mmu_unlink_page(kvm, sp);
+ tdp_mmu_unlink_page(kvm, sp, shared);

for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
- old_child_spte = READ_ONCE(*(pt + i));
- WRITE_ONCE(*(pt + i), 0);
+ sptep = pt + i;
+
+ if (shared) {
+ old_child_spte = xchg(sptep, 0);
+ } else {
+ old_child_spte = READ_ONCE(*sptep);
+ WRITE_ONCE(*sptep, 0);
+ }
handle_changed_spte(kvm, kvm_mmu_page_as_id(sp),
gfn + (i * KVM_PAGES_PER_HPAGE(level - 1)),
- old_child_spte, 0, level - 1);
+ old_child_spte, 0, level - 1, shared);
}

kvm_flush_remote_tlbs_with_address(kvm, gfn,
@@ -338,12 +371,16 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt)
* @old_spte: The value of the SPTE before the change
* @new_spte: The value of the SPTE after the change
* @level: the level of the PT the SPTE is part of in the paging structure
+ * @shared: This operation may not be running under the exclusive use of
+ * the MMU lock and the operation must synchronize with other
+ * threads that might be modifying SPTEs.
*
* Handle bookkeeping that might result from the modification of a SPTE.
* This function must be called for all TDP SPTE modifications.
*/
static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
- u64 old_spte, u64 new_spte, int level)
+ u64 old_spte, u64 new_spte, int level,
+ bool shared)
{
bool was_present = is_shadow_present_pte(old_spte);
bool is_present = is_shadow_present_pte(new_spte);
@@ -413,18 +450,51 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
*/
if (was_present && !was_leaf && (pfn_changed || !is_present))
handle_removed_tdp_mmu_page(kvm,
- spte_to_child_pt(old_spte, level));
+ spte_to_child_pt(old_spte, level), shared);
}

static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
- u64 old_spte, u64 new_spte, int level)
+ u64 old_spte, u64 new_spte, int level,
+ bool shared)
{
- __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level);
+ __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level,
+ shared);
handle_changed_spte_acc_track(old_spte, new_spte, level);
handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte,
new_spte, level);
}

+/*
+ * tdp_mmu_set_spte_atomic - Set a TDP MMU SPTE atomically and handle the
+ * associated bookkeeping
+ *
+ * @kvm: kvm instance
+ * @iter: a tdp_iter instance currently on the SPTE that should be set
+ * @new_spte: The value the SPTE should be set to
+ * Returns: true if the SPTE was set, false if it was not. If false is returned,
+ * this function will have no side-effects.
+ */
+static inline bool tdp_mmu_set_spte_atomic(struct kvm *kvm,
+ struct tdp_iter *iter,
+ u64 new_spte)
+{
+ u64 *root_pt = tdp_iter_root_pt(iter);
+ struct kvm_mmu_page *root = sptep_to_sp(root_pt);
+ int as_id = kvm_mmu_page_as_id(root);
+
+ lockdep_assert_held_read(&kvm->mmu_lock);
+
+ if (cmpxchg64(rcu_dereference(iter->sptep), iter->old_spte,
+ new_spte) != iter->old_spte)
+ return false;
+
+ handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte,
+ iter->level, true);
+
+ return true;
+}
+
+
/*
* __tdp_mmu_set_spte - Set a TDP MMU SPTE and handle the associated bookkeeping
* @kvm: kvm instance
@@ -454,7 +524,7 @@ static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter,
WRITE_ONCE(*rcu_dereference(iter->sptep), new_spte);

__handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte,
- iter->level);
+ iter->level, false);
if (record_acc_track)
handle_changed_spte_acc_track(iter->old_spte, new_spte,
iter->level);
@@ -629,23 +699,18 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int write,
int ret = 0;
int make_spte_ret = 0;

- if (unlikely(is_noslot_pfn(pfn))) {
+ if (unlikely(is_noslot_pfn(pfn)))
new_spte = make_mmio_spte(vcpu, iter->gfn, ACC_ALL);
- trace_mark_mmio_spte(rcu_dereference(iter->sptep), iter->gfn,
- new_spte);
- } else {
+ else
make_spte_ret = make_spte(vcpu, ACC_ALL, iter->level, iter->gfn,
pfn, iter->old_spte, prefault, true,
map_writable, !shadow_accessed_mask,
&new_spte);
- trace_kvm_mmu_set_spte(iter->level, iter->gfn,
- rcu_dereference(iter->sptep));
- }

if (new_spte == iter->old_spte)
ret = RET_PF_SPURIOUS;
- else
- tdp_mmu_set_spte(vcpu->kvm, iter, new_spte);
+ else if (!tdp_mmu_set_spte_atomic(vcpu->kvm, iter, new_spte))
+ return RET_PF_RETRY;

/*
* If the page fault was caused by a write but the page is write
@@ -659,8 +724,13 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int write,
}

/* If a MMIO SPTE is installed, the MMIO will need to be emulated. */
- if (unlikely(is_mmio_spte(new_spte)))
+ if (unlikely(is_mmio_spte(new_spte))) {
+ trace_mark_mmio_spte(rcu_dereference(iter->sptep), iter->gfn,
+ new_spte);
ret = RET_PF_EMULATE;
+ } else
+ trace_kvm_mmu_set_spte(iter->level, iter->gfn,
+ rcu_dereference(iter->sptep));

trace_kvm_mmu_set_spte(iter->level, iter->gfn,
rcu_dereference(iter->sptep));
@@ -719,7 +789,8 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
*/
if (is_shadow_present_pte(iter.old_spte) &&
is_large_pte(iter.old_spte)) {
- tdp_mmu_set_spte(vcpu->kvm, &iter, 0);
+ if (!tdp_mmu_set_spte_atomic(vcpu->kvm, &iter, 0))
+ break;

kvm_flush_remote_tlbs_with_address(vcpu->kvm, iter.gfn,
KVM_PAGES_PER_HPAGE(iter.level));
@@ -736,19 +807,24 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
sp = alloc_tdp_mmu_page(vcpu, iter.gfn, iter.level);
child_pt = sp->spt;

- tdp_mmu_link_page(vcpu->kvm, sp,
- huge_page_disallowed &&
- req_level >= iter.level);
-
new_spte = make_nonleaf_spte(child_pt,
!shadow_accessed_mask);

- trace_kvm_mmu_get_page(sp, true);
- tdp_mmu_set_spte(vcpu->kvm, &iter, new_spte);
+ if (tdp_mmu_set_spte_atomic(vcpu->kvm, &iter,
+ new_spte)) {
+ tdp_mmu_link_page(vcpu->kvm, sp, true,
+ huge_page_disallowed &&
+ req_level >= iter.level);
+
+ trace_kvm_mmu_get_page(sp, true);
+ } else {
+ tdp_mmu_free_sp(sp);
+ break;
+ }
}
}

- if (WARN_ON(iter.level != level)) {
+ if (iter.level != level) {
rcu_read_unlock();
return RET_PF_RETRY;
}
--
2.30.1



2021-04-05 16:40:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 104/126] firewire: nosy: Fix a use-after-free bug in nosy_ioctl()

From: Zheyu Ma <[email protected]>

[ Upstream commit 829933ef05a951c8ff140e814656d73e74915faf ]

For each device, the nosy driver allocates a pcilynx structure.
A use-after-free might happen in the following scenario:

1. Open nosy device for the first time and call ioctl with command
NOSY_IOC_START, then a new client A will be malloced and added to
doubly linked list.
2. Open nosy device for the second time and call ioctl with command
NOSY_IOC_START, then a new client B will be malloced and added to
doubly linked list.
3. Call ioctl with command NOSY_IOC_START for client A, then client A
will be readded to the doubly linked list. Now the doubly linked
list is messed up.
4. Close the first nosy device and nosy_release will be called. In
nosy_release, client A will be unlinked and freed.
5. Close the second nosy device, and client A will be referenced,
resulting in UAF.

The root cause of this bug is that the element in the doubly linked list
is reentered into the list.

Fix this bug by adding a check before inserting a client. If a client
is already in the linked list, don't insert it.

The following KASAN report reveals it:

BUG: KASAN: use-after-free in nosy_release+0x1ea/0x210
Write of size 8 at addr ffff888102ad7360 by task poc
CPU: 3 PID: 337 Comm: poc Not tainted 5.12.0-rc5+ #6
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
Call Trace:
nosy_release+0x1ea/0x210
__fput+0x1e2/0x840
task_work_run+0xe8/0x180
exit_to_user_mode_prepare+0x114/0x120
syscall_exit_to_user_mode+0x1d/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae

Allocated by task 337:
nosy_open+0x154/0x4d0
misc_open+0x2ec/0x410
chrdev_open+0x20d/0x5a0
do_dentry_open+0x40f/0xe80
path_openat+0x1cf9/0x37b0
do_filp_open+0x16d/0x390
do_sys_openat2+0x11d/0x360
__x64_sys_open+0xfd/0x1a0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae

Freed by task 337:
kfree+0x8f/0x210
nosy_release+0x158/0x210
__fput+0x1e2/0x840
task_work_run+0xe8/0x180
exit_to_user_mode_prepare+0x114/0x120
syscall_exit_to_user_mode+0x1d/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae

The buggy address belongs to the object at ffff888102ad7300 which belongs to the cache kmalloc-128 of size 128
The buggy address is located 96 bytes inside of 128-byte region [ffff888102ad7300, ffff888102ad7380)

[ Modified to use 'list_empty()' inside proper lock - Linus ]

Link: https://lore.kernel.org/lkml/[email protected]/
Reported-and-tested-by: 马哲宇 (Zheyu Ma) <[email protected]>
Signed-off-by: Zheyu Ma <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Stefan Richter <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/firewire/nosy.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/firewire/nosy.c b/drivers/firewire/nosy.c
index 5fd6a60b6741..88ed971e32c0 100644
--- a/drivers/firewire/nosy.c
+++ b/drivers/firewire/nosy.c
@@ -346,6 +346,7 @@ nosy_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
struct client *client = file->private_data;
spinlock_t *client_list_lock = &client->lynx->client_list_lock;
struct nosy_stats stats;
+ int ret;

switch (cmd) {
case NOSY_IOC_GET_STATS:
@@ -360,11 +361,15 @@ nosy_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
return 0;

case NOSY_IOC_START:
+ ret = -EBUSY;
spin_lock_irq(client_list_lock);
- list_add_tail(&client->link, &client->lynx->client_list);
+ if (list_empty(&client->link)) {
+ list_add_tail(&client->link, &client->lynx->client_list);
+ ret = 0;
+ }
spin_unlock_irq(client_list_lock);

- return 0;
+ return ret;

case NOSY_IOC_STOP:
spin_lock_irq(client_list_lock);
--
2.30.2



2021-04-05 16:40:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 066/126] KVM: SVM: ensure that EFER.SVME is set when running nested guest or on nested vmexit

From: Paolo Bonzini <[email protected]>

commit 3c346c0c60ab06a021d1c0884a0ef494bc4ee3a7 upstream.

Fixing nested_vmcb_check_save to avoid all TOC/TOU races
is a bit harder in released kernels, so do the bare minimum
by avoiding that EFER.SVME is cleared. This is problematic
because svm_set_efer frees the data structures for nested
virtualization if EFER.SVME is cleared.

Also check that EFER.SVME remains set after a nested vmexit;
clearing it could happen if the bit is zero in the save area
that is passed to KVM_SET_NESTED_STATE (the save area of the
nested state corresponds to the nested hypervisor's state
and is restored on the next nested vmexit).

Cc: [email protected]
Fixes: 2fcf4876ada ("KVM: nSVM: implement on demand allocation of the nested state")
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kvm/svm/nested.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)

--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -251,6 +251,13 @@ static bool nested_vmcb_check_save(struc
struct kvm_vcpu *vcpu = &svm->vcpu;
bool vmcb12_lma;

+ /*
+ * FIXME: these should be done after copying the fields,
+ * to avoid TOC/TOU races. For these save area checks
+ * the possible damage is limited since kvm_set_cr0 and
+ * kvm_set_cr4 handle failure; EFER_SVME is an exception
+ * so it is force-set later in nested_prepare_vmcb_save.
+ */
if ((vmcb12->save.efer & EFER_SVME) == 0)
return false;

@@ -396,7 +403,14 @@ static void nested_prepare_vmcb_save(str
svm->vmcb->save.gdtr = vmcb12->save.gdtr;
svm->vmcb->save.idtr = vmcb12->save.idtr;
kvm_set_rflags(&svm->vcpu, vmcb12->save.rflags);
- svm_set_efer(&svm->vcpu, vmcb12->save.efer);
+
+ /*
+ * Force-set EFER_SVME even though it is checked earlier on the
+ * VMCB12, because the guest can flip the bit between the check
+ * and now. Clearing EFER_SVME would call svm_free_nested.
+ */
+ svm_set_efer(&svm->vcpu, vmcb12->save.efer | EFER_SVME);
+
svm_set_cr0(&svm->vcpu, vmcb12->save.cr0);
svm_set_cr4(&svm->vcpu, vmcb12->save.cr4);
svm->vmcb->save.cr2 = svm->vcpu.arch.cr2 = vmcb12->save.cr2;
@@ -1207,6 +1221,8 @@ static int svm_set_nested_state(struct k
*/
if (!(save->cr0 & X86_CR0_PG))
goto out_free;
+ if (!(save->efer & EFER_SVME))
+ goto out_free;

/*
* All checks done, we can enter guest mode. L1 control fields


2021-04-05 16:40:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 086/126] KVM: x86/mmu: Yield in TDU MMU iter even if no SPTES changed

From: Ben Gardon <[email protected]>

[ Upstream commit 1af4a96025b33587ca953c7ef12a1b20c6e70412 ]

Given certain conditions, some TDP MMU functions may not yield
reliably / frequently enough. For example, if a paging structure was
very large but had few, if any writable entries, wrprot_gfn_range
could traverse many entries before finding a writable entry and yielding
because the check for yielding only happens after an SPTE is modified.

Fix this issue by moving the yield to the beginning of the loop.

Fixes: a6a0b05da9f3 ("kvm: x86/mmu: Support dirty logging for the TDP MMU")
Reviewed-by: Peter Feiner <[email protected]>
Signed-off-by: Ben Gardon <[email protected]>

Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/mmu/tdp_mmu.c | 32 ++++++++++++++++++++++----------
1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index f0bc5d3ce3d4..0d17457f1c84 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -462,6 +462,12 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
bool flush_needed = false;

tdp_root_for_each_pte(iter, root, start, end) {
+ if (can_yield &&
+ tdp_mmu_iter_cond_resched(kvm, &iter, flush_needed)) {
+ flush_needed = false;
+ continue;
+ }
+
if (!is_shadow_present_pte(iter.old_spte))
continue;

@@ -476,9 +482,7 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
continue;

tdp_mmu_set_spte(kvm, &iter, 0);
-
- flush_needed = !(can_yield &&
- tdp_mmu_iter_cond_resched(kvm, &iter, true));
+ flush_needed = true;
}
return flush_needed;
}
@@ -838,6 +842,9 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,

for_each_tdp_pte_min_level(iter, root->spt, root->role.level,
min_level, start, end) {
+ if (tdp_mmu_iter_cond_resched(kvm, &iter, false))
+ continue;
+
if (!is_shadow_present_pte(iter.old_spte) ||
!is_last_spte(iter.old_spte, iter.level))
continue;
@@ -846,8 +853,6 @@ static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,

tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte);
spte_set = true;
-
- tdp_mmu_iter_cond_resched(kvm, &iter, false);
}
return spte_set;
}
@@ -891,6 +896,9 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
bool spte_set = false;

tdp_root_for_each_leaf_pte(iter, root, start, end) {
+ if (tdp_mmu_iter_cond_resched(kvm, &iter, false))
+ continue;
+
if (spte_ad_need_write_protect(iter.old_spte)) {
if (is_writable_pte(iter.old_spte))
new_spte = iter.old_spte & ~PT_WRITABLE_MASK;
@@ -905,8 +913,6 @@ static bool clear_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,

tdp_mmu_set_spte_no_dirty_log(kvm, &iter, new_spte);
spte_set = true;
-
- tdp_mmu_iter_cond_resched(kvm, &iter, false);
}
return spte_set;
}
@@ -1014,6 +1020,9 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
bool spte_set = false;

tdp_root_for_each_pte(iter, root, start, end) {
+ if (tdp_mmu_iter_cond_resched(kvm, &iter, false))
+ continue;
+
if (!is_shadow_present_pte(iter.old_spte))
continue;

@@ -1021,8 +1030,6 @@ static bool set_dirty_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,

tdp_mmu_set_spte(kvm, &iter, new_spte);
spte_set = true;
-
- tdp_mmu_iter_cond_resched(kvm, &iter, false);
}

return spte_set;
@@ -1063,6 +1070,11 @@ static void zap_collapsible_spte_range(struct kvm *kvm,
bool spte_set = false;

tdp_root_for_each_pte(iter, root, start, end) {
+ if (tdp_mmu_iter_cond_resched(kvm, &iter, spte_set)) {
+ spte_set = false;
+ continue;
+ }
+
if (!is_shadow_present_pte(iter.old_spte) ||
!is_last_spte(iter.old_spte, iter.level))
continue;
@@ -1075,7 +1087,7 @@ static void zap_collapsible_spte_range(struct kvm *kvm,

tdp_mmu_set_spte(kvm, &iter, 0);

- spte_set = !tdp_mmu_iter_cond_resched(kvm, &iter, true);
+ spte_set = true;
}

if (spte_set)
--
2.30.1



2021-04-05 16:40:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 077/126] drm/imx: fix memory leak when fails to init

From: Pan Bian <[email protected]>

commit 69c3ed7282a143439bbc2d03dc00d49c68fcb629 upstream.

Put DRM device on initialization failure path rather than directly
return error code.

Fixes: a67d5088ceb8 ("drm/imx: drop explicit drm_mode_config_cleanup")
Signed-off-by: Pan Bian <[email protected]>
Signed-off-by: Philipp Zabel <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/imx/imx-drm-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/imx/imx-drm-core.c b/drivers/gpu/drm/imx/imx-drm-core.c
index d1a9841adeed..e6a88c8cbd69 100644
--- a/drivers/gpu/drm/imx/imx-drm-core.c
+++ b/drivers/gpu/drm/imx/imx-drm-core.c
@@ -215,7 +215,7 @@ static int imx_drm_bind(struct device *dev)

ret = drmm_mode_config_init(drm);
if (ret)
- return ret;
+ goto err_kms;

ret = drm_vblank_init(drm, MAX_CRTC);
if (ret)
--
2.31.1



2021-04-05 16:41:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 113/126] USB: cdc-acm: fix use-after-free after probe failure

From: Johan Hovold <[email protected]>

commit 4e49bf376c0451ad2eae2592e093659cde12be9a upstream.

If tty-device registration fails the driver would fail to release the
data interface. When the device is later disconnected, the disconnect
callback would still be called for the data interface and would go about
releasing already freed resources.

Fixes: c93d81955005 ("usb: cdc-acm: fix error handling in acm_probe()")
Cc: [email protected] # 3.9
Cc: Alexey Khoroshilov <[email protected]>
Acked-by: Oliver Neukum <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/usb/class/cdc-acm.c | 5 +++++
1 file changed, 5 insertions(+)

--- a/drivers/usb/class/cdc-acm.c
+++ b/drivers/usb/class/cdc-acm.c
@@ -1516,6 +1516,11 @@ skip_countries:

return 0;
alloc_fail6:
+ if (!acm->combined_interfaces) {
+ /* Clear driver data so that disconnect() returns early. */
+ usb_set_intfdata(data_interface, NULL);
+ usb_driver_release_interface(&acm_driver, data_interface);
+ }
if (acm->country_codes) {
device_remove_file(&acm->control->dev,
&dev_attr_wCountryCodes);


2021-04-05 16:41:08

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 126/126] bpf: Use NOP_ATOMIC5 instead of emit_nops(&prog, 5) for BPF_TRAMP_F_CALL_ORIG

From: Stanislav Fomichev <[email protected]>

commit b9082970478009b778aa9b22d5561eef35b53b63 upstream.

__bpf_arch_text_poke does rewrite only for atomic nop5, emit_nops(xxx, 5)
emits non-atomic one which breaks fentry/fexit with k8 atomics:

P6_NOP5 == P6_NOP5_ATOMIC (0f1f440000 == 0f1f440000)
K8_NOP5 != K8_NOP5_ATOMIC (6666906690 != 6666666690)

Can be reproduced by doing "ideal_nops = k8_nops" in "arch_init_ideal_nops()
and running fexit_bpf2bpf selftest.

Fixes: e21aa341785c ("bpf: Fix fexit trampoline.")
Signed-off-by: Stanislav Fomichev <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/net/bpf_jit_comp.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1811,7 +1811,8 @@ int arch_prepare_bpf_trampoline(struct b
/* remember return value in a stack for bpf prog to access */
emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
im->ip_after_call = prog;
- emit_nops(&prog, 5);
+ memcpy(prog, ideal_nops[NOP_ATOMIC5], X86_PATCH_SIZE);
+ prog += X86_PATCH_SIZE;
}

if (fmod_ret->nr_progs) {


2021-04-05 16:41:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 125/126] Revert "kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing"

From: Jens Axboe <[email protected]>

commit d3dc04cd81e0eaf50b2d09ab051a13300e587439 upstream.

This reverts commit 15b2219facadec583c24523eed40fa45865f859f.

Before IO threads accepted signals, the freezer using take signals to wake
up an IO thread would cause them to loop without any way to clear the
pending signal. That is no longer the case, so stop special casing
PF_IO_WORKER in the freezer.

Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/freezer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/kernel/freezer.c
+++ b/kernel/freezer.c
@@ -134,7 +134,7 @@ bool freeze_task(struct task_struct *p)
return false;
}

- if (!(p->flags & (PF_KTHREAD | PF_IO_WORKER)))
+ if (!(p->flags & PF_KTHREAD))
fake_signal_wake_up(p);
else
wake_up_state(p, TASK_INTERRUPTIBLE);


2021-04-05 22:58:35

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 5.10 047/126] ath10k: hold RCU lock when calling ieee80211_find_sta_by_ifaddr()

Hi!

> Fix ath10k_wmi_tlv_op_pull_peer_stats_info() to hold RCU lock before it
> calls ieee80211_find_sta_by_ifaddr() and release it when the resulting
> pointer is no longer needed.

It does that. But is also does the unlock even if it did not take the
lock:

> +++ b/drivers/net/wireless/ath/ath10k/wmi-tlv.c
> @@ -576,13 +576,13 @@ static void ath10k_wmi_event_tdls_peer(struct ath10k *ar, struct sk_buff *skb)
> case WMI_TDLS_TEARDOWN_REASON_TX:
> case WMI_TDLS_TEARDOWN_REASON_RSSI:
> case WMI_TDLS_TEARDOWN_REASON_PTR_TIMEOUT:
> + rcu_read_lock();
> station = ieee80211_find_sta_by_ifaddr(ar->hw,
> ev->peer_macaddr.addr,
> NULL);
> if (!station) {
> ath10k_warn(ar, "did not find station from tdls peer event");
> - kfree(tb);
> - return;
> + goto exit;
> }
> arvif = ath10k_get_arvif(ar, __le32_to_cpu(ev->vdev_id));
> ieee80211_tdls_oper_request(
> @@ -593,6 +593,9 @@ static void ath10k_wmi_event_tdls_peer(struct ath10k *ar, struct sk_buff *skb)
> );
> break;
> }
> +
> +exit:
> + rcu_read_unlock();
> kfree(tb);
> }

The switch only takes the lock in 3 branches, but it is released
unconditionally at the end.

Something like this?

Best regards,
Pavel

Signed-off-by: Pavel Machek (CIP) <[email protected]>

diff --git a/drivers/net/wireless/ath/ath10k/wmi-tlv.c b/drivers/net/wireless/ath/ath10k/wmi-tlv.c
index e7072fc4f487..e03ff56d938b 100644
--- a/drivers/net/wireless/ath/ath10k/wmi-tlv.c
+++ b/drivers/net/wireless/ath/ath10k/wmi-tlv.c
@@ -582,20 +582,19 @@ static void ath10k_wmi_event_tdls_peer(struct ath10k *ar, struct sk_buff *skb)
NULL);
if (!station) {
ath10k_warn(ar, "did not find station from tdls peer event");
- goto exit;
- }
- arvif = ath10k_get_arvif(ar, __le32_to_cpu(ev->vdev_id));
- ieee80211_tdls_oper_request(
+ } else {
+ arvif = ath10k_get_arvif(ar, __le32_to_cpu(ev->vdev_id));
+ ieee80211_tdls_oper_request(
arvif->vif, station->addr,
NL80211_TDLS_TEARDOWN,
WLAN_REASON_TDLS_TEARDOWN_UNREACHABLE,
GFP_ATOMIC
);
+ }
+ rcu_read_unlock();
break;
}

-exit:
- rcu_read_unlock();
kfree(tb);
}



--
http://www.livejournal.com/~pavelmachek


Attachments:
(No filename) (2.25 kB)
signature.asc (188.00 B)
Digital signature
Download all attachments

2021-04-05 22:59:36

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 5.10 048/126] net: ethernet: aquantia: Handle error cleanup of start on open

Hi!

> From: Nathan Rossi <[email protected]>
>
> [ Upstream commit 8a28af7a3e85ddf358f8c41e401a33002f7a9587 ]
>
> The aq_nic_start function can fail in a variety of cases which leaves
> the device in broken state.
>
> An example case where the start function fails is the
> request_threaded_irq which can be interrupted, resulting in a EINTR
> result. This can be manually triggered by bringing the link up (e.g. ip
> link set up) and triggering a SIGINT on the initiating process (e.g.
> Ctrl+C). This would put the device into a half configured state.
> Subsequently bringing the link up again would cause the napi_enable to
> BUG.
>
> In order to correctly clean up the failed attempt to start a device call
> aq_nic_stop.

No.

> +++ b/drivers/net/ethernet/aquantia/atlantic/aq_main.c
> @@ -71,8 +71,10 @@ static int aq_ndev_open(struct net_device *ndev)
> goto err_exit;
>
> err = aq_nic_start(aq_nic);
> - if (err < 0)
> + if (err < 0) {
> + aq_nic_stop(aq_nic);
> goto err_exit;
> + }
>
> err_exit:
> if (err < 0)

First, take a look at the goto. Does not need to be there.

Second check the crazy calling convention. If nic_start() fails, it
should clean up after itself.

Then, check the code. nic_stop() undoes initialization that was not
even done in the nic_start().

This introduces more problems than it solves.

Best regards,
Pavel
--
DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany


Attachments:
(No filename) (1.54 kB)
signature.asc (188.00 B)
Digital signature
Download all attachments

2021-04-06 00:16:26

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 5.10 039/126] can: dev: move driver related infrastructure into separate subdir

Hi!

> From: Marc Kleine-Budde <[email protected]>
>
> [ Upstream commit 3e77f70e734584e0ad1038e459ed3fd2400f873a ]
>
> This patch moves the CAN driver related infrastructure into a separate subdir.
> It will be split into more files in the coming patches.

I don't think this is suitable for stable. I don't think any of the
follow up patches depend on it...?

Best regards,
Pavel

> drivers/net/can/Makefile | 7 +------
> drivers/net/can/dev/Makefile | 7 +++++++
> drivers/net/can/{ => dev}/dev.c | 0
> drivers/net/can/{ => dev}/rx-offload.c | 0

--
http://www.livejournal.com/~pavelmachek


Attachments:
(No filename) (661.00 B)
signature.asc (188.00 B)
Digital signature
Download all attachments

2021-04-06 03:21:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 5.10 039/126] can: dev: move driver related infrastructure into separate subdir

On Mon, Apr 05, 2021 at 05:30:29PM +0200, Pavel Machek wrote:
> Hi!
>
> > From: Marc Kleine-Budde <[email protected]>
> >
> > [ Upstream commit 3e77f70e734584e0ad1038e459ed3fd2400f873a ]
> >
> > This patch moves the CAN driver related infrastructure into a separate subdir.
> > It will be split into more files in the coming patches.
>
> I don't think this is suitable for stable. I don't think any of the
> follow up patches depend on it...?

Yes it does.

2021-04-06 06:06:35

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH 5.10 000/126] 5.10.28-rc1 review



On 4/5/2021 1:52 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.10.28 release.
> There are 126 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 07 Apr 2021 08:50:09 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.28-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

On ARCH_BRCMSTB using 32-bit and 64-bit kernels:

Tested-by: Florian Fainelli <[email protected]>
--
Florian

2021-04-06 08:10:13

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 5.10 000/126] 5.10.28-rc1 review

On Mon, Apr 05, 2021 at 10:52:42AM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.10.28 release.
> There are 126 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 07 Apr 2021 08:50:09 +0000.
> Anything received after that time might be too late.
>

Build results:
total: 156 pass: 156 fail: 0
Qemu test results:
total: 454 pass: 454 fail: 0

Tested-by: Guenter Roeck <[email protected]>

Guenter

2021-04-06 08:46:04

by Naresh Kamboju

[permalink] [raw]
Subject: Re: [PATCH 5.10 000/126] 5.10.28-rc1 review

On Mon, 5 Apr 2021 at 14:37, Greg Kroah-Hartman
<[email protected]> wrote:
>
> This is the start of the stable review cycle for the 5.10.28 release.
> There are 126 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 07 Apr 2021 08:50:09 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.28-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h


Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Tested-by: Linux Kernel Functional Testing <[email protected]>

## Build
* kernel: 5.10.28-rc1
* git: ['https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git',
'https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc']
* git branch: linux-5.10.y
* git commit: 17879c574df0acdfddd69065b191a3c1b1f00d48
* git describe: v5.10.27-127-g17879c574df0
* test details:
https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.10.y/build/v5.10.27-127-g17879c574df0

## No regressions (compared to v5.10.27-82-g948d7f2d279f)

## No fixes (compared to v5.10.27-82-g948d7f2d279f)

## Test result summary
total: 75519, pass: 63466, fail: 1846, skip: 9956, xfail: 251,

## Build Summary
* arc: 10 total, 10 passed, 0 failed
* arm: 192 total, 192 passed, 0 failed
* arm64: 26 total, 26 passed, 0 failed
* dragonboard-410c: 1 total, 1 passed, 0 failed
* hi6220-hikey: 1 total, 1 passed, 0 failed
* i386: 25 total, 25 passed, 0 failed
* juno-r2: 1 total, 1 passed, 0 failed
* mips: 45 total, 45 passed, 0 failed
* parisc: 9 total, 9 passed, 0 failed
* powerpc: 27 total, 27 passed, 0 failed
* riscv: 30 total, 30 passed, 0 failed
* s390: 27 total, 27 passed, 0 failed
* sh: 18 total, 18 passed, 0 failed
* sparc: 9 total, 9 passed, 0 failed
* x15: 1 total, 1 passed, 0 failed
* x86: 1 total, 1 passed, 0 failed
* x86_64: 35 total, 35 passed, 0 failed

## Test suites summary
* fwts
* igt-gpu-tools
* install-android-platform-tools-r2600
* kselftest-
* kselftest-android
* kselftest-bpf
* kselftest-capabilities
* kselftest-cgroup
* kselftest-clone3
* kselftest-core
* kselftest-cpu-hotplug
* kselftest-cpufreq
* kselftest-efivarfs
* kselftest-filesystems
* kselftest-firmware
* kselftest-fpu
* kselftest-futex
* kselftest-gpio
* kselftest-intel_pstate
* kselftest-ipc
* kselftest-ir
* kselftest-kcmp
* kselftest-kexec
* kselftest-kvm
* kselftest-lib
* kselftest-livepatch
* kselftest-lkdtm
* kselftest-membarrier
* kselftest-memfd
* kselftest-memory-hotplug
* kselftest-mincore
* kselftest-mount
* kselftest-mqueue
* kselftest-net
* kselftest-netfilter
* kselftest-nsfs
* kselftest-openat2
* kselftest-pid_namespace
* kselftest-pidfd
* kselftest-proc
* kselftest-pstore
* kselftest-ptrace
* kselftest-rseq
* kselftest-rtc
* kselftest-seccomp
* kselftest-sigaltstack
* kselftest-size
* kselftest-splice
* kselftest-static_keys
* kselftest-sync
* kselftest-sysctl
* kselftest-tc-testing
* kselftest-timens
* kselftest-timers
* kselftest-tmpfs
* kselftest-tpm2
* kselftest-user
* kselftest-vm
* kselftest-vsyscall-mode-native-
* kselftest-vsyscall-mode-none-
* kselftest-x86
* kselftest-zram
* kunit
* kvm-unit-tests
* libhugetlbfs
* linux-log-parser
* ltp-cap_bounds-tests
* ltp-commands-tests
* ltp-containers-tests
* ltp-controllers-tests
* ltp-cpuhotplug-tests
* ltp-crypto-tests
* ltp-cve-tests
* ltp-dio-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-mm-tests
* ltp-nptl-tests
* ltp-open-posix-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* ltp-tracing-tests
* network-basic-tests
* perf
* rcutorture
* ssuite
* v4l2-compliance

--
Linaro LKFT
https://lkft.linaro.org

2021-04-06 12:07:50

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 5.10 000/126] 5.10.28-rc1 review

On 4/5/21 2:52 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.10.28 release.
> There are 126 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 07 Apr 2021 08:50:09 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.28-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted on my test system. No dmesg regressions.

Tested-by: Shuah Khan <[email protected]>

thanks,
-- Shuah

2021-04-06 12:23:39

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 5.10 047/126] ath10k: hold RCU lock when calling ieee80211_find_sta_by_ifaddr()

On 4/5/21 9:34 AM, Pavel Machek wrote:
> Hi!
>
>> Fix ath10k_wmi_tlv_op_pull_peer_stats_info() to hold RCU lock before it
>> calls ieee80211_find_sta_by_ifaddr() and release it when the resulting
>> pointer is no longer needed.
>
> It does that. But is also does the unlock even if it did not take the
> lock:
>

There is only one path after it takes the lock.

>> +++ b/drivers/net/wireless/ath/ath10k/wmi-tlv.c
>> @@ -576,13 +576,13 @@ static void ath10k_wmi_event_tdls_peer(struct ath10k *ar, struct sk_buff *skb)
>> case WMI_TDLS_TEARDOWN_REASON_TX:
>> case WMI_TDLS_TEARDOWN_REASON_RSSI:
>> case WMI_TDLS_TEARDOWN_REASON_PTR_TIMEOUT:
>> + rcu_read_lock();

Takes the lock here:

>> station = ieee80211_find_sta_by_ifaddr(ar->hw,
>> ev->peer_macaddr.addr,
>> NULL);
>> if (!station) {
>> ath10k_warn(ar, "did not find station from tdls peer event");
>> - kfree(tb);
>> - return;
>> + goto exit;

Releases it if something fails

>> }
>> arvif = ath10k_get_arvif(ar, __le32_to_cpu(ev->vdev_id));
>> ieee80211_tdls_oper_request(
>> @@ -593,6 +593,9 @@ static void ath10k_wmi_event_tdls_peer(struct ath10k *ar, struct sk_buff *skb)
>> );
>> break;
>> }
>> +

falls through here.

>> +exit:
>> + rcu_read_unlock();
>> kfree(tb);
>> }
>
> The switch only takes the lock in 3 branches, but it is released
> unconditionally at the end.
>

I don't see any other switch cases. However, default is missing:

It could be done this way perhaps: (simpler than what you proposed)

diff --git a/drivers/net/wireless/ath/ath10k/wmi-tlv.c
b/drivers/net/wireless/ath/ath10k/wmi-tlv.c
index d97b33f789e4..7efbe03fbca8 100644
--- a/drivers/net/wireless/ath/ath10k/wmi-tlv.c
+++ b/drivers/net/wireless/ath/ath10k/wmi-tlv.c
@@ -592,6 +592,9 @@ static void ath10k_wmi_event_tdls_peer(struct ath10k
*ar, struct sk_buff *skb)
GFP_ATOMIC
);
break;
+ default:
+ kfree(tb);
+ return;
}

exit:

thanks,
-- Shuah

2021-04-06 13:33:11

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 5.10 096/126] KVM: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map

On 05/04/21 10:54, Greg Kroah-Hartman wrote:
> From: Ben Gardon <[email protected]>
>
> [ Upstream commit 9a77daacc87dee9fd63e31243f21894132ed8407 ]
>
> To prepare for handling page faults in parallel, change the TDP MMU
> page fault handler to use atomic operations to set SPTEs so that changes
> are not lost if multiple threads attempt to modify the same SPTE.
>
> Reviewed-by: Peter Feiner <[email protected]>
> Signed-off-by: Ben Gardon <[email protected]>
>
> Message-Id: <[email protected]>
> [Document new locking rules. - Paolo]
> Signed-off-by: Paolo Bonzini <[email protected]>
> Signed-off-by: Sasha Levin <[email protected]>
> ---
> Documentation/virt/kvm/locking.rst | 9 +-
> arch/x86/include/asm/kvm_host.h | 13 +++
> arch/x86/kvm/mmu/tdp_mmu.c | 142 ++++++++++++++++++++++-------
> 3 files changed, 130 insertions(+), 34 deletions(-)
>
> diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
> index b21a34c34a21..0aa4817b466d 100644
> --- a/Documentation/virt/kvm/locking.rst
> +++ b/Documentation/virt/kvm/locking.rst
> @@ -16,7 +16,14 @@ The acquisition orders for mutexes are as follows:
> - kvm->slots_lock is taken outside kvm->irq_lock, though acquiring
> them together is quite rare.
>
> -On x86, vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock.
> +On x86:
> +
> +- vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock
> +
> +- kvm->arch.mmu_lock is an rwlock. kvm->arch.tdp_mmu_pages_lock is
> + taken inside kvm->arch.mmu_lock, and cannot be taken without already
> + holding kvm->arch.mmu_lock (typically with ``read_lock``, otherwise
> + there's no need to take kvm->arch.tdp_mmu_pages_lock at all).
>
> Everything else is a leaf: no other lock is taken inside the critical
> sections.
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 02d4c74d30e2..47cd8f9b3fe7 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1014,6 +1014,19 @@ struct kvm_arch {
> struct list_head tdp_mmu_roots;
> /* List of struct tdp_mmu_pages not being used as roots */
> struct list_head tdp_mmu_pages;
> +
> + /*
> + * Protects accesses to the following fields when the MMU lock
> + * is held in read mode:
> + * - tdp_mmu_pages (above)
> + * - the link field of struct kvm_mmu_pages used by the TDP MMU
> + * - lpage_disallowed_mmu_pages
> + * - the lpage_disallowed_link field of struct kvm_mmu_pages used
> + * by the TDP MMU
> + * It is acceptable, but not necessary, to acquire this lock when
> + * the thread holds the MMU lock in write mode.
> + */
> + spinlock_t tdp_mmu_pages_lock;
> };
>
> struct kvm_vm_stat {
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 14d69c01c710..eb38f74af3f2 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -7,6 +7,7 @@
> #include "tdp_mmu.h"
> #include "spte.h"
>
> +#include <asm/cmpxchg.h>
> #include <trace/events/kvm.h>
>
> #ifdef CONFIG_X86_64
> @@ -33,6 +34,7 @@ void kvm_mmu_init_tdp_mmu(struct kvm *kvm)
> kvm->arch.tdp_mmu_enabled = true;
>
> INIT_LIST_HEAD(&kvm->arch.tdp_mmu_roots);
> + spin_lock_init(&kvm->arch.tdp_mmu_pages_lock);
> INIT_LIST_HEAD(&kvm->arch.tdp_mmu_pages);
> }
>
> @@ -225,7 +227,8 @@ static void tdp_mmu_free_sp_rcu_callback(struct rcu_head *head)
> }
>
> static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
> - u64 old_spte, u64 new_spte, int level);
> + u64 old_spte, u64 new_spte, int level,
> + bool shared);
>
> static int kvm_mmu_page_as_id(struct kvm_mmu_page *sp)
> {
> @@ -267,17 +270,26 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn,
> *
> * @kvm: kvm instance
> * @sp: the new page
> + * @shared: This operation may not be running under the exclusive use of
> + * the MMU lock and the operation must synchronize with other
> + * threads that might be adding or removing pages.
> * @account_nx: This page replaces a NX large page and should be marked for
> * eventual reclaim.
> */
> static void tdp_mmu_link_page(struct kvm *kvm, struct kvm_mmu_page *sp,
> - bool account_nx)
> + bool shared, bool account_nx)
> {
> - lockdep_assert_held_write(&kvm->mmu_lock);
> + if (shared)
> + spin_lock(&kvm->arch.tdp_mmu_pages_lock);
> + else
> + lockdep_assert_held_write(&kvm->mmu_lock);
>
> list_add(&sp->link, &kvm->arch.tdp_mmu_pages);
> if (account_nx)
> account_huge_nx_page(kvm, sp);
> +
> + if (shared)
> + spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
> }
>
> /**
> @@ -285,14 +297,24 @@ static void tdp_mmu_link_page(struct kvm *kvm, struct kvm_mmu_page *sp,
> *
> * @kvm: kvm instance
> * @sp: the page to be removed
> + * @shared: This operation may not be running under the exclusive use of
> + * the MMU lock and the operation must synchronize with other
> + * threads that might be adding or removing pages.
> */
> -static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp)
> +static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp,
> + bool shared)
> {
> - lockdep_assert_held_write(&kvm->mmu_lock);
> + if (shared)
> + spin_lock(&kvm->arch.tdp_mmu_pages_lock);
> + else
> + lockdep_assert_held_write(&kvm->mmu_lock);
>
> list_del(&sp->link);
> if (sp->lpage_disallowed)
> unaccount_huge_nx_page(kvm, sp);
> +
> + if (shared)
> + spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
> }
>
> /**
> @@ -300,28 +322,39 @@ static void tdp_mmu_unlink_page(struct kvm *kvm, struct kvm_mmu_page *sp)
> *
> * @kvm: kvm instance
> * @pt: the page removed from the paging structure
> + * @shared: This operation may not be running under the exclusive use
> + * of the MMU lock and the operation must synchronize with other
> + * threads that might be modifying SPTEs.
> *
> * Given a page table that has been removed from the TDP paging structure,
> * iterates through the page table to clear SPTEs and free child page tables.
> */
> -static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt)
> +static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt,
> + bool shared)
> {
> struct kvm_mmu_page *sp = sptep_to_sp(pt);
> int level = sp->role.level;
> gfn_t gfn = sp->gfn;
> u64 old_child_spte;
> + u64 *sptep;
> int i;
>
> trace_kvm_mmu_prepare_zap_page(sp);
>
> - tdp_mmu_unlink_page(kvm, sp);
> + tdp_mmu_unlink_page(kvm, sp, shared);
>
> for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
> - old_child_spte = READ_ONCE(*(pt + i));
> - WRITE_ONCE(*(pt + i), 0);
> + sptep = pt + i;
> +
> + if (shared) {
> + old_child_spte = xchg(sptep, 0);
> + } else {
> + old_child_spte = READ_ONCE(*sptep);
> + WRITE_ONCE(*sptep, 0);
> + }
> handle_changed_spte(kvm, kvm_mmu_page_as_id(sp),
> gfn + (i * KVM_PAGES_PER_HPAGE(level - 1)),
> - old_child_spte, 0, level - 1);
> + old_child_spte, 0, level - 1, shared);
> }
>
> kvm_flush_remote_tlbs_with_address(kvm, gfn,
> @@ -338,12 +371,16 @@ static void handle_removed_tdp_mmu_page(struct kvm *kvm, u64 *pt)
> * @old_spte: The value of the SPTE before the change
> * @new_spte: The value of the SPTE after the change
> * @level: the level of the PT the SPTE is part of in the paging structure
> + * @shared: This operation may not be running under the exclusive use of
> + * the MMU lock and the operation must synchronize with other
> + * threads that might be modifying SPTEs.
> *
> * Handle bookkeeping that might result from the modification of a SPTE.
> * This function must be called for all TDP SPTE modifications.
> */
> static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
> - u64 old_spte, u64 new_spte, int level)
> + u64 old_spte, u64 new_spte, int level,
> + bool shared)
> {
> bool was_present = is_shadow_present_pte(old_spte);
> bool is_present = is_shadow_present_pte(new_spte);
> @@ -413,18 +450,51 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
> */
> if (was_present && !was_leaf && (pfn_changed || !is_present))
> handle_removed_tdp_mmu_page(kvm,
> - spte_to_child_pt(old_spte, level));
> + spte_to_child_pt(old_spte, level), shared);
> }
>
> static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
> - u64 old_spte, u64 new_spte, int level)
> + u64 old_spte, u64 new_spte, int level,
> + bool shared)
> {
> - __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level);
> + __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level,
> + shared);
> handle_changed_spte_acc_track(old_spte, new_spte, level);
> handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte,
> new_spte, level);
> }
>
> +/*
> + * tdp_mmu_set_spte_atomic - Set a TDP MMU SPTE atomically and handle the
> + * associated bookkeeping
> + *
> + * @kvm: kvm instance
> + * @iter: a tdp_iter instance currently on the SPTE that should be set
> + * @new_spte: The value the SPTE should be set to
> + * Returns: true if the SPTE was set, false if it was not. If false is returned,
> + * this function will have no side-effects.
> + */
> +static inline bool tdp_mmu_set_spte_atomic(struct kvm *kvm,
> + struct tdp_iter *iter,
> + u64 new_spte)
> +{
> + u64 *root_pt = tdp_iter_root_pt(iter);
> + struct kvm_mmu_page *root = sptep_to_sp(root_pt);
> + int as_id = kvm_mmu_page_as_id(root);
> +
> + lockdep_assert_held_read(&kvm->mmu_lock);
> +
> + if (cmpxchg64(rcu_dereference(iter->sptep), iter->old_spte,
> + new_spte) != iter->old_spte)
> + return false;
> +
> + handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte,
> + iter->level, true);
> +
> + return true;
> +}
> +
> +
> /*
> * __tdp_mmu_set_spte - Set a TDP MMU SPTE and handle the associated bookkeeping
> * @kvm: kvm instance
> @@ -454,7 +524,7 @@ static inline void __tdp_mmu_set_spte(struct kvm *kvm, struct tdp_iter *iter,
> WRITE_ONCE(*rcu_dereference(iter->sptep), new_spte);
>
> __handle_changed_spte(kvm, as_id, iter->gfn, iter->old_spte, new_spte,
> - iter->level);
> + iter->level, false);
> if (record_acc_track)
> handle_changed_spte_acc_track(iter->old_spte, new_spte,
> iter->level);
> @@ -629,23 +699,18 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int write,
> int ret = 0;
> int make_spte_ret = 0;
>
> - if (unlikely(is_noslot_pfn(pfn))) {
> + if (unlikely(is_noslot_pfn(pfn)))
> new_spte = make_mmio_spte(vcpu, iter->gfn, ACC_ALL);
> - trace_mark_mmio_spte(rcu_dereference(iter->sptep), iter->gfn,
> - new_spte);
> - } else {
> + else
> make_spte_ret = make_spte(vcpu, ACC_ALL, iter->level, iter->gfn,
> pfn, iter->old_spte, prefault, true,
> map_writable, !shadow_accessed_mask,
> &new_spte);
> - trace_kvm_mmu_set_spte(iter->level, iter->gfn,
> - rcu_dereference(iter->sptep));
> - }
>
> if (new_spte == iter->old_spte)
> ret = RET_PF_SPURIOUS;
> - else
> - tdp_mmu_set_spte(vcpu->kvm, iter, new_spte);
> + else if (!tdp_mmu_set_spte_atomic(vcpu->kvm, iter, new_spte))
> + return RET_PF_RETRY;
>
> /*
> * If the page fault was caused by a write but the page is write
> @@ -659,8 +724,13 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, int write,
> }
>
> /* If a MMIO SPTE is installed, the MMIO will need to be emulated. */
> - if (unlikely(is_mmio_spte(new_spte)))
> + if (unlikely(is_mmio_spte(new_spte))) {
> + trace_mark_mmio_spte(rcu_dereference(iter->sptep), iter->gfn,
> + new_spte);
> ret = RET_PF_EMULATE;
> + } else
> + trace_kvm_mmu_set_spte(iter->level, iter->gfn,
> + rcu_dereference(iter->sptep));
>
> trace_kvm_mmu_set_spte(iter->level, iter->gfn,
> rcu_dereference(iter->sptep));
> @@ -719,7 +789,8 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
> */
> if (is_shadow_present_pte(iter.old_spte) &&
> is_large_pte(iter.old_spte)) {
> - tdp_mmu_set_spte(vcpu->kvm, &iter, 0);
> + if (!tdp_mmu_set_spte_atomic(vcpu->kvm, &iter, 0))
> + break;
>
> kvm_flush_remote_tlbs_with_address(vcpu->kvm, iter.gfn,
> KVM_PAGES_PER_HPAGE(iter.level));
> @@ -736,19 +807,24 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
> sp = alloc_tdp_mmu_page(vcpu, iter.gfn, iter.level);
> child_pt = sp->spt;
>
> - tdp_mmu_link_page(vcpu->kvm, sp,
> - huge_page_disallowed &&
> - req_level >= iter.level);
> -
> new_spte = make_nonleaf_spte(child_pt,
> !shadow_accessed_mask);
>
> - trace_kvm_mmu_get_page(sp, true);
> - tdp_mmu_set_spte(vcpu->kvm, &iter, new_spte);
> + if (tdp_mmu_set_spte_atomic(vcpu->kvm, &iter,
> + new_spte)) {
> + tdp_mmu_link_page(vcpu->kvm, sp, true,
> + huge_page_disallowed &&
> + req_level >= iter.level);
> +
> + trace_kvm_mmu_get_page(sp, true);
> + } else {
> + tdp_mmu_free_sp(sp);
> + break;
> + }
> }
> }
>
> - if (WARN_ON(iter.level != level)) {
> + if (iter.level != level) {
> rcu_read_unlock();
> return RET_PF_RETRY;
> }
>

Whoa no, you have included basically a whole new feature, except for the
final patch that actually enables the feature. The whole new MMU is
still not meant to be used in production and development is still
happening as of 5.13.

Were all these patches (82-97) included just to enable patch 98 ("KVM:
x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping")? Same
for 105-120 in 5.11.

Paolo

2021-04-06 22:51:31

by Andrei Rabusov

[permalink] [raw]
Subject: Re: [PATCH 5.10 000/126] 5.10.28-rc1 review

On Mon, Apr 05, 2021 at 10:52:42AM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.10.28 release.
> There are 126 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 07 Apr 2021 08:50:09 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.28-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Fine with mine i686 gcc 10.2
No selftest regressions were found
Selftest results [ok/not ok]: [1436/80]

Tested-by: Andrei Rabusov <[email protected]>

P.S.
Sorry, as always I forgot to do reply-all instead of just reply

2021-04-06 22:52:43

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH 5.10 096/126] KVM: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map

On Tue, Apr 06, 2021 at 08:09:26AM +0200, Paolo Bonzini wrote:
>On 05/04/21 10:54, Greg Kroah-Hartman wrote:
>>From: Ben Gardon <[email protected]>
>>
>>[ Upstream commit 9a77daacc87dee9fd63e31243f21894132ed8407 ]
>>
>>To prepare for handling page faults in parallel, change the TDP MMU
>>page fault handler to use atomic operations to set SPTEs so that changes
>>are not lost if multiple threads attempt to modify the same SPTE.
>>
>>Reviewed-by: Peter Feiner <[email protected]>
>>Signed-off-by: Ben Gardon <[email protected]>
>>
>Whoa no, you have included basically a whole new feature, except for
>the final patch that actually enables the feature. The whole new MMU

Right, we would usually grab dependencies rather than modifying the
patch. It means we diverge less with upstream, and custom backports tend
to be buggier than just grabbing dependencies.

>is still not meant to be used in production and development is still
>happening as of 5.13.

Unrelated to this disucssion, but how are folks supposed to know which
feature can and which feature can't be used in production? If it's a
released kernel, in theory anyone can pick up 5.12 and use it in
production.

>Were all these patches (82-97) included just to enable patch 98 ("KVM:
>x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping")?
>Same for 105-120 in 5.11.

Yup. Is there anything wrong with those patches?

--
Thanks,
Sasha

2021-04-07 08:04:36

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH 5.10 096/126] KVM: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map

On Tue, Apr 06, 2021 at 05:48:50PM +0200, Paolo Bonzini wrote:
>On 06/04/21 15:49, Sasha Levin wrote:
>>Yup. Is there anything wrong with those patches?
>
>The big issue, and the one that you ignoredz every time we discuss
>this topic, is that this particular subset of 17 has AFAIK never been
>tested by anyone.

Few of the CI systems that run on stable(-rc) releases run
kvm-unit-tests, which passed. So yes, this was tested.

>There's plenty of locking changes in here, one patch that you didn't
>backport has this in its commit message:
>
> This isn't technically a bug fix in the current code [...] but that
> is all very, very subtle, and will break at the slightest sneeze,
>
>meaning that the locking in 5.10 and 5.11 was also less robust to
>changes elsewhere in the code.
>
>Let's also talk about the process and the timing. I got the "failed
>to apply" automated message last Friday and I was going to work on the
>backport today since yesterday was a holiday here. I was *never* CCed

There are a few more "FAILED:" mails that need attention that are older
than this one, I hope they're also in the queue.

>on a post of this backport for maintainers to review; you guys

You're looking at it, this is the -rc cycle for stable kernels.

>*literally* took random subsets of patches from a feature that is new
>and in active development, and hoped that they worked on a past
>release.

Right, I looked at what needed to be backported, took it back to 5.4,
and ran kvm-unit-tests on it.

What other hoops should we jump through so we won't need to "hope"
anymore?

>I could be happy because you just provided me with a perfect example
>of why to use my employer's franken-kernel instead of upstream stable
>kernels... ;) but this is not how a world-class operating system is
>developed. Who cares if a VM breaks or even if my laptop panics; but
>I'd seriously fear for my data if you applied the same attitude to XFS
>or ext4fs.
>
>For now, please drop all 17 patches from 5.10 and 5.11. I'll send a
>tested backport as soon as possible.

Sure, I'll drop them. Please let us know when a backport is available.

--
Thanks,
Sasha

2021-04-07 08:09:36

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 5.10 096/126] KVM: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map

On 06/04/21 20:01, Sasha Levin wrote:
> On Tue, Apr 06, 2021 at 05:48:50PM +0200, Paolo Bonzini wrote:
>> On 06/04/21 15:49, Sasha Levin wrote:
>>> Yup. Is there anything wrong with those patches?
>>
>> The big issue, and the one that you ignoredz every time we discuss
>> this topic, is that this particular subset of 17 has AFAIK never been
>> tested by anyone.
>
> Few of the CI systems that run on stable(-rc) releases run
> kvm-unit-tests, which passed. So yes, this was tested.

The code didn't run unless those CI systems set the module parameter
that gates the experimental code (they don't). A compile test is better
than nothing I guess, but code that didn't run cannot be considered
tested. Again, I don't expect that anyone would notice a botched
backport to 5.10 or 5.11 of this code, but that isn't an excuse for a
poor process.

> Right, I looked at what needed to be backported, took it back to 5.4,
> and ran kvm-unit-tests on it.

I guess that's a typo since the code was added in 5.10, but anyway...

> What other hoops should we jump through so we won't need to "hope"
> anymore?

... you should jump through _less_ hoops. You are not expected to know
the status of the code in each and every subsystem, not even Linus does.

If a patch doesn't (more or less trivially) apply, the maintainer should
take action. Distro maintainers can also jump in and post the backport
to subsystem mailing lists. If the stable kernel loses a patch because
a maintainer doesn't have the time to do a backport, it's not the end of
the world.

Paolo

2021-04-07 09:26:00

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 5.10 096/126] KVM: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map

On 06/04/21 15:49, Sasha Levin wrote:
> On Tue, Apr 06, 2021 at 08:09:26AM +0200, Paolo Bonzini wrote:
>> Whoa no, you have included basically a whole new feature, except for
>> the final patch that actually enables the feature.  The whole new MMU
>
> Right, we would usually grab dependencies rather than modifying the
> patch. It means we diverge less with upstream, and custom backports tend
> to be buggier than just grabbing dependencies.

In general I can't disagree. However, you *are* modifying at least
commit 048f49809c in your backport, so it's not clear where you draw the
line and why you didn't ask for help (more on this below).

Only the first five patches here are actual prerequisites for an easy
backport of the two commits that actually matter (a835429cda91, "KVM:
x86/mmu: Ensure TLBs are flushed when yielding during GFN range zap";
and 048f49809c52, "KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU
during NX zapping", 2021-03-30). Everything after "KVM: x86/mmu: Yield
in TDU MMU iter even if no SPTES changed" could be dropped without
making it any harder to backport those two.

>> is still not meant to be used in production and development is still
>> happening as of 5.13.
>
> Unrelated to this discussion, but how are folks supposed to know which
> feature can and which feature can't be used in production? If it's a
> released kernel, in theory anyone can pick up 5.12 and use it in
> production.

It's not enabled by default and requires turning on a module parameter.

That also means that something like CKI will not notice if anything's
wrong with these patches. It also means that I could just shrug and
hope that no one ever runs this code in 5.10 and 5.11 (which is actually
the most likely case), but it is the process that is *just wrong*.

>> Were all these patches (82-97) included just to enable patch 98 ("KVM:
>> x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping")? Same
>> for 105-120 in 5.11.
>
> Yup. Is there anything wrong with those patches?

The big issue, and the one that you ignoredz every time we discuss this
topic, is that this particular subset of 17 has AFAIK never been tested
by anyone.

There's plenty of locking changes in here, one patch that you didn't
backport has this in its commit message:

This isn't technically a bug fix in the current code [...] but that
is all very, very subtle, and will break at the slightest sneeze,

meaning that the locking in 5.10 and 5.11 was also less robust to
changes elsewhere in the code.

Let's also talk about the process and the timing. I got the "failed to
apply" automated message last Friday and I was going to work on the
backport today since yesterday was a holiday here. I was *never* CCed
on a post of this backport for maintainers to review; you guys
*literally* took random subsets of patches from a feature that is new
and in active development, and hoped that they worked on a past release.

I could be happy because you just provided me with a perfect example of
why to use my employer's franken-kernel instead of upstream stable
kernels... ;) but this is not how a world-class operating system is
developed. Who cares if a VM breaks or even if my laptop panics; but
I'd seriously fear for my data if you applied the same attitude to XFS
or ext4fs.

For now, please drop all 17 patches from 5.10 and 5.11. I'll send a
tested backport as soon as possible.

Paolo

2021-04-07 10:54:14

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH 5.10 096/126] KVM: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map

On Tue, Apr 06, 2021 at 08:28:27PM +0200, Paolo Bonzini wrote:
>If a patch doesn't (more or less trivially) apply, the maintainer
>should take action. Distro maintainers can also jump in and post the
>backport to subsystem mailing lists. If the stable kernel loses a
>patch because a maintainer doesn't have the time to do a backport,
>it's not the end of the world.

This quickly went from a "world class" to "fuck it".

It's understandable that maintainers don't have all the time in the
world for this, but are you really asking not to backport fixes to
stable trees because you don't have the time for it and don't want
anyone else to do it instead?

Maybe consider designating someone who knows the subsystem well and does
have time for this?

--
Thanks,
Sasha

2021-04-07 15:16:14

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 5.10 096/126] KVM: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map

On 06/04/21 21:44, Sasha Levin wrote:
> On Tue, Apr 06, 2021 at 08:28:27PM +0200, Paolo Bonzini wrote:
>> If a patch doesn't (more or less trivially) apply, the maintainer
>> should take action.  Distro maintainers can also jump in and post the
>> backport to subsystem mailing lists.  If the stable kernel loses a
>> patch because a maintainer doesn't have the time to do a backport,
>> it's not the end of the world.
>
> This quickly went from a "world class" to "fuck it".

Known bugs are better than unknown bugs. If something is reported on
4.19 and the stable@ backports were only done up to 5.4 because the
backport was a bit more messy, it's okay. If a user comes up with a
weird bug that I've never seen and that it's caused by a botched
backport, it's much worse.

In this specific case we're talking of 5.10; but in many cases users of
really old kernels, let's say 4.4-4.9 at this point, won't care about
having *all* bugfixes. Some newer (and thus more buggy) features may be
so distant from the mainline in those kernels, or so immature, that no
one will consider them while staying on such an old kernel.

Again, kernel necrophilia pays my bills, so I have some experience there. :)

> It's understandable that maintainers don't have all the time in the
> world for this, but are you really asking not to backport fixes to
> stable trees because you don't have the time for it and don't want
> anyone else to do it instead?

If a bug is serious I *will* do the backport; I literally did this
specific backport on the first working day after the failure report.
But not all bugs are created equal and neither are all stable@-worthy
bugs. I try to use the "Fixes" tag correctly, but sometimes a bug that
*technically* is 10-years-old may not be worthwhile or even possible to
fix even in 5.4.

That said... one thing that would be really, really awesome would be a
website where you navigate a Linux checkout and for each directory you
can choose to get a list of stable patches that were Cc-ed to stable@
and failed to apply. A pipedream maybe, but also a great internship
project. :)

Paolo

> Maybe consider designating someone who knows the subsystem well and does
> have time for this?

2021-04-07 16:07:20

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH 5.10 096/126] KVM: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map

On Tue, Apr 06, 2021 at 10:43:52PM +0200, Paolo Bonzini wrote:
>On 06/04/21 21:44, Sasha Levin wrote:
>>On Tue, Apr 06, 2021 at 08:28:27PM +0200, Paolo Bonzini wrote:
>>>If a patch doesn't (more or less trivially) apply, the maintainer
>>>should take action.? Distro maintainers can also jump in and post
>>>the backport to subsystem mailing lists.? If the stable kernel
>>>loses a patch because a maintainer doesn't have the time to do a
>>>backport, it's not the end of the world.
>>
>>This quickly went from a "world class" to "fuck it".
>
>Known bugs are better than unknown bugs. If something is reported on
>4.19 and the stable@ backports were only done up to 5.4 because the
>backport was a bit more messy, it's okay. If a user comes up with a
>weird bug that I've never seen and that it's caused by a botched
>backport, it's much worse.
>
>In this specific case we're talking of 5.10; but in many cases users
>of really old kernels, let's say 4.4-4.9 at this point, won't care
>about having *all* bugfixes. Some newer (and thus more buggy)
>features may be so distant from the mainline in those kernels, or so
>immature, that no one will consider them while staying on such an old
>kernel.
>
>Again, kernel necrophilia pays my bills, so I have some experience there. :)

I'm with you on older kernels: if it's too complex users should just be
upgrading.

>>It's understandable that maintainers don't have all the time in the
>>world for this, but are you really asking not to backport fixes to
>>stable trees because you don't have the time for it and don't want
>>anyone else to do it instead?
>
>If a bug is serious I *will* do the backport; I literally did this
>specific backport on the first working day after the failure report.
>But not all bugs are created equal and neither are all stable@-worthy
>bugs. I try to use the "Fixes" tag correctly, but sometimes a bug
>that *technically* is 10-years-old may not be worthwhile or even
>possible to fix even in 5.4.

Agree here too: I'm not advocating to go overboard on backporting stuff
to really old kernels, but I do think it would be great to have most
fixes (even ones not necessarily tagged for stable) on recent kernels
that users should actually be using, and this is where my suggesting to
let more folks deal with the less important stuff comes from.

>That said... one thing that would be really, really awesome would be a
>website where you navigate a Linux checkout and for each directory you
>can choose to get a list of stable patches that were Cc-ed to stable@
>and failed to apply. A pipedream maybe, but also a great internship
>project. :)

I have scripts that do this audit, but sadly my web design skills are
not good enough to make it look pretty :)

I've attached a list for virt/kvm/ and arch/x86/kvm/ on all supported
LTS kernel as an example.

--
Thanks,
Sasha


Attachments:
(No filename) (2.87 kB)
kvm_missing.txt (25.42 kB)
Download all attachments

2021-04-07 17:22:16

by Zou Wei

[permalink] [raw]
Subject: Re: [PATCH 5.10 000/126] 5.10.28-rc1 review



On 2021/4/5 16:52, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.10.28 release.
> There are 126 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 07 Apr 2021 08:50:09 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.28-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Tested on arm64 and x86 for 5.10.28-rc1,

Kernel repo:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Branch: linux-5.10.y
Version: 5.10.28-rc1
Commit: 17879c574df0acdfddd69065b191a3c1b1f00d48
Compiler: gcc version 7.3.0 (GCC)

arm64:
--------------------------------------------------------------------
Testcase Result Summary:
total: 4720
passed: 4720
failed: 0
timeout: 0
--------------------------------------------------------------------

x86:
--------------------------------------------------------------------
Testcase Result Summary:
total: 4720
passed: 4720
failed: 0
timeout: 0
--------------------------------------------------------------------

Tested-by: Hulk Robot <[email protected]>

2021-04-07 20:31:29

by Sudip Mukherjee

[permalink] [raw]
Subject: Re: [PATCH 5.10 000/126] 5.10.28-rc1 review

Hi Greg,

On Mon, Apr 5, 2021 at 10:09 AM Greg Kroah-Hartman
<[email protected]> wrote:
>
> This is the start of the stable review cycle for the 5.10.28 release.
> There are 126 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 07 Apr 2021 08:50:09 +0000.
> Anything received after that time might be too late.

Booted on my test laptop. No regression.

Tested-by: Sudip Mukherjee <[email protected]>


--
Regards
Sudip

2021-04-07 20:34:44

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 5.10 000/126] 5.10.28-rc1 review

Hi!

> This is the start of the stable review cycle for the 5.10.28 release.
> There are 126 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.

CIP testing did not find any kernel problems here: (Siemens boards
are unavailable)

https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-5.10.y

Tested-by: Pavel Machek (CIP) <[email protected]>

Best regards,
Pavel

--
DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany


Attachments:
(No filename) (648.00 B)
signature.asc (188.00 B)
Digital signature
Download all attachments