This is the start of the stable review cycle for the 4.19.106 release.
There are 191 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 23 Feb 2020 07:19:49 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.106-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <[email protected]>
Linux 4.19.106-rc1
Alex Deucher <[email protected]>
drm/amdgpu/display: handle multiple numbers of fclks in dcn_calcs.c (v2)
Ido Schimmel <[email protected]>
mlxsw: spectrum_dpipe: Add missing error path
Michael S. Tsirkin <[email protected]>
virtio_balloon: prevent pfn array overflow
Steve French <[email protected]>
cifs: log warning message (once) if out of disk space
Vasily Averin <[email protected]>
help_next should increase position index
Wenwen Wang <[email protected]>
NFS: Fix memory leaks
Alex Deucher <[email protected]>
drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_voltage
Alex Deucher <[email protected]>
drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_latency
Zhiqiang Liu <[email protected]>
brd: check and limit max_part par
Shubhrajyoti Datta <[email protected]>
microblaze: Prevent the overflow of the start
Andrei Otcheretianski <[email protected]>
iwlwifi: mvm: Fix thermal zone registration
Zenghui Yu <[email protected]>
irqchip/gic-v3-its: Reference to its_invall_cmd descriptor when building INVALL
Coly Li <[email protected]>
bcache: explicity type cast in bset_bkey_last()
Yunfeng Ye <[email protected]>
reiserfs: prevent NULL pointer dereference in reiserfs_insert_item()
Nathan Chancellor <[email protected]>
lib/scatterlist.c: adjust indentation in __sg_alloc_table
wangyan <[email protected]>
ocfs2: fix a NULL pointer dereference when call ocfs2_update_inode_fsync_trans()
Daniel Vetter <[email protected]>
radeon: insert 10ms sleep in dce5_crtc_load_lut
Vasily Averin <[email protected]>
trigger_next should increase position index
Vasily Averin <[email protected]>
ftrace: fpid_next() should increase position index
Ben Skeggs <[email protected]>
drm/nouveau/disp/nv50-: prevent oops when no channel method map provided
Marc Zyngier <[email protected]>
irqchip/gic-v3: Only provision redistributors that are enabled in ACPI
Arnd Bergmann <[email protected]>
rbd: work around -Wuninitialized warning
Xiubo Li <[email protected]>
ceph: check availability of mds cluster on mount after wait timeout
Vasily Averin <[email protected]>
bpf: map_seq_next should always increase position index
Ronnie Sahlberg <[email protected]>
cifs: fix NULL dereference in match_prepath
Colin Ian King <[email protected]>
iwlegacy: ensure loop counter addr does not wrap and cause an infinite loop
Nathan Chancellor <[email protected]>
hostap: Adjust indentation in prism2_hostapd_add_sta
Vincenzo Frascino <[email protected]>
ARM: 8951/1: Fix Kexec compilation issue.
zhangyi (F) <[email protected]>
jbd2: make sure ESHUTDOWN to be recorded in the journal superblock
zhangyi (F) <[email protected]>
jbd2: switch to use jbd2_journal_abort() when failed to submit the commit record
Lorenz Bauer <[email protected]>
selftests: bpf: Reset global state between reuseport test runs
Lu Baolu <[email protected]>
iommu/vt-d: Remove unnecessary WARN_ON_ONCE()
Liang Chen <[email protected]>
bcache: cached_dev_free needs to put the sb page
Oliver O'Halloran <[email protected]>
powerpc/sriov: Remove VF eeh_dev state when disabling SR-IOV
Ben Skeggs <[email protected]>
drm/nouveau/mmu: fix comptag memory leak
Peter Große <[email protected]>
ALSA: hda - Add docking station support for Lenovo Thinkpad T420s
Colin Ian King <[email protected]>
driver core: platform: fix u32 greater or equal to zero comparison
Vasily Gorbik <[email protected]>
s390/ftrace: generate traced function stack frame
Vasily Gorbik <[email protected]>
s390: adjust -mpacked-stack support check for clang 10
Masami Hiramatsu <[email protected]>
x86/decoder: Add TEST opcode to Group3-2
Masahiro Yamada <[email protected]>
kbuild: use -S instead of -E for precise cc-option test in Kconfig
Kai Vehmanen <[email protected]>
ALSA: hda/hdmi - add retry logic to parse_intel_hdmi()
John Garry <[email protected]>
irqchip/mbigen: Set driver .suppress_bind_attrs to avoid remove problems
Brandon Maier <[email protected]>
remoteproc: Initialize rproc_class before use
Jessica Yu <[email protected]>
module: avoid setting info->name early in case we can fall back to info->mod->name
Anand Jain <[email protected]>
btrfs: device stats, log when stats are zeroed
David Sterba <[email protected]>
btrfs: safely advance counter when looking up bio csums
Johannes Thumshirn <[email protected]>
btrfs: fix possible NULL-pointer dereference in integrity checks
yu kuai <[email protected]>
pwm: Remove set but not set variable 'pwm'
Dan Carpenter <[email protected]>
ide: serverworks: potential overflow in svwks_set_pio_mode()
Dan Carpenter <[email protected]>
cmd64x: potential buffer overflow in cmd64x_program_timings()
Uwe Kleine-König <[email protected]>
pwm: omap-dmtimer: Remove PWM chip in .remove before making it unfunctional
Ard Biesheuvel <[email protected]>
x86/mm: Fix NX bit clearing issue in kernel_map_pages_in_pgd
Chao Yu <[email protected]>
f2fs: fix memleak of kobject
Hanjun Guo <[email protected]>
ACPI/IORT: Fix 'Number of IDs' handling in iort_id_map()
Thomas Gleixner <[email protected]>
watchdog/softlockup: Enforce that timestamp is valid on boot
Jun Lei <[email protected]>
drm/amd/display: fixup DML dependencies
Sami Tolvanen <[email protected]>
arm64: fix alternatives with LLVM's integrated assembler
Nick Black <[email protected]>
scsi: iscsi: Don't destroy session if there are outstanding connections
Jaegeuk Kim <[email protected]>
f2fs: free sysfs kobject
Jaegeuk Kim <[email protected]>
f2fs: set I_LINKABLE early to avoid wrong access by vfs
Will Deacon <[email protected]>
iommu/arm-smmu-v3: Use WRITE_ONCE() when changing validity of an STE
Tony Lindgren <[email protected]>
usb: musb: omap2430: Get rid of musb .set_vbus for omap2430 glue
Navid Emamdoost <[email protected]>
drm/vmwgfx: prevent memory leak in vmw_cmdbuf_res_add
Ben Skeggs <[email protected]>
drm/nouveau/fault/gv100-: fix memory leak on module unload
YueHaibing <[email protected]>
drm/nouveau/drm/ttm: Remove set but not used variable 'mem'
YueHaibing <[email protected]>
drm/nouveau: Fix copy-paste error in nouveau_fence_wait_uevent_handler
Ben Skeggs <[email protected]>
drm/nouveau/gr/gk20a,gm200-: add terminators to method lists read from fw
Dan Carpenter <[email protected]>
drm/nouveau/secboot/gm20b: initialize pointer in gm20b_secboot_new()
Arnd Bergmann <[email protected]>
vme: bridges: reduce stack usage
Li RongQing <[email protected]>
bpf: Return -EBADRQC for invalid map type in __bpf_tx_xdp_map
Geert Uytterhoeven <[email protected]>
driver core: Print device when resources present in really_probe()
Simon Schwartz <[email protected]>
driver core: platform: Prevent resouce overflow from causing infinite loops
Arnd Bergmann <[email protected]>
visorbus: fix uninitialized variable access
Nathan Chancellor <[email protected]>
tty: synclink_gt: Adjust indentation in several functions
Nathan Chancellor <[email protected]>
tty: synclinkmp: Adjust indentation in several functions
Jason Gunthorpe <[email protected]>
RDMA/uverbs: Remove needs_kfree_rcu from uverbs_obj_type_class
Chen Zhou <[email protected]>
ASoC: atmel: fix build error with CONFIG_SND_ATMEL_SOC_DMA=m
Arnd Bergmann <[email protected]>
wan: ixp4xx_hss: fix compile-testing on 64-bit
Changbin Du <[email protected]>
x86/nmi: Remove irq_work from the long duration NMI handler
Philipp Zabel <[email protected]>
Input: edt-ft5x06 - work around first register access error
Paul E. McKenney <[email protected]>
rcu: Use WRITE_ONCE() for assignments to ->pprev for hlist_nulls
Ard Biesheuvel <[email protected]>
efi/x86: Don't panic or BUG() on non-critical error conditions
Dmitry Osipenko <[email protected]>
soc/tegra: fuse: Correct straps' address for older Tegra124 device trees
Mike Marciniszyn <[email protected]>
IB/hfi1: Add software counter for ctxt0 seq drop
Arnd Bergmann <[email protected]>
staging: rtl8188: avoid excessive stack usage
Jan Kara <[email protected]>
udf: Fix free space reporting for metadata and virtual partitions
Shuah Khan <[email protected]>
usbip: Fix unsafe unaligned pointer usage
Benjamin Gaignard <[email protected]>
ARM: dts: stm32: Add power-supply for DSI panel on stm32f469-disco
Dingchen Zhang <[email protected]>
drm: remove the newline for CRC source name.
Arnd Bergmann <[email protected]>
mlx5: work around high stack usage with gcc
Jason Ekstrand <[email protected]>
ACPI: button: Add DMI quirk for Razer Blade Stealth 13 late 2019 lid switch
Shile Zhang <[email protected]>
x86/unwind/orc: Fix !CONFIG_MODULES build warning
Andrey Zhizhikin <[email protected]>
tools lib api fs: Fix gcc9 stringop-truncation compilation error
Takashi Iwai <[email protected]>
ALSA: sh: Fix compile warning wrt const
Kunihiko Hayashi <[email protected]>
clk: uniphier: Add SCSSI clock gate for each channel
Takashi Iwai <[email protected]>
ALSA: sh: Fix unused variable warnings
Icenowy Zheng <[email protected]>
clk: sunxi-ng: add mux and pll notifiers for A64 CPU clock
Jiewei Ke <[email protected]>
RDMA/rxe: Fix error type of mmap_offset
Kunihiko Hayashi <[email protected]>
reset: uniphier: Add SCSSI reset control for each channel
Geert Uytterhoeven <[email protected]>
pinctrl: sh-pfc: sh7269: Fix CAN function GPIOs
Chanwoo Choi <[email protected]>
PM / devfreq: rk3399_dmc: Add COMPILE_TEST and HAVE_ARM_SMCCC dependency
Valdis Kletnieks <[email protected]>
x86/vdso: Provide missing include file
Vinay Kumar Yadav <[email protected]>
crypto: chtls - Fixed memory leak
Sascha Hauer <[email protected]>
dmaengine: imx-sdma: Fix memory leak
Logan Gunthorpe <[email protected]>
dmaengine: Store module owner in dma_device struct
Jaihind Yadav <[email protected]>
selinux: ensure we cleanup the internal AVC counters on error in avc_update()
Geert Uytterhoeven <[email protected]>
ARM: dts: r8a7779: Add device node for ARM global timer
Bibby Hsieh <[email protected]>
drm/mediatek: handle events when enabling/disabling crtc
Nathan Chancellor <[email protected]>
scsi: aic7xxx: Adjust indentation in ahc_find_syncrate
Can Guo <[email protected]>
scsi: ufs: Complete pending requests in host reset and restore path
Erik Kaneda <[email protected]>
ACPICA: Disassembler: create buffer fields in ACPI_PARSE_LOAD_PASS1
Aditya Pakki <[email protected]>
orinoco: avoid assertion in case of NULL pointer
Phong Tran <[email protected]>
rtlwifi: rtl_pci: Fix -Wcast-function-type
Phong Tran <[email protected]>
iwlegacy: Fix -Wcast-function-type
Phong Tran <[email protected]>
ipw2x00: Fix -Wcast-function-type
Phong Tran <[email protected]>
b43legacy: Fix -Wcast-function-type
Nathan Chancellor <[email protected]>
ALSA: usx2y: Adjust indentation in snd_usX2Y_hwdep_dsp_status
Xin Long <[email protected]>
netfilter: nft_tunnel: add the missing ERSPAN_VERSION nla_policy
Arnd Bergmann <[email protected]>
isdn: don't mark kcapi_proc_exit as __exit
Aditya Pakki <[email protected]>
fore200e: Fix incorrect checks of NULL pointer dereference
Heiner Kallweit <[email protected]>
r8169: check that Realtek PHY driver module is loaded
Jan Kara <[email protected]>
reiserfs: Fix spurious unlock in reiserfs_fill_super() error handling
Nathan Chancellor <[email protected]>
media: v4l2-device.h: Explicitly compare grp{id,mask} to zero in v4l2_device macros
Daniel Drake <[email protected]>
PCI: Increase D3 delay for AMD Ryzen5/7 XHCI controllers
Daniel Drake <[email protected]>
PCI: Add generic quirk for increasing D3hot delay
Forest Crossman <[email protected]>
media: cx23885: Add support for AVerMedia CE310B
Wei Liu <[email protected]>
PCI: iproc: Apply quirk_paxc_bridge() for module as well as built-in
Andrey Smirnov <[email protected]>
ARM: dts: imx6: rdu2: Limit USBH1 to Full Speed
Andrey Smirnov <[email protected]>
ARM: dts: imx6: rdu2: Disable WP for USDHC2 and USDHC3
Daniel Jordan <[email protected]>
padata: always acquire cpu_hotplug_lock before pinst->lock
Manu Gautam <[email protected]>
arm64: dts: qcom: msm8996: Disable USB2 PHY suspend by core
Paul Moore <[email protected]>
selinux: ensure we cleanup the internal AVC counters on error in avc_insert()
Andre Przywara <[email protected]>
arm: dts: allwinner: H3: Add PMU node
Andre Przywara <[email protected]>
arm64: dts: allwinner: H6: Add PMU mode
Stephen Smalley <[email protected]>
selinux: fall back to ref-walk if audit is required
Mao Wenan <[email protected]>
NFC: port100: Convert cpu_to_le16(le16_to_cpu(E1) + E2) to use le16_add_cpu().
Rasmus Villemoes <[email protected]>
net/wan/fsl_ucc_hdlc: reject muram offsets above 64K
Miquel Raynal <[email protected]>
regulator: rk808: Lower log level on optional GPIOs being not available
Nathan Chancellor <[email protected]>
drm/amdgpu: Ensure ret is always initialized when using SOC15_WAIT_ON_RREG
yu kuai <[email protected]>
drm/amdgpu: remove 4 set but not used variable in amdgpu_atombios_get_connector_info_from_object_table
Douglas Anderson <[email protected]>
clk: qcom: rcg2: Don't crash if our parent can't be found; return an error
Masahiro Yamada <[email protected]>
kconfig: fix broken dependency in randconfig-generated .config
Christian Borntraeger <[email protected]>
KVM: s390: ENOTSUPP -> EOPNOTSUPP fixups
Sun Ke <[email protected]>
nbd: add a flush_workqueue in nbd_start_device
Harry Wentland <[email protected]>
drm/amd/display: Retrain dongles when SINK_COUNT becomes non-zero
Rakesh Pillai <[email protected]>
ath10k: Correct the DMA direction for management tx buffers
zhangyi (F) <[email protected]>
ext4, jbd2: ensure panic when aborting with zero errno
Vincenzo Frascino <[email protected]>
ARM: 8952/1: Disable kmemleak on XIP kernels
Steven Rostedt (VMware) <[email protected]>
tracing: Fix very unlikely race of registering two stat tracers
Luis Henriques <[email protected]>
tracing: Fix tracing_stat return values in error handling paths
Oliver O'Halloran <[email protected]>
powerpc/iov: Move VF pdev fixup into pcibios_fixup_iov()
Niklas Schnelle <[email protected]>
s390/pci: Fix possible deadlock in recover_store()
Uwe Kleine-König <[email protected]>
pwm: omap-dmtimer: Simplify error handling
Arvind Sankar <[email protected]>
x86/sysfb: Fix check for bad VRAM size
Kai Li <[email protected]>
jbd2: clear JBD2_ABORT flag before journal_reset to update log tail info when load journal
Siddhesh Poyarekar <[email protected]>
kselftest: Minimise dependency of get_size on C library interfaces
Colin Ian King <[email protected]>
clocksource/drivers/bcm2835_timer: Fix memory leak of timer
John Keeping <[email protected]>
usb: dwc2: Fix IN FIFO allocation
Jia-Ju Bai <[email protected]>
usb: gadget: udc: fix possible sleep-in-atomic-context bugs in gr_probe()
Jia-Ju Bai <[email protected]>
uio: fix a sleep-in-atomic-context bug in uio_dmem_genirq_irqcontrol()
David S. Miller <[email protected]>
sparc: Add .exit.data section.
Tiezhu Yang <[email protected]>
MIPS: Loongson: Fix potential NULL dereference in loongson3_platform_init()
Ard Biesheuvel <[email protected]>
efi/x86: Map the entire EFI vendor string before copying it
Hans de Goede <[email protected]>
pinctrl: baytrail: Do not clear IRQ flags on direct-irq enabled pins
Jia-Ju Bai <[email protected]>
media: sti: bdisp: fix a possible sleep-in-atomic-context bug in bdisp_device_run()
Sergey Senozhatsky <[email protected]>
char/random: silence a lockdep splat with printk()
Jacob Pan <[email protected]>
iommu/vt-d: Fix off-by-one in PASID allocation
Jia-Ju Bai <[email protected]>
gpio: gpio-grgpio: fix possible sleep-in-atomic-context bugs in grgpio_irq_map/unmap()
Oliver O'Halloran <[email protected]>
powerpc/powernv/iov: Ensure the pdn for VFs always contains a valid PE number
Eugen Hristev <[email protected]>
media: i2c: mt9v032: fix enum mbus codes and frame sizes
Christophe JAILLET <[email protected]>
pxa168fb: Fix the function used to release some memory in an error handling path
Geert Uytterhoeven <[email protected]>
pinctrl: sh-pfc: sh7264: Fix CAN function GPIOs
Vladimir Oltean <[email protected]>
gianfar: Fix TX timestamping with a stacked DSA driver
Takashi Sakamoto <[email protected]>
ALSA: ctl: allow TLV read operation for callback type of element in locked case
Ritesh Harjani <[email protected]>
ext4: fix ext4_dax_read/write inode locking sequence for IOCB_NOWAIT
Zahari Petkov <[email protected]>
leds: pca963x: Fix open-drain initialization
Dan Carpenter <[email protected]>
brcmfmac: Fix use after free in brcmf_sdio_readframes()
Peter Zijlstra <[email protected]>
cpu/hotplug, stop_machine: Fix stop_machine vs hotplug order
Rasmus Villemoes <[email protected]>
soc: fsl: qe: change return type of cpm_muram_alloc() to s32
J. Bruce Fields <[email protected]>
nfsd4: avoid NULL deference on strange COPY compounds
Paul Kocialkowski <[email protected]>
drm/gma500: Fixup fbdev stolen size usage evaluation
Sean Christopherson <[email protected]>
KVM: nVMX: Use correct root level for nested EPT shadow page tables
Sasha Levin <[email protected]>
Revert "KVM: VMX: Add non-canonical check on writes to RTIT address MSRs"
Sasha Levin <[email protected]>
Revert "KVM: nVMX: Use correct root level for nested EPT shadow page tables"
Davide Caratti <[email protected]>
net/sched: flower: add missing validation of TCA_FLOWER_FLAGS
Davide Caratti <[email protected]>
net/sched: matchall: add missing validation of TCA_MATCHALL_FLAGS
Per Forlin <[email protected]>
net: dsa: tag_qca: Make sure there is headroom for tag
Eric Dumazet <[email protected]>
net/smc: fix leak of kernel memory to user space
Firo Yang <[email protected]>
enic: prevent waking up stopped tx queues over watchdog reset
Toke Høiland-Jørgensen <[email protected]>
core: Don't skip generic XDP program execution for cloned SKBs
-------------
Diffstat:
Makefile | 4 +-
arch/arm/Kconfig | 4 +-
arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi | 7 +-
arch/arm/boot/dts/r8a7779.dtsi | 8 +
arch/arm/boot/dts/stm32f469-disco.dts | 8 +
arch/arm/boot/dts/sun8i-h3.dtsi | 15 +-
arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi | 10 +
arch/arm64/boot/dts/qcom/msm8996.dtsi | 4 +
arch/arm64/include/asm/alternative.h | 32 +-
arch/microblaze/kernel/cpu/cache.c | 3 +-
arch/mips/loongson64/loongson-3/platform.c | 3 +
arch/powerpc/kernel/eeh_driver.c | 6 -
arch/powerpc/kernel/pci_dn.c | 15 +-
arch/powerpc/platforms/powernv/pci-ioda.c | 48 +-
arch/powerpc/platforms/powernv/pci.c | 18 -
arch/s390/Makefile | 2 +-
arch/s390/kernel/mcount.S | 15 +-
arch/s390/kvm/interrupt.c | 6 +-
arch/s390/pci/pci_sysfs.c | 63 +-
arch/sh/include/cpu-sh2a/cpu/sh7269.h | 11 +-
arch/sparc/kernel/vmlinux.lds.S | 6 +-
arch/x86/entry/vdso/vdso32-setup.c | 1 +
arch/x86/include/asm/nmi.h | 1 -
arch/x86/kernel/nmi.c | 20 +-
arch/x86/kernel/sysfb_simplefb.c | 2 +-
arch/x86/kernel/unwind_orc.c | 3 +-
arch/x86/kvm/vmx.c | 3 +
arch/x86/kvm/vmx/vmx.c | 8036 --------------------
arch/x86/lib/x86-opcode-map.txt | 2 +-
arch/x86/mm/pageattr.c | 8 +-
arch/x86/platform/efi/efi.c | 41 +-
arch/x86/platform/efi/efi_64.c | 9 +-
drivers/acpi/acpica/dsfield.c | 2 +-
drivers/acpi/acpica/dswload.c | 21 +
drivers/acpi/arm64/iort.c | 57 +-
drivers/acpi/button.c | 11 +
drivers/atm/fore200e.c | 25 +-
drivers/base/dd.c | 5 +-
drivers/base/platform.c | 12 +-
drivers/block/brd.c | 22 +-
drivers/block/nbd.c | 10 +
drivers/block/rbd.c | 2 +-
drivers/char/random.c | 5 +-
drivers/clk/qcom/clk-rcg2.c | 3 +
drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 28 +-
drivers/clk/uniphier/clk-uniphier-peri.c | 13 +-
drivers/clocksource/bcm2835_timer.c | 5 +-
drivers/crypto/chelsio/chtls/chtls_cm.c | 27 +-
drivers/crypto/chelsio/chtls/chtls_cm.h | 21 +
drivers/crypto/chelsio/chtls/chtls_hw.c | 3 +
drivers/devfreq/Kconfig | 3 +-
drivers/devfreq/event/Kconfig | 2 +-
drivers/dma/dmaengine.c | 4 +-
drivers/dma/imx-sdma.c | 19 +-
drivers/gpio/gpio-grgpio.c | 10 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 19 +-
drivers/gpu/drm/amd/amdgpu/soc15_common.h | 1 +
drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c | 34 +-
drivers/gpu/drm/amd/display/dc/core/dc_link.c | 3 +-
.../gpu/drm/amd/display/dc/dml/dml_common_defs.c | 2 +-
.../gpu/drm/amd/display/dc/dml/dml_inline_defs.h | 2 +-
.../amd/display/dc/{calcs => inc}/dcn_calc_math.h | 0
drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c | 23 +-
drivers/gpu/drm/drm_debugfs_crc.c | 4 +-
drivers/gpu/drm/gma500/framebuffer.c | 8 +-
drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 8 +
drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +-
drivers/gpu/drm/nouveau/nouveau_ttm.c | 4 -
drivers/gpu/drm/nouveau/nvkm/core/memory.c | 2 +-
.../gpu/drm/nouveau/nvkm/engine/disp/channv50.c | 2 +
drivers/gpu/drm/nouveau/nvkm/engine/gr/gk20a.c | 21 +-
drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c | 1 +
.../gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c | 5 +-
drivers/gpu/drm/radeon/radeon_display.c | 2 +
drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c | 4 +-
drivers/ide/cmd64x.c | 3 +
drivers/ide/serverworks.c | 6 +
drivers/infiniband/core/rdma_core.c | 23 +-
drivers/infiniband/hw/hfi1/chip.c | 10 +
drivers/infiniband/hw/hfi1/chip.h | 1 +
drivers/infiniband/hw/hfi1/driver.c | 1 +
drivers/infiniband/hw/hfi1/hfi.h | 2 +
drivers/infiniband/sw/rxe/rxe_verbs.h | 2 +-
drivers/input/touchscreen/edt-ft5x06.c | 7 +
drivers/iommu/arm-smmu-v3.c | 3 +-
drivers/iommu/dmar.c | 1 -
drivers/iommu/intel-svm.c | 2 +-
drivers/irqchip/irq-gic-v3-its.c | 2 +-
drivers/irqchip/irq-gic-v3.c | 9 +-
drivers/irqchip/irq-mbigen.c | 1 +
drivers/isdn/capi/kcapi_proc.c | 2 +-
drivers/leds/leds-pca963x.c | 8 +-
drivers/md/bcache/bset.h | 3 +-
drivers/md/bcache/super.c | 3 +
drivers/media/i2c/mt9v032.c | 10 +-
drivers/media/pci/cx23885/cx23885-cards.c | 24 +
drivers/media/pci/cx23885/cx23885-video.c | 3 +-
drivers/media/pci/cx23885/cx23885.h | 1 +
drivers/media/platform/sti/bdisp/bdisp-hw.c | 6 +-
drivers/net/ethernet/cisco/enic/enic_main.c | 2 +-
drivers/net/ethernet/freescale/gianfar.c | 10 +-
drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 3 +
.../net/ethernet/mellanox/mlxsw/spectrum_dpipe.c | 3 +-
drivers/net/ethernet/realtek/r8169.c | 9 +
drivers/net/wan/fsl_ucc_hdlc.c | 5 +
drivers/net/wan/ixp4xx_hss.c | 4 +-
drivers/net/wireless/ath/ath10k/wmi.c | 2 +-
drivers/net/wireless/broadcom/b43legacy/main.c | 5 +-
.../wireless/broadcom/brcm80211/brcmfmac/sdio.c | 1 +
drivers/net/wireless/intel/ipw2x00/ipw2100.c | 7 +-
drivers/net/wireless/intel/ipw2x00/ipw2200.c | 5 +-
drivers/net/wireless/intel/iwlegacy/3945-mac.c | 5 +-
drivers/net/wireless/intel/iwlegacy/4965-mac.c | 5 +-
drivers/net/wireless/intel/iwlegacy/common.c | 2 +-
drivers/net/wireless/intel/iwlwifi/mvm/tt.c | 4 +-
drivers/net/wireless/intersil/hostap/hostap_ap.c | 2 +-
.../net/wireless/intersil/orinoco/orinoco_usb.c | 3 +-
drivers/net/wireless/realtek/rtlwifi/pci.c | 10 +-
drivers/nfc/port100.c | 2 +-
drivers/pci/controller/pcie-iproc.c | 24 +
drivers/pci/quirks.c | 61 +-
drivers/pinctrl/intel/pinctrl-baytrail.c | 8 +-
drivers/pinctrl/sh-pfc/pfc-sh7264.c | 9 +-
drivers/pinctrl/sh-pfc/pfc-sh7269.c | 39 +-
drivers/pwm/pwm-omap-dmtimer.c | 35 +-
drivers/pwm/pwm-pca9685.c | 4 -
drivers/regulator/rk808-regulator.c | 2 +-
drivers/remoteproc/remoteproc_core.c | 2 +-
drivers/reset/reset-uniphier.c | 13 +-
drivers/scsi/aic7xxx/aic7xxx_core.c | 2 +-
drivers/scsi/iscsi_tcp.c | 4 +
drivers/scsi/scsi_transport_iscsi.c | 26 +-
drivers/scsi/ufs/ufshcd.c | 24 +-
drivers/scsi/ufs/ufshcd.h | 2 +
drivers/soc/fsl/qe/qe_common.c | 29 +-
drivers/soc/tegra/fuse/tegra-apbmisc.c | 2 +-
drivers/staging/rtl8188eu/os_dep/ioctl_linux.c | 9 +-
drivers/tty/synclink_gt.c | 18 +-
drivers/tty/synclinkmp.c | 24 +-
drivers/uio/uio_dmem_genirq.c | 6 +-
drivers/usb/dwc2/gadget.c | 3 +-
drivers/usb/gadget/udc/gr_udc.c | 16 +-
drivers/usb/musb/omap2430.c | 2 -
drivers/video/fbdev/pxa168fb.c | 6 +-
drivers/virtio/virtio_balloon.c | 2 +
drivers/visorbus/visorchipset.c | 11 +-
drivers/vme/bridges/vme_fake.c | 30 +-
fs/btrfs/check-integrity.c | 3 +-
fs/btrfs/file-item.c | 3 +-
fs/btrfs/volumes.c | 2 +
fs/ceph/mds_client.c | 3 +-
fs/ceph/super.c | 5 +
fs/cifs/connect.c | 6 +-
fs/cifs/smb2pdu.c | 3 +
fs/ext4/file.c | 10 +-
fs/f2fs/namei.c | 27 +-
fs/f2fs/sysfs.c | 12 +-
fs/jbd2/checkpoint.c | 2 +-
fs/jbd2/commit.c | 4 +-
fs/jbd2/journal.c | 24 +-
fs/nfs/nfs42proc.c | 4 +-
fs/nfsd/nfs4proc.c | 3 +-
fs/ocfs2/journal.h | 8 +-
fs/orangefs/orangefs-debugfs.c | 1 +
fs/reiserfs/stree.c | 3 +-
fs/reiserfs/super.c | 2 +-
fs/udf/super.c | 22 +-
include/linux/dmaengine.h | 2 +
include/linux/list_nulls.h | 8 +-
include/linux/rculist_nulls.h | 8 +-
include/media/v4l2-device.h | 12 +-
include/rdma/uverbs_types.h | 1 -
include/soc/fsl/qe/qe.h | 16 +-
kernel/bpf/inode.c | 3 +-
kernel/cpu.c | 13 +-
kernel/module.c | 9 +-
kernel/padata.c | 4 +-
kernel/trace/ftrace.c | 5 +-
kernel/trace/trace_events_trigger.c | 5 +-
kernel/trace/trace_stat.c | 31 +-
kernel/watchdog.c | 10 +-
lib/scatterlist.c | 2 +-
net/core/dev.c | 4 +-
net/core/filter.c | 2 +-
net/dsa/tag_qca.c | 2 +-
net/netfilter/nft_tunnel.c | 3 +-
net/sched/cls_flower.c | 1 +
net/sched/cls_matchall.c | 1 +
net/smc/smc_diag.c | 5 +-
scripts/Kconfig.include | 2 +-
scripts/kconfig/confdata.c | 2 +-
security/selinux/avc.c | 77 +-
security/selinux/hooks.c | 11 +-
security/selinux/include/avc.h | 8 +-
sound/core/control.c | 5 +-
sound/pci/hda/patch_conexant.c | 1 +
sound/pci/hda/patch_hdmi.c | 7 +-
sound/sh/aica.c | 4 +-
sound/sh/sh_dac_audio.c | 3 -
sound/soc/atmel/Kconfig | 2 +
sound/usb/usx2y/usX2Yhwdep.c | 2 +-
tools/lib/api/fs/fs.c | 4 +-
tools/objtool/arch/x86/lib/x86-opcode-map.txt | 2 +-
.../testing/selftests/bpf/test_select_reuseport.c | 16 +-
tools/testing/selftests/size/get_size.c | 24 +-
tools/usb/usbip/src/usbip_network.c | 40 +-
tools/usb/usbip/src/usbip_network.h | 12 +-
207 files changed, 1280 insertions(+), 8742 deletions(-)
From: Vladimir Oltean <[email protected]>
[ Upstream commit c26a2c2ddc0115eb088873f5c309cf46b982f522 ]
The driver wrongly assumes that it is the only entity that can set the
SKBTX_IN_PROGRESS bit of the current skb. Therefore, in the
gfar_clean_tx_ring function, where the TX timestamp is collected if
necessary, the aforementioned bit is used to discriminate whether or not
the TX timestamp should be delivered to the socket's error queue.
But a stacked driver such as a DSA switch can also set the
SKBTX_IN_PROGRESS bit, which is actually exactly what it should do in
order to denote that the hardware timestamping process is undergoing.
Therefore, gianfar would misinterpret the "in progress" bit as being its
own, and deliver a second skb clone in the socket's error queue,
completely throwing off a PTP process which is not expecting to receive
it, _even though_ TX timestamping is not enabled for gianfar.
There have been discussions [0] as to whether non-MAC drivers need or
not to set SKBTX_IN_PROGRESS at all (whose purpose is to avoid sending 2
timestamps, a sw and a hw one, to applications which only expect one).
But as of this patch, there are at least 2 PTP drivers that would break
in conjunction with gianfar: the sja1105 DSA switch and the felix
switch, by way of its ocelot core driver.
So regardless of that conclusion, fix the gianfar driver to not do stuff
based on flags set by others and not intended for it.
[0]: https://www.spinics.net/lists/netdev/msg619699.html
Fixes: f0ee7acfcdd4 ("gianfar: Add hardware TX timestamping support")
Signed-off-by: Vladimir Oltean <[email protected]>
Acked-by: Richard Cochran <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/freescale/gianfar.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c
index c97c4edfa31bc..cf2d1e846a692 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -2685,13 +2685,17 @@ static void gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue)
skb_dirtytx = tx_queue->skb_dirtytx;
while ((skb = tx_queue->tx_skbuff[skb_dirtytx])) {
+ bool do_tstamp;
+
+ do_tstamp = (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
+ priv->hwts_tx_en;
frags = skb_shinfo(skb)->nr_frags;
/* When time stamping, one additional TxBD must be freed.
* Also, we need to dma_unmap_single() the TxPAL.
*/
- if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
+ if (unlikely(do_tstamp))
nr_txbds = frags + 2;
else
nr_txbds = frags + 1;
@@ -2705,7 +2709,7 @@ static void gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue)
(lstatus & BD_LENGTH_MASK))
break;
- if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS)) {
+ if (unlikely(do_tstamp)) {
next = next_txbd(bdp, base, tx_ring_size);
buflen = be16_to_cpu(next->length) +
GMAC_FCB_LEN + GMAC_TXPAL_LEN;
@@ -2715,7 +2719,7 @@ static void gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue)
dma_unmap_single(priv->dev, be32_to_cpu(bdp->bufPtr),
buflen, DMA_TO_DEVICE);
- if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS)) {
+ if (unlikely(do_tstamp)) {
struct skb_shared_hwtstamps shhwtstamps;
u64 *ns = (u64 *)(((uintptr_t)skb->data + 0x10) &
~0x7UL);
--
2.20.1
From: Eugen Hristev <[email protected]>
[ Upstream commit 1451d5ae351d938a0ab1677498c893f17b9ee21d ]
This driver supports both the mt9v032 (color) and the mt9v022 (mono)
sensors. Depending on which sensor is used, the format from the sensor is
different. The format.code inside the dev struct holds this information.
The enum mbus and enum frame sizes need to take into account both type of
sensors, not just the color one. To solve this, use the format.code in
these functions instead of the hardcoded bayer color format (which is only
used for mt9v032).
[Sakari Ailus: rewrapped commit message]
Suggested-by: Wenyou Yang <[email protected]>
Signed-off-by: Eugen Hristev <[email protected]>
Reviewed-by: Laurent Pinchart <[email protected]>
Signed-off-by: Sakari Ailus <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/media/i2c/mt9v032.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/media/i2c/mt9v032.c b/drivers/media/i2c/mt9v032.c
index f74730d24d8fe..04788692c9ff4 100644
--- a/drivers/media/i2c/mt9v032.c
+++ b/drivers/media/i2c/mt9v032.c
@@ -431,10 +431,12 @@ static int mt9v032_enum_mbus_code(struct v4l2_subdev *subdev,
struct v4l2_subdev_pad_config *cfg,
struct v4l2_subdev_mbus_code_enum *code)
{
+ struct mt9v032 *mt9v032 = to_mt9v032(subdev);
+
if (code->index > 0)
return -EINVAL;
- code->code = MEDIA_BUS_FMT_SGRBG10_1X10;
+ code->code = mt9v032->format.code;
return 0;
}
@@ -442,7 +444,11 @@ static int mt9v032_enum_frame_size(struct v4l2_subdev *subdev,
struct v4l2_subdev_pad_config *cfg,
struct v4l2_subdev_frame_size_enum *fse)
{
- if (fse->index >= 3 || fse->code != MEDIA_BUS_FMT_SGRBG10_1X10)
+ struct mt9v032 *mt9v032 = to_mt9v032(subdev);
+
+ if (fse->index >= 3)
+ return -EINVAL;
+ if (mt9v032->format.code != fse->code)
return -EINVAL;
fse->min_width = MT9V032_WINDOW_WIDTH_DEF / (1 << fse->index);
--
2.20.1
From: Oliver O'Halloran <[email protected]>
[ Upstream commit 3b5b9997b331e77ce967eba2c4bc80dc3134a7fe ]
On pseries there is a bug with adding hotplugged devices to an IOMMU
group. For a number of dumb reasons fixing that bug first requires
re-working how VFs are configured on PowerNV. For background, on
PowerNV we use the pcibios_sriov_enable() hook to do two things:
1. Create a pci_dn structure for each of the VFs, and
2. Configure the PHB's internal BARs so the MMIO range for each VF
maps to a unique PE.
Roughly speaking a PE is the hardware counterpart to a Linux IOMMU
group since all the devices in a PE share the same IOMMU table. A PE
also defines the set of devices that should be isolated in response to
a PCI error (i.e. bad DMA, UR/CA, AER events, etc). When isolated all
MMIO and DMA traffic to and from devicein the PE is blocked by the
root complex until the PE is recovered by the OS.
The requirement to block MMIO causes a giant headache because the P8
PHB generally uses a fixed mapping between MMIO addresses and PEs. As
a result we need to delay configuring the IOMMU groups for device
until after MMIO resources are assigned. For physical devices (i.e.
non-VFs) the PE assignment is done in pcibios_setup_bridge() which is
called immediately after the MMIO resources for downstream
devices (and the bridge's windows) are assigned. For VFs the setup is
more complicated because:
a) pcibios_setup_bridge() is not called again when VFs are activated, and
b) The pci_dev for VFs are created by generic code which runs after
pcibios_sriov_enable() is called.
The work around for this is a two step process:
1. A fixup in pcibios_add_device() is used to initialised the cached
pe_number in pci_dn, then
2. A bus notifier then adds the device to the IOMMU group for the PE
specified in pci_dn->pe_number.
A side effect fixing the pseries bug mentioned in the first paragraph
is moving the fixup out of pcibios_add_device() and into
pcibios_bus_add_device(), which is called much later. This results in
step 2. failing because pci_dn->pe_number won't be initialised when
the bus notifier is run.
We can fix this by removing the need for the fixup. The PE for a VF is
known before the VF is even scanned so we can initialise
pci_dn->pe_number pcibios_sriov_enable() instead. Unfortunately,
moving the initialisation causes two problems:
1. We trip the WARN_ON() in the current fixup code, and
2. The EEH core clears pdn->pe_number when recovering a VF and
relies on the fixup to correctly re-set it.
The only justification for either of these is a comment in
eeh_rmv_device() suggesting that pdn->pe_number *must* be set to
IODA_INVALID_PE in order for the VF to be scanned. However, this
comment appears to have no basis in reality. Both bugs can be fixed by
just deleting the code.
Tested-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: Alexey Kardashevskiy <[email protected]>
Signed-off-by: Oliver O'Halloran <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/powerpc/kernel/eeh_driver.c | 6 ------
arch/powerpc/platforms/powernv/pci-ioda.c | 19 +++++++++++++++----
arch/powerpc/platforms/powernv/pci.c | 4 ----
3 files changed, 15 insertions(+), 14 deletions(-)
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index af1f3d5f9a0f7..377d23f58197a 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -554,12 +554,6 @@ static void *eeh_rmv_device(struct eeh_dev *edev, void *userdata)
pci_iov_remove_virtfn(edev->physfn, pdn->vf_index);
edev->pdev = NULL;
-
- /*
- * We have to set the VF PE number to invalid one, which is
- * required to plug the VF successfully.
- */
- pdn->pe_number = IODA_INVALID_PE;
#endif
if (rmv_data)
list_add(&edev->rmv_list, &rmv_data->edev_list);
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index ee63749a2d47e..28adfe4dd04c5 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1552,6 +1552,10 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
/* Reserve PE for each VF */
for (vf_index = 0; vf_index < num_vfs; vf_index++) {
+ int vf_devfn = pci_iov_virtfn_devfn(pdev, vf_index);
+ int vf_bus = pci_iov_virtfn_bus(pdev, vf_index);
+ struct pci_dn *vf_pdn;
+
if (pdn->m64_single_mode)
pe_num = pdn->pe_num_map[vf_index];
else
@@ -1564,13 +1568,11 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
pe->pbus = NULL;
pe->parent_dev = pdev;
pe->mve_number = -1;
- pe->rid = (pci_iov_virtfn_bus(pdev, vf_index) << 8) |
- pci_iov_virtfn_devfn(pdev, vf_index);
+ pe->rid = (vf_bus << 8) | vf_devfn;
pe_info(pe, "VF %04d:%02d:%02d.%d associated with PE#%x\n",
hose->global_number, pdev->bus->number,
- PCI_SLOT(pci_iov_virtfn_devfn(pdev, vf_index)),
- PCI_FUNC(pci_iov_virtfn_devfn(pdev, vf_index)), pe_num);
+ PCI_SLOT(vf_devfn), PCI_FUNC(vf_devfn), pe_num);
if (pnv_ioda_configure_pe(phb, pe)) {
/* XXX What do we do here ? */
@@ -1584,6 +1586,15 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
list_add_tail(&pe->list, &phb->ioda.pe_list);
mutex_unlock(&phb->ioda.pe_list_mutex);
+ /* associate this pe to it's pdn */
+ list_for_each_entry(vf_pdn, &pdn->parent->child_list, list) {
+ if (vf_pdn->busno == vf_bus &&
+ vf_pdn->devfn == vf_devfn) {
+ vf_pdn->pe_number = pe_num;
+ break;
+ }
+ }
+
pnv_pci_ioda2_setup_dma_pe(phb, pe);
}
}
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index c846300b78368..aa95b8e0f66ad 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -822,16 +822,12 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
struct pnv_phb *phb = hose->private_data;
#ifdef CONFIG_PCI_IOV
struct pnv_ioda_pe *pe;
- struct pci_dn *pdn;
/* Fix the VF pdn PE number */
if (pdev->is_virtfn) {
- pdn = pci_get_pdn(pdev);
- WARN_ON(pdn->pe_number != IODA_INVALID_PE);
list_for_each_entry(pe, &phb->ioda.pe_list, list) {
if (pe->rid == ((pdev->bus->number << 8) |
(pdev->devfn & 0xff))) {
- pdn->pe_number = pe->pe_number;
pe->pdev = pdev;
break;
}
--
2.20.1
From: Firo Yang <[email protected]>
[ Upstream commit 0f90522591fd09dd201065c53ebefdfe3c6b55cb ]
Recent months, our customer reported several kernel crashes all
preceding with following message:
NETDEV WATCHDOG: eth2 (enic): transmit queue 0 timed out
Error message of one of those crashes:
BUG: unable to handle kernel paging request at ffffffffa007e090
After analyzing severl vmcores, I found that most of crashes are
caused by memory corruption. And all the corrupted memory areas
are overwritten by data of network packets. Moreover, I also found
that the tx queues were enabled over watchdog reset.
After going through the source code, I found that in enic_stop(),
the tx queues stopped by netif_tx_disable() could be woken up over
a small time window between netif_tx_disable() and the
napi_disable() by the following code path:
napi_poll->
enic_poll_msix_wq->
vnic_cq_service->
enic_wq_service->
netif_wake_subqueue(enic->netdev, q_number)->
test_and_clear_bit(__QUEUE_STATE_DRV_XOFF, &txq->state)
In turn, upper netowrk stack could queue skb to ENIC NIC though
enic_hard_start_xmit(). And this might introduce some race condition.
Our customer comfirmed that this kind of kernel crash doesn't occur over
90 days since they applied this patch.
Signed-off-by: Firo Yang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/cisco/enic/enic_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -2013,10 +2013,10 @@ static int enic_stop(struct net_device *
napi_disable(&enic->napi[i]);
netif_carrier_off(netdev);
- netif_tx_disable(netdev);
if (vnic_dev_get_intr_mode(enic->vdev) == VNIC_DEV_INTR_MODE_MSIX)
for (i = 0; i < enic->wq_count; i++)
napi_disable(&enic->napi[enic_cq_wq(enic, i)]);
+ netif_tx_disable(netdev);
if (!enic_is_dynamic(enic) && !enic_is_sriov_vf(enic))
enic_dev_del_station_addr(enic);
From: Jacob Pan <[email protected]>
[ Upstream commit 39d630e332144028f56abba83d94291978e72df1 ]
PASID allocator uses IDR which is exclusive for the end of the
allocation range. There is no need to decrement pasid_max.
Fixes: af39507305fb ("iommu/vt-d: Apply global PASID in SVA")
Reported-by: Eric Auger <[email protected]>
Signed-off-by: Jacob Pan <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/iommu/intel-svm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index fd8730b2cd46e..5944d3b4dca37 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -377,7 +377,7 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, int flags, struct svm_dev_
/* Do not use PASID 0 in caching mode (virtualised IOMMU) */
ret = intel_pasid_alloc_id(svm,
!!cap_caching_mode(iommu->cap),
- pasid_max - 1, GFP_KERNEL);
+ pasid_max, GFP_KERNEL);
if (ret < 0) {
kfree(svm);
kfree(sdev);
--
2.20.1
From: Sergey Senozhatsky <[email protected]>
[ Upstream commit 1b710b1b10eff9d46666064ea25f079f70bc67a8 ]
Sergey didn't like the locking order,
uart_port->lock -> tty_port->lock
uart_write (uart_port->lock)
__uart_start
pl011_start_tx
pl011_tx_chars
uart_write_wakeup
tty_port_tty_wakeup
tty_port_default
tty_port_tty_get (tty_port->lock)
but those code is so old, and I have no clue how to de-couple it after
checking other locks in the splat. There is an onging effort to make all
printk() as deferred, so until that happens, workaround it for now as a
short-term fix.
LTP: starting iogen01 (export LTPROOT; rwtest -N iogen01 -i 120s -s
read,write -Da -Dv -n 2 500b:$TMPDIR/doio.f1.$$
1000b:$TMPDIR/doio.f2.$$)
WARNING: possible circular locking dependency detected
------------------------------------------------------
doio/49441 is trying to acquire lock:
ffff008b7cff7290 (&(&zone->lock)->rlock){..-.}, at: rmqueue+0x138/0x2050
but task is already holding lock:
60ff000822352818 (&pool->lock/1){-.-.}, at: start_flush_work+0xd8/0x3f0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (&pool->lock/1){-.-.}:
lock_acquire+0x320/0x360
_raw_spin_lock+0x64/0x80
__queue_work+0x4b4/0xa10
queue_work_on+0xac/0x11c
tty_schedule_flip+0x84/0xbc
tty_flip_buffer_push+0x1c/0x28
pty_write+0x98/0xd0
n_tty_write+0x450/0x60c
tty_write+0x338/0x474
__vfs_write+0x88/0x214
vfs_write+0x12c/0x1a4
redirected_tty_write+0x90/0xdc
do_loop_readv_writev+0x140/0x180
do_iter_write+0xe0/0x10c
vfs_writev+0x134/0x1cc
do_writev+0xbc/0x130
__arm64_sys_writev+0x58/0x8c
el0_svc_handler+0x170/0x240
el0_sync_handler+0x150/0x250
el0_sync+0x164/0x180
-> #3 (&(&port->lock)->rlock){-.-.}:
lock_acquire+0x320/0x360
_raw_spin_lock_irqsave+0x7c/0x9c
tty_port_tty_get+0x24/0x60
tty_port_default_wakeup+0x1c/0x3c
tty_port_tty_wakeup+0x34/0x40
uart_write_wakeup+0x28/0x44
pl011_tx_chars+0x1b8/0x270
pl011_start_tx+0x24/0x70
__uart_start+0x5c/0x68
uart_write+0x164/0x1c8
do_output_char+0x33c/0x348
n_tty_write+0x4bc/0x60c
tty_write+0x338/0x474
redirected_tty_write+0xc0/0xdc
do_loop_readv_writev+0x140/0x180
do_iter_write+0xe0/0x10c
vfs_writev+0x134/0x1cc
do_writev+0xbc/0x130
__arm64_sys_writev+0x58/0x8c
el0_svc_handler+0x170/0x240
el0_sync_handler+0x150/0x250
el0_sync+0x164/0x180
-> #2 (&port_lock_key){-.-.}:
lock_acquire+0x320/0x360
_raw_spin_lock+0x64/0x80
pl011_console_write+0xec/0x2cc
console_unlock+0x794/0x96c
vprintk_emit+0x260/0x31c
vprintk_default+0x54/0x7c
vprintk_func+0x218/0x254
printk+0x7c/0xa4
register_console+0x734/0x7b0
uart_add_one_port+0x734/0x834
pl011_register_port+0x6c/0xac
sbsa_uart_probe+0x234/0x2ec
platform_drv_probe+0xd4/0x124
really_probe+0x250/0x71c
driver_probe_device+0xb4/0x200
__device_attach_driver+0xd8/0x188
bus_for_each_drv+0xbc/0x110
__device_attach+0x120/0x220
device_initial_probe+0x20/0x2c
bus_probe_device+0x54/0x100
device_add+0xae8/0xc2c
platform_device_add+0x278/0x3b8
platform_device_register_full+0x238/0x2ac
acpi_create_platform_device+0x2dc/0x3a8
acpi_bus_attach+0x390/0x3cc
acpi_bus_attach+0x108/0x3cc
acpi_bus_attach+0x108/0x3cc
acpi_bus_attach+0x108/0x3cc
acpi_bus_scan+0x7c/0xb0
acpi_scan_init+0xe4/0x304
acpi_init+0x100/0x114
do_one_initcall+0x348/0x6a0
do_initcall_level+0x190/0x1fc
do_basic_setup+0x34/0x4c
kernel_init_freeable+0x19c/0x260
kernel_init+0x18/0x338
ret_from_fork+0x10/0x18
-> #1 (console_owner){-...}:
lock_acquire+0x320/0x360
console_lock_spinning_enable+0x6c/0x7c
console_unlock+0x4f8/0x96c
vprintk_emit+0x260/0x31c
vprintk_default+0x54/0x7c
vprintk_func+0x218/0x254
printk+0x7c/0xa4
get_random_u64+0x1c4/0x1dc
shuffle_pick_tail+0x40/0xac
__free_one_page+0x424/0x710
free_one_page+0x70/0x120
__free_pages_ok+0x61c/0xa94
__free_pages_core+0x1bc/0x294
memblock_free_pages+0x38/0x48
__free_pages_memory+0xcc/0xfc
__free_memory_core+0x70/0x78
free_low_memory_core_early+0x148/0x18c
memblock_free_all+0x18/0x54
mem_init+0xb4/0x17c
mm_init+0x14/0x38
start_kernel+0x19c/0x530
-> #0 (&(&zone->lock)->rlock){..-.}:
validate_chain+0xf6c/0x2e2c
__lock_acquire+0x868/0xc2c
lock_acquire+0x320/0x360
_raw_spin_lock+0x64/0x80
rmqueue+0x138/0x2050
get_page_from_freelist+0x474/0x688
__alloc_pages_nodemask+0x3b4/0x18dc
alloc_pages_current+0xd0/0xe0
alloc_slab_page+0x2b4/0x5e0
new_slab+0xc8/0x6bc
___slab_alloc+0x3b8/0x640
kmem_cache_alloc+0x4b4/0x588
__debug_object_init+0x778/0x8b4
debug_object_init_on_stack+0x40/0x50
start_flush_work+0x16c/0x3f0
__flush_work+0xb8/0x124
flush_work+0x20/0x30
xlog_cil_force_lsn+0x88/0x204 [xfs]
xfs_log_force_lsn+0x128/0x1b8 [xfs]
xfs_file_fsync+0x3c4/0x488 [xfs]
vfs_fsync_range+0xb0/0xd0
generic_write_sync+0x80/0xa0 [xfs]
xfs_file_buffered_aio_write+0x66c/0x6e4 [xfs]
xfs_file_write_iter+0x1a0/0x218 [xfs]
__vfs_write+0x1cc/0x214
vfs_write+0x12c/0x1a4
ksys_write+0xb0/0x120
__arm64_sys_write+0x54/0x88
el0_svc_handler+0x170/0x240
el0_sync_handler+0x150/0x250
el0_sync+0x164/0x180
other info that might help us debug this:
Chain exists of:
&(&zone->lock)->rlock --> &(&port->lock)->rlock --> &pool->lock/1
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&pool->lock/1);
lock(&(&port->lock)->rlock);
lock(&pool->lock/1);
lock(&(&zone->lock)->rlock);
*** DEADLOCK ***
4 locks held by doio/49441:
#0: a0ff00886fc27408 (sb_writers#8){.+.+}, at: vfs_write+0x118/0x1a4
#1: 8fff00080810dfe0 (&xfs_nondir_ilock_class){++++}, at:
xfs_ilock+0x2a8/0x300 [xfs]
#2: ffff9000129f2390 (rcu_read_lock){....}, at:
rcu_lock_acquire+0x8/0x38
#3: 60ff000822352818 (&pool->lock/1){-.-.}, at:
start_flush_work+0xd8/0x3f0
stack backtrace:
CPU: 48 PID: 49441 Comm: doio Tainted: G W
Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS
L50_5.13_1.11 06/18/2019
Call trace:
dump_backtrace+0x0/0x248
show_stack+0x20/0x2c
dump_stack+0xe8/0x150
print_circular_bug+0x368/0x380
check_noncircular+0x28c/0x294
validate_chain+0xf6c/0x2e2c
__lock_acquire+0x868/0xc2c
lock_acquire+0x320/0x360
_raw_spin_lock+0x64/0x80
rmqueue+0x138/0x2050
get_page_from_freelist+0x474/0x688
__alloc_pages_nodemask+0x3b4/0x18dc
alloc_pages_current+0xd0/0xe0
alloc_slab_page+0x2b4/0x5e0
new_slab+0xc8/0x6bc
___slab_alloc+0x3b8/0x640
kmem_cache_alloc+0x4b4/0x588
__debug_object_init+0x778/0x8b4
debug_object_init_on_stack+0x40/0x50
start_flush_work+0x16c/0x3f0
__flush_work+0xb8/0x124
flush_work+0x20/0x30
xlog_cil_force_lsn+0x88/0x204 [xfs]
xfs_log_force_lsn+0x128/0x1b8 [xfs]
xfs_file_fsync+0x3c4/0x488 [xfs]
vfs_fsync_range+0xb0/0xd0
generic_write_sync+0x80/0xa0 [xfs]
xfs_file_buffered_aio_write+0x66c/0x6e4 [xfs]
xfs_file_write_iter+0x1a0/0x218 [xfs]
__vfs_write+0x1cc/0x214
vfs_write+0x12c/0x1a4
ksys_write+0xb0/0x120
__arm64_sys_write+0x54/0x88
el0_svc_handler+0x170/0x240
el0_sync_handler+0x150/0x250
el0_sync+0x164/0x180
Reviewed-by: Sergey Senozhatsky <[email protected]>
Signed-off-by: Qian Cai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/char/random.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 53e822793d463..28b110cd39775 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1609,8 +1609,9 @@ static void _warn_unseeded_randomness(const char *func_name, void *caller,
print_once = true;
#endif
if (__ratelimit(&unseeded_warning))
- pr_notice("random: %s called from %pS with crng_init=%d\n",
- func_name, caller, crng_init);
+ printk_deferred(KERN_NOTICE "random: %s called from %pS "
+ "with crng_init=%d\n", func_name, caller,
+ crng_init);
}
/*
--
2.20.1
From: Ard Biesheuvel <[email protected]>
[ Upstream commit ffc2760bcf2dba0dbef74013ed73eea8310cc52c ]
Fix a couple of issues with the way we map and copy the vendor string:
- we map only 2 bytes, which usually works since you get at least a
page, but if the vendor string happens to cross a page boundary,
a crash will result
- only call early_memunmap() if early_memremap() succeeded, or we will
call it with a NULL address which it doesn't like,
- while at it, switch to early_memremap_ro(), and array indexing rather
than pointer dereferencing to read the CHAR16 characters.
Signed-off-by: Ard Biesheuvel <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Arvind Sankar <[email protected]>
Cc: Matthew Garrett <[email protected]>
Cc: [email protected]
Fixes: 5b83683f32b1 ("x86: EFI runtime service support")
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/platform/efi/efi.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 335a62e74a2e9..5b0275310070e 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -480,7 +480,6 @@ void __init efi_init(void)
efi_char16_t *c16;
char vendor[100] = "unknown";
int i = 0;
- void *tmp;
#ifdef CONFIG_X86_32
if (boot_params.efi_info.efi_systab_hi ||
@@ -505,14 +504,16 @@ void __init efi_init(void)
/*
* Show what we know for posterity
*/
- c16 = tmp = early_memremap(efi.systab->fw_vendor, 2);
+ c16 = early_memremap_ro(efi.systab->fw_vendor,
+ sizeof(vendor) * sizeof(efi_char16_t));
if (c16) {
- for (i = 0; i < sizeof(vendor) - 1 && *c16; ++i)
- vendor[i] = *c16++;
+ for (i = 0; i < sizeof(vendor) - 1 && c16[i]; ++i)
+ vendor[i] = c16[i];
vendor[i] = '\0';
- } else
+ early_memunmap(c16, sizeof(vendor) * sizeof(efi_char16_t));
+ } else {
pr_err("Could not map the firmware vendor!\n");
- early_memunmap(tmp, 2);
+ }
pr_info("EFI v%u.%.02u by %s\n",
efi.systab->hdr.revision >> 16,
--
2.20.1
From: Tiezhu Yang <[email protected]>
[ Upstream commit 72d052e28d1d2363f9107be63ef3a3afdea6143c ]
If kzalloc fails, it should return -ENOMEM, otherwise may trigger a NULL
pointer dereference.
Fixes: 3adeb2566b9b ("MIPS: Loongson: Improve LEFI firmware interface")
Signed-off-by: Tiezhu Yang <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/mips/loongson64/loongson-3/platform.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/mips/loongson64/loongson-3/platform.c b/arch/mips/loongson64/loongson-3/platform.c
index 25a97cc0ee336..0db4cc3196ebd 100644
--- a/arch/mips/loongson64/loongson-3/platform.c
+++ b/arch/mips/loongson64/loongson-3/platform.c
@@ -31,6 +31,9 @@ static int __init loongson3_platform_init(void)
continue;
pdev = kzalloc(sizeof(struct platform_device), GFP_KERNEL);
+ if (!pdev)
+ return -ENOMEM;
+
pdev->name = loongson_sysconf.sensors[i].name;
pdev->id = loongson_sysconf.sensors[i].id;
pdev->dev.platform_data = &loongson_sysconf.sensors[i];
--
2.20.1
From: David S. Miller <[email protected]>
[ Upstream commit 548f0b9a5f4cffa0cecf62eb12aa8db682e4eee6 ]
This fixes build errors of all sorts.
Also, emit .exit.text unconditionally.
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/sparc/kernel/vmlinux.lds.S | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/sparc/kernel/vmlinux.lds.S b/arch/sparc/kernel/vmlinux.lds.S
index 61afd787bd0c7..59b6df13ddead 100644
--- a/arch/sparc/kernel/vmlinux.lds.S
+++ b/arch/sparc/kernel/vmlinux.lds.S
@@ -172,12 +172,14 @@ SECTIONS
}
PERCPU_SECTION(SMP_CACHE_BYTES)
-#ifdef CONFIG_JUMP_LABEL
. = ALIGN(PAGE_SIZE);
.exit.text : {
EXIT_TEXT
}
-#endif
+
+ .exit.data : {
+ EXIT_DATA
+ }
. = ALIGN(PAGE_SIZE);
__init_end = .;
--
2.20.1
From: Jia-Ju Bai <[email protected]>
[ Upstream commit b74351287d4bd90636c3f48bc188c2f53824c2d4 ]
The driver may sleep while holding a spinlock.
The function call path (from bottom to top) in Linux 4.19 is:
kernel/irq/manage.c, 523:
synchronize_irq in disable_irq
drivers/uio/uio_dmem_genirq.c, 140:
disable_irq in uio_dmem_genirq_irqcontrol
drivers/uio/uio_dmem_genirq.c, 134:
_raw_spin_lock_irqsave in uio_dmem_genirq_irqcontrol
synchronize_irq() can sleep at runtime.
To fix this bug, disable_irq() is called without holding the spinlock.
This bug is found by a static analysis tool STCheck written by myself.
Signed-off-by: Jia-Ju Bai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/uio/uio_dmem_genirq.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/uio/uio_dmem_genirq.c b/drivers/uio/uio_dmem_genirq.c
index e1134a4d97f3f..a00b4aee6c799 100644
--- a/drivers/uio/uio_dmem_genirq.c
+++ b/drivers/uio/uio_dmem_genirq.c
@@ -135,11 +135,13 @@ static int uio_dmem_genirq_irqcontrol(struct uio_info *dev_info, s32 irq_on)
if (irq_on) {
if (test_and_clear_bit(0, &priv->flags))
enable_irq(dev_info->irq);
+ spin_unlock_irqrestore(&priv->lock, flags);
} else {
- if (!test_and_set_bit(0, &priv->flags))
+ if (!test_and_set_bit(0, &priv->flags)) {
+ spin_unlock_irqrestore(&priv->lock, flags);
disable_irq(dev_info->irq);
+ }
}
- spin_unlock_irqrestore(&priv->lock, flags);
return 0;
}
--
2.20.1
From: John Keeping <[email protected]>
[ Upstream commit 644139f8b64d818f6345351455f14471510879a5 ]
On chips with fewer FIFOs than endpoints (for example RK3288 which has 9
endpoints, but only 6 which are cabable of input), the DPTXFSIZN
registers above the FIFO count may return invalid values.
With logging added on startup, I see:
dwc2 ff580000.usb: dwc2_hsotg_init_fifo: ep=1 sz=256
dwc2 ff580000.usb: dwc2_hsotg_init_fifo: ep=2 sz=128
dwc2 ff580000.usb: dwc2_hsotg_init_fifo: ep=3 sz=128
dwc2 ff580000.usb: dwc2_hsotg_init_fifo: ep=4 sz=64
dwc2 ff580000.usb: dwc2_hsotg_init_fifo: ep=5 sz=64
dwc2 ff580000.usb: dwc2_hsotg_init_fifo: ep=6 sz=32
dwc2 ff580000.usb: dwc2_hsotg_init_fifo: ep=7 sz=0
dwc2 ff580000.usb: dwc2_hsotg_init_fifo: ep=8 sz=0
dwc2 ff580000.usb: dwc2_hsotg_init_fifo: ep=9 sz=0
dwc2 ff580000.usb: dwc2_hsotg_init_fifo: ep=10 sz=0
dwc2 ff580000.usb: dwc2_hsotg_init_fifo: ep=11 sz=0
dwc2 ff580000.usb: dwc2_hsotg_init_fifo: ep=12 sz=0
dwc2 ff580000.usb: dwc2_hsotg_init_fifo: ep=13 sz=0
dwc2 ff580000.usb: dwc2_hsotg_init_fifo: ep=14 sz=0
dwc2 ff580000.usb: dwc2_hsotg_init_fifo: ep=15 sz=0
but:
# cat /sys/kernel/debug/ff580000.usb/fifo
Non-periodic FIFOs:
RXFIFO: Size 275
NPTXFIFO: Size 16, Start 0x00000113
Periodic TXFIFOs:
DPTXFIFO 1: Size 256, Start 0x00000123
DPTXFIFO 2: Size 128, Start 0x00000223
DPTXFIFO 3: Size 128, Start 0x000002a3
DPTXFIFO 4: Size 64, Start 0x00000323
DPTXFIFO 5: Size 64, Start 0x00000363
DPTXFIFO 6: Size 32, Start 0x000003a3
DPTXFIFO 7: Size 0, Start 0x000003e3
DPTXFIFO 8: Size 0, Start 0x000003a3
DPTXFIFO 9: Size 256, Start 0x00000123
so it seems that FIFO 9 is mirroring FIFO 1.
Fix the allocation by using the FIFO count instead of the endpoint count
when selecting a FIFO for an endpoint.
Acked-by: Minas Harutyunyan <[email protected]>
Signed-off-by: John Keeping <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/usb/dwc2/gadget.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c
index f64d1cd08fb67..17f3e7b4d4fed 100644
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -3918,11 +3918,12 @@ static int dwc2_hsotg_ep_enable(struct usb_ep *ep,
* a unique tx-fifo even if it is non-periodic.
*/
if (dir_in && hsotg->dedicated_fifos) {
+ unsigned fifo_count = dwc2_hsotg_tx_fifo_count(hsotg);
u32 fifo_index = 0;
u32 fifo_size = UINT_MAX;
size = hs_ep->ep.maxpacket * hs_ep->mc;
- for (i = 1; i < hsotg->num_of_eps; ++i) {
+ for (i = 1; i <= fifo_count; ++i) {
if (hsotg->fifo_map & (1 << i))
continue;
val = dwc2_readl(hsotg, DPTXFSIZN(i));
--
2.20.1
From: Siddhesh Poyarekar <[email protected]>
[ Upstream commit 6b64a650f0b2ae3940698f401732988699eecf7a ]
It was observed[1] on arm64 that __builtin_strlen led to an infinite
loop in the get_size selftest. This is because __builtin_strlen (and
other builtins) may sometimes result in a call to the C library
function. The C library implementation of strlen uses an IFUNC
resolver to load the most efficient strlen implementation for the
underlying machine and hence has a PLT indirection even for static
binaries. Because this binary avoids the C library startup routines,
the PLT initialization never happens and hence the program gets stuck
in an infinite loop.
On x86_64 the __builtin_strlen just happens to expand inline and avoid
the call but that is not always guaranteed.
Further, while testing on x86_64 (Fedora 31), it was observed that the
test also failed with a segfault inside write() because the generated
code for the write function in glibc seems to access TLS before the
syscall (probably due to the cancellation point check) and fails
because TLS is not initialised.
To mitigate these problems, this patch reduces the interface with the
C library to just the syscall function. The syscall function still
sets errno on failure, which is undesirable but for now it only
affects cases where syscalls fail.
[1] https://bugs.linaro.org/show_bug.cgi?id=5479
Signed-off-by: Siddhesh Poyarekar <[email protected]>
Reported-by: Masami Hiramatsu <[email protected]>
Tested-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Tim Bird <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
tools/testing/selftests/size/get_size.c | 24 ++++++++++++++++++------
1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/size/get_size.c b/tools/testing/selftests/size/get_size.c
index d4b59ab979a09..f55943b6d1e2a 100644
--- a/tools/testing/selftests/size/get_size.c
+++ b/tools/testing/selftests/size/get_size.c
@@ -12,23 +12,35 @@
* own execution. It also attempts to have as few dependencies
* on kernel features as possible.
*
- * It should be statically linked, with startup libs avoided.
- * It uses no library calls, and only the following 3 syscalls:
+ * It should be statically linked, with startup libs avoided. It uses
+ * no library calls except the syscall() function for the following 3
+ * syscalls:
* sysinfo(), write(), and _exit()
*
* For output, it avoids printf (which in some C libraries
* has large external dependencies) by implementing it's own
* number output and print routines, and using __builtin_strlen()
+ *
+ * The test may crash if any of the above syscalls fails because in some
+ * libc implementations (e.g. the GNU C Library) errno is saved in
+ * thread-local storage, which does not get initialized due to avoiding
+ * startup libs.
*/
#include <sys/sysinfo.h>
#include <unistd.h>
+#include <sys/syscall.h>
#define STDOUT_FILENO 1
static int print(const char *s)
{
- return write(STDOUT_FILENO, s, __builtin_strlen(s));
+ size_t len = 0;
+
+ while (s[len] != '\0')
+ len++;
+
+ return syscall(SYS_write, STDOUT_FILENO, s, len);
}
static inline char *num_to_str(unsigned long num, char *buf, int len)
@@ -80,12 +92,12 @@ void _start(void)
print("TAP version 13\n");
print("# Testing system size.\n");
- ccode = sysinfo(&info);
+ ccode = syscall(SYS_sysinfo, &info);
if (ccode < 0) {
print("not ok 1");
print(test_name);
print(" ---\n reason: \"could not get sysinfo\"\n ...\n");
- _exit(ccode);
+ syscall(SYS_exit, ccode);
}
print("ok 1");
print(test_name);
@@ -101,5 +113,5 @@ void _start(void)
print(" ...\n");
print("1..1\n");
- _exit(0);
+ syscall(SYS_exit, 0);
}
--
2.20.1
From: Sean Christopherson <[email protected]>
[ Upstream commit 148d735eb55d32848c3379e460ce365f2c1cbe4b ]
Hardcode the EPT page-walk level for L2 to be 4 levels, as KVM's MMU
currently also hardcodes the page walk level for nested EPT to be 4
levels. The L2 guest is all but guaranteed to soft hang on its first
instruction when L1 is using EPT, as KVM will construct 4-level page
tables and then tell hardware to use 5-level page tables.
Fixes: 855feb673640 ("KVM: MMU: Add 5 level EPT & Shadow page table support.")
Cc: [email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/vmx.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 2660c01eadaeb..aead984d89ad6 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -5302,6 +5302,9 @@ static void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
static int get_ept_level(struct kvm_vcpu *vcpu)
{
+ /* Nested EPT currently only supports 4-level walks. */
+ if (is_guest_mode(vcpu) && nested_cpu_has_ept(get_vmcs12(vcpu)))
+ return 4;
if (cpu_has_vmx_ept_5levels() && (cpuid_maxphyaddr(vcpu) > 48))
return 5;
return 4;
--
2.20.1
From: Uwe Kleine-König <[email protected]>
[ Upstream commit c4cf7aa57eb83b108d2d9c6c37c143388fee2a4d ]
Instead of doing error handling in the middle of ->probe(), move error
handling and freeing the reference to timer to the end.
This fixes a resource leak as dm_timer wasn't freed when allocating
*omap failed.
Implementation note: The put: label was never reached without a goto and
ret being unequal to 0, so the removed return statement is fine.
Fixes: 6604c6556db9 ("pwm: Add PWM driver for OMAP using dual-mode timers")
Signed-off-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/pwm/pwm-omap-dmtimer.c | 28 +++++++++++++++++++---------
1 file changed, 19 insertions(+), 9 deletions(-)
diff --git a/drivers/pwm/pwm-omap-dmtimer.c b/drivers/pwm/pwm-omap-dmtimer.c
index f45798679e3c0..2d585c101d52e 100644
--- a/drivers/pwm/pwm-omap-dmtimer.c
+++ b/drivers/pwm/pwm-omap-dmtimer.c
@@ -301,15 +301,10 @@ static int pwm_omap_dmtimer_probe(struct platform_device *pdev)
goto put;
}
-put:
- of_node_put(timer);
- if (ret < 0)
- return ret;
-
omap = devm_kzalloc(&pdev->dev, sizeof(*omap), GFP_KERNEL);
if (!omap) {
- pdata->free(dm_timer);
- return -ENOMEM;
+ ret = -ENOMEM;
+ goto err_alloc_omap;
}
omap->pdata = pdata;
@@ -342,13 +337,28 @@ put:
ret = pwmchip_add(&omap->chip);
if (ret < 0) {
dev_err(&pdev->dev, "failed to register PWM\n");
- omap->pdata->free(omap->dm_timer);
- return ret;
+ goto err_pwmchip_add;
}
+ of_node_put(timer);
+
platform_set_drvdata(pdev, omap);
return 0;
+
+err_pwmchip_add:
+
+ /*
+ * *omap is allocated using devm_kzalloc,
+ * so no free necessary here
+ */
+err_alloc_omap:
+
+ pdata->free(dm_timer);
+put:
+ of_node_put(timer);
+
+ return ret;
}
static int pwm_omap_dmtimer_remove(struct platform_device *pdev)
--
2.20.1
From: Christophe JAILLET <[email protected]>
[ Upstream commit 3c911fe799d1c338d94b78e7182ad452c37af897 ]
In the probe function, some resources are allocated using 'dma_alloc_wc()',
they should be released with 'dma_free_wc()', not 'dma_free_coherent()'.
We already use 'dma_free_wc()' in the remove function, but not in the
error handling path of the probe function.
Also, remove a useless 'PAGE_ALIGN()'. 'info->fix.smem_len' is already
PAGE_ALIGNed.
Fixes: 638772c7553f ("fb: add support of LCD display controller on pxa168/910 (base layer)")
Signed-off-by: Christophe JAILLET <[email protected]>
Reviewed-by: Lubomir Rintel <[email protected]>
CC: YueHaibing <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/video/fbdev/pxa168fb.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/video/fbdev/pxa168fb.c b/drivers/video/fbdev/pxa168fb.c
index d059d04c63acd..20195d3dbf088 100644
--- a/drivers/video/fbdev/pxa168fb.c
+++ b/drivers/video/fbdev/pxa168fb.c
@@ -769,8 +769,8 @@ failed_free_cmap:
failed_free_clk:
clk_disable_unprepare(fbi->clk);
failed_free_fbmem:
- dma_free_coherent(fbi->dev, info->fix.smem_len,
- info->screen_base, fbi->fb_start_dma);
+ dma_free_wc(fbi->dev, info->fix.smem_len,
+ info->screen_base, fbi->fb_start_dma);
failed_free_info:
kfree(info);
@@ -804,7 +804,7 @@ static int pxa168fb_remove(struct platform_device *pdev)
irq = platform_get_irq(pdev, 0);
- dma_free_wc(fbi->dev, PAGE_ALIGN(info->fix.smem_len),
+ dma_free_wc(fbi->dev, info->fix.smem_len,
info->screen_base, info->fix.smem_start);
clk_disable_unprepare(fbi->clk);
--
2.20.1
From: Luis Henriques <[email protected]>
[ Upstream commit afccc00f75bbbee4e4ae833a96c2d29a7259c693 ]
tracing_stat_init() was always returning '0', even on the error paths. It
now returns -ENODEV if tracing_init_dentry() fails or -ENOMEM if it fails
to created the 'trace_stat' debugfs directory.
Link: http://lkml.kernel.org/r/[email protected]
Fixes: ed6f1c996bfe4 ("tracing: Check return value of tracing_init_dentry()")
Signed-off-by: Luis Henriques <[email protected]>
[ Pulled from the archeological digging of my INBOX ]
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/trace/trace_stat.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
index 75bf1bcb4a8a5..bf68af63538b4 100644
--- a/kernel/trace/trace_stat.c
+++ b/kernel/trace/trace_stat.c
@@ -278,18 +278,22 @@ static int tracing_stat_init(void)
d_tracing = tracing_init_dentry();
if (IS_ERR(d_tracing))
- return 0;
+ return -ENODEV;
stat_dir = tracefs_create_dir("trace_stat", d_tracing);
- if (!stat_dir)
+ if (!stat_dir) {
pr_warn("Could not create tracefs 'trace_stat' entry\n");
+ return -ENOMEM;
+ }
return 0;
}
static int init_stat_file(struct stat_session *session)
{
- if (!stat_dir && tracing_stat_init())
- return -ENODEV;
+ int ret;
+
+ if (!stat_dir && (ret = tracing_stat_init()))
+ return ret;
session->file = tracefs_create_file(session->ts->name, 0644,
stat_dir,
--
2.20.1
From: Steven Rostedt (VMware) <[email protected]>
[ Upstream commit dfb6cd1e654315168e36d947471bd2a0ccd834ae ]
Looking through old emails in my INBOX, I came across a patch from Luis
Henriques that attempted to fix a race of two stat tracers registering the
same stat trace (extremely unlikely, as this is done in the kernel, and
probably doesn't even exist). The submitted patch wasn't quite right as it
needed to deal with clean up a bit better (if two stat tracers were the
same, it would have the same files).
But to make the code cleaner, all we needed to do is to keep the
all_stat_sessions_mutex held for most of the registering function.
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 002bb86d8d42f ("tracing/ftrace: separate events tracing and stats tracing engine")
Reported-by: Luis Henriques <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/trace/trace_stat.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
index bf68af63538b4..92b76f9e25edd 100644
--- a/kernel/trace/trace_stat.c
+++ b/kernel/trace/trace_stat.c
@@ -306,7 +306,7 @@ static int init_stat_file(struct stat_session *session)
int register_stat_tracer(struct tracer_stat *trace)
{
struct stat_session *session, *node;
- int ret;
+ int ret = -EINVAL;
if (!trace)
return -EINVAL;
@@ -317,17 +317,15 @@ int register_stat_tracer(struct tracer_stat *trace)
/* Already registered? */
mutex_lock(&all_stat_sessions_mutex);
list_for_each_entry(node, &all_stat_sessions, session_list) {
- if (node->ts == trace) {
- mutex_unlock(&all_stat_sessions_mutex);
- return -EINVAL;
- }
+ if (node->ts == trace)
+ goto out;
}
- mutex_unlock(&all_stat_sessions_mutex);
+ ret = -ENOMEM;
/* Init the session */
session = kzalloc(sizeof(*session), GFP_KERNEL);
if (!session)
- return -ENOMEM;
+ goto out;
session->ts = trace;
INIT_LIST_HEAD(&session->session_list);
@@ -336,15 +334,16 @@ int register_stat_tracer(struct tracer_stat *trace)
ret = init_stat_file(session);
if (ret) {
destroy_session(session);
- return ret;
+ goto out;
}
+ ret = 0;
/* Register */
- mutex_lock(&all_stat_sessions_mutex);
list_add_tail(&session->session_list, &all_stat_sessions);
+ out:
mutex_unlock(&all_stat_sessions_mutex);
- return 0;
+ return ret;
}
void unregister_stat_tracer(struct tracer_stat *trace)
--
2.20.1
From: J. Bruce Fields <[email protected]>
[ Upstream commit d781e3df710745fbbaee4eb07fd5b64331a1b175 ]
With cross-server COPY we've introduced the possibility that the current
or saved filehandle might not have fh_dentry/fh_export filled in, but we
missed a place that assumed it was. I think this could be triggered by
a compound like:
PUTFH(foreign filehandle)
GETATTR
SAVEFH
COPY
First, check_if_stalefh_allowed sets no_verify on the first (PUTFH) op.
Then op_func = nfsd4_putfh runs and leaves current_fh->fh_export NULL.
need_wrongsec_check returns true, since this PUTFH has OP_IS_PUTFH_LIKE
set and GETATTR does not have OP_HANDLES_WRONGSEC set.
We should probably also consider tightening the checks in
check_if_stalefh_allowed and double-checking that we don't assume the
filehandle is verified elsewhere in the compound. But I think this
fixes the immediate issue.
Reported-by: Dan Carpenter <[email protected]>
Fixes: 4e48f1cccab3 "NFSD: allow inter server COPY to have... "
Signed-off-by: J. Bruce Fields <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/nfsd/nfs4proc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index f35aa9f88b5ec..895123518fd42 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1789,7 +1789,8 @@ nfsd4_proc_compound(struct svc_rqst *rqstp)
if (op->opdesc->op_flags & OP_CLEAR_STATEID)
clear_current_stateid(cstate);
- if (need_wrongsec_check(rqstp))
+ if (current_fh->fh_export &&
+ need_wrongsec_check(rqstp))
op->status = check_nfsd_access(current_fh->fh_export, rqstp);
}
encode_op:
--
2.20.1
From: Zahari Petkov <[email protected]>
[ Upstream commit 697529091ac7a0a90ca349b914bb30641c13c753 ]
Before commit bb29b9cccd95 ("leds: pca963x: Add bindings to invert
polarity") Mode register 2 was initialized directly with either 0x01
or 0x05 for open-drain or totem pole (push-pull) configuration.
Afterwards, MODE2 initialization started using bitwise operations on
top of the default MODE2 register value (0x05). Using bitwise OR for
setting OUTDRV with 0x01 and 0x05 does not produce correct results.
When open-drain is used, instead of setting OUTDRV to 0, the driver
keeps it as 1:
Open-drain: 0x05 | 0x01 -> 0x05 (0b101 - incorrect)
Totem pole: 0x05 | 0x05 -> 0x05 (0b101 - correct but still wrong)
Now OUTDRV setting uses correct bitwise operations for initialization:
Open-drain: 0x05 & ~0x04 -> 0x01 (0b001 - correct)
Totem pole: 0x05 | 0x04 -> 0x05 (0b101 - correct)
Additional MODE2 register definitions are introduced now as well.
Fixes: bb29b9cccd95 ("leds: pca963x: Add bindings to invert polarity")
Signed-off-by: Zahari Petkov <[email protected]>
Signed-off-by: Pavel Machek <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/leds/leds-pca963x.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/leds/leds-pca963x.c b/drivers/leds/leds-pca963x.c
index 5c0908113e388..bbcde13b77f16 100644
--- a/drivers/leds/leds-pca963x.c
+++ b/drivers/leds/leds-pca963x.c
@@ -43,6 +43,8 @@
#define PCA963X_LED_PWM 0x2 /* Controlled through PWM */
#define PCA963X_LED_GRP_PWM 0x3 /* Controlled through PWM/GRPPWM */
+#define PCA963X_MODE2_OUTDRV 0x04 /* Open-drain or totem pole */
+#define PCA963X_MODE2_INVRT 0x10 /* Normal or inverted direction */
#define PCA963X_MODE2_DMBLNK 0x20 /* Enable blinking */
#define PCA963X_MODE1 0x00
@@ -462,12 +464,12 @@ static int pca963x_probe(struct i2c_client *client,
PCA963X_MODE2);
/* Configure output: open-drain or totem pole (push-pull) */
if (pdata->outdrv == PCA963X_OPEN_DRAIN)
- mode2 |= 0x01;
+ mode2 &= ~PCA963X_MODE2_OUTDRV;
else
- mode2 |= 0x05;
+ mode2 |= PCA963X_MODE2_OUTDRV;
/* Configure direction: normal or inverted */
if (pdata->dir == PCA963X_INVERTED)
- mode2 |= 0x10;
+ mode2 |= PCA963X_MODE2_INVRT;
i2c_smbus_write_byte_data(pca963x->chip->client, PCA963X_MODE2,
mode2);
}
--
2.20.1
From: Jia-Ju Bai <[email protected]>
[ Upstream commit bb6d42061a05d71dd73f620582d9e09c8fbf7f5b ]
The driver may sleep while holding a spinlock.
The function call path (from bottom to top) in Linux 4.19 is:
drivers/media/platform/sti/bdisp/bdisp-hw.c, 385:
msleep in bdisp_hw_reset
drivers/media/platform/sti/bdisp/bdisp-v4l2.c, 341:
bdisp_hw_reset in bdisp_device_run
drivers/media/platform/sti/bdisp/bdisp-v4l2.c, 317:
_raw_spin_lock_irqsave in bdisp_device_run
To fix this bug, msleep() is replaced with udelay().
This bug is found by a static analysis tool STCheck written by myself.
Signed-off-by: Jia-Ju Bai <[email protected]>
Reviewed-by: Fabien Dessenne <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/media/platform/sti/bdisp/bdisp-hw.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/media/platform/sti/bdisp/bdisp-hw.c b/drivers/media/platform/sti/bdisp/bdisp-hw.c
index 26d9fa7aeb5f2..d57f659d740a1 100644
--- a/drivers/media/platform/sti/bdisp/bdisp-hw.c
+++ b/drivers/media/platform/sti/bdisp/bdisp-hw.c
@@ -14,8 +14,8 @@
#define MAX_SRC_WIDTH 2048
/* Reset & boot poll config */
-#define POLL_RST_MAX 50
-#define POLL_RST_DELAY_MS 20
+#define POLL_RST_MAX 500
+#define POLL_RST_DELAY_MS 2
enum bdisp_target_plan {
BDISP_RGB,
@@ -382,7 +382,7 @@ int bdisp_hw_reset(struct bdisp_dev *bdisp)
for (i = 0; i < POLL_RST_MAX; i++) {
if (readl(bdisp->regs + BLT_STA1) & BLT_STA1_IDLE)
break;
- msleep(POLL_RST_DELAY_MS);
+ udelay(POLL_RST_DELAY_MS * 1000);
}
if (i == POLL_RST_MAX)
dev_err(bdisp->dev, "Reset timeout\n");
--
2.20.1
From: Dan Carpenter <[email protected]>
[ Upstream commit 216b44000ada87a63891a8214c347e05a4aea8fe ]
The brcmu_pkt_buf_free_skb() function frees "pkt" so it leads to a
static checker warning:
drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c:1974 brcmf_sdio_readframes()
error: dereferencing freed memory 'pkt'
It looks like there was supposed to be a continue after we free "pkt".
Fixes: 4754fceeb9a6 ("brcmfmac: streamline SDIO read frame routine")
Signed-off-by: Dan Carpenter <[email protected]>
Acked-by: Franky Lin <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
index 5c3b62e619807..e0211321fe9e8 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
@@ -1934,6 +1934,7 @@ static uint brcmf_sdio_readframes(struct brcmf_sdio *bus, uint maxframes)
BRCMF_SDIO_FT_NORMAL)) {
rd->len = 0;
brcmu_pkt_buf_free_skb(pkt);
+ continue;
}
bus->sdcnt.rx_readahead_cnt++;
if (rd->len != roundup(rd_new.len, 16)) {
--
2.20.1
From: Rasmus Villemoes <[email protected]>
[ Upstream commit 148587a59f6b85831695e0497d9dd1af5f0495af ]
Qiang Zhao points out that these offsets get written to 16-bit
registers, and there are some QE platforms with more than 64K
muram. So it is possible that qe_muram_alloc() gives us an allocation
that can't actually be used by the hardware, so detect and reject
that.
Reported-by: Qiang Zhao <[email protected]>
Reviewed-by: Timur Tabi <[email protected]>
Signed-off-by: Rasmus Villemoes <[email protected]>
Acked-by: David S. Miller <[email protected]>
Signed-off-by: Li Yang <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wan/fsl_ucc_hdlc.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c
index daeab33f623e7..9ab04ef532f34 100644
--- a/drivers/net/wan/fsl_ucc_hdlc.c
+++ b/drivers/net/wan/fsl_ucc_hdlc.c
@@ -242,6 +242,11 @@ static int uhdlc_init(struct ucc_hdlc_private *priv)
ret = -ENOMEM;
goto free_riptr;
}
+ if (riptr != (u16)riptr || tiptr != (u16)tiptr) {
+ dev_err(priv->dev, "MURAM allocation out of addressable range\n");
+ ret = -ENOMEM;
+ goto free_tiptr;
+ }
/* Set RIPTR, TIPTR */
iowrite16be(riptr, &priv->ucc_pram->riptr);
--
2.20.1
From: Jia-Ju Bai <[email protected]>
[ Upstream commit 9c1ed62ae0690dfe5d5e31d8f70e70a95cb48e52 ]
The driver may sleep while holding a spinlock.
The function call path (from bottom to top) in Linux 4.19 is:
drivers/usb/gadget/udc/core.c, 1175:
kzalloc(GFP_KERNEL) in usb_add_gadget_udc_release
drivers/usb/gadget/udc/core.c, 1272:
usb_add_gadget_udc_release in usb_add_gadget_udc
drivers/usb/gadget/udc/gr_udc.c, 2186:
usb_add_gadget_udc in gr_probe
drivers/usb/gadget/udc/gr_udc.c, 2183:
spin_lock in gr_probe
drivers/usb/gadget/udc/core.c, 1195:
mutex_lock in usb_add_gadget_udc_release
drivers/usb/gadget/udc/core.c, 1272:
usb_add_gadget_udc_release in usb_add_gadget_udc
drivers/usb/gadget/udc/gr_udc.c, 2186:
usb_add_gadget_udc in gr_probe
drivers/usb/gadget/udc/gr_udc.c, 2183:
spin_lock in gr_probe
drivers/usb/gadget/udc/gr_udc.c, 212:
debugfs_create_file in gr_probe
drivers/usb/gadget/udc/gr_udc.c, 2197:
gr_dfs_create in gr_probe
drivers/usb/gadget/udc/gr_udc.c, 2183:
spin_lock in gr_probe
drivers/usb/gadget/udc/gr_udc.c, 2114:
devm_request_threaded_irq in gr_request_irq
drivers/usb/gadget/udc/gr_udc.c, 2202:
gr_request_irq in gr_probe
drivers/usb/gadget/udc/gr_udc.c, 2183:
spin_lock in gr_probe
kzalloc(GFP_KERNEL), mutex_lock(), debugfs_create_file() and
devm_request_threaded_irq() can sleep at runtime.
To fix these possible bugs, usb_add_gadget_udc(), gr_dfs_create() and
gr_request_irq() are called without handling the spinlock.
These bugs are found by a static analysis tool STCheck written by myself.
Signed-off-by: Jia-Ju Bai <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/usb/gadget/udc/gr_udc.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/usb/gadget/udc/gr_udc.c b/drivers/usb/gadget/udc/gr_udc.c
index 729e60e495641..e50108f9a374e 100644
--- a/drivers/usb/gadget/udc/gr_udc.c
+++ b/drivers/usb/gadget/udc/gr_udc.c
@@ -2180,8 +2180,6 @@ static int gr_probe(struct platform_device *pdev)
return -ENOMEM;
}
- spin_lock(&dev->lock);
-
/* Inside lock so that no gadget can use this udc until probe is done */
retval = usb_add_gadget_udc(dev->dev, &dev->gadget);
if (retval) {
@@ -2190,15 +2188,21 @@ static int gr_probe(struct platform_device *pdev)
}
dev->added = 1;
+ spin_lock(&dev->lock);
+
retval = gr_udc_init(dev);
- if (retval)
+ if (retval) {
+ spin_unlock(&dev->lock);
goto out;
-
- gr_dfs_create(dev);
+ }
/* Clear all interrupt enables that might be left on since last boot */
gr_disable_interrupts_and_pullup(dev);
+ spin_unlock(&dev->lock);
+
+ gr_dfs_create(dev);
+
retval = gr_request_irq(dev, dev->irq);
if (retval) {
dev_err(dev->dev, "Failed to request irq %d\n", dev->irq);
@@ -2227,8 +2231,6 @@ static int gr_probe(struct platform_device *pdev)
dev_info(dev->dev, "regs: %p, irq %d\n", dev->regs, dev->irq);
out:
- spin_unlock(&dev->lock);
-
if (retval)
gr_remove(pdev);
--
2.20.1
From: Mao Wenan <[email protected]>
[ Upstream commit 718eae277e62a26e5862eb72a830b5e0fe37b04a ]
Convert cpu_to_le16(le16_to_cpu(frame->datalen) + len) to
use le16_add_cpu(), which is more concise and does the same thing.
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Mao Wenan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/nfc/port100.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nfc/port100.c b/drivers/nfc/port100.c
index 60ae382f50da9..06bb226c62ef4 100644
--- a/drivers/nfc/port100.c
+++ b/drivers/nfc/port100.c
@@ -574,7 +574,7 @@ static void port100_tx_update_payload_len(void *_frame, int len)
{
struct port100_frame *frame = _frame;
- frame->datalen = cpu_to_le16(le16_to_cpu(frame->datalen) + len);
+ le16_add_cpu(&frame->datalen, len);
}
static bool port100_rx_frame_is_valid(void *_frame)
--
2.20.1
From: Davide Caratti <[email protected]>
[ Upstream commit e2debf0852c4d66ba1a8bde12869b196094c70a7 ]
unlike other classifiers that can be offloaded (i.e. users can set flags
like 'skip_hw' and 'skip_sw'), 'cls_flower' doesn't validate the size of
netlink attribute 'TCA_FLOWER_FLAGS' provided by user: add a proper entry
to fl_policy.
Fixes: 5b33f48842fa ("net/flower: Introduce hardware offload support")
Signed-off-by: Davide Caratti <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/sched/cls_flower.c | 1 +
1 file changed, 1 insertion(+)
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -486,6 +486,7 @@ static const struct nla_policy fl_policy
[TCA_FLOWER_KEY_ENC_IP_TTL_MASK] = { .type = NLA_U8 },
[TCA_FLOWER_KEY_ENC_OPTS] = { .type = NLA_NESTED },
[TCA_FLOWER_KEY_ENC_OPTS_MASK] = { .type = NLA_NESTED },
+ [TCA_FLOWER_FLAGS] = { .type = NLA_U32 },
};
static const struct nla_policy
From: Vincenzo Frascino <[email protected]>
[ Upstream commit bc420c6ceefbb86cbbc8c00061bd779c17fa6997 ]
Kmemleak relies on specific symbols to register the read only data
during init (e.g. __start_ro_after_init).
Trying to build an XIP kernel on arm results in the linking error
reported below because when this option is selected read only data
after init are not allowed since .data is read only (.rodata).
arm-linux-gnueabihf-ld: mm/kmemleak.o: in function `kmemleak_init':
kmemleak.c:(.init.text+0x148): undefined reference to `__end_ro_after_init'
arm-linux-gnueabihf-ld: kmemleak.c:(.init.text+0x14c):
undefined reference to `__end_ro_after_init'
arm-linux-gnueabihf-ld: kmemleak.c:(.init.text+0x150):
undefined reference to `__start_ro_after_init'
arm-linux-gnueabihf-ld: kmemleak.c:(.init.text+0x156):
undefined reference to `__start_ro_after_init'
arm-linux-gnueabihf-ld: kmemleak.c:(.init.text+0x162):
undefined reference to `__start_ro_after_init'
arm-linux-gnueabihf-ld: kmemleak.c:(.init.text+0x16a):
undefined reference to `__start_ro_after_init'
linux/Makefile:1078: recipe for target 'vmlinux' failed
Fix the issue enabling kmemleak only on non XIP kernels.
Signed-off-by: Vincenzo Frascino <[email protected]>
Signed-off-by: Russell King <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/arm/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 185e552f14610..9f03befdfbf06 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -61,7 +61,7 @@ config ARM
select HAVE_EBPF_JIT if !CPU_ENDIAN_BE32
select HAVE_CONTEXT_TRACKING
select HAVE_C_RECORDMCOUNT
- select HAVE_DEBUG_KMEMLEAK
+ select HAVE_DEBUG_KMEMLEAK if !XIP_KERNEL
select HAVE_DMA_CONTIGUOUS if MMU
select HAVE_DYNAMIC_FTRACE if (!XIP_KERNEL) && !CPU_ENDIAN_BE32 && MMU
select HAVE_DYNAMIC_FTRACE_WITH_REGS if HAVE_DYNAMIC_FTRACE
--
2.20.1
From: Peter Zijlstra <[email protected]>
[ Upstream commit 45178ac0cea853fe0e405bf11e101bdebea57b15 ]
Paul reported a very sporadic, rcutorture induced, workqueue failure.
When the planets align, the workqueue rescuer's self-migrate fails and
then triggers a WARN for running a work on the wrong CPU.
Tejun then figured that set_cpus_allowed_ptr()'s stop_one_cpu() call
could be ignored! When stopper->enabled is false, stop_machine will
insta complete the work, without actually doing the work. Worse, it
will not WARN about this (we really should fix this).
It turns out there is a small window where a freshly online'ed CPU is
marked 'online' but doesn't yet have the stopper task running:
BP AP
bringup_cpu()
__cpu_up(cpu, idle) --> start_secondary()
...
cpu_startup_entry()
bringup_wait_for_ap()
wait_for_ap_thread() <-- cpuhp_online_idle()
while (1)
do_idle()
... available to run kthreads ...
stop_machine_unpark()
stopper->enable = true;
Close this by moving the stop_machine_unpark() into
cpuhp_online_idle(), such that the stopper thread is ready before we
start the idle loop and schedule.
Reported-by: "Paul E. McKenney" <[email protected]>
Debugged-by: Tejun Heo <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: "Paul E. McKenney" <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/cpu.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 8d6b8b5493f9f..2d850eaaf82e3 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -493,8 +493,7 @@ static int bringup_wait_for_ap(unsigned int cpu)
if (WARN_ON_ONCE((!cpu_online(cpu))))
return -ECANCELED;
- /* Unpark the stopper thread and the hotplug thread of the target cpu */
- stop_machine_unpark(cpu);
+ /* Unpark the hotplug thread of the target cpu */
kthread_unpark(st->thread);
/*
@@ -1048,8 +1047,8 @@ void notify_cpu_starting(unsigned int cpu)
/*
* Called from the idle task. Wake up the controlling task which brings the
- * stopper and the hotplug thread of the upcoming CPU up and then delegates
- * the rest of the online bringup to the hotplug thread.
+ * hotplug thread of the upcoming CPU up and then delegates the rest of the
+ * online bringup to the hotplug thread.
*/
void cpuhp_online_idle(enum cpuhp_state state)
{
@@ -1059,6 +1058,12 @@ void cpuhp_online_idle(enum cpuhp_state state)
if (state != CPUHP_AP_ONLINE_IDLE)
return;
+ /*
+ * Unpart the stopper thread before we start the idle loop (and start
+ * scheduling); this ensures the stopper task is always available.
+ */
+ stop_machine_unpark(smp_processor_id());
+
st->state = CPUHP_AP_ONLINE_IDLE;
complete_ap_thread(st, true);
}
--
2.20.1
From: Andre Przywara <[email protected]>
[ Upstream commit 7aa9b9eb7d6a8fde7acbe0446444f7e3fae1fe3b ]
Add the Performance Monitoring Unit (PMU) device tree node to the H6
.dtsi, which tells DT users which interrupts are triggered by PMU
overflow events on each core. The numbers come from the manual and have
been checked in U-Boot and with perf in Linux.
Tested with perf record and taskset on a Pine H64.
Signed-off-by: Andre Przywara <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
index 72813e7aefb8a..bd43912696111 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
+++ b/arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi
@@ -69,6 +69,16 @@
clock-output-names = "osc32k";
};
+ pmu {
+ compatible = "arm,cortex-a53-pmu",
+ "arm,armv8-pmuv3";
+ interrupts = <GIC_SPI 140 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 141 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 142 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 143 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-affinity = <&cpu0>, <&cpu1>, <&cpu2>, <&cpu3>;
+ };
+
psci {
compatible = "arm,psci-0.2";
method = "smc";
--
2.20.1
From: Andre Przywara <[email protected]>
[ Upstream commit 0388a110747bec0c9d9de995842bb2a03a26aae1 ]
Add the Performance Monitoring Unit (PMU) device tree node to the H3
.dtsi, which tells DT users which interrupts are triggered by PMU
overflow events on each core. The numbers come from the manual and have
been checked in U-Boot and with perf in Linux.
Tested with perf record and taskset on an OrangePi Zero.
Signed-off-by: Andre Przywara <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/arm/boot/dts/sun8i-h3.dtsi | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/arch/arm/boot/dts/sun8i-h3.dtsi b/arch/arm/boot/dts/sun8i-h3.dtsi
index 9233ba30a857c..11172fbdc03aa 100644
--- a/arch/arm/boot/dts/sun8i-h3.dtsi
+++ b/arch/arm/boot/dts/sun8i-h3.dtsi
@@ -80,7 +80,7 @@
#cooling-cells = <2>;
};
- cpu@1 {
+ cpu1: cpu@1 {
compatible = "arm,cortex-a7";
device_type = "cpu";
reg = <1>;
@@ -90,7 +90,7 @@
#cooling-cells = <2>;
};
- cpu@2 {
+ cpu2: cpu@2 {
compatible = "arm,cortex-a7";
device_type = "cpu";
reg = <2>;
@@ -100,7 +100,7 @@
#cooling-cells = <2>;
};
- cpu@3 {
+ cpu3: cpu@3 {
compatible = "arm,cortex-a7";
device_type = "cpu";
reg = <3>;
@@ -111,6 +111,15 @@
};
};
+ pmu {
+ compatible = "arm,cortex-a7-pmu";
+ interrupts = <GIC_SPI 120 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 121 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 122 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 123 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-affinity = <&cpu0>, <&cpu1>, <&cpu2>, <&cpu3>;
+ };
+
timer {
compatible = "arm,armv7-timer";
interrupts = <GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_LEVEL_LOW)>,
--
2.20.1
From: Daniel Jordan <[email protected]>
[ Upstream commit 38228e8848cd7dd86ccb90406af32de0cad24be3 ]
lockdep complains when padata's paths to update cpumasks via CPU hotplug
and sysfs are both taken:
# echo 0 > /sys/devices/system/cpu/cpu1/online
# echo ff > /sys/kernel/pcrypt/pencrypt/parallel_cpumask
======================================================
WARNING: possible circular locking dependency detected
5.4.0-rc8-padata-cpuhp-v3+ #1 Not tainted
------------------------------------------------------
bash/205 is trying to acquire lock:
ffffffff8286bcd0 (cpu_hotplug_lock.rw_sem){++++}, at: padata_set_cpumask+0x2b/0x120
but task is already holding lock:
ffff8880001abfa0 (&pinst->lock){+.+.}, at: padata_set_cpumask+0x26/0x120
which lock already depends on the new lock.
padata doesn't take cpu_hotplug_lock and pinst->lock in a consistent
order. Which should be first? CPU hotplug calls into padata with
cpu_hotplug_lock already held, so it should have priority.
Fixes: 6751fb3c0e0c ("padata: Use get_online_cpus/put_online_cpus")
Signed-off-by: Daniel Jordan <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Steffen Klassert <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/padata.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/padata.c b/kernel/padata.c
index cfab62923c452..c280cb153915f 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -671,8 +671,8 @@ int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type,
struct cpumask *serial_mask, *parallel_mask;
int err = -EINVAL;
- mutex_lock(&pinst->lock);
get_online_cpus();
+ mutex_lock(&pinst->lock);
switch (cpumask_type) {
case PADATA_CPU_PARALLEL:
@@ -690,8 +690,8 @@ int padata_set_cpumask(struct padata_instance *pinst, int cpumask_type,
err = __padata_set_cpumasks(pinst, parallel_mask, serial_mask);
out:
- put_online_cpus();
mutex_unlock(&pinst->lock);
+ put_online_cpus();
return err;
}
--
2.20.1
From: Oliver O'Halloran <[email protected]>
[ Upstream commit 965c94f309be58fbcc6c8d3e4f123376c5970d79 ]
An ioda_pe for each VF is allocated in pnv_pci_sriov_enable() before
the pci_dev for the VF is created. We need to set the pe->pdev pointer
at some point after the pci_dev is created. Currently we do that in:
pcibios_bus_add_device()
pnv_pci_dma_dev_setup() (via phb->ops.dma_dev_setup)
/* fixup is done here */
pnv_pci_ioda_dma_dev_setup() (via pnv_phb->dma_dev_setup)
The fixup needs to be done before setting up DMA for for the VF's PE,
but there's no real reason to delay it until this point. Move the
fixup into pnv_pci_ioda_fixup_iov() so the ordering is:
pcibios_add_device()
pnv_pci_ioda_fixup_iov() (via ppc_md.pcibios_fixup_sriov)
pcibios_bus_add_device()
...
This isn't strictly required, but it's a slightly more logical place
to do the fixup and it simplifies pnv_pci_dma_dev_setup().
Signed-off-by: Oliver O'Halloran <[email protected]>
Reviewed-by: Alexey Kardashevskiy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/powerpc/platforms/powernv/pci-ioda.c | 29 +++++++++++++++++++----
arch/powerpc/platforms/powernv/pci.c | 14 -----------
2 files changed, 25 insertions(+), 18 deletions(-)
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 28adfe4dd04c5..ecd211c5f24a5 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3015,9 +3015,6 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
struct pci_dn *pdn;
int mul, total_vfs;
- if (!pdev->is_physfn || pci_dev_is_added(pdev))
- return;
-
pdn = pci_get_pdn(pdev);
pdn->vfs_expanded = 0;
pdn->m64_single_mode = false;
@@ -3092,6 +3089,30 @@ truncate_iov:
res->end = res->start - 1;
}
}
+
+static void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev)
+{
+ if (WARN_ON(pci_dev_is_added(pdev)))
+ return;
+
+ if (pdev->is_virtfn) {
+ struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
+
+ /*
+ * VF PEs are single-device PEs so their pdev pointer needs to
+ * be set. The pdev doesn't exist when the PE is allocated (in
+ * (pcibios_sriov_enable()) so we fix it up here.
+ */
+ pe->pdev = pdev;
+ WARN_ON(!(pe->flags & PNV_IODA_PE_VF));
+ } else if (pdev->is_physfn) {
+ /*
+ * For PFs adjust their allocated IOV resources to match what
+ * the PHB can support using it's M64 BAR table.
+ */
+ pnv_pci_ioda_fixup_iov_resources(pdev);
+ }
+}
#endif /* CONFIG_PCI_IOV */
static void pnv_ioda_setup_pe_res(struct pnv_ioda_pe *pe,
@@ -3985,7 +4006,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
ppc_md.pcibios_default_alignment = pnv_pci_default_alignment;
#ifdef CONFIG_PCI_IOV
- ppc_md.pcibios_fixup_sriov = pnv_pci_ioda_fixup_iov_resources;
+ ppc_md.pcibios_fixup_sriov = pnv_pci_ioda_fixup_iov;
ppc_md.pcibios_iov_resource_alignment = pnv_pci_iov_resource_alignment;
ppc_md.pcibios_sriov_enable = pnv_pcibios_sriov_enable;
ppc_md.pcibios_sriov_disable = pnv_pcibios_sriov_disable;
diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
index aa95b8e0f66ad..b6fa900af5da5 100644
--- a/arch/powerpc/platforms/powernv/pci.c
+++ b/arch/powerpc/platforms/powernv/pci.c
@@ -820,20 +820,6 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
{
struct pci_controller *hose = pci_bus_to_host(pdev->bus);
struct pnv_phb *phb = hose->private_data;
-#ifdef CONFIG_PCI_IOV
- struct pnv_ioda_pe *pe;
-
- /* Fix the VF pdn PE number */
- if (pdev->is_virtfn) {
- list_for_each_entry(pe, &phb->ioda.pe_list, list) {
- if (pe->rid == ((pdev->bus->number << 8) |
- (pdev->devfn & 0xff))) {
- pe->pdev = pdev;
- break;
- }
- }
- }
-#endif /* CONFIG_PCI_IOV */
if (phb && phb->dma_dev_setup)
phb->dma_dev_setup(phb, pdev);
--
2.20.1
From: Harry Wentland <[email protected]>
[ Upstream commit 3eb6d7aca53d81ce888624f09cd44dc0302161e8 ]
[WHY]
Two years ago the patch referenced by the Fixes tag stopped running
dp_verify_link_cap_with_retries during DP detection when the reason
for the detection was a short-pulse interrupt. This effectively meant
that we were no longer doing the verify_link_cap training on active
dongles when their SINK_COUNT changed from 0 to 1.
A year ago this was partly remedied with:
commit 80adaebd2d41 ("drm/amd/display: Don't skip link training for empty dongle")
This made sure that we trained the dongle on initial hotplug (without
connected downstream devices).
This is all fine and dandy if it weren't for the fact that there are
some dongles on the market that don't like link training when SINK_COUNT
is 0 These dongles will in fact indicate a SINK_COUNT of 0 immediately
after hotplug, even when a downstream device is connected, and then
trigger a shortpulse interrupt indicating a SINK_COUNT change to 1.
In order to play nicely we will need our policy to not link train an
active DP dongle when SINK_COUNT is 0 but ensure we train it when the
SINK_COUNT changes to 1.
[HOW]
Call dp_verify_link_cap_with_retries on detection even when the detection
is triggered from a short pulse interrupt.
With this change we can also revert this commit which we'll do in a separate
follow-up change:
commit 80adaebd2d41 ("drm/amd/display: Don't skip link training for empty dongle")
Fixes: 0301ccbaf67d ("drm/amd/display: DP Compliance 400.1.1 failure")
Suggested-by: Louis Li <[email protected]>
Tested-by: Louis Li <[email protected]>
Cc: Wenjing Liu <[email protected]>
Cc: Hersen Wu <[email protected]>
Cc: Eric Yang <[email protected]>
Reviewed-by: Wenjing Liu <[email protected]>
Signed-off-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/display/dc/core/dc_link.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
index 2f42964fb9f45..3abc0294c05f5 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c
@@ -780,8 +780,7 @@ bool dc_link_detect(struct dc_link *link, enum dc_detect_reason reason)
same_edid = is_same_edid(&prev_sink->dc_edid, &sink->dc_edid);
if (link->connector_signal == SIGNAL_TYPE_DISPLAY_PORT &&
- sink_caps.transaction_type == DDC_TRANSACTION_TYPE_I2C_OVER_AUX &&
- reason != DETECT_REASON_HPDRX) {
+ sink_caps.transaction_type == DDC_TRANSACTION_TYPE_I2C_OVER_AUX) {
/*
* TODO debug why Dell 2413 doesn't like
* two link trainings
--
2.20.1
From: Niklas Schnelle <[email protected]>
[ Upstream commit 576c75e36c689bec6a940e807bae27291ab0c0de ]
With zpci_disable() working, lockdep detected a potential deadlock
(lockdep output at the end).
The deadlock is between recovering a PCI function via the
/sys/bus/pci/devices/<dev>/recover
attribute vs powering it off via
/sys/bus/pci/slots/<slot>/power.
The fix is analogous to the changes in commit 0ee223b2e1f6 ("scsi: core:
Avoid that SCSI device removal through sysfs triggers a deadlock")
that fixed a potential deadlock on removing a SCSI device via sysfs.
[ 204.830107] ======================================================
[ 204.830109] WARNING: possible circular locking dependency detected
[ 204.830111] 5.5.0-rc2-06072-gbc03ecc9a672 #6 Tainted: G W
[ 204.830112] ------------------------------------------------------
[ 204.830113] bash/1034 is trying to acquire lock:
[ 204.830115] 0000000192a1a610 (kn->count#200){++++}, at: kernfs_remove_by_name_ns+0x5c/0xa8
[ 204.830122]
but task is already holding lock:
[ 204.830123] 00000000c16134a8 (pci_rescan_remove_lock){+.+.}, at: pci_stop_and_remove_bus_device_locked+0x26/0x48
[ 204.830128]
which lock already depends on the new lock.
[ 204.830129]
the existing dependency chain (in reverse order) is:
[ 204.830130]
-> #1 (pci_rescan_remove_lock){+.+.}:
[ 204.830134] validate_chain+0x93a/0xd08
[ 204.830136] __lock_acquire+0x4ae/0x9d0
[ 204.830137] lock_acquire+0x114/0x280
[ 204.830140] __mutex_lock+0xa2/0x960
[ 204.830142] mutex_lock_nested+0x32/0x40
[ 204.830145] recover_store+0x4c/0xa8
[ 204.830147] kernfs_fop_write+0xe6/0x218
[ 204.830151] vfs_write+0xb0/0x1b8
[ 204.830152] ksys_write+0x6c/0xf8
[ 204.830154] system_call+0xd8/0x2d8
[ 204.830155]
-> #0 (kn->count#200){++++}:
[ 204.830187] check_noncircular+0x1e6/0x240
[ 204.830189] check_prev_add+0xfc/0xdb0
[ 204.830190] validate_chain+0x93a/0xd08
[ 204.830192] __lock_acquire+0x4ae/0x9d0
[ 204.830193] lock_acquire+0x114/0x280
[ 204.830194] __kernfs_remove.part.0+0x2e4/0x360
[ 204.830196] kernfs_remove_by_name_ns+0x5c/0xa8
[ 204.830198] remove_files.isra.0+0x4c/0x98
[ 204.830199] sysfs_remove_group+0x66/0xc8
[ 204.830201] sysfs_remove_groups+0x46/0x68
[ 204.830204] device_remove_attrs+0x52/0x90
[ 204.830207] device_del+0x182/0x418
[ 204.830208] pci_remove_bus_device+0x8a/0x130
[ 204.830210] pci_stop_and_remove_bus_device_locked+0x3a/0x48
[ 204.830212] disable_slot+0x68/0x100
[ 204.830213] power_write_file+0x7c/0x130
[ 204.830215] kernfs_fop_write+0xe6/0x218
[ 204.830217] vfs_write+0xb0/0x1b8
[ 204.830218] ksys_write+0x6c/0xf8
[ 204.830220] system_call+0xd8/0x2d8
[ 204.830221]
other info that might help us debug this:
[ 204.830223] Possible unsafe locking scenario:
[ 204.830224] CPU0 CPU1
[ 204.830225] ---- ----
[ 204.830226] lock(pci_rescan_remove_lock);
[ 204.830227] lock(kn->count#200);
[ 204.830229] lock(pci_rescan_remove_lock);
[ 204.830231] lock(kn->count#200);
[ 204.830233]
*** DEADLOCK ***
[ 204.830234] 4 locks held by bash/1034:
[ 204.830235] #0: 00000001b6fbc498 (sb_writers#4){.+.+}, at: vfs_write+0x158/0x1b8
[ 204.830239] #1: 000000018c9f5090 (&of->mutex){+.+.}, at: kernfs_fop_write+0xaa/0x218
[ 204.830242] #2: 00000001f7da0810 (kn->count#235){.+.+}, at: kernfs_fop_write+0xb6/0x218
[ 204.830245] #3: 00000000c16134a8 (pci_rescan_remove_lock){+.+.}, at: pci_stop_and_remove_bus_device_locked+0x26/0x48
[ 204.830248]
stack backtrace:
[ 204.830250] CPU: 2 PID: 1034 Comm: bash Tainted: G W 5.5.0-rc2-06072-gbc03ecc9a672 #6
[ 204.830252] Hardware name: IBM 8561 T01 703 (LPAR)
[ 204.830253] Call Trace:
[ 204.830257] [<00000000c05e10c0>] show_stack+0x88/0xf0
[ 204.830260] [<00000000c112dca4>] dump_stack+0xa4/0xe0
[ 204.830261] [<00000000c0694c06>] check_noncircular+0x1e6/0x240
[ 204.830263] [<00000000c0695bec>] check_prev_add+0xfc/0xdb0
[ 204.830264] [<00000000c06971da>] validate_chain+0x93a/0xd08
[ 204.830266] [<00000000c06994c6>] __lock_acquire+0x4ae/0x9d0
[ 204.830267] [<00000000c069867c>] lock_acquire+0x114/0x280
[ 204.830269] [<00000000c09ca15c>] __kernfs_remove.part.0+0x2e4/0x360
[ 204.830270] [<00000000c09cb5c4>] kernfs_remove_by_name_ns+0x5c/0xa8
[ 204.830272] [<00000000c09cee14>] remove_files.isra.0+0x4c/0x98
[ 204.830274] [<00000000c09cf2ae>] sysfs_remove_group+0x66/0xc8
[ 204.830276] [<00000000c09cf356>] sysfs_remove_groups+0x46/0x68
[ 204.830278] [<00000000c0e3dfe2>] device_remove_attrs+0x52/0x90
[ 204.830280] [<00000000c0e40382>] device_del+0x182/0x418
[ 204.830281] [<00000000c0dcfd7a>] pci_remove_bus_device+0x8a/0x130
[ 204.830283] [<00000000c0dcfe92>] pci_stop_and_remove_bus_device_locked+0x3a/0x48
[ 204.830285] [<00000000c0de7190>] disable_slot+0x68/0x100
[ 204.830286] [<00000000c0de6514>] power_write_file+0x7c/0x130
[ 204.830288] [<00000000c09cc846>] kernfs_fop_write+0xe6/0x218
[ 204.830290] [<00000000c08f3480>] vfs_write+0xb0/0x1b8
[ 204.830291] [<00000000c08f378c>] ksys_write+0x6c/0xf8
[ 204.830293] [<00000000c1154374>] system_call+0xd8/0x2d8
[ 204.830294] INFO: lockdep is turned off.
Signed-off-by: Niklas Schnelle <[email protected]>
Reviewed-by: Peter Oberparleiter <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/s390/pci/pci_sysfs.c | 63 ++++++++++++++++++++++++++-------------
1 file changed, 42 insertions(+), 21 deletions(-)
diff --git a/arch/s390/pci/pci_sysfs.c b/arch/s390/pci/pci_sysfs.c
index 430c14b006d17..0e11fc023fe78 100644
--- a/arch/s390/pci/pci_sysfs.c
+++ b/arch/s390/pci/pci_sysfs.c
@@ -13,6 +13,8 @@
#include <linux/stat.h>
#include <linux/pci.h>
+#include "../../../drivers/pci/pci.h"
+
#include <asm/sclp.h>
#define zpci_attr(name, fmt, member) \
@@ -40,31 +42,50 @@ zpci_attr(segment3, "0x%02x\n", pfip[3]);
static ssize_t recover_store(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
{
+ struct kernfs_node *kn;
struct pci_dev *pdev = to_pci_dev(dev);
struct zpci_dev *zdev = to_zpci(pdev);
- int ret;
-
- if (!device_remove_file_self(dev, attr))
- return count;
-
+ int ret = 0;
+
+ /* Can't use device_remove_self() here as that would lead us to lock
+ * the pci_rescan_remove_lock while holding the device' kernfs lock.
+ * This would create a possible deadlock with disable_slot() which is
+ * not directly protected by the device' kernfs lock but takes it
+ * during the device removal which happens under
+ * pci_rescan_remove_lock.
+ *
+ * This is analogous to sdev_store_delete() in
+ * drivers/scsi/scsi_sysfs.c
+ */
+ kn = sysfs_break_active_protection(&dev->kobj, &attr->attr);
+ WARN_ON_ONCE(!kn);
+ /* device_remove_file() serializes concurrent calls ignoring all but
+ * the first
+ */
+ device_remove_file(dev, attr);
+
+ /* A concurrent call to recover_store() may slip between
+ * sysfs_break_active_protection() and the sysfs file removal.
+ * Once it unblocks from pci_lock_rescan_remove() the original pdev
+ * will already be removed.
+ */
pci_lock_rescan_remove();
- pci_stop_and_remove_bus_device(pdev);
- ret = zpci_disable_device(zdev);
- if (ret)
- goto error;
-
- ret = zpci_enable_device(zdev);
- if (ret)
- goto error;
-
- pci_rescan_bus(zdev->bus);
+ if (pci_dev_is_added(pdev)) {
+ pci_stop_and_remove_bus_device(pdev);
+ ret = zpci_disable_device(zdev);
+ if (ret)
+ goto out;
+
+ ret = zpci_enable_device(zdev);
+ if (ret)
+ goto out;
+ pci_rescan_bus(zdev->bus);
+ }
+out:
pci_unlock_rescan_remove();
-
- return count;
-
-error:
- pci_unlock_rescan_remove();
- return ret;
+ if (kn)
+ sysfs_unbreak_active_protection(kn);
+ return ret ? ret : count;
}
static DEVICE_ATTR_WO(recover);
--
2.20.1
From: Wei Liu <[email protected]>
[ Upstream commit 574f29036fce385e28617547955dd6911d375025 ]
Previously quirk_paxc_bridge() was applied when the iproc driver was
built-in, but not when it was compiled as a module.
This happened because it was under #ifdef CONFIG_PCIE_IPROC_PLATFORM:
PCIE_IPROC_PLATFORM=y causes CONFIG_PCIE_IPROC_PLATFORM to be defined, but
PCIE_IPROC_PLATFORM=m causes CONFIG_PCIE_IPROC_PLATFORM_MODULE to be
defined.
Move quirk_paxc_bridge() to pcie-iproc.c and drop the #ifdef so the quirk
is always applied, whether iproc is built-in or a module.
[bhelgaas: commit log, move to pcie-iproc.c, not pcie-iproc-platform.c]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Wei Liu <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/pci/controller/pcie-iproc.c | 24 ++++++++++++++++++++++++
drivers/pci/quirks.c | 26 --------------------------
2 files changed, 24 insertions(+), 26 deletions(-)
diff --git a/drivers/pci/controller/pcie-iproc.c b/drivers/pci/controller/pcie-iproc.c
index 9d5cbc75d5ae0..ec86414216f97 100644
--- a/drivers/pci/controller/pcie-iproc.c
+++ b/drivers/pci/controller/pcie-iproc.c
@@ -1526,6 +1526,30 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0xd802,
DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0xd804,
quirk_paxc_disable_msi_parsing);
+static void quirk_paxc_bridge(struct pci_dev *pdev)
+{
+ /*
+ * The PCI config space is shared with the PAXC root port and the first
+ * Ethernet device. So, we need to workaround this by telling the PCI
+ * code that the bridge is not an Ethernet device.
+ */
+ if (pdev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
+ pdev->class = PCI_CLASS_BRIDGE_PCI << 8;
+
+ /*
+ * MPSS is not being set properly (as it is currently 0). This is
+ * because that area of the PCI config space is hard coded to zero, and
+ * is not modifiable by firmware. Set this to 2 (e.g., 512 byte MPS)
+ * so that the MPS can be set to the real max value.
+ */
+ pdev->pcie_mpss = 2;
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0x16cd, quirk_paxc_bridge);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0x16f0, quirk_paxc_bridge);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0xd750, quirk_paxc_bridge);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0xd802, quirk_paxc_bridge);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0xd804, quirk_paxc_bridge);
+
MODULE_AUTHOR("Ray Jui <[email protected]>");
MODULE_DESCRIPTION("Broadcom iPROC PCIe common driver");
MODULE_LICENSE("GPL v2");
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 5b4c36ab15962..84f10cda539ea 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2358,32 +2358,6 @@ DECLARE_PCI_FIXUP_ENABLE(PCI_VENDOR_ID_BROADCOM,
PCI_DEVICE_ID_TIGON3_5719,
quirk_brcm_5719_limit_mrrs);
-#ifdef CONFIG_PCIE_IPROC_PLATFORM
-static void quirk_paxc_bridge(struct pci_dev *pdev)
-{
- /*
- * The PCI config space is shared with the PAXC root port and the first
- * Ethernet device. So, we need to workaround this by telling the PCI
- * code that the bridge is not an Ethernet device.
- */
- if (pdev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
- pdev->class = PCI_CLASS_BRIDGE_PCI << 8;
-
- /*
- * MPSS is not being set properly (as it is currently 0). This is
- * because that area of the PCI config space is hard coded to zero, and
- * is not modifiable by firmware. Set this to 2 (e.g., 512 byte MPS)
- * so that the MPS can be set to the real max value.
- */
- pdev->pcie_mpss = 2;
-}
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0x16cd, quirk_paxc_bridge);
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0x16f0, quirk_paxc_bridge);
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0xd750, quirk_paxc_bridge);
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0xd802, quirk_paxc_bridge);
-DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_BROADCOM, 0xd804, quirk_paxc_bridge);
-#endif
-
/*
* Originally in EDAC sources for i82875P: Intel tells BIOS developers to
* hide device 6 which configures the overflow device access containing the
--
2.20.1
From: Forest Crossman <[email protected]>
[ Upstream commit dc4cac67e13515835ed8081d510aa507aacb013b ]
The AVerMedia CE310B is a simple composite + S-Video + stereo audio
capture card, and uses only the CX23888 to perform all of these
functions.
I've tested both video inputs and the audio interface and confirmed that
they're all working. However, there are some issues:
* Sometimes when I switch inputs the video signal turns black and can't
be recovered until the system is rebooted. I haven't been able to
determine the cause of this behavior, nor have I found a solution to
fix it or any workarounds other than rebooting.
* The card sometimes seems to have trouble syncing to the video signal,
and some of the VBI data appears as noise at the top of the frame, but
I assume that to be a result of my very noisy RF environment and the
card's unshielded input traces rather than a configuration issue.
Signed-off-by: Forest Crossman <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/media/pci/cx23885/cx23885-cards.c | 24 +++++++++++++++++++++++
drivers/media/pci/cx23885/cx23885-video.c | 3 ++-
drivers/media/pci/cx23885/cx23885.h | 1 +
3 files changed, 27 insertions(+), 1 deletion(-)
diff --git a/drivers/media/pci/cx23885/cx23885-cards.c b/drivers/media/pci/cx23885/cx23885-cards.c
index ed3210dc50bc2..642aefdbb7bb6 100644
--- a/drivers/media/pci/cx23885/cx23885-cards.c
+++ b/drivers/media/pci/cx23885/cx23885-cards.c
@@ -811,6 +811,25 @@ struct cx23885_board cx23885_boards[] = {
.name = "Hauppauge WinTV-Starburst2",
.portb = CX23885_MPEG_DVB,
},
+ [CX23885_BOARD_AVERMEDIA_CE310B] = {
+ .name = "AVerMedia CE310B",
+ .porta = CX23885_ANALOG_VIDEO,
+ .force_bff = 1,
+ .input = {{
+ .type = CX23885_VMUX_COMPOSITE1,
+ .vmux = CX25840_VIN1_CH1 |
+ CX25840_NONE_CH2 |
+ CX25840_NONE0_CH3,
+ .amux = CX25840_AUDIO7,
+ }, {
+ .type = CX23885_VMUX_SVIDEO,
+ .vmux = CX25840_VIN8_CH1 |
+ CX25840_NONE_CH2 |
+ CX25840_VIN7_CH3 |
+ CX25840_SVIDEO_ON,
+ .amux = CX25840_AUDIO7,
+ } },
+ },
};
const unsigned int cx23885_bcount = ARRAY_SIZE(cx23885_boards);
@@ -1134,6 +1153,10 @@ struct cx23885_subid cx23885_subids[] = {
.subvendor = 0x0070,
.subdevice = 0xf02a,
.card = CX23885_BOARD_HAUPPAUGE_STARBURST2,
+ }, {
+ .subvendor = 0x1461,
+ .subdevice = 0x3100,
+ .card = CX23885_BOARD_AVERMEDIA_CE310B,
},
};
const unsigned int cx23885_idcount = ARRAY_SIZE(cx23885_subids);
@@ -2358,6 +2381,7 @@ void cx23885_card_setup(struct cx23885_dev *dev)
case CX23885_BOARD_DVBSKY_T982:
case CX23885_BOARD_VIEWCAST_260E:
case CX23885_BOARD_VIEWCAST_460E:
+ case CX23885_BOARD_AVERMEDIA_CE310B:
dev->sd_cx25840 = v4l2_i2c_new_subdev(&dev->v4l2_dev,
&dev->i2c_bus[2].i2c_adap,
"cx25840", 0x88 >> 1, NULL);
diff --git a/drivers/media/pci/cx23885/cx23885-video.c b/drivers/media/pci/cx23885/cx23885-video.c
index f8a3deadc77a1..2a20c7165e1e8 100644
--- a/drivers/media/pci/cx23885/cx23885-video.c
+++ b/drivers/media/pci/cx23885/cx23885-video.c
@@ -268,7 +268,8 @@ static int cx23885_video_mux(struct cx23885_dev *dev, unsigned int input)
(dev->board == CX23885_BOARD_MYGICA_X8507) ||
(dev->board == CX23885_BOARD_AVERMEDIA_HC81R) ||
(dev->board == CX23885_BOARD_VIEWCAST_260E) ||
- (dev->board == CX23885_BOARD_VIEWCAST_460E)) {
+ (dev->board == CX23885_BOARD_VIEWCAST_460E) ||
+ (dev->board == CX23885_BOARD_AVERMEDIA_CE310B)) {
/* Configure audio routing */
v4l2_subdev_call(dev->sd_cx25840, audio, s_routing,
INPUT(input)->amux, 0, 0);
diff --git a/drivers/media/pci/cx23885/cx23885.h b/drivers/media/pci/cx23885/cx23885.h
index cf965efabe666..7bbd62cc993ef 100644
--- a/drivers/media/pci/cx23885/cx23885.h
+++ b/drivers/media/pci/cx23885/cx23885.h
@@ -111,6 +111,7 @@
#define CX23885_BOARD_HAUPPAUGE_STARBURST2 59
#define CX23885_BOARD_HAUPPAUGE_QUADHD_DVB_885 60
#define CX23885_BOARD_HAUPPAUGE_QUADHD_ATSC_885 61
+#define CX23885_BOARD_AVERMEDIA_CE310B 62
#define GPIO_0 0x00000001
#define GPIO_1 0x00000002
--
2.20.1
From: Daniel Drake <[email protected]>
[ Upstream commit 62fe23df067715a21c4aef44068efe7ceaa8f627 ]
Separate the D3 delay increase functionality out of quirk_radeon_pm() into
its own function so that it can be shared with other quirks, including the
AMD Ryzen XHCI quirk that will be introduced in a followup commit.
Tweak the function name and message to indicate more clearly that the delay
relates to a D3hot-to-D0 transition.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Daniel Drake <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Mika Westerberg <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/pci/quirks.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 84f10cda539ea..798d46a266037 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -1848,16 +1848,21 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x2609, quirk_intel_pcie_pm);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x260a, quirk_intel_pcie_pm);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x260b, quirk_intel_pcie_pm);
+static void quirk_d3hot_delay(struct pci_dev *dev, unsigned int delay)
+{
+ if (dev->d3_delay >= delay)
+ return;
+
+ dev->d3_delay = delay;
+ pci_info(dev, "extending delay after power-on from D3hot to %d msec\n",
+ dev->d3_delay);
+}
+
static void quirk_radeon_pm(struct pci_dev *dev)
{
if (dev->subsystem_vendor == PCI_VENDOR_ID_APPLE &&
- dev->subsystem_device == 0x00e2) {
- if (dev->d3_delay < 20) {
- dev->d3_delay = 20;
- pci_info(dev, "extending delay after power-on from D3 to %d msec\n",
- dev->d3_delay);
- }
- }
+ dev->subsystem_device == 0x00e2)
+ quirk_d3hot_delay(dev, 20);
}
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x6741, quirk_radeon_pm);
--
2.20.1
From: Manu Gautam <[email protected]>
[ Upstream commit d026c96b25b7ce5df89526aad2df988d553edb4d ]
QUSB2 PHY on msm8996 doesn't work well when autosuspend by
dwc3 core using USB2PHYCFG register is enabled. One of the
issue seen is that PHY driver reports PLL lock failure and
fails phy_init() if dwc3 core has USB2 PHY suspend enabled.
Fix this by using quirks to disable USB2 PHY LPM/suspend and
dwc3 core already takes care of explicitly suspending PHY
during suspend if quirks are specified.
Signed-off-by: Manu Gautam <[email protected]>
Signed-off-by: Paolo Pisati <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/arm64/boot/dts/qcom/msm8996.dtsi | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi b/arch/arm64/boot/dts/qcom/msm8996.dtsi
index 8c86c41a0d25f..3e7baabf64507 100644
--- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
@@ -918,6 +918,8 @@
interrupts = <0 138 IRQ_TYPE_LEVEL_HIGH>;
phys = <&hsusb_phy2>;
phy-names = "usb2-phy";
+ snps,dis_u2_susphy_quirk;
+ snps,dis_enblslpm_quirk;
};
};
@@ -947,6 +949,8 @@
interrupts = <0 131 IRQ_TYPE_LEVEL_HIGH>;
phys = <&hsusb_phy1>, <&ssusb_phy_0>;
phy-names = "usb2-phy", "usb3-phy";
+ snps,dis_u2_susphy_quirk;
+ snps,dis_enblslpm_quirk;
};
};
--
2.20.1
From: Andrey Smirnov <[email protected]>
[ Upstream commit 6bb1e09c4c375db29770444f689f35f5cbe696bc ]
Cabling used to connect devices to USBH1 on RDU2 does not meet USB
spec cable quality and cable length requirements to operate at High
Speed, so limit the port to Full Speed only.
Reported-by: Chris Healy <[email protected]>
Reviewed-by: Chris Healy <[email protected]>
Reviewed-by: Lucas Stach <[email protected]>
Signed-off-by: Andrey Smirnov <[email protected]>
Cc: Shawn Guo <[email protected]>
Cc: Fabio Estevam <[email protected]>
Cc: Lucas Stach <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Shawn Guo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi b/arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi
index 56d6e82b75337..bc5f2de02d433 100644
--- a/arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi
@@ -804,6 +804,7 @@
&usbh1 {
vbus-supply = <®_5p0v_main>;
disable-over-current;
+ maximum-speed = "full-speed";
status = "okay";
};
--
2.20.1
From: Paul Moore <[email protected]>
[ Upstream commit d8db60cb23e49a92cf8cada3297395c7fa50fdf8 ]
Fix avc_insert() to call avc_node_kill() if we've already allocated
an AVC node and the code fails to insert the node in the cache.
Fixes: fa1aa143ac4a ("selinux: extended permissions for ioctls")
Reported-by: [email protected]
Suggested-by: Stephen Smalley <[email protected]>
Acked-by: Stephen Smalley <[email protected]>
Signed-off-by: Paul Moore <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
security/selinux/avc.c | 51 ++++++++++++++++++++----------------------
1 file changed, 24 insertions(+), 27 deletions(-)
diff --git a/security/selinux/avc.c b/security/selinux/avc.c
index 0622cae510461..83eef39c8a799 100644
--- a/security/selinux/avc.c
+++ b/security/selinux/avc.c
@@ -689,40 +689,37 @@ static struct avc_node *avc_insert(struct selinux_avc *avc,
struct avc_node *pos, *node = NULL;
int hvalue;
unsigned long flag;
+ spinlock_t *lock;
+ struct hlist_head *head;
if (avc_latest_notif_update(avc, avd->seqno, 1))
- goto out;
+ return NULL;
node = avc_alloc_node(avc);
- if (node) {
- struct hlist_head *head;
- spinlock_t *lock;
- int rc = 0;
-
- hvalue = avc_hash(ssid, tsid, tclass);
- avc_node_populate(node, ssid, tsid, tclass, avd);
- rc = avc_xperms_populate(node, xp_node);
- if (rc) {
- kmem_cache_free(avc_node_cachep, node);
- return NULL;
- }
- head = &avc->avc_cache.slots[hvalue];
- lock = &avc->avc_cache.slots_lock[hvalue];
+ if (!node)
+ return NULL;
- spin_lock_irqsave(lock, flag);
- hlist_for_each_entry(pos, head, list) {
- if (pos->ae.ssid == ssid &&
- pos->ae.tsid == tsid &&
- pos->ae.tclass == tclass) {
- avc_node_replace(avc, node, pos);
- goto found;
- }
+ avc_node_populate(node, ssid, tsid, tclass, avd);
+ if (avc_xperms_populate(node, xp_node)) {
+ avc_node_kill(avc, node);
+ return NULL;
+ }
+
+ hvalue = avc_hash(ssid, tsid, tclass);
+ head = &avc->avc_cache.slots[hvalue];
+ lock = &avc->avc_cache.slots_lock[hvalue];
+ spin_lock_irqsave(lock, flag);
+ hlist_for_each_entry(pos, head, list) {
+ if (pos->ae.ssid == ssid &&
+ pos->ae.tsid == tsid &&
+ pos->ae.tclass == tclass) {
+ avc_node_replace(avc, node, pos);
+ goto found;
}
- hlist_add_head_rcu(&node->list, head);
-found:
- spin_unlock_irqrestore(lock, flag);
}
-out:
+ hlist_add_head_rcu(&node->list, head);
+found:
+ spin_unlock_irqrestore(lock, flag);
return node;
}
--
2.20.1
From: Rakesh Pillai <[email protected]>
[ Upstream commit 6ba8b3b6bd772f575f7736c8fd893c6981fcce16 ]
The management packets, send to firmware via WMI, are
mapped using the direction DMA_TO_DEVICE. Currently in
case of wmi cleanup, these buffers are being unmapped
using an incorrect DMA direction. This can cause unwanted
behavior when the host driver is handling a restart
of the wlan firmware.
We might see a trace like below
[<ffffff8008098b18>] __dma_inv_area+0x28/0x58
[<ffffff8001176734>] ath10k_wmi_mgmt_tx_clean_up_pending+0x60/0xb0 [ath10k_core]
[<ffffff80088c7c50>] idr_for_each+0x78/0xe4
[<ffffff80011766a4>] ath10k_wmi_detach+0x4c/0x7c [ath10k_core]
[<ffffff8001163d7c>] ath10k_core_stop+0x58/0x68 [ath10k_core]
[<ffffff800114fb74>] ath10k_halt+0xec/0x13c [ath10k_core]
[<ffffff8001165110>] ath10k_core_restart+0x11c/0x1a8 [ath10k_core]
[<ffffff80080c36bc>] process_one_work+0x16c/0x31c
Fix the incorrect DMA direction during the wmi
management tx buffer cleanup.
Tested HW: WCN3990
Tested FW: WLAN.HL.3.1-00784-QCAHLSWMTPLZ-1
Fixes: dc405152bb6 ("ath10k: handle mgmt tx completion event")
Signed-off-by: Rakesh Pillai <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/ath/ath10k/wmi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c
index 0f6ff7a78e49d..3372dfa0deccf 100644
--- a/drivers/net/wireless/ath/ath10k/wmi.c
+++ b/drivers/net/wireless/ath/ath10k/wmi.c
@@ -9193,7 +9193,7 @@ static int ath10k_wmi_mgmt_tx_clean_up_pending(int msdu_id, void *ptr,
msdu = pkt_addr->vaddr;
dma_unmap_single(ar->dev, pkt_addr->paddr,
- msdu->len, DMA_FROM_DEVICE);
+ msdu->len, DMA_TO_DEVICE);
ieee80211_free_txskb(ar->hw, msdu);
return 0;
--
2.20.1
From: Sun Ke <[email protected]>
[ Upstream commit 5c0dd228b5fc30a3b732c7ae2657e0161ec7ed80 ]
When kzalloc fail, may cause trying to destroy the
workqueue from inside the workqueue.
If num_connections is m (2 < m), and NO.1 ~ NO.n
(1 < n < m) kzalloc are successful. The NO.(n + 1)
failed. Then, nbd_start_device will return ENOMEM
to nbd_start_device_ioctl, and nbd_start_device_ioctl
will return immediately without running flush_workqueue.
However, we still have n recv threads. If nbd_release
run first, recv threads may have to drop the last
config_refs and try to destroy the workqueue from
inside the workqueue.
To fix it, add a flush_workqueue in nbd_start_device.
Fixes: e9e006f5fcf2 ("nbd: fix max number of supported devs")
Signed-off-by: Sun Ke <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/block/nbd.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index b9d321bdaa8a7..226103af30f05 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1216,6 +1216,16 @@ static int nbd_start_device(struct nbd_device *nbd)
args = kzalloc(sizeof(*args), GFP_KERNEL);
if (!args) {
sock_shutdown(nbd);
+ /*
+ * If num_connections is m (2 < m),
+ * and NO.1 ~ NO.n(1 < n < m) kzallocs are successful.
+ * But NO.(n + 1) failed. We still have n recv threads.
+ * So, add flush_workqueue here to prevent recv threads
+ * dropping the last config_refs and trying to destroy
+ * the workqueue from inside the workqueue.
+ */
+ if (i)
+ flush_workqueue(nbd->recv_workq);
return -ENOMEM;
}
sk_set_memalloc(config->socks[i]->sock->sk);
--
2.20.1
From: Nathan Chancellor <[email protected]>
[ Upstream commit df4654bd6e42125d9b85ce3a26eaca2935290b98 ]
Clang warns:
../sound/usb/usx2y/usX2Yhwdep.c:122:3: warning: misleading indentation;
statement is not part of the previous 'if' [-Wmisleading-indentation]
info->version = USX2Y_DRIVER_VERSION;
^
../sound/usb/usx2y/usX2Yhwdep.c:120:2: note: previous statement is here
if (us428->chip_status & USX2Y_STAT_CHIP_INIT)
^
1 warning generated.
This warning occurs because there is a space before the tab on this
line. Remove it so that the indentation is consistent with the Linux
kernel coding style and clang no longer warns.
This was introduced before the beginning of git history so no fixes tag.
Link: https://github.com/ClangBuiltLinux/linux/issues/831
Signed-off-by: Nathan Chancellor <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/usb/usx2y/usX2Yhwdep.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/usb/usx2y/usX2Yhwdep.c b/sound/usb/usx2y/usX2Yhwdep.c
index c1dd9a7b48df6..36b3459703640 100644
--- a/sound/usb/usx2y/usX2Yhwdep.c
+++ b/sound/usb/usx2y/usX2Yhwdep.c
@@ -131,7 +131,7 @@ static int snd_usX2Y_hwdep_dsp_status(struct snd_hwdep *hw,
info->num_dsps = 2; // 0: Prepad Data, 1: FPGA Code
if (us428->chip_status & USX2Y_STAT_CHIP_INIT)
info->chip_ready = 1;
- info->version = USX2Y_DRIVER_VERSION;
+ info->version = USX2Y_DRIVER_VERSION;
return 0;
}
--
2.20.1
From: Phong Tran <[email protected]>
[ Upstream commit da5e57e8a6a3e69dac2937ba63fa86355628fbb2 ]
correct usage prototype of callback in tasklet_init().
Report by https://github.com/KSPP/linux/issues/20
Signed-off-by: Phong Tran <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/intel/iwlegacy/3945-mac.c | 5 +++--
drivers/net/wireless/intel/iwlegacy/4965-mac.c | 5 +++--
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlegacy/3945-mac.c b/drivers/net/wireless/intel/iwlegacy/3945-mac.c
index 57e3b6cca2341..b536ec20eaccb 100644
--- a/drivers/net/wireless/intel/iwlegacy/3945-mac.c
+++ b/drivers/net/wireless/intel/iwlegacy/3945-mac.c
@@ -1392,8 +1392,9 @@ il3945_dump_nic_error_log(struct il_priv *il)
}
static void
-il3945_irq_tasklet(struct il_priv *il)
+il3945_irq_tasklet(unsigned long data)
{
+ struct il_priv *il = (struct il_priv *)data;
u32 inta, handled = 0;
u32 inta_fh;
unsigned long flags;
@@ -3419,7 +3420,7 @@ il3945_setup_deferred_work(struct il_priv *il)
timer_setup(&il->watchdog, il_bg_watchdog, 0);
tasklet_init(&il->irq_tasklet,
- (void (*)(unsigned long))il3945_irq_tasklet,
+ il3945_irq_tasklet,
(unsigned long)il);
}
diff --git a/drivers/net/wireless/intel/iwlegacy/4965-mac.c b/drivers/net/wireless/intel/iwlegacy/4965-mac.c
index 280cd8ae1696d..6fc51c74cdb86 100644
--- a/drivers/net/wireless/intel/iwlegacy/4965-mac.c
+++ b/drivers/net/wireless/intel/iwlegacy/4965-mac.c
@@ -4360,8 +4360,9 @@ il4965_synchronize_irq(struct il_priv *il)
}
static void
-il4965_irq_tasklet(struct il_priv *il)
+il4965_irq_tasklet(unsigned long data)
{
+ struct il_priv *il = (struct il_priv *)data;
u32 inta, handled = 0;
u32 inta_fh;
unsigned long flags;
@@ -6257,7 +6258,7 @@ il4965_setup_deferred_work(struct il_priv *il)
timer_setup(&il->watchdog, il_bg_watchdog, 0);
tasklet_init(&il->irq_tasklet,
- (void (*)(unsigned long))il4965_irq_tasklet,
+ il4965_irq_tasklet,
(unsigned long)il);
}
--
2.20.1
From: Phong Tran <[email protected]>
[ Upstream commit cb775c88da5d48a85d99d95219f637b6fad2e0e9 ]
correct usage prototype of callback in tasklet_init().
Report by https://github.com/KSPP/linux/issues/20
Signed-off-by: Phong Tran <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/realtek/rtlwifi/pci.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtlwifi/pci.c b/drivers/net/wireless/realtek/rtlwifi/pci.c
index 5d1fda16fc8c4..83749578fa8b3 100644
--- a/drivers/net/wireless/realtek/rtlwifi/pci.c
+++ b/drivers/net/wireless/realtek/rtlwifi/pci.c
@@ -1082,13 +1082,15 @@ done:
return ret;
}
-static void _rtl_pci_irq_tasklet(struct ieee80211_hw *hw)
+static void _rtl_pci_irq_tasklet(unsigned long data)
{
+ struct ieee80211_hw *hw = (struct ieee80211_hw *)data;
_rtl_pci_tx_chk_waitq(hw);
}
-static void _rtl_pci_prepare_bcn_tasklet(struct ieee80211_hw *hw)
+static void _rtl_pci_prepare_bcn_tasklet(unsigned long data)
{
+ struct ieee80211_hw *hw = (struct ieee80211_hw *)data;
struct rtl_priv *rtlpriv = rtl_priv(hw);
struct rtl_pci *rtlpci = rtl_pcidev(rtl_pcipriv(hw));
struct rtl_mac *mac = rtl_mac(rtl_priv(hw));
@@ -1214,10 +1216,10 @@ static void _rtl_pci_init_struct(struct ieee80211_hw *hw,
/*task */
tasklet_init(&rtlpriv->works.irq_tasklet,
- (void (*)(unsigned long))_rtl_pci_irq_tasklet,
+ _rtl_pci_irq_tasklet,
(unsigned long)hw);
tasklet_init(&rtlpriv->works.irq_prepare_bcn_tasklet,
- (void (*)(unsigned long))_rtl_pci_prepare_bcn_tasklet,
+ _rtl_pci_prepare_bcn_tasklet,
(unsigned long)hw);
INIT_WORK(&rtlpriv->works.lps_change_work,
rtl_lps_change_work_callback);
--
2.20.1
From: Andrey Smirnov <[email protected]>
[ Upstream commit cd58a174e58649426fb43d7456e5f7d7eab58af1 ]
RDU2 production units come with resistor connecting WP pin to
correpsonding GPIO DNPed for both SD card slots. Drop any WP related
configuration and mark both slots with "disable-wp".
Reported-by: Chris Healy <[email protected]>
Reviewed-by: Chris Healy <[email protected]>
Reviewed-by: Lucas Stach <[email protected]>
Signed-off-by: Andrey Smirnov <[email protected]>
Cc: Shawn Guo <[email protected]>
Cc: Fabio Estevam <[email protected]>
Cc: Lucas Stach <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Shawn Guo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi b/arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi
index 315d0e7615f33..56d6e82b75337 100644
--- a/arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi
@@ -657,7 +657,7 @@
pinctrl-0 = <&pinctrl_usdhc2>;
bus-width = <4>;
cd-gpios = <&gpio2 2 GPIO_ACTIVE_LOW>;
- wp-gpios = <&gpio2 3 GPIO_ACTIVE_HIGH>;
+ disable-wp;
vmmc-supply = <®_3p3v_sd>;
vqmmc-supply = <®_3p3v>;
no-1-8-v;
@@ -670,7 +670,7 @@
pinctrl-0 = <&pinctrl_usdhc3>;
bus-width = <4>;
cd-gpios = <&gpio2 0 GPIO_ACTIVE_LOW>;
- wp-gpios = <&gpio2 1 GPIO_ACTIVE_HIGH>;
+ disable-wp;
vmmc-supply = <®_3p3v_sd>;
vqmmc-supply = <®_3p3v>;
no-1-8-v;
@@ -1081,7 +1081,6 @@
MX6QDL_PAD_SD2_DAT1__SD2_DATA1 0x17059
MX6QDL_PAD_SD2_DAT2__SD2_DATA2 0x17059
MX6QDL_PAD_SD2_DAT3__SD2_DATA3 0x17059
- MX6QDL_PAD_NANDF_D3__GPIO2_IO03 0x40010040
MX6QDL_PAD_NANDF_D2__GPIO2_IO02 0x40010040
>;
};
@@ -1094,7 +1093,6 @@
MX6QDL_PAD_SD3_DAT1__SD3_DATA1 0x17059
MX6QDL_PAD_SD3_DAT2__SD3_DATA2 0x17059
MX6QDL_PAD_SD3_DAT3__SD3_DATA3 0x17059
- MX6QDL_PAD_NANDF_D1__GPIO2_IO01 0x40010040
MX6QDL_PAD_NANDF_D0__GPIO2_IO00 0x40010040
>;
--
2.20.1
From: Nathan Chancellor <[email protected]>
[ Upstream commit 4dbc96ad65c45cdd4e895ed7ae4c151b780790c5 ]
Clang warns:
../drivers/scsi/aic7xxx/aic7xxx_core.c:2317:5: warning: misleading
indentation; statement is not part of the previous 'if'
[-Wmisleading-indentation]
if ((syncrate->sxfr_u2 & ST_SXFR) != 0)
^
../drivers/scsi/aic7xxx/aic7xxx_core.c:2310:4: note: previous statement
is here
if (syncrate == &ahc_syncrates[maxsync])
^
1 warning generated.
This warning occurs because there is a space amongst the tabs on this
line. Remove it so that the indentation is consistent with the Linux kernel
coding style and clang no longer warns.
This has been a problem since the beginning of git history hence no fixes
tag.
Link: https://github.com/ClangBuiltLinux/linux/issues/817
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Nathan Chancellor <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/scsi/aic7xxx/aic7xxx_core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/aic7xxx/aic7xxx_core.c b/drivers/scsi/aic7xxx/aic7xxx_core.c
index 915a34f141e4f..49e02e8745533 100644
--- a/drivers/scsi/aic7xxx/aic7xxx_core.c
+++ b/drivers/scsi/aic7xxx/aic7xxx_core.c
@@ -2321,7 +2321,7 @@ ahc_find_syncrate(struct ahc_softc *ahc, u_int *period,
* At some speeds, we only support
* ST transfers.
*/
- if ((syncrate->sxfr_u2 & ST_SXFR) != 0)
+ if ((syncrate->sxfr_u2 & ST_SXFR) != 0)
*ppr_options &= ~MSG_EXT_PPR_DT_REQ;
break;
}
--
2.20.1
From: Aditya Pakki <[email protected]>
[ Upstream commit bbd20c939c8aa3f27fa30e86691af250bf92973a ]
In fore200e_send and fore200e_close, the pointers from the arguments
are dereferenced in the variable declaration block and then checked
for NULL. The patch fixes these issues by avoiding NULL pointer
dereferences.
Signed-off-by: Aditya Pakki <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/atm/fore200e.c | 25 ++++++++++++++++++-------
1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/drivers/atm/fore200e.c b/drivers/atm/fore200e.c
index 99a38115b0a8f..86aab14872fd0 100644
--- a/drivers/atm/fore200e.c
+++ b/drivers/atm/fore200e.c
@@ -1504,12 +1504,14 @@ fore200e_open(struct atm_vcc *vcc)
static void
fore200e_close(struct atm_vcc* vcc)
{
- struct fore200e* fore200e = FORE200E_DEV(vcc->dev);
struct fore200e_vcc* fore200e_vcc;
+ struct fore200e* fore200e;
struct fore200e_vc_map* vc_map;
unsigned long flags;
ASSERT(vcc);
+ fore200e = FORE200E_DEV(vcc->dev);
+
ASSERT((vcc->vpi >= 0) && (vcc->vpi < 1<<FORE200E_VPI_BITS));
ASSERT((vcc->vci >= 0) && (vcc->vci < 1<<FORE200E_VCI_BITS));
@@ -1554,10 +1556,10 @@ fore200e_close(struct atm_vcc* vcc)
static int
fore200e_send(struct atm_vcc *vcc, struct sk_buff *skb)
{
- struct fore200e* fore200e = FORE200E_DEV(vcc->dev);
- struct fore200e_vcc* fore200e_vcc = FORE200E_VCC(vcc);
+ struct fore200e* fore200e;
+ struct fore200e_vcc* fore200e_vcc;
struct fore200e_vc_map* vc_map;
- struct host_txq* txq = &fore200e->host_txq;
+ struct host_txq* txq;
struct host_txq_entry* entry;
struct tpd* tpd;
struct tpd_haddr tpd_haddr;
@@ -1570,9 +1572,18 @@ fore200e_send(struct atm_vcc *vcc, struct sk_buff *skb)
unsigned char* data;
unsigned long flags;
- ASSERT(vcc);
- ASSERT(fore200e);
- ASSERT(fore200e_vcc);
+ if (!vcc)
+ return -EINVAL;
+
+ fore200e = FORE200E_DEV(vcc->dev);
+ fore200e_vcc = FORE200E_VCC(vcc);
+
+ if (!fore200e)
+ return -EINVAL;
+
+ txq = &fore200e->host_txq;
+ if (!fore200e_vcc)
+ return -EINVAL;
if (!test_bit(ATM_VF_READY, &vcc->flags)) {
DPRINTK(1, "VC %d.%d.%d not ready for tx\n", vcc->itf, vcc->vpi, vcc->vpi);
--
2.20.1
From: Geert Uytterhoeven <[email protected]>
[ Upstream commit 8443ffd1bbd5be74e9b12db234746d12e8ea93e2 ]
Add a device node for the global timer, which is part of the Cortex-A9
MPCore.
The global timer can serve as an accurate (4 ns) clock source for
scheduling and delay loops.
Signed-off-by: Geert Uytterhoeven <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/arm/boot/dts/r8a7779.dtsi | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/arm/boot/dts/r8a7779.dtsi b/arch/arm/boot/dts/r8a7779.dtsi
index 03919714645ae..f1c9b2bc542c5 100644
--- a/arch/arm/boot/dts/r8a7779.dtsi
+++ b/arch/arm/boot/dts/r8a7779.dtsi
@@ -68,6 +68,14 @@
<0xf0000100 0x100>;
};
+ timer@f0000200 {
+ compatible = "arm,cortex-a9-global-timer";
+ reg = <0xf0000200 0x100>;
+ interrupts = <GIC_PPI 11
+ (GIC_CPU_MASK_SIMPLE(4) | IRQ_TYPE_EDGE_RISING)>;
+ clocks = <&cpg_clocks R8A7779_CLK_ZS>;
+ };
+
timer@f0000600 {
compatible = "arm,cortex-a9-twd-timer";
reg = <0xf0000600 0x20>;
--
2.20.1
From: Heiner Kallweit <[email protected]>
[ Upstream commit f325937735498afb054a0195291bbf68d0b60be5 ]
Some users complained about problems with r8169 and it turned out that
the generic PHY driver was used instead instead of the dedicated one.
In all cases reason was that r8169.ko was in initramfs, but realtek.ko
not. Manually adding realtek.ko to initramfs fixed the issues.
Root cause seems to be that tools like dracut and genkernel don't
consider softdeps. Add a check for loaded Realtek PHY driver module
and provide the user with a hint if it's not loaded.
Signed-off-by: Heiner Kallweit <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/realtek/r8169.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 4ab87fe845427..6ea43e48d5f97 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -7433,6 +7433,15 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
int chipset, region, i;
int jumbo_max, rc;
+ /* Some tools for creating an initramfs don't consider softdeps, then
+ * r8169.ko may be in initramfs, but realtek.ko not. Then the generic
+ * PHY driver is used that doesn't work with most chip versions.
+ */
+ if (!driver_find("RTL8201CP Ethernet", &mdio_bus_type)) {
+ dev_err(&pdev->dev, "realtek.ko not loaded, maybe it needs to be added to initramfs?\n");
+ return -ENOENT;
+ }
+
dev = devm_alloc_etherdev(&pdev->dev, sizeof (*tp));
if (!dev)
return -ENOMEM;
--
2.20.1
From: Nathan Chancellor <[email protected]>
[ Upstream commit afb34781620274236bd9fc9246e22f6963ef5262 ]
When building with Clang + -Wtautological-constant-compare, several of
the ivtv and cx18 drivers warn along the lines of:
drivers/media/pci/cx18/cx18-driver.c:1005:21: warning: converting the
result of '<<' to a boolean always evaluates to true
[-Wtautological-constant-compare]
cx18_call_hw(cx, CX18_HW_GPIO_RESET_CTRL,
^
drivers/media/pci/cx18/cx18-cards.h:18:37: note: expanded from macro
'CX18_HW_GPIO_RESET_CTRL'
#define CX18_HW_GPIO_RESET_CTRL (1 << 6)
^
1 warning generated.
This warning happens because the shift operation is implicitly converted
to a boolean in v4l2_device_mask_call_all before being negated. This can
be solved by just comparing the mask result to 0 explicitly so that
there is no boolean conversion. The ultimate goal is to enable
-Wtautological-compare globally because there are several subwarnings
that would be helpful to have.
For visual consistency and avoidance of these warnings in the future,
all of the implicitly boolean conversions in the v4l2_device macros
are converted to explicit ones as well.
Link: https://github.com/ClangBuiltLinux/linux/issues/752
Reviewed-by: Ezequiel Garcia <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
include/media/v4l2-device.h | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/include/media/v4l2-device.h b/include/media/v4l2-device.h
index b330e4a08a6b8..40840fec337ca 100644
--- a/include/media/v4l2-device.h
+++ b/include/media/v4l2-device.h
@@ -372,7 +372,7 @@ static inline void v4l2_subdev_notify(struct v4l2_subdev *sd,
struct v4l2_subdev *__sd; \
\
__v4l2_device_call_subdevs_p(v4l2_dev, __sd, \
- !(grpid) || __sd->grp_id == (grpid), o, f , \
+ (grpid) == 0 || __sd->grp_id == (grpid), o, f , \
##args); \
} while (0)
@@ -404,7 +404,7 @@ static inline void v4l2_subdev_notify(struct v4l2_subdev *sd,
({ \
struct v4l2_subdev *__sd; \
__v4l2_device_call_subdevs_until_err_p(v4l2_dev, __sd, \
- !(grpid) || __sd->grp_id == (grpid), o, f , \
+ (grpid) == 0 || __sd->grp_id == (grpid), o, f , \
##args); \
})
@@ -432,8 +432,8 @@ static inline void v4l2_subdev_notify(struct v4l2_subdev *sd,
struct v4l2_subdev *__sd; \
\
__v4l2_device_call_subdevs_p(v4l2_dev, __sd, \
- !(grpmsk) || (__sd->grp_id & (grpmsk)), o, f , \
- ##args); \
+ (grpmsk) == 0 || (__sd->grp_id & (grpmsk)), o, \
+ f , ##args); \
} while (0)
/**
@@ -463,8 +463,8 @@ static inline void v4l2_subdev_notify(struct v4l2_subdev *sd,
({ \
struct v4l2_subdev *__sd; \
__v4l2_device_call_subdevs_until_err_p(v4l2_dev, __sd, \
- !(grpmsk) || (__sd->grp_id & (grpmsk)), o, f , \
- ##args); \
+ (grpmsk) == 0 || (__sd->grp_id & (grpmsk)), o, \
+ f , ##args); \
})
--
2.20.1
From: Douglas Anderson <[email protected]>
[ Upstream commit 908b050114d8fefdddc57ec9fbc213c3690e7f5f ]
When I got my clock parenting slightly wrong I ended up with a crash
that looked like this:
Unable to handle kernel NULL pointer dereference at virtual
address 0000000000000000
...
pc : clk_hw_get_rate+0x14/0x44
...
Call trace:
clk_hw_get_rate+0x14/0x44
_freq_tbl_determine_rate+0x94/0xfc
clk_rcg2_determine_rate+0x2c/0x38
clk_core_determine_round_nolock+0x4c/0x88
clk_core_round_rate_nolock+0x6c/0xa8
clk_core_round_rate_nolock+0x9c/0xa8
clk_core_set_rate_nolock+0x70/0x180
clk_set_rate+0x3c/0x6c
of_clk_set_defaults+0x254/0x360
platform_drv_probe+0x28/0xb0
really_probe+0x120/0x2dc
driver_probe_device+0x64/0xfc
device_driver_attach+0x4c/0x6c
__driver_attach+0xac/0xc0
bus_for_each_dev+0x84/0xcc
driver_attach+0x2c/0x38
bus_add_driver+0xfc/0x1d0
driver_register+0x64/0xf8
__platform_driver_register+0x4c/0x58
msm_drm_register+0x5c/0x60
...
It turned out that clk_hw_get_parent_by_index() was returning NULL and
we weren't checking. Let's check it so that we don't crash.
Fixes: ac269395cdd8 ("clk: qcom: Convert to clk_hw based provider APIs")
Signed-off-by: Douglas Anderson <[email protected]>
Reviewed-by: Matthias Kaehlcke <[email protected]>
Link: https://lkml.kernel.org/r/20200203103049.v4.1.I7487325fe8e701a68a07d3be8a6a4b571eca9cfa@changeid
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/clk/qcom/clk-rcg2.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/clk/qcom/clk-rcg2.c b/drivers/clk/qcom/clk-rcg2.c
index 51b2388d80ac9..ee693e15d9ebc 100644
--- a/drivers/clk/qcom/clk-rcg2.c
+++ b/drivers/clk/qcom/clk-rcg2.c
@@ -203,6 +203,9 @@ static int _freq_tbl_determine_rate(struct clk_hw *hw, const struct freq_tbl *f,
clk_flags = clk_hw_get_flags(hw);
p = clk_hw_get_parent_by_index(hw, index);
+ if (!p)
+ return -EINVAL;
+
if (clk_flags & CLK_SET_RATE_PARENT) {
rate = f->freq;
if (f->pre_div) {
--
2.20.1
From: Nathan Chancellor <[email protected]>
[ Upstream commit a63141e31764f8daf3f29e8e2d450dcf9199d1c8 ]
Commit b0f3cd3191cd ("drm/amdgpu: remove unnecessary JPEG2.0 code from
VCN2.0") introduced a new clang warning in the vcn_v2_0_stop function:
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1082:2: warning: variable 'r'
is used uninitialized whenever 'while' loop exits because its condition
is false [-Wsometimes-uninitialized]
SOC15_WAIT_ON_RREG(VCN, 0, mmUVD_STATUS, UVD_STATUS__IDLE, 0x7, r);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/amd/amdgpu/../amdgpu/soc15_common.h:55:10: note:
expanded from macro 'SOC15_WAIT_ON_RREG'
while ((tmp_ & (mask)) != (expected_value)) { \
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1083:6: note: uninitialized use
occurs here
if (r)
^
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1082:2: note: remove the
condition if it is always true
SOC15_WAIT_ON_RREG(VCN, 0, mmUVD_STATUS, UVD_STATUS__IDLE, 0x7, r);
^
../drivers/gpu/drm/amd/amdgpu/../amdgpu/soc15_common.h:55:10: note:
expanded from macro 'SOC15_WAIT_ON_RREG'
while ((tmp_ & (mask)) != (expected_value)) { \
^
../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1072:7: note: initialize the
variable 'r' to silence this warning
int r;
^
= 0
1 warning generated.
To prevent warnings like this from happening in the future, make the
SOC15_WAIT_ON_RREG macro initialize its ret variable before the while
loop that can time out. This macro's return value is always checked so
it should set ret in both the success and fail path.
Link: https://github.com/ClangBuiltLinux/linux/issues/776
Signed-off-by: Nathan Chancellor <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/soc15_common.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15_common.h b/drivers/gpu/drm/amd/amdgpu/soc15_common.h
index 0942f492d2e19..9d444f7fdfcdd 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15_common.h
+++ b/drivers/gpu/drm/amd/amdgpu/soc15_common.h
@@ -51,6 +51,7 @@
do { \
uint32_t tmp_ = RREG32(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] + reg); \
uint32_t loop = adev->usec_timeout; \
+ ret = 0; \
while ((tmp_ & (mask)) != (expected_value)) { \
udelay(2); \
tmp_ = RREG32(adev->reg_offset[ip##_HWIP][inst][reg##_BASE_IDX] + reg); \
--
2.20.1
From: Xin Long <[email protected]>
[ Upstream commit 0705f95c332081036d85f26691e9d3cd7d901c31 ]
ERSPAN_VERSION is an attribute parsed in kernel side, nla_policy
type should be added for it, like other attributes.
Fixes: af308b94a2a4 ("netfilter: nf_tables: add tunnel support")
Signed-off-by: Xin Long <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
net/netfilter/nft_tunnel.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/nft_tunnel.c b/net/netfilter/nft_tunnel.c
index e5444f3ff43fc..5e66042ac3466 100644
--- a/net/netfilter/nft_tunnel.c
+++ b/net/netfilter/nft_tunnel.c
@@ -218,8 +218,9 @@ static int nft_tunnel_obj_vxlan_init(const struct nlattr *attr,
}
static const struct nla_policy nft_tunnel_opts_erspan_policy[NFTA_TUNNEL_KEY_ERSPAN_MAX + 1] = {
+ [NFTA_TUNNEL_KEY_ERSPAN_VERSION] = { .type = NLA_U32 },
[NFTA_TUNNEL_KEY_ERSPAN_V1_INDEX] = { .type = NLA_U32 },
- [NFTA_TUNNEL_KEY_ERSPAN_V2_DIR] = { .type = NLA_U8 },
+ [NFTA_TUNNEL_KEY_ERSPAN_V2_DIR] = { .type = NLA_U8 },
[NFTA_TUNNEL_KEY_ERSPAN_V2_HWID] = { .type = NLA_U8 },
};
--
2.20.1
From: Phong Tran <[email protected]>
[ Upstream commit ebd77feb27e91bb5fe35a7818b7c13ea7435fb98 ]
correct usage prototype of callback in tasklet_init().
Report by https://github.com/KSPP/linux/issues/20
Signed-off-by: Phong Tran <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/intel/ipw2x00/ipw2100.c | 7 ++++---
drivers/net/wireless/intel/ipw2x00/ipw2200.c | 5 +++--
2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/net/wireless/intel/ipw2x00/ipw2100.c b/drivers/net/wireless/intel/ipw2x00/ipw2100.c
index 910db46db6a12..a3a470976a5c7 100644
--- a/drivers/net/wireless/intel/ipw2x00/ipw2100.c
+++ b/drivers/net/wireless/intel/ipw2x00/ipw2100.c
@@ -3220,8 +3220,9 @@ static void ipw2100_tx_send_data(struct ipw2100_priv *priv)
}
}
-static void ipw2100_irq_tasklet(struct ipw2100_priv *priv)
+static void ipw2100_irq_tasklet(unsigned long data)
{
+ struct ipw2100_priv *priv = (struct ipw2100_priv *)data;
struct net_device *dev = priv->net_dev;
unsigned long flags;
u32 inta, tmp;
@@ -6025,7 +6026,7 @@ static void ipw2100_rf_kill(struct work_struct *work)
spin_unlock_irqrestore(&priv->low_lock, flags);
}
-static void ipw2100_irq_tasklet(struct ipw2100_priv *priv);
+static void ipw2100_irq_tasklet(unsigned long data);
static const struct net_device_ops ipw2100_netdev_ops = {
.ndo_open = ipw2100_open,
@@ -6155,7 +6156,7 @@ static struct net_device *ipw2100_alloc_device(struct pci_dev *pci_dev,
INIT_DELAYED_WORK(&priv->rf_kill, ipw2100_rf_kill);
INIT_DELAYED_WORK(&priv->scan_event, ipw2100_scan_event);
- tasklet_init(&priv->irq_tasklet, (void (*)(unsigned long))
+ tasklet_init(&priv->irq_tasklet,
ipw2100_irq_tasklet, (unsigned long)priv);
/* NOTE: We do not start the deferred work for status checks yet */
diff --git a/drivers/net/wireless/intel/ipw2x00/ipw2200.c b/drivers/net/wireless/intel/ipw2x00/ipw2200.c
index 9644e7b93645f..04aee2fdba375 100644
--- a/drivers/net/wireless/intel/ipw2x00/ipw2200.c
+++ b/drivers/net/wireless/intel/ipw2x00/ipw2200.c
@@ -1959,8 +1959,9 @@ static void notify_wx_assoc_event(struct ipw_priv *priv)
wireless_send_event(priv->net_dev, SIOCGIWAP, &wrqu, NULL);
}
-static void ipw_irq_tasklet(struct ipw_priv *priv)
+static void ipw_irq_tasklet(unsigned long data)
{
+ struct ipw_priv *priv = (struct ipw_priv *)data;
u32 inta, inta_mask, handled = 0;
unsigned long flags;
int rc = 0;
@@ -10694,7 +10695,7 @@ static int ipw_setup_deferred_work(struct ipw_priv *priv)
INIT_WORK(&priv->qos_activate, ipw_bg_qos_activate);
#endif /* CONFIG_IPW2200_QOS */
- tasklet_init(&priv->irq_tasklet, (void (*)(unsigned long))
+ tasklet_init(&priv->irq_tasklet,
ipw_irq_tasklet, (unsigned long)priv);
return ret;
--
2.20.1
From: Kunihiko Hayashi <[email protected]>
[ Upstream commit 1ec09a2ec67a0baa46a3ccac041dbcdbc6db2cb9 ]
SCSSI has clock gates for each channel in the SoCs newer than Pro4,
so this adds missing clock gates for channel 1, 2 and 3. And more, this
moves MCSSI clock ID after SCSSI.
Fixes: ff388ee36516 ("clk: uniphier: add clock frequency support for SPI")
Signed-off-by: Kunihiko Hayashi <[email protected]>
Acked-by: Masahiro Yamada <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/clk/uniphier/clk-uniphier-peri.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/clk/uniphier/clk-uniphier-peri.c b/drivers/clk/uniphier/clk-uniphier-peri.c
index 89b3ac378b3f9..8b75dc116a98c 100644
--- a/drivers/clk/uniphier/clk-uniphier-peri.c
+++ b/drivers/clk/uniphier/clk-uniphier-peri.c
@@ -27,8 +27,8 @@
#define UNIPHIER_PERI_CLK_FI2C(idx, ch) \
UNIPHIER_CLK_GATE("i2c" #ch, (idx), "i2c", 0x24, 24 + (ch))
-#define UNIPHIER_PERI_CLK_SCSSI(idx) \
- UNIPHIER_CLK_GATE("scssi", (idx), "spi", 0x20, 17)
+#define UNIPHIER_PERI_CLK_SCSSI(idx, ch) \
+ UNIPHIER_CLK_GATE("scssi" #ch, (idx), "spi", 0x20, 17 + (ch))
#define UNIPHIER_PERI_CLK_MCSSI(idx) \
UNIPHIER_CLK_GATE("mcssi", (idx), "spi", 0x24, 14)
@@ -44,7 +44,7 @@ const struct uniphier_clk_data uniphier_ld4_peri_clk_data[] = {
UNIPHIER_PERI_CLK_I2C(6, 2),
UNIPHIER_PERI_CLK_I2C(7, 3),
UNIPHIER_PERI_CLK_I2C(8, 4),
- UNIPHIER_PERI_CLK_SCSSI(11),
+ UNIPHIER_PERI_CLK_SCSSI(11, 0),
{ /* sentinel */ }
};
@@ -60,7 +60,10 @@ const struct uniphier_clk_data uniphier_pro4_peri_clk_data[] = {
UNIPHIER_PERI_CLK_FI2C(8, 4),
UNIPHIER_PERI_CLK_FI2C(9, 5),
UNIPHIER_PERI_CLK_FI2C(10, 6),
- UNIPHIER_PERI_CLK_SCSSI(11),
- UNIPHIER_PERI_CLK_MCSSI(12),
+ UNIPHIER_PERI_CLK_SCSSI(11, 0),
+ UNIPHIER_PERI_CLK_SCSSI(12, 1),
+ UNIPHIER_PERI_CLK_SCSSI(13, 2),
+ UNIPHIER_PERI_CLK_SCSSI(14, 3),
+ UNIPHIER_PERI_CLK_MCSSI(15),
{ /* sentinel */ }
};
--
2.20.1
From: Andrey Zhizhikin <[email protected]>
[ Upstream commit 6794200fa3c9c3e6759dae099145f23e4310f4f7 ]
GCC9 introduced string hardening mechanisms, which exhibits the error
during fs api compilation:
error: '__builtin_strncpy' specified bound 4096 equals destination size
[-Werror=stringop-truncation]
This comes when the length of copy passed to strncpy is is equal to
destination size, which could potentially lead to buffer overflow.
There is a need to mitigate this potential issue by limiting the size of
destination by 1 and explicitly terminate the destination with NULL.
Signed-off-by: Andrey Zhizhikin <[email protected]>
Reviewed-by: Petr Mladek <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Kefeng Wang <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Yonghong Song <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
tools/lib/api/fs/fs.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/lib/api/fs/fs.c b/tools/lib/api/fs/fs.c
index 7aba8243a0e7c..bd021a0eeef8c 100644
--- a/tools/lib/api/fs/fs.c
+++ b/tools/lib/api/fs/fs.c
@@ -210,6 +210,7 @@ static bool fs__env_override(struct fs *fs)
size_t name_len = strlen(fs->name);
/* name + "_PATH" + '\0' */
char upper_name[name_len + 5 + 1];
+
memcpy(upper_name, fs->name, name_len);
mem_toupper(upper_name, name_len);
strcpy(&upper_name[name_len], "_PATH");
@@ -219,7 +220,8 @@ static bool fs__env_override(struct fs *fs)
return false;
fs->found = true;
- strncpy(fs->path, override_path, sizeof(fs->path));
+ strncpy(fs->path, override_path, sizeof(fs->path) - 1);
+ fs->path[sizeof(fs->path) - 1] = '\0';
return true;
}
--
2.20.1
From: Erik Kaneda <[email protected]>
[ Upstream commit 5ddbd77181dfca61b16d2e2222382ea65637f1b9 ]
ACPICA commit 29cc8dbc5463a93625bed87d7550a8bed8913bf4
create_buffer_field is a deferred op that is typically processed in
load pass 2. However, disassembly of control method contents walk the
parse tree with ACPI_PARSE_LOAD_PASS1 and AML_CREATE operators are
processed in a later walk. This is a problem when there is a control
method that has the same name as the AML_CREATE object. In this case,
any use of the name segment will be detected as a method call rather
than a reference to a buffer field. If this is detected as a method
call, it can result in a mal-formed parse tree if the control methods
have parameters.
This change in processing AML_CREATE ops earlier solves this issue by
inserting the named object in the ACPI namespace so that references
to this name would be detected as a name string rather than a method
call.
Link: https://github.com/acpica/acpica/commit/29cc8dbc
Reported-by: Elia Geretto <[email protected]>
Tested-by: Elia Geretto <[email protected]>
Signed-off-by: Bob Moore <[email protected]>
Signed-off-by: Erik Kaneda <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/acpi/acpica/dsfield.c | 2 +-
drivers/acpi/acpica/dswload.c | 21 +++++++++++++++++++++
2 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/acpica/dsfield.c b/drivers/acpi/acpica/dsfield.c
index 30fe89545d6ab..bcc6a7acc5762 100644
--- a/drivers/acpi/acpica/dsfield.c
+++ b/drivers/acpi/acpica/dsfield.c
@@ -244,7 +244,7 @@ cleanup:
* FUNCTION: acpi_ds_get_field_names
*
* PARAMETERS: info - create_field info structure
- * ` walk_state - Current method state
+ * walk_state - Current method state
* arg - First parser arg for the field name list
*
* RETURN: Status
diff --git a/drivers/acpi/acpica/dswload.c b/drivers/acpi/acpica/dswload.c
index d06c414462822..ba53662f12179 100644
--- a/drivers/acpi/acpica/dswload.c
+++ b/drivers/acpi/acpica/dswload.c
@@ -412,6 +412,27 @@ acpi_status acpi_ds_load1_end_op(struct acpi_walk_state *walk_state)
ACPI_DEBUG_PRINT((ACPI_DB_DISPATCH, "Op=%p State=%p\n", op,
walk_state));
+ /*
+ * Disassembler: handle create field operators here.
+ *
+ * create_buffer_field is a deferred op that is typically processed in load
+ * pass 2. However, disassembly of control method contents walk the parse
+ * tree with ACPI_PARSE_LOAD_PASS1 and AML_CREATE operators are processed
+ * in a later walk. This is a problem when there is a control method that
+ * has the same name as the AML_CREATE object. In this case, any use of the
+ * name segment will be detected as a method call rather than a reference
+ * to a buffer field.
+ *
+ * This earlier creation during disassembly solves this issue by inserting
+ * the named object in the ACPI namespace so that references to this name
+ * would be a name string rather than a method call.
+ */
+ if ((walk_state->parse_flags & ACPI_PARSE_DISASSEMBLE) &&
+ (walk_state->op_info->flags & AML_CREATE)) {
+ status = acpi_ds_create_buffer_field(op, walk_state);
+ return_ACPI_STATUS(status);
+ }
+
/* We are only interested in opcodes that have an associated name */
if (!(walk_state->op_info->flags & (AML_NAMED | AML_FIELD))) {
--
2.20.1
From: Shuah Khan <[email protected]>
[ Upstream commit 585c91f40d201bc564d4e76b83c05b3b5363fe7e ]
Fix unsafe unaligned pointer usage in usbip network interfaces. usbip tool
build fails with new gcc -Werror=address-of-packed-member checks.
usbip_network.c: In function ‘usbip_net_pack_usb_device’:
usbip_network.c:79:32: error: taking address of packed member of ‘struct usbip_usb_device’ may result in an unaligned pointer value [-Werror=address-of-packed-member]
79 | usbip_net_pack_uint32_t(pack, &udev->busnum);
Fix with minor changes to pass by value instead of by address.
Signed-off-by: Shuah Khan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
tools/usb/usbip/src/usbip_network.c | 40 +++++++++++++++++------------
tools/usb/usbip/src/usbip_network.h | 12 +++------
2 files changed, 27 insertions(+), 25 deletions(-)
diff --git a/tools/usb/usbip/src/usbip_network.c b/tools/usb/usbip/src/usbip_network.c
index 8ffcd47d96385..902f55208e239 100644
--- a/tools/usb/usbip/src/usbip_network.c
+++ b/tools/usb/usbip/src/usbip_network.c
@@ -62,39 +62,39 @@ void usbip_setup_port_number(char *arg)
info("using port %d (\"%s\")", usbip_port, usbip_port_string);
}
-void usbip_net_pack_uint32_t(int pack, uint32_t *num)
+uint32_t usbip_net_pack_uint32_t(int pack, uint32_t num)
{
uint32_t i;
if (pack)
- i = htonl(*num);
+ i = htonl(num);
else
- i = ntohl(*num);
+ i = ntohl(num);
- *num = i;
+ return i;
}
-void usbip_net_pack_uint16_t(int pack, uint16_t *num)
+uint16_t usbip_net_pack_uint16_t(int pack, uint16_t num)
{
uint16_t i;
if (pack)
- i = htons(*num);
+ i = htons(num);
else
- i = ntohs(*num);
+ i = ntohs(num);
- *num = i;
+ return i;
}
void usbip_net_pack_usb_device(int pack, struct usbip_usb_device *udev)
{
- usbip_net_pack_uint32_t(pack, &udev->busnum);
- usbip_net_pack_uint32_t(pack, &udev->devnum);
- usbip_net_pack_uint32_t(pack, &udev->speed);
+ udev->busnum = usbip_net_pack_uint32_t(pack, udev->busnum);
+ udev->devnum = usbip_net_pack_uint32_t(pack, udev->devnum);
+ udev->speed = usbip_net_pack_uint32_t(pack, udev->speed);
- usbip_net_pack_uint16_t(pack, &udev->idVendor);
- usbip_net_pack_uint16_t(pack, &udev->idProduct);
- usbip_net_pack_uint16_t(pack, &udev->bcdDevice);
+ udev->idVendor = usbip_net_pack_uint16_t(pack, udev->idVendor);
+ udev->idProduct = usbip_net_pack_uint16_t(pack, udev->idProduct);
+ udev->bcdDevice = usbip_net_pack_uint16_t(pack, udev->bcdDevice);
}
void usbip_net_pack_usb_interface(int pack __attribute__((unused)),
@@ -141,6 +141,14 @@ ssize_t usbip_net_send(int sockfd, void *buff, size_t bufflen)
return usbip_net_xmit(sockfd, buff, bufflen, 1);
}
+static inline void usbip_net_pack_op_common(int pack,
+ struct op_common *op_common)
+{
+ op_common->version = usbip_net_pack_uint16_t(pack, op_common->version);
+ op_common->code = usbip_net_pack_uint16_t(pack, op_common->code);
+ op_common->status = usbip_net_pack_uint32_t(pack, op_common->status);
+}
+
int usbip_net_send_op_common(int sockfd, uint32_t code, uint32_t status)
{
struct op_common op_common;
@@ -152,7 +160,7 @@ int usbip_net_send_op_common(int sockfd, uint32_t code, uint32_t status)
op_common.code = code;
op_common.status = status;
- PACK_OP_COMMON(1, &op_common);
+ usbip_net_pack_op_common(1, &op_common);
rc = usbip_net_send(sockfd, &op_common, sizeof(op_common));
if (rc < 0) {
@@ -176,7 +184,7 @@ int usbip_net_recv_op_common(int sockfd, uint16_t *code, int *status)
goto err;
}
- PACK_OP_COMMON(0, &op_common);
+ usbip_net_pack_op_common(0, &op_common);
if (op_common.version != USBIP_VERSION) {
err("USBIP Kernel and tool version mismatch: %d %d:",
diff --git a/tools/usb/usbip/src/usbip_network.h b/tools/usb/usbip/src/usbip_network.h
index 555215eae43e9..83b4c5344f721 100644
--- a/tools/usb/usbip/src/usbip_network.h
+++ b/tools/usb/usbip/src/usbip_network.h
@@ -32,12 +32,6 @@ struct op_common {
} __attribute__((packed));
-#define PACK_OP_COMMON(pack, op_common) do {\
- usbip_net_pack_uint16_t(pack, &(op_common)->version);\
- usbip_net_pack_uint16_t(pack, &(op_common)->code);\
- usbip_net_pack_uint32_t(pack, &(op_common)->status);\
-} while (0)
-
/* ---------------------------------------------------------------------- */
/* Dummy Code */
#define OP_UNSPEC 0x00
@@ -163,11 +157,11 @@ struct op_devlist_reply_extra {
} while (0)
#define PACK_OP_DEVLIST_REPLY(pack, reply) do {\
- usbip_net_pack_uint32_t(pack, &(reply)->ndev);\
+ (reply)->ndev = usbip_net_pack_uint32_t(pack, (reply)->ndev);\
} while (0)
-void usbip_net_pack_uint32_t(int pack, uint32_t *num);
-void usbip_net_pack_uint16_t(int pack, uint16_t *num);
+uint32_t usbip_net_pack_uint32_t(int pack, uint32_t num);
+uint16_t usbip_net_pack_uint16_t(int pack, uint16_t num);
void usbip_net_pack_usb_device(int pack, struct usbip_usb_device *udev);
void usbip_net_pack_usb_interface(int pack, struct usbip_usb_interface *uinf);
--
2.20.1
From: Jan Kara <[email protected]>
[ Upstream commit a4a8b99ec819ca60b49dc582a4287ef03411f117 ]
Free space on filesystems with metadata or virtual partition maps
currently gets misreported. This is because these partitions are just
remapped onto underlying real partitions from which keep track of free
blocks. Take this remapping into account when counting free blocks as
well.
Reviewed-by: Pali Rohár <[email protected]>
Reported-by: Pali Rohár <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/udf/super.c | 22 +++++++++++++++++-----
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/fs/udf/super.c b/fs/udf/super.c
index 6fd0f14e9dd22..1676a175cd7a8 100644
--- a/fs/udf/super.c
+++ b/fs/udf/super.c
@@ -2469,17 +2469,29 @@ static unsigned int udf_count_free_table(struct super_block *sb,
static unsigned int udf_count_free(struct super_block *sb)
{
unsigned int accum = 0;
- struct udf_sb_info *sbi;
+ struct udf_sb_info *sbi = UDF_SB(sb);
struct udf_part_map *map;
+ unsigned int part = sbi->s_partition;
+ int ptype = sbi->s_partmaps[part].s_partition_type;
+
+ if (ptype == UDF_METADATA_MAP25) {
+ part = sbi->s_partmaps[part].s_type_specific.s_metadata.
+ s_phys_partition_ref;
+ } else if (ptype == UDF_VIRTUAL_MAP15 || ptype == UDF_VIRTUAL_MAP20) {
+ /*
+ * Filesystems with VAT are append-only and we cannot write to
+ * them. Let's just report 0 here.
+ */
+ return 0;
+ }
- sbi = UDF_SB(sb);
if (sbi->s_lvid_bh) {
struct logicalVolIntegrityDesc *lvid =
(struct logicalVolIntegrityDesc *)
sbi->s_lvid_bh->b_data;
- if (le32_to_cpu(lvid->numOfPartitions) > sbi->s_partition) {
+ if (le32_to_cpu(lvid->numOfPartitions) > part) {
accum = le32_to_cpu(
- lvid->freeSpaceTable[sbi->s_partition]);
+ lvid->freeSpaceTable[part]);
if (accum == 0xFFFFFFFF)
accum = 0;
}
@@ -2488,7 +2500,7 @@ static unsigned int udf_count_free(struct super_block *sb)
if (accum)
return accum;
- map = &sbi->s_partmaps[sbi->s_partition];
+ map = &sbi->s_partmaps[part];
if (map->s_partition_flags & UDF_PART_FLAG_UNALLOC_BITMAP) {
accum += udf_count_free_bitmap(sb,
map->s_uspace.s_bitmap);
--
2.20.1
From: Arnd Bergmann <[email protected]>
[ Upstream commit 42ae1a5c76691928ed217c7e40269db27f5225e9 ]
In some configurations, gcc tries too hard to optimize this code:
drivers/net/ethernet/mellanox/mlx5/core/en_stats.c: In function 'mlx5e_grp_sw_update_stats':
drivers/net/ethernet/mellanox/mlx5/core/en_stats.c:302:1: error: the frame size of 1336 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
As was stated in the bug report, the reason is that gcc runs into a corner
case in the register allocator that is rather hard to fix in a good way.
As there is an easy way to work around it, just add a comment and the
barrier that stops gcc from trying to overoptimize the function.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92657
Cc: Adhemerval Zanella <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/en_stats.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 8255d797ea943..9a68dee588c1a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -211,6 +211,9 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
s->tx_tls_resync_bytes += sq_stats->tls_resync_bytes;
#endif
s->tx_cqes += sq_stats->cqes;
+
+ /* https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92657 */
+ barrier();
}
}
--
2.20.1
From: Sascha Hauer <[email protected]>
[ Upstream commit 02939cd167095f16328a1bd5cab5a90b550606df ]
The current descriptor is not on any list of the virtual DMA channel.
Once sdma_terminate_all() is called when a descriptor is currently
in flight then this one is forgotten to be freed. We have to call
vchan_terminate_vdesc() on this descriptor to re-add it to the lists.
Now that we also free the currently running descriptor we can (and
actually have to) remove the current descriptor from its list also
for the cyclic case.
Signed-off-by: Sascha Hauer <[email protected]>
Reviewed-by: Robin Gong <[email protected]>
Tested-by: Robin Gong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/dma/imx-sdma.c | 19 +++++++++++--------
1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/drivers/dma/imx-sdma.c b/drivers/dma/imx-sdma.c
index ceb82e74f5b4e..d66a7fdff898e 100644
--- a/drivers/dma/imx-sdma.c
+++ b/drivers/dma/imx-sdma.c
@@ -738,12 +738,8 @@ static void sdma_start_desc(struct sdma_channel *sdmac)
return;
}
sdmac->desc = desc = to_sdma_desc(&vd->tx);
- /*
- * Do not delete the node in desc_issued list in cyclic mode, otherwise
- * the desc allocated will never be freed in vchan_dma_desc_free_list
- */
- if (!(sdmac->flags & IMX_DMA_SG_LOOP))
- list_del(&vd->node);
+
+ list_del(&vd->node);
sdma->channel_control[channel].base_bd_ptr = desc->bd_phys;
sdma->channel_control[channel].current_bd_ptr = desc->bd_phys;
@@ -1044,7 +1040,6 @@ static void sdma_channel_terminate_work(struct work_struct *work)
spin_lock_irqsave(&sdmac->vc.lock, flags);
vchan_get_all_descriptors(&sdmac->vc, &head);
- sdmac->desc = NULL;
spin_unlock_irqrestore(&sdmac->vc.lock, flags);
vchan_dma_desc_free_list(&sdmac->vc, &head);
}
@@ -1052,11 +1047,19 @@ static void sdma_channel_terminate_work(struct work_struct *work)
static int sdma_disable_channel_async(struct dma_chan *chan)
{
struct sdma_channel *sdmac = to_sdma_chan(chan);
+ unsigned long flags;
+
+ spin_lock_irqsave(&sdmac->vc.lock, flags);
sdma_disable_channel(chan);
- if (sdmac->desc)
+ if (sdmac->desc) {
+ vchan_terminate_vdesc(&sdmac->desc->vd);
+ sdmac->desc = NULL;
schedule_work(&sdmac->terminate_worker);
+ }
+
+ spin_unlock_irqrestore(&sdmac->vc.lock, flags);
return 0;
}
--
2.20.1
From: Bibby Hsieh <[email protected]>
[ Upstream commit 411f5c1eacfebb1f6e40b653d29447cdfe7282aa ]
The driver currently handles vblank events only when updating planes on
an already enabled CRTC. The atomic update API however allows requesting
an event when enabling or disabling a CRTC. This currently leads to
event objects being leaked in the kernel and to events not being sent
out. Fix it.
Signed-off-by: Bibby Hsieh <[email protected]>
Signed-off-by: CK Hu <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
index 92ecb9bf982cf..b86ee7d25af36 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
@@ -308,6 +308,7 @@ err_pm_runtime_put:
static void mtk_crtc_ddp_hw_fini(struct mtk_drm_crtc *mtk_crtc)
{
struct drm_device *drm = mtk_crtc->base.dev;
+ struct drm_crtc *crtc = &mtk_crtc->base;
int i;
DRM_DEBUG_DRIVER("%s\n", __func__);
@@ -329,6 +330,13 @@ static void mtk_crtc_ddp_hw_fini(struct mtk_drm_crtc *mtk_crtc)
mtk_disp_mutex_unprepare(mtk_crtc->mutex);
pm_runtime_put(drm->dev);
+
+ if (crtc->state->event && !crtc->state->active) {
+ spin_lock_irq(&crtc->dev->event_lock);
+ drm_crtc_send_vblank_event(crtc, crtc->state->event);
+ crtc->state->event = NULL;
+ spin_unlock_irq(&crtc->dev->event_lock);
+ }
}
static void mtk_crtc_ddp_config(struct drm_crtc *crtc)
--
2.20.1
From: Ard Biesheuvel <[email protected]>
[ Upstream commit e2d68a955e49d61fd0384f23e92058dc9b79be5e ]
The logic in __efi_enter_virtual_mode() does a number of steps in
sequence, all of which may fail in one way or the other. In most
cases, we simply print an error and disable EFI runtime services
support, but in some cases, we BUG() or panic() and bring down the
system when encountering conditions that we could easily handle in
the same way.
While at it, replace a pointless page-to-virt-phys conversion with
one that goes straight from struct page to physical.
Signed-off-by: Ard Biesheuvel <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Arvind Sankar <[email protected]>
Cc: Matthew Garrett <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/platform/efi/efi.c | 28 ++++++++++++++--------------
arch/x86/platform/efi/efi_64.c | 9 +++++----
2 files changed, 19 insertions(+), 18 deletions(-)
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 5b0275310070e..e7f19dec16b97 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -930,16 +930,14 @@ static void __init __efi_enter_virtual_mode(void)
if (efi_alloc_page_tables()) {
pr_err("Failed to allocate EFI page tables\n");
- clear_bit(EFI_RUNTIME_SERVICES, &efi.flags);
- return;
+ goto err;
}
efi_merge_regions();
new_memmap = efi_map_regions(&count, &pg_shift);
if (!new_memmap) {
pr_err("Error reallocating memory, EFI runtime non-functional!\n");
- clear_bit(EFI_RUNTIME_SERVICES, &efi.flags);
- return;
+ goto err;
}
pa = __pa(new_memmap);
@@ -953,8 +951,7 @@ static void __init __efi_enter_virtual_mode(void)
if (efi_memmap_init_late(pa, efi.memmap.desc_size * count)) {
pr_err("Failed to remap late EFI memory map\n");
- clear_bit(EFI_RUNTIME_SERVICES, &efi.flags);
- return;
+ goto err;
}
if (efi_enabled(EFI_DBG)) {
@@ -962,12 +959,11 @@ static void __init __efi_enter_virtual_mode(void)
efi_print_memmap();
}
- BUG_ON(!efi.systab);
+ if (WARN_ON(!efi.systab))
+ goto err;
- if (efi_setup_page_tables(pa, 1 << pg_shift)) {
- clear_bit(EFI_RUNTIME_SERVICES, &efi.flags);
- return;
- }
+ if (efi_setup_page_tables(pa, 1 << pg_shift))
+ goto err;
efi_sync_low_kernel_mappings();
@@ -987,9 +983,9 @@ static void __init __efi_enter_virtual_mode(void)
}
if (status != EFI_SUCCESS) {
- pr_alert("Unable to switch EFI into virtual mode (status=%lx)!\n",
- status);
- panic("EFI call to SetVirtualAddressMap() failed!");
+ pr_err("Unable to switch EFI into virtual mode (status=%lx)!\n",
+ status);
+ goto err;
}
/*
@@ -1016,6 +1012,10 @@ static void __init __efi_enter_virtual_mode(void)
/* clean DUMMY object */
efi_delete_dummy_variable();
+ return;
+
+err:
+ clear_bit(EFI_RUNTIME_SERVICES, &efi.flags);
}
void __init efi_enter_virtual_mode(void)
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index ee5d08f25ce45..6db8f3598c800 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -389,11 +389,12 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
return 0;
page = alloc_page(GFP_KERNEL|__GFP_DMA32);
- if (!page)
- panic("Unable to allocate EFI runtime stack < 4GB\n");
+ if (!page) {
+ pr_err("Unable to allocate EFI runtime stack < 4GB\n");
+ return 1;
+ }
- efi_scratch.phys_stack = virt_to_phys(page_address(page));
- efi_scratch.phys_stack += PAGE_SIZE; /* stack grows down */
+ efi_scratch.phys_stack = page_to_phys(page + 1); /* stack grows down */
npages = (_etext - _text) >> PAGE_SHIFT;
text = __pa(_text);
--
2.20.1
From: Paul E. McKenney <[email protected]>
[ Upstream commit 860c8802ace14c646864795e057349c9fb2d60ad ]
Eric Dumazet supplied a KCSAN report of a bug that forces use
of hlist_unhashed_lockless() from sk_unhashed():
------------------------------------------------------------------------
BUG: KCSAN: data-race in inet_unhash / inet_unhash
write to 0xffff8880a69a0170 of 8 bytes by interrupt on cpu 1:
__hlist_nulls_del include/linux/list_nulls.h:88 [inline]
hlist_nulls_del_init_rcu include/linux/rculist_nulls.h:36 [inline]
__sk_nulls_del_node_init_rcu include/net/sock.h:676 [inline]
inet_unhash+0x38f/0x4a0 net/ipv4/inet_hashtables.c:612
tcp_set_state+0xfa/0x3e0 net/ipv4/tcp.c:2249
tcp_done+0x93/0x1e0 net/ipv4/tcp.c:3854
tcp_write_err+0x7e/0xc0 net/ipv4/tcp_timer.c:56
tcp_retransmit_timer+0x9b8/0x16d0 net/ipv4/tcp_timer.c:479
tcp_write_timer_handler+0x42d/0x510 net/ipv4/tcp_timer.c:599
tcp_write_timer+0xd1/0xf0 net/ipv4/tcp_timer.c:619
call_timer_fn+0x5f/0x2f0 kernel/time/timer.c:1404
expire_timers kernel/time/timer.c:1449 [inline]
__run_timers kernel/time/timer.c:1773 [inline]
__run_timers kernel/time/timer.c:1740 [inline]
run_timer_softirq+0xc0c/0xcd0 kernel/time/timer.c:1786
__do_softirq+0x115/0x33f kernel/softirq.c:292
invoke_softirq kernel/softirq.c:373 [inline]
irq_exit+0xbb/0xe0 kernel/softirq.c:413
exiting_irq arch/x86/include/asm/apic.h:536 [inline]
smp_apic_timer_interrupt+0xe6/0x280 arch/x86/kernel/apic/apic.c:1137
apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830
native_safe_halt+0xe/0x10 arch/x86/kernel/paravirt.c:71
arch_cpu_idle+0x1f/0x30 arch/x86/kernel/process.c:571
default_idle_call+0x1e/0x40 kernel/sched/idle.c:94
cpuidle_idle_call kernel/sched/idle.c:154 [inline]
do_idle+0x1af/0x280 kernel/sched/idle.c:263
cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:355
start_secondary+0x208/0x260 arch/x86/kernel/smpboot.c:264
secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241
read to 0xffff8880a69a0170 of 8 bytes by interrupt on cpu 0:
sk_unhashed include/net/sock.h:607 [inline]
inet_unhash+0x3d/0x4a0 net/ipv4/inet_hashtables.c:592
tcp_set_state+0xfa/0x3e0 net/ipv4/tcp.c:2249
tcp_done+0x93/0x1e0 net/ipv4/tcp.c:3854
tcp_write_err+0x7e/0xc0 net/ipv4/tcp_timer.c:56
tcp_retransmit_timer+0x9b8/0x16d0 net/ipv4/tcp_timer.c:479
tcp_write_timer_handler+0x42d/0x510 net/ipv4/tcp_timer.c:599
tcp_write_timer+0xd1/0xf0 net/ipv4/tcp_timer.c:619
call_timer_fn+0x5f/0x2f0 kernel/time/timer.c:1404
expire_timers kernel/time/timer.c:1449 [inline]
__run_timers kernel/time/timer.c:1773 [inline]
__run_timers kernel/time/timer.c:1740 [inline]
run_timer_softirq+0xc0c/0xcd0 kernel/time/timer.c:1786
__do_softirq+0x115/0x33f kernel/softirq.c:292
invoke_softirq kernel/softirq.c:373 [inline]
irq_exit+0xbb/0xe0 kernel/softirq.c:413
exiting_irq arch/x86/include/asm/apic.h:536 [inline]
smp_apic_timer_interrupt+0xe6/0x280 arch/x86/kernel/apic/apic.c:1137
apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830
native_safe_halt+0xe/0x10 arch/x86/kernel/paravirt.c:71
arch_cpu_idle+0x1f/0x30 arch/x86/kernel/process.c:571
default_idle_call+0x1e/0x40 kernel/sched/idle.c:94
cpuidle_idle_call kernel/sched/idle.c:154 [inline]
do_idle+0x1af/0x280 kernel/sched/idle.c:263
cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:355
rest_init+0xec/0xf6 init/main.c:452
arch_call_rest_init+0x17/0x37
start_kernel+0x838/0x85e init/main.c:786
x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:490
x86_64_start_kernel+0x72/0x76 arch/x86/kernel/head64.c:471
secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc6+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
------------------------------------------------------------------------
This commit therefore replaces C-language assignments with WRITE_ONCE()
in include/linux/list_nulls.h and include/linux/rculist_nulls.h.
Reported-by: Eric Dumazet <[email protected]> # For KCSAN
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
include/linux/list_nulls.h | 8 ++++----
include/linux/rculist_nulls.h | 8 ++++----
2 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/include/linux/list_nulls.h b/include/linux/list_nulls.h
index 3ef96743db8da..1ecd35664e0d3 100644
--- a/include/linux/list_nulls.h
+++ b/include/linux/list_nulls.h
@@ -72,10 +72,10 @@ static inline void hlist_nulls_add_head(struct hlist_nulls_node *n,
struct hlist_nulls_node *first = h->first;
n->next = first;
- n->pprev = &h->first;
+ WRITE_ONCE(n->pprev, &h->first);
h->first = n;
if (!is_a_nulls(first))
- first->pprev = &n->next;
+ WRITE_ONCE(first->pprev, &n->next);
}
static inline void __hlist_nulls_del(struct hlist_nulls_node *n)
@@ -85,13 +85,13 @@ static inline void __hlist_nulls_del(struct hlist_nulls_node *n)
WRITE_ONCE(*pprev, next);
if (!is_a_nulls(next))
- next->pprev = pprev;
+ WRITE_ONCE(next->pprev, pprev);
}
static inline void hlist_nulls_del(struct hlist_nulls_node *n)
{
__hlist_nulls_del(n);
- n->pprev = LIST_POISON2;
+ WRITE_ONCE(n->pprev, LIST_POISON2);
}
/**
diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h
index 61974c4c566be..90f2e2232c6d7 100644
--- a/include/linux/rculist_nulls.h
+++ b/include/linux/rculist_nulls.h
@@ -34,7 +34,7 @@ static inline void hlist_nulls_del_init_rcu(struct hlist_nulls_node *n)
{
if (!hlist_nulls_unhashed(n)) {
__hlist_nulls_del(n);
- n->pprev = NULL;
+ WRITE_ONCE(n->pprev, NULL);
}
}
@@ -66,7 +66,7 @@ static inline void hlist_nulls_del_init_rcu(struct hlist_nulls_node *n)
static inline void hlist_nulls_del_rcu(struct hlist_nulls_node *n)
{
__hlist_nulls_del(n);
- n->pprev = LIST_POISON2;
+ WRITE_ONCE(n->pprev, LIST_POISON2);
}
/**
@@ -94,10 +94,10 @@ static inline void hlist_nulls_add_head_rcu(struct hlist_nulls_node *n,
struct hlist_nulls_node *first = h->first;
n->next = first;
- n->pprev = &h->first;
+ WRITE_ONCE(n->pprev, &h->first);
rcu_assign_pointer(hlist_nulls_first_rcu(h), n);
if (!is_a_nulls(first))
- first->pprev = &n->next;
+ WRITE_ONCE(first->pprev, &n->next);
}
/**
--
2.20.1
From: zhangyi (F) <[email protected]>
[ Upstream commit 51f57b01e4a3c7d7bdceffd84de35144e8c538e7 ]
JBD2_REC_ERR flag used to indicate the errno has been updated when jbd2
aborted, and then __ext4_abort() and ext4_handle_error() can invoke
panic if ERRORS_PANIC is specified. But if the journal has been aborted
with zero errno, jbd2_journal_abort() didn't set this flag so we can
no longer panic. Fix this by always record the proper errno in the
journal superblock.
Fixes: 4327ba52afd03 ("ext4, jbd2: ensure entering into panic after recording an error in superblock")
Signed-off-by: zhangyi (F) <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/jbd2/checkpoint.c | 2 +-
fs/jbd2/journal.c | 15 ++++-----------
2 files changed, 5 insertions(+), 12 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
index 26f8d7e46462e..66409cbd3ed54 100644
--- a/fs/jbd2/checkpoint.c
+++ b/fs/jbd2/checkpoint.c
@@ -165,7 +165,7 @@ void __jbd2_log_wait_for_space(journal_t *journal)
"journal space in %s\n", __func__,
journal->j_devname);
WARN_ON(1);
- jbd2_journal_abort(journal, 0);
+ jbd2_journal_abort(journal, -EIO);
}
write_lock(&journal->j_state_lock);
} else {
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 568ca0ca0127c..1a96287f92647 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -2142,12 +2142,10 @@ static void __journal_abort_soft (journal_t *journal, int errno)
__jbd2_journal_abort_hard(journal);
- if (errno) {
- jbd2_journal_update_sb_errno(journal);
- write_lock(&journal->j_state_lock);
- journal->j_flags |= JBD2_REC_ERR;
- write_unlock(&journal->j_state_lock);
- }
+ jbd2_journal_update_sb_errno(journal);
+ write_lock(&journal->j_state_lock);
+ journal->j_flags |= JBD2_REC_ERR;
+ write_unlock(&journal->j_state_lock);
}
/**
@@ -2189,11 +2187,6 @@ static void __journal_abort_soft (journal_t *journal, int errno)
* failure to disk. ext3_error, for example, now uses this
* functionality.
*
- * Errors which originate from within the journaling layer will NOT
- * supply an errno; a null errno implies that absolutely no further
- * writes are done to the journal (unless there are any already in
- * progress).
- *
*/
void jbd2_journal_abort(journal_t *journal, int errno)
--
2.20.1
From: yu kuai <[email protected]>
[ Upstream commit bae028e3e521e8cb8caf2cc16a455ce4c55f2332 ]
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c: In function
'amdgpu_atombios_get_connector_info_from_object_table':
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:376:26: warning: variable
'grph_obj_num' set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:376:13: warning: variable
'grph_obj_id' set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:341:37: warning: variable
'con_obj_type' set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c:341:24: warning: variable
'con_obj_num' set but not used [-Wunused-but-set-variable]
They are never used, so can be removed.
Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)")
Signed-off-by: yu kuai <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c | 19 ++-----------------
1 file changed, 2 insertions(+), 17 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
index bf872f694f509..d1fbaea91f580 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
@@ -337,17 +337,9 @@ bool amdgpu_atombios_get_connector_info_from_object_table(struct amdgpu_device *
path_size += le16_to_cpu(path->usSize);
if (device_support & le16_to_cpu(path->usDeviceTag)) {
- uint8_t con_obj_id, con_obj_num, con_obj_type;
-
- con_obj_id =
+ uint8_t con_obj_id =
(le16_to_cpu(path->usConnObjectId) & OBJECT_ID_MASK)
>> OBJECT_ID_SHIFT;
- con_obj_num =
- (le16_to_cpu(path->usConnObjectId) & ENUM_ID_MASK)
- >> ENUM_ID_SHIFT;
- con_obj_type =
- (le16_to_cpu(path->usConnObjectId) &
- OBJECT_TYPE_MASK) >> OBJECT_TYPE_SHIFT;
/* Skip TV/CV support */
if ((le16_to_cpu(path->usDeviceTag) ==
@@ -372,14 +364,7 @@ bool amdgpu_atombios_get_connector_info_from_object_table(struct amdgpu_device *
router.ddc_valid = false;
router.cd_valid = false;
for (j = 0; j < ((le16_to_cpu(path->usSize) - 8) / 2); j++) {
- uint8_t grph_obj_id, grph_obj_num, grph_obj_type;
-
- grph_obj_id =
- (le16_to_cpu(path->usGraphicObjIds[j]) &
- OBJECT_ID_MASK) >> OBJECT_ID_SHIFT;
- grph_obj_num =
- (le16_to_cpu(path->usGraphicObjIds[j]) &
- ENUM_ID_MASK) >> ENUM_ID_SHIFT;
+ uint8_t grph_obj_type=
grph_obj_type =
(le16_to_cpu(path->usGraphicObjIds[j]) &
OBJECT_TYPE_MASK) >> OBJECT_TYPE_SHIFT;
--
2.20.1
From: Changbin Du <[email protected]>
[ Upstream commit 248ed51048c40d36728e70914e38bffd7821da57 ]
First, printk() is NMI-context safe now since the safe printk() has been
implemented and it already has an irq_work to make NMI-context safe.
Second, this NMI irq_work actually does not work if a NMI handler causes
panic by watchdog timeout. It has no chance to run in such case, while
the safe printk() will flush its per-cpu buffers before panicking.
While at it, repurpose the irq_work callback into a function which
concentrates the NMI duration checking and makes the code easier to
follow.
[ bp: Massage. ]
Signed-off-by: Changbin Du <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/include/asm/nmi.h | 1 -
arch/x86/kernel/nmi.c | 20 +++++++++-----------
2 files changed, 9 insertions(+), 12 deletions(-)
diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
index 75ded1d13d98d..9d5d949e662e1 100644
--- a/arch/x86/include/asm/nmi.h
+++ b/arch/x86/include/asm/nmi.h
@@ -41,7 +41,6 @@ struct nmiaction {
struct list_head list;
nmi_handler_t handler;
u64 max_duration;
- struct irq_work irq_work;
unsigned long flags;
const char *name;
};
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index 086cf1d1d71d8..0f8b9b900b0e7 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -102,18 +102,22 @@ static int __init nmi_warning_debugfs(void)
}
fs_initcall(nmi_warning_debugfs);
-static void nmi_max_handler(struct irq_work *w)
+static void nmi_check_duration(struct nmiaction *action, u64 duration)
{
- struct nmiaction *a = container_of(w, struct nmiaction, irq_work);
+ u64 whole_msecs = READ_ONCE(action->max_duration);
int remainder_ns, decimal_msecs;
- u64 whole_msecs = READ_ONCE(a->max_duration);
+
+ if (duration < nmi_longest_ns || duration < action->max_duration)
+ return;
+
+ action->max_duration = duration;
remainder_ns = do_div(whole_msecs, (1000 * 1000));
decimal_msecs = remainder_ns / 1000;
printk_ratelimited(KERN_INFO
"INFO: NMI handler (%ps) took too long to run: %lld.%03d msecs\n",
- a->handler, whole_msecs, decimal_msecs);
+ action->handler, whole_msecs, decimal_msecs);
}
static int nmi_handle(unsigned int type, struct pt_regs *regs)
@@ -140,11 +144,7 @@ static int nmi_handle(unsigned int type, struct pt_regs *regs)
delta = sched_clock() - delta;
trace_nmi_handler(a->handler, (int)delta, thishandled);
- if (delta < nmi_longest_ns || delta < a->max_duration)
- continue;
-
- a->max_duration = delta;
- irq_work_queue(&a->irq_work);
+ nmi_check_duration(a, delta);
}
rcu_read_unlock();
@@ -162,8 +162,6 @@ int __register_nmi_handler(unsigned int type, struct nmiaction *action)
if (!action->handler)
return -EINVAL;
- init_irq_work(&action->irq_work, nmi_max_handler);
-
raw_spin_lock_irqsave(&desc->lock, flags);
/*
--
2.20.1
From: Arnd Bergmann <[email protected]>
[ Upstream commit 504c28c853ec5c626900b914b5833daf0581a344 ]
Change the driver to use portable integer types to avoid
warnings during compile testing:
drivers/net/wan/ixp4xx_hss.c:863:21: error: cast to 'u32 *' (aka 'unsigned int *') from smaller integer type 'int' [-Werror,-Wint-to-pointer-cast]
memcpy_swab32(mem, (u32 *)((int)skb->data & ~3), bytes / 4);
^
drivers/net/wan/ixp4xx_hss.c:979:12: error: incompatible pointer types passing 'u32 *' (aka 'unsigned int *') to parameter of type 'dma_addr_t *' (aka 'unsigned long long *') [-Werror,-Wincompatible-pointer-types]
&port->desc_tab_phys)))
^~~~~~~~~~~~~~~~~~~~
include/linux/dmapool.h:27:20: note: passing argument to parameter 'handle' here
dma_addr_t *handle);
^
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wan/ixp4xx_hss.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wan/ixp4xx_hss.c b/drivers/net/wan/ixp4xx_hss.c
index 6a505c26a3e74..a269ed63d90f7 100644
--- a/drivers/net/wan/ixp4xx_hss.c
+++ b/drivers/net/wan/ixp4xx_hss.c
@@ -261,7 +261,7 @@ struct port {
struct hss_plat_info *plat;
buffer_t *rx_buff_tab[RX_DESCS], *tx_buff_tab[TX_DESCS];
struct desc *desc_tab; /* coherent */
- u32 desc_tab_phys;
+ dma_addr_t desc_tab_phys;
unsigned int id;
unsigned int clock_type, clock_rate, loopback;
unsigned int initialized, carrier;
@@ -861,7 +861,7 @@ static int hss_hdlc_xmit(struct sk_buff *skb, struct net_device *dev)
dev->stats.tx_dropped++;
return NETDEV_TX_OK;
}
- memcpy_swab32(mem, (u32 *)((int)skb->data & ~3), bytes / 4);
+ memcpy_swab32(mem, (u32 *)((uintptr_t)skb->data & ~3), bytes / 4);
dev_kfree_skb(skb);
#endif
--
2.20.1
From: Takashi Iwai <[email protected]>
[ Upstream commit 5da116f164ce265e397b8f59af5c39e4a61d61a5 ]
Remove unused variables that are left over after the conversion of new
PCM ops:
sound/sh/sh_dac_audio.c:166:26: warning: unused variable 'runtime'
sound/sh/sh_dac_audio.c:186:26: warning: unused variable 'runtime'
sound/sh/sh_dac_audio.c:205:26: warning: unused variable 'runtime'
Fixes: 1cc2f8ba0b3e ("ALSA: sh: Convert to the new PCM ops")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/sh/sh_dac_audio.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/sound/sh/sh_dac_audio.c b/sound/sh/sh_dac_audio.c
index 834b2574786f5..6251b5e1b64a2 100644
--- a/sound/sh/sh_dac_audio.c
+++ b/sound/sh/sh_dac_audio.c
@@ -190,7 +190,6 @@ static int snd_sh_dac_pcm_copy(struct snd_pcm_substream *substream,
{
/* channel is not used (interleaved data) */
struct snd_sh_dac *chip = snd_pcm_substream_chip(substream);
- struct snd_pcm_runtime *runtime = substream->runtime;
if (copy_from_user_toio(chip->data_buffer + pos, src, count))
return -EFAULT;
@@ -210,7 +209,6 @@ static int snd_sh_dac_pcm_copy_kernel(struct snd_pcm_substream *substream,
{
/* channel is not used (interleaved data) */
struct snd_sh_dac *chip = snd_pcm_substream_chip(substream);
- struct snd_pcm_runtime *runtime = substream->runtime;
memcpy_toio(chip->data_buffer + pos, src, count);
chip->buffer_end = chip->data_buffer + pos + count;
@@ -229,7 +227,6 @@ static int snd_sh_dac_pcm_silence(struct snd_pcm_substream *substream,
{
/* channel is not used (interleaved data) */
struct snd_sh_dac *chip = snd_pcm_substream_chip(substream);
- struct snd_pcm_runtime *runtime = substream->runtime;
memset_io(chip->data_buffer + pos, 0, count);
chip->buffer_end = chip->data_buffer + pos + count;
--
2.20.1
From: Jason Gunthorpe <[email protected]>
[ Upstream commit 8bdf9dd984c18375d1090ddeb1792511f619c5c1 ]
After device disassociation the uapi_objects are destroyed and freed,
however it is still possible that core code can be holding a kref on the
uobject. When it finally goes to uverbs_uobject_free() via the kref_put()
it can trigger a use-after-free on the uapi_object.
Since needs_kfree_rcu is a micro optimization that only benefits file
uobjects, just get rid of it. There is no harm in using kfree_rcu even if
it isn't required, and the number of involved objects is small.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Michael Guralnik <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/infiniband/core/rdma_core.c | 23 +----------------------
include/rdma/uverbs_types.h | 1 -
2 files changed, 1 insertion(+), 23 deletions(-)
diff --git a/drivers/infiniband/core/rdma_core.c b/drivers/infiniband/core/rdma_core.c
index c4118bcd51035..c2c9bd72b350f 100644
--- a/drivers/infiniband/core/rdma_core.c
+++ b/drivers/infiniband/core/rdma_core.c
@@ -49,13 +49,7 @@ void uverbs_uobject_get(struct ib_uobject *uobject)
static void uverbs_uobject_free(struct kref *ref)
{
- struct ib_uobject *uobj =
- container_of(ref, struct ib_uobject, ref);
-
- if (uobj->uapi_object->type_class->needs_kfree_rcu)
- kfree_rcu(uobj, rcu);
- else
- kfree(uobj);
+ kfree_rcu(container_of(ref, struct ib_uobject, ref), rcu);
}
void uverbs_uobject_put(struct ib_uobject *uobject)
@@ -753,20 +747,6 @@ const struct uverbs_obj_type_class uverbs_idr_class = {
.lookup_put = lookup_put_idr_uobject,
.destroy_hw = destroy_hw_idr_uobject,
.remove_handle = remove_handle_idr_uobject,
- /*
- * When we destroy an object, we first just lock it for WRITE and
- * actually DESTROY it in the finalize stage. So, the problematic
- * scenario is when we just started the finalize stage of the
- * destruction (nothing was executed yet). Now, the other thread
- * fetched the object for READ access, but it didn't lock it yet.
- * The DESTROY thread continues and starts destroying the object.
- * When the other thread continue - without the RCU, it would
- * access freed memory. However, the rcu_read_lock delays the free
- * until the rcu_read_lock of the READ operation quits. Since the
- * exclusive lock of the object is still taken by the DESTROY flow, the
- * READ operation will get -EBUSY and it'll just bail out.
- */
- .needs_kfree_rcu = true,
};
EXPORT_SYMBOL(uverbs_idr_class);
@@ -954,7 +934,6 @@ const struct uverbs_obj_type_class uverbs_fd_class = {
.lookup_put = lookup_put_fd_uobject,
.destroy_hw = destroy_hw_fd_uobject,
.remove_handle = remove_handle_fd_uobject,
- .needs_kfree_rcu = false,
};
EXPORT_SYMBOL(uverbs_fd_class);
diff --git a/include/rdma/uverbs_types.h b/include/rdma/uverbs_types.h
index acb1bfa3cc99a..f70155cc73979 100644
--- a/include/rdma/uverbs_types.h
+++ b/include/rdma/uverbs_types.h
@@ -97,7 +97,6 @@ struct uverbs_obj_type_class {
int __must_check (*destroy_hw)(struct ib_uobject *uobj,
enum rdma_remove_reason why);
void (*remove_handle)(struct ib_uobject *uobj);
- u8 needs_kfree_rcu;
};
struct uverbs_obj_type {
--
2.20.1
From: Nathan Chancellor <[email protected]>
[ Upstream commit 1feedf61e7265128244f6993f23421f33dd93dbc ]
Clang warns:
../drivers/tty/synclinkmp.c:1456:3: warning: misleading indentation;
statement is not part of the previous 'if' [-Wmisleading-indentation]
if (C_CRTSCTS(tty)) {
^
../drivers/tty/synclinkmp.c:1453:2: note: previous statement is here
if (I_IXOFF(tty))
^
../drivers/tty/synclinkmp.c:2473:8: warning: misleading indentation;
statement is not part of the previous 'if' [-Wmisleading-indentation]
info->port.tty->hw_stopped = 0;
^
../drivers/tty/synclinkmp.c:2471:7: note: previous statement is here
if ( debug_level >= DEBUG_LEVEL_ISR )
^
../drivers/tty/synclinkmp.c:2482:8: warning: misleading indentation;
statement is not part of the previous 'if' [-Wmisleading-indentation]
info->port.tty->hw_stopped = 1;
^
../drivers/tty/synclinkmp.c:2480:7: note: previous statement is here
if ( debug_level >= DEBUG_LEVEL_ISR )
^
../drivers/tty/synclinkmp.c:2809:3: warning: misleading indentation;
statement is not part of the previous 'if' [-Wmisleading-indentation]
if (I_BRKINT(info->port.tty) || I_PARMRK(info->port.tty))
^
../drivers/tty/synclinkmp.c:2807:2: note: previous statement is here
if (I_INPCK(info->port.tty))
^
../drivers/tty/synclinkmp.c:3246:3: warning: misleading indentation;
statement is not part of the previous 'else' [-Wmisleading-indentation]
set_signals(info);
^
../drivers/tty/synclinkmp.c:3244:2: note: previous statement is here
else
^
5 warnings generated.
The indentation on these lines is not at all consistent, tabs and spaces
are mixed together. Convert to just using tabs to be consistent with the
Linux kernel coding style and eliminate these warnings from clang.
Link: https://github.com/ClangBuiltLinux/linux/issues/823
Signed-off-by: Nathan Chancellor <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/tty/synclinkmp.c | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/drivers/tty/synclinkmp.c b/drivers/tty/synclinkmp.c
index 1e4d5b9c981a4..57c2c647af61e 100644
--- a/drivers/tty/synclinkmp.c
+++ b/drivers/tty/synclinkmp.c
@@ -1454,10 +1454,10 @@ static void throttle(struct tty_struct * tty)
if (I_IXOFF(tty))
send_xchar(tty, STOP_CHAR(tty));
- if (C_CRTSCTS(tty)) {
+ if (C_CRTSCTS(tty)) {
spin_lock_irqsave(&info->lock,flags);
info->serial_signals &= ~SerialSignal_RTS;
- set_signals(info);
+ set_signals(info);
spin_unlock_irqrestore(&info->lock,flags);
}
}
@@ -1483,10 +1483,10 @@ static void unthrottle(struct tty_struct * tty)
send_xchar(tty, START_CHAR(tty));
}
- if (C_CRTSCTS(tty)) {
+ if (C_CRTSCTS(tty)) {
spin_lock_irqsave(&info->lock,flags);
info->serial_signals |= SerialSignal_RTS;
- set_signals(info);
+ set_signals(info);
spin_unlock_irqrestore(&info->lock,flags);
}
}
@@ -2471,7 +2471,7 @@ static void isr_io_pin( SLMP_INFO *info, u16 status )
if (status & SerialSignal_CTS) {
if ( debug_level >= DEBUG_LEVEL_ISR )
printk("CTS tx start...");
- info->port.tty->hw_stopped = 0;
+ info->port.tty->hw_stopped = 0;
tx_start(info);
info->pending_bh |= BH_TRANSMIT;
return;
@@ -2480,7 +2480,7 @@ static void isr_io_pin( SLMP_INFO *info, u16 status )
if (!(status & SerialSignal_CTS)) {
if ( debug_level >= DEBUG_LEVEL_ISR )
printk("CTS tx stop...");
- info->port.tty->hw_stopped = 1;
+ info->port.tty->hw_stopped = 1;
tx_stop(info);
}
}
@@ -2807,8 +2807,8 @@ static void change_params(SLMP_INFO *info)
info->read_status_mask2 = OVRN;
if (I_INPCK(info->port.tty))
info->read_status_mask2 |= PE | FRME;
- if (I_BRKINT(info->port.tty) || I_PARMRK(info->port.tty))
- info->read_status_mask1 |= BRKD;
+ if (I_BRKINT(info->port.tty) || I_PARMRK(info->port.tty))
+ info->read_status_mask1 |= BRKD;
if (I_IGNPAR(info->port.tty))
info->ignore_status_mask2 |= PE | FRME;
if (I_IGNBRK(info->port.tty)) {
@@ -3178,7 +3178,7 @@ static int tiocmget(struct tty_struct *tty)
unsigned long flags;
spin_lock_irqsave(&info->lock,flags);
- get_signals(info);
+ get_signals(info);
spin_unlock_irqrestore(&info->lock,flags);
result = ((info->serial_signals & SerialSignal_RTS) ? TIOCM_RTS : 0) |
@@ -3216,7 +3216,7 @@ static int tiocmset(struct tty_struct *tty,
info->serial_signals &= ~SerialSignal_DTR;
spin_lock_irqsave(&info->lock,flags);
- set_signals(info);
+ set_signals(info);
spin_unlock_irqrestore(&info->lock,flags);
return 0;
@@ -3228,7 +3228,7 @@ static int carrier_raised(struct tty_port *port)
unsigned long flags;
spin_lock_irqsave(&info->lock,flags);
- get_signals(info);
+ get_signals(info);
spin_unlock_irqrestore(&info->lock,flags);
return (info->serial_signals & SerialSignal_DCD) ? 1 : 0;
@@ -3244,7 +3244,7 @@ static void dtr_rts(struct tty_port *port, int on)
info->serial_signals |= SerialSignal_RTS | SerialSignal_DTR;
else
info->serial_signals &= ~(SerialSignal_RTS | SerialSignal_DTR);
- set_signals(info);
+ set_signals(info);
spin_unlock_irqrestore(&info->lock,flags);
}
--
2.20.1
From: Vinay Kumar Yadav <[email protected]>
[ Upstream commit 93e23eb2ed6c11b4f483c8111ac155ec2b1f3042 ]
Freed work request skbs when connection terminates.
enqueue_wr()/ dequeue_wr() is shared between softirq
and application contexts, should be protected by socket
lock. Moved dequeue_wr() to appropriate file.
Signed-off-by: Vinay Kumar Yadav <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/crypto/chelsio/chtls/chtls_cm.c | 27 +++++++++++++------------
drivers/crypto/chelsio/chtls/chtls_cm.h | 21 +++++++++++++++++++
drivers/crypto/chelsio/chtls/chtls_hw.c | 3 +++
3 files changed, 38 insertions(+), 13 deletions(-)
diff --git a/drivers/crypto/chelsio/chtls/chtls_cm.c b/drivers/crypto/chelsio/chtls/chtls_cm.c
index 8b749c721c871..28d24118c6450 100644
--- a/drivers/crypto/chelsio/chtls/chtls_cm.c
+++ b/drivers/crypto/chelsio/chtls/chtls_cm.c
@@ -731,6 +731,14 @@ static int chtls_close_listsrv_rpl(struct chtls_dev *cdev, struct sk_buff *skb)
return 0;
}
+static void chtls_purge_wr_queue(struct sock *sk)
+{
+ struct sk_buff *skb;
+
+ while ((skb = dequeue_wr(sk)) != NULL)
+ kfree_skb(skb);
+}
+
static void chtls_release_resources(struct sock *sk)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
@@ -745,6 +753,11 @@ static void chtls_release_resources(struct sock *sk)
kfree_skb(csk->txdata_skb_cache);
csk->txdata_skb_cache = NULL;
+ if (csk->wr_credits != csk->wr_max_credits) {
+ chtls_purge_wr_queue(sk);
+ chtls_reset_wr_list(csk);
+ }
+
if (csk->l2t_entry) {
cxgb4_l2t_release(csk->l2t_entry);
csk->l2t_entry = NULL;
@@ -1714,6 +1727,7 @@ static void chtls_peer_close(struct sock *sk, struct sk_buff *skb)
else
sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
}
+ kfree_skb(skb);
}
static void chtls_close_con_rpl(struct sock *sk, struct sk_buff *skb)
@@ -2041,19 +2055,6 @@ rel_skb:
return 0;
}
-static struct sk_buff *dequeue_wr(struct sock *sk)
-{
- struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
- struct sk_buff *skb = csk->wr_skb_head;
-
- if (likely(skb)) {
- /* Don't bother clearing the tail */
- csk->wr_skb_head = WR_SKB_CB(skb)->next_wr;
- WR_SKB_CB(skb)->next_wr = NULL;
- }
- return skb;
-}
-
static void chtls_rx_ack(struct sock *sk, struct sk_buff *skb)
{
struct cpl_fw4_ack *hdr = cplhdr(skb) + RSS_HDR;
diff --git a/drivers/crypto/chelsio/chtls/chtls_cm.h b/drivers/crypto/chelsio/chtls/chtls_cm.h
index 78eb3afa3a80d..4282d8a4eae48 100644
--- a/drivers/crypto/chelsio/chtls/chtls_cm.h
+++ b/drivers/crypto/chelsio/chtls/chtls_cm.h
@@ -188,6 +188,12 @@ static inline void chtls_kfree_skb(struct sock *sk, struct sk_buff *skb)
kfree_skb(skb);
}
+static inline void chtls_reset_wr_list(struct chtls_sock *csk)
+{
+ csk->wr_skb_head = NULL;
+ csk->wr_skb_tail = NULL;
+}
+
static inline void enqueue_wr(struct chtls_sock *csk, struct sk_buff *skb)
{
WR_SKB_CB(skb)->next_wr = NULL;
@@ -200,4 +206,19 @@ static inline void enqueue_wr(struct chtls_sock *csk, struct sk_buff *skb)
WR_SKB_CB(csk->wr_skb_tail)->next_wr = skb;
csk->wr_skb_tail = skb;
}
+
+static inline struct sk_buff *dequeue_wr(struct sock *sk)
+{
+ struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
+ struct sk_buff *skb = NULL;
+
+ skb = csk->wr_skb_head;
+
+ if (likely(skb)) {
+ /* Don't bother clearing the tail */
+ csk->wr_skb_head = WR_SKB_CB(skb)->next_wr;
+ WR_SKB_CB(skb)->next_wr = NULL;
+ }
+ return skb;
+}
#endif
diff --git a/drivers/crypto/chelsio/chtls/chtls_hw.c b/drivers/crypto/chelsio/chtls/chtls_hw.c
index 4909607558644..64d24823c65aa 100644
--- a/drivers/crypto/chelsio/chtls/chtls_hw.c
+++ b/drivers/crypto/chelsio/chtls/chtls_hw.c
@@ -361,6 +361,7 @@ int chtls_setkey(struct chtls_sock *csk, u32 keylen, u32 optname)
kwr->sc_imm.cmd_more = cpu_to_be32(ULPTX_CMD_V(ULP_TX_SC_IMM));
kwr->sc_imm.len = cpu_to_be32(klen);
+ lock_sock(sk);
/* key info */
kctx = (struct _key_ctx *)(kwr + 1);
ret = chtls_key_info(csk, kctx, keylen, optname);
@@ -399,8 +400,10 @@ int chtls_setkey(struct chtls_sock *csk, u32 keylen, u32 optname)
csk->tlshws.txkey = keyid;
}
+ release_sock(sk);
return ret;
out_notcb:
+ release_sock(sk);
free_tls_keyid(sk);
out_nokey:
kfree_skb(skb);
--
2.20.1
From: Arnd Bergmann <[email protected]>
[ Upstream commit 7483e7a939c074d887450ef1c4d9ccc5909405f8 ]
With CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3, the stack usage in vme_fake
grows above the warning limit:
drivers/vme/bridges/vme_fake.c: In function 'fake_master_read':
drivers/vme/bridges/vme_fake.c:610:1: error: the frame size of 1160 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
drivers/vme/bridges/vme_fake.c: In function 'fake_master_write':
drivers/vme/bridges/vme_fake.c:797:1: error: the frame size of 1160 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
The problem is that in some configurations, each call to
fake_vmereadX() puts another variable on the stack.
Reduce the amount of inlining to get back to the previous state,
with no function using more than 200 bytes each.
Signed-off-by: Arnd Bergmann <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/vme/bridges/vme_fake.c | 30 ++++++++++++++++++------------
1 file changed, 18 insertions(+), 12 deletions(-)
diff --git a/drivers/vme/bridges/vme_fake.c b/drivers/vme/bridges/vme_fake.c
index 7d83691047f47..685a43bdc2a1d 100644
--- a/drivers/vme/bridges/vme_fake.c
+++ b/drivers/vme/bridges/vme_fake.c
@@ -418,8 +418,9 @@ static void fake_lm_check(struct fake_driver *bridge, unsigned long long addr,
}
}
-static u8 fake_vmeread8(struct fake_driver *bridge, unsigned long long addr,
- u32 aspace, u32 cycle)
+static noinline_for_stack u8 fake_vmeread8(struct fake_driver *bridge,
+ unsigned long long addr,
+ u32 aspace, u32 cycle)
{
u8 retval = 0xff;
int i;
@@ -450,8 +451,9 @@ static u8 fake_vmeread8(struct fake_driver *bridge, unsigned long long addr,
return retval;
}
-static u16 fake_vmeread16(struct fake_driver *bridge, unsigned long long addr,
- u32 aspace, u32 cycle)
+static noinline_for_stack u16 fake_vmeread16(struct fake_driver *bridge,
+ unsigned long long addr,
+ u32 aspace, u32 cycle)
{
u16 retval = 0xffff;
int i;
@@ -482,8 +484,9 @@ static u16 fake_vmeread16(struct fake_driver *bridge, unsigned long long addr,
return retval;
}
-static u32 fake_vmeread32(struct fake_driver *bridge, unsigned long long addr,
- u32 aspace, u32 cycle)
+static noinline_for_stack u32 fake_vmeread32(struct fake_driver *bridge,
+ unsigned long long addr,
+ u32 aspace, u32 cycle)
{
u32 retval = 0xffffffff;
int i;
@@ -613,8 +616,9 @@ out:
return retval;
}
-static void fake_vmewrite8(struct fake_driver *bridge, u8 *buf,
- unsigned long long addr, u32 aspace, u32 cycle)
+static noinline_for_stack void fake_vmewrite8(struct fake_driver *bridge,
+ u8 *buf, unsigned long long addr,
+ u32 aspace, u32 cycle)
{
int i;
unsigned long long start, end, offset;
@@ -643,8 +647,9 @@ static void fake_vmewrite8(struct fake_driver *bridge, u8 *buf,
}
-static void fake_vmewrite16(struct fake_driver *bridge, u16 *buf,
- unsigned long long addr, u32 aspace, u32 cycle)
+static noinline_for_stack void fake_vmewrite16(struct fake_driver *bridge,
+ u16 *buf, unsigned long long addr,
+ u32 aspace, u32 cycle)
{
int i;
unsigned long long start, end, offset;
@@ -673,8 +678,9 @@ static void fake_vmewrite16(struct fake_driver *bridge, u16 *buf,
}
-static void fake_vmewrite32(struct fake_driver *bridge, u32 *buf,
- unsigned long long addr, u32 aspace, u32 cycle)
+static noinline_for_stack void fake_vmewrite32(struct fake_driver *bridge,
+ u32 *buf, unsigned long long addr,
+ u32 aspace, u32 cycle)
{
int i;
unsigned long long start, end, offset;
--
2.20.1
From: Arnd Bergmann <[email protected]>
[ Upstream commit c497ae2077c055b85c1bf04f3d182a84bd8f365b ]
The rtl8188 copy of the os_dep support code causes a
warning about a very significant stack usage in the translate_scan()
function:
drivers/staging/rtl8188eu/os_dep/ioctl_linux.c: In function 'translate_scan':
drivers/staging/rtl8188eu/os_dep/ioctl_linux.c:306:1: error: the frame size of 1560 bytes is larger than 1400 bytes [-Werror=frame-larger-than=]
Use the same trick as in the rtl8723bs copy of the same function, and
allocate it dynamically.
Signed-off-by: Arnd Bergmann <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/staging/rtl8188eu/os_dep/ioctl_linux.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/staging/rtl8188eu/os_dep/ioctl_linux.c b/drivers/staging/rtl8188eu/os_dep/ioctl_linux.c
index bee3c3a7a7a99..2db4444267a74 100644
--- a/drivers/staging/rtl8188eu/os_dep/ioctl_linux.c
+++ b/drivers/staging/rtl8188eu/os_dep/ioctl_linux.c
@@ -229,18 +229,21 @@ static char *translate_scan(struct adapter *padapter,
/* parsing WPA/WPA2 IE */
{
- u8 buf[MAX_WPA_IE_LEN];
+ u8 *buf;
u8 wpa_ie[255], rsn_ie[255];
u16 wpa_len = 0, rsn_len = 0;
u8 *p;
+ buf = kzalloc(MAX_WPA_IE_LEN, GFP_ATOMIC);
+ if (!buf)
+ return start;
+
rtw_get_sec_ie(pnetwork->network.ies, pnetwork->network.ie_length, rsn_ie, &rsn_len, wpa_ie, &wpa_len);
RT_TRACE(_module_rtl871x_mlme_c_, _drv_info_, ("rtw_wx_get_scan: ssid =%s\n", pnetwork->network.Ssid.Ssid));
RT_TRACE(_module_rtl871x_mlme_c_, _drv_info_, ("rtw_wx_get_scan: wpa_len =%d rsn_len =%d\n", wpa_len, rsn_len));
if (wpa_len > 0) {
p = buf;
- memset(buf, 0, MAX_WPA_IE_LEN);
p += sprintf(p, "wpa_ie=");
for (i = 0; i < wpa_len; i++)
p += sprintf(p, "%02x", wpa_ie[i]);
@@ -257,7 +260,6 @@ static char *translate_scan(struct adapter *padapter,
}
if (rsn_len > 0) {
p = buf;
- memset(buf, 0, MAX_WPA_IE_LEN);
p += sprintf(p, "rsn_ie=");
for (i = 0; i < rsn_len; i++)
p += sprintf(p, "%02x", rsn_ie[i]);
@@ -271,6 +273,7 @@ static char *translate_scan(struct adapter *padapter,
iwe.u.data.length = rsn_len;
start = iwe_stream_add_point(info, start, stop, &iwe, rsn_ie);
}
+ kfree(buf);
}
{/* parsing WPS IE */
--
2.20.1
From: Mike Marciniszyn <[email protected]>
[ Upstream commit 5ffd048698ea5139743acd45e8ab388a683642b8 ]
All other code paths increment some form of drop counter.
This was missed in the original implementation.
Fixes: 82c2611daaf0 ("staging/rdma/hfi1: Handle packets with invalid RHF on context 0")
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Kaike Wan <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/infiniband/hw/hfi1/chip.c | 10 ++++++++++
drivers/infiniband/hw/hfi1/chip.h | 1 +
drivers/infiniband/hw/hfi1/driver.c | 1 +
drivers/infiniband/hw/hfi1/hfi.h | 2 ++
4 files changed, 14 insertions(+)
diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index b09a4b1cf397b..1221faea75a68 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -1687,6 +1687,14 @@ static u64 access_sw_pio_drain(const struct cntr_entry *entry,
return dd->verbs_dev.n_piodrain;
}
+static u64 access_sw_ctx0_seq_drop(const struct cntr_entry *entry,
+ void *context, int vl, int mode, u64 data)
+{
+ struct hfi1_devdata *dd = context;
+
+ return dd->ctx0_seq_drop;
+}
+
static u64 access_sw_vtx_wait(const struct cntr_entry *entry,
void *context, int vl, int mode, u64 data)
{
@@ -4247,6 +4255,8 @@ static struct cntr_entry dev_cntrs[DEV_CNTR_LAST] = {
access_sw_cpu_intr),
[C_SW_CPU_RCV_LIM] = CNTR_ELEM("RcvLimit", 0, 0, CNTR_NORMAL,
access_sw_cpu_rcv_limit),
+[C_SW_CTX0_SEQ_DROP] = CNTR_ELEM("SeqDrop0", 0, 0, CNTR_NORMAL,
+ access_sw_ctx0_seq_drop),
[C_SW_VTX_WAIT] = CNTR_ELEM("vTxWait", 0, 0, CNTR_NORMAL,
access_sw_vtx_wait),
[C_SW_PIO_WAIT] = CNTR_ELEM("PioWait", 0, 0, CNTR_NORMAL,
diff --git a/drivers/infiniband/hw/hfi1/chip.h b/drivers/infiniband/hw/hfi1/chip.h
index 36b04d6300e54..c9a352d8a7e13 100644
--- a/drivers/infiniband/hw/hfi1/chip.h
+++ b/drivers/infiniband/hw/hfi1/chip.h
@@ -909,6 +909,7 @@ enum {
C_DC_PG_STS_TX_MBE_CNT,
C_SW_CPU_INTR,
C_SW_CPU_RCV_LIM,
+ C_SW_CTX0_SEQ_DROP,
C_SW_VTX_WAIT,
C_SW_PIO_WAIT,
C_SW_PIO_DRAIN,
diff --git a/drivers/infiniband/hw/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c
index d5277c23cba60..769e114567a03 100644
--- a/drivers/infiniband/hw/hfi1/driver.c
+++ b/drivers/infiniband/hw/hfi1/driver.c
@@ -734,6 +734,7 @@ static noinline int skip_rcv_packet(struct hfi1_packet *packet, int thread)
{
int ret;
+ packet->rcd->dd->ctx0_seq_drop++;
/* Set up for the next packet */
packet->rhqoff += packet->rsize;
if (packet->rhqoff >= packet->maxcnt)
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index ab981874c71cc..e38de547785d0 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1093,6 +1093,8 @@ struct hfi1_devdata {
char *boardname; /* human readable board info */
+ u64 ctx0_seq_drop;
+
/* reset value */
u64 z_int_counter;
u64 z_rcv_limit;
--
2.20.1
From: Chanwoo Choi <[email protected]>
[ Upstream commit eff5d31f7407fa9d31fb840106f1593399457298 ]
To build test, add COMPILE_TEST depedency to both ARM_RK3399_DMC_DEVFREQ
and DEVFREQ_EVENT_ROCKCHIP_DFI configuration. And ARM_RK3399_DMC_DEVFREQ
used the SMCCC interface so that add HAVE_ARM_SMCCC dependency to prevent
the build break.
Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Chanwoo Choi <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/devfreq/Kconfig | 3 ++-
drivers/devfreq/event/Kconfig | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
index 6a172d338f6dc..4c4ec68b0566d 100644
--- a/drivers/devfreq/Kconfig
+++ b/drivers/devfreq/Kconfig
@@ -103,7 +103,8 @@ config ARM_TEGRA_DEVFREQ
config ARM_RK3399_DMC_DEVFREQ
tristate "ARM RK3399 DMC DEVFREQ Driver"
- depends on ARCH_ROCKCHIP
+ depends on (ARCH_ROCKCHIP && HAVE_ARM_SMCCC) || \
+ (COMPILE_TEST && HAVE_ARM_SMCCC)
select DEVFREQ_EVENT_ROCKCHIP_DFI
select DEVFREQ_GOV_SIMPLE_ONDEMAND
select PM_DEVFREQ_EVENT
diff --git a/drivers/devfreq/event/Kconfig b/drivers/devfreq/event/Kconfig
index cd949800eed96..8851bc4e8e3e1 100644
--- a/drivers/devfreq/event/Kconfig
+++ b/drivers/devfreq/event/Kconfig
@@ -33,7 +33,7 @@ config DEVFREQ_EVENT_EXYNOS_PPMU
config DEVFREQ_EVENT_ROCKCHIP_DFI
tristate "ROCKCHIP DFI DEVFREQ event Driver"
- depends on ARCH_ROCKCHIP
+ depends on ARCH_ROCKCHIP || COMPILE_TEST
help
This add the devfreq-event driver for Rockchip SoC. It provides DFI
(DDR Monitor Module) driver to count ddr load.
--
2.20.1
From: Geert Uytterhoeven <[email protected]>
[ Upstream commit 02aeb2f21530c98fc3ca51028eda742a3fafbd9f ]
pinmux_func_gpios[] contains a hole due to the missing function GPIO
definition for the "CTX0&CTX1" signal, which is the logical "AND" of the
first two CAN outputs.
A closer look reveals other issues:
- Some functionality is available on alternative pins, but the
PINMUX_DATA() entries is using the wrong marks,
- Several configurations are missing.
Fix this by:
- Renaming CTX0CTX1CTX2_MARK, CRX0CRX1_PJ22_MARK, and
CRX0CRX1CRX2_PJ20_MARK to CTX0_CTX1_CTX2_MARK, CRX0_CRX1_PJ22_MARK,
resp. CRX0_CRX1_CRX2_PJ20_MARK for consistency with the
corresponding enum IDs,
- Adding all missing enum IDs and marks,
- Use the right (*_PJ2x) variants for alternative pins,
- Adding all missing configurations to pinmux_data[],
- Adding all missing function GPIO definitions to pinmux_func_gpios[].
See SH7268 Group, SH7269 Group User’s Manual: Hardware, Rev. 2.00:
[1] Table 1.4 List of Pins
[2] Figure 23.29 Connection Example when Using Channels 0 and 1 as One
Channel (64 Mailboxes × 1 Channel) and Channel 2 as One Channel
(32 Mailboxes × 1 Channel),
[3] Figure 23.30 Connection Example when Using Channels 0, 1, and 2 as
One Channel (96 Mailboxes × 1 Channel),
[4] Table 48.3 Multiplexed Pins (Port B),
[5] Table 48.4 Multiplexed Pins (Port C),
[6] Table 48.10 Multiplexed Pins (Port J),
[7] Section 48.2.4 Port B Control Registers 0 to 5 (PBCR0 to PBCR5).
Signed-off-by: Geert Uytterhoeven <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/sh/include/cpu-sh2a/cpu/sh7269.h | 11 ++++++--
drivers/pinctrl/sh-pfc/pfc-sh7269.c | 39 ++++++++++++++++++---------
2 files changed, 36 insertions(+), 14 deletions(-)
diff --git a/arch/sh/include/cpu-sh2a/cpu/sh7269.h b/arch/sh/include/cpu-sh2a/cpu/sh7269.h
index d516e5d488180..b887cc402b712 100644
--- a/arch/sh/include/cpu-sh2a/cpu/sh7269.h
+++ b/arch/sh/include/cpu-sh2a/cpu/sh7269.h
@@ -78,8 +78,15 @@ enum {
GPIO_FN_WDTOVF,
/* CAN */
- GPIO_FN_CTX1, GPIO_FN_CRX1, GPIO_FN_CTX0, GPIO_FN_CTX0_CTX1,
- GPIO_FN_CRX0, GPIO_FN_CRX0_CRX1, GPIO_FN_CRX0_CRX1_CRX2,
+ GPIO_FN_CTX2, GPIO_FN_CRX2,
+ GPIO_FN_CTX1, GPIO_FN_CRX1,
+ GPIO_FN_CTX0, GPIO_FN_CRX0,
+ GPIO_FN_CTX0_CTX1, GPIO_FN_CRX0_CRX1,
+ GPIO_FN_CTX0_CTX1_CTX2, GPIO_FN_CRX0_CRX1_CRX2,
+ GPIO_FN_CTX2_PJ21, GPIO_FN_CRX2_PJ20,
+ GPIO_FN_CTX1_PJ23, GPIO_FN_CRX1_PJ22,
+ GPIO_FN_CTX0_CTX1_PJ23, GPIO_FN_CRX0_CRX1_PJ22,
+ GPIO_FN_CTX0_CTX1_CTX2_PJ21, GPIO_FN_CRX0_CRX1_CRX2_PJ20,
/* DMAC */
GPIO_FN_TEND0, GPIO_FN_DACK0, GPIO_FN_DREQ0,
diff --git a/drivers/pinctrl/sh-pfc/pfc-sh7269.c b/drivers/pinctrl/sh-pfc/pfc-sh7269.c
index cfdb4fc177c3e..3df0c0d139d08 100644
--- a/drivers/pinctrl/sh-pfc/pfc-sh7269.c
+++ b/drivers/pinctrl/sh-pfc/pfc-sh7269.c
@@ -740,13 +740,12 @@ enum {
CRX0_MARK, CTX0_MARK,
CRX1_MARK, CTX1_MARK,
CRX2_MARK, CTX2_MARK,
- CRX0_CRX1_MARK,
- CRX0_CRX1_CRX2_MARK,
- CTX0CTX1CTX2_MARK,
+ CRX0_CRX1_MARK, CTX0_CTX1_MARK,
+ CRX0_CRX1_CRX2_MARK, CTX0_CTX1_CTX2_MARK,
CRX1_PJ22_MARK, CTX1_PJ23_MARK,
CRX2_PJ20_MARK, CTX2_PJ21_MARK,
- CRX0CRX1_PJ22_MARK,
- CRX0CRX1CRX2_PJ20_MARK,
+ CRX0_CRX1_PJ22_MARK, CTX0_CTX1_PJ23_MARK,
+ CRX0_CRX1_CRX2_PJ20_MARK, CTX0_CTX1_CTX2_PJ21_MARK,
/* VDC */
DV_CLK_MARK,
@@ -824,6 +823,7 @@ static const u16 pinmux_data[] = {
PINMUX_DATA(CS3_MARK, PC8MD_001),
PINMUX_DATA(TXD7_MARK, PC8MD_010),
PINMUX_DATA(CTX1_MARK, PC8MD_011),
+ PINMUX_DATA(CTX0_CTX1_MARK, PC8MD_100),
PINMUX_DATA(PC7_DATA, PC7MD_000),
PINMUX_DATA(CKE_MARK, PC7MD_001),
@@ -836,11 +836,12 @@ static const u16 pinmux_data[] = {
PINMUX_DATA(CAS_MARK, PC6MD_001),
PINMUX_DATA(SCK7_MARK, PC6MD_010),
PINMUX_DATA(CTX0_MARK, PC6MD_011),
+ PINMUX_DATA(CTX0_CTX1_CTX2_MARK, PC6MD_100),
PINMUX_DATA(PC5_DATA, PC5MD_000),
PINMUX_DATA(RAS_MARK, PC5MD_001),
PINMUX_DATA(CRX0_MARK, PC5MD_011),
- PINMUX_DATA(CTX0CTX1CTX2_MARK, PC5MD_100),
+ PINMUX_DATA(CTX0_CTX1_CTX2_MARK, PC5MD_100),
PINMUX_DATA(IRQ0_PC_MARK, PC5MD_101),
PINMUX_DATA(PC4_DATA, PC4MD_00),
@@ -1292,30 +1293,32 @@ static const u16 pinmux_data[] = {
PINMUX_DATA(LCD_DATA23_PJ23_MARK, PJ23MD_010),
PINMUX_DATA(LCD_TCON6_MARK, PJ23MD_011),
PINMUX_DATA(IRQ3_PJ_MARK, PJ23MD_100),
- PINMUX_DATA(CTX1_MARK, PJ23MD_101),
+ PINMUX_DATA(CTX1_PJ23_MARK, PJ23MD_101),
+ PINMUX_DATA(CTX0_CTX1_PJ23_MARK, PJ23MD_110),
PINMUX_DATA(PJ22_DATA, PJ22MD_000),
PINMUX_DATA(DV_DATA22_MARK, PJ22MD_001),
PINMUX_DATA(LCD_DATA22_PJ22_MARK, PJ22MD_010),
PINMUX_DATA(LCD_TCON5_MARK, PJ22MD_011),
PINMUX_DATA(IRQ2_PJ_MARK, PJ22MD_100),
- PINMUX_DATA(CRX1_MARK, PJ22MD_101),
- PINMUX_DATA(CRX0_CRX1_MARK, PJ22MD_110),
+ PINMUX_DATA(CRX1_PJ22_MARK, PJ22MD_101),
+ PINMUX_DATA(CRX0_CRX1_PJ22_MARK, PJ22MD_110),
PINMUX_DATA(PJ21_DATA, PJ21MD_000),
PINMUX_DATA(DV_DATA21_MARK, PJ21MD_001),
PINMUX_DATA(LCD_DATA21_PJ21_MARK, PJ21MD_010),
PINMUX_DATA(LCD_TCON4_MARK, PJ21MD_011),
PINMUX_DATA(IRQ1_PJ_MARK, PJ21MD_100),
- PINMUX_DATA(CTX2_MARK, PJ21MD_101),
+ PINMUX_DATA(CTX2_PJ21_MARK, PJ21MD_101),
+ PINMUX_DATA(CTX0_CTX1_CTX2_PJ21_MARK, PJ21MD_110),
PINMUX_DATA(PJ20_DATA, PJ20MD_000),
PINMUX_DATA(DV_DATA20_MARK, PJ20MD_001),
PINMUX_DATA(LCD_DATA20_PJ20_MARK, PJ20MD_010),
PINMUX_DATA(LCD_TCON3_MARK, PJ20MD_011),
PINMUX_DATA(IRQ0_PJ_MARK, PJ20MD_100),
- PINMUX_DATA(CRX2_MARK, PJ20MD_101),
- PINMUX_DATA(CRX0CRX1CRX2_PJ20_MARK, PJ20MD_110),
+ PINMUX_DATA(CRX2_PJ20_MARK, PJ20MD_101),
+ PINMUX_DATA(CRX0_CRX1_CRX2_PJ20_MARK, PJ20MD_110),
PINMUX_DATA(PJ19_DATA, PJ19MD_000),
PINMUX_DATA(DV_DATA19_MARK, PJ19MD_001),
@@ -1666,12 +1669,24 @@ static const struct pinmux_func pinmux_func_gpios[] = {
GPIO_FN(WDTOVF),
/* CAN */
+ GPIO_FN(CTX2),
+ GPIO_FN(CRX2),
GPIO_FN(CTX1),
GPIO_FN(CRX1),
GPIO_FN(CTX0),
GPIO_FN(CRX0),
+ GPIO_FN(CTX0_CTX1),
GPIO_FN(CRX0_CRX1),
+ GPIO_FN(CTX0_CTX1_CTX2),
GPIO_FN(CRX0_CRX1_CRX2),
+ GPIO_FN(CTX2_PJ21),
+ GPIO_FN(CRX2_PJ20),
+ GPIO_FN(CTX1_PJ23),
+ GPIO_FN(CRX1_PJ22),
+ GPIO_FN(CTX0_CTX1_PJ23),
+ GPIO_FN(CRX0_CRX1_PJ22),
+ GPIO_FN(CTX0_CTX1_CTX2_PJ21),
+ GPIO_FN(CRX0_CRX1_CRX2_PJ20),
/* DMAC */
GPIO_FN(TEND0),
--
2.20.1
From: Ben Skeggs <[email protected]>
[ Upstream commit 633cc9beeb6f9b5fa2f17a2a9d0e2790cb6c3de7 ]
Signed-off-by: Ben Skeggs <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c
index 16ad91c91a7be..f18ce6ff5b7ed 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fault/base.c
@@ -150,6 +150,7 @@ nvkm_fault_dtor(struct nvkm_subdev *subdev)
struct nvkm_fault *fault = nvkm_fault(subdev);
int i;
+ nvkm_notify_fini(&fault->nrpfb);
nvkm_event_fini(&fault->event);
for (i = 0; i < fault->buffer_nr; i++) {
--
2.20.1
From: Hanjun Guo <[email protected]>
[ Upstream commit 3c23b83a88d00383e1d498cfa515249aa2fe0238 ]
The IORT specification [0] (Section 3, table 4, page 9) defines the
'Number of IDs' as 'The number of IDs in the range minus one'.
However, the IORT ID mapping function iort_id_map() treats the 'Number
of IDs' field as if it were the full IDs mapping count, with the
following check in place to detect out of boundary input IDs:
InputID >= Input base + Number of IDs
This check is flawed in that it considers the 'Number of IDs' field as
the full number of IDs mapping and disregards the 'minus one' from
the IDs count.
The correct check in iort_id_map() should be implemented as:
InputID > Input base + Number of IDs
this implements the specification correctly but unfortunately it breaks
existing firmwares that erroneously set the 'Number of IDs' as the full
IDs mapping count rather than IDs mapping count minus one.
e.g.
PCI hostbridge mapping entry 1:
Input base: 0x1000
ID Count: 0x100
Output base: 0x1000
Output reference: 0xC4 //ITS reference
PCI hostbridge mapping entry 2:
Input base: 0x1100
ID Count: 0x100
Output base: 0x2000
Output reference: 0xD4 //ITS reference
Two mapping entries which the second entry's Input base = the first
entry's Input base + ID count, so for InputID 0x1100 and with the
correct InputID check in place in iort_id_map() the kernel would map
the InputID to ITS 0xC4 not 0xD4 as it would be expected.
Therefore, to keep supporting existing flawed firmwares, introduce a
workaround that instructs the kernel to use the old InputID range check
logic in iort_id_map(), so that we can support both firmwares written
with the flawed 'Number of IDs' logic and the correct one as defined in
the specifications.
[0]: http://infocenter.arm.com/help/topic/com.arm.doc.den0049d/DEN0049D_IO_Remapping_Table.pdf
Reported-by: Pankaj Bansal <[email protected]>
Link: https://lore.kernel.org/linux-acpi/[email protected]/
Signed-off-by: Hanjun Guo <[email protected]>
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Cc: Pankaj Bansal <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Sudeep Holla <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Robin Murphy <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/acpi/arm64/iort.c | 57 +++++++++++++++++++++++++++++++++++++--
1 file changed, 55 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index e11b5da6f828f..7d86468300b78 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -306,6 +306,59 @@ out:
return status;
}
+struct iort_workaround_oem_info {
+ char oem_id[ACPI_OEM_ID_SIZE + 1];
+ char oem_table_id[ACPI_OEM_TABLE_ID_SIZE + 1];
+ u32 oem_revision;
+};
+
+static bool apply_id_count_workaround;
+
+static struct iort_workaround_oem_info wa_info[] __initdata = {
+ {
+ .oem_id = "HISI ",
+ .oem_table_id = "HIP07 ",
+ .oem_revision = 0,
+ }, {
+ .oem_id = "HISI ",
+ .oem_table_id = "HIP08 ",
+ .oem_revision = 0,
+ }
+};
+
+static void __init
+iort_check_id_count_workaround(struct acpi_table_header *tbl)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(wa_info); i++) {
+ if (!memcmp(wa_info[i].oem_id, tbl->oem_id, ACPI_OEM_ID_SIZE) &&
+ !memcmp(wa_info[i].oem_table_id, tbl->oem_table_id, ACPI_OEM_TABLE_ID_SIZE) &&
+ wa_info[i].oem_revision == tbl->oem_revision) {
+ apply_id_count_workaround = true;
+ pr_warn(FW_BUG "ID count for ID mapping entry is wrong, applying workaround\n");
+ break;
+ }
+ }
+}
+
+static inline u32 iort_get_map_max(struct acpi_iort_id_mapping *map)
+{
+ u32 map_max = map->input_base + map->id_count;
+
+ /*
+ * The IORT specification revision D (Section 3, table 4, page 9) says
+ * Number of IDs = The number of IDs in the range minus one, but the
+ * IORT code ignored the "minus one", and some firmware did that too,
+ * so apply a workaround here to keep compatible with both the spec
+ * compliant and non-spec compliant firmwares.
+ */
+ if (apply_id_count_workaround)
+ map_max--;
+
+ return map_max;
+}
+
static int iort_id_map(struct acpi_iort_id_mapping *map, u8 type, u32 rid_in,
u32 *rid_out)
{
@@ -322,8 +375,7 @@ static int iort_id_map(struct acpi_iort_id_mapping *map, u8 type, u32 rid_in,
return -ENXIO;
}
- if (rid_in < map->input_base ||
- (rid_in >= map->input_base + map->id_count))
+ if (rid_in < map->input_base || rid_in > iort_get_map_max(map))
return -ENXIO;
*rid_out = map->output_base + (rid_in - map->input_base);
@@ -1542,5 +1594,6 @@ void __init acpi_iort_init(void)
return;
}
+ iort_check_id_count_workaround(iort_table);
iort_init_platform_devices();
}
--
2.20.1
From: Chao Yu <[email protected]>
[ Upstream commit fe396ad8e7526f059f7b8c7290d33a1b84adacab ]
If kobject_init_and_add() failed, caller needs to invoke kobject_put()
to release kobject explicitly.
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/f2fs/sysfs.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index b405548202d3b..9a59f49ba4050 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -658,10 +658,12 @@ int __init f2fs_init_sysfs(void)
ret = kobject_init_and_add(&f2fs_feat, &f2fs_feat_ktype,
NULL, "features");
- if (ret)
+ if (ret) {
+ kobject_put(&f2fs_feat);
kset_unregister(&f2fs_kset);
- else
+ } else {
f2fs_proc_root = proc_mkdir("fs/f2fs", NULL);
+ }
return ret;
}
@@ -682,8 +684,11 @@ int f2fs_register_sysfs(struct f2fs_sb_info *sbi)
init_completion(&sbi->s_kobj_unregister);
err = kobject_init_and_add(&sbi->s_kobj, &f2fs_sb_ktype, NULL,
"%s", sb->s_id);
- if (err)
+ if (err) {
+ kobject_put(&sbi->s_kobj);
+ wait_for_completion(&sbi->s_kobj_unregister);
return err;
+ }
if (f2fs_proc_root)
sbi->s_proc = proc_mkdir(sb->s_id, f2fs_proc_root);
--
2.20.1
From: Ard Biesheuvel <[email protected]>
[ Upstream commit 75fbef0a8b6b4bb19b9a91b5214f846c2dc5139e ]
The following commit:
15f003d20782 ("x86/mm/pat: Don't implicitly allow _PAGE_RW in kernel_map_pages_in_pgd()")
modified kernel_map_pages_in_pgd() to manage writable permissions
of memory mappings in the EFI page table in a different way, but
in the process, it removed the ability to clear NX attributes from
read-only mappings, by clobbering the clear mask if _PAGE_RW is not
being requested.
Failure to remove the NX attribute from read-only mappings is
unlikely to be a security issue, but it does prevent us from
tightening the permissions in the EFI page tables going forward,
so let's fix it now.
Fixes: 15f003d20782 ("x86/mm/pat: Don't implicitly allow _PAGE_RW in kernel_map_pages_in_pgd()
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/mm/pageattr.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index e2d4b25c7aa44..101f3ad0d6ad1 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -2126,19 +2126,13 @@ int kernel_map_pages_in_pgd(pgd_t *pgd, u64 pfn, unsigned long address,
.pgd = pgd,
.numpages = numpages,
.mask_set = __pgprot(0),
- .mask_clr = __pgprot(0),
+ .mask_clr = __pgprot(~page_flags & (_PAGE_NX|_PAGE_RW)),
.flags = 0,
};
if (!(__supported_pte_mask & _PAGE_NX))
goto out;
- if (!(page_flags & _PAGE_NX))
- cpa.mask_clr = __pgprot(_PAGE_NX);
-
- if (!(page_flags & _PAGE_RW))
- cpa.mask_clr = __pgprot(_PAGE_RW);
-
if (!(page_flags & _PAGE_ENC))
cpa.mask_clr = pgprot_encrypted(cpa.mask_clr);
--
2.20.1
From: Ben Skeggs <[email protected]>
[ Upstream commit 7adc77aa0e11f25b0e762859219c70852cd8d56f ]
Method init is typically ordered by class in the FW image as ThreeD,
TwoD, Compute.
Due to a bug in parsing the FW into our internal format, we've been
accidentally sending Twod + Compute methods to the ThreeD class, as
well as Compute methods to the TwoD class - oops.
Signed-off-by: Ben Skeggs <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
.../gpu/drm/nouveau/nvkm/engine/gr/gk20a.c | 21 ++++++++++---------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gk20a.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gk20a.c
index 500cb08dd6080..b57ab5cea9a10 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/gk20a.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/gk20a.c
@@ -143,23 +143,24 @@ gk20a_gr_av_to_method(struct gf100_gr *gr, const char *fw_name,
nent = (fuc.size / sizeof(struct gk20a_fw_av));
- pack = vzalloc((sizeof(*pack) * max_classes) +
- (sizeof(*init) * (nent + 1)));
+ pack = vzalloc((sizeof(*pack) * (max_classes + 1)) +
+ (sizeof(*init) * (nent + max_classes + 1)));
if (!pack) {
ret = -ENOMEM;
goto end;
}
- init = (void *)(pack + max_classes);
+ init = (void *)(pack + max_classes + 1);
- for (i = 0; i < nent; i++) {
- struct gf100_gr_init *ent = &init[i];
+ for (i = 0; i < nent; i++, init++) {
struct gk20a_fw_av *av = &((struct gk20a_fw_av *)fuc.data)[i];
u32 class = av->addr & 0xffff;
u32 addr = (av->addr & 0xffff0000) >> 14;
if (prevclass != class) {
- pack[classidx].init = ent;
+ if (prevclass) /* Add terminator to the method list. */
+ init++;
+ pack[classidx].init = init;
pack[classidx].type = class;
prevclass = class;
if (++classidx >= max_classes) {
@@ -169,10 +170,10 @@ gk20a_gr_av_to_method(struct gf100_gr *gr, const char *fw_name,
}
}
- ent->addr = addr;
- ent->data = av->data;
- ent->count = 1;
- ent->pitch = 1;
+ init->addr = addr;
+ init->data = av->data;
+ init->count = 1;
+ init->pitch = 1;
}
*ppack = pack;
--
2.20.1
From: Dan Carpenter <[email protected]>
[ Upstream commit ce1f31b4c0b9551dd51874dd5364654ed4ca13ae ]
The "drive->dn" variable is a u8 controlled by root.
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/ide/serverworks.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/ide/serverworks.c b/drivers/ide/serverworks.c
index a97affca18abe..0f57d45484d1d 100644
--- a/drivers/ide/serverworks.c
+++ b/drivers/ide/serverworks.c
@@ -114,6 +114,9 @@ static void svwks_set_pio_mode(ide_hwif_t *hwif, ide_drive_t *drive)
struct pci_dev *dev = to_pci_dev(hwif->dev);
const u8 pio = drive->pio_mode - XFER_PIO_0;
+ if (drive->dn >= ARRAY_SIZE(drive_pci))
+ return;
+
pci_write_config_byte(dev, drive_pci[drive->dn], pio_modes[pio]);
if (svwks_csb_check(dev)) {
@@ -140,6 +143,9 @@ static void svwks_set_dma_mode(ide_hwif_t *hwif, ide_drive_t *drive)
u8 ultra_enable = 0, ultra_timing = 0, dma_timing = 0;
+ if (drive->dn >= ARRAY_SIZE(drive_pci2))
+ return;
+
pci_read_config_byte(dev, (0x56|hwif->channel), &ultra_timing);
pci_read_config_byte(dev, 0x54, &ultra_enable);
--
2.20.1
From: yu kuai <[email protected]>
[ Upstream commit 9871abffc81048e20f02e15d6aa4558a44ad53ea ]
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/pwm/pwm-pca9685.c: In function ‘pca9685_pwm_gpio_free’:
drivers/pwm/pwm-pca9685.c:162:21: warning: variable ‘pwm’ set but not used [-Wunused-but-set-variable]
It is never used, and so can be removed. In that case, hold and release
the lock 'pca->lock' can be removed since nothing will be done between
them.
Fixes: e926b12c611c ("pwm: Clear chip_data in pwm_put()")
Signed-off-by: yu kuai <[email protected]>
Acked-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/pwm/pwm-pca9685.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/pwm/pwm-pca9685.c b/drivers/pwm/pwm-pca9685.c
index 567f5e2771c47..e1e5dfcb16f36 100644
--- a/drivers/pwm/pwm-pca9685.c
+++ b/drivers/pwm/pwm-pca9685.c
@@ -170,13 +170,9 @@ static void pca9685_pwm_gpio_set(struct gpio_chip *gpio, unsigned int offset,
static void pca9685_pwm_gpio_free(struct gpio_chip *gpio, unsigned int offset)
{
struct pca9685 *pca = gpiochip_get_data(gpio);
- struct pwm_device *pwm;
pca9685_pwm_gpio_set(gpio, offset, 0);
pm_runtime_put(pca->chip.dev);
- mutex_lock(&pca->lock);
- pwm = &pca->chip.pwms[offset];
- mutex_unlock(&pca->lock);
}
static int pca9685_pwm_gpio_get_direction(struct gpio_chip *chip,
--
2.20.1
From: David Sterba <[email protected]>
[ Upstream commit 4babad10198fa73fe73239d02c2e99e3333f5f5c ]
Dan's smatch tool reports
fs/btrfs/file-item.c:295 btrfs_lookup_bio_sums()
warn: should this be 'count == -1'
which points to the while (count--) loop. With count == 0 the check
itself could decrement it to -1. There's a WARN_ON a few lines below
that has never been seen in practice though.
It turns out that the value of page_bytes_left matches the count (by
sectorsize multiples). The loop never reaches the state where count
would go to -1, because page_bytes_left == 0 is found first and this
breaks out.
For clarity, use only plain check on count (and only for positive
value), decrement safely inside the loop. Any other discrepancy after
the whole bio list processing should be reported by the exising
WARN_ON_ONCE as well.
Reported-by: Dan Carpenter <[email protected]>
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/btrfs/file-item.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
index 4cf2817ab1202..f9e280d0b44f3 100644
--- a/fs/btrfs/file-item.c
+++ b/fs/btrfs/file-item.c
@@ -275,7 +275,8 @@ found:
csum += count * csum_size;
nblocks -= count;
next:
- while (count--) {
+ while (count > 0) {
+ count--;
disk_bytenr += fs_info->sectorsize;
offset += fs_info->sectorsize;
page_bytes_left -= fs_info->sectorsize;
--
2.20.1
From: Navid Emamdoost <[email protected]>
[ Upstream commit 40efb09a7f53125719e49864da008495e39aaa1e ]
In vmw_cmdbuf_res_add if drm_ht_insert_item fails the allocated memory
for cres should be released.
Fixes: 18e4a4669c50 ("drm/vmwgfx: Fix compat shader namespace")
Signed-off-by: Navid Emamdoost <[email protected]>
Reviewed-by: Thomas Hellstrom <[email protected]>
Signed-off-by: Thomas Hellstrom <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c
index 3b75af9bf85f3..f27bd7cff579d 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c
@@ -210,8 +210,10 @@ int vmw_cmdbuf_res_add(struct vmw_cmdbuf_res_manager *man,
cres->hash.key = user_key | (res_type << 24);
ret = drm_ht_insert_item(&man->resources, &cres->hash);
- if (unlikely(ret != 0))
+ if (unlikely(ret != 0)) {
+ kfree(cres);
goto out_invalid_key;
+ }
cres->state = VMW_CMDBUF_RES_ADD;
cres->res = vmw_resource_reference(res);
--
2.20.1
From: Brandon Maier <[email protected]>
[ Upstream commit a8f40111d184098cd2b3dc0c7170c42250a5fa09 ]
The remoteproc_core and remoteproc drivers all initialize with module_init().
However remoteproc drivers need the rproc_class during their probe. If one of
the remoteproc drivers runs init and gets through probe before
remoteproc_init() runs, a NULL pointer access of rproc_class's `glue_dirs`
spinlock occurs.
> Unable to handle kernel NULL pointer dereference at virtual address 000000dc
> pgd = c0004000
> [000000dc] *pgd=00000000
> Internal error: Oops: 5 [#1] PREEMPT ARM
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper Tainted: G W 4.14.106-rt56 #1
> Hardware name: Generic OMAP36xx (Flattened Device Tree)
> task: c6050000 task.stack: c604a000
> PC is at rt_spin_lock+0x40/0x6c
> LR is at rt_spin_lock+0x28/0x6c
> pc : [<c0523c90>] lr : [<c0523c78>] psr: 60000013
> sp : c604bdc0 ip : 00000000 fp : 00000000
> r10: 00000000 r9 : c61c7c10 r8 : c6269c20
> r7 : c0905888 r6 : c6269c20 r5 : 00000000 r4 : 000000d4
> r3 : 000000dc r2 : c6050000 r1 : 00000002 r0 : 000000d4
> Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
...
> [<c0523c90>] (rt_spin_lock) from [<c03b65a4>] (get_device_parent+0x54/0x17c)
> [<c03b65a4>] (get_device_parent) from [<c03b6bec>] (device_add+0xe0/0x5b4)
> [<c03b6bec>] (device_add) from [<c042adf4>] (rproc_add+0x18/0xd8)
> [<c042adf4>] (rproc_add) from [<c01110e4>] (my_rproc_probe+0x158/0x204)
> [<c01110e4>] (my_rproc_probe) from [<c03bb6b8>] (platform_drv_probe+0x34/0x70)
> [<c03bb6b8>] (platform_drv_probe) from [<c03b9dd4>] (driver_probe_device+0x2c8/0x420)
> [<c03b9dd4>] (driver_probe_device) from [<c03ba02c>] (__driver_attach+0x100/0x11c)
> [<c03ba02c>] (__driver_attach) from [<c03b7d08>] (bus_for_each_dev+0x7c/0xc0)
> [<c03b7d08>] (bus_for_each_dev) from [<c03b910c>] (bus_add_driver+0x1cc/0x264)
> [<c03b910c>] (bus_add_driver) from [<c03ba714>] (driver_register+0x78/0xf8)
> [<c03ba714>] (driver_register) from [<c010181c>] (do_one_initcall+0x100/0x190)
> [<c010181c>] (do_one_initcall) from [<c0800de8>] (kernel_init_freeable+0x130/0x1d0)
> [<c0800de8>] (kernel_init_freeable) from [<c051eee8>] (kernel_init+0x8/0x114)
> [<c051eee8>] (kernel_init) from [<c01175b0>] (ret_from_fork+0x14/0x24)
> Code: e2843008 e3c2203f f5d3f000 e5922010 (e193cf9f)
> ---[ end trace 0000000000000002 ]---
Signed-off-by: Brandon Maier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/remoteproc/remoteproc_core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
index aa6206706fe33..abbef17c97ee2 100644
--- a/drivers/remoteproc/remoteproc_core.c
+++ b/drivers/remoteproc/remoteproc_core.c
@@ -1786,7 +1786,7 @@ static int __init remoteproc_init(void)
return 0;
}
-module_init(remoteproc_init);
+subsys_initcall(remoteproc_init);
static void __exit remoteproc_exit(void)
{
--
2.20.1
From: John Garry <[email protected]>
[ Upstream commit d6152e6ec9e2171280436f7b31a571509b9287e1 ]
The following crash can be seen for setting
CONFIG_DEBUG_TEST_DRIVER_REMOVE=y for DT FW (which some people still use):
Hisilicon MBIGEN-V2 60080000.interrupt-controller: Failed to create mbi-gen irqdomain
Hisilicon MBIGEN-V2: probe of 60080000.interrupt-controller failed with error -12
[...]
Unable to handle kernel paging request at virtual address 0000000000005008
Mem abort info:
ESR = 0x96000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=0000041fb9990000
[0000000000005008] pgd=0000000000000000
Internal error: Oops: 96000004 [#1] PREEMPT SMP
Modules linked in:
CPU: 7 PID: 1 Comm: swapper/0 Not tainted 5.5.0-rc6-00002-g3fc42638a506-dirty #1622
Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT21 Nemo 2.0 RC0 04/18/2018
pstate: 40000085 (nZcv daIf -PAN -UAO)
pc : mbigen_set_type+0x38/0x60
lr : __irq_set_trigger+0x6c/0x188
sp : ffff800014b4b400
x29: ffff800014b4b400 x28: 0000000000000007
x27: 0000000000000000 x26: 0000000000000000
x25: ffff041fd83bd0d4 x24: ffff041fd83bd188
x23: 0000000000000000 x22: ffff80001193ce00
x21: 0000000000000004 x20: 0000000000000000
x19: ffff041fd83bd000 x18: ffffffffffffffff
x17: 0000000000000000 x16: 0000000000000000
x15: ffff8000119098c8 x14: ffff041fb94ec91c
x13: ffff041fb94ec1a1 x12: 0000000000000030
x11: 0101010101010101 x10: 0000000000000040
x9 : 0000000000000000 x8 : ffff041fb98c6680
x7 : ffff800014b4b380 x6 : ffff041fd81636c8
x5 : 0000000000000000 x4 : 000000000000025f
x3 : 0000000000005000 x2 : 0000000000005008
x1 : 0000000000000004 x0 : 0000000080000000
Call trace:
mbigen_set_type+0x38/0x60
__setup_irq+0x744/0x900
request_threaded_irq+0xe0/0x198
pcie_pme_probe+0x98/0x118
pcie_port_probe_service+0x38/0x78
really_probe+0xa0/0x3e0
driver_probe_device+0x58/0x100
__device_attach_driver+0x90/0xb0
bus_for_each_drv+0x64/0xc8
__device_attach+0xd8/0x138
device_initial_probe+0x10/0x18
bus_probe_device+0x90/0x98
device_add+0x4c4/0x770
device_register+0x1c/0x28
pcie_port_device_register+0x1e4/0x4f0
pcie_portdrv_probe+0x34/0xd8
local_pci_probe+0x3c/0xa0
pci_device_probe+0x128/0x1c0
really_probe+0xa0/0x3e0
driver_probe_device+0x58/0x100
__device_attach_driver+0x90/0xb0
bus_for_each_drv+0x64/0xc8
__device_attach+0xd8/0x138
device_attach+0x10/0x18
pci_bus_add_device+0x4c/0xb8
pci_bus_add_devices+0x38/0x88
pci_host_probe+0x3c/0xc0
pci_host_common_probe+0xf0/0x208
hisi_pcie_almost_ecam_probe+0x24/0x30
platform_drv_probe+0x50/0xa0
really_probe+0xa0/0x3e0
driver_probe_device+0x58/0x100
device_driver_attach+0x6c/0x90
__driver_attach+0x84/0xc8
bus_for_each_dev+0x74/0xc8
driver_attach+0x20/0x28
bus_add_driver+0x148/0x1f0
driver_register+0x60/0x110
__platform_driver_register+0x40/0x48
hisi_pcie_almost_ecam_driver_init+0x1c/0x24
The specific problem here is that the mbigen driver real probe has failed
as the mbigen_of_create_domain()->of_platform_device_create() call fails,
the reason for that being that we never destroyed the platform device
created during the remove test dry run and there is some conflict.
Since we generally would never want to unbind this driver, and to save
adding a driver tear down path for that, just set the driver
.suppress_bind_attrs member to avoid this possibility.
Signed-off-by: John Garry <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Reviewed-by: Hanjun Guo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/irqchip/irq-mbigen.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/irqchip/irq-mbigen.c b/drivers/irqchip/irq-mbigen.c
index 98b6e1d4b1a68..f7fdbf5d183b9 100644
--- a/drivers/irqchip/irq-mbigen.c
+++ b/drivers/irqchip/irq-mbigen.c
@@ -381,6 +381,7 @@ static struct platform_driver mbigen_platform_driver = {
.name = "Hisilicon MBIGEN-V2",
.of_match_table = mbigen_of_match,
.acpi_match_table = ACPI_PTR(mbigen_acpi_match),
+ .suppress_bind_attrs = true,
},
.probe = mbigen_device_probe,
};
--
2.20.1
From: Valdis Klētnieks <[email protected]>
[ Upstream commit bff47c2302cc249bcd550b17067f8dddbd4b6f77 ]
When building with C=1, sparse issues a warning:
CHECK arch/x86/entry/vdso/vdso32-setup.c
arch/x86/entry/vdso/vdso32-setup.c:28:28: warning: symbol 'vdso32_enabled' was not declared. Should it be static?
Provide the missing header file.
Signed-off-by: Valdis Kletnieks <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: x86-ml <[email protected]>
Link: https://lkml.kernel.org/r/36224.1575599767@turing-police
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/entry/vdso/vdso32-setup.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/entry/vdso/vdso32-setup.c b/arch/x86/entry/vdso/vdso32-setup.c
index 42d4c89f990ed..ddff0ca6f5098 100644
--- a/arch/x86/entry/vdso/vdso32-setup.c
+++ b/arch/x86/entry/vdso/vdso32-setup.c
@@ -11,6 +11,7 @@
#include <linux/smp.h>
#include <linux/kernel.h>
#include <linux/mm_types.h>
+#include <linux/elf.h>
#include <asm/processor.h>
#include <asm/vdso.h>
--
2.20.1
From: Vasily Gorbik <[email protected]>
[ Upstream commit 253b3c4b2920e07ce9e2b18800b9b65245e2fafa ]
clang 10 introduces -mpacked-stack compiler option implementation. At the
same time currently it does not support a combination of -mpacked-stack
and -mbackchain. This leads to the following build error:
clang: error: unsupported option '-mpacked-stack with -mbackchain' for
target 's390x-ibm-linux'
If/when clang adds support for a combination of -mpacked-stack and
-mbackchain it would also require -msoft-float (like gcc does). According
to Ulrich Weigand "stack slot assigned to the kernel backchain overlaps
the stack slot assigned to the FPR varargs (both are required to be
placed immediately after the saved r15 slot if present)."
Extend -mpacked-stack compiler option support check to include all 3
options -mpacked-stack -mbackchain -msoft-float which must present to
support -mpacked-stack with -mbackchain.
Acked-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/s390/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/s390/Makefile b/arch/s390/Makefile
index e6c2e8925fefa..4bccde36cb161 100644
--- a/arch/s390/Makefile
+++ b/arch/s390/Makefile
@@ -63,7 +63,7 @@ cflags-y += -Wa,-I$(srctree)/arch/$(ARCH)/include
#
cflags-$(CONFIG_FRAME_POINTER) += -fno-optimize-sibling-calls
-ifeq ($(call cc-option-yn,-mpacked-stack),y)
+ifeq ($(call cc-option-yn,-mpacked-stack -mbackchain -msoft-float),y)
cflags-$(CONFIG_PACK_STACK) += -mpacked-stack -D__PACK_STACK
aflags-$(CONFIG_PACK_STACK) += -D__PACK_STACK
endif
--
2.20.1
From: Icenowy Zheng <[email protected]>
[ Upstream commit ec97faff743b398e21f74a54c81333f3390093aa ]
The A64 PLL_CPU clock has the same instability if some factor changed
without the PLL gated like other SoCs with sun6i-style CCU, e.g. A33,
H3.
Add the mux and pll notifiers for A64 CPU clock to workaround the
problem.
Fixes: c6a0637460c2 ("clk: sunxi-ng: Add A64 clocks")
Signed-off-by: Icenowy Zheng <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 28 ++++++++++++++++++++++++++-
1 file changed, 27 insertions(+), 1 deletion(-)
diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
index dec4a130390a3..9ac6c299e0744 100644
--- a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
+++ b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
@@ -901,11 +901,26 @@ static const struct sunxi_ccu_desc sun50i_a64_ccu_desc = {
.num_resets = ARRAY_SIZE(sun50i_a64_ccu_resets),
};
+static struct ccu_pll_nb sun50i_a64_pll_cpu_nb = {
+ .common = &pll_cpux_clk.common,
+ /* copy from pll_cpux_clk */
+ .enable = BIT(31),
+ .lock = BIT(28),
+};
+
+static struct ccu_mux_nb sun50i_a64_cpu_nb = {
+ .common = &cpux_clk.common,
+ .cm = &cpux_clk.mux,
+ .delay_us = 1, /* > 8 clock cycles at 24 MHz */
+ .bypass_index = 1, /* index of 24 MHz oscillator */
+};
+
static int sun50i_a64_ccu_probe(struct platform_device *pdev)
{
struct resource *res;
void __iomem *reg;
u32 val;
+ int ret;
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
reg = devm_ioremap_resource(&pdev->dev, res);
@@ -919,7 +934,18 @@ static int sun50i_a64_ccu_probe(struct platform_device *pdev)
writel(0x515, reg + SUN50I_A64_PLL_MIPI_REG);
- return sunxi_ccu_probe(pdev->dev.of_node, reg, &sun50i_a64_ccu_desc);
+ ret = sunxi_ccu_probe(pdev->dev.of_node, reg, &sun50i_a64_ccu_desc);
+ if (ret)
+ return ret;
+
+ /* Gate then ungate PLL CPU after any rate changes */
+ ccu_pll_notifier_register(&sun50i_a64_pll_cpu_nb);
+
+ /* Reparent CPU during PLL CPU rate changes */
+ ccu_mux_notifier_register(pll_cpux_clk.common.hw.clk,
+ &sun50i_a64_cpu_nb);
+
+ return 0;
}
static const struct of_device_id sun50i_a64_ccu_ids[] = {
--
2.20.1
From: Jaihind Yadav <[email protected]>
[ Upstream commit 030b995ad9ece9fa2d218af4429c1c78c2342096 ]
In AVC update we don't call avc_node_kill() when avc_xperms_populate()
fails, resulting in the avc->avc_cache.active_nodes counter having a
false value. In last patch this changes was missed , so correcting it.
Fixes: fa1aa143ac4a ("selinux: extended permissions for ioctls")
Signed-off-by: Jaihind Yadav <[email protected]>
Signed-off-by: Ravi Kumar Siddojigari <[email protected]>
[PM: merge fuzz, minor description cleanup]
Signed-off-by: Paul Moore <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
security/selinux/avc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/security/selinux/avc.c b/security/selinux/avc.c
index 83eef39c8a799..d52be7b9f08c8 100644
--- a/security/selinux/avc.c
+++ b/security/selinux/avc.c
@@ -896,7 +896,7 @@ static int avc_update_node(struct selinux_avc *avc,
if (orig->ae.xp_node) {
rc = avc_xperms_populate(node, orig->ae.xp_node);
if (rc) {
- kmem_cache_free(avc_node_cachep, node);
+ avc_node_kill(avc, node);
goto out_unlock;
}
}
--
2.20.1
From: Tony Lindgren <[email protected]>
[ Upstream commit 91b6dec32e5c25fbdbb564d1e5af23764ec17ef1 ]
We currently have musb_set_vbus() called from two different paths. Mostly
it gets called from the USB PHY via omap_musb_set_mailbox(), but in some
cases it can get also called from musb_stage0_irq() rather via .set_vbus:
(musb_set_host [musb_hdrc])
(omap2430_musb_set_vbus [omap2430])
(musb_stage0_irq [musb_hdrc])
(musb_interrupt [musb_hdrc])
(omap2430_musb_interrupt [omap2430])
This is racy and will not work with introducing generic helper functions
for musb_set_host() and musb_set_peripheral(). We want to get rid of the
busy loops in favor of usleep_range().
Let's just get rid of .set_vbus for omap2430 glue layer and let the PHY
code handle VBUS with musb_set_vbus(). Note that in the follow-up patch
we can completely remove omap2430_musb_set_vbus(), but let's do it in a
separate patch as this change may actually turn out to be needed as a
fix.
Reported-by: Pavel Machek <[email protected]>
Acked-by: Pavel Machek <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: Bin Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/usb/musb/omap2430.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/usb/musb/omap2430.c b/drivers/usb/musb/omap2430.c
index b1dd81fb5f55e..24e622c056383 100644
--- a/drivers/usb/musb/omap2430.c
+++ b/drivers/usb/musb/omap2430.c
@@ -361,8 +361,6 @@ static const struct musb_platform_ops omap2430_ops = {
.init = omap2430_musb_init,
.exit = omap2430_musb_exit,
- .set_vbus = omap2430_musb_set_vbus,
-
.enable = omap2430_musb_enable,
.disable = omap2430_musb_disable,
--
2.20.1
From: Thomas Gleixner <[email protected]>
[ Upstream commit 11e31f608b499f044f24b20be73f1dcab3e43f8a ]
Robert reported that during boot the watchdog timestamp is set to 0 for one
second which is the indicator for a watchdog reset.
The reason for this is that the timestamp is in seconds and the time is
taken from sched clock and divided by ~1e9. sched clock starts at 0 which
means that for the first second during boot the watchdog timestamp is 0,
i.e. reset.
Use ULONG_MAX as the reset indicator value so the watchdog works correctly
right from the start. ULONG_MAX would only conflict with a real timestamp
if the system reaches an uptime of 136 years on 32bit and almost eternity
on 64bit.
Reported-by: Robert Richter <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/watchdog.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index bbc4940f21af6..6d60701dc6361 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -161,6 +161,8 @@ static void lockup_detector_update_enable(void)
#ifdef CONFIG_SOFTLOCKUP_DETECTOR
+#define SOFTLOCKUP_RESET ULONG_MAX
+
/* Global variables, exported for sysctl */
unsigned int __read_mostly softlockup_panic =
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE;
@@ -267,7 +269,7 @@ notrace void touch_softlockup_watchdog_sched(void)
* Preemption can be enabled. It doesn't matter which CPU's timestamp
* gets zeroed here, so use the raw_ operation.
*/
- raw_cpu_write(watchdog_touch_ts, 0);
+ raw_cpu_write(watchdog_touch_ts, SOFTLOCKUP_RESET);
}
notrace void touch_softlockup_watchdog(void)
@@ -291,14 +293,14 @@ void touch_all_softlockup_watchdogs(void)
* the softlockup check.
*/
for_each_cpu(cpu, &watchdog_allowed_mask)
- per_cpu(watchdog_touch_ts, cpu) = 0;
+ per_cpu(watchdog_touch_ts, cpu) = SOFTLOCKUP_RESET;
wq_watchdog_touch(-1);
}
void touch_softlockup_watchdog_sync(void)
{
__this_cpu_write(softlockup_touch_sync, true);
- __this_cpu_write(watchdog_touch_ts, 0);
+ __this_cpu_write(watchdog_touch_ts, SOFTLOCKUP_RESET);
}
static int is_softlockup(unsigned long touch_ts)
@@ -376,7 +378,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
/* .. and repeat */
hrtimer_forward_now(hrtimer, ns_to_ktime(sample_period));
- if (touch_ts == 0) {
+ if (touch_ts == SOFTLOCKUP_RESET) {
if (unlikely(__this_cpu_read(softlockup_touch_sync))) {
/*
* If the time stamp was touched atomically
--
2.20.1
From: Oliver O'Halloran <[email protected]>
[ Upstream commit 1fb4124ca9d456656a324f1ee29b7bf942f59ac8 ]
When disabling virtual functions on an SR-IOV adapter we currently do not
correctly remove the EEH state for the now-dead virtual functions. When
removing the pci_dn that was created for the VF when SR-IOV was enabled
we free the corresponding eeh_dev without removing it from the child device
list of the eeh_pe that contained it. This can result in crashes due to the
use-after-free.
Signed-off-by: Oliver O'Halloran <[email protected]>
Reviewed-by: Sam Bobroff <[email protected]>
Tested-by: Sam Bobroff <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/powerpc/kernel/pci_dn.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index ab147a1909c8b..7cecc3bd953b7 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -257,9 +257,22 @@ void remove_dev_pci_data(struct pci_dev *pdev)
continue;
#ifdef CONFIG_EEH
- /* Release EEH device for the VF */
+ /*
+ * Release EEH state for this VF. The PCI core
+ * has already torn down the pci_dev for this VF, but
+ * we're responsible to removing the eeh_dev since it
+ * has the same lifetime as the pci_dn that spawned it.
+ */
edev = pdn_to_eeh_dev(pdn);
if (edev) {
+ /*
+ * We allocate pci_dn's for the totalvfs count,
+ * but only only the vfs that were activated
+ * have a configured PE.
+ */
+ if (edev->pe)
+ eeh_rmv_from_parent_pe(edev);
+
pdn->edev = NULL;
kfree(edev);
}
--
2.20.1
From: Uwe Kleine-König <[email protected]>
[ Upstream commit 43efdc8f0e6d7088ec61bd55a73bf853f002d043 ]
In the old code (e.g.) mutex_destroy() was called before
pwmchip_remove(). Between these two calls it is possible that a PWM
callback is used which tries to grab the mutex.
Fixes: 6604c6556db9 ("pwm: Add PWM driver for OMAP using dual-mode timers")
Signed-off-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/pwm/pwm-omap-dmtimer.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/pwm/pwm-omap-dmtimer.c b/drivers/pwm/pwm-omap-dmtimer.c
index 2d585c101d52e..c6e710a713d31 100644
--- a/drivers/pwm/pwm-omap-dmtimer.c
+++ b/drivers/pwm/pwm-omap-dmtimer.c
@@ -364,6 +364,11 @@ put:
static int pwm_omap_dmtimer_remove(struct platform_device *pdev)
{
struct pwm_omap_dmtimer_chip *omap = platform_get_drvdata(pdev);
+ int ret;
+
+ ret = pwmchip_remove(&omap->chip);
+ if (ret)
+ return ret;
if (pm_runtime_active(&omap->dm_timer_pdev->dev))
omap->pdata->stop(omap->dm_timer);
@@ -372,7 +377,7 @@ static int pwm_omap_dmtimer_remove(struct platform_device *pdev)
mutex_destroy(&omap->mutex);
- return pwmchip_remove(&omap->chip);
+ return 0;
}
static const struct of_device_id pwm_omap_dmtimer_of_match[] = {
--
2.20.1
From: Liang Chen <[email protected]>
[ Upstream commit e8547d42095e58bee658f00fef8e33d2a185c927 ]
Same as cache device, the buffer page needs to be put while
freeing cached_dev. Otherwise a page would be leaked every
time a cached_dev is stopped.
Signed-off-by: Liang Chen <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Coly Li <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/md/bcache/super.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index c45d9ad010770..5b5cbfadd003d 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1226,6 +1226,9 @@ static void cached_dev_free(struct closure *cl)
mutex_unlock(&bch_register_lock);
+ if (dc->sb_bio.bi_inline_vecs[0].bv_page)
+ put_page(bio_first_page_all(&dc->sb_bio));
+
if (!IS_ERR_OR_NULL(dc->bdev))
blkdev_put(dc->bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL);
--
2.20.1
From: Lu Baolu <[email protected]>
[ Upstream commit 857f081426e5aa38313426c13373730f1345fe95 ]
Address field in device TLB invalidation descriptor is qualified
by the S field. If S field is zero, a single page at page address
specified by address [63:12] is requested to be invalidated. If S
field is set, the least significant bit in the address field with
value 0b (say bit N) indicates the invalidation address range. The
spec doesn't require the address [N - 1, 0] to be cleared, hence
remove the unnecessary WARN_ON_ONCE().
Otherwise, the caller might set "mask = MAX_AGAW_PFN_WIDTH" in order
to invalidating all the cached mappings on an endpoint, and below
overflow error will be triggered.
[...]
UBSAN: Undefined behaviour in drivers/iommu/dmar.c:1354:3
shift exponent 64 is too large for 64-bit type 'long long unsigned int'
[...]
Reported-and-tested-by: Frank <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/iommu/dmar.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 7f9824b0609e7..72994d67bc5b9 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -1345,7 +1345,6 @@ void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
struct qi_desc desc;
if (mask) {
- WARN_ON_ONCE(addr & ((1ULL << (VTD_PAGE_SHIFT + mask)) - 1));
addr |= (1ULL << (VTD_PAGE_SHIFT + mask - 1)) - 1;
desc.high = QI_DEV_IOTLB_ADDR(addr) | QI_DEV_IOTLB_SIZE;
} else
--
2.20.1
From: Dan Carpenter <[email protected]>
[ Upstream commit 117fcc3053606d8db5cef8821dca15022ae578bb ]
The "drive->dn" value is a u8 and it is controlled by root only, but
it could be out of bounds here so let's check.
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/ide/cmd64x.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/ide/cmd64x.c b/drivers/ide/cmd64x.c
index b127ed60c7336..9dde8390da09b 100644
--- a/drivers/ide/cmd64x.c
+++ b/drivers/ide/cmd64x.c
@@ -65,6 +65,9 @@ static void cmd64x_program_timings(ide_drive_t *drive, u8 mode)
struct ide_timing t;
u8 arttim = 0;
+ if (drive->dn >= ARRAY_SIZE(drwtim_regs))
+ return;
+
ide_timing_compute(drive, mode, &t, T, 0);
/*
--
2.20.1
From: Lorenz Bauer <[email protected]>
[ Upstream commit 51bad0f05616c43d6d34b0a19bcc9bdab8e8fb39 ]
Currently, there is a lot of false positives if a single reuseport test
fails. This is because expected_results and the result map are not cleared.
Zero both after individual test runs, which fixes the mentioned false
positives.
Fixes: 91134d849a0e ("bpf: Test BPF_PROG_TYPE_SK_REUSEPORT")
Signed-off-by: Lorenz Bauer <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Jakub Sitnicki <[email protected]>
Acked-by: Martin KaFai Lau <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
.../selftests/bpf/test_select_reuseport.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/test_select_reuseport.c b/tools/testing/selftests/bpf/test_select_reuseport.c
index 75646d9b34aaa..cdbbdab2725fc 100644
--- a/tools/testing/selftests/bpf/test_select_reuseport.c
+++ b/tools/testing/selftests/bpf/test_select_reuseport.c
@@ -30,7 +30,7 @@
#define REUSEPORT_ARRAY_SIZE 32
static int result_map, tmp_index_ovr_map, linum_map, data_check_map;
-static enum result expected_results[NR_RESULTS];
+static __u32 expected_results[NR_RESULTS];
static int sk_fds[REUSEPORT_ARRAY_SIZE];
static int reuseport_array, outer_map;
static int select_by_skb_data_prog;
@@ -610,7 +610,19 @@ static void setup_per_test(int type, unsigned short family, bool inany)
static void cleanup_per_test(void)
{
- int i, err;
+ int i, err, zero = 0;
+
+ memset(expected_results, 0, sizeof(expected_results));
+
+ for (i = 0; i < NR_RESULTS; i++) {
+ err = bpf_map_update_elem(result_map, &i, &zero, BPF_ANY);
+ RET_IF(err, "reset elem in result_map",
+ "i:%u err:%d errno:%d\n", i, err, errno);
+ }
+
+ err = bpf_map_update_elem(linum_map, &zero, &zero, BPF_ANY);
+ RET_IF(err, "reset line number in linum_map", "err:%d errno:%d\n",
+ err, errno);
for (i = 0; i < REUSEPORT_ARRAY_SIZE; i++)
close(sk_fds[i]);
--
2.20.1
From: Colin Ian King <[email protected]>
[ Upstream commit 0707cfa5c3ef58effb143db9db6d6e20503f9dec ]
Currently the check that a u32 variable i is >= 0 is always true because
the unsigned variable will never be negative, causing the loop to run
forever. Fix this by changing the pre-decrement check to a zero check on
i followed by a decrement of i.
Addresses-Coverity: ("Unsigned compared against 0")
Fixes: 39cc539f90d0 ("driver core: platform: Prevent resouce overflow from causing infinite loops")
Signed-off-by: Colin Ian King <[email protected]>
Reviewed-by: Rafael J. Wysocki <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/base/platform.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 1d3a50ac21664..d1f901b58f755 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -427,7 +427,7 @@ int platform_device_add(struct platform_device *pdev)
pdev->id = PLATFORM_DEVID_AUTO;
}
- while (--i >= 0) {
+ while (i--) {
struct resource *r = &pdev->resource[i];
if (r->parent)
release_resource(r);
--
2.20.1
From: zhangyi (F) <[email protected]>
[ Upstream commit 0e98c084a21177ef136149c6a293b3d1eb33ff92 ]
Commit fb7c02445c49 ("ext4: pass -ESHUTDOWN code to jbd2 layer") want
to allow jbd2 layer to distinguish shutdown journal abort from other
error cases. So the ESHUTDOWN should be taken precedence over any other
errno which has already been recoded after EXT4_FLAGS_SHUTDOWN is set,
but it only update errno in the journal suoerblock now if the old errno
is 0.
Fixes: fb7c02445c49 ("ext4: pass -ESHUTDOWN code to jbd2 layer")
Signed-off-by: zhangyi (F) <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/jbd2/journal.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 1a96287f92647..a15a22d209090 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -2133,8 +2133,7 @@ static void __journal_abort_soft (journal_t *journal, int errno)
if (journal->j_flags & JBD2_ABORT) {
write_unlock(&journal->j_state_lock);
- if (!old_errno && old_errno != -ESHUTDOWN &&
- errno == -ESHUTDOWN)
+ if (old_errno != -ESHUTDOWN && errno == -ESHUTDOWN)
jbd2_journal_update_sb_errno(journal);
return;
}
--
2.20.1
From: Jaegeuk Kim <[email protected]>
[ Upstream commit 5b1dbb082f196278f82b6a15a13848efacb9ff11 ]
This patch moves setting I_LINKABLE early in rename2(whiteout) to avoid the
below warning.
[ 3189.163385] WARNING: CPU: 3 PID: 59523 at fs/inode.c:358 inc_nlink+0x32/0x40
[ 3189.246979] Call Trace:
[ 3189.248707] f2fs_init_inode_metadata+0x2d6/0x440 [f2fs]
[ 3189.251399] f2fs_add_inline_entry+0x162/0x8c0 [f2fs]
[ 3189.254010] f2fs_add_dentry+0x69/0xe0 [f2fs]
[ 3189.256353] f2fs_do_add_link+0xc5/0x100 [f2fs]
[ 3189.258774] f2fs_rename2+0xabf/0x1010 [f2fs]
[ 3189.261079] vfs_rename+0x3f8/0xaa0
[ 3189.263056] ? tomoyo_path_rename+0x44/0x60
[ 3189.265283] ? do_renameat2+0x49b/0x550
[ 3189.267324] do_renameat2+0x49b/0x550
[ 3189.269316] __x64_sys_renameat2+0x20/0x30
[ 3189.271441] do_syscall_64+0x5a/0x230
[ 3189.273410] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 3189.275848] RIP: 0033:0x7f270b4d9a49
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/f2fs/namei.c | 27 +++++++++++++--------------
1 file changed, 13 insertions(+), 14 deletions(-)
diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
index 0ace2c2e3de93..4f0cc0c79d1ee 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -769,6 +769,7 @@ static int __f2fs_tmpfile(struct inode *dir, struct dentry *dentry,
if (whiteout) {
f2fs_i_links_write(inode, false);
+ inode->i_state |= I_LINKABLE;
*whiteout = inode;
} else {
d_tmpfile(dentry, inode);
@@ -835,6 +836,12 @@ static int f2fs_rename(struct inode *old_dir, struct dentry *old_dentry,
F2FS_I(old_dentry->d_inode)->i_projid)))
return -EXDEV;
+ if (flags & RENAME_WHITEOUT) {
+ err = f2fs_create_whiteout(old_dir, &whiteout);
+ if (err)
+ return err;
+ }
+
err = dquot_initialize(old_dir);
if (err)
goto out;
@@ -865,17 +872,11 @@ static int f2fs_rename(struct inode *old_dir, struct dentry *old_dentry,
}
}
- if (flags & RENAME_WHITEOUT) {
- err = f2fs_create_whiteout(old_dir, &whiteout);
- if (err)
- goto out_dir;
- }
-
if (new_inode) {
err = -ENOTEMPTY;
if (old_dir_entry && !f2fs_empty_dir(new_inode))
- goto out_whiteout;
+ goto out_dir;
err = -ENOENT;
new_entry = f2fs_find_entry(new_dir, &new_dentry->d_name,
@@ -883,7 +884,7 @@ static int f2fs_rename(struct inode *old_dir, struct dentry *old_dentry,
if (!new_entry) {
if (IS_ERR(new_page))
err = PTR_ERR(new_page);
- goto out_whiteout;
+ goto out_dir;
}
f2fs_balance_fs(sbi, true);
@@ -915,7 +916,7 @@ static int f2fs_rename(struct inode *old_dir, struct dentry *old_dentry,
err = f2fs_add_link(new_dentry, old_inode);
if (err) {
f2fs_unlock_op(sbi);
- goto out_whiteout;
+ goto out_dir;
}
if (old_dir_entry)
@@ -939,7 +940,7 @@ static int f2fs_rename(struct inode *old_dir, struct dentry *old_dentry,
if (IS_ERR(old_page))
err = PTR_ERR(old_page);
f2fs_unlock_op(sbi);
- goto out_whiteout;
+ goto out_dir;
}
}
}
@@ -958,7 +959,6 @@ static int f2fs_rename(struct inode *old_dir, struct dentry *old_dentry,
f2fs_delete_entry(old_entry, old_page, old_dir, NULL);
if (whiteout) {
- whiteout->i_state |= I_LINKABLE;
set_inode_flag(whiteout, FI_INC_LINK);
err = f2fs_add_link(old_dentry, whiteout);
if (err)
@@ -992,15 +992,14 @@ put_out_dir:
f2fs_unlock_op(sbi);
if (new_page)
f2fs_put_page(new_page, 0);
-out_whiteout:
- if (whiteout)
- iput(whiteout);
out_dir:
if (old_dir_entry)
f2fs_put_page(old_dir_page, 0);
out_old:
f2fs_put_page(old_page, 0);
out:
+ if (whiteout)
+ iput(whiteout);
return err;
}
--
2.20.1
From: Jaegeuk Kim <[email protected]>
[ Upstream commit 820d366736c949ffe698d3b3fe1266a91da1766d ]
Detected kmemleak.
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/f2fs/sysfs.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index 98887187af4cd..b405548202d3b 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -711,4 +711,5 @@ void f2fs_unregister_sysfs(struct f2fs_sb_info *sbi)
remove_proc_entry(sbi->sb->s_id, f2fs_proc_root);
}
kobject_del(&sbi->s_kobj);
+ kobject_put(&sbi->s_kobj);
}
--
2.20.1
From: Masahiro Yamada <[email protected]>
[ Upstream commit 3bed1b7b9d79ca40e41e3af130931a3225e951a3 ]
Currently, -E (stop after the preprocessing stage) is used to check
whether the given compiler flag is supported.
While it is faster than -S (or -c), it can be false-positive. You need
to run the compilation proper to check the flag more precisely.
For example, -E and -S disagree about the support of
"--param asan-instrument-allocas=1".
$ gcc -Werror --param asan-instrument-allocas=1 -E -x c /dev/null -o /dev/null
$ echo $?
0
$ gcc -Werror --param asan-instrument-allocas=1 -S -x c /dev/null -o /dev/null
cc1: error: invalid --param name ‘asan-instrument-allocas’; did you mean ‘asan-instrument-writes’?
$ echo $?
1
Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
scripts/Kconfig.include | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/Kconfig.include b/scripts/Kconfig.include
index 3b2861f47709b..79455ad6b3863 100644
--- a/scripts/Kconfig.include
+++ b/scripts/Kconfig.include
@@ -20,7 +20,7 @@ success = $(if-success,$(1),y,n)
# $(cc-option,<flag>)
# Return y if the compiler supports <flag>, n otherwise
-cc-option = $(success,$(CC) -Werror $(CLANG_FLAGS) $(1) -E -x c /dev/null -o /dev/null)
+cc-option = $(success,$(CC) -Werror $(CLANG_FLAGS) $(1) -S -x c /dev/null -o /dev/null)
# $(ld-option,<flag>)
# Return y if the linker supports <flag>, n otherwise
--
2.20.1
From: Nick Black <[email protected]>
[ Upstream commit 54155ed4199c7aa3fd20866648024ab63c96d579 ]
A faulty userspace that calls destroy_session() before destroying the
connections can trigger the failure. This patch prevents the issue by
refusing to destroy the session if there are outstanding connections.
------------[ cut here ]------------
kernel BUG at mm/slub.c:306!
invalid opcode: 0000 [#1] SMP PTI
CPU: 1 PID: 1224 Comm: iscsid Not tainted 5.4.0-rc2.iscsi+ #7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
RIP: 0010:__slab_free+0x181/0x350
[...]
[ 1209.686056] RSP: 0018:ffffa93d4074fae0 EFLAGS: 00010246
[ 1209.686694] RAX: ffff934efa5ad800 RBX: 000000008010000a RCX: ffff934efa5ad800
[ 1209.687651] RDX: ffff934efa5ad800 RSI: ffffeb4041e96b00 RDI: ffff934efd402c40
[ 1209.688582] RBP: ffffa93d4074fb80 R08: 0000000000000001 R09: ffffffffbb5dfa26
[ 1209.689425] R10: ffff934efa5ad800 R11: 0000000000000001 R12: ffffeb4041e96b00
[ 1209.690285] R13: ffff934efa5ad800 R14: ffff934efd402c40 R15: 0000000000000000
[ 1209.691213] FS: 00007f7945dfb540(0000) GS:ffff934efda80000(0000) knlGS:0000000000000000
[ 1209.692316] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1209.693013] CR2: 000055877fd3da80 CR3: 0000000077384000 CR4: 00000000000006e0
[ 1209.693897] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1209.694773] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1209.695631] Call Trace:
[ 1209.695957] ? __wake_up_common_lock+0x8a/0xc0
[ 1209.696712] iscsi_pool_free+0x26/0x40
[ 1209.697263] iscsi_session_teardown+0x2f/0xf0
[ 1209.698117] iscsi_sw_tcp_session_destroy+0x45/0x60
[ 1209.698831] iscsi_if_rx+0xd88/0x14e0
[ 1209.699370] netlink_unicast+0x16f/0x200
[ 1209.699932] netlink_sendmsg+0x21a/0x3e0
[ 1209.700446] sock_sendmsg+0x4f/0x60
[ 1209.700902] ___sys_sendmsg+0x2ae/0x320
[ 1209.701451] ? cp_new_stat+0x150/0x180
[ 1209.701922] __sys_sendmsg+0x59/0xa0
[ 1209.702357] do_syscall_64+0x52/0x160
[ 1209.702812] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1209.703419] RIP: 0033:0x7f7946433914
[...]
[ 1209.706084] RSP: 002b:00007fffb99f2378 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 1209.706994] RAX: ffffffffffffffda RBX: 000055bc869eac20 RCX: 00007f7946433914
[ 1209.708082] RDX: 0000000000000000 RSI: 00007fffb99f2390 RDI: 0000000000000005
[ 1209.709120] RBP: 00007fffb99f2390 R08: 000055bc84fe9320 R09: 00007fffb99f1f07
[ 1209.710110] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000038
[ 1209.711085] R13: 000055bc8502306e R14: 0000000000000000 R15: 0000000000000000
Modules linked in:
---[ end trace a2d933ede7f730d8 ]---
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Nick Black <[email protected]>
Co-developed-by: Salman Qazi <[email protected]>
Signed-off-by: Salman Qazi <[email protected]>
Co-developed-by: Junho Ryu <[email protected]>
Signed-off-by: Junho Ryu <[email protected]>
Co-developed-by: Khazhismel Kumykov <[email protected]>
Signed-off-by: Khazhismel Kumykov <[email protected]>
Co-developed-by: Gabriel Krisman Bertazi <[email protected]>
Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
Reviewed-by: Lee Duncan <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/scsi/iscsi_tcp.c | 4 ++++
drivers/scsi/scsi_transport_iscsi.c | 26 +++++++++++++++++++++++---
2 files changed, 27 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 55181d28291e7..7212e3a13fe6b 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -892,6 +892,10 @@ free_host:
static void iscsi_sw_tcp_session_destroy(struct iscsi_cls_session *cls_session)
{
struct Scsi_Host *shost = iscsi_session_to_shost(cls_session);
+ struct iscsi_session *session = cls_session->dd_data;
+
+ if (WARN_ON_ONCE(session->leadconn))
+ return;
iscsi_tcp_r2tpool_free(cls_session->dd_data);
iscsi_session_teardown(cls_session);
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 4c4781e5974f0..c0fb9e7890807 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -2945,6 +2945,24 @@ iscsi_set_path(struct iscsi_transport *transport, struct iscsi_uevent *ev)
return err;
}
+static int iscsi_session_has_conns(int sid)
+{
+ struct iscsi_cls_conn *conn;
+ unsigned long flags;
+ int found = 0;
+
+ spin_lock_irqsave(&connlock, flags);
+ list_for_each_entry(conn, &connlist, conn_list) {
+ if (iscsi_conn_get_sid(conn) == sid) {
+ found = 1;
+ break;
+ }
+ }
+ spin_unlock_irqrestore(&connlock, flags);
+
+ return found;
+}
+
static int
iscsi_set_iface_params(struct iscsi_transport *transport,
struct iscsi_uevent *ev, uint32_t len)
@@ -3522,10 +3540,12 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
break;
case ISCSI_UEVENT_DESTROY_SESSION:
session = iscsi_session_lookup(ev->u.d_session.sid);
- if (session)
- transport->destroy_session(session);
- else
+ if (!session)
err = -EINVAL;
+ else if (iscsi_session_has_conns(ev->u.d_session.sid))
+ err = -EBUSY;
+ else
+ transport->destroy_session(session);
break;
case ISCSI_UEVENT_UNBIND_SESSION:
session = iscsi_session_lookup(ev->u.d_session.sid);
--
2.20.1
From: Sami Tolvanen <[email protected]>
[ Upstream commit c54f90c2627cc316d365e3073614731e17dbc631 ]
LLVM's integrated assembler fails with the following error when
building KVM:
<inline asm>:12:6: error: expected absolute expression
.if kvm_update_va_mask == 0
^
<inline asm>:21:6: error: expected absolute expression
.if kvm_update_va_mask == 0
^
<inline asm>:24:2: error: unrecognized instruction mnemonic
NOT_AN_INSTRUCTION
^
LLVM ERROR: Error parsing inline asm
These errors come from ALTERNATIVE_CB and __ALTERNATIVE_CFG,
which test for the existence of the callback parameter in inline
assembly using the following expression:
" .if " __stringify(cb) " == 0\n"
This works with GNU as, but isn't supported by LLVM. This change
splits __ALTERNATIVE_CFG and ALTINSTR_ENTRY into separate macros
to fix the LLVM build.
Link: https://github.com/ClangBuiltLinux/linux/issues/472
Signed-off-by: Sami Tolvanen <[email protected]>
Tested-by: Nick Desaulniers <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/arm64/include/asm/alternative.h | 32 ++++++++++++++++++----------
1 file changed, 21 insertions(+), 11 deletions(-)
diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
index 4b650ec1d7dd1..887a8512bf10b 100644
--- a/arch/arm64/include/asm/alternative.h
+++ b/arch/arm64/include/asm/alternative.h
@@ -35,13 +35,16 @@ void apply_alternatives_module(void *start, size_t length);
static inline void apply_alternatives_module(void *start, size_t length) { }
#endif
-#define ALTINSTR_ENTRY(feature,cb) \
+#define ALTINSTR_ENTRY(feature) \
" .word 661b - .\n" /* label */ \
- " .if " __stringify(cb) " == 0\n" \
" .word 663f - .\n" /* new instruction */ \
- " .else\n" \
+ " .hword " __stringify(feature) "\n" /* feature bit */ \
+ " .byte 662b-661b\n" /* source len */ \
+ " .byte 664f-663f\n" /* replacement len */
+
+#define ALTINSTR_ENTRY_CB(feature, cb) \
+ " .word 661b - .\n" /* label */ \
" .word " __stringify(cb) "- .\n" /* callback */ \
- " .endif\n" \
" .hword " __stringify(feature) "\n" /* feature bit */ \
" .byte 662b-661b\n" /* source len */ \
" .byte 664f-663f\n" /* replacement len */
@@ -62,15 +65,14 @@ static inline void apply_alternatives_module(void *start, size_t length) { }
*
* Alternatives with callbacks do not generate replacement instructions.
*/
-#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled, cb) \
+#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled) \
".if "__stringify(cfg_enabled)" == 1\n" \
"661:\n\t" \
oldinstr "\n" \
"662:\n" \
".pushsection .altinstructions,\"a\"\n" \
- ALTINSTR_ENTRY(feature,cb) \
+ ALTINSTR_ENTRY(feature) \
".popsection\n" \
- " .if " __stringify(cb) " == 0\n" \
".pushsection .altinstr_replacement, \"a\"\n" \
"663:\n\t" \
newinstr "\n" \
@@ -78,17 +80,25 @@ static inline void apply_alternatives_module(void *start, size_t length) { }
".popsection\n\t" \
".org . - (664b-663b) + (662b-661b)\n\t" \
".org . - (662b-661b) + (664b-663b)\n" \
- ".else\n\t" \
+ ".endif\n"
+
+#define __ALTERNATIVE_CFG_CB(oldinstr, feature, cfg_enabled, cb) \
+ ".if "__stringify(cfg_enabled)" == 1\n" \
+ "661:\n\t" \
+ oldinstr "\n" \
+ "662:\n" \
+ ".pushsection .altinstructions,\"a\"\n" \
+ ALTINSTR_ENTRY_CB(feature, cb) \
+ ".popsection\n" \
"663:\n\t" \
"664:\n\t" \
- ".endif\n" \
".endif\n"
#define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...) \
- __ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg), 0)
+ __ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg))
#define ALTERNATIVE_CB(oldinstr, cb) \
- __ALTERNATIVE_CFG(oldinstr, "NOT_AN_INSTRUCTION", ARM64_CB_PATCH, 1, cb)
+ __ALTERNATIVE_CFG_CB(oldinstr, ARM64_CB_PATCH, 1, cb)
#else
#include <asm/assembler.h>
--
2.20.1
From: Jun Lei <[email protected]>
[ Upstream commit 34ad0230062c39cdcba564d16d122c0fb467a7d6 ]
[why]
Need to fix DML portability issues to enable SW unit testing around DML
[how]
Move calcs into dc include folder since multiple components reference it
Remove relative paths to external dependencies
Signed-off-by: Jun Lei <[email protected]>
Reviewed-by: Anthony Koo <[email protected]>
Acked-by: Harry Wentland <[email protected]>
Acked-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/display/dc/dml/dml_common_defs.c | 2 +-
drivers/gpu/drm/amd/display/dc/dml/dml_inline_defs.h | 2 +-
drivers/gpu/drm/amd/display/dc/{calcs => inc}/dcn_calc_math.h | 0
3 files changed, 2 insertions(+), 2 deletions(-)
rename drivers/gpu/drm/amd/display/dc/{calcs => inc}/dcn_calc_math.h (100%)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dml_common_defs.c b/drivers/gpu/drm/amd/display/dc/dml/dml_common_defs.c
index b953b02a15121..723af0b2dda04 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dml_common_defs.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dml_common_defs.c
@@ -24,7 +24,7 @@
*/
#include "dml_common_defs.h"
-#include "../calcs/dcn_calc_math.h"
+#include "dcn_calc_math.h"
#include "dml_inline_defs.h"
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dml_inline_defs.h b/drivers/gpu/drm/amd/display/dc/dml/dml_inline_defs.h
index e8ce08567cd8e..e4f595a3038c4 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dml_inline_defs.h
+++ b/drivers/gpu/drm/amd/display/dc/dml/dml_inline_defs.h
@@ -27,7 +27,7 @@
#define __DML_INLINE_DEFS_H__
#include "dml_common_defs.h"
-#include "../calcs/dcn_calc_math.h"
+#include "dcn_calc_math.h"
#include "dml_logger.h"
static inline double dml_min(double a, double b)
diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_math.h b/drivers/gpu/drm/amd/display/dc/inc/dcn_calc_math.h
similarity index 100%
rename from drivers/gpu/drm/amd/display/dc/calcs/dcn_calc_math.h
rename to drivers/gpu/drm/amd/display/dc/inc/dcn_calc_math.h
--
2.20.1
From: Colin Ian King <[email protected]>
[ Upstream commit c2f9a4e4a5abfc84c01b738496b3fd2d471e0b18 ]
The loop counter addr is a u16 where as the upper limit of the loop
is an int. In the unlikely event that the il->cfg->eeprom_size is
greater than 64K then we end up with an infinite loop since addr will
wrap around an never reach upper loop limit. Fix this by making addr
an int.
Addresses-Coverity: ("Infinite loop")
Fixes: be663ab67077 ("iwlwifi: split the drivers for agn and legacy devices 3945/4965")
Signed-off-by: Colin Ian King <[email protected]>
Acked-by: Stanislaw Gruszka <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/intel/iwlegacy/common.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/intel/iwlegacy/common.c b/drivers/net/wireless/intel/iwlegacy/common.c
index 6514baf799fef..e16f2597c2199 100644
--- a/drivers/net/wireless/intel/iwlegacy/common.c
+++ b/drivers/net/wireless/intel/iwlegacy/common.c
@@ -717,7 +717,7 @@ il_eeprom_init(struct il_priv *il)
u32 gp = _il_rd(il, CSR_EEPROM_GP);
int sz;
int ret;
- u16 addr;
+ int addr;
/* allocate eeprom */
sz = il->cfg->eeprom_size;
--
2.20.1
From: YueHaibing <[email protected]>
[ Upstream commit 2e4534a22794746b11a794b2229b8d58797eccce ]
drivers/gpu/drm/nouveau/nouveau_ttm.c: In function nouveau_vram_manager_new:
drivers/gpu/drm/nouveau/nouveau_ttm.c:66:22: warning: variable mem set but not used [-Wunused-but-set-variable]
drivers/gpu/drm/nouveau/nouveau_ttm.c: In function nouveau_gart_manager_new:
drivers/gpu/drm/nouveau/nouveau_ttm.c:106:22: warning: variable mem set but not used [-Wunused-but-set-variable]
They are not used any more, so remove it.
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: Ben Skeggs <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/nouveau/nouveau_ttm.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c
index e4b977cc84528..37715a2a2f3f9 100644
--- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
@@ -63,14 +63,12 @@ nouveau_vram_manager_new(struct ttm_mem_type_manager *man,
{
struct nouveau_bo *nvbo = nouveau_bo(bo);
struct nouveau_drm *drm = nouveau_bdev(bo->bdev);
- struct nouveau_mem *mem;
int ret;
if (drm->client.device.info.ram_size == 0)
return -ENOMEM;
ret = nouveau_mem_new(&drm->master, nvbo->kind, nvbo->comp, reg);
- mem = nouveau_mem(reg);
if (ret)
return ret;
@@ -103,11 +101,9 @@ nouveau_gart_manager_new(struct ttm_mem_type_manager *man,
{
struct nouveau_bo *nvbo = nouveau_bo(bo);
struct nouveau_drm *drm = nouveau_bdev(bo->bdev);
- struct nouveau_mem *mem;
int ret;
ret = nouveau_mem_new(&drm->master, nvbo->kind, nvbo->comp, reg);
- mem = nouveau_mem(reg);
if (ret)
return ret;
--
2.20.1
From: wangyan <[email protected]>
[ Upstream commit 9f16ca48fc818a17de8be1f75d08e7f4addc4497 ]
I found a NULL pointer dereference in ocfs2_update_inode_fsync_trans(),
handle->h_transaction may be NULL in this situation:
ocfs2_file_write_iter
->__generic_file_write_iter
->generic_perform_write
->ocfs2_write_begin
->ocfs2_write_begin_nolock
->ocfs2_write_cluster_by_desc
->ocfs2_write_cluster
->ocfs2_mark_extent_written
->ocfs2_change_extent_flag
->ocfs2_split_extent
->ocfs2_try_to_merge_extent
->ocfs2_extend_rotate_transaction
->ocfs2_extend_trans
->jbd2_journal_restart
->jbd2__journal_restart
// handle->h_transaction is NULL here
->handle->h_transaction = NULL;
->start_this_handle
/* journal aborted due to storage
network disconnection, return error */
->return -EROFS;
/* line 3806 in ocfs2_try_to_merge_extent (),
it will ignore ret error. */
->ret = 0;
->...
->ocfs2_write_end
->ocfs2_write_end_nolock
->ocfs2_update_inode_fsync_trans
// NULL pointer dereference
->oi->i_sync_tid = handle->h_transaction->t_tid;
The information of NULL pointer dereference as follows:
JBD2: Detected IO errors while flushing file data on dm-11-45
Aborting journal on device dm-11-45.
JBD2: Error -5 detected when updating journal superblock for dm-11-45.
(dd,22081,3):ocfs2_extend_trans:474 ERROR: status = -30
(dd,22081,3):ocfs2_try_to_merge_extent:3877 ERROR: status = -30
Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000008
Mem abort info:
ESR = 0x96000004
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000e74e1338
[0000000000000008] pgd=0000000000000000
Internal error: Oops: 96000004 [#1] SMP
Process dd (pid: 22081, stack limit = 0x00000000584f35a9)
CPU: 3 PID: 22081 Comm: dd Kdump: loaded
Hardware name: Huawei TaiShan 2280 V2/BC82AMDD, BIOS 0.98 08/25/2019
pstate: 60400009 (nZCv daif +PAN -UAO)
pc : ocfs2_write_end_nolock+0x2b8/0x550 [ocfs2]
lr : ocfs2_write_end_nolock+0x2a0/0x550 [ocfs2]
sp : ffff0000459fba70
x29: ffff0000459fba70 x28: 0000000000000000
x27: ffff807ccf7f1000 x26: 0000000000000001
x25: ffff807bdff57970 x24: ffff807caf1d4000
x23: ffff807cc79e9000 x22: 0000000000001000
x21: 000000006c6cd000 x20: ffff0000091d9000
x19: ffff807ccb239db0 x18: ffffffffffffffff
x17: 000000000000000e x16: 0000000000000007
x15: ffff807c5e15bd78 x14: 0000000000000000
x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000001
x9 : 0000000000000228 x8 : 000000000000000c
x7 : 0000000000000fff x6 : ffff807a308ed6b0
x5 : ffff7e01f10967c0 x4 : 0000000000000018
x3 : d0bc661572445600 x2 : 0000000000000000
x1 : 000000001b2e0200 x0 : 0000000000000000
Call trace:
ocfs2_write_end_nolock+0x2b8/0x550 [ocfs2]
ocfs2_write_end+0x4c/0x80 [ocfs2]
generic_perform_write+0x108/0x1a8
__generic_file_write_iter+0x158/0x1c8
ocfs2_file_write_iter+0x668/0x950 [ocfs2]
__vfs_write+0x11c/0x190
vfs_write+0xac/0x1c0
ksys_write+0x6c/0xd8
__arm64_sys_write+0x24/0x30
el0_svc_common+0x78/0x130
el0_svc_handler+0x38/0x78
el0_svc+0x8/0xc
To prevent NULL pointer dereference in this situation, we use
is_handle_aborted() before using handle->h_transaction->t_tid.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Yan Wang <[email protected]>
Reviewed-by: Jun Piao <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Gang He <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/ocfs2/journal.h | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h
index 497a4171ef61f..bfb50fc51528f 100644
--- a/fs/ocfs2/journal.h
+++ b/fs/ocfs2/journal.h
@@ -637,9 +637,11 @@ static inline void ocfs2_update_inode_fsync_trans(handle_t *handle,
{
struct ocfs2_inode_info *oi = OCFS2_I(inode);
- oi->i_sync_tid = handle->h_transaction->t_tid;
- if (datasync)
- oi->i_datasync_tid = handle->h_transaction->t_tid;
+ if (!is_handle_aborted(handle)) {
+ oi->i_sync_tid = handle->h_transaction->t_tid;
+ if (datasync)
+ oi->i_datasync_tid = handle->h_transaction->t_tid;
+ }
}
#endif /* OCFS2_JOURNAL_H */
--
2.20.1
From: Will Deacon <[email protected]>
[ Upstream commit d71e01716b3606a6648df7e5646ae12c75babde4 ]
If, for some bizarre reason, the compiler decided to split up the write
of STE DWORD 0, we could end up making a partial structure valid.
Although this probably won't happen, follow the example of the
context-descriptor code and use WRITE_ONCE() to ensure atomicity of the
write.
Reported-by: Jean-Philippe Brucker <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/iommu/arm-smmu-v3.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index eff1f3aa5ef43..6b7664052b5be 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1185,7 +1185,8 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_device *smmu, u32 sid,
}
arm_smmu_sync_ste_for_sid(smmu, sid);
- dst[0] = cpu_to_le64(val);
+ /* See comment in arm_smmu_write_ctx_desc() */
+ WRITE_ONCE(dst[0], cpu_to_le64(val));
arm_smmu_sync_ste_for_sid(smmu, sid);
/* It's likely that we'll want to use the new STE soon */
--
2.20.1
From: Coly Li <[email protected]>
[ Upstream commit 7c02b0055f774ed9afb6e1c7724f33bf148ffdc0 ]
In bset.h, macro bset_bkey_last() is defined as,
bkey_idx((struct bkey *) (i)->d, (i)->keys)
Parameter i can be variable type of data structure, the macro always
works once the type of struct i has member 'd' and 'keys'.
bset_bkey_last() is also used in macro csum_set() to calculate the
checksum of a on-disk data structure. When csum_set() is used to
calculate checksum of on-disk bcache super block, the parameter 'i'
data type is struct cache_sb_disk. Inside struct cache_sb_disk (also in
struct cache_sb) the member keys is __u16 type. But bkey_idx() expects
unsigned int (a 32bit width), so there is problem when sending
parameters via stack to call bkey_idx().
Sparse tool from Intel 0day kbuild system reports this incompatible
problem. bkey_idx() is part of user space API, so the simplest fix is
to cast the (i)->keys to unsigned int type in macro bset_bkey_last().
Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Coly Li <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/md/bcache/bset.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/md/bcache/bset.h b/drivers/md/bcache/bset.h
index c71365e7c1fac..a50dcfda656f5 100644
--- a/drivers/md/bcache/bset.h
+++ b/drivers/md/bcache/bset.h
@@ -397,7 +397,8 @@ void bch_btree_keys_stats(struct btree_keys *b, struct bset_stats *state);
/* Bkey utility code */
-#define bset_bkey_last(i) bkey_idx((struct bkey *) (i)->d, (i)->keys)
+#define bset_bkey_last(i) bkey_idx((struct bkey *) (i)->d, \
+ (unsigned int)(i)->keys)
static inline struct bkey *bset_bkey_idx(struct bset *i, unsigned int idx)
{
--
2.20.1
From: Andrei Otcheretianski <[email protected]>
[ Upstream commit baa6cf8450b72dcab11f37c47efce7c5b9b8ad0f ]
Use a unique name when registering a thermal zone. Otherwise, with
multiple NICS, we hit the following warning during the unregistration.
WARNING: CPU: 2 PID: 3525 at fs/sysfs/group.c:255
RIP: 0010:sysfs_remove_group+0x80/0x90
Call Trace:
dpm_sysfs_remove+0x57/0x60
device_del+0x5a/0x350
? sscanf+0x4e/0x70
device_unregister+0x1a/0x60
hwmon_device_unregister+0x4a/0xa0
thermal_remove_hwmon_sysfs+0x175/0x1d0
thermal_zone_device_unregister+0x188/0x1e0
iwl_mvm_thermal_exit+0xe7/0x100 [iwlmvm]
iwl_op_mode_mvm_stop+0x27/0x180 [iwlmvm]
_iwl_op_mode_stop.isra.3+0x2b/0x50 [iwlwifi]
iwl_opmode_deregister+0x90/0xa0 [iwlwifi]
__exit_compat+0x10/0x2c7 [iwlmvm]
__x64_sys_delete_module+0x13f/0x270
do_syscall_64+0x5a/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Signed-off-by: Andrei Otcheretianski <[email protected]>
Signed-off-by: Luca Coelho <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/intel/iwlwifi/mvm/tt.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
index 1232f63278eb6..319103f4b432e 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/tt.c
@@ -739,7 +739,8 @@ static struct thermal_zone_device_ops tzone_ops = {
static void iwl_mvm_thermal_zone_register(struct iwl_mvm *mvm)
{
int i;
- char name[] = "iwlwifi";
+ char name[16];
+ static atomic_t counter = ATOMIC_INIT(0);
if (!iwl_mvm_is_tt_in_fw(mvm)) {
mvm->tz_device.tzone = NULL;
@@ -749,6 +750,7 @@ static void iwl_mvm_thermal_zone_register(struct iwl_mvm *mvm)
BUILD_BUG_ON(ARRAY_SIZE(name) >= THERMAL_NAME_LENGTH);
+ sprintf(name, "iwlwifi_%u", atomic_inc_return(&counter) & 0xFF);
mvm->tz_device.tzone = thermal_zone_device_register(name,
IWL_MAX_DTS_TRIPS,
IWL_WRITABLE_TRIPS_MSK,
--
2.20.1
From: Zhiqiang Liu <[email protected]>
[ Upstream commit c8ab422553c81a0eb070329c63725df1cd1425bc ]
In brd_init func, rd_nr num of brd_device are firstly allocated
and add in brd_devices, then brd_devices are traversed to add each
brd_device by calling add_disk func. When allocating brd_device,
the disk->first_minor is set to i * max_part, if rd_nr * max_part
is larger than MINORMASK, two different brd_device may have the same
devt, then only one of them can be successfully added.
when rmmod brd.ko, it will cause oops when calling brd_exit.
Follow those steps:
# modprobe brd rd_nr=3 rd_size=102400 max_part=1048576
# rmmod brd
then, the oops will appear.
Oops log:
[ 726.613722] Call trace:
[ 726.614175] kernfs_find_ns+0x24/0x130
[ 726.614852] kernfs_find_and_get_ns+0x44/0x68
[ 726.615749] sysfs_remove_group+0x38/0xb0
[ 726.616520] blk_trace_remove_sysfs+0x1c/0x28
[ 726.617320] blk_unregister_queue+0x98/0x100
[ 726.618105] del_gendisk+0x144/0x2b8
[ 726.618759] brd_exit+0x68/0x560 [brd]
[ 726.619501] __arm64_sys_delete_module+0x19c/0x2a0
[ 726.620384] el0_svc_common+0x78/0x130
[ 726.621057] el0_svc_handler+0x38/0x78
[ 726.621738] el0_svc+0x8/0xc
[ 726.622259] Code: aa0203f6 aa0103f7 aa1e03e0 d503201f (7940e260)
Here, we add brd_check_and_reset_par func to check and limit max_part par.
--
V5->V6:
- remove useless code
V4->V5:(suggested by Ming Lei)
- make sure max_part is not larger than DISK_MAX_PARTS
V3->V4:(suggested by Ming Lei)
- remove useless change
- add one limit of max_part
V2->V3: (suggested by Ming Lei)
- clear .minors when running out of consecutive minor space in brd_alloc
- remove limit of rd_nr
V1->V2:
- add more checks in brd_check_par_valid as suggested by Ming Lei.
Signed-off-by: Zhiqiang Liu <[email protected]>
Reviewed-by: Bob Liu <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/block/brd.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index 17defbf4f332c..02e8fff3f8283 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -463,6 +463,25 @@ static struct kobject *brd_probe(dev_t dev, int *part, void *data)
return kobj;
}
+static inline void brd_check_and_reset_par(void)
+{
+ if (unlikely(!max_part))
+ max_part = 1;
+
+ /*
+ * make sure 'max_part' can be divided exactly by (1U << MINORBITS),
+ * otherwise, it is possiable to get same dev_t when adding partitions.
+ */
+ if ((1U << MINORBITS) % max_part != 0)
+ max_part = 1UL << fls(max_part);
+
+ if (max_part > DISK_MAX_PARTS) {
+ pr_info("brd: max_part can't be larger than %d, reset max_part = %d.\n",
+ DISK_MAX_PARTS, DISK_MAX_PARTS);
+ max_part = DISK_MAX_PARTS;
+ }
+}
+
static int __init brd_init(void)
{
struct brd_device *brd, *next;
@@ -486,8 +505,7 @@ static int __init brd_init(void)
if (register_blkdev(RAMDISK_MAJOR, "ramdisk"))
return -EIO;
- if (unlikely(!max_part))
- max_part = 1;
+ brd_check_and_reset_par();
for (i = 0; i < rd_nr; i++) {
brd = brd_alloc(i);
--
2.20.1
From: Alex Deucher <[email protected]>
[ Upstream commit 4d0a72b66065dd7e274bad6aa450196d42fd8f84 ]
Only send non-0 clocks to DC for validation. This mirrors
what the windows driver does.
Bug: https://gitlab.freedesktop.org/drm/amd/issues/963
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
index 1546bc49004f8..3fa6e8123b8eb 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
@@ -994,12 +994,15 @@ static int smu10_get_clock_by_type_with_latency(struct pp_hwmgr *hwmgr,
clocks->num_levels = 0;
for (i = 0; i < pclk_vol_table->count; i++) {
- clocks->data[i].clocks_in_khz = pclk_vol_table->entries[i].clk * 10;
- clocks->data[i].latency_in_us = latency_required ?
- smu10_get_mem_latency(hwmgr,
- pclk_vol_table->entries[i].clk) :
- 0;
- clocks->num_levels++;
+ if (pclk_vol_table->entries[i].clk) {
+ clocks->data[clocks->num_levels].clocks_in_khz =
+ pclk_vol_table->entries[i].clk * 10;
+ clocks->data[clocks->num_levels].latency_in_us = latency_required ?
+ smu10_get_mem_latency(hwmgr,
+ pclk_vol_table->entries[i].clk) :
+ 0;
+ clocks->num_levels++;
+ }
}
return 0;
--
2.20.1
From: Ronnie Sahlberg <[email protected]>
[ Upstream commit fe1292686333d1dadaf84091f585ee903b9ddb84 ]
RHBZ: 1760879
Fix an oops in match_prepath() by making sure that the prepath string is not
NULL before we pass it into strcmp().
This is similar to other checks we make for example in cifs_root_iget()
Signed-off-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/cifs/connect.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 576cf71576da1..6c62ce40608a1 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -3342,8 +3342,10 @@ match_prepath(struct super_block *sb, struct cifs_mnt_data *mnt_data)
{
struct cifs_sb_info *old = CIFS_SB(sb);
struct cifs_sb_info *new = mnt_data->cifs_sb;
- bool old_set = old->mnt_cifs_flags & CIFS_MOUNT_USE_PREFIX_PATH;
- bool new_set = new->mnt_cifs_flags & CIFS_MOUNT_USE_PREFIX_PATH;
+ bool old_set = (old->mnt_cifs_flags & CIFS_MOUNT_USE_PREFIX_PATH) &&
+ old->prepath;
+ bool new_set = (new->mnt_cifs_flags & CIFS_MOUNT_USE_PREFIX_PATH) &&
+ new->prepath;
if (old_set && new_set && !strcmp(new->prepath, old->prepath))
return 1;
--
2.20.1
From: Steve French <[email protected]>
[ Upstream commit d6fd41905ec577851734623fb905b1763801f5ef ]
We ran into a confusing problem where an application wasn't checking
return code on close and so user didn't realize that the application
ran out of disk space. log a warning message (once) in these
cases. For example:
[ 8407.391909] Out of space writing to \\oleg-server\small-share
Signed-off-by: Steve French <[email protected]>
Reported-by: Oleg Kravtsov <[email protected]>
Reviewed-by: Ronnie Sahlberg <[email protected]>
Reviewed-by: Pavel Shilovsky <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/cifs/smb2pdu.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
index 0d4e4d97e6cf5..e2d2b749c8f38 100644
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -3425,6 +3425,9 @@ smb2_writev_callback(struct mid_q_entry *mid)
wdata->cfile->fid.persistent_fid,
tcon->tid, tcon->ses->Suid, wdata->offset,
wdata->bytes, wdata->result);
+ if (wdata->result == -ENOSPC)
+ printk_once(KERN_WARNING "Out of space writing to %s\n",
+ tcon->treeName);
} else
trace_smb3_write_done(0 /* no xid */,
wdata->cfile->fid.persistent_fid,
--
2.20.1
From: Ido Schimmel <[email protected]>
[ Upstream commit 3a99cbb6fa7bca1995586ec2dc21b0368aad4937 ]
In case devlink_dpipe_entry_ctx_prepare() failed, release RTNL that was
previously taken and free the memory allocated by
mlxsw_sp_erif_entry_prepare().
Fixes: 2ba5999f009d ("mlxsw: spectrum: Add Support for erif table entries access")
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/mellanox/mlxsw/spectrum_dpipe.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_dpipe.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_dpipe.c
index 41e607a14846d..4fe193c4fa55d 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_dpipe.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_dpipe.c
@@ -215,7 +215,7 @@ mlxsw_sp_dpipe_table_erif_entries_dump(void *priv, bool counters_enabled,
start_again:
err = devlink_dpipe_entry_ctx_prepare(dump_ctx);
if (err)
- return err;
+ goto err_ctx_prepare;
j = 0;
for (; i < rif_count; i++) {
struct mlxsw_sp_rif *rif = mlxsw_sp_rif_by_index(mlxsw_sp, i);
@@ -247,6 +247,7 @@ start_again:
return 0;
err_entry_append:
err_entry_get:
+err_ctx_prepare:
rtnl_unlock();
devlink_dpipe_entry_clear(&entry);
return err;
--
2.20.1
From: Alex Deucher <[email protected]>
[ Upstream commit c37243579d6c881c575dcfb54cf31c9ded88f946 ]
We might get different numbers of clocks from powerplay depending
on what the OEM has populated.
v2: add assert for at least one level
Bug: https://gitlab.freedesktop.org/drm/amd/issues/963
Reviewed-by: Nicholas Kazlauskas <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
.../gpu/drm/amd/display/dc/calcs/dcn_calcs.c | 34 +++++++++++++------
1 file changed, 23 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
index 6342f64993512..b0956c360393e 100644
--- a/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
+++ b/drivers/gpu/drm/amd/display/dc/calcs/dcn_calcs.c
@@ -1346,6 +1346,7 @@ void dcn_bw_update_from_pplib(struct dc *dc)
struct dc_context *ctx = dc->ctx;
struct dm_pp_clock_levels_with_voltage fclks = {0}, dcfclks = {0};
bool res;
+ unsigned vmin0p65_idx, vmid0p72_idx, vnom0p8_idx, vmax0p9_idx;
/* TODO: This is not the proper way to obtain fabric_and_dram_bandwidth, should be min(fclk, memclk) */
res = dm_pp_get_clock_levels_by_type_with_voltage(
@@ -1357,17 +1358,28 @@ void dcn_bw_update_from_pplib(struct dc *dc)
res = verify_clock_values(&fclks);
if (res) {
- ASSERT(fclks.num_levels >= 3);
- dc->dcn_soc->fabric_and_dram_bandwidth_vmin0p65 = 32 * (fclks.data[0].clocks_in_khz / 1000.0) / 1000.0;
- dc->dcn_soc->fabric_and_dram_bandwidth_vmid0p72 = dc->dcn_soc->number_of_channels *
- (fclks.data[fclks.num_levels - (fclks.num_levels > 2 ? 3 : 2)].clocks_in_khz / 1000.0)
- * ddr4_dram_factor_single_Channel / 1000.0;
- dc->dcn_soc->fabric_and_dram_bandwidth_vnom0p8 = dc->dcn_soc->number_of_channels *
- (fclks.data[fclks.num_levels - 2].clocks_in_khz / 1000.0)
- * ddr4_dram_factor_single_Channel / 1000.0;
- dc->dcn_soc->fabric_and_dram_bandwidth_vmax0p9 = dc->dcn_soc->number_of_channels *
- (fclks.data[fclks.num_levels - 1].clocks_in_khz / 1000.0)
- * ddr4_dram_factor_single_Channel / 1000.0;
+ ASSERT(fclks.num_levels);
+
+ vmin0p65_idx = 0;
+ vmid0p72_idx = fclks.num_levels -
+ (fclks.num_levels > 2 ? 3 : (fclks.num_levels > 1 ? 2 : 1));
+ vnom0p8_idx = fclks.num_levels - (fclks.num_levels > 1 ? 2 : 1);
+ vmax0p9_idx = fclks.num_levels - 1;
+
+ dc->dcn_soc->fabric_and_dram_bandwidth_vmin0p65 =
+ 32 * (fclks.data[vmin0p65_idx].clocks_in_khz / 1000.0) / 1000.0;
+ dc->dcn_soc->fabric_and_dram_bandwidth_vmid0p72 =
+ dc->dcn_soc->number_of_channels *
+ (fclks.data[vmid0p72_idx].clocks_in_khz / 1000.0)
+ * ddr4_dram_factor_single_Channel / 1000.0;
+ dc->dcn_soc->fabric_and_dram_bandwidth_vnom0p8 =
+ dc->dcn_soc->number_of_channels *
+ (fclks.data[vnom0p8_idx].clocks_in_khz / 1000.0)
+ * ddr4_dram_factor_single_Channel / 1000.0;
+ dc->dcn_soc->fabric_and_dram_bandwidth_vmax0p9 =
+ dc->dcn_soc->number_of_channels *
+ (fclks.data[vmax0p9_idx].clocks_in_khz / 1000.0)
+ * ddr4_dram_factor_single_Channel / 1000.0;
} else
BREAK_TO_DEBUGGER();
--
2.20.1
From: Ben Skeggs <[email protected]>
[ Upstream commit 0e6176c6d286316e9431b4f695940cfac4ffe6c2 ]
The implementations for most channel types contains a map of methods to
priv registers in order to provide debugging info when a disp exception
has been raised.
This info is missing from the implementation of PIO channels as they're
rather simplistic already, however, if an exception is raised by one of
them, we'd end up triggering a NULL-pointer deref. Not ideal...
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206299
Signed-off-by: Ben Skeggs <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c
index bcf32d92ee5a9..50e3539f33d22 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/channv50.c
@@ -74,6 +74,8 @@ nv50_disp_chan_mthd(struct nv50_disp_chan *chan, int debug)
if (debug > subdev->debug)
return;
+ if (!mthd)
+ return;
for (i = 0; (list = mthd->data[i].mthd) != NULL; i++) {
u32 base = chan->head * mthd->addr;
--
2.20.1
From: Vasily Averin <[email protected]>
[ Upstream commit e4075e8bdffd93a9b6d6e1d52fabedceeca5a91b ]
if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.
Without patch:
# dd bs=4 skip=1 if=/sys/kernel/tracing/set_ftrace_pid
dd: /sys/kernel/tracing/set_ftrace_pid: cannot skip to specified offset
id
no pid
2+1 records in
2+1 records out
10 bytes copied, 0.000213285 s, 46.9 kB/s
Notice the "id" followed by "no pid".
With the patch:
# dd bs=4 skip=1 if=/sys/kernel/tracing/set_ftrace_pid
dd: /sys/kernel/tracing/set_ftrace_pid: cannot skip to specified offset
id
0+1 records in
0+1 records out
3 bytes copied, 0.000202112 s, 14.8 kB/s
Notice that it only prints "id" and not the "no pid" afterward.
Link: http://lkml.kernel.org/r/[email protected]
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/trace/ftrace.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 53795237e9751..0c379cd40bea3 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -6525,9 +6525,10 @@ static void *fpid_next(struct seq_file *m, void *v, loff_t *pos)
struct trace_array *tr = m->private;
struct trace_pid_list *pid_list = rcu_dereference_sched(tr->function_pids);
- if (v == FTRACE_NO_PIDS)
+ if (v == FTRACE_NO_PIDS) {
+ (*pos)++;
return NULL;
-
+ }
return trace_pid_next(pid_list, v, pos);
}
--
2.20.1
From: Shubhrajyoti Datta <[email protected]>
[ Upstream commit 061d2c1d593076424c910cb1b64ecdb5c9a6923f ]
In case the start + cache size is more than the max int the
start overflows.
Prevent the same.
Signed-off-by: Shubhrajyoti Datta <[email protected]>
Signed-off-by: Michal Simek <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/microblaze/kernel/cpu/cache.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/microblaze/kernel/cpu/cache.c b/arch/microblaze/kernel/cpu/cache.c
index 0bde47e4fa694..dcba53803fa5f 100644
--- a/arch/microblaze/kernel/cpu/cache.c
+++ b/arch/microblaze/kernel/cpu/cache.c
@@ -92,7 +92,8 @@ static inline void __disable_dcache_nomsr(void)
#define CACHE_LOOP_LIMITS(start, end, cache_line_length, cache_size) \
do { \
int align = ~(cache_line_length - 1); \
- end = min(start + cache_size, end); \
+ if (start < UINT_MAX - cache_size) \
+ end = min(start + cache_size, end); \
start &= align; \
} while (0)
--
2.20.1
From: Nathan Chancellor <[email protected]>
[ Upstream commit b61156fba74f659d0bc2de8f2dbf5bad9f4b8faf ]
Clang warns:
../drivers/net/wireless/intersil/hostap/hostap_ap.c:2511:3: warning:
misleading indentation; statement is not part of the previous 'if'
[-Wmisleading-indentation]
if (sta->tx_supp_rates & WLAN_RATE_5M5)
^
../drivers/net/wireless/intersil/hostap/hostap_ap.c:2509:2: note:
previous statement is here
if (sta->tx_supp_rates & WLAN_RATE_2M)
^
1 warning generated.
This warning occurs because there is a space before the tab on this
line. Remove it so that the indentation is consistent with the Linux
kernel coding style and clang no longer warns.
Fixes: ff1d2767d5a4 ("Add HostAP wireless driver.")
Link: https://github.com/ClangBuiltLinux/linux/issues/813
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/intersil/hostap/hostap_ap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/intersil/hostap/hostap_ap.c b/drivers/net/wireless/intersil/hostap/hostap_ap.c
index 0094b1d2b5770..3ec46f48cfde1 100644
--- a/drivers/net/wireless/intersil/hostap/hostap_ap.c
+++ b/drivers/net/wireless/intersil/hostap/hostap_ap.c
@@ -2508,7 +2508,7 @@ static int prism2_hostapd_add_sta(struct ap_data *ap,
sta->supported_rates[0] = 2;
if (sta->tx_supp_rates & WLAN_RATE_2M)
sta->supported_rates[1] = 4;
- if (sta->tx_supp_rates & WLAN_RATE_5M5)
+ if (sta->tx_supp_rates & WLAN_RATE_5M5)
sta->supported_rates[2] = 11;
if (sta->tx_supp_rates & WLAN_RATE_11M)
sta->supported_rates[3] = 22;
--
2.20.1
From: Marc Zyngier <[email protected]>
[ Upstream commit 926b5dfa6b8dc666ff398044af6906b156e1d949 ]
We currently allocate redistributor region structures for
individual redistributors when ACPI doesn't present us with
compact MMIO regions covering multiple redistributors.
It turns out that we allocate these structures even when
the redistributor is flagged as disabled by ACPI. It works
fine until someone actually tries to tarse one of these
structures, and access the corresponding MMIO region.
Instead, track the number of enabled redistributors, and
only allocate what is required. This makes sure that there
is no invalid data to misuse.
Signed-off-by: Marc Zyngier <[email protected]>
Reported-by: Heyi Guo <[email protected]>
Tested-by: Heyi Guo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/irqchip/irq-gic-v3.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index d5912f1ec8848..ac888d7a0b00a 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -1347,6 +1347,7 @@ static struct
struct redist_region *redist_regs;
u32 nr_redist_regions;
bool single_redist;
+ int enabled_rdists;
u32 maint_irq;
int maint_irq_mode;
phys_addr_t vcpu_base;
@@ -1441,8 +1442,10 @@ static int __init gic_acpi_match_gicc(struct acpi_subtable_header *header,
* If GICC is enabled and has valid gicr base address, then it means
* GICR base is presented via GICC
*/
- if ((gicc->flags & ACPI_MADT_ENABLED) && gicc->gicr_base_address)
+ if ((gicc->flags & ACPI_MADT_ENABLED) && gicc->gicr_base_address) {
+ acpi_data.enabled_rdists++;
return 0;
+ }
/*
* It's perfectly valid firmware can pass disabled GICC entry, driver
@@ -1472,8 +1475,10 @@ static int __init gic_acpi_count_gicr_regions(void)
count = acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT,
gic_acpi_match_gicc, 0);
- if (count > 0)
+ if (count > 0) {
acpi_data.single_redist = true;
+ count = acpi_data.enabled_rdists;
+ }
return count;
}
--
2.20.1
From: Alex Deucher <[email protected]>
[ Upstream commit 1064ad4aeef94f51ca230ac639a9e996fb7867a0 ]
Cull out 0 clocks to avoid a warning in DC.
Bug: https://gitlab.freedesktop.org/drm/amd/issues/963
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
index 3fa6e8123b8eb..48e31711bc68f 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
@@ -1048,9 +1048,11 @@ static int smu10_get_clock_by_type_with_voltage(struct pp_hwmgr *hwmgr,
clocks->num_levels = 0;
for (i = 0; i < pclk_vol_table->count; i++) {
- clocks->data[i].clocks_in_khz = pclk_vol_table->entries[i].clk * 10;
- clocks->data[i].voltage_in_mv = pclk_vol_table->entries[i].vol;
- clocks->num_levels++;
+ if (pclk_vol_table->entries[i].clk) {
+ clocks->data[clocks->num_levels].clocks_in_khz = pclk_vol_table->entries[i].clk * 10;
+ clocks->data[clocks->num_levels].voltage_in_mv = pclk_vol_table->entries[i].vol;
+ clocks->num_levels++;
+ }
}
return 0;
--
2.20.1
From: Arnd Bergmann <[email protected]>
[ Upstream commit a55e601b2f02df5db7070e9a37bd655c9c576a52 ]
gcc -O3 warns about a dummy variable that is passed
down into rbd_img_fill_nodata without being initialized:
drivers/block/rbd.c: In function 'rbd_img_fill_nodata':
drivers/block/rbd.c:2573:13: error: 'dummy' is used uninitialized in this function [-Werror=uninitialized]
fctx->iter = *fctx->pos;
Since this is a dummy, I assume the warning is harmless, but
it's better to initialize it anyway and avoid the warning.
Fixes: mmtom ("init/Kconfig: enable -O3 for all arches")
Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Ilya Dryomov <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/block/rbd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index b942f4c8cea8c..d3ad1b8c133e6 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -2097,7 +2097,7 @@ static int rbd_img_fill_nodata(struct rbd_img_request *img_req,
u64 off, u64 len)
{
struct ceph_file_extent ex = { off, len };
- union rbd_img_fill_iter dummy;
+ union rbd_img_fill_iter dummy = {};
struct rbd_img_fill_ctx fctx = {
.pos_type = OBJ_REQUEST_NODATA,
.pos = &dummy,
--
2.20.1
From: Vasily Averin <[email protected]>
[ Upstream commit 9f198a2ac543eaaf47be275531ad5cbd50db3edf ]
if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Mike Marshall <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/orangefs/orangefs-debugfs.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/orangefs/orangefs-debugfs.c b/fs/orangefs/orangefs-debugfs.c
index 0732cb08173e9..e24738c691f66 100644
--- a/fs/orangefs/orangefs-debugfs.c
+++ b/fs/orangefs/orangefs-debugfs.c
@@ -305,6 +305,7 @@ static void *help_start(struct seq_file *m, loff_t *pos)
static void *help_next(struct seq_file *m, void *v, loff_t *pos)
{
+ (*pos)++;
gossip_debug(GOSSIP_DEBUGFS_DEBUG, "help_next: start\n");
return NULL;
--
2.20.1
From: Michael S. Tsirkin <[email protected]>
[ Upstream commit 6e9826e77249355c09db6ba41cd3f84e89f4b614 ]
Make sure, at build time, that pfn array is big enough to hold a single
page. It happens to be true since the PAGE_SHIFT value at the moment is
20, which is 1M - exactly 256 4K balloon pages.
Signed-off-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/virtio/virtio_balloon.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 14ac36ca8fbd3..1afcbef397ab7 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -126,6 +126,8 @@ static void set_page_pfns(struct virtio_balloon *vb,
{
unsigned int i;
+ BUILD_BUG_ON(VIRTIO_BALLOON_PAGES_PER_PAGE > VIRTIO_BALLOON_ARRAY_PFNS_MAX);
+
/*
* Set balloon pfns pointing at this page.
* Note that the first pfn points at start of the page.
--
2.20.1
From: Vasily Averin <[email protected]>
[ Upstream commit 90435a7891a2259b0f74c5a1bc5600d0d64cba8f ]
If seq_file .next fuction does not change position index,
read after some lseek can generate an unexpected output.
See also: https://bugzilla.kernel.org/show_bug.cgi?id=206283
v1 -> v2: removed missed increment in end of function
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/bpf/inode.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
index dc9d7ac8228db..c04815bb15cc1 100644
--- a/kernel/bpf/inode.c
+++ b/kernel/bpf/inode.c
@@ -198,6 +198,7 @@ static void *map_seq_next(struct seq_file *m, void *v, loff_t *pos)
void *key = map_iter(m)->key;
void *prev_key;
+ (*pos)++;
if (map_iter(m)->done)
return NULL;
@@ -210,8 +211,6 @@ static void *map_seq_next(struct seq_file *m, void *v, loff_t *pos)
map_iter(m)->done = true;
return NULL;
}
-
- ++(*pos);
return key;
}
--
2.20.1
From: Xiubo Li <[email protected]>
[ Upstream commit 97820058fb2831a4b203981fa2566ceaaa396103 ]
If all the MDS daemons are down for some reason, then the first mount
attempt will fail with EIO after the mount request times out. A mount
attempt will also fail with EIO if all of the MDS's are laggy.
This patch changes the code to return -EHOSTUNREACH in these situations
and adds a pr_info error message to help the admin determine the cause.
URL: https://tracker.ceph.com/issues/4386
Signed-off-by: Xiubo Li <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/ceph/mds_client.c | 3 +--
fs/ceph/super.c | 5 +++++
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 09db6d08614d2..a2e903203bf9f 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -2343,8 +2343,7 @@ static void __do_request(struct ceph_mds_client *mdsc,
if (!(mdsc->fsc->mount_options->flags &
CEPH_MOUNT_OPT_MOUNTWAIT) &&
!ceph_mdsmap_is_cluster_available(mdsc->mdsmap)) {
- err = -ENOENT;
- pr_info("probably no mds server is up\n");
+ err = -EHOSTUNREACH;
goto finish;
}
}
diff --git a/fs/ceph/super.c b/fs/ceph/super.c
index 2bd0b1ed9708e..c4314f4492401 100644
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -1106,6 +1106,11 @@ static struct dentry *ceph_mount(struct file_system_type *fs_type,
return res;
out_splat:
+ if (!ceph_mdsmap_is_cluster_available(fsc->mdsc->mdsmap)) {
+ pr_info("No mds server is up or the cluster is laggy\n");
+ err = -EHOSTUNREACH;
+ }
+
ceph_mdsc_close_sessions(fsc->mdsc);
deactivate_locked_super(sb);
goto out_final;
--
2.20.1
From: Wenwen Wang <[email protected]>
[ Upstream commit 123c23c6a7b7ecd2a3d6060bea1d94019f71fd66 ]
In _nfs42_proc_copy(), 'res->commit_res.verf' is allocated through
kzalloc() if 'args->sync' is true. In the following code, if
'res->synchronous' is false, handle_async_copy() will be invoked. If an
error occurs during the invocation, the following code will not be executed
and the error will be returned . However, the allocated
'res->commit_res.verf' is not deallocated, leading to a memory leak. This
is also true if the invocation of process_copy_commit() returns an error.
To fix the above leaks, redirect the execution to the 'out' label if an
error is encountered.
Signed-off-by: Wenwen Wang <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/nfs/nfs42proc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index 94f98e190e632..526441de89c1d 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -283,14 +283,14 @@ static ssize_t _nfs42_proc_copy(struct file *src,
status = handle_async_copy(res, server, src, dst,
&args->src_stateid);
if (status)
- return status;
+ goto out;
}
if ((!res->synchronous || !args->sync) &&
res->write_res.verifier.committed != NFS_FILE_SYNC) {
status = process_copy_commit(dst, pos_dst, res);
if (status)
- return status;
+ goto out;
}
truncate_pagecache_range(dst_inode, pos_dst,
--
2.20.1
From: Yunfeng Ye <[email protected]>
[ Upstream commit aacee5446a2a1aa35d0a49dab289552578657fb4 ]
The variable inode may be NULL in reiserfs_insert_item(), but there is
no check before accessing the member of inode.
Fix this by adding NULL pointer check before calling reiserfs_debug().
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Yunfeng Ye <[email protected]>
Cc: zhengbin <[email protected]>
Cc: Hu Shiyuan <[email protected]>
Cc: Feilong Lin <[email protected]>
Cc: Jan Kara <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/reiserfs/stree.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/reiserfs/stree.c b/fs/reiserfs/stree.c
index 0037aea97d39a..2946713cb00d6 100644
--- a/fs/reiserfs/stree.c
+++ b/fs/reiserfs/stree.c
@@ -2250,7 +2250,8 @@ error_out:
/* also releases the path */
unfix_nodes(&s_ins_balance);
#ifdef REISERQUOTA_DEBUG
- reiserfs_debug(th->t_super, REISERFS_DEBUG_CODE,
+ if (inode)
+ reiserfs_debug(th->t_super, REISERFS_DEBUG_CODE,
"reiserquota insert_item(): freeing %u id=%u type=%c",
quota_bytes, inode->i_uid, head2type(ih));
#endif
--
2.20.1
From: Vasily Averin <[email protected]>
[ Upstream commit 6722b23e7a2ace078344064a9735fb73e554e9ef ]
if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.
Without patch:
# dd bs=30 skip=1 if=/sys/kernel/tracing/events/sched/sched_switch/trigger
dd: /sys/kernel/tracing/events/sched/sched_switch/trigger: cannot skip to specified offset
n traceoff snapshot stacktrace enable_event disable_event enable_hist disable_hist hist
# Available triggers:
# traceon traceoff snapshot stacktrace enable_event disable_event enable_hist disable_hist hist
6+1 records in
6+1 records out
206 bytes copied, 0.00027916 s, 738 kB/s
Notice the printing of "# Available triggers:..." after the line.
With the patch:
# dd bs=30 skip=1 if=/sys/kernel/tracing/events/sched/sched_switch/trigger
dd: /sys/kernel/tracing/events/sched/sched_switch/trigger: cannot skip to specified offset
n traceoff snapshot stacktrace enable_event disable_event enable_hist disable_hist hist
2+1 records in
2+1 records out
88 bytes copied, 0.000526867 s, 167 kB/s
It only prints the end of the file, and does not restart.
Link: http://lkml.kernel.org/r/[email protected]
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/trace/trace_events_trigger.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
index b05d1b6a62911..9300e8bbf08ac 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -115,9 +115,10 @@ static void *trigger_next(struct seq_file *m, void *t, loff_t *pos)
{
struct trace_event_file *event_file = event_file_data(m->private);
- if (t == SHOW_AVAILABLE_TRIGGERS)
+ if (t == SHOW_AVAILABLE_TRIGGERS) {
+ (*pos)++;
return NULL;
-
+ }
return seq_list_next(t, &event_file->triggers, pos);
}
--
2.20.1
From: Nathan Chancellor <[email protected]>
[ Upstream commit 4e456fee215677584cafa7f67298a76917e89c64 ]
Clang warns:
../lib/scatterlist.c:314:5: warning: misleading indentation; statement
is not part of the previous 'if' [-Wmisleading-indentation]
return -ENOMEM;
^
../lib/scatterlist.c:311:4: note: previous statement is here
if (prv)
^
1 warning generated.
This warning occurs because there is a space before the tab on this
line. Remove it so that the indentation is consistent with the Linux
kernel coding style and clang no longer warns.
Link: http://lkml.kernel.org/r/[email protected]
Link: https://github.com/ClangBuiltLinux/linux/issues/830
Fixes: edce6820a9fd ("scatterlist: prevent invalid free when alloc fails")
Signed-off-by: Nathan Chancellor <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
lib/scatterlist.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 8c3036c37ba0e..60e7eca2f4bed 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -305,7 +305,7 @@ int __sg_alloc_table(struct sg_table *table, unsigned int nents,
if (prv)
table->nents = ++table->orig_nents;
- return -ENOMEM;
+ return -ENOMEM;
}
sg_init_table(sg, alloc_size);
--
2.20.1
From: zhangyi (F) <[email protected]>
[ Upstream commit d0a186e0d3e7ac05cc77da7c157dae5aa59f95d9 ]
We invoke jbd2_journal_abort() to abort the journal and record errno
in the jbd2 superblock when committing journal transaction besides the
failure on submitting the commit record. But there is no need for the
case and we can also invoke jbd2_journal_abort() instead of
__jbd2_journal_abort_hard().
Fixes: 818d276ceb83a ("ext4: Add the journal checksum feature")
Signed-off-by: zhangyi (F) <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/jbd2/commit.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
index c321fa06081ce..4200a6fe9599c 100644
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -781,7 +781,7 @@ start_journal_io:
err = journal_submit_commit_record(journal, commit_transaction,
&cbh, crc32_sum);
if (err)
- __jbd2_journal_abort_hard(journal);
+ jbd2_journal_abort(journal, err);
}
blk_finish_plug(&plug);
@@ -874,7 +874,7 @@ start_journal_io:
err = journal_submit_commit_record(journal, commit_transaction,
&cbh, crc32_sum);
if (err)
- __jbd2_journal_abort_hard(journal);
+ jbd2_journal_abort(journal, err);
}
if (cbh)
err = journal_wait_on_commit_record(journal, cbh);
--
2.20.1
From: Peter Große <[email protected]>
[ Upstream commit ef7d84caa5928b40b1c93a26dbe5a3f12737c6ab ]
Lenovo Thinkpad T420s uses the same codec as T420, so apply the
same quirk to enable audio output on a docking station.
Signed-off-by: Peter Große <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/pci/hda/patch_conexant.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/sound/pci/hda/patch_conexant.c b/sound/pci/hda/patch_conexant.c
index 5500dd437b44d..78bb96263bc27 100644
--- a/sound/pci/hda/patch_conexant.c
+++ b/sound/pci/hda/patch_conexant.c
@@ -935,6 +935,7 @@ static const struct snd_pci_quirk cxt5066_fixups[] = {
SND_PCI_QUIRK(0x17aa, 0x215f, "Lenovo T510", CXT_PINCFG_LENOVO_TP410),
SND_PCI_QUIRK(0x17aa, 0x21ce, "Lenovo T420", CXT_PINCFG_LENOVO_TP410),
SND_PCI_QUIRK(0x17aa, 0x21cf, "Lenovo T520", CXT_PINCFG_LENOVO_TP410),
+ SND_PCI_QUIRK(0x17aa, 0x21d2, "Lenovo T420s", CXT_PINCFG_LENOVO_TP410),
SND_PCI_QUIRK(0x17aa, 0x21da, "Lenovo X220", CXT_PINCFG_LENOVO_TP410),
SND_PCI_QUIRK(0x17aa, 0x21db, "Lenovo X220-tablet", CXT_PINCFG_LENOVO_TP410),
SND_PCI_QUIRK(0x17aa, 0x38af, "Lenovo IdeaPad Z560", CXT_FIXUP_MUTE_LED_EAPD),
--
2.20.1
From: Ben Skeggs <[email protected]>
[ Upstream commit 35e4909b6a2b4005ced3c4238da60d926b78fdea ]
Signed-off-by: Ben Skeggs <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/nouveau/nvkm/core/memory.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/nvkm/core/memory.c b/drivers/gpu/drm/nouveau/nvkm/core/memory.c
index e85a08ecd9da5..4cc186262d344 100644
--- a/drivers/gpu/drm/nouveau/nvkm/core/memory.c
+++ b/drivers/gpu/drm/nouveau/nvkm/core/memory.c
@@ -91,8 +91,8 @@ nvkm_memory_tags_get(struct nvkm_memory *memory, struct nvkm_device *device,
}
refcount_set(&tags->refcount, 1);
+ *ptags = memory->tags = tags;
mutex_unlock(&fb->subdev.mutex);
- *ptags = tags;
return 0;
}
--
2.20.1
From: Vincenzo Frascino <[email protected]>
[ Upstream commit 76950f7162cad51d2200ebd22c620c14af38f718 ]
To perform the reserve_crashkernel() operation kexec uses SECTION_SIZE to
find a memblock in a range.
SECTION_SIZE is not defined for nommu systems. Trying to compile kexec in
these conditions results in a build error:
linux/arch/arm/kernel/setup.c: In function ‘reserve_crashkernel’:
linux/arch/arm/kernel/setup.c:1016:25: error: ‘SECTION_SIZE’ undeclared
(first use in this function); did you mean ‘SECTIONS_WIDTH’?
crash_size, SECTION_SIZE);
^~~~~~~~~~~~
SECTIONS_WIDTH
linux/arch/arm/kernel/setup.c:1016:25: note: each undeclared identifier
is reported only once for each function it appears in
linux/scripts/Makefile.build:265: recipe for target 'arch/arm/kernel/setup.o'
failed
Make KEXEC depend on MMU to fix the compilation issue.
Signed-off-by: Vincenzo Frascino <[email protected]>
Signed-off-by: Russell King <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/arm/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 9f03befdfbf06..e2f7c50dbace5 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -2008,7 +2008,7 @@ config XIP_DEFLATED_DATA
config KEXEC
bool "Kexec system call (EXPERIMENTAL)"
depends on (!SMP || PM_SLEEP_SMP)
- depends on !CPU_V7M
+ depends on MMU
select KEXEC_CORE
help
kexec is a system call that implements the ability to shutdown your
--
2.20.1
From: Kai Vehmanen <[email protected]>
[ Upstream commit 2928fa0a97ebb9549cb877fdc99aed9b95438c3a ]
The initial snd_hda_get_sub_node() can fail on certain
devices (e.g. some Chromebook models using Intel GLK).
The failure rate is very low, but as this is is part of
the probe process, end-user impact is high.
In observed cases, related hardware status registers have
expected values, but the node query still fails. Retrying
the node query does seem to help, so fix the problem by
adding retry logic to the query. This does not impact
non-Intel platforms.
BugLink: https://github.com/thesofproject/linux/issues/1642
Signed-off-by: Kai Vehmanen <[email protected]>
Reviewed-by: Takashi Iwai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/pci/hda/patch_hdmi.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/sound/pci/hda/patch_hdmi.c b/sound/pci/hda/patch_hdmi.c
index c827a2a89cc3d..c67fadd5aae53 100644
--- a/sound/pci/hda/patch_hdmi.c
+++ b/sound/pci/hda/patch_hdmi.c
@@ -2604,9 +2604,12 @@ static int alloc_intel_hdmi(struct hda_codec *codec)
/* parse and post-process for Intel codecs */
static int parse_intel_hdmi(struct hda_codec *codec)
{
- int err;
+ int err, retries = 3;
+
+ do {
+ err = hdmi_parse_codec(codec);
+ } while (err < 0 && retries--);
- err = hdmi_parse_codec(codec);
if (err < 0) {
generic_spec_free(codec);
return err;
--
2.20.1
From: Anand Jain <[email protected]>
[ Upstream commit a69976bc69308aa475d0ba3b8b3efd1d013c0460 ]
We had a report indicating that some read errors aren't reported by the
device stats in the userland. It is important to have the errors
reported in the device stat as user land scripts might depend on it to
take the reasonable corrective actions. But to debug these issue we need
to be really sure that request to reset the device stat did not come
from the userland itself. So log an info message when device error reset
happens.
For example:
BTRFS info (device sdc): device stats zeroed by btrfs(9223)
Reported-by: [email protected]
Link: https://www.spinics.net/lists/linux-btrfs/msg96528.html
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Anand Jain <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/btrfs/volumes.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 5bbcdcff68a9e..9c3b394b99fa2 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7260,6 +7260,8 @@ int btrfs_get_dev_stats(struct btrfs_fs_info *fs_info,
else
btrfs_dev_stat_reset(dev, i);
}
+ btrfs_info(fs_info, "device stats zeroed by %s (%d)",
+ current->comm, task_pid_nr(current));
} else {
for (i = 0; i < BTRFS_DEV_STAT_VALUES_MAX; i++)
if (stats->nr_items > i)
--
2.20.1
From: Zenghui Yu <[email protected]>
[ Upstream commit 107945227ac5d4c37911c7841b27c64b489ce9a9 ]
It looks like an obvious mistake to use its_mapc_cmd descriptor when
building the INVALL command block. It so far worked by luck because
both its_mapc_cmd.col and its_invall_cmd.col sit at the same offset of
the ITS command descriptor, but we should not rely on it.
Fixes: cc2d3216f53c ("irqchip: GICv3: ITS command queue")
Signed-off-by: Zenghui Yu <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 050d6e040128d..bf7b69449b438 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -578,7 +578,7 @@ static struct its_collection *its_build_invall_cmd(struct its_node *its,
struct its_cmd_desc *desc)
{
its_encode_cmd(cmd, GITS_CMD_INVALL);
- its_encode_collection(cmd, desc->its_mapc_cmd.col->col_id);
+ its_encode_collection(cmd, desc->its_invall_cmd.col->col_id);
its_fixup_cmd(cmd);
--
2.20.1
From: Jessica Yu <[email protected]>
[ Upstream commit 708e0ada1916be765b7faa58854062f2bc620bbf ]
In setup_load_info(), info->name (which contains the name of the module,
mostly used for early logging purposes before the module gets set up)
gets unconditionally assigned if .modinfo is missing despite the fact
that there is an if (!info->name) check near the end of the function.
Avoid assigning a placeholder string to info->name if .modinfo doesn't
exist, so that we can fall back to info->mod->name later on.
Fixes: 5fdc7db6448a ("module: setup load info before module_sig_check()")
Reviewed-by: Miroslav Benes <[email protected]>
Signed-off-by: Jessica Yu <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/module.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/kernel/module.c b/kernel/module.c
index 70a75a7216abb..20fc0efc679c0 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2980,9 +2980,7 @@ static int setup_load_info(struct load_info *info, int flags)
/* Try to find a name early so we can log errors with a module name */
info->index.info = find_sec(info, ".modinfo");
- if (!info->index.info)
- info->name = "(missing .modinfo section)";
- else
+ if (info->index.info)
info->name = get_modinfo(info, "name");
/* Find internal symbols and strings. */
@@ -2997,14 +2995,15 @@ static int setup_load_info(struct load_info *info, int flags)
}
if (info->index.sym == 0) {
- pr_warn("%s: module has no symbols (stripped?)\n", info->name);
+ pr_warn("%s: module has no symbols (stripped?)\n",
+ info->name ?: "(missing .modinfo section or name field)");
return -ENOEXEC;
}
info->index.mod = find_sec(info, ".gnu.linkonce.this_module");
if (!info->index.mod) {
pr_warn("%s: No module found in object\n",
- info->name ?: "(missing .modinfo name field)");
+ info->name ?: "(missing .modinfo section or name field)");
return -ENOEXEC;
}
/* This is temporary: point mod into copy of data. */
--
2.20.1
From: Daniel Vetter <[email protected]>
[ Upstream commit ec3d65082d7dabad6fa8f66a8ef166f2d522d6b2 ]
Per at least one tester this is enough magic to recover the regression
introduced for some people (but not all) in
commit b8e2b0199cc377617dc238f5106352c06dcd3fa2
Author: Peter Rosin <[email protected]>
Date: Tue Jul 4 12:36:57 2017 +0200
drm/fb-helper: factor out pseudo-palette
which for radeon had the side-effect of refactoring out a seemingly
redudant writing of the color palette.
10ms in a fairly slow modeset path feels like an acceptable form of
duct-tape, so maybe worth a shot and see what sticks.
Cc: Alex Deucher <[email protected]>
Cc: Michel Dänzer <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/radeon/radeon_display.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c
index d8e2d7b3b8365..7d1e14f0140a2 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -121,6 +121,8 @@ static void dce5_crtc_load_lut(struct drm_crtc *crtc)
DRM_DEBUG_KMS("%d\n", radeon_crtc->crtc_id);
+ msleep(10);
+
WREG32(NI_INPUT_CSC_CONTROL + radeon_crtc->crtc_offset,
(NI_INPUT_CSC_GRPH_MODE(NI_INPUT_CSC_BYPASS) |
NI_INPUT_CSC_OVL_MODE(NI_INPUT_CSC_BYPASS)));
--
2.20.1
From: Dan Carpenter <[email protected]>
[ Upstream commit 3613a9bea95a1470dd42e4ed1cc7d86ebe0a2dc0 ]
We accidentally set "psb" which is a no-op instead of "*psb" so it
generates a static checker warning. We should probably set it before
the first error return so that it's always initialized.
Fixes: 923f1bd27bf1 ("drm/nouveau/secboot/gm20b: add secure boot support")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Ben Skeggs <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c
index df8b919dcf09b..ace6fefba4280 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/secboot/gm20b.c
@@ -108,6 +108,7 @@ gm20b_secboot_new(struct nvkm_device *device, int index,
struct gm200_secboot *gsb;
struct nvkm_acr *acr;
+ *psb = NULL;
acr = acr_r352_new(BIT(NVKM_SECBOOT_FALCON_FECS) |
BIT(NVKM_SECBOOT_FALCON_PMU));
if (IS_ERR(acr))
@@ -116,10 +117,8 @@ gm20b_secboot_new(struct nvkm_device *device, int index,
acr->optional_falcons = BIT(NVKM_SECBOOT_FALCON_PMU);
gsb = kzalloc(sizeof(*gsb), GFP_KERNEL);
- if (!gsb) {
- psb = NULL;
+ if (!gsb)
return -ENOMEM;
- }
*psb = &gsb->base;
ret = nvkm_secboot_ctor(&gm20b_secboot, acr, device, index, &gsb->base);
--
2.20.1
From: Jiewei Ke <[email protected]>
[ Upstream commit 6ca18d8927d468c763571f78c9a7387a69ffa020 ]
The type of mmap_offset should be u64 instead of int to match the type of
mminfo.offset. If otherwise, after we create several thousands of CQs, it
will run into overflow issues.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jiewei Ke <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/infiniband/sw/rxe/rxe_verbs.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index 6a75f96b90962..b4e24362edbb0 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -407,7 +407,7 @@ struct rxe_dev {
struct list_head pending_mmaps;
spinlock_t mmap_offset_lock; /* guard mmap_offset */
- int mmap_offset;
+ u64 mmap_offset;
atomic64_t stats_counters[RXE_NUM_OF_COUNTERS];
--
2.20.1
From: Li RongQing <[email protected]>
[ Upstream commit 0a29275b6300f39f78a87f2038bbfe5bdbaeca47 ]
A negative value should be returned if map->map_type is invalid
although that is impossible now, but if we run into such situation
in future, then xdpbuff could be leaked.
Daniel Borkmann suggested:
-EBADRQC should be returned to stay consistent with generic XDP
for the tracepoint output and not to be confused with -EOPNOTSUPP
from other locations like dev_map_enqueue() when ndo_xdp_xmit is
missing and such.
Suggested-by: Daniel Borkmann <[email protected]>
Signed-off-by: Li RongQing <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
net/core/filter.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/filter.c b/net/core/filter.c
index 9daf1a4118b51..40b3af05c883c 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3207,7 +3207,7 @@ static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd,
return err;
}
default:
- break;
+ return -EBADRQC;
}
return 0;
}
--
2.20.1
From: Vasily Gorbik <[email protected]>
[ Upstream commit 45f7a0da600d3c409b5ad8d5ddddacd98ddc8840 ]
Currently backtrace from ftraced function does not contain ftraced
function itself. e.g. for "path_openat":
arch_stack_walk+0x15c/0x2d8
stack_trace_save+0x50/0x68
stack_trace_call+0x15e/0x3d8
ftrace_graph_caller+0x0/0x1c <-- ftrace code
do_filp_open+0x7c/0xe8 <-- ftraced function caller
do_open_execat+0x76/0x1b8
open_exec+0x52/0x78
load_elf_binary+0x180/0x1160
search_binary_handler+0x8e/0x288
load_script+0x2a8/0x2b8
search_binary_handler+0x8e/0x288
__do_execve_file.isra.39+0x6fa/0xb40
__s390x_sys_execve+0x56/0x68
system_call+0xdc/0x2d8
Ftraced function is expected in the backtrace by ftrace kselftests, which
are now failing. It would also be nice to have it for clarity reasons.
"ftrace_caller" itself is called without stack frame allocated for it
and does not store its caller (ftraced function). Instead it simply
allocates a stack frame for "ftrace_trace_function" and sets backchain
to point to ftraced function stack frame (which contains ftraced function
caller in saved r14).
To fix this issue make "ftrace_caller" allocate a stack frame
for itself just to store ftraced function for the stack unwinder.
As a result backtrace looks like the following:
arch_stack_walk+0x15c/0x2d8
stack_trace_save+0x50/0x68
stack_trace_call+0x15e/0x3d8
ftrace_graph_caller+0x0/0x1c <-- ftrace code
path_openat+0x6/0xd60 <-- ftraced function
do_filp_open+0x7c/0xe8 <-- ftraced function caller
do_open_execat+0x76/0x1b8
open_exec+0x52/0x78
load_elf_binary+0x180/0x1160
search_binary_handler+0x8e/0x288
load_script+0x2a8/0x2b8
search_binary_handler+0x8e/0x288
__do_execve_file.isra.39+0x6fa/0xb40
__s390x_sys_execve+0x56/0x68
system_call+0xdc/0x2d8
Reported-by: Sven Schnelle <[email protected]>
Tested-by: Sven Schnelle <[email protected]>
Reviewed-by: Heiko Carstens <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/s390/kernel/mcount.S | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/s390/kernel/mcount.S b/arch/s390/kernel/mcount.S
index e93fbf02490cf..83afd5b78e16a 100644
--- a/arch/s390/kernel/mcount.S
+++ b/arch/s390/kernel/mcount.S
@@ -25,6 +25,12 @@ ENTRY(ftrace_stub)
#define STACK_PTREGS (STACK_FRAME_OVERHEAD)
#define STACK_PTREGS_GPRS (STACK_PTREGS + __PT_GPRS)
#define STACK_PTREGS_PSW (STACK_PTREGS + __PT_PSW)
+#ifdef __PACK_STACK
+/* allocate just enough for r14, r15 and backchain */
+#define TRACED_FUNC_FRAME_SIZE 24
+#else
+#define TRACED_FUNC_FRAME_SIZE STACK_FRAME_OVERHEAD
+#endif
ENTRY(_mcount)
BR_EX %r14
@@ -38,9 +44,16 @@ ENTRY(ftrace_caller)
#if !(defined(CC_USING_HOTPATCH) || defined(CC_USING_NOP_MCOUNT))
aghi %r0,MCOUNT_RETURN_FIXUP
#endif
- aghi %r15,-STACK_FRAME_SIZE
+ # allocate stack frame for ftrace_caller to contain traced function
+ aghi %r15,-TRACED_FUNC_FRAME_SIZE
stg %r1,__SF_BACKCHAIN(%r15)
+ stg %r0,(__SF_GPRS+8*8)(%r15)
+ stg %r15,(__SF_GPRS+9*8)(%r15)
+ # allocate pt_regs and stack frame for ftrace_trace_function
+ aghi %r15,-STACK_FRAME_SIZE
stg %r1,(STACK_PTREGS_GPRS+15*8)(%r15)
+ aghi %r1,-TRACED_FUNC_FRAME_SIZE
+ stg %r1,__SF_BACKCHAIN(%r15)
stg %r0,(STACK_PTREGS_PSW+8)(%r15)
stmg %r2,%r14,(STACK_PTREGS_GPRS+2*8)(%r15)
#ifdef CONFIG_HAVE_MARCH_Z196_FEATURES
--
2.20.1
From: Simon Schwartz <[email protected]>
[ Upstream commit 39cc539f90d035a293240c9443af50be55ee81b8 ]
num_resources in the platform_device struct is declared as a u32. The
for loops that iterate over num_resources use an int as the counter,
which can cause infinite loops on architectures with smaller ints.
Change the loop counters to u32.
Signed-off-by: Simon Schwartz <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/base/platform.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index e9be1f56929af..1d3a50ac21664 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -27,6 +27,7 @@
#include <linux/limits.h>
#include <linux/property.h>
#include <linux/kmemleak.h>
+#include <linux/types.h>
#include "base.h"
#include "power/power.h"
@@ -67,7 +68,7 @@ void __weak arch_setup_pdev_archdata(struct platform_device *pdev)
struct resource *platform_get_resource(struct platform_device *dev,
unsigned int type, unsigned int num)
{
- int i;
+ u32 i;
for (i = 0; i < dev->num_resources; i++) {
struct resource *r = &dev->resource[i];
@@ -162,7 +163,7 @@ struct resource *platform_get_resource_byname(struct platform_device *dev,
unsigned int type,
const char *name)
{
- int i;
+ u32 i;
for (i = 0; i < dev->num_resources; i++) {
struct resource *r = &dev->resource[i];
@@ -359,7 +360,8 @@ EXPORT_SYMBOL_GPL(platform_device_add_properties);
*/
int platform_device_add(struct platform_device *pdev)
{
- int i, ret;
+ u32 i;
+ int ret;
if (!pdev)
return -EINVAL;
@@ -446,7 +448,7 @@ EXPORT_SYMBOL_GPL(platform_device_add);
*/
void platform_device_del(struct platform_device *pdev)
{
- int i;
+ u32 i;
if (pdev) {
device_remove_properties(&pdev->dev);
--
2.20.1
From: Masami Hiramatsu <[email protected]>
[ Upstream commit 8b7e20a7ba54836076ff35a28349dabea4cec48f ]
Add TEST opcode to Group3-2 reg=001b as same as Group3-1 does.
Commit
12a78d43de76 ("x86/decoder: Add new TEST instruction pattern")
added a TEST opcode assignment to f6 XX/001/XXX (Group 3-1), but did
not add f7 XX/001/XXX (Group 3-2).
Actually, this TEST opcode variant (ModRM.reg /1) is not described in
the Intel SDM Vol2 but in AMD64 Architecture Programmer's Manual Vol.3,
Appendix A.2 Table A-6. ModRM.reg Extensions for the Primary Opcode Map.
Without this fix, Randy found a warning by insn_decoder_test related
to this issue as below.
HOSTCC arch/x86/tools/insn_decoder_test
HOSTCC arch/x86/tools/insn_sanity
TEST posttest
arch/x86/tools/insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this.
arch/x86/tools/insn_decoder_test: warning: ffffffff81000bf1: f7 0b 00 01 08 00 testl $0x80100,(%rbx)
arch/x86/tools/insn_decoder_test: warning: objdump says 6 bytes, but insn_get_length() says 2
arch/x86/tools/insn_decoder_test: warning: Decoded and checked 11913894 instructions with 1 failures
TEST posttest
arch/x86/tools/insn_sanity: Success: decoded and checked 1000000 random instructions with 0 errors (seed:0x871ce29c)
To fix this error, add the TEST opcode according to AMD64 APM Vol.3.
[ bp: Massage commit message. ]
Reported-by: Randy Dunlap <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Randy Dunlap <[email protected]>
Tested-by: Randy Dunlap <[email protected]>
Link: https://lkml.kernel.org/r/157966631413.9580.10311036595431878351.stgit@devnote2
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/lib/x86-opcode-map.txt | 2 +-
tools/objtool/arch/x86/lib/x86-opcode-map.txt | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
index 0a0e9112f2842..5cb9f009f2be3 100644
--- a/arch/x86/lib/x86-opcode-map.txt
+++ b/arch/x86/lib/x86-opcode-map.txt
@@ -909,7 +909,7 @@ EndTable
GrpTable: Grp3_2
0: TEST Ev,Iz
-1:
+1: TEST Ev,Iz
2: NOT Ev
3: NEG Ev
4: MUL rAX,Ev
diff --git a/tools/objtool/arch/x86/lib/x86-opcode-map.txt b/tools/objtool/arch/x86/lib/x86-opcode-map.txt
index 0a0e9112f2842..5cb9f009f2be3 100644
--- a/tools/objtool/arch/x86/lib/x86-opcode-map.txt
+++ b/tools/objtool/arch/x86/lib/x86-opcode-map.txt
@@ -909,7 +909,7 @@ EndTable
GrpTable: Grp3_2
0: TEST Ev,Iz
-1:
+1: TEST Ev,Iz
2: NOT Ev
3: NEG Ev
4: MUL rAX,Ev
--
2.20.1
From: Arnd Bergmann <[email protected]>
[ Upstream commit caf82f727e69b647f09d57a1fc56e69d22a5f483 ]
The setup_crash_devices_work_queue function only partially initializes
the message it sends to chipset_init, leading to undefined behavior:
drivers/visorbus/visorchipset.c: In function 'setup_crash_devices_work_queue':
drivers/visorbus/visorchipset.c:333:6: error: '((unsigned char*)&msg.hdr.flags)[0]' is used uninitialized in this function [-Werror=uninitialized]
if (inmsg->hdr.flags.response_expected)
Set up the entire structure, zero-initializing the 'response_expected'
flag.
This was apparently found by the patch that added the -O3 build option
in Kconfig.
Fixes: 12e364b9f08a ("staging: visorchipset driver to provide registration and other services")
Signed-off-by: Arnd Bergmann <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/visorbus/visorchipset.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/visorbus/visorchipset.c b/drivers/visorbus/visorchipset.c
index ca752b8f495fa..cb1eb7e05f871 100644
--- a/drivers/visorbus/visorchipset.c
+++ b/drivers/visorbus/visorchipset.c
@@ -1210,14 +1210,17 @@ static void setup_crash_devices_work_queue(struct work_struct *work)
{
struct controlvm_message local_crash_bus_msg;
struct controlvm_message local_crash_dev_msg;
- struct controlvm_message msg;
+ struct controlvm_message msg = {
+ .hdr.id = CONTROLVM_CHIPSET_INIT,
+ .cmd.init_chipset = {
+ .bus_count = 23,
+ .switch_count = 0,
+ },
+ };
u32 local_crash_msg_offset;
u16 local_crash_msg_count;
/* send init chipset msg */
- msg.hdr.id = CONTROLVM_CHIPSET_INIT;
- msg.cmd.init_chipset.bus_count = 23;
- msg.cmd.init_chipset.switch_count = 0;
chipset_init(&msg);
/* get saved message count */
if (visorchannel_read(chipset_dev->controlvm_channel,
--
2.20.1
From: Nathan Chancellor <[email protected]>
[ Upstream commit 446e76873b5e4e70bdee5db2f2a894d5b4a7d081 ]
Clang warns:
../drivers/tty/synclink_gt.c:1337:3: warning: misleading indentation;
statement is not part of the previous 'if' [-Wmisleading-indentation]
if (C_CRTSCTS(tty)) {
^
../drivers/tty/synclink_gt.c:1335:2: note: previous statement is here
if (I_IXOFF(tty))
^
../drivers/tty/synclink_gt.c:2563:3: warning: misleading indentation;
statement is not part of the previous 'if' [-Wmisleading-indentation]
if (I_BRKINT(info->port.tty) || I_PARMRK(info->port.tty))
^
../drivers/tty/synclink_gt.c:2561:2: note: previous statement is here
if (I_INPCK(info->port.tty))
^
../drivers/tty/synclink_gt.c:3221:3: warning: misleading indentation;
statement is not part of the previous 'else' [-Wmisleading-indentation]
set_signals(info);
^
../drivers/tty/synclink_gt.c:3219:2: note: previous statement is here
else
^
3 warnings generated.
The indentation on these lines is not at all consistent, tabs and spaces
are mixed together. Convert to just using tabs to be consistent with the
Linux kernel coding style and eliminate these warnings from clang.
Link: https://github.com/ClangBuiltLinux/linux/issues/822
Signed-off-by: Nathan Chancellor <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/tty/synclink_gt.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/tty/synclink_gt.c b/drivers/tty/synclink_gt.c
index b88ecf102764e..e9779b03ee565 100644
--- a/drivers/tty/synclink_gt.c
+++ b/drivers/tty/synclink_gt.c
@@ -1335,10 +1335,10 @@ static void throttle(struct tty_struct * tty)
DBGINFO(("%s throttle\n", info->device_name));
if (I_IXOFF(tty))
send_xchar(tty, STOP_CHAR(tty));
- if (C_CRTSCTS(tty)) {
+ if (C_CRTSCTS(tty)) {
spin_lock_irqsave(&info->lock,flags);
info->signals &= ~SerialSignal_RTS;
- set_signals(info);
+ set_signals(info);
spin_unlock_irqrestore(&info->lock,flags);
}
}
@@ -1360,10 +1360,10 @@ static void unthrottle(struct tty_struct * tty)
else
send_xchar(tty, START_CHAR(tty));
}
- if (C_CRTSCTS(tty)) {
+ if (C_CRTSCTS(tty)) {
spin_lock_irqsave(&info->lock,flags);
info->signals |= SerialSignal_RTS;
- set_signals(info);
+ set_signals(info);
spin_unlock_irqrestore(&info->lock,flags);
}
}
@@ -2561,8 +2561,8 @@ static void change_params(struct slgt_info *info)
info->read_status_mask = IRQ_RXOVER;
if (I_INPCK(info->port.tty))
info->read_status_mask |= MASK_PARITY | MASK_FRAMING;
- if (I_BRKINT(info->port.tty) || I_PARMRK(info->port.tty))
- info->read_status_mask |= MASK_BREAK;
+ if (I_BRKINT(info->port.tty) || I_PARMRK(info->port.tty))
+ info->read_status_mask |= MASK_BREAK;
if (I_IGNPAR(info->port.tty))
info->ignore_status_mask |= MASK_PARITY | MASK_FRAMING;
if (I_IGNBRK(info->port.tty)) {
@@ -3193,7 +3193,7 @@ static int tiocmset(struct tty_struct *tty,
info->signals &= ~SerialSignal_DTR;
spin_lock_irqsave(&info->lock,flags);
- set_signals(info);
+ set_signals(info);
spin_unlock_irqrestore(&info->lock,flags);
return 0;
}
@@ -3204,7 +3204,7 @@ static int carrier_raised(struct tty_port *port)
struct slgt_info *info = container_of(port, struct slgt_info, port);
spin_lock_irqsave(&info->lock,flags);
- get_signals(info);
+ get_signals(info);
spin_unlock_irqrestore(&info->lock,flags);
return (info->signals & SerialSignal_DCD) ? 1 : 0;
}
@@ -3219,7 +3219,7 @@ static void dtr_rts(struct tty_port *port, int on)
info->signals |= SerialSignal_RTS | SerialSignal_DTR;
else
info->signals &= ~(SerialSignal_RTS | SerialSignal_DTR);
- set_signals(info);
+ set_signals(info);
spin_unlock_irqrestore(&info->lock,flags);
}
--
2.20.1
From: Geert Uytterhoeven <[email protected]>
[ Upstream commit 7c35e699c88bd60734277b26962783c60e04b494 ]
If a device already has devres items attached before probing, a warning
backtrace is printed. However, this backtrace does not reveal the
offending device, leaving the user uninformed. Furthermore, using
WARN_ON() causes systems with panic-on-warn to reboot.
Fix this by replacing the WARN_ON() by a dev_crit() message.
Abort probing the device, to prevent doing more damage to the device's
resources.
Signed-off-by: Geert Uytterhoeven <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/base/dd.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 11d24a552ee49..5f6416e6ba96b 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -470,7 +470,10 @@ static int really_probe(struct device *dev, struct device_driver *drv)
atomic_inc(&probe_count);
pr_debug("bus: '%s': %s: probing driver %s with device %s\n",
drv->bus->name, __func__, drv->name, dev_name(dev));
- WARN_ON(!list_empty(&dev->devres_head));
+ if (!list_empty(&dev->devres_head)) {
+ dev_crit(dev, "Resources present before probing\n");
+ return -EBUSY;
+ }
re_probe:
dev->driver = drv;
--
2.20.1
From: Johannes Thumshirn <[email protected]>
[ Upstream commit 3dbd351df42109902fbcebf27104149226a4fcd9 ]
A user reports a possible NULL-pointer dereference in
btrfsic_process_superblock(). We are assigning state->fs_info to a local
fs_info variable and afterwards checking for the presence of state.
While we would BUG_ON() a NULL state anyways, we can also just remove
the local fs_info copy, as fs_info is only used once as the first
argument for btrfs_num_copies(). There we can just pass in
state->fs_info as well.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205003
Signed-off-by: Johannes Thumshirn <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/btrfs/check-integrity.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c
index 833cf3c35b4df..3b77c8ab5357e 100644
--- a/fs/btrfs/check-integrity.c
+++ b/fs/btrfs/check-integrity.c
@@ -629,7 +629,6 @@ static struct btrfsic_dev_state *btrfsic_dev_state_hashtable_lookup(dev_t dev,
static int btrfsic_process_superblock(struct btrfsic_state *state,
struct btrfs_fs_devices *fs_devices)
{
- struct btrfs_fs_info *fs_info = state->fs_info;
struct btrfs_super_block *selected_super;
struct list_head *dev_head = &fs_devices->devices;
struct btrfs_device *device;
@@ -700,7 +699,7 @@ static int btrfsic_process_superblock(struct btrfsic_state *state,
break;
}
- num_copies = btrfs_num_copies(fs_info, next_bytenr,
+ num_copies = btrfs_num_copies(state->fs_info, next_bytenr,
state->metablock_size);
if (state->print_mask & BTRFSIC_PRINT_MASK_NUM_COPIES)
pr_info("num_copies(log_bytenr=%llu) = %d\n",
--
2.20.1
From: Philipp Zabel <[email protected]>
[ Upstream commit e112324cc0422c046f1cf54c56f333d34fa20885 ]
The EP0700MLP1 returns bogus data on the first register read access
(reading the threshold parameter from register 0x00):
edt_ft5x06 2-0038: crc error: 0xfc expected, got 0x40
It ignores writes until then. This patch adds a dummy read after which
the number of sensors and parameter read/writes work correctly.
Signed-off-by: Philipp Zabel <[email protected]>
Signed-off-by: Marco Felsch <[email protected]>
Tested-by: Andy Shevchenko <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: Dmitry Torokhov <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/input/touchscreen/edt-ft5x06.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/input/touchscreen/edt-ft5x06.c b/drivers/input/touchscreen/edt-ft5x06.c
index 1e18ca0d1b4e1..3fdaa644a82c1 100644
--- a/drivers/input/touchscreen/edt-ft5x06.c
+++ b/drivers/input/touchscreen/edt-ft5x06.c
@@ -968,6 +968,7 @@ static int edt_ft5x06_ts_probe(struct i2c_client *client,
{
const struct edt_i2c_chip_data *chip_data;
struct edt_ft5x06_ts_data *tsdata;
+ u8 buf[2] = { 0xfc, 0x00 };
struct input_dev *input;
unsigned long irq_flags;
int error;
@@ -1037,6 +1038,12 @@ static int edt_ft5x06_ts_probe(struct i2c_client *client,
return error;
}
+ /*
+ * Dummy read access. EP0700MLP1 returns bogus data on the first
+ * register read access and ignores writes.
+ */
+ edt_ft5x06_ts_readwrite(tsdata->client, 2, buf, 2, buf);
+
edt_ft5x06_ts_set_regs(tsdata);
edt_ft5x06_ts_get_defaults(&client->dev, tsdata);
edt_ft5x06_ts_get_parameters(tsdata);
--
2.20.1
From: Benjamin Gaignard <[email protected]>
[ Upstream commit 0ff15a86d0c5a3f004fee2e92d65b88e56a3bc58 ]
Add a fixed regulator and use it as power supply for DSI panel.
Fixes: 18c8866266 ("ARM: dts: stm32: Add display support on stm32f469-disco")
Signed-off-by: Benjamin Gaignard <[email protected]>
Signed-off-by: Alexandre Torgue <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/arm/boot/dts/stm32f469-disco.dts | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/arm/boot/dts/stm32f469-disco.dts b/arch/arm/boot/dts/stm32f469-disco.dts
index 3ee768cb86fc9..eea979ef5512f 100644
--- a/arch/arm/boot/dts/stm32f469-disco.dts
+++ b/arch/arm/boot/dts/stm32f469-disco.dts
@@ -75,6 +75,13 @@
regulator-max-microvolt = <3300000>;
};
+ vdd_dsi: vdd-dsi {
+ compatible = "regulator-fixed";
+ regulator-name = "vdd_dsi";
+ regulator-min-microvolt = <3300000>;
+ regulator-max-microvolt = <3300000>;
+ };
+
soc {
dma-ranges = <0xc0000000 0x0 0x10000000>;
};
@@ -154,6 +161,7 @@
compatible = "orisetech,otm8009a";
reg = <0>; /* dsi virtual channel (0..3) */
reset-gpios = <&gpioh 7 GPIO_ACTIVE_LOW>;
+ power-supply = <&vdd_dsi>;
status = "okay";
port {
--
2.20.1
From: Shile Zhang <[email protected]>
[ Upstream commit 22a7fa8848c5e881d87ef2f7f3c2ea77b286e6f9 ]
To fix follwowing warning due to ORC sort moved to build time:
arch/x86/kernel/unwind_orc.c:210:12: warning: ‘orc_sort_cmp’ defined but not used [-Wunused-function]
arch/x86/kernel/unwind_orc.c:190:13: warning: ‘orc_sort_swap’ defined but not used [-Wunused-function]
Signed-off-by: Shile Zhang <[email protected]>
Reported-by: Stephen Rothwell <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kernel/unwind_orc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 89be1be1790c4..a3cb70fbe941a 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -176,6 +176,8 @@ static struct orc_entry *orc_find(unsigned long ip)
return orc_ftrace_find(ip);
}
+#ifdef CONFIG_MODULES
+
static void orc_sort_swap(void *_a, void *_b, int size)
{
struct orc_entry *orc_a, *orc_b;
@@ -218,7 +220,6 @@ static int orc_sort_cmp(const void *_a, const void *_b)
return orc_a->sp_reg == ORC_REG_UNDEFINED && !orc_a->end ? -1 : 1;
}
-#ifdef CONFIG_MODULES
void unwind_module_init(struct module *mod, void *_orc_ip, size_t orc_ip_size,
void *_orc, size_t orc_size)
{
--
2.20.1
From: Jason Ekstrand <[email protected]>
[ Upstream commit 0528904926aab19bffb2068879aa44db166c6d5f ]
Running evemu-record on the lid switch event shows that the lid reports
the first "close" but then never reports an "open". This causes systemd
to continuously re-suspend the laptop every 30s. Resetting the _LID to
"open" fixes the issue.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/acpi/button.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
index a25d77b3a16ad..d5c19e25ddf59 100644
--- a/drivers/acpi/button.c
+++ b/drivers/acpi/button.c
@@ -102,6 +102,17 @@ static const struct dmi_system_id lid_blacklst[] = {
},
.driver_data = (void *)(long)ACPI_BUTTON_LID_INIT_OPEN,
},
+ {
+ /*
+ * Razer Blade Stealth 13 late 2019, notification of the LID device
+ * only happens on close, not on open and _LID always returns closed.
+ */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Razer"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "Razer Blade Stealth 13 Late 2019"),
+ },
+ .driver_data = (void *)(long)ACPI_BUTTON_LID_INIT_OPEN,
+ },
{}
};
--
2.20.1
From: Logan Gunthorpe <[email protected]>
[ Upstream commit dae7a589c18a4d979d5f14b09374e871b995ceb1 ]
dma_chan_to_owner() dereferences the driver from the struct device to
obtain the owner and call module_[get|put](). However, if the backing
device is unbound before the dma_device is unregistered, the driver
will be cleared and this will cause a NULL pointer dereference.
Instead, store a pointer to the owner module in the dma_device struct
so the module reference can be properly put when the channel is put, even
if the backing device was destroyed first.
This change helps to support a safer unbind of DMA engines.
If the dma_device is unregistered in the driver's remove function,
there's no guarantee that there are no existing clients and a users
action may trigger the WARN_ONCE in dma_async_device_unregister()
which is unlikely to leave the system in a consistent state.
Instead, a better approach is to allow the backing driver to go away
and fail any subsequent requests to it.
Signed-off-by: Logan Gunthorpe <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/dma/dmaengine.c | 4 +++-
include/linux/dmaengine.h | 2 ++
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index f1a441ab395d7..8a52a5efee4f5 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -190,7 +190,7 @@ __dma_device_satisfies_mask(struct dma_device *device,
static struct module *dma_chan_to_owner(struct dma_chan *chan)
{
- return chan->device->dev->driver->owner;
+ return chan->device->owner;
}
/**
@@ -923,6 +923,8 @@ int dma_async_device_register(struct dma_device *device)
return -EIO;
}
+ device->owner = device->dev->driver->owner;
+
if (dma_has_cap(DMA_MEMCPY, device->cap_mask) && !device->device_prep_dma_memcpy) {
dev_err(device->dev,
"Device claims capability %s, but op is not defined\n",
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 0647f436f88c2..50128c36f0b48 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -686,6 +686,7 @@ struct dma_filter {
* @fill_align: alignment shift for memset operations
* @dev_id: unique device ID
* @dev: struct device reference for dma mapping api
+ * @owner: owner module (automatically set based on the provided dev)
* @src_addr_widths: bit mask of src addr widths the device supports
* Width is specified in bytes, e.g. for a device supporting
* a width of 4 the mask should have BIT(4) set.
@@ -749,6 +750,7 @@ struct dma_device {
int dev_id;
struct device *dev;
+ struct module *owner;
u32 src_addr_widths;
u32 dst_addr_widths;
--
2.20.1
From: Miquel Raynal <[email protected]>
[ Upstream commit b8a039d37792067c1a380dc710361905724b9b2f ]
RK808 can leverage a couple of GPIOs to tweak the ramp rate during DVS
(Dynamic Voltage Scaling). These GPIOs are entirely optional but a
dev_warn() appeared when cleaning this driver to use a more up-to-date
gpiod API. At least reduce the log level to 'info' as it is totally
fine to not populate these GPIO on a hardware design.
This change is trivial but it is worth not polluting the logs during
bringup phase by having real warnings and errors sorted out
correctly.
Fixes: a13eaf02e2d6 ("regulator: rk808: make better use of the gpiod API")
Signed-off-by: Miquel Raynal <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/regulator/rk808-regulator.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/regulator/rk808-regulator.c b/drivers/regulator/rk808-regulator.c
index 213b68743cc89..92498ac503035 100644
--- a/drivers/regulator/rk808-regulator.c
+++ b/drivers/regulator/rk808-regulator.c
@@ -714,7 +714,7 @@ static int rk808_regulator_dt_parse_pdata(struct device *dev,
}
if (!pdata->dvs_gpio[i]) {
- dev_warn(dev, "there is no dvs%d gpio\n", i);
+ dev_info(dev, "there is no dvs%d gpio\n", i);
continue;
}
--
2.20.1
From: Kunihiko Hayashi <[email protected]>
[ Upstream commit f4aec227e985e31d2fdc5608daf48e3de19157b7 ]
SCSSI has reset controls for each channel in the SoCs newer than Pro4,
so this adds missing reset controls for channel 1, 2 and 3. And more, this
moves MCSSI reset ID after SCSSI.
Fixes: 6b39fd590aeb ("reset: uniphier: add reset control support for SPI")
Signed-off-by: Kunihiko Hayashi <[email protected]>
Acked-by: Masahiro Yamada <[email protected]>
Signed-off-by: Philipp Zabel <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/reset/reset-uniphier.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/reset/reset-uniphier.c b/drivers/reset/reset-uniphier.c
index 5605745663ae0..adbecb2c7cd3a 100644
--- a/drivers/reset/reset-uniphier.c
+++ b/drivers/reset/reset-uniphier.c
@@ -202,8 +202,8 @@ static const struct uniphier_reset_data uniphier_pro5_sd_reset_data[] = {
#define UNIPHIER_PERI_RESET_FI2C(id, ch) \
UNIPHIER_RESETX((id), 0x114, 24 + (ch))
-#define UNIPHIER_PERI_RESET_SCSSI(id) \
- UNIPHIER_RESETX((id), 0x110, 17)
+#define UNIPHIER_PERI_RESET_SCSSI(id, ch) \
+ UNIPHIER_RESETX((id), 0x110, 17 + (ch))
#define UNIPHIER_PERI_RESET_MCSSI(id) \
UNIPHIER_RESETX((id), 0x114, 14)
@@ -218,7 +218,7 @@ static const struct uniphier_reset_data uniphier_ld4_peri_reset_data[] = {
UNIPHIER_PERI_RESET_I2C(6, 2),
UNIPHIER_PERI_RESET_I2C(7, 3),
UNIPHIER_PERI_RESET_I2C(8, 4),
- UNIPHIER_PERI_RESET_SCSSI(11),
+ UNIPHIER_PERI_RESET_SCSSI(11, 0),
UNIPHIER_RESET_END,
};
@@ -234,8 +234,11 @@ static const struct uniphier_reset_data uniphier_pro4_peri_reset_data[] = {
UNIPHIER_PERI_RESET_FI2C(8, 4),
UNIPHIER_PERI_RESET_FI2C(9, 5),
UNIPHIER_PERI_RESET_FI2C(10, 6),
- UNIPHIER_PERI_RESET_SCSSI(11),
- UNIPHIER_PERI_RESET_MCSSI(12),
+ UNIPHIER_PERI_RESET_SCSSI(11, 0),
+ UNIPHIER_PERI_RESET_SCSSI(12, 1),
+ UNIPHIER_PERI_RESET_SCSSI(13, 2),
+ UNIPHIER_PERI_RESET_SCSSI(14, 3),
+ UNIPHIER_PERI_RESET_MCSSI(15),
UNIPHIER_RESET_END,
};
--
2.20.1
From: Takashi Iwai <[email protected]>
[ Upstream commit f1dd4795b1523fbca7ab4344dd5a8bb439cc770d ]
A long-standing compile warning was seen during build test:
sound/sh/aica.c: In function 'load_aica_firmware':
sound/sh/aica.c:521:25: warning: passing argument 2 of 'spu_memload' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
Fixes: 198de43d758c ("[ALSA] Add ALSA support for the SEGA Dreamcast PCM device")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/sh/aica.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/sh/aica.c b/sound/sh/aica.c
index ad3f71358486a..69ac44b335602 100644
--- a/sound/sh/aica.c
+++ b/sound/sh/aica.c
@@ -117,10 +117,10 @@ static void spu_memset(u32 toi, u32 what, int length)
}
/* spu_memload - write to SPU address space */
-static void spu_memload(u32 toi, void *from, int length)
+static void spu_memload(u32 toi, const void *from, int length)
{
unsigned long flags;
- u32 *froml = from;
+ const u32 *froml = from;
u32 __iomem *to = (u32 __iomem *) (SPU_MEMORY_BASE + toi);
int i;
u32 val;
--
2.20.1
From: YueHaibing <[email protected]>
[ Upstream commit 1eb013473bff5f95b6fe1ca4dd7deda47257b9c2 ]
Like other cases, it should use rcu protected 'chan' rather
than 'fence->channel' in nouveau_fence_wait_uevent_handler.
Fixes: 0ec5f02f0e2c ("drm/nouveau: prevent stale fence->channel pointers, and protect with rcu")
Signed-off-by: YueHaibing <[email protected]>
Signed-off-by: Ben Skeggs <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 412d49bc6e560..ba3883aed4567 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -157,7 +157,7 @@ nouveau_fence_wait_uevent_handler(struct nvif_notify *notify)
fence = list_entry(fctx->pending.next, typeof(*fence), head);
chan = rcu_dereference_protected(fence->channel, lockdep_is_held(&fctx->lock));
- if (nouveau_fence_update(fence->channel, fctx))
+ if (nouveau_fence_update(chan, fctx))
ret = NVIF_NOTIFY_DROP;
}
spin_unlock_irqrestore(&fctx->lock, flags);
--
2.20.1
From: Can Guo <[email protected]>
[ Upstream commit 2df74b6985b51e77756e2e8faa16c45ca3ba53c5 ]
In UFS host reset and restore path, before probe, we stop and start the
host controller once. After host controller is stopped, the pending
requests, if any, are cleared from the doorbell, but no completion IRQ
would be raised due to the hba is stopped. These pending requests shall be
completed along with the first NOP_OUT command (as it is the first command
which can raise a transfer completion IRQ) sent during probe. Since the
OCSs of these pending requests are not SUCCESS (because they are not yet
literally finished), their UPIUs shall be dumped. When there are multiple
pending requests, the UPIU dump can be overwhelming and may lead to
stability issues because it is in atomic context. Therefore, before probe,
complete these pending requests right after host controller is stopped and
silence the UPIU dump from them.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Alim Akhtar <[email protected]>
Reviewed-by: Bean Huo <[email protected]>
Tested-by: Bean Huo <[email protected]>
Signed-off-by: Can Guo <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/scsi/ufs/ufshcd.c | 24 ++++++++++--------------
drivers/scsi/ufs/ufshcd.h | 2 ++
2 files changed, 12 insertions(+), 14 deletions(-)
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index f4fcaee41dc26..b3dee24917a86 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -4809,7 +4809,7 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp)
break;
} /* end of switch */
- if (host_byte(result) != DID_OK)
+ if ((host_byte(result) != DID_OK) && !hba->silence_err_logs)
ufshcd_print_trs(hba, 1 << lrbp->task_tag, true);
return result;
}
@@ -5341,8 +5341,8 @@ static void ufshcd_err_handler(struct work_struct *work)
/*
* if host reset is required then skip clearing the pending
- * transfers forcefully because they will automatically get
- * cleared after link startup.
+ * transfers forcefully because they will get cleared during
+ * host reset and restore
*/
if (needs_reset)
goto skip_pending_xfer_clear;
@@ -5996,9 +5996,15 @@ static int ufshcd_host_reset_and_restore(struct ufs_hba *hba)
int err;
unsigned long flags;
- /* Reset the host controller */
+ /*
+ * Stop the host controller and complete the requests
+ * cleared by h/w
+ */
spin_lock_irqsave(hba->host->host_lock, flags);
ufshcd_hba_stop(hba, false);
+ hba->silence_err_logs = true;
+ ufshcd_complete_requests(hba);
+ hba->silence_err_logs = false;
spin_unlock_irqrestore(hba->host->host_lock, flags);
/* scale up clocks to max frequency before full reinitialization */
@@ -6032,22 +6038,12 @@ out:
static int ufshcd_reset_and_restore(struct ufs_hba *hba)
{
int err = 0;
- unsigned long flags;
int retries = MAX_HOST_RESET_RETRIES;
do {
err = ufshcd_host_reset_and_restore(hba);
} while (err && --retries);
- /*
- * After reset the door-bell might be cleared, complete
- * outstanding requests in s/w here.
- */
- spin_lock_irqsave(hba->host->host_lock, flags);
- ufshcd_transfer_req_compl(hba);
- ufshcd_tmc_handler(hba);
- spin_unlock_irqrestore(hba->host->host_lock, flags);
-
return err;
}
diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
index 33fdd3f281ae8..4554a4b725b54 100644
--- a/drivers/scsi/ufs/ufshcd.h
+++ b/drivers/scsi/ufs/ufshcd.h
@@ -489,6 +489,7 @@ struct ufs_stats {
* @uic_error: UFS interconnect layer error status
* @saved_err: sticky error mask
* @saved_uic_err: sticky UIC error mask
+ * @silence_err_logs: flag to silence error logs
* @dev_cmd: ufs device management command information
* @last_dme_cmd_tstamp: time stamp of the last completed DME command
* @auto_bkops_enabled: to track whether bkops is enabled in device
@@ -645,6 +646,7 @@ struct ufs_hba {
u32 saved_err;
u32 saved_uic_err;
struct ufs_stats ufs_stats;
+ bool silence_err_logs;
/* Device management request data */
struct ufs_dev_cmd dev_cmd;
--
2.20.1
From: Aditya Pakki <[email protected]>
[ Upstream commit c705f9fc6a1736dcf6ec01f8206707c108dca824 ]
In ezusb_init, if upriv is NULL, the code crashes. However, the caller
in ezusb_probe can handle the error and print the failure message.
The patch replaces the BUG_ON call to error return.
Signed-off-by: Aditya Pakki <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/intersil/orinoco/orinoco_usb.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/intersil/orinoco/orinoco_usb.c b/drivers/net/wireless/intersil/orinoco/orinoco_usb.c
index 2c7dd2a7350c1..b704e4bce171d 100644
--- a/drivers/net/wireless/intersil/orinoco/orinoco_usb.c
+++ b/drivers/net/wireless/intersil/orinoco/orinoco_usb.c
@@ -1364,7 +1364,8 @@ static int ezusb_init(struct hermes *hw)
int retval;
BUG_ON(in_interrupt());
- BUG_ON(!upriv);
+ if (!upriv)
+ return -EINVAL;
upriv->reply_count = 0;
/* Write the MAGIC number on the simulated registers to keep
--
2.20.1
From: Arnd Bergmann <[email protected]>
[ Upstream commit b33bdf8020c94438269becc6dace9ed49257c4ba ]
As everybody pointed out by now, my patch to clean up CAPI introduced
a link time warning, as the two parts of the capi driver are now in
one module and the exit function may need to be called in the error
path of the init function:
>> WARNING: drivers/isdn/capi/kernelcapi.o(.text+0xea4): Section mismatch in reference from the function kcapi_exit() to the function .exit.text:kcapi_proc_exit()
The function kcapi_exit() references a function in an exit section.
Often the function kcapi_proc_exit() has valid usage outside the exit section
and the fix is to remove the __exit annotation of kcapi_proc_exit.
Remove the incorrect __exit annotation.
Reported-by: kbuild test robot <[email protected]>
Reported-by: kernelci.org bot <[email protected]>
Reported-by: Olof's autobuilder <[email protected]>
Reported-by: Stephen Rothwell <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/isdn/capi/kcapi_proc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/isdn/capi/kcapi_proc.c b/drivers/isdn/capi/kcapi_proc.c
index c94bd12c0f7c6..28cd051f1dfd9 100644
--- a/drivers/isdn/capi/kcapi_proc.c
+++ b/drivers/isdn/capi/kcapi_proc.c
@@ -239,7 +239,7 @@ kcapi_proc_init(void)
proc_create_seq("capi/driver", 0, NULL, &seq_capi_driver_ops);
}
-void __exit
+void
kcapi_proc_exit(void)
{
remove_proc_entry("capi/driver", NULL);
--
2.20.1
From: Chen Zhou <[email protected]>
[ Upstream commit 8fea78029f5e6ed734ae1957bef23cfda1af4354 ]
If CONFIG_SND_ATMEL_SOC_DMA=m, build error:
sound/soc/atmel/atmel_ssc_dai.o: In function `atmel_ssc_set_audio':
(.text+0x7cd): undefined reference to `atmel_pcm_dma_platform_register'
Function atmel_pcm_dma_platform_register is defined under
CONFIG SND_ATMEL_SOC_DMA, so select SND_ATMEL_SOC_DMA in
CONFIG SND_ATMEL_SOC_SSC, same to CONFIG_SND_ATMEL_SOC_PDC.
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Chen Zhou <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/soc/atmel/Kconfig | 2 ++
1 file changed, 2 insertions(+)
diff --git a/sound/soc/atmel/Kconfig b/sound/soc/atmel/Kconfig
index 64b784e96f844..dad778e5884bd 100644
--- a/sound/soc/atmel/Kconfig
+++ b/sound/soc/atmel/Kconfig
@@ -25,6 +25,8 @@ config SND_ATMEL_SOC_DMA
config SND_ATMEL_SOC_SSC_DMA
tristate
+ select SND_ATMEL_SOC_DMA
+ select SND_ATMEL_SOC_PDC
config SND_ATMEL_SOC_SSC
tristate
--
2.20.1
From: Daniel Drake <[email protected]>
[ Upstream commit 3030df209aa8cf831b9963829bd9f94900ee8032 ]
On Asus UX434DA (AMD Ryzen7 3700U) and Asus X512DK (AMD Ryzen5 3500U), the
XHCI controller fails to resume from runtime suspend or s2idle, and USB
becomes unusable from that point.
xhci_hcd 0000:03:00.4: Refused to change power state, currently in D3
xhci_hcd 0000:03:00.4: enabling device (0000 -> 0002)
xhci_hcd 0000:03:00.4: WARN: xHC restore state timeout
xhci_hcd 0000:03:00.4: PCI post-resume error -110!
xhci_hcd 0000:03:00.4: HC died; cleaning up
During suspend, a transition to D3cold is attempted, however the affected
platforms do not seem to cut the power to the PCI device when in this
state, so the device stays in D3hot.
Upon resume, the D3hot-to-D0 transition is successful only if the D3 delay
is increased to 20ms. The transition failure does not appear to be
detectable as a CRS condition. Add a PCI quirk to increase the delay on the
affected hardware.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=205587
Link: http://lkml.kernel.org/r/CAD8Lp47Vh69gQjROYG69=waJgL7hs1PwnLonL9+27S_TcRhixA@mail.gmail.com
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Daniel Drake <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Mika Westerberg <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/pci/quirks.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 798d46a266037..419dda6dbd166 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -1866,6 +1866,22 @@ static void quirk_radeon_pm(struct pci_dev *dev)
}
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x6741, quirk_radeon_pm);
+/*
+ * Ryzen5/7 XHCI controllers fail upon resume from runtime suspend or s2idle.
+ * https://bugzilla.kernel.org/show_bug.cgi?id=205587
+ *
+ * The kernel attempts to transition these devices to D3cold, but that seems
+ * to be ineffective on the platforms in question; the PCI device appears to
+ * remain on in D3hot state. The D3hot-to-D0 transition then requires an
+ * extended delay in order to succeed.
+ */
+static void quirk_ryzen_xhci_d3hot(struct pci_dev *dev)
+{
+ quirk_d3hot_delay(dev, 20);
+}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15e0, quirk_ryzen_xhci_d3hot);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15e1, quirk_ryzen_xhci_d3hot);
+
#ifdef CONFIG_X86_IO_APIC
static int dmi_disable_ioapicreroute(const struct dmi_system_id *d)
{
--
2.20.1
From: Dmitry Osipenko <[email protected]>
[ Upstream commit 2d9ea1934f8ef0dfb862d103389562cc28b4fc03 ]
Trying to read out Chip ID before APBMISC registers are mapped won't
succeed, in a result Tegra124 gets a wrong address for the HW straps
register if machine uses an old outdated device tree.
Fixes: 297c4f3dcbff ("soc/tegra: fuse: Restrict legacy code to 32-bit ARM")
Signed-off-by: Dmitry Osipenko <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/soc/tegra/fuse/tegra-apbmisc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/soc/tegra/fuse/tegra-apbmisc.c b/drivers/soc/tegra/fuse/tegra-apbmisc.c
index e5a4d8f98b10e..d1cbb0fe1691a 100644
--- a/drivers/soc/tegra/fuse/tegra-apbmisc.c
+++ b/drivers/soc/tegra/fuse/tegra-apbmisc.c
@@ -135,7 +135,7 @@ void __init tegra_init_apbmisc(void)
apbmisc.flags = IORESOURCE_MEM;
/* strapping options */
- if (tegra_get_chip_id() == TEGRA124) {
+ if (of_machine_is_compatible("nvidia,tegra124")) {
straps.start = 0x7000e864;
straps.end = 0x7000e867;
} else {
--
2.20.1
From: Dingchen Zhang <[email protected]>
[ Upstream commit 72a848f5c46bab4c921edc9cbffd1ab273b2be17 ]
userspace may transfer a newline, and this terminating newline
is replaced by a '\0' to avoid followup issues.
'len-1' is the index to replace the newline of CRC source name.
v3: typo fix (Sam)
v2: update patch subject, body and format. (Sam)
Cc: Leo Li <[email protected]>
Cc: Harry Wentland <[email protected]>
Cc: Sam Ravnborg <[email protected]>
Signed-off-by: Dingchen Zhang <[email protected]>
Reviewed-by: Sam Ravnborg <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/drm_debugfs_crc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/drm_debugfs_crc.c b/drivers/gpu/drm/drm_debugfs_crc.c
index c88e5ff41add6..a3c756710845e 100644
--- a/drivers/gpu/drm/drm_debugfs_crc.c
+++ b/drivers/gpu/drm/drm_debugfs_crc.c
@@ -101,8 +101,8 @@ static ssize_t crc_control_write(struct file *file, const char __user *ubuf,
if (IS_ERR(source))
return PTR_ERR(source);
- if (source[len] == '\n')
- source[len] = '\0';
+ if (source[len - 1] == '\n')
+ source[len - 1] = '\0';
spin_lock_irq(&crc->lock);
--
2.20.1
From: Stephen Smalley <[email protected]>
[ Upstream commit 0188d5c025ca8fe756ba3193bd7d150139af5a88 ]
commit bda0be7ad994 ("security: make inode_follow_link RCU-walk aware")
passed down the rcu flag to the SELinux AVC, but failed to adjust the
test in slow_avc_audit() to also return -ECHILD on LSM_AUDIT_DATA_DENTRY.
Previously, we only returned -ECHILD if generating an audit record with
LSM_AUDIT_DATA_INODE since this was only relevant from inode_permission.
Move the handling of MAY_NOT_BLOCK to avc_audit() and its inlined
equivalent in selinux_inode_permission() immediately after we determine
that audit is required, and always fall back to ref-walk in this case.
Fixes: bda0be7ad994 ("security: make inode_follow_link RCU-walk aware")
Reported-by: Will Deacon <[email protected]>
Suggested-by: Al Viro <[email protected]>
Signed-off-by: Stephen Smalley <[email protected]>
Signed-off-by: Paul Moore <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
security/selinux/avc.c | 24 +++++-------------------
security/selinux/hooks.c | 11 +++++++----
security/selinux/include/avc.h | 8 +++++---
3 files changed, 17 insertions(+), 26 deletions(-)
diff --git a/security/selinux/avc.c b/security/selinux/avc.c
index 5de18a6d5c3f0..0622cae510461 100644
--- a/security/selinux/avc.c
+++ b/security/selinux/avc.c
@@ -496,7 +496,7 @@ static inline int avc_xperms_audit(struct selinux_state *state,
if (likely(!audited))
return 0;
return slow_avc_audit(state, ssid, tsid, tclass, requested,
- audited, denied, result, ad, 0);
+ audited, denied, result, ad);
}
static void avc_node_free(struct rcu_head *rhead)
@@ -766,8 +766,7 @@ static void avc_audit_post_callback(struct audit_buffer *ab, void *a)
noinline int slow_avc_audit(struct selinux_state *state,
u32 ssid, u32 tsid, u16 tclass,
u32 requested, u32 audited, u32 denied, int result,
- struct common_audit_data *a,
- unsigned int flags)
+ struct common_audit_data *a)
{
struct common_audit_data stack_data;
struct selinux_audit_data sad;
@@ -777,17 +776,6 @@ noinline int slow_avc_audit(struct selinux_state *state,
a->type = LSM_AUDIT_DATA_NONE;
}
- /*
- * When in a RCU walk do the audit on the RCU retry. This is because
- * the collection of the dname in an inode audit message is not RCU
- * safe. Note this may drop some audits when the situation changes
- * during retry. However this is logically just as if the operation
- * happened a little later.
- */
- if ((a->type == LSM_AUDIT_DATA_INODE) &&
- (flags & MAY_NOT_BLOCK))
- return -ECHILD;
-
sad.tclass = tclass;
sad.requested = requested;
sad.ssid = ssid;
@@ -860,16 +848,14 @@ static int avc_update_node(struct selinux_avc *avc,
/*
* If we are in a non-blocking code path, e.g. VFS RCU walk,
* then we must not add permissions to a cache entry
- * because we cannot safely audit the denial. Otherwise,
+ * because we will not audit the denial. Otherwise,
* during the subsequent blocking retry (e.g. VFS ref walk), we
* will find the permissions already granted in the cache entry
* and won't audit anything at all, leading to silent denials in
* permissive mode that only appear when in enforcing mode.
*
- * See the corresponding handling in slow_avc_audit(), and the
- * logic in selinux_inode_follow_link and selinux_inode_permission
- * for the VFS MAY_NOT_BLOCK flag, which is transliterated into
- * AVC_NONBLOCKING for avc_has_perm_noaudit().
+ * See the corresponding handling of MAY_NOT_BLOCK in avc_audit()
+ * and selinux_inode_permission().
*/
if (flags & AVC_NONBLOCKING)
return 0;
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 040c843968dc6..c574285966f9d 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3171,8 +3171,7 @@ static int selinux_inode_follow_link(struct dentry *dentry, struct inode *inode,
static noinline int audit_inode_permission(struct inode *inode,
u32 perms, u32 audited, u32 denied,
- int result,
- unsigned flags)
+ int result)
{
struct common_audit_data ad;
struct inode_security_struct *isec = inode->i_security;
@@ -3183,7 +3182,7 @@ static noinline int audit_inode_permission(struct inode *inode,
rc = slow_avc_audit(&selinux_state,
current_sid(), isec->sid, isec->sclass, perms,
- audited, denied, result, &ad, flags);
+ audited, denied, result, &ad);
if (rc)
return rc;
return 0;
@@ -3230,7 +3229,11 @@ static int selinux_inode_permission(struct inode *inode, int mask)
if (likely(!audited))
return rc;
- rc2 = audit_inode_permission(inode, perms, audited, denied, rc, flags);
+ /* fall back to ref-walk if we have to generate audit */
+ if (flags & MAY_NOT_BLOCK)
+ return -ECHILD;
+
+ rc2 = audit_inode_permission(inode, perms, audited, denied, rc);
if (rc2)
return rc2;
return rc;
diff --git a/security/selinux/include/avc.h b/security/selinux/include/avc.h
index 74ea50977c201..cf4cc3ef959b5 100644
--- a/security/selinux/include/avc.h
+++ b/security/selinux/include/avc.h
@@ -100,8 +100,7 @@ static inline u32 avc_audit_required(u32 requested,
int slow_avc_audit(struct selinux_state *state,
u32 ssid, u32 tsid, u16 tclass,
u32 requested, u32 audited, u32 denied, int result,
- struct common_audit_data *a,
- unsigned flags);
+ struct common_audit_data *a);
/**
* avc_audit - Audit the granting or denial of permissions.
@@ -135,9 +134,12 @@ static inline int avc_audit(struct selinux_state *state,
audited = avc_audit_required(requested, avd, result, 0, &denied);
if (likely(!audited))
return 0;
+ /* fall back to ref-walk if we have to generate audit */
+ if (flags & MAY_NOT_BLOCK)
+ return -ECHILD;
return slow_avc_audit(state, ssid, tsid, tclass,
requested, audited, denied, result,
- a, flags);
+ a);
}
#define AVC_STRICT 1 /* Ignore permissive mode. */
--
2.20.1
From: Masahiro Yamada <[email protected]>
[ Upstream commit c8fb7d7e48d11520ad24808cfce7afb7b9c9f798 ]
Running randconfig on arm64 using KCONFIG_SEED=0x40C5E904 (e.g. on v5.5)
produces the .config with CONFIG_EFI=y and CONFIG_CPU_BIG_ENDIAN=y,
which does not meet the !CONFIG_CPU_BIG_ENDIAN dependency.
This is because the user choice for CONFIG_CPU_LITTLE_ENDIAN vs
CONFIG_CPU_BIG_ENDIAN is set by randomize_choice_values() after the
value of CONFIG_EFI is calculated.
When this happens, the has_changed flag should be set.
Currently, it takes the result from the last iteration. It should
accumulate all the results of the loop.
Fixes: 3b9a19e08960 ("kconfig: loop as long as we changed some symbols in randconfig")
Reported-by: Vincenzo Frascino <[email protected]>
Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
scripts/kconfig/confdata.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/kconfig/confdata.c b/scripts/kconfig/confdata.c
index 0dde19cf74865..2caf5fac102a2 100644
--- a/scripts/kconfig/confdata.c
+++ b/scripts/kconfig/confdata.c
@@ -1314,7 +1314,7 @@ bool conf_set_all_new_symbols(enum conf_def_mode mode)
sym_calc_value(csym);
if (mode == def_random)
- has_changed = randomize_choice_values(csym);
+ has_changed |= randomize_choice_values(csym);
else {
set_all_choice_values(csym);
has_changed = true;
--
2.20.1
From: Paul Kocialkowski <[email protected]>
[ Upstream commit fd1a5e521c3c083bb43ea731aae0f8b95f12b9bd ]
psbfb_probe performs an evaluation of the required size from the stolen
GTT memory, but gets it wrong in two distinct ways:
- The resulting size must be page-size-aligned;
- The size to allocate is derived from the surface dimensions, not the fb
dimensions.
When two connectors are connected with different modes, the smallest will
be stored in the fb dimensions, but the size that needs to be allocated must
match the largest (surface) dimensions. This is what is used in the actual
allocation code.
Fix this by correcting the evaluation to conform to the two points above.
It allows correctly switching to 16bpp when one connector is e.g. 1920x1080
and the other is 1024x768.
Signed-off-by: Paul Kocialkowski <[email protected]>
Signed-off-by: Patrik Jakobsson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/gma500/framebuffer.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/gma500/framebuffer.c b/drivers/gpu/drm/gma500/framebuffer.c
index adefae58b5fcd..b4035ef72af8a 100644
--- a/drivers/gpu/drm/gma500/framebuffer.c
+++ b/drivers/gpu/drm/gma500/framebuffer.c
@@ -480,6 +480,7 @@ static int psbfb_probe(struct drm_fb_helper *helper,
container_of(helper, struct psb_fbdev, psb_fb_helper);
struct drm_device *dev = psb_fbdev->psb_fb_helper.dev;
struct drm_psb_private *dev_priv = dev->dev_private;
+ unsigned int fb_size;
int bytespp;
bytespp = sizes->surface_bpp / 8;
@@ -489,8 +490,11 @@ static int psbfb_probe(struct drm_fb_helper *helper,
/* If the mode will not fit in 32bit then switch to 16bit to get
a console on full resolution. The X mode setting server will
allocate its own 32bit GEM framebuffer */
- if (ALIGN(sizes->fb_width * bytespp, 64) * sizes->fb_height >
- dev_priv->vram_stolen_size) {
+ fb_size = ALIGN(sizes->surface_width * bytespp, 64) *
+ sizes->surface_height;
+ fb_size = ALIGN(fb_size, PAGE_SIZE);
+
+ if (fb_size > dev_priv->vram_stolen_size) {
sizes->surface_bpp = 16;
sizes->surface_depth = 16;
}
--
2.20.1
From: Kai Li <[email protected]>
[ Upstream commit a09decff5c32060639a685581c380f51b14e1fc2 ]
If the journal is dirty when the filesystem is mounted, jbd2 will replay
the journal but the journal superblock will not be updated by
journal_reset() because JBD2_ABORT flag is still set (it was set in
journal_init_common()). This is problematic because when a new transaction
is then committed, it will be recorded in block 1 (journal->j_tail was set
to 1 in journal_reset()). If unclean shutdown happens again before the
journal superblock is updated, the new recorded transaction will not be
replayed during the next mount (because of stale sb->s_start and
sb->s_sequence values) which can lead to filesystem corruption.
Fixes: 85e0c4e89c1b ("jbd2: if the journal is aborted then don't allow update of the log tail")
Signed-off-by: Kai Li <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/jbd2/journal.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 1a2339f2cb49b..568ca0ca0127c 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1701,6 +1701,11 @@ int jbd2_journal_load(journal_t *journal)
journal->j_devname);
return -EFSCORRUPTED;
}
+ /*
+ * clear JBD2_ABORT flag initialized in journal_init_common
+ * here to update log tail information with the newest seq.
+ */
+ journal->j_flags &= ~JBD2_ABORT;
/* OK, we've finished with the dynamic journal bits:
* reinitialise the dynamic contents of the superblock in memory
@@ -1708,7 +1713,6 @@ int jbd2_journal_load(journal_t *journal)
if (journal_reset(journal))
goto recovery_error;
- journal->j_flags &= ~JBD2_ABORT;
journal->j_flags |= JBD2_LOADED;
return 0;
--
2.20.1
From: Christian Borntraeger <[email protected]>
[ Upstream commit c611990844c28c61ca4b35ff69d3a2ae95ccd486 ]
There is no ENOTSUPP for userspace.
Reported-by: Julian Wiedmann <[email protected]>
Fixes: 519783935451 ("KVM: s390: introduce ais mode modify function")
Fixes: 2c1a48f2e5ed ("KVM: S390: add new group for flic")
Reviewed-by: Cornelia Huck <[email protected]>
Reviewed-by: Thomas Huth <[email protected]>
Signed-off-by: Christian Borntraeger <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/s390/kvm/interrupt.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 05ea466b9e403..3515f2b55eb9e 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -2109,7 +2109,7 @@ static int flic_ais_mode_get_all(struct kvm *kvm, struct kvm_device_attr *attr)
return -EINVAL;
if (!test_kvm_facility(kvm, 72))
- return -ENOTSUPP;
+ return -EOPNOTSUPP;
mutex_lock(&fi->ais_lock);
ais.simm = fi->simm;
@@ -2412,7 +2412,7 @@ static int modify_ais_mode(struct kvm *kvm, struct kvm_device_attr *attr)
int ret = 0;
if (!test_kvm_facility(kvm, 72))
- return -ENOTSUPP;
+ return -EOPNOTSUPP;
if (copy_from_user(&req, (void __user *)attr->addr, sizeof(req)))
return -EFAULT;
@@ -2492,7 +2492,7 @@ static int flic_ais_mode_set_all(struct kvm *kvm, struct kvm_device_attr *attr)
struct kvm_s390_ais_all ais;
if (!test_kvm_facility(kvm, 72))
- return -ENOTSUPP;
+ return -EOPNOTSUPP;
if (copy_from_user(&ais, (void __user *)attr->addr, sizeof(ais)))
return -EFAULT;
--
2.20.1
From: Phong Tran <[email protected]>
[ Upstream commit 475eec112e4267232d10f4afe2f939a241692b6c ]
correct usage prototype of callback in tasklet_init().
Report by https://github.com/KSPP/linux/issues/20
Tested-by: Larry Finger <[email protected]>
Signed-off-by: Phong Tran <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/broadcom/b43legacy/main.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/broadcom/b43legacy/main.c b/drivers/net/wireless/broadcom/b43legacy/main.c
index 55f411925960e..770cc218ca4bd 100644
--- a/drivers/net/wireless/broadcom/b43legacy/main.c
+++ b/drivers/net/wireless/broadcom/b43legacy/main.c
@@ -1304,8 +1304,9 @@ static void handle_irq_ucode_debug(struct b43legacy_wldev *dev)
}
/* Interrupt handler bottom-half */
-static void b43legacy_interrupt_tasklet(struct b43legacy_wldev *dev)
+static void b43legacy_interrupt_tasklet(unsigned long data)
{
+ struct b43legacy_wldev *dev = (struct b43legacy_wldev *)data;
u32 reason;
u32 dma_reason[ARRAY_SIZE(dev->dma_reason)];
u32 merged_dma_reason = 0;
@@ -3775,7 +3776,7 @@ static int b43legacy_one_core_attach(struct ssb_device *dev,
b43legacy_set_status(wldev, B43legacy_STAT_UNINIT);
wldev->bad_frames_preempt = modparam_bad_frames_preempt;
tasklet_init(&wldev->isr_tasklet,
- (void (*)(unsigned long))b43legacy_interrupt_tasklet,
+ b43legacy_interrupt_tasklet,
(unsigned long)wldev);
if (modparam_pio)
wldev->__using_pio = true;
--
2.20.1
From: Hans de Goede <[email protected]>
[ Upstream commit a23680594da7a9e2696dbcf4f023e9273e2fa40b ]
Suspending Goodix touchscreens requires changing the interrupt pin to
output before sending them a power-down command. Followed by wiggling
the interrupt pin to wake the device up, after which it is put back
in input mode.
On Bay Trail devices with a Goodix touchscreen direct-irq mode is used
in combination with listing the pin as a normal GpioIo resource.
This works fine, until the goodix driver gets rmmod-ed and then insmod-ed
again. In this case byt_gpio_disable_free() calls
byt_gpio_clear_triggering() which clears the IRQ flags and after that the
(direct) IRQ no longer triggers.
This commit fixes this by adding a check for the BYT_DIRECT_IRQ_EN flag
to byt_gpio_clear_triggering().
Note that byt_gpio_clear_triggering() only gets called from
byt_gpio_disable_free() for direct-irq enabled pins, as these are excluded
from the irq_valid mask by byt_init_irq_valid_mask().
Signed-off-by: Hans de Goede <[email protected]>
Acked-by: Mika Westerberg <[email protected]>
Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/pinctrl/intel/pinctrl-baytrail.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/pinctrl/intel/pinctrl-baytrail.c b/drivers/pinctrl/intel/pinctrl-baytrail.c
index 021e28ff1194e..a760d8bda0af7 100644
--- a/drivers/pinctrl/intel/pinctrl-baytrail.c
+++ b/drivers/pinctrl/intel/pinctrl-baytrail.c
@@ -950,7 +950,13 @@ static void byt_gpio_clear_triggering(struct byt_gpio *vg, unsigned int offset)
raw_spin_lock_irqsave(&byt_lock, flags);
value = readl(reg);
- value &= ~(BYT_TRIG_POS | BYT_TRIG_NEG | BYT_TRIG_LVL);
+
+ /* Do not clear direct-irq enabled IRQs (from gpio_disable_free) */
+ if (value & BYT_DIRECT_IRQ_EN)
+ /* nothing to do */ ;
+ else
+ value &= ~(BYT_TRIG_POS | BYT_TRIG_NEG | BYT_TRIG_LVL);
+
writel(value, reg);
raw_spin_unlock_irqrestore(&byt_lock, flags);
}
--
2.20.1
From: Jan Kara <[email protected]>
[ Upstream commit 4d5c1adaf893b8aa52525d2b81995e949bcb3239 ]
When we fail to allocate string for journal device name we jump to
'error' label which tries to unlock reiserfs write lock which is not
held. Jump to 'error_unlocked' instead.
Fixes: f32485be8397 ("reiserfs: delay reiserfs lock until journal initialization")
Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/reiserfs/super.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
index 6280efeceb0a2..de5eda33c92a0 100644
--- a/fs/reiserfs/super.c
+++ b/fs/reiserfs/super.c
@@ -1954,7 +1954,7 @@ static int reiserfs_fill_super(struct super_block *s, void *data, int silent)
if (!sbi->s_jdev) {
SWARN(silent, s, "", "Cannot allocate memory for "
"journal device name");
- goto error;
+ goto error_unlocked;
}
}
#ifdef CONFIG_QUOTA
--
2.20.1
From: Jia-Ju Bai <[email protected]>
[ Upstream commit e36eaf94be8f7bc4e686246eed3cf92d845e2ef8 ]
The driver may sleep while holding a spinlock.
The function call path (from bottom to top) in Linux 4.19 is:
drivers/gpio/gpio-grgpio.c, 261:
request_irq in grgpio_irq_map
drivers/gpio/gpio-grgpio.c, 255:
_raw_spin_lock_irqsave in grgpio_irq_map
drivers/gpio/gpio-grgpio.c, 318:
free_irq in grgpio_irq_unmap
drivers/gpio/gpio-grgpio.c, 299:
_raw_spin_lock_irqsave in grgpio_irq_unmap
request_irq() and free_irq() can sleep at runtime.
To fix these bugs, request_irq() and free_irq() are called without
holding the spinlock.
These bugs are found by a static analysis tool STCheck written by myself.
Signed-off-by: Jia-Ju Bai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Linus Walleij <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpio/gpio-grgpio.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/gpio/gpio-grgpio.c b/drivers/gpio/gpio-grgpio.c
index 60a1556c570a4..c1be299e5567b 100644
--- a/drivers/gpio/gpio-grgpio.c
+++ b/drivers/gpio/gpio-grgpio.c
@@ -258,17 +258,16 @@ static int grgpio_irq_map(struct irq_domain *d, unsigned int irq,
lirq->irq = irq;
uirq = &priv->uirqs[lirq->index];
if (uirq->refcnt == 0) {
+ spin_unlock_irqrestore(&priv->gc.bgpio_lock, flags);
ret = request_irq(uirq->uirq, grgpio_irq_handler, 0,
dev_name(priv->dev), priv);
if (ret) {
dev_err(priv->dev,
"Could not request underlying irq %d\n",
uirq->uirq);
-
- spin_unlock_irqrestore(&priv->gc.bgpio_lock, flags);
-
return ret;
}
+ spin_lock_irqsave(&priv->gc.bgpio_lock, flags);
}
uirq->refcnt++;
@@ -314,8 +313,11 @@ static void grgpio_irq_unmap(struct irq_domain *d, unsigned int irq)
if (index >= 0) {
uirq = &priv->uirqs[lirq->index];
uirq->refcnt--;
- if (uirq->refcnt == 0)
+ if (uirq->refcnt == 0) {
+ spin_unlock_irqrestore(&priv->gc.bgpio_lock, flags);
free_irq(uirq->uirq, priv);
+ return;
+ }
}
spin_unlock_irqrestore(&priv->gc.bgpio_lock, flags);
--
2.20.1
This reverts commit 57211b7366cc2abf784c35e537b256e7fcddc91e.
This patch isn't needed on 4.19 and older.
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 8033 ----------------------------------------
1 file changed, 8033 deletions(-)
delete mode 100644 arch/x86/kvm/vmx/vmx.c
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
deleted file mode 100644
index 3791ce8d269e0..0000000000000
--- a/arch/x86/kvm/vmx/vmx.c
+++ /dev/null
@@ -1,8033 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * Kernel-based Virtual Machine driver for Linux
- *
- * This module enables machines with Intel VT-x extensions to run virtual
- * machines without emulation or binary translation.
- *
- * Copyright (C) 2006 Qumranet, Inc.
- * Copyright 2010 Red Hat, Inc. and/or its affiliates.
- *
- * Authors:
- * Avi Kivity <[email protected]>
- * Yaniv Kamay <[email protected]>
- */
-
-#include <linux/frame.h>
-#include <linux/highmem.h>
-#include <linux/hrtimer.h>
-#include <linux/kernel.h>
-#include <linux/kvm_host.h>
-#include <linux/module.h>
-#include <linux/moduleparam.h>
-#include <linux/mod_devicetable.h>
-#include <linux/mm.h>
-#include <linux/sched.h>
-#include <linux/sched/smt.h>
-#include <linux/slab.h>
-#include <linux/tboot.h>
-#include <linux/trace_events.h>
-
-#include <asm/apic.h>
-#include <asm/asm.h>
-#include <asm/cpu.h>
-#include <asm/debugreg.h>
-#include <asm/desc.h>
-#include <asm/fpu/internal.h>
-#include <asm/io.h>
-#include <asm/irq_remapping.h>
-#include <asm/kexec.h>
-#include <asm/perf_event.h>
-#include <asm/mce.h>
-#include <asm/mmu_context.h>
-#include <asm/mshyperv.h>
-#include <asm/spec-ctrl.h>
-#include <asm/virtext.h>
-#include <asm/vmx.h>
-
-#include "capabilities.h"
-#include "cpuid.h"
-#include "evmcs.h"
-#include "irq.h"
-#include "kvm_cache_regs.h"
-#include "lapic.h"
-#include "mmu.h"
-#include "nested.h"
-#include "ops.h"
-#include "pmu.h"
-#include "trace.h"
-#include "vmcs.h"
-#include "vmcs12.h"
-#include "vmx.h"
-#include "x86.h"
-
-MODULE_AUTHOR("Qumranet");
-MODULE_LICENSE("GPL");
-
-static const struct x86_cpu_id vmx_cpu_id[] = {
- X86_FEATURE_MATCH(X86_FEATURE_VMX),
- {}
-};
-MODULE_DEVICE_TABLE(x86cpu, vmx_cpu_id);
-
-bool __read_mostly enable_vpid = 1;
-module_param_named(vpid, enable_vpid, bool, 0444);
-
-static bool __read_mostly enable_vnmi = 1;
-module_param_named(vnmi, enable_vnmi, bool, S_IRUGO);
-
-bool __read_mostly flexpriority_enabled = 1;
-module_param_named(flexpriority, flexpriority_enabled, bool, S_IRUGO);
-
-bool __read_mostly enable_ept = 1;
-module_param_named(ept, enable_ept, bool, S_IRUGO);
-
-bool __read_mostly enable_unrestricted_guest = 1;
-module_param_named(unrestricted_guest,
- enable_unrestricted_guest, bool, S_IRUGO);
-
-bool __read_mostly enable_ept_ad_bits = 1;
-module_param_named(eptad, enable_ept_ad_bits, bool, S_IRUGO);
-
-static bool __read_mostly emulate_invalid_guest_state = true;
-module_param(emulate_invalid_guest_state, bool, S_IRUGO);
-
-static bool __read_mostly fasteoi = 1;
-module_param(fasteoi, bool, S_IRUGO);
-
-static bool __read_mostly enable_apicv = 1;
-module_param(enable_apicv, bool, S_IRUGO);
-
-/*
- * If nested=1, nested virtualization is supported, i.e., guests may use
- * VMX and be a hypervisor for its own guests. If nested=0, guests may not
- * use VMX instructions.
- */
-static bool __read_mostly nested = 1;
-module_param(nested, bool, S_IRUGO);
-
-bool __read_mostly enable_pml = 1;
-module_param_named(pml, enable_pml, bool, S_IRUGO);
-
-static bool __read_mostly dump_invalid_vmcs = 0;
-module_param(dump_invalid_vmcs, bool, 0644);
-
-#define MSR_BITMAP_MODE_X2APIC 1
-#define MSR_BITMAP_MODE_X2APIC_APICV 2
-
-#define KVM_VMX_TSC_MULTIPLIER_MAX 0xffffffffffffffffULL
-
-/* Guest_tsc -> host_tsc conversion requires 64-bit division. */
-static int __read_mostly cpu_preemption_timer_multi;
-static bool __read_mostly enable_preemption_timer = 1;
-#ifdef CONFIG_X86_64
-module_param_named(preemption_timer, enable_preemption_timer, bool, S_IRUGO);
-#endif
-
-#define KVM_VM_CR0_ALWAYS_OFF (X86_CR0_NW | X86_CR0_CD)
-#define KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST X86_CR0_NE
-#define KVM_VM_CR0_ALWAYS_ON \
- (KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST | \
- X86_CR0_WP | X86_CR0_PG | X86_CR0_PE)
-#define KVM_CR4_GUEST_OWNED_BITS \
- (X86_CR4_PVI | X86_CR4_DE | X86_CR4_PCE | X86_CR4_OSFXSR \
- | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_TSD)
-
-#define KVM_VM_CR4_ALWAYS_ON_UNRESTRICTED_GUEST X86_CR4_VMXE
-#define KVM_PMODE_VM_CR4_ALWAYS_ON (X86_CR4_PAE | X86_CR4_VMXE)
-#define KVM_RMODE_VM_CR4_ALWAYS_ON (X86_CR4_VME | X86_CR4_PAE | X86_CR4_VMXE)
-
-#define RMODE_GUEST_OWNED_EFLAGS_BITS (~(X86_EFLAGS_IOPL | X86_EFLAGS_VM))
-
-#define MSR_IA32_RTIT_STATUS_MASK (~(RTIT_STATUS_FILTEREN | \
- RTIT_STATUS_CONTEXTEN | RTIT_STATUS_TRIGGEREN | \
- RTIT_STATUS_ERROR | RTIT_STATUS_STOPPED | \
- RTIT_STATUS_BYTECNT))
-
-#define MSR_IA32_RTIT_OUTPUT_BASE_MASK \
- (~((1UL << cpuid_query_maxphyaddr(vcpu)) - 1) | 0x7f)
-
-/*
- * These 2 parameters are used to config the controls for Pause-Loop Exiting:
- * ple_gap: upper bound on the amount of time between two successive
- * executions of PAUSE in a loop. Also indicate if ple enabled.
- * According to test, this time is usually smaller than 128 cycles.
- * ple_window: upper bound on the amount of time a guest is allowed to execute
- * in a PAUSE loop. Tests indicate that most spinlocks are held for
- * less than 2^12 cycles
- * Time is measured based on a counter that runs at the same rate as the TSC,
- * refer SDM volume 3b section 21.6.13 & 22.1.3.
- */
-static unsigned int ple_gap = KVM_DEFAULT_PLE_GAP;
-module_param(ple_gap, uint, 0444);
-
-static unsigned int ple_window = KVM_VMX_DEFAULT_PLE_WINDOW;
-module_param(ple_window, uint, 0444);
-
-/* Default doubles per-vcpu window every exit. */
-static unsigned int ple_window_grow = KVM_DEFAULT_PLE_WINDOW_GROW;
-module_param(ple_window_grow, uint, 0444);
-
-/* Default resets per-vcpu window every exit to ple_window. */
-static unsigned int ple_window_shrink = KVM_DEFAULT_PLE_WINDOW_SHRINK;
-module_param(ple_window_shrink, uint, 0444);
-
-/* Default is to compute the maximum so we can never overflow. */
-static unsigned int ple_window_max = KVM_VMX_DEFAULT_PLE_WINDOW_MAX;
-module_param(ple_window_max, uint, 0444);
-
-/* Default is SYSTEM mode, 1 for host-guest mode */
-int __read_mostly pt_mode = PT_MODE_SYSTEM;
-module_param(pt_mode, int, S_IRUGO);
-
-static DEFINE_STATIC_KEY_FALSE(vmx_l1d_should_flush);
-static DEFINE_STATIC_KEY_FALSE(vmx_l1d_flush_cond);
-static DEFINE_MUTEX(vmx_l1d_flush_mutex);
-
-/* Storage for pre module init parameter parsing */
-static enum vmx_l1d_flush_state __read_mostly vmentry_l1d_flush_param = VMENTER_L1D_FLUSH_AUTO;
-
-static const struct {
- const char *option;
- bool for_parse;
-} vmentry_l1d_param[] = {
- [VMENTER_L1D_FLUSH_AUTO] = {"auto", true},
- [VMENTER_L1D_FLUSH_NEVER] = {"never", true},
- [VMENTER_L1D_FLUSH_COND] = {"cond", true},
- [VMENTER_L1D_FLUSH_ALWAYS] = {"always", true},
- [VMENTER_L1D_FLUSH_EPT_DISABLED] = {"EPT disabled", false},
- [VMENTER_L1D_FLUSH_NOT_REQUIRED] = {"not required", false},
-};
-
-#define L1D_CACHE_ORDER 4
-static void *vmx_l1d_flush_pages;
-
-static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf)
-{
- struct page *page;
- unsigned int i;
-
- if (!boot_cpu_has_bug(X86_BUG_L1TF)) {
- l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_NOT_REQUIRED;
- return 0;
- }
-
- if (!enable_ept) {
- l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_EPT_DISABLED;
- return 0;
- }
-
- if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES)) {
- u64 msr;
-
- rdmsrl(MSR_IA32_ARCH_CAPABILITIES, msr);
- if (msr & ARCH_CAP_SKIP_VMENTRY_L1DFLUSH) {
- l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_NOT_REQUIRED;
- return 0;
- }
- }
-
- /* If set to auto use the default l1tf mitigation method */
- if (l1tf == VMENTER_L1D_FLUSH_AUTO) {
- switch (l1tf_mitigation) {
- case L1TF_MITIGATION_OFF:
- l1tf = VMENTER_L1D_FLUSH_NEVER;
- break;
- case L1TF_MITIGATION_FLUSH_NOWARN:
- case L1TF_MITIGATION_FLUSH:
- case L1TF_MITIGATION_FLUSH_NOSMT:
- l1tf = VMENTER_L1D_FLUSH_COND;
- break;
- case L1TF_MITIGATION_FULL:
- case L1TF_MITIGATION_FULL_FORCE:
- l1tf = VMENTER_L1D_FLUSH_ALWAYS;
- break;
- }
- } else if (l1tf_mitigation == L1TF_MITIGATION_FULL_FORCE) {
- l1tf = VMENTER_L1D_FLUSH_ALWAYS;
- }
-
- if (l1tf != VMENTER_L1D_FLUSH_NEVER && !vmx_l1d_flush_pages &&
- !boot_cpu_has(X86_FEATURE_FLUSH_L1D)) {
- /*
- * This allocation for vmx_l1d_flush_pages is not tied to a VM
- * lifetime and so should not be charged to a memcg.
- */
- page = alloc_pages(GFP_KERNEL, L1D_CACHE_ORDER);
- if (!page)
- return -ENOMEM;
- vmx_l1d_flush_pages = page_address(page);
-
- /*
- * Initialize each page with a different pattern in
- * order to protect against KSM in the nested
- * virtualization case.
- */
- for (i = 0; i < 1u << L1D_CACHE_ORDER; ++i) {
- memset(vmx_l1d_flush_pages + i * PAGE_SIZE, i + 1,
- PAGE_SIZE);
- }
- }
-
- l1tf_vmx_mitigation = l1tf;
-
- if (l1tf != VMENTER_L1D_FLUSH_NEVER)
- static_branch_enable(&vmx_l1d_should_flush);
- else
- static_branch_disable(&vmx_l1d_should_flush);
-
- if (l1tf == VMENTER_L1D_FLUSH_COND)
- static_branch_enable(&vmx_l1d_flush_cond);
- else
- static_branch_disable(&vmx_l1d_flush_cond);
- return 0;
-}
-
-static int vmentry_l1d_flush_parse(const char *s)
-{
- unsigned int i;
-
- if (s) {
- for (i = 0; i < ARRAY_SIZE(vmentry_l1d_param); i++) {
- if (vmentry_l1d_param[i].for_parse &&
- sysfs_streq(s, vmentry_l1d_param[i].option))
- return i;
- }
- }
- return -EINVAL;
-}
-
-static int vmentry_l1d_flush_set(const char *s, const struct kernel_param *kp)
-{
- int l1tf, ret;
-
- l1tf = vmentry_l1d_flush_parse(s);
- if (l1tf < 0)
- return l1tf;
-
- if (!boot_cpu_has(X86_BUG_L1TF))
- return 0;
-
- /*
- * Has vmx_init() run already? If not then this is the pre init
- * parameter parsing. In that case just store the value and let
- * vmx_init() do the proper setup after enable_ept has been
- * established.
- */
- if (l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_AUTO) {
- vmentry_l1d_flush_param = l1tf;
- return 0;
- }
-
- mutex_lock(&vmx_l1d_flush_mutex);
- ret = vmx_setup_l1d_flush(l1tf);
- mutex_unlock(&vmx_l1d_flush_mutex);
- return ret;
-}
-
-static int vmentry_l1d_flush_get(char *s, const struct kernel_param *kp)
-{
- if (WARN_ON_ONCE(l1tf_vmx_mitigation >= ARRAY_SIZE(vmentry_l1d_param)))
- return sprintf(s, "???\n");
-
- return sprintf(s, "%s\n", vmentry_l1d_param[l1tf_vmx_mitigation].option);
-}
-
-static const struct kernel_param_ops vmentry_l1d_flush_ops = {
- .set = vmentry_l1d_flush_set,
- .get = vmentry_l1d_flush_get,
-};
-module_param_cb(vmentry_l1d_flush, &vmentry_l1d_flush_ops, NULL, 0644);
-
-static bool guest_state_valid(struct kvm_vcpu *vcpu);
-static u32 vmx_segment_access_rights(struct kvm_segment *var);
-static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
- u32 msr, int type);
-
-void vmx_vmexit(void);
-
-#define vmx_insn_failed(fmt...) \
-do { \
- WARN_ONCE(1, fmt); \
- pr_warn_ratelimited(fmt); \
-} while (0)
-
-asmlinkage void vmread_error(unsigned long field, bool fault)
-{
- if (fault)
- kvm_spurious_fault();
- else
- vmx_insn_failed("kvm: vmread failed: field=%lx\n", field);
-}
-
-noinline void vmwrite_error(unsigned long field, unsigned long value)
-{
- vmx_insn_failed("kvm: vmwrite failed: field=%lx val=%lx err=%d\n",
- field, value, vmcs_read32(VM_INSTRUCTION_ERROR));
-}
-
-noinline void vmclear_error(struct vmcs *vmcs, u64 phys_addr)
-{
- vmx_insn_failed("kvm: vmclear failed: %p/%llx\n", vmcs, phys_addr);
-}
-
-noinline void vmptrld_error(struct vmcs *vmcs, u64 phys_addr)
-{
- vmx_insn_failed("kvm: vmptrld failed: %p/%llx\n", vmcs, phys_addr);
-}
-
-noinline void invvpid_error(unsigned long ext, u16 vpid, gva_t gva)
-{
- vmx_insn_failed("kvm: invvpid failed: ext=0x%lx vpid=%u gva=0x%lx\n",
- ext, vpid, gva);
-}
-
-noinline void invept_error(unsigned long ext, u64 eptp, gpa_t gpa)
-{
- vmx_insn_failed("kvm: invept failed: ext=0x%lx eptp=%llx gpa=0x%llx\n",
- ext, eptp, gpa);
-}
-
-static DEFINE_PER_CPU(struct vmcs *, vmxarea);
-DEFINE_PER_CPU(struct vmcs *, current_vmcs);
-/*
- * We maintain a per-CPU linked-list of VMCS loaded on that CPU. This is needed
- * when a CPU is brought down, and we need to VMCLEAR all VMCSs loaded on it.
- */
-static DEFINE_PER_CPU(struct list_head, loaded_vmcss_on_cpu);
-
-/*
- * We maintian a per-CPU linked-list of vCPU, so in wakeup_handler() we
- * can find which vCPU should be waken up.
- */
-static DEFINE_PER_CPU(struct list_head, blocked_vcpu_on_cpu);
-static DEFINE_PER_CPU(spinlock_t, blocked_vcpu_on_cpu_lock);
-
-static DECLARE_BITMAP(vmx_vpid_bitmap, VMX_NR_VPIDS);
-static DEFINE_SPINLOCK(vmx_vpid_lock);
-
-struct vmcs_config vmcs_config;
-struct vmx_capability vmx_capability;
-
-#define VMX_SEGMENT_FIELD(seg) \
- [VCPU_SREG_##seg] = { \
- .selector = GUEST_##seg##_SELECTOR, \
- .base = GUEST_##seg##_BASE, \
- .limit = GUEST_##seg##_LIMIT, \
- .ar_bytes = GUEST_##seg##_AR_BYTES, \
- }
-
-static const struct kvm_vmx_segment_field {
- unsigned selector;
- unsigned base;
- unsigned limit;
- unsigned ar_bytes;
-} kvm_vmx_segment_fields[] = {
- VMX_SEGMENT_FIELD(CS),
- VMX_SEGMENT_FIELD(DS),
- VMX_SEGMENT_FIELD(ES),
- VMX_SEGMENT_FIELD(FS),
- VMX_SEGMENT_FIELD(GS),
- VMX_SEGMENT_FIELD(SS),
- VMX_SEGMENT_FIELD(TR),
- VMX_SEGMENT_FIELD(LDTR),
-};
-
-u64 host_efer;
-static unsigned long host_idt_base;
-
-/*
- * Though SYSCALL is only supported in 64-bit mode on Intel CPUs, kvm
- * will emulate SYSCALL in legacy mode if the vendor string in guest
- * CPUID.0:{EBX,ECX,EDX} is "AuthenticAMD" or "AMDisbetter!" To
- * support this emulation, IA32_STAR must always be included in
- * vmx_msr_index[], even in i386 builds.
- */
-const u32 vmx_msr_index[] = {
-#ifdef CONFIG_X86_64
- MSR_SYSCALL_MASK, MSR_LSTAR, MSR_CSTAR,
-#endif
- MSR_EFER, MSR_TSC_AUX, MSR_STAR,
- MSR_IA32_TSX_CTRL,
-};
-
-#if IS_ENABLED(CONFIG_HYPERV)
-static bool __read_mostly enlightened_vmcs = true;
-module_param(enlightened_vmcs, bool, 0444);
-
-/* check_ept_pointer() should be under protection of ept_pointer_lock. */
-static void check_ept_pointer_match(struct kvm *kvm)
-{
- struct kvm_vcpu *vcpu;
- u64 tmp_eptp = INVALID_PAGE;
- int i;
-
- kvm_for_each_vcpu(i, vcpu, kvm) {
- if (!VALID_PAGE(tmp_eptp)) {
- tmp_eptp = to_vmx(vcpu)->ept_pointer;
- } else if (tmp_eptp != to_vmx(vcpu)->ept_pointer) {
- to_kvm_vmx(kvm)->ept_pointers_match
- = EPT_POINTERS_MISMATCH;
- return;
- }
- }
-
- to_kvm_vmx(kvm)->ept_pointers_match = EPT_POINTERS_MATCH;
-}
-
-static int kvm_fill_hv_flush_list_func(struct hv_guest_mapping_flush_list *flush,
- void *data)
-{
- struct kvm_tlb_range *range = data;
-
- return hyperv_fill_flush_guest_mapping_list(flush, range->start_gfn,
- range->pages);
-}
-
-static inline int __hv_remote_flush_tlb_with_range(struct kvm *kvm,
- struct kvm_vcpu *vcpu, struct kvm_tlb_range *range)
-{
- u64 ept_pointer = to_vmx(vcpu)->ept_pointer;
-
- /*
- * FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE hypercall needs address
- * of the base of EPT PML4 table, strip off EPT configuration
- * information.
- */
- if (range)
- return hyperv_flush_guest_mapping_range(ept_pointer & PAGE_MASK,
- kvm_fill_hv_flush_list_func, (void *)range);
- else
- return hyperv_flush_guest_mapping(ept_pointer & PAGE_MASK);
-}
-
-static int hv_remote_flush_tlb_with_range(struct kvm *kvm,
- struct kvm_tlb_range *range)
-{
- struct kvm_vcpu *vcpu;
- int ret = 0, i;
-
- spin_lock(&to_kvm_vmx(kvm)->ept_pointer_lock);
-
- if (to_kvm_vmx(kvm)->ept_pointers_match == EPT_POINTERS_CHECK)
- check_ept_pointer_match(kvm);
-
- if (to_kvm_vmx(kvm)->ept_pointers_match != EPT_POINTERS_MATCH) {
- kvm_for_each_vcpu(i, vcpu, kvm) {
- /* If ept_pointer is invalid pointer, bypass flush request. */
- if (VALID_PAGE(to_vmx(vcpu)->ept_pointer))
- ret |= __hv_remote_flush_tlb_with_range(
- kvm, vcpu, range);
- }
- } else {
- ret = __hv_remote_flush_tlb_with_range(kvm,
- kvm_get_vcpu(kvm, 0), range);
- }
-
- spin_unlock(&to_kvm_vmx(kvm)->ept_pointer_lock);
- return ret;
-}
-static int hv_remote_flush_tlb(struct kvm *kvm)
-{
- return hv_remote_flush_tlb_with_range(kvm, NULL);
-}
-
-static int hv_enable_direct_tlbflush(struct kvm_vcpu *vcpu)
-{
- struct hv_enlightened_vmcs *evmcs;
- struct hv_partition_assist_pg **p_hv_pa_pg =
- &vcpu->kvm->arch.hyperv.hv_pa_pg;
- /*
- * Synthetic VM-Exit is not enabled in current code and so All
- * evmcs in singe VM shares same assist page.
- */
- if (!*p_hv_pa_pg)
- *p_hv_pa_pg = kzalloc(PAGE_SIZE, GFP_KERNEL);
-
- if (!*p_hv_pa_pg)
- return -ENOMEM;
-
- evmcs = (struct hv_enlightened_vmcs *)to_vmx(vcpu)->loaded_vmcs->vmcs;
-
- evmcs->partition_assist_page =
- __pa(*p_hv_pa_pg);
- evmcs->hv_vm_id = (unsigned long)vcpu->kvm;
- evmcs->hv_enlightenments_control.nested_flush_hypercall = 1;
-
- return 0;
-}
-
-#endif /* IS_ENABLED(CONFIG_HYPERV) */
-
-/*
- * Comment's format: document - errata name - stepping - processor name.
- * Refer from
- * https://www.virtualbox.org/svn/vbox/trunk/src/VBox/VMM/VMMR0/HMR0.cpp
- */
-static u32 vmx_preemption_cpu_tfms[] = {
-/* 323344.pdf - BA86 - D0 - Xeon 7500 Series */
-0x000206E6,
-/* 323056.pdf - AAX65 - C2 - Xeon L3406 */
-/* 322814.pdf - AAT59 - C2 - i7-600, i5-500, i5-400 and i3-300 Mobile */
-/* 322911.pdf - AAU65 - C2 - i5-600, i3-500 Desktop and Pentium G6950 */
-0x00020652,
-/* 322911.pdf - AAU65 - K0 - i5-600, i3-500 Desktop and Pentium G6950 */
-0x00020655,
-/* 322373.pdf - AAO95 - B1 - Xeon 3400 Series */
-/* 322166.pdf - AAN92 - B1 - i7-800 and i5-700 Desktop */
-/*
- * 320767.pdf - AAP86 - B1 -
- * i7-900 Mobile Extreme, i7-800 and i7-700 Mobile
- */
-0x000106E5,
-/* 321333.pdf - AAM126 - C0 - Xeon 3500 */
-0x000106A0,
-/* 321333.pdf - AAM126 - C1 - Xeon 3500 */
-0x000106A1,
-/* 320836.pdf - AAJ124 - C0 - i7-900 Desktop Extreme and i7-900 Desktop */
-0x000106A4,
- /* 321333.pdf - AAM126 - D0 - Xeon 3500 */
- /* 321324.pdf - AAK139 - D0 - Xeon 5500 */
- /* 320836.pdf - AAJ124 - D0 - i7-900 Extreme and i7-900 Desktop */
-0x000106A5,
- /* Xeon E3-1220 V2 */
-0x000306A8,
-};
-
-static inline bool cpu_has_broken_vmx_preemption_timer(void)
-{
- u32 eax = cpuid_eax(0x00000001), i;
-
- /* Clear the reserved bits */
- eax &= ~(0x3U << 14 | 0xfU << 28);
- for (i = 0; i < ARRAY_SIZE(vmx_preemption_cpu_tfms); i++)
- if (eax == vmx_preemption_cpu_tfms[i])
- return true;
-
- return false;
-}
-
-static inline bool cpu_need_virtualize_apic_accesses(struct kvm_vcpu *vcpu)
-{
- return flexpriority_enabled && lapic_in_kernel(vcpu);
-}
-
-static inline bool report_flexpriority(void)
-{
- return flexpriority_enabled;
-}
-
-static inline int __find_msr_index(struct vcpu_vmx *vmx, u32 msr)
-{
- int i;
-
- for (i = 0; i < vmx->nmsrs; ++i)
- if (vmx_msr_index[vmx->guest_msrs[i].index] == msr)
- return i;
- return -1;
-}
-
-struct shared_msr_entry *find_msr_entry(struct vcpu_vmx *vmx, u32 msr)
-{
- int i;
-
- i = __find_msr_index(vmx, msr);
- if (i >= 0)
- return &vmx->guest_msrs[i];
- return NULL;
-}
-
-static int vmx_set_guest_msr(struct vcpu_vmx *vmx, struct shared_msr_entry *msr, u64 data)
-{
- int ret = 0;
-
- u64 old_msr_data = msr->data;
- msr->data = data;
- if (msr - vmx->guest_msrs < vmx->save_nmsrs) {
- preempt_disable();
- ret = kvm_set_shared_msr(msr->index, msr->data,
- msr->mask);
- preempt_enable();
- if (ret)
- msr->data = old_msr_data;
- }
- return ret;
-}
-
-void loaded_vmcs_init(struct loaded_vmcs *loaded_vmcs)
-{
- vmcs_clear(loaded_vmcs->vmcs);
- if (loaded_vmcs->shadow_vmcs && loaded_vmcs->launched)
- vmcs_clear(loaded_vmcs->shadow_vmcs);
- loaded_vmcs->cpu = -1;
- loaded_vmcs->launched = 0;
-}
-
-#ifdef CONFIG_KEXEC_CORE
-/*
- * This bitmap is used to indicate whether the vmclear
- * operation is enabled on all cpus. All disabled by
- * default.
- */
-static cpumask_t crash_vmclear_enabled_bitmap = CPU_MASK_NONE;
-
-static inline void crash_enable_local_vmclear(int cpu)
-{
- cpumask_set_cpu(cpu, &crash_vmclear_enabled_bitmap);
-}
-
-static inline void crash_disable_local_vmclear(int cpu)
-{
- cpumask_clear_cpu(cpu, &crash_vmclear_enabled_bitmap);
-}
-
-static inline int crash_local_vmclear_enabled(int cpu)
-{
- return cpumask_test_cpu(cpu, &crash_vmclear_enabled_bitmap);
-}
-
-static void crash_vmclear_local_loaded_vmcss(void)
-{
- int cpu = raw_smp_processor_id();
- struct loaded_vmcs *v;
-
- if (!crash_local_vmclear_enabled(cpu))
- return;
-
- list_for_each_entry(v, &per_cpu(loaded_vmcss_on_cpu, cpu),
- loaded_vmcss_on_cpu_link)
- vmcs_clear(v->vmcs);
-}
-#else
-static inline void crash_enable_local_vmclear(int cpu) { }
-static inline void crash_disable_local_vmclear(int cpu) { }
-#endif /* CONFIG_KEXEC_CORE */
-
-static void __loaded_vmcs_clear(void *arg)
-{
- struct loaded_vmcs *loaded_vmcs = arg;
- int cpu = raw_smp_processor_id();
-
- if (loaded_vmcs->cpu != cpu)
- return; /* vcpu migration can race with cpu offline */
- if (per_cpu(current_vmcs, cpu) == loaded_vmcs->vmcs)
- per_cpu(current_vmcs, cpu) = NULL;
- crash_disable_local_vmclear(cpu);
- list_del(&loaded_vmcs->loaded_vmcss_on_cpu_link);
-
- /*
- * we should ensure updating loaded_vmcs->loaded_vmcss_on_cpu_link
- * is before setting loaded_vmcs->vcpu to -1 which is done in
- * loaded_vmcs_init. Otherwise, other cpu can see vcpu = -1 fist
- * then adds the vmcs into percpu list before it is deleted.
- */
- smp_wmb();
-
- loaded_vmcs_init(loaded_vmcs);
- crash_enable_local_vmclear(cpu);
-}
-
-void loaded_vmcs_clear(struct loaded_vmcs *loaded_vmcs)
-{
- int cpu = loaded_vmcs->cpu;
-
- if (cpu != -1)
- smp_call_function_single(cpu,
- __loaded_vmcs_clear, loaded_vmcs, 1);
-}
-
-static bool vmx_segment_cache_test_set(struct vcpu_vmx *vmx, unsigned seg,
- unsigned field)
-{
- bool ret;
- u32 mask = 1 << (seg * SEG_FIELD_NR + field);
-
- if (!kvm_register_is_available(&vmx->vcpu, VCPU_EXREG_SEGMENTS)) {
- kvm_register_mark_available(&vmx->vcpu, VCPU_EXREG_SEGMENTS);
- vmx->segment_cache.bitmask = 0;
- }
- ret = vmx->segment_cache.bitmask & mask;
- vmx->segment_cache.bitmask |= mask;
- return ret;
-}
-
-static u16 vmx_read_guest_seg_selector(struct vcpu_vmx *vmx, unsigned seg)
-{
- u16 *p = &vmx->segment_cache.seg[seg].selector;
-
- if (!vmx_segment_cache_test_set(vmx, seg, SEG_FIELD_SEL))
- *p = vmcs_read16(kvm_vmx_segment_fields[seg].selector);
- return *p;
-}
-
-static ulong vmx_read_guest_seg_base(struct vcpu_vmx *vmx, unsigned seg)
-{
- ulong *p = &vmx->segment_cache.seg[seg].base;
-
- if (!vmx_segment_cache_test_set(vmx, seg, SEG_FIELD_BASE))
- *p = vmcs_readl(kvm_vmx_segment_fields[seg].base);
- return *p;
-}
-
-static u32 vmx_read_guest_seg_limit(struct vcpu_vmx *vmx, unsigned seg)
-{
- u32 *p = &vmx->segment_cache.seg[seg].limit;
-
- if (!vmx_segment_cache_test_set(vmx, seg, SEG_FIELD_LIMIT))
- *p = vmcs_read32(kvm_vmx_segment_fields[seg].limit);
- return *p;
-}
-
-static u32 vmx_read_guest_seg_ar(struct vcpu_vmx *vmx, unsigned seg)
-{
- u32 *p = &vmx->segment_cache.seg[seg].ar;
-
- if (!vmx_segment_cache_test_set(vmx, seg, SEG_FIELD_AR))
- *p = vmcs_read32(kvm_vmx_segment_fields[seg].ar_bytes);
- return *p;
-}
-
-void update_exception_bitmap(struct kvm_vcpu *vcpu)
-{
- u32 eb;
-
- eb = (1u << PF_VECTOR) | (1u << UD_VECTOR) | (1u << MC_VECTOR) |
- (1u << DB_VECTOR) | (1u << AC_VECTOR);
- /*
- * Guest access to VMware backdoor ports could legitimately
- * trigger #GP because of TSS I/O permission bitmap.
- * We intercept those #GP and allow access to them anyway
- * as VMware does.
- */
- if (enable_vmware_backdoor)
- eb |= (1u << GP_VECTOR);
- if ((vcpu->guest_debug &
- (KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP)) ==
- (KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP))
- eb |= 1u << BP_VECTOR;
- if (to_vmx(vcpu)->rmode.vm86_active)
- eb = ~0;
- if (enable_ept)
- eb &= ~(1u << PF_VECTOR); /* bypass_guest_pf = 0 */
-
- /* When we are running a nested L2 guest and L1 specified for it a
- * certain exception bitmap, we must trap the same exceptions and pass
- * them to L1. When running L2, we will only handle the exceptions
- * specified above if L1 did not want them.
- */
- if (is_guest_mode(vcpu))
- eb |= get_vmcs12(vcpu)->exception_bitmap;
-
- vmcs_write32(EXCEPTION_BITMAP, eb);
-}
-
-/*
- * Check if MSR is intercepted for currently loaded MSR bitmap.
- */
-static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
-{
- unsigned long *msr_bitmap;
- int f = sizeof(unsigned long);
-
- if (!cpu_has_vmx_msr_bitmap())
- return true;
-
- msr_bitmap = to_vmx(vcpu)->loaded_vmcs->msr_bitmap;
-
- if (msr <= 0x1fff) {
- return !!test_bit(msr, msr_bitmap + 0x800 / f);
- } else if ((msr >= 0xc0000000) && (msr <= 0xc0001fff)) {
- msr &= 0x1fff;
- return !!test_bit(msr, msr_bitmap + 0xc00 / f);
- }
-
- return true;
-}
-
-static void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
- unsigned long entry, unsigned long exit)
-{
- vm_entry_controls_clearbit(vmx, entry);
- vm_exit_controls_clearbit(vmx, exit);
-}
-
-int vmx_find_msr_index(struct vmx_msrs *m, u32 msr)
-{
- unsigned int i;
-
- for (i = 0; i < m->nr; ++i) {
- if (m->val[i].index == msr)
- return i;
- }
- return -ENOENT;
-}
-
-static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
-{
- int i;
- struct msr_autoload *m = &vmx->msr_autoload;
-
- switch (msr) {
- case MSR_EFER:
- if (cpu_has_load_ia32_efer()) {
- clear_atomic_switch_msr_special(vmx,
- VM_ENTRY_LOAD_IA32_EFER,
- VM_EXIT_LOAD_IA32_EFER);
- return;
- }
- break;
- case MSR_CORE_PERF_GLOBAL_CTRL:
- if (cpu_has_load_perf_global_ctrl()) {
- clear_atomic_switch_msr_special(vmx,
- VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,
- VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL);
- return;
- }
- break;
- }
- i = vmx_find_msr_index(&m->guest, msr);
- if (i < 0)
- goto skip_guest;
- --m->guest.nr;
- m->guest.val[i] = m->guest.val[m->guest.nr];
- vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr);
-
-skip_guest:
- i = vmx_find_msr_index(&m->host, msr);
- if (i < 0)
- return;
-
- --m->host.nr;
- m->host.val[i] = m->host.val[m->host.nr];
- vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr);
-}
-
-static void add_atomic_switch_msr_special(struct vcpu_vmx *vmx,
- unsigned long entry, unsigned long exit,
- unsigned long guest_val_vmcs, unsigned long host_val_vmcs,
- u64 guest_val, u64 host_val)
-{
- vmcs_write64(guest_val_vmcs, guest_val);
- if (host_val_vmcs != HOST_IA32_EFER)
- vmcs_write64(host_val_vmcs, host_val);
- vm_entry_controls_setbit(vmx, entry);
- vm_exit_controls_setbit(vmx, exit);
-}
-
-static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
- u64 guest_val, u64 host_val, bool entry_only)
-{
- int i, j = 0;
- struct msr_autoload *m = &vmx->msr_autoload;
-
- switch (msr) {
- case MSR_EFER:
- if (cpu_has_load_ia32_efer()) {
- add_atomic_switch_msr_special(vmx,
- VM_ENTRY_LOAD_IA32_EFER,
- VM_EXIT_LOAD_IA32_EFER,
- GUEST_IA32_EFER,
- HOST_IA32_EFER,
- guest_val, host_val);
- return;
- }
- break;
- case MSR_CORE_PERF_GLOBAL_CTRL:
- if (cpu_has_load_perf_global_ctrl()) {
- add_atomic_switch_msr_special(vmx,
- VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL,
- VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL,
- GUEST_IA32_PERF_GLOBAL_CTRL,
- HOST_IA32_PERF_GLOBAL_CTRL,
- guest_val, host_val);
- return;
- }
- break;
- case MSR_IA32_PEBS_ENABLE:
- /* PEBS needs a quiescent period after being disabled (to write
- * a record). Disabling PEBS through VMX MSR swapping doesn't
- * provide that period, so a CPU could write host's record into
- * guest's memory.
- */
- wrmsrl(MSR_IA32_PEBS_ENABLE, 0);
- }
-
- i = vmx_find_msr_index(&m->guest, msr);
- if (!entry_only)
- j = vmx_find_msr_index(&m->host, msr);
-
- if ((i < 0 && m->guest.nr == NR_LOADSTORE_MSRS) ||
- (j < 0 && m->host.nr == NR_LOADSTORE_MSRS)) {
- printk_once(KERN_WARNING "Not enough msr switch entries. "
- "Can't add msr %x\n", msr);
- return;
- }
- if (i < 0) {
- i = m->guest.nr++;
- vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, m->guest.nr);
- }
- m->guest.val[i].index = msr;
- m->guest.val[i].value = guest_val;
-
- if (entry_only)
- return;
-
- if (j < 0) {
- j = m->host.nr++;
- vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, m->host.nr);
- }
- m->host.val[j].index = msr;
- m->host.val[j].value = host_val;
-}
-
-static bool update_transition_efer(struct vcpu_vmx *vmx, int efer_offset)
-{
- u64 guest_efer = vmx->vcpu.arch.efer;
- u64 ignore_bits = 0;
-
- /* Shadow paging assumes NX to be available. */
- if (!enable_ept)
- guest_efer |= EFER_NX;
-
- /*
- * LMA and LME handled by hardware; SCE meaningless outside long mode.
- */
- ignore_bits |= EFER_SCE;
-#ifdef CONFIG_X86_64
- ignore_bits |= EFER_LMA | EFER_LME;
- /* SCE is meaningful only in long mode on Intel */
- if (guest_efer & EFER_LMA)
- ignore_bits &= ~(u64)EFER_SCE;
-#endif
-
- /*
- * On EPT, we can't emulate NX, so we must switch EFER atomically.
- * On CPUs that support "load IA32_EFER", always switch EFER
- * atomically, since it's faster than switching it manually.
- */
- if (cpu_has_load_ia32_efer() ||
- (enable_ept && ((vmx->vcpu.arch.efer ^ host_efer) & EFER_NX))) {
- if (!(guest_efer & EFER_LMA))
- guest_efer &= ~EFER_LME;
- if (guest_efer != host_efer)
- add_atomic_switch_msr(vmx, MSR_EFER,
- guest_efer, host_efer, false);
- else
- clear_atomic_switch_msr(vmx, MSR_EFER);
- return false;
- } else {
- clear_atomic_switch_msr(vmx, MSR_EFER);
-
- guest_efer &= ~ignore_bits;
- guest_efer |= host_efer & ignore_bits;
-
- vmx->guest_msrs[efer_offset].data = guest_efer;
- vmx->guest_msrs[efer_offset].mask = ~ignore_bits;
-
- return true;
- }
-}
-
-#ifdef CONFIG_X86_32
-/*
- * On 32-bit kernels, VM exits still load the FS and GS bases from the
- * VMCS rather than the segment table. KVM uses this helper to figure
- * out the current bases to poke them into the VMCS before entry.
- */
-static unsigned long segment_base(u16 selector)
-{
- struct desc_struct *table;
- unsigned long v;
-
- if (!(selector & ~SEGMENT_RPL_MASK))
- return 0;
-
- table = get_current_gdt_ro();
-
- if ((selector & SEGMENT_TI_MASK) == SEGMENT_LDT) {
- u16 ldt_selector = kvm_read_ldt();
-
- if (!(ldt_selector & ~SEGMENT_RPL_MASK))
- return 0;
-
- table = (struct desc_struct *)segment_base(ldt_selector);
- }
- v = get_desc_base(&table[selector >> 3]);
- return v;
-}
-#endif
-
-static inline void pt_load_msr(struct pt_ctx *ctx, u32 addr_range)
-{
- u32 i;
-
- wrmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
- wrmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
- wrmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
- wrmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
- for (i = 0; i < addr_range; i++) {
- wrmsrl(MSR_IA32_RTIT_ADDR0_A + i * 2, ctx->addr_a[i]);
- wrmsrl(MSR_IA32_RTIT_ADDR0_B + i * 2, ctx->addr_b[i]);
- }
-}
-
-static inline void pt_save_msr(struct pt_ctx *ctx, u32 addr_range)
-{
- u32 i;
-
- rdmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
- rdmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
- rdmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
- rdmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
- for (i = 0; i < addr_range; i++) {
- rdmsrl(MSR_IA32_RTIT_ADDR0_A + i * 2, ctx->addr_a[i]);
- rdmsrl(MSR_IA32_RTIT_ADDR0_B + i * 2, ctx->addr_b[i]);
- }
-}
-
-static void pt_guest_enter(struct vcpu_vmx *vmx)
-{
- if (pt_mode == PT_MODE_SYSTEM)
- return;
-
- /*
- * GUEST_IA32_RTIT_CTL is already set in the VMCS.
- * Save host state before VM entry.
- */
- rdmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl);
- if (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) {
- wrmsrl(MSR_IA32_RTIT_CTL, 0);
- pt_save_msr(&vmx->pt_desc.host, vmx->pt_desc.addr_range);
- pt_load_msr(&vmx->pt_desc.guest, vmx->pt_desc.addr_range);
- }
-}
-
-static void pt_guest_exit(struct vcpu_vmx *vmx)
-{
- if (pt_mode == PT_MODE_SYSTEM)
- return;
-
- if (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) {
- pt_save_msr(&vmx->pt_desc.guest, vmx->pt_desc.addr_range);
- pt_load_msr(&vmx->pt_desc.host, vmx->pt_desc.addr_range);
- }
-
- /* Reload host state (IA32_RTIT_CTL will be cleared on VM exit). */
- wrmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl);
-}
-
-void vmx_set_host_fs_gs(struct vmcs_host_state *host, u16 fs_sel, u16 gs_sel,
- unsigned long fs_base, unsigned long gs_base)
-{
- if (unlikely(fs_sel != host->fs_sel)) {
- if (!(fs_sel & 7))
- vmcs_write16(HOST_FS_SELECTOR, fs_sel);
- else
- vmcs_write16(HOST_FS_SELECTOR, 0);
- host->fs_sel = fs_sel;
- }
- if (unlikely(gs_sel != host->gs_sel)) {
- if (!(gs_sel & 7))
- vmcs_write16(HOST_GS_SELECTOR, gs_sel);
- else
- vmcs_write16(HOST_GS_SELECTOR, 0);
- host->gs_sel = gs_sel;
- }
- if (unlikely(fs_base != host->fs_base)) {
- vmcs_writel(HOST_FS_BASE, fs_base);
- host->fs_base = fs_base;
- }
- if (unlikely(gs_base != host->gs_base)) {
- vmcs_writel(HOST_GS_BASE, gs_base);
- host->gs_base = gs_base;
- }
-}
-
-void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- struct vmcs_host_state *host_state;
-#ifdef CONFIG_X86_64
- int cpu = raw_smp_processor_id();
-#endif
- unsigned long fs_base, gs_base;
- u16 fs_sel, gs_sel;
- int i;
-
- vmx->req_immediate_exit = false;
-
- /*
- * Note that guest MSRs to be saved/restored can also be changed
- * when guest state is loaded. This happens when guest transitions
- * to/from long-mode by setting MSR_EFER.LMA.
- */
- if (!vmx->guest_msrs_ready) {
- vmx->guest_msrs_ready = true;
- for (i = 0; i < vmx->save_nmsrs; ++i)
- kvm_set_shared_msr(vmx->guest_msrs[i].index,
- vmx->guest_msrs[i].data,
- vmx->guest_msrs[i].mask);
-
- }
- if (vmx->guest_state_loaded)
- return;
-
- host_state = &vmx->loaded_vmcs->host_state;
-
- /*
- * Set host fs and gs selectors. Unfortunately, 22.2.3 does not
- * allow segment selectors with cpl > 0 or ti == 1.
- */
- host_state->ldt_sel = kvm_read_ldt();
-
-#ifdef CONFIG_X86_64
- savesegment(ds, host_state->ds_sel);
- savesegment(es, host_state->es_sel);
-
- gs_base = cpu_kernelmode_gs_base(cpu);
- if (likely(is_64bit_mm(current->mm))) {
- save_fsgs_for_kvm();
- fs_sel = current->thread.fsindex;
- gs_sel = current->thread.gsindex;
- fs_base = current->thread.fsbase;
- vmx->msr_host_kernel_gs_base = current->thread.gsbase;
- } else {
- savesegment(fs, fs_sel);
- savesegment(gs, gs_sel);
- fs_base = read_msr(MSR_FS_BASE);
- vmx->msr_host_kernel_gs_base = read_msr(MSR_KERNEL_GS_BASE);
- }
-
- wrmsrl(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base);
-#else
- savesegment(fs, fs_sel);
- savesegment(gs, gs_sel);
- fs_base = segment_base(fs_sel);
- gs_base = segment_base(gs_sel);
-#endif
-
- vmx_set_host_fs_gs(host_state, fs_sel, gs_sel, fs_base, gs_base);
- vmx->guest_state_loaded = true;
-}
-
-static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx)
-{
- struct vmcs_host_state *host_state;
-
- if (!vmx->guest_state_loaded)
- return;
-
- host_state = &vmx->loaded_vmcs->host_state;
-
- ++vmx->vcpu.stat.host_state_reload;
-
-#ifdef CONFIG_X86_64
- rdmsrl(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base);
-#endif
- if (host_state->ldt_sel || (host_state->gs_sel & 7)) {
- kvm_load_ldt(host_state->ldt_sel);
-#ifdef CONFIG_X86_64
- load_gs_index(host_state->gs_sel);
-#else
- loadsegment(gs, host_state->gs_sel);
-#endif
- }
- if (host_state->fs_sel & 7)
- loadsegment(fs, host_state->fs_sel);
-#ifdef CONFIG_X86_64
- if (unlikely(host_state->ds_sel | host_state->es_sel)) {
- loadsegment(ds, host_state->ds_sel);
- loadsegment(es, host_state->es_sel);
- }
-#endif
- invalidate_tss_limit();
-#ifdef CONFIG_X86_64
- wrmsrl(MSR_KERNEL_GS_BASE, vmx->msr_host_kernel_gs_base);
-#endif
- load_fixmap_gdt(raw_smp_processor_id());
- vmx->guest_state_loaded = false;
- vmx->guest_msrs_ready = false;
-}
-
-#ifdef CONFIG_X86_64
-static u64 vmx_read_guest_kernel_gs_base(struct vcpu_vmx *vmx)
-{
- preempt_disable();
- if (vmx->guest_state_loaded)
- rdmsrl(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base);
- preempt_enable();
- return vmx->msr_guest_kernel_gs_base;
-}
-
-static void vmx_write_guest_kernel_gs_base(struct vcpu_vmx *vmx, u64 data)
-{
- preempt_disable();
- if (vmx->guest_state_loaded)
- wrmsrl(MSR_KERNEL_GS_BASE, data);
- preempt_enable();
- vmx->msr_guest_kernel_gs_base = data;
-}
-#endif
-
-static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu)
-{
- struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
- struct pi_desc old, new;
- unsigned int dest;
-
- /*
- * In case of hot-plug or hot-unplug, we may have to undo
- * vmx_vcpu_pi_put even if there is no assigned device. And we
- * always keep PI.NDST up to date for simplicity: it makes the
- * code easier, and CPU migration is not a fast path.
- */
- if (!pi_test_sn(pi_desc) && vcpu->cpu == cpu)
- return;
-
- /*
- * If the 'nv' field is POSTED_INTR_WAKEUP_VECTOR, do not change
- * PI.NDST: pi_post_block is the one expected to change PID.NDST and the
- * wakeup handler expects the vCPU to be on the blocked_vcpu_list that
- * matches PI.NDST. Otherwise, a vcpu may not be able to be woken up
- * correctly.
- */
- if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR || vcpu->cpu == cpu) {
- pi_clear_sn(pi_desc);
- goto after_clear_sn;
- }
-
- /* The full case. */
- do {
- old.control = new.control = pi_desc->control;
-
- dest = cpu_physical_id(cpu);
-
- if (x2apic_enabled())
- new.ndst = dest;
- else
- new.ndst = (dest << 8) & 0xFF00;
-
- new.sn = 0;
- } while (cmpxchg64(&pi_desc->control, old.control,
- new.control) != old.control);
-
-after_clear_sn:
-
- /*
- * Clear SN before reading the bitmap. The VT-d firmware
- * writes the bitmap and reads SN atomically (5.2.3 in the
- * spec), so it doesn't really have a memory barrier that
- * pairs with this, but we cannot do that and we need one.
- */
- smp_mb__after_atomic();
-
- if (!pi_is_pir_empty(pi_desc))
- pi_set_on(pi_desc);
-}
-
-void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- bool already_loaded = vmx->loaded_vmcs->cpu == cpu;
-
- if (!already_loaded) {
- loaded_vmcs_clear(vmx->loaded_vmcs);
- local_irq_disable();
- crash_disable_local_vmclear(cpu);
-
- /*
- * Read loaded_vmcs->cpu should be before fetching
- * loaded_vmcs->loaded_vmcss_on_cpu_link.
- * See the comments in __loaded_vmcs_clear().
- */
- smp_rmb();
-
- list_add(&vmx->loaded_vmcs->loaded_vmcss_on_cpu_link,
- &per_cpu(loaded_vmcss_on_cpu, cpu));
- crash_enable_local_vmclear(cpu);
- local_irq_enable();
- }
-
- if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
- per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
- vmcs_load(vmx->loaded_vmcs->vmcs);
- indirect_branch_prediction_barrier();
- }
-
- if (!already_loaded) {
- void *gdt = get_current_gdt_ro();
- unsigned long sysenter_esp;
-
- kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
-
- /*
- * Linux uses per-cpu TSS and GDT, so set these when switching
- * processors. See 22.2.4.
- */
- vmcs_writel(HOST_TR_BASE,
- (unsigned long)&get_cpu_entry_area(cpu)->tss.x86_tss);
- vmcs_writel(HOST_GDTR_BASE, (unsigned long)gdt); /* 22.2.4 */
-
- rdmsrl(MSR_IA32_SYSENTER_ESP, sysenter_esp);
- vmcs_writel(HOST_IA32_SYSENTER_ESP, sysenter_esp); /* 22.2.3 */
-
- vmx->loaded_vmcs->cpu = cpu;
- }
-
- /* Setup TSC multiplier */
- if (kvm_has_tsc_control &&
- vmx->current_tsc_ratio != vcpu->arch.tsc_scaling_ratio)
- decache_tsc_multiplier(vmx);
-}
-
-/*
- * Switches to specified vcpu, until a matching vcpu_put(), but assumes
- * vcpu mutex is already taken.
- */
-void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
-
- vmx_vcpu_load_vmcs(vcpu, cpu);
-
- vmx_vcpu_pi_load(vcpu, cpu);
-
- vmx->host_pkru = read_pkru();
- vmx->host_debugctlmsr = get_debugctlmsr();
-}
-
-static void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu)
-{
- struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
-
- if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
- !irq_remapping_cap(IRQ_POSTING_CAP) ||
- !kvm_vcpu_apicv_active(vcpu))
- return;
-
- /* Set SN when the vCPU is preempted */
- if (vcpu->preempted)
- pi_set_sn(pi_desc);
-}
-
-static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
-{
- vmx_vcpu_pi_put(vcpu);
-
- vmx_prepare_switch_to_host(to_vmx(vcpu));
-}
-
-static bool emulation_required(struct kvm_vcpu *vcpu)
-{
- return emulate_invalid_guest_state && !guest_state_valid(vcpu);
-}
-
-static void vmx_decache_cr0_guest_bits(struct kvm_vcpu *vcpu);
-
-unsigned long vmx_get_rflags(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- unsigned long rflags, save_rflags;
-
- if (!kvm_register_is_available(vcpu, VCPU_EXREG_RFLAGS)) {
- kvm_register_mark_available(vcpu, VCPU_EXREG_RFLAGS);
- rflags = vmcs_readl(GUEST_RFLAGS);
- if (vmx->rmode.vm86_active) {
- rflags &= RMODE_GUEST_OWNED_EFLAGS_BITS;
- save_rflags = vmx->rmode.save_rflags;
- rflags |= save_rflags & ~RMODE_GUEST_OWNED_EFLAGS_BITS;
- }
- vmx->rflags = rflags;
- }
- return vmx->rflags;
-}
-
-void vmx_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- unsigned long old_rflags;
-
- if (enable_unrestricted_guest) {
- kvm_register_mark_available(vcpu, VCPU_EXREG_RFLAGS);
- vmx->rflags = rflags;
- vmcs_writel(GUEST_RFLAGS, rflags);
- return;
- }
-
- old_rflags = vmx_get_rflags(vcpu);
- vmx->rflags = rflags;
- if (vmx->rmode.vm86_active) {
- vmx->rmode.save_rflags = rflags;
- rflags |= X86_EFLAGS_IOPL | X86_EFLAGS_VM;
- }
- vmcs_writel(GUEST_RFLAGS, rflags);
-
- if ((old_rflags ^ vmx->rflags) & X86_EFLAGS_VM)
- vmx->emulation_required = emulation_required(vcpu);
-}
-
-u32 vmx_get_interrupt_shadow(struct kvm_vcpu *vcpu)
-{
- u32 interruptibility = vmcs_read32(GUEST_INTERRUPTIBILITY_INFO);
- int ret = 0;
-
- if (interruptibility & GUEST_INTR_STATE_STI)
- ret |= KVM_X86_SHADOW_INT_STI;
- if (interruptibility & GUEST_INTR_STATE_MOV_SS)
- ret |= KVM_X86_SHADOW_INT_MOV_SS;
-
- return ret;
-}
-
-void vmx_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
-{
- u32 interruptibility_old = vmcs_read32(GUEST_INTERRUPTIBILITY_INFO);
- u32 interruptibility = interruptibility_old;
-
- interruptibility &= ~(GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS);
-
- if (mask & KVM_X86_SHADOW_INT_MOV_SS)
- interruptibility |= GUEST_INTR_STATE_MOV_SS;
- else if (mask & KVM_X86_SHADOW_INT_STI)
- interruptibility |= GUEST_INTR_STATE_STI;
-
- if ((interruptibility != interruptibility_old))
- vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, interruptibility);
-}
-
-static int vmx_rtit_ctl_check(struct kvm_vcpu *vcpu, u64 data)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- unsigned long value;
-
- /*
- * Any MSR write that attempts to change bits marked reserved will
- * case a #GP fault.
- */
- if (data & vmx->pt_desc.ctl_bitmask)
- return 1;
-
- /*
- * Any attempt to modify IA32_RTIT_CTL while TraceEn is set will
- * result in a #GP unless the same write also clears TraceEn.
- */
- if ((vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) &&
- ((vmx->pt_desc.guest.ctl ^ data) & ~RTIT_CTL_TRACEEN))
- return 1;
-
- /*
- * WRMSR to IA32_RTIT_CTL that sets TraceEn but clears this bit
- * and FabricEn would cause #GP, if
- * CPUID.(EAX=14H, ECX=0):ECX.SNGLRGNOUT[bit 2] = 0
- */
- if ((data & RTIT_CTL_TRACEEN) && !(data & RTIT_CTL_TOPA) &&
- !(data & RTIT_CTL_FABRIC_EN) &&
- !intel_pt_validate_cap(vmx->pt_desc.caps,
- PT_CAP_single_range_output))
- return 1;
-
- /*
- * MTCFreq, CycThresh and PSBFreq encodings check, any MSR write that
- * utilize encodings marked reserved will casue a #GP fault.
- */
- value = intel_pt_validate_cap(vmx->pt_desc.caps, PT_CAP_mtc_periods);
- if (intel_pt_validate_cap(vmx->pt_desc.caps, PT_CAP_mtc) &&
- !test_bit((data & RTIT_CTL_MTC_RANGE) >>
- RTIT_CTL_MTC_RANGE_OFFSET, &value))
- return 1;
- value = intel_pt_validate_cap(vmx->pt_desc.caps,
- PT_CAP_cycle_thresholds);
- if (intel_pt_validate_cap(vmx->pt_desc.caps, PT_CAP_psb_cyc) &&
- !test_bit((data & RTIT_CTL_CYC_THRESH) >>
- RTIT_CTL_CYC_THRESH_OFFSET, &value))
- return 1;
- value = intel_pt_validate_cap(vmx->pt_desc.caps, PT_CAP_psb_periods);
- if (intel_pt_validate_cap(vmx->pt_desc.caps, PT_CAP_psb_cyc) &&
- !test_bit((data & RTIT_CTL_PSB_FREQ) >>
- RTIT_CTL_PSB_FREQ_OFFSET, &value))
- return 1;
-
- /*
- * If ADDRx_CFG is reserved or the encodings is >2 will
- * cause a #GP fault.
- */
- value = (data & RTIT_CTL_ADDR0) >> RTIT_CTL_ADDR0_OFFSET;
- if ((value && (vmx->pt_desc.addr_range < 1)) || (value > 2))
- return 1;
- value = (data & RTIT_CTL_ADDR1) >> RTIT_CTL_ADDR1_OFFSET;
- if ((value && (vmx->pt_desc.addr_range < 2)) || (value > 2))
- return 1;
- value = (data & RTIT_CTL_ADDR2) >> RTIT_CTL_ADDR2_OFFSET;
- if ((value && (vmx->pt_desc.addr_range < 3)) || (value > 2))
- return 1;
- value = (data & RTIT_CTL_ADDR3) >> RTIT_CTL_ADDR3_OFFSET;
- if ((value && (vmx->pt_desc.addr_range < 4)) || (value > 2))
- return 1;
-
- return 0;
-}
-
-static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
-{
- unsigned long rip;
-
- /*
- * Using VMCS.VM_EXIT_INSTRUCTION_LEN on EPT misconfig depends on
- * undefined behavior: Intel's SDM doesn't mandate the VMCS field be
- * set when EPT misconfig occurs. In practice, real hardware updates
- * VM_EXIT_INSTRUCTION_LEN on EPT misconfig, but other hypervisors
- * (namely Hyper-V) don't set it due to it being undefined behavior,
- * i.e. we end up advancing IP with some random value.
- */
- if (!static_cpu_has(X86_FEATURE_HYPERVISOR) ||
- to_vmx(vcpu)->exit_reason != EXIT_REASON_EPT_MISCONFIG) {
- rip = kvm_rip_read(vcpu);
- rip += vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
- kvm_rip_write(vcpu, rip);
- } else {
- if (!kvm_emulate_instruction(vcpu, EMULTYPE_SKIP))
- return 0;
- }
-
- /* skipping an emulated instruction also counts */
- vmx_set_interrupt_shadow(vcpu, 0);
-
- return 1;
-}
-
-static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
-{
- /*
- * Ensure that we clear the HLT state in the VMCS. We don't need to
- * explicitly skip the instruction because if the HLT state is set,
- * then the instruction is already executing and RIP has already been
- * advanced.
- */
- if (kvm_hlt_in_guest(vcpu->kvm) &&
- vmcs_read32(GUEST_ACTIVITY_STATE) == GUEST_ACTIVITY_HLT)
- vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE);
-}
-
-static void vmx_queue_exception(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- unsigned nr = vcpu->arch.exception.nr;
- bool has_error_code = vcpu->arch.exception.has_error_code;
- u32 error_code = vcpu->arch.exception.error_code;
- u32 intr_info = nr | INTR_INFO_VALID_MASK;
-
- kvm_deliver_exception_payload(vcpu);
-
- if (has_error_code) {
- vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, error_code);
- intr_info |= INTR_INFO_DELIVER_CODE_MASK;
- }
-
- if (vmx->rmode.vm86_active) {
- int inc_eip = 0;
- if (kvm_exception_is_soft(nr))
- inc_eip = vcpu->arch.event_exit_inst_len;
- kvm_inject_realmode_interrupt(vcpu, nr, inc_eip);
- return;
- }
-
- WARN_ON_ONCE(vmx->emulation_required);
-
- if (kvm_exception_is_soft(nr)) {
- vmcs_write32(VM_ENTRY_INSTRUCTION_LEN,
- vmx->vcpu.arch.event_exit_inst_len);
- intr_info |= INTR_TYPE_SOFT_EXCEPTION;
- } else
- intr_info |= INTR_TYPE_HARD_EXCEPTION;
-
- vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr_info);
-
- vmx_clear_hlt(vcpu);
-}
-
-static bool vmx_rdtscp_supported(void)
-{
- return cpu_has_vmx_rdtscp();
-}
-
-static bool vmx_invpcid_supported(void)
-{
- return cpu_has_vmx_invpcid();
-}
-
-/*
- * Swap MSR entry in host/guest MSR entry array.
- */
-static void move_msr_up(struct vcpu_vmx *vmx, int from, int to)
-{
- struct shared_msr_entry tmp;
-
- tmp = vmx->guest_msrs[to];
- vmx->guest_msrs[to] = vmx->guest_msrs[from];
- vmx->guest_msrs[from] = tmp;
-}
-
-/*
- * Set up the vmcs to automatically save and restore system
- * msrs. Don't touch the 64-bit msrs if the guest is in legacy
- * mode, as fiddling with msrs is very expensive.
- */
-static void setup_msrs(struct vcpu_vmx *vmx)
-{
- int save_nmsrs, index;
-
- save_nmsrs = 0;
-#ifdef CONFIG_X86_64
- /*
- * The SYSCALL MSRs are only needed on long mode guests, and only
- * when EFER.SCE is set.
- */
- if (is_long_mode(&vmx->vcpu) && (vmx->vcpu.arch.efer & EFER_SCE)) {
- index = __find_msr_index(vmx, MSR_STAR);
- if (index >= 0)
- move_msr_up(vmx, index, save_nmsrs++);
- index = __find_msr_index(vmx, MSR_LSTAR);
- if (index >= 0)
- move_msr_up(vmx, index, save_nmsrs++);
- index = __find_msr_index(vmx, MSR_SYSCALL_MASK);
- if (index >= 0)
- move_msr_up(vmx, index, save_nmsrs++);
- }
-#endif
- index = __find_msr_index(vmx, MSR_EFER);
- if (index >= 0 && update_transition_efer(vmx, index))
- move_msr_up(vmx, index, save_nmsrs++);
- index = __find_msr_index(vmx, MSR_TSC_AUX);
- if (index >= 0 && guest_cpuid_has(&vmx->vcpu, X86_FEATURE_RDTSCP))
- move_msr_up(vmx, index, save_nmsrs++);
- index = __find_msr_index(vmx, MSR_IA32_TSX_CTRL);
- if (index >= 0)
- move_msr_up(vmx, index, save_nmsrs++);
-
- vmx->save_nmsrs = save_nmsrs;
- vmx->guest_msrs_ready = false;
-
- if (cpu_has_vmx_msr_bitmap())
- vmx_update_msr_bitmap(&vmx->vcpu);
-}
-
-static u64 vmx_read_l1_tsc_offset(struct kvm_vcpu *vcpu)
-{
- struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
-
- if (is_guest_mode(vcpu) &&
- (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETTING))
- return vcpu->arch.tsc_offset - vmcs12->tsc_offset;
-
- return vcpu->arch.tsc_offset;
-}
-
-static u64 vmx_write_l1_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
-{
- struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
- u64 g_tsc_offset = 0;
-
- /*
- * We're here if L1 chose not to trap WRMSR to TSC. According
- * to the spec, this should set L1's TSC; The offset that L1
- * set for L2 remains unchanged, and still needs to be added
- * to the newly set TSC to get L2's TSC.
- */
- if (is_guest_mode(vcpu) &&
- (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETTING))
- g_tsc_offset = vmcs12->tsc_offset;
-
- trace_kvm_write_tsc_offset(vcpu->vcpu_id,
- vcpu->arch.tsc_offset - g_tsc_offset,
- offset);
- vmcs_write64(TSC_OFFSET, offset + g_tsc_offset);
- return offset + g_tsc_offset;
-}
-
-/*
- * nested_vmx_allowed() checks whether a guest should be allowed to use VMX
- * instructions and MSRs (i.e., nested VMX). Nested VMX is disabled for
- * all guests if the "nested" module option is off, and can also be disabled
- * for a single guest by disabling its VMX cpuid bit.
- */
-bool nested_vmx_allowed(struct kvm_vcpu *vcpu)
-{
- return nested && guest_cpuid_has(vcpu, X86_FEATURE_VMX);
-}
-
-static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
- uint64_t val)
-{
- uint64_t valid_bits = to_vmx(vcpu)->msr_ia32_feature_control_valid_bits;
-
- return !(val & ~valid_bits);
-}
-
-static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
-{
- switch (msr->index) {
- case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
- if (!nested)
- return 1;
- return vmx_get_vmx_msr(&vmcs_config.nested, msr->index, &msr->data);
- default:
- return 1;
- }
-}
-
-/*
- * Reads an msr value (of 'msr_index') into 'pdata'.
- * Returns 0 on success, non-0 otherwise.
- * Assumes vcpu_load() was already called.
- */
-static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- struct shared_msr_entry *msr;
- u32 index;
-
- switch (msr_info->index) {
-#ifdef CONFIG_X86_64
- case MSR_FS_BASE:
- msr_info->data = vmcs_readl(GUEST_FS_BASE);
- break;
- case MSR_GS_BASE:
- msr_info->data = vmcs_readl(GUEST_GS_BASE);
- break;
- case MSR_KERNEL_GS_BASE:
- msr_info->data = vmx_read_guest_kernel_gs_base(vmx);
- break;
-#endif
- case MSR_EFER:
- return kvm_get_msr_common(vcpu, msr_info);
- case MSR_IA32_TSX_CTRL:
- if (!msr_info->host_initiated &&
- !(vcpu->arch.arch_capabilities & ARCH_CAP_TSX_CTRL_MSR))
- return 1;
- goto find_shared_msr;
- case MSR_IA32_UMWAIT_CONTROL:
- if (!msr_info->host_initiated && !vmx_has_waitpkg(vmx))
- return 1;
-
- msr_info->data = vmx->msr_ia32_umwait_control;
- break;
- case MSR_IA32_SPEC_CTRL:
- if (!msr_info->host_initiated &&
- !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
- return 1;
-
- msr_info->data = to_vmx(vcpu)->spec_ctrl;
- break;
- case MSR_IA32_SYSENTER_CS:
- msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
- break;
- case MSR_IA32_SYSENTER_EIP:
- msr_info->data = vmcs_readl(GUEST_SYSENTER_EIP);
- break;
- case MSR_IA32_SYSENTER_ESP:
- msr_info->data = vmcs_readl(GUEST_SYSENTER_ESP);
- break;
- case MSR_IA32_BNDCFGS:
- if (!kvm_mpx_supported() ||
- (!msr_info->host_initiated &&
- !guest_cpuid_has(vcpu, X86_FEATURE_MPX)))
- return 1;
- msr_info->data = vmcs_read64(GUEST_BNDCFGS);
- break;
- case MSR_IA32_MCG_EXT_CTL:
- if (!msr_info->host_initiated &&
- !(vmx->msr_ia32_feature_control &
- FEATURE_CONTROL_LMCE))
- return 1;
- msr_info->data = vcpu->arch.mcg_ext_ctl;
- break;
- case MSR_IA32_FEATURE_CONTROL:
- msr_info->data = vmx->msr_ia32_feature_control;
- break;
- case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
- if (!nested_vmx_allowed(vcpu))
- return 1;
- return vmx_get_vmx_msr(&vmx->nested.msrs, msr_info->index,
- &msr_info->data);
- case MSR_IA32_RTIT_CTL:
- if (pt_mode != PT_MODE_HOST_GUEST)
- return 1;
- msr_info->data = vmx->pt_desc.guest.ctl;
- break;
- case MSR_IA32_RTIT_STATUS:
- if (pt_mode != PT_MODE_HOST_GUEST)
- return 1;
- msr_info->data = vmx->pt_desc.guest.status;
- break;
- case MSR_IA32_RTIT_CR3_MATCH:
- if ((pt_mode != PT_MODE_HOST_GUEST) ||
- !intel_pt_validate_cap(vmx->pt_desc.caps,
- PT_CAP_cr3_filtering))
- return 1;
- msr_info->data = vmx->pt_desc.guest.cr3_match;
- break;
- case MSR_IA32_RTIT_OUTPUT_BASE:
- if ((pt_mode != PT_MODE_HOST_GUEST) ||
- (!intel_pt_validate_cap(vmx->pt_desc.caps,
- PT_CAP_topa_output) &&
- !intel_pt_validate_cap(vmx->pt_desc.caps,
- PT_CAP_single_range_output)))
- return 1;
- msr_info->data = vmx->pt_desc.guest.output_base;
- break;
- case MSR_IA32_RTIT_OUTPUT_MASK:
- if ((pt_mode != PT_MODE_HOST_GUEST) ||
- (!intel_pt_validate_cap(vmx->pt_desc.caps,
- PT_CAP_topa_output) &&
- !intel_pt_validate_cap(vmx->pt_desc.caps,
- PT_CAP_single_range_output)))
- return 1;
- msr_info->data = vmx->pt_desc.guest.output_mask;
- break;
- case MSR_IA32_RTIT_ADDR0_A ... MSR_IA32_RTIT_ADDR3_B:
- index = msr_info->index - MSR_IA32_RTIT_ADDR0_A;
- if ((pt_mode != PT_MODE_HOST_GUEST) ||
- (index >= 2 * intel_pt_validate_cap(vmx->pt_desc.caps,
- PT_CAP_num_address_ranges)))
- return 1;
- if (is_noncanonical_address(data, vcpu))
- return 1;
- if (index % 2)
- msr_info->data = vmx->pt_desc.guest.addr_b[index / 2];
- else
- msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
- break;
- case MSR_TSC_AUX:
- if (!msr_info->host_initiated &&
- !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
- return 1;
- goto find_shared_msr;
- default:
- find_shared_msr:
- msr = find_msr_entry(vmx, msr_info->index);
- if (msr) {
- msr_info->data = msr->data;
- break;
- }
- return kvm_get_msr_common(vcpu, msr_info);
- }
-
- return 0;
-}
-
-/*
- * Writes msr value into the appropriate "register".
- * Returns 0 on success, non-0 otherwise.
- * Assumes vcpu_load() was already called.
- */
-static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- struct shared_msr_entry *msr;
- int ret = 0;
- u32 msr_index = msr_info->index;
- u64 data = msr_info->data;
- u32 index;
-
- switch (msr_index) {
- case MSR_EFER:
- ret = kvm_set_msr_common(vcpu, msr_info);
- break;
-#ifdef CONFIG_X86_64
- case MSR_FS_BASE:
- vmx_segment_cache_clear(vmx);
- vmcs_writel(GUEST_FS_BASE, data);
- break;
- case MSR_GS_BASE:
- vmx_segment_cache_clear(vmx);
- vmcs_writel(GUEST_GS_BASE, data);
- break;
- case MSR_KERNEL_GS_BASE:
- vmx_write_guest_kernel_gs_base(vmx, data);
- break;
-#endif
- case MSR_IA32_SYSENTER_CS:
- if (is_guest_mode(vcpu))
- get_vmcs12(vcpu)->guest_sysenter_cs = data;
- vmcs_write32(GUEST_SYSENTER_CS, data);
- break;
- case MSR_IA32_SYSENTER_EIP:
- if (is_guest_mode(vcpu))
- get_vmcs12(vcpu)->guest_sysenter_eip = data;
- vmcs_writel(GUEST_SYSENTER_EIP, data);
- break;
- case MSR_IA32_SYSENTER_ESP:
- if (is_guest_mode(vcpu))
- get_vmcs12(vcpu)->guest_sysenter_esp = data;
- vmcs_writel(GUEST_SYSENTER_ESP, data);
- break;
- case MSR_IA32_DEBUGCTLMSR:
- if (is_guest_mode(vcpu) && get_vmcs12(vcpu)->vm_exit_controls &
- VM_EXIT_SAVE_DEBUG_CONTROLS)
- get_vmcs12(vcpu)->guest_ia32_debugctl = data;
-
- ret = kvm_set_msr_common(vcpu, msr_info);
- break;
-
- case MSR_IA32_BNDCFGS:
- if (!kvm_mpx_supported() ||
- (!msr_info->host_initiated &&
- !guest_cpuid_has(vcpu, X86_FEATURE_MPX)))
- return 1;
- if (is_noncanonical_address(data & PAGE_MASK, vcpu) ||
- (data & MSR_IA32_BNDCFGS_RSVD))
- return 1;
- vmcs_write64(GUEST_BNDCFGS, data);
- break;
- case MSR_IA32_UMWAIT_CONTROL:
- if (!msr_info->host_initiated && !vmx_has_waitpkg(vmx))
- return 1;
-
- /* The reserved bit 1 and non-32 bit [63:32] should be zero */
- if (data & (BIT_ULL(1) | GENMASK_ULL(63, 32)))
- return 1;
-
- vmx->msr_ia32_umwait_control = data;
- break;
- case MSR_IA32_SPEC_CTRL:
- if (!msr_info->host_initiated &&
- !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
- return 1;
-
- /* The STIBP bit doesn't fault even if it's not advertised */
- if (data & ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD))
- return 1;
-
- vmx->spec_ctrl = data;
-
- if (!data)
- break;
-
- /*
- * For non-nested:
- * When it's written (to non-zero) for the first time, pass
- * it through.
- *
- * For nested:
- * The handling of the MSR bitmap for L2 guests is done in
- * nested_vmx_prepare_msr_bitmap. We should not touch the
- * vmcs02.msr_bitmap here since it gets completely overwritten
- * in the merging. We update the vmcs01 here for L1 as well
- * since it will end up touching the MSR anyway now.
- */
- vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap,
- MSR_IA32_SPEC_CTRL,
- MSR_TYPE_RW);
- break;
- case MSR_IA32_TSX_CTRL:
- if (!msr_info->host_initiated &&
- !(vcpu->arch.arch_capabilities & ARCH_CAP_TSX_CTRL_MSR))
- return 1;
- if (data & ~(TSX_CTRL_RTM_DISABLE | TSX_CTRL_CPUID_CLEAR))
- return 1;
- goto find_shared_msr;
- case MSR_IA32_PRED_CMD:
- if (!msr_info->host_initiated &&
- !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
- return 1;
-
- if (data & ~PRED_CMD_IBPB)
- return 1;
-
- if (!data)
- break;
-
- wrmsrl(MSR_IA32_PRED_CMD, PRED_CMD_IBPB);
-
- /*
- * For non-nested:
- * When it's written (to non-zero) for the first time, pass
- * it through.
- *
- * For nested:
- * The handling of the MSR bitmap for L2 guests is done in
- * nested_vmx_prepare_msr_bitmap. We should not touch the
- * vmcs02.msr_bitmap here since it gets completely overwritten
- * in the merging.
- */
- vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap, MSR_IA32_PRED_CMD,
- MSR_TYPE_W);
- break;
- case MSR_IA32_CR_PAT:
- if (!kvm_pat_valid(data))
- return 1;
-
- if (is_guest_mode(vcpu) &&
- get_vmcs12(vcpu)->vm_exit_controls & VM_EXIT_SAVE_IA32_PAT)
- get_vmcs12(vcpu)->guest_ia32_pat = data;
-
- if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
- vmcs_write64(GUEST_IA32_PAT, data);
- vcpu->arch.pat = data;
- break;
- }
- ret = kvm_set_msr_common(vcpu, msr_info);
- break;
- case MSR_IA32_TSC_ADJUST:
- ret = kvm_set_msr_common(vcpu, msr_info);
- break;
- case MSR_IA32_MCG_EXT_CTL:
- if ((!msr_info->host_initiated &&
- !(to_vmx(vcpu)->msr_ia32_feature_control &
- FEATURE_CONTROL_LMCE)) ||
- (data & ~MCG_EXT_CTL_LMCE_EN))
- return 1;
- vcpu->arch.mcg_ext_ctl = data;
- break;
- case MSR_IA32_FEATURE_CONTROL:
- if (!vmx_feature_control_msr_valid(vcpu, data) ||
- (to_vmx(vcpu)->msr_ia32_feature_control &
- FEATURE_CONTROL_LOCKED && !msr_info->host_initiated))
- return 1;
- vmx->msr_ia32_feature_control = data;
- if (msr_info->host_initiated && data == 0)
- vmx_leave_nested(vcpu);
- break;
- case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
- if (!msr_info->host_initiated)
- return 1; /* they are read-only */
- if (!nested_vmx_allowed(vcpu))
- return 1;
- return vmx_set_vmx_msr(vcpu, msr_index, data);
- case MSR_IA32_RTIT_CTL:
- if ((pt_mode != PT_MODE_HOST_GUEST) ||
- vmx_rtit_ctl_check(vcpu, data) ||
- vmx->nested.vmxon)
- return 1;
- vmcs_write64(GUEST_IA32_RTIT_CTL, data);
- vmx->pt_desc.guest.ctl = data;
- pt_update_intercept_for_msr(vmx);
- break;
- case MSR_IA32_RTIT_STATUS:
- if ((pt_mode != PT_MODE_HOST_GUEST) ||
- (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) ||
- (data & MSR_IA32_RTIT_STATUS_MASK))
- return 1;
- vmx->pt_desc.guest.status = data;
- break;
- case MSR_IA32_RTIT_CR3_MATCH:
- if ((pt_mode != PT_MODE_HOST_GUEST) ||
- (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) ||
- !intel_pt_validate_cap(vmx->pt_desc.caps,
- PT_CAP_cr3_filtering))
- return 1;
- vmx->pt_desc.guest.cr3_match = data;
- break;
- case MSR_IA32_RTIT_OUTPUT_BASE:
- if ((pt_mode != PT_MODE_HOST_GUEST) ||
- (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) ||
- (!intel_pt_validate_cap(vmx->pt_desc.caps,
- PT_CAP_topa_output) &&
- !intel_pt_validate_cap(vmx->pt_desc.caps,
- PT_CAP_single_range_output)) ||
- (data & MSR_IA32_RTIT_OUTPUT_BASE_MASK))
- return 1;
- vmx->pt_desc.guest.output_base = data;
- break;
- case MSR_IA32_RTIT_OUTPUT_MASK:
- if ((pt_mode != PT_MODE_HOST_GUEST) ||
- (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) ||
- (!intel_pt_validate_cap(vmx->pt_desc.caps,
- PT_CAP_topa_output) &&
- !intel_pt_validate_cap(vmx->pt_desc.caps,
- PT_CAP_single_range_output)))
- return 1;
- vmx->pt_desc.guest.output_mask = data;
- break;
- case MSR_IA32_RTIT_ADDR0_A ... MSR_IA32_RTIT_ADDR3_B:
- index = msr_info->index - MSR_IA32_RTIT_ADDR0_A;
- if ((pt_mode != PT_MODE_HOST_GUEST) ||
- (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) ||
- (index >= 2 * intel_pt_validate_cap(vmx->pt_desc.caps,
- PT_CAP_num_address_ranges)))
- return 1;
- if (is_noncanonical_address(data, vcpu))
- return 1;
- if (index % 2)
- vmx->pt_desc.guest.addr_b[index / 2] = data;
- else
- vmx->pt_desc.guest.addr_a[index / 2] = data;
- break;
- case MSR_TSC_AUX:
- if (!msr_info->host_initiated &&
- !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
- return 1;
- /* Check reserved bit, higher 32 bits should be zero */
- if ((data >> 32) != 0)
- return 1;
- goto find_shared_msr;
-
- default:
- find_shared_msr:
- msr = find_msr_entry(vmx, msr_index);
- if (msr)
- ret = vmx_set_guest_msr(vmx, msr, data);
- else
- ret = kvm_set_msr_common(vcpu, msr_info);
- }
-
- return ret;
-}
-
-static void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg)
-{
- kvm_register_mark_available(vcpu, reg);
-
- switch (reg) {
- case VCPU_REGS_RSP:
- vcpu->arch.regs[VCPU_REGS_RSP] = vmcs_readl(GUEST_RSP);
- break;
- case VCPU_REGS_RIP:
- vcpu->arch.regs[VCPU_REGS_RIP] = vmcs_readl(GUEST_RIP);
- break;
- case VCPU_EXREG_PDPTR:
- if (enable_ept)
- ept_save_pdptrs(vcpu);
- break;
- case VCPU_EXREG_CR3:
- if (enable_unrestricted_guest || (enable_ept && is_paging(vcpu)))
- vcpu->arch.cr3 = vmcs_readl(GUEST_CR3);
- break;
- default:
- WARN_ON_ONCE(1);
- break;
- }
-}
-
-static __init int cpu_has_kvm_support(void)
-{
- return cpu_has_vmx();
-}
-
-static __init int vmx_disabled_by_bios(void)
-{
- u64 msr;
-
- rdmsrl(MSR_IA32_FEATURE_CONTROL, msr);
- if (msr & FEATURE_CONTROL_LOCKED) {
- /* launched w/ TXT and VMX disabled */
- if (!(msr & FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX)
- && tboot_enabled())
- return 1;
- /* launched w/o TXT and VMX only enabled w/ TXT */
- if (!(msr & FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX)
- && (msr & FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX)
- && !tboot_enabled()) {
- printk(KERN_WARNING "kvm: disable TXT in the BIOS or "
- "activate TXT before enabling KVM\n");
- return 1;
- }
- /* launched w/o TXT and VMX disabled */
- if (!(msr & FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX)
- && !tboot_enabled())
- return 1;
- }
-
- return 0;
-}
-
-static void kvm_cpu_vmxon(u64 addr)
-{
- cr4_set_bits(X86_CR4_VMXE);
- intel_pt_handle_vmx(1);
-
- asm volatile ("vmxon %0" : : "m"(addr));
-}
-
-static int hardware_enable(void)
-{
- int cpu = raw_smp_processor_id();
- u64 phys_addr = __pa(per_cpu(vmxarea, cpu));
- u64 old, test_bits;
-
- if (cr4_read_shadow() & X86_CR4_VMXE)
- return -EBUSY;
-
- /*
- * This can happen if we hot-added a CPU but failed to allocate
- * VP assist page for it.
- */
- if (static_branch_unlikely(&enable_evmcs) &&
- !hv_get_vp_assist_page(cpu))
- return -EFAULT;
-
- INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
- INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
- spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
-
- /*
- * Now we can enable the vmclear operation in kdump
- * since the loaded_vmcss_on_cpu list on this cpu
- * has been initialized.
- *
- * Though the cpu is not in VMX operation now, there
- * is no problem to enable the vmclear operation
- * for the loaded_vmcss_on_cpu list is empty!
- */
- crash_enable_local_vmclear(cpu);
-
- rdmsrl(MSR_IA32_FEATURE_CONTROL, old);
-
- test_bits = FEATURE_CONTROL_LOCKED;
- test_bits |= FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX;
- if (tboot_enabled())
- test_bits |= FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX;
-
- if ((old & test_bits) != test_bits) {
- /* enable and lock */
- wrmsrl(MSR_IA32_FEATURE_CONTROL, old | test_bits);
- }
- kvm_cpu_vmxon(phys_addr);
- if (enable_ept)
- ept_sync_global();
-
- return 0;
-}
-
-static void vmclear_local_loaded_vmcss(void)
-{
- int cpu = raw_smp_processor_id();
- struct loaded_vmcs *v, *n;
-
- list_for_each_entry_safe(v, n, &per_cpu(loaded_vmcss_on_cpu, cpu),
- loaded_vmcss_on_cpu_link)
- __loaded_vmcs_clear(v);
-}
-
-
-/* Just like cpu_vmxoff(), but with the __kvm_handle_fault_on_reboot()
- * tricks.
- */
-static void kvm_cpu_vmxoff(void)
-{
- asm volatile (__ex("vmxoff"));
-
- intel_pt_handle_vmx(0);
- cr4_clear_bits(X86_CR4_VMXE);
-}
-
-static void hardware_disable(void)
-{
- vmclear_local_loaded_vmcss();
- kvm_cpu_vmxoff();
-}
-
-static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
- u32 msr, u32 *result)
-{
- u32 vmx_msr_low, vmx_msr_high;
- u32 ctl = ctl_min | ctl_opt;
-
- rdmsr(msr, vmx_msr_low, vmx_msr_high);
-
- ctl &= vmx_msr_high; /* bit == 0 in high word ==> must be zero */
- ctl |= vmx_msr_low; /* bit == 1 in low word ==> must be one */
-
- /* Ensure minimum (required) set of control bits are supported. */
- if (ctl_min & ~ctl)
- return -EIO;
-
- *result = ctl;
- return 0;
-}
-
-static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
- struct vmx_capability *vmx_cap)
-{
- u32 vmx_msr_low, vmx_msr_high;
- u32 min, opt, min2, opt2;
- u32 _pin_based_exec_control = 0;
- u32 _cpu_based_exec_control = 0;
- u32 _cpu_based_2nd_exec_control = 0;
- u32 _vmexit_control = 0;
- u32 _vmentry_control = 0;
-
- memset(vmcs_conf, 0, sizeof(*vmcs_conf));
- min = CPU_BASED_HLT_EXITING |
-#ifdef CONFIG_X86_64
- CPU_BASED_CR8_LOAD_EXITING |
- CPU_BASED_CR8_STORE_EXITING |
-#endif
- CPU_BASED_CR3_LOAD_EXITING |
- CPU_BASED_CR3_STORE_EXITING |
- CPU_BASED_UNCOND_IO_EXITING |
- CPU_BASED_MOV_DR_EXITING |
- CPU_BASED_USE_TSC_OFFSETTING |
- CPU_BASED_MWAIT_EXITING |
- CPU_BASED_MONITOR_EXITING |
- CPU_BASED_INVLPG_EXITING |
- CPU_BASED_RDPMC_EXITING;
-
- opt = CPU_BASED_TPR_SHADOW |
- CPU_BASED_USE_MSR_BITMAPS |
- CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
- if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
- &_cpu_based_exec_control) < 0)
- return -EIO;
-#ifdef CONFIG_X86_64
- if ((_cpu_based_exec_control & CPU_BASED_TPR_SHADOW))
- _cpu_based_exec_control &= ~CPU_BASED_CR8_LOAD_EXITING &
- ~CPU_BASED_CR8_STORE_EXITING;
-#endif
- if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
- min2 = 0;
- opt2 = SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
- SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
- SECONDARY_EXEC_WBINVD_EXITING |
- SECONDARY_EXEC_ENABLE_VPID |
- SECONDARY_EXEC_ENABLE_EPT |
- SECONDARY_EXEC_UNRESTRICTED_GUEST |
- SECONDARY_EXEC_PAUSE_LOOP_EXITING |
- SECONDARY_EXEC_DESC |
- SECONDARY_EXEC_RDTSCP |
- SECONDARY_EXEC_ENABLE_INVPCID |
- SECONDARY_EXEC_APIC_REGISTER_VIRT |
- SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
- SECONDARY_EXEC_SHADOW_VMCS |
- SECONDARY_EXEC_XSAVES |
- SECONDARY_EXEC_RDSEED_EXITING |
- SECONDARY_EXEC_RDRAND_EXITING |
- SECONDARY_EXEC_ENABLE_PML |
- SECONDARY_EXEC_TSC_SCALING |
- SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE |
- SECONDARY_EXEC_PT_USE_GPA |
- SECONDARY_EXEC_PT_CONCEAL_VMX |
- SECONDARY_EXEC_ENABLE_VMFUNC |
- SECONDARY_EXEC_ENCLS_EXITING;
- if (adjust_vmx_controls(min2, opt2,
- MSR_IA32_VMX_PROCBASED_CTLS2,
- &_cpu_based_2nd_exec_control) < 0)
- return -EIO;
- }
-#ifndef CONFIG_X86_64
- if (!(_cpu_based_2nd_exec_control &
- SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES))
- _cpu_based_exec_control &= ~CPU_BASED_TPR_SHADOW;
-#endif
-
- if (!(_cpu_based_exec_control & CPU_BASED_TPR_SHADOW))
- _cpu_based_2nd_exec_control &= ~(
- SECONDARY_EXEC_APIC_REGISTER_VIRT |
- SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
- SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY);
-
- rdmsr_safe(MSR_IA32_VMX_EPT_VPID_CAP,
- &vmx_cap->ept, &vmx_cap->vpid);
-
- if (_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_EPT) {
- /* CR3 accesses and invlpg don't need to cause VM Exits when EPT
- enabled */
- _cpu_based_exec_control &= ~(CPU_BASED_CR3_LOAD_EXITING |
- CPU_BASED_CR3_STORE_EXITING |
- CPU_BASED_INVLPG_EXITING);
- } else if (vmx_cap->ept) {
- vmx_cap->ept = 0;
- pr_warn_once("EPT CAP should not exist if not support "
- "1-setting enable EPT VM-execution control\n");
- }
- if (!(_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_VPID) &&
- vmx_cap->vpid) {
- vmx_cap->vpid = 0;
- pr_warn_once("VPID CAP should not exist if not support "
- "1-setting enable VPID VM-execution control\n");
- }
-
- min = VM_EXIT_SAVE_DEBUG_CONTROLS | VM_EXIT_ACK_INTR_ON_EXIT;
-#ifdef CONFIG_X86_64
- min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
-#endif
- opt = VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
- VM_EXIT_LOAD_IA32_PAT |
- VM_EXIT_LOAD_IA32_EFER |
- VM_EXIT_CLEAR_BNDCFGS |
- VM_EXIT_PT_CONCEAL_PIP |
- VM_EXIT_CLEAR_IA32_RTIT_CTL;
- if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
- &_vmexit_control) < 0)
- return -EIO;
-
- min = PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING;
- opt = PIN_BASED_VIRTUAL_NMIS | PIN_BASED_POSTED_INTR |
- PIN_BASED_VMX_PREEMPTION_TIMER;
- if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PINBASED_CTLS,
- &_pin_based_exec_control) < 0)
- return -EIO;
-
- if (cpu_has_broken_vmx_preemption_timer())
- _pin_based_exec_control &= ~PIN_BASED_VMX_PREEMPTION_TIMER;
- if (!(_cpu_based_2nd_exec_control &
- SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY))
- _pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;
-
- min = VM_ENTRY_LOAD_DEBUG_CONTROLS;
- opt = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
- VM_ENTRY_LOAD_IA32_PAT |
- VM_ENTRY_LOAD_IA32_EFER |
- VM_ENTRY_LOAD_BNDCFGS |
- VM_ENTRY_PT_CONCEAL_PIP |
- VM_ENTRY_LOAD_IA32_RTIT_CTL;
- if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
- &_vmentry_control) < 0)
- return -EIO;
-
- /*
- * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
- * can't be used due to an errata where VM Exit may incorrectly clear
- * IA32_PERF_GLOBAL_CTRL[34:32]. Workaround the errata by using the
- * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
- */
- if (boot_cpu_data.x86 == 0x6) {
- switch (boot_cpu_data.x86_model) {
- case 26: /* AAK155 */
- case 30: /* AAP115 */
- case 37: /* AAT100 */
- case 44: /* BC86,AAY89,BD102 */
- case 46: /* BA97 */
- _vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
- _vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
- pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
- "does not work properly. Using workaround\n");
- break;
- default:
- break;
- }
- }
-
-
- rdmsr(MSR_IA32_VMX_BASIC, vmx_msr_low, vmx_msr_high);
-
- /* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
- if ((vmx_msr_high & 0x1fff) > PAGE_SIZE)
- return -EIO;
-
-#ifdef CONFIG_X86_64
- /* IA-32 SDM Vol 3B: 64-bit CPUs always have VMX_BASIC_MSR[48]==0. */
- if (vmx_msr_high & (1u<<16))
- return -EIO;
-#endif
-
- /* Require Write-Back (WB) memory type for VMCS accesses. */
- if (((vmx_msr_high >> 18) & 15) != 6)
- return -EIO;
-
- vmcs_conf->size = vmx_msr_high & 0x1fff;
- vmcs_conf->order = get_order(vmcs_conf->size);
- vmcs_conf->basic_cap = vmx_msr_high & ~0x1fff;
-
- vmcs_conf->revision_id = vmx_msr_low;
-
- vmcs_conf->pin_based_exec_ctrl = _pin_based_exec_control;
- vmcs_conf->cpu_based_exec_ctrl = _cpu_based_exec_control;
- vmcs_conf->cpu_based_2nd_exec_ctrl = _cpu_based_2nd_exec_control;
- vmcs_conf->vmexit_ctrl = _vmexit_control;
- vmcs_conf->vmentry_ctrl = _vmentry_control;
-
- if (static_branch_unlikely(&enable_evmcs))
- evmcs_sanitize_exec_ctrls(vmcs_conf);
-
- return 0;
-}
-
-struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu, gfp_t flags)
-{
- int node = cpu_to_node(cpu);
- struct page *pages;
- struct vmcs *vmcs;
-
- pages = __alloc_pages_node(node, flags, vmcs_config.order);
- if (!pages)
- return NULL;
- vmcs = page_address(pages);
- memset(vmcs, 0, vmcs_config.size);
-
- /* KVM supports Enlightened VMCS v1 only */
- if (static_branch_unlikely(&enable_evmcs))
- vmcs->hdr.revision_id = KVM_EVMCS_VERSION;
- else
- vmcs->hdr.revision_id = vmcs_config.revision_id;
-
- if (shadow)
- vmcs->hdr.shadow_vmcs = 1;
- return vmcs;
-}
-
-void free_vmcs(struct vmcs *vmcs)
-{
- free_pages((unsigned long)vmcs, vmcs_config.order);
-}
-
-/*
- * Free a VMCS, but before that VMCLEAR it on the CPU where it was last loaded
- */
-void free_loaded_vmcs(struct loaded_vmcs *loaded_vmcs)
-{
- if (!loaded_vmcs->vmcs)
- return;
- loaded_vmcs_clear(loaded_vmcs);
- free_vmcs(loaded_vmcs->vmcs);
- loaded_vmcs->vmcs = NULL;
- if (loaded_vmcs->msr_bitmap)
- free_page((unsigned long)loaded_vmcs->msr_bitmap);
- WARN_ON(loaded_vmcs->shadow_vmcs != NULL);
-}
-
-int alloc_loaded_vmcs(struct loaded_vmcs *loaded_vmcs)
-{
- loaded_vmcs->vmcs = alloc_vmcs(false);
- if (!loaded_vmcs->vmcs)
- return -ENOMEM;
-
- loaded_vmcs->shadow_vmcs = NULL;
- loaded_vmcs->hv_timer_soft_disabled = false;
- loaded_vmcs_init(loaded_vmcs);
-
- if (cpu_has_vmx_msr_bitmap()) {
- loaded_vmcs->msr_bitmap = (unsigned long *)
- __get_free_page(GFP_KERNEL_ACCOUNT);
- if (!loaded_vmcs->msr_bitmap)
- goto out_vmcs;
- memset(loaded_vmcs->msr_bitmap, 0xff, PAGE_SIZE);
-
- if (IS_ENABLED(CONFIG_HYPERV) &&
- static_branch_unlikely(&enable_evmcs) &&
- (ms_hyperv.nested_features & HV_X64_NESTED_MSR_BITMAP)) {
- struct hv_enlightened_vmcs *evmcs =
- (struct hv_enlightened_vmcs *)loaded_vmcs->vmcs;
-
- evmcs->hv_enlightenments_control.msr_bitmap = 1;
- }
- }
-
- memset(&loaded_vmcs->host_state, 0, sizeof(struct vmcs_host_state));
- memset(&loaded_vmcs->controls_shadow, 0,
- sizeof(struct vmcs_controls_shadow));
-
- return 0;
-
-out_vmcs:
- free_loaded_vmcs(loaded_vmcs);
- return -ENOMEM;
-}
-
-static void free_kvm_area(void)
-{
- int cpu;
-
- for_each_possible_cpu(cpu) {
- free_vmcs(per_cpu(vmxarea, cpu));
- per_cpu(vmxarea, cpu) = NULL;
- }
-}
-
-static __init int alloc_kvm_area(void)
-{
- int cpu;
-
- for_each_possible_cpu(cpu) {
- struct vmcs *vmcs;
-
- vmcs = alloc_vmcs_cpu(false, cpu, GFP_KERNEL);
- if (!vmcs) {
- free_kvm_area();
- return -ENOMEM;
- }
-
- /*
- * When eVMCS is enabled, alloc_vmcs_cpu() sets
- * vmcs->revision_id to KVM_EVMCS_VERSION instead of
- * revision_id reported by MSR_IA32_VMX_BASIC.
- *
- * However, even though not explicitly documented by
- * TLFS, VMXArea passed as VMXON argument should
- * still be marked with revision_id reported by
- * physical CPU.
- */
- if (static_branch_unlikely(&enable_evmcs))
- vmcs->hdr.revision_id = vmcs_config.revision_id;
-
- per_cpu(vmxarea, cpu) = vmcs;
- }
- return 0;
-}
-
-static void fix_pmode_seg(struct kvm_vcpu *vcpu, int seg,
- struct kvm_segment *save)
-{
- if (!emulate_invalid_guest_state) {
- /*
- * CS and SS RPL should be equal during guest entry according
- * to VMX spec, but in reality it is not always so. Since vcpu
- * is in the middle of the transition from real mode to
- * protected mode it is safe to assume that RPL 0 is a good
- * default value.
- */
- if (seg == VCPU_SREG_CS || seg == VCPU_SREG_SS)
- save->selector &= ~SEGMENT_RPL_MASK;
- save->dpl = save->selector & SEGMENT_RPL_MASK;
- save->s = 1;
- }
- vmx_set_segment(vcpu, save, seg);
-}
-
-static void enter_pmode(struct kvm_vcpu *vcpu)
-{
- unsigned long flags;
- struct vcpu_vmx *vmx = to_vmx(vcpu);
-
- /*
- * Update real mode segment cache. It may be not up-to-date if sement
- * register was written while vcpu was in a guest mode.
- */
- vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_ES], VCPU_SREG_ES);
- vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_DS], VCPU_SREG_DS);
- vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_FS], VCPU_SREG_FS);
- vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_GS], VCPU_SREG_GS);
- vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_SS], VCPU_SREG_SS);
- vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_CS], VCPU_SREG_CS);
-
- vmx->rmode.vm86_active = 0;
-
- vmx_segment_cache_clear(vmx);
-
- vmx_set_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_TR], VCPU_SREG_TR);
-
- flags = vmcs_readl(GUEST_RFLAGS);
- flags &= RMODE_GUEST_OWNED_EFLAGS_BITS;
- flags |= vmx->rmode.save_rflags & ~RMODE_GUEST_OWNED_EFLAGS_BITS;
- vmcs_writel(GUEST_RFLAGS, flags);
-
- vmcs_writel(GUEST_CR4, (vmcs_readl(GUEST_CR4) & ~X86_CR4_VME) |
- (vmcs_readl(CR4_READ_SHADOW) & X86_CR4_VME));
-
- update_exception_bitmap(vcpu);
-
- fix_pmode_seg(vcpu, VCPU_SREG_CS, &vmx->rmode.segs[VCPU_SREG_CS]);
- fix_pmode_seg(vcpu, VCPU_SREG_SS, &vmx->rmode.segs[VCPU_SREG_SS]);
- fix_pmode_seg(vcpu, VCPU_SREG_ES, &vmx->rmode.segs[VCPU_SREG_ES]);
- fix_pmode_seg(vcpu, VCPU_SREG_DS, &vmx->rmode.segs[VCPU_SREG_DS]);
- fix_pmode_seg(vcpu, VCPU_SREG_FS, &vmx->rmode.segs[VCPU_SREG_FS]);
- fix_pmode_seg(vcpu, VCPU_SREG_GS, &vmx->rmode.segs[VCPU_SREG_GS]);
-}
-
-static void fix_rmode_seg(int seg, struct kvm_segment *save)
-{
- const struct kvm_vmx_segment_field *sf = &kvm_vmx_segment_fields[seg];
- struct kvm_segment var = *save;
-
- var.dpl = 0x3;
- if (seg == VCPU_SREG_CS)
- var.type = 0x3;
-
- if (!emulate_invalid_guest_state) {
- var.selector = var.base >> 4;
- var.base = var.base & 0xffff0;
- var.limit = 0xffff;
- var.g = 0;
- var.db = 0;
- var.present = 1;
- var.s = 1;
- var.l = 0;
- var.unusable = 0;
- var.type = 0x3;
- var.avl = 0;
- if (save->base & 0xf)
- printk_once(KERN_WARNING "kvm: segment base is not "
- "paragraph aligned when entering "
- "protected mode (seg=%d)", seg);
- }
-
- vmcs_write16(sf->selector, var.selector);
- vmcs_writel(sf->base, var.base);
- vmcs_write32(sf->limit, var.limit);
- vmcs_write32(sf->ar_bytes, vmx_segment_access_rights(&var));
-}
-
-static void enter_rmode(struct kvm_vcpu *vcpu)
-{
- unsigned long flags;
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- struct kvm_vmx *kvm_vmx = to_kvm_vmx(vcpu->kvm);
-
- vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_TR], VCPU_SREG_TR);
- vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_ES], VCPU_SREG_ES);
- vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_DS], VCPU_SREG_DS);
- vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_FS], VCPU_SREG_FS);
- vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_GS], VCPU_SREG_GS);
- vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_SS], VCPU_SREG_SS);
- vmx_get_segment(vcpu, &vmx->rmode.segs[VCPU_SREG_CS], VCPU_SREG_CS);
-
- vmx->rmode.vm86_active = 1;
-
- /*
- * Very old userspace does not call KVM_SET_TSS_ADDR before entering
- * vcpu. Warn the user that an update is overdue.
- */
- if (!kvm_vmx->tss_addr)
- printk_once(KERN_WARNING "kvm: KVM_SET_TSS_ADDR need to be "
- "called before entering vcpu\n");
-
- vmx_segment_cache_clear(vmx);
-
- vmcs_writel(GUEST_TR_BASE, kvm_vmx->tss_addr);
- vmcs_write32(GUEST_TR_LIMIT, RMODE_TSS_SIZE - 1);
- vmcs_write32(GUEST_TR_AR_BYTES, 0x008b);
-
- flags = vmcs_readl(GUEST_RFLAGS);
- vmx->rmode.save_rflags = flags;
-
- flags |= X86_EFLAGS_IOPL | X86_EFLAGS_VM;
-
- vmcs_writel(GUEST_RFLAGS, flags);
- vmcs_writel(GUEST_CR4, vmcs_readl(GUEST_CR4) | X86_CR4_VME);
- update_exception_bitmap(vcpu);
-
- fix_rmode_seg(VCPU_SREG_SS, &vmx->rmode.segs[VCPU_SREG_SS]);
- fix_rmode_seg(VCPU_SREG_CS, &vmx->rmode.segs[VCPU_SREG_CS]);
- fix_rmode_seg(VCPU_SREG_ES, &vmx->rmode.segs[VCPU_SREG_ES]);
- fix_rmode_seg(VCPU_SREG_DS, &vmx->rmode.segs[VCPU_SREG_DS]);
- fix_rmode_seg(VCPU_SREG_GS, &vmx->rmode.segs[VCPU_SREG_GS]);
- fix_rmode_seg(VCPU_SREG_FS, &vmx->rmode.segs[VCPU_SREG_FS]);
-
- kvm_mmu_reset_context(vcpu);
-}
-
-void vmx_set_efer(struct kvm_vcpu *vcpu, u64 efer)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- struct shared_msr_entry *msr = find_msr_entry(vmx, MSR_EFER);
-
- if (!msr)
- return;
-
- vcpu->arch.efer = efer;
- if (efer & EFER_LMA) {
- vm_entry_controls_setbit(to_vmx(vcpu), VM_ENTRY_IA32E_MODE);
- msr->data = efer;
- } else {
- vm_entry_controls_clearbit(to_vmx(vcpu), VM_ENTRY_IA32E_MODE);
-
- msr->data = efer & ~EFER_LME;
- }
- setup_msrs(vmx);
-}
-
-#ifdef CONFIG_X86_64
-
-static void enter_lmode(struct kvm_vcpu *vcpu)
-{
- u32 guest_tr_ar;
-
- vmx_segment_cache_clear(to_vmx(vcpu));
-
- guest_tr_ar = vmcs_read32(GUEST_TR_AR_BYTES);
- if ((guest_tr_ar & VMX_AR_TYPE_MASK) != VMX_AR_TYPE_BUSY_64_TSS) {
- pr_debug_ratelimited("%s: tss fixup for long mode. \n",
- __func__);
- vmcs_write32(GUEST_TR_AR_BYTES,
- (guest_tr_ar & ~VMX_AR_TYPE_MASK)
- | VMX_AR_TYPE_BUSY_64_TSS);
- }
- vmx_set_efer(vcpu, vcpu->arch.efer | EFER_LMA);
-}
-
-static void exit_lmode(struct kvm_vcpu *vcpu)
-{
- vm_entry_controls_clearbit(to_vmx(vcpu), VM_ENTRY_IA32E_MODE);
- vmx_set_efer(vcpu, vcpu->arch.efer & ~EFER_LMA);
-}
-
-#endif
-
-static void vmx_flush_tlb_gva(struct kvm_vcpu *vcpu, gva_t addr)
-{
- int vpid = to_vmx(vcpu)->vpid;
-
- if (!vpid_sync_vcpu_addr(vpid, addr))
- vpid_sync_context(vpid);
-
- /*
- * If VPIDs are not supported or enabled, then the above is a no-op.
- * But we don't really need a TLB flush in that case anyway, because
- * each VM entry/exit includes an implicit flush when VPID is 0.
- */
-}
-
-static void vmx_decache_cr0_guest_bits(struct kvm_vcpu *vcpu)
-{
- ulong cr0_guest_owned_bits = vcpu->arch.cr0_guest_owned_bits;
-
- vcpu->arch.cr0 &= ~cr0_guest_owned_bits;
- vcpu->arch.cr0 |= vmcs_readl(GUEST_CR0) & cr0_guest_owned_bits;
-}
-
-static void vmx_decache_cr4_guest_bits(struct kvm_vcpu *vcpu)
-{
- ulong cr4_guest_owned_bits = vcpu->arch.cr4_guest_owned_bits;
-
- vcpu->arch.cr4 &= ~cr4_guest_owned_bits;
- vcpu->arch.cr4 |= vmcs_readl(GUEST_CR4) & cr4_guest_owned_bits;
-}
-
-static void ept_load_pdptrs(struct kvm_vcpu *vcpu)
-{
- struct kvm_mmu *mmu = vcpu->arch.walk_mmu;
-
- if (!kvm_register_is_dirty(vcpu, VCPU_EXREG_PDPTR))
- return;
-
- if (is_pae_paging(vcpu)) {
- vmcs_write64(GUEST_PDPTR0, mmu->pdptrs[0]);
- vmcs_write64(GUEST_PDPTR1, mmu->pdptrs[1]);
- vmcs_write64(GUEST_PDPTR2, mmu->pdptrs[2]);
- vmcs_write64(GUEST_PDPTR3, mmu->pdptrs[3]);
- }
-}
-
-void ept_save_pdptrs(struct kvm_vcpu *vcpu)
-{
- struct kvm_mmu *mmu = vcpu->arch.walk_mmu;
-
- if (is_pae_paging(vcpu)) {
- mmu->pdptrs[0] = vmcs_read64(GUEST_PDPTR0);
- mmu->pdptrs[1] = vmcs_read64(GUEST_PDPTR1);
- mmu->pdptrs[2] = vmcs_read64(GUEST_PDPTR2);
- mmu->pdptrs[3] = vmcs_read64(GUEST_PDPTR3);
- }
-
- kvm_register_mark_dirty(vcpu, VCPU_EXREG_PDPTR);
-}
-
-static void ept_update_paging_mode_cr0(unsigned long *hw_cr0,
- unsigned long cr0,
- struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
-
- if (!kvm_register_is_available(vcpu, VCPU_EXREG_CR3))
- vmx_cache_reg(vcpu, VCPU_EXREG_CR3);
- if (!(cr0 & X86_CR0_PG)) {
- /* From paging/starting to nonpaging */
- exec_controls_setbit(vmx, CPU_BASED_CR3_LOAD_EXITING |
- CPU_BASED_CR3_STORE_EXITING);
- vcpu->arch.cr0 = cr0;
- vmx_set_cr4(vcpu, kvm_read_cr4(vcpu));
- } else if (!is_paging(vcpu)) {
- /* From nonpaging to paging */
- exec_controls_clearbit(vmx, CPU_BASED_CR3_LOAD_EXITING |
- CPU_BASED_CR3_STORE_EXITING);
- vcpu->arch.cr0 = cr0;
- vmx_set_cr4(vcpu, kvm_read_cr4(vcpu));
- }
-
- if (!(cr0 & X86_CR0_WP))
- *hw_cr0 &= ~X86_CR0_WP;
-}
-
-void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- unsigned long hw_cr0;
-
- hw_cr0 = (cr0 & ~KVM_VM_CR0_ALWAYS_OFF);
- if (enable_unrestricted_guest)
- hw_cr0 |= KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST;
- else {
- hw_cr0 |= KVM_VM_CR0_ALWAYS_ON;
-
- if (vmx->rmode.vm86_active && (cr0 & X86_CR0_PE))
- enter_pmode(vcpu);
-
- if (!vmx->rmode.vm86_active && !(cr0 & X86_CR0_PE))
- enter_rmode(vcpu);
- }
-
-#ifdef CONFIG_X86_64
- if (vcpu->arch.efer & EFER_LME) {
- if (!is_paging(vcpu) && (cr0 & X86_CR0_PG))
- enter_lmode(vcpu);
- if (is_paging(vcpu) && !(cr0 & X86_CR0_PG))
- exit_lmode(vcpu);
- }
-#endif
-
- if (enable_ept && !enable_unrestricted_guest)
- ept_update_paging_mode_cr0(&hw_cr0, cr0, vcpu);
-
- vmcs_writel(CR0_READ_SHADOW, cr0);
- vmcs_writel(GUEST_CR0, hw_cr0);
- vcpu->arch.cr0 = cr0;
-
- /* depends on vcpu->arch.cr0 to be set to a new value */
- vmx->emulation_required = emulation_required(vcpu);
-}
-
-static int get_ept_level(struct kvm_vcpu *vcpu)
-{
- if (cpu_has_vmx_ept_5levels() && (cpuid_maxphyaddr(vcpu) > 48))
- return 5;
- return 4;
-}
-
-u64 construct_eptp(struct kvm_vcpu *vcpu, unsigned long root_hpa)
-{
- u64 eptp = VMX_EPTP_MT_WB;
-
- eptp |= (get_ept_level(vcpu) == 5) ? VMX_EPTP_PWL_5 : VMX_EPTP_PWL_4;
-
- if (enable_ept_ad_bits &&
- (!is_guest_mode(vcpu) || nested_ept_ad_enabled(vcpu)))
- eptp |= VMX_EPTP_AD_ENABLE_BIT;
- eptp |= (root_hpa & PAGE_MASK);
-
- return eptp;
-}
-
-void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
-{
- struct kvm *kvm = vcpu->kvm;
- bool update_guest_cr3 = true;
- unsigned long guest_cr3;
- u64 eptp;
-
- guest_cr3 = cr3;
- if (enable_ept) {
- eptp = construct_eptp(vcpu, cr3);
- vmcs_write64(EPT_POINTER, eptp);
-
- if (kvm_x86_ops->tlb_remote_flush) {
- spin_lock(&to_kvm_vmx(kvm)->ept_pointer_lock);
- to_vmx(vcpu)->ept_pointer = eptp;
- to_kvm_vmx(kvm)->ept_pointers_match
- = EPT_POINTERS_CHECK;
- spin_unlock(&to_kvm_vmx(kvm)->ept_pointer_lock);
- }
-
- /* Loading vmcs02.GUEST_CR3 is handled by nested VM-Enter. */
- if (is_guest_mode(vcpu))
- update_guest_cr3 = false;
- else if (!enable_unrestricted_guest && !is_paging(vcpu))
- guest_cr3 = to_kvm_vmx(kvm)->ept_identity_map_addr;
- else if (test_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail))
- guest_cr3 = vcpu->arch.cr3;
- else /* vmcs01.GUEST_CR3 is already up-to-date. */
- update_guest_cr3 = false;
- ept_load_pdptrs(vcpu);
- }
-
- if (update_guest_cr3)
- vmcs_writel(GUEST_CR3, guest_cr3);
-}
-
-int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- /*
- * Pass through host's Machine Check Enable value to hw_cr4, which
- * is in force while we are in guest mode. Do not let guests control
- * this bit, even if host CR4.MCE == 0.
- */
- unsigned long hw_cr4;
-
- hw_cr4 = (cr4_read_shadow() & X86_CR4_MCE) | (cr4 & ~X86_CR4_MCE);
- if (enable_unrestricted_guest)
- hw_cr4 |= KVM_VM_CR4_ALWAYS_ON_UNRESTRICTED_GUEST;
- else if (vmx->rmode.vm86_active)
- hw_cr4 |= KVM_RMODE_VM_CR4_ALWAYS_ON;
- else
- hw_cr4 |= KVM_PMODE_VM_CR4_ALWAYS_ON;
-
- if (!boot_cpu_has(X86_FEATURE_UMIP) && vmx_umip_emulated()) {
- if (cr4 & X86_CR4_UMIP) {
- secondary_exec_controls_setbit(vmx, SECONDARY_EXEC_DESC);
- hw_cr4 &= ~X86_CR4_UMIP;
- } else if (!is_guest_mode(vcpu) ||
- !nested_cpu_has2(get_vmcs12(vcpu), SECONDARY_EXEC_DESC)) {
- secondary_exec_controls_clearbit(vmx, SECONDARY_EXEC_DESC);
- }
- }
-
- if (cr4 & X86_CR4_VMXE) {
- /*
- * To use VMXON (and later other VMX instructions), a guest
- * must first be able to turn on cr4.VMXE (see handle_vmon()).
- * So basically the check on whether to allow nested VMX
- * is here. We operate under the default treatment of SMM,
- * so VMX cannot be enabled under SMM.
- */
- if (!nested_vmx_allowed(vcpu) || is_smm(vcpu))
- return 1;
- }
-
- if (vmx->nested.vmxon && !nested_cr4_valid(vcpu, cr4))
- return 1;
-
- vcpu->arch.cr4 = cr4;
-
- if (!enable_unrestricted_guest) {
- if (enable_ept) {
- if (!is_paging(vcpu)) {
- hw_cr4 &= ~X86_CR4_PAE;
- hw_cr4 |= X86_CR4_PSE;
- } else if (!(cr4 & X86_CR4_PAE)) {
- hw_cr4 &= ~X86_CR4_PAE;
- }
- }
-
- /*
- * SMEP/SMAP/PKU is disabled if CPU is in non-paging mode in
- * hardware. To emulate this behavior, SMEP/SMAP/PKU needs
- * to be manually disabled when guest switches to non-paging
- * mode.
- *
- * If !enable_unrestricted_guest, the CPU is always running
- * with CR0.PG=1 and CR4 needs to be modified.
- * If enable_unrestricted_guest, the CPU automatically
- * disables SMEP/SMAP/PKU when the guest sets CR0.PG=0.
- */
- if (!is_paging(vcpu))
- hw_cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE);
- }
-
- vmcs_writel(CR4_READ_SHADOW, cr4);
- vmcs_writel(GUEST_CR4, hw_cr4);
- return 0;
-}
-
-void vmx_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- u32 ar;
-
- if (vmx->rmode.vm86_active && seg != VCPU_SREG_LDTR) {
- *var = vmx->rmode.segs[seg];
- if (seg == VCPU_SREG_TR
- || var->selector == vmx_read_guest_seg_selector(vmx, seg))
- return;
- var->base = vmx_read_guest_seg_base(vmx, seg);
- var->selector = vmx_read_guest_seg_selector(vmx, seg);
- return;
- }
- var->base = vmx_read_guest_seg_base(vmx, seg);
- var->limit = vmx_read_guest_seg_limit(vmx, seg);
- var->selector = vmx_read_guest_seg_selector(vmx, seg);
- ar = vmx_read_guest_seg_ar(vmx, seg);
- var->unusable = (ar >> 16) & 1;
- var->type = ar & 15;
- var->s = (ar >> 4) & 1;
- var->dpl = (ar >> 5) & 3;
- /*
- * Some userspaces do not preserve unusable property. Since usable
- * segment has to be present according to VMX spec we can use present
- * property to amend userspace bug by making unusable segment always
- * nonpresent. vmx_segment_access_rights() already marks nonpresent
- * segment as unusable.
- */
- var->present = !var->unusable;
- var->avl = (ar >> 12) & 1;
- var->l = (ar >> 13) & 1;
- var->db = (ar >> 14) & 1;
- var->g = (ar >> 15) & 1;
-}
-
-static u64 vmx_get_segment_base(struct kvm_vcpu *vcpu, int seg)
-{
- struct kvm_segment s;
-
- if (to_vmx(vcpu)->rmode.vm86_active) {
- vmx_get_segment(vcpu, &s, seg);
- return s.base;
- }
- return vmx_read_guest_seg_base(to_vmx(vcpu), seg);
-}
-
-int vmx_get_cpl(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
-
- if (unlikely(vmx->rmode.vm86_active))
- return 0;
- else {
- int ar = vmx_read_guest_seg_ar(vmx, VCPU_SREG_SS);
- return VMX_AR_DPL(ar);
- }
-}
-
-static u32 vmx_segment_access_rights(struct kvm_segment *var)
-{
- u32 ar;
-
- if (var->unusable || !var->present)
- ar = 1 << 16;
- else {
- ar = var->type & 15;
- ar |= (var->s & 1) << 4;
- ar |= (var->dpl & 3) << 5;
- ar |= (var->present & 1) << 7;
- ar |= (var->avl & 1) << 12;
- ar |= (var->l & 1) << 13;
- ar |= (var->db & 1) << 14;
- ar |= (var->g & 1) << 15;
- }
-
- return ar;
-}
-
-void vmx_set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- const struct kvm_vmx_segment_field *sf = &kvm_vmx_segment_fields[seg];
-
- vmx_segment_cache_clear(vmx);
-
- if (vmx->rmode.vm86_active && seg != VCPU_SREG_LDTR) {
- vmx->rmode.segs[seg] = *var;
- if (seg == VCPU_SREG_TR)
- vmcs_write16(sf->selector, var->selector);
- else if (var->s)
- fix_rmode_seg(seg, &vmx->rmode.segs[seg]);
- goto out;
- }
-
- vmcs_writel(sf->base, var->base);
- vmcs_write32(sf->limit, var->limit);
- vmcs_write16(sf->selector, var->selector);
-
- /*
- * Fix the "Accessed" bit in AR field of segment registers for older
- * qemu binaries.
- * IA32 arch specifies that at the time of processor reset the
- * "Accessed" bit in the AR field of segment registers is 1. And qemu
- * is setting it to 0 in the userland code. This causes invalid guest
- * state vmexit when "unrestricted guest" mode is turned on.
- * Fix for this setup issue in cpu_reset is being pushed in the qemu
- * tree. Newer qemu binaries with that qemu fix would not need this
- * kvm hack.
- */
- if (enable_unrestricted_guest && (seg != VCPU_SREG_LDTR))
- var->type |= 0x1; /* Accessed */
-
- vmcs_write32(sf->ar_bytes, vmx_segment_access_rights(var));
-
-out:
- vmx->emulation_required = emulation_required(vcpu);
-}
-
-static void vmx_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l)
-{
- u32 ar = vmx_read_guest_seg_ar(to_vmx(vcpu), VCPU_SREG_CS);
-
- *db = (ar >> 14) & 1;
- *l = (ar >> 13) & 1;
-}
-
-static void vmx_get_idt(struct kvm_vcpu *vcpu, struct desc_ptr *dt)
-{
- dt->size = vmcs_read32(GUEST_IDTR_LIMIT);
- dt->address = vmcs_readl(GUEST_IDTR_BASE);
-}
-
-static void vmx_set_idt(struct kvm_vcpu *vcpu, struct desc_ptr *dt)
-{
- vmcs_write32(GUEST_IDTR_LIMIT, dt->size);
- vmcs_writel(GUEST_IDTR_BASE, dt->address);
-}
-
-static void vmx_get_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt)
-{
- dt->size = vmcs_read32(GUEST_GDTR_LIMIT);
- dt->address = vmcs_readl(GUEST_GDTR_BASE);
-}
-
-static void vmx_set_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt)
-{
- vmcs_write32(GUEST_GDTR_LIMIT, dt->size);
- vmcs_writel(GUEST_GDTR_BASE, dt->address);
-}
-
-static bool rmode_segment_valid(struct kvm_vcpu *vcpu, int seg)
-{
- struct kvm_segment var;
- u32 ar;
-
- vmx_get_segment(vcpu, &var, seg);
- var.dpl = 0x3;
- if (seg == VCPU_SREG_CS)
- var.type = 0x3;
- ar = vmx_segment_access_rights(&var);
-
- if (var.base != (var.selector << 4))
- return false;
- if (var.limit != 0xffff)
- return false;
- if (ar != 0xf3)
- return false;
-
- return true;
-}
-
-static bool code_segment_valid(struct kvm_vcpu *vcpu)
-{
- struct kvm_segment cs;
- unsigned int cs_rpl;
-
- vmx_get_segment(vcpu, &cs, VCPU_SREG_CS);
- cs_rpl = cs.selector & SEGMENT_RPL_MASK;
-
- if (cs.unusable)
- return false;
- if (~cs.type & (VMX_AR_TYPE_CODE_MASK|VMX_AR_TYPE_ACCESSES_MASK))
- return false;
- if (!cs.s)
- return false;
- if (cs.type & VMX_AR_TYPE_WRITEABLE_MASK) {
- if (cs.dpl > cs_rpl)
- return false;
- } else {
- if (cs.dpl != cs_rpl)
- return false;
- }
- if (!cs.present)
- return false;
-
- /* TODO: Add Reserved field check, this'll require a new member in the kvm_segment_field structure */
- return true;
-}
-
-static bool stack_segment_valid(struct kvm_vcpu *vcpu)
-{
- struct kvm_segment ss;
- unsigned int ss_rpl;
-
- vmx_get_segment(vcpu, &ss, VCPU_SREG_SS);
- ss_rpl = ss.selector & SEGMENT_RPL_MASK;
-
- if (ss.unusable)
- return true;
- if (ss.type != 3 && ss.type != 7)
- return false;
- if (!ss.s)
- return false;
- if (ss.dpl != ss_rpl) /* DPL != RPL */
- return false;
- if (!ss.present)
- return false;
-
- return true;
-}
-
-static bool data_segment_valid(struct kvm_vcpu *vcpu, int seg)
-{
- struct kvm_segment var;
- unsigned int rpl;
-
- vmx_get_segment(vcpu, &var, seg);
- rpl = var.selector & SEGMENT_RPL_MASK;
-
- if (var.unusable)
- return true;
- if (!var.s)
- return false;
- if (!var.present)
- return false;
- if (~var.type & (VMX_AR_TYPE_CODE_MASK|VMX_AR_TYPE_WRITEABLE_MASK)) {
- if (var.dpl < rpl) /* DPL < RPL */
- return false;
- }
-
- /* TODO: Add other members to kvm_segment_field to allow checking for other access
- * rights flags
- */
- return true;
-}
-
-static bool tr_valid(struct kvm_vcpu *vcpu)
-{
- struct kvm_segment tr;
-
- vmx_get_segment(vcpu, &tr, VCPU_SREG_TR);
-
- if (tr.unusable)
- return false;
- if (tr.selector & SEGMENT_TI_MASK) /* TI = 1 */
- return false;
- if (tr.type != 3 && tr.type != 11) /* TODO: Check if guest is in IA32e mode */
- return false;
- if (!tr.present)
- return false;
-
- return true;
-}
-
-static bool ldtr_valid(struct kvm_vcpu *vcpu)
-{
- struct kvm_segment ldtr;
-
- vmx_get_segment(vcpu, &ldtr, VCPU_SREG_LDTR);
-
- if (ldtr.unusable)
- return true;
- if (ldtr.selector & SEGMENT_TI_MASK) /* TI = 1 */
- return false;
- if (ldtr.type != 2)
- return false;
- if (!ldtr.present)
- return false;
-
- return true;
-}
-
-static bool cs_ss_rpl_check(struct kvm_vcpu *vcpu)
-{
- struct kvm_segment cs, ss;
-
- vmx_get_segment(vcpu, &cs, VCPU_SREG_CS);
- vmx_get_segment(vcpu, &ss, VCPU_SREG_SS);
-
- return ((cs.selector & SEGMENT_RPL_MASK) ==
- (ss.selector & SEGMENT_RPL_MASK));
-}
-
-/*
- * Check if guest state is valid. Returns true if valid, false if
- * not.
- * We assume that registers are always usable
- */
-static bool guest_state_valid(struct kvm_vcpu *vcpu)
-{
- if (enable_unrestricted_guest)
- return true;
-
- /* real mode guest state checks */
- if (!is_protmode(vcpu) || (vmx_get_rflags(vcpu) & X86_EFLAGS_VM)) {
- if (!rmode_segment_valid(vcpu, VCPU_SREG_CS))
- return false;
- if (!rmode_segment_valid(vcpu, VCPU_SREG_SS))
- return false;
- if (!rmode_segment_valid(vcpu, VCPU_SREG_DS))
- return false;
- if (!rmode_segment_valid(vcpu, VCPU_SREG_ES))
- return false;
- if (!rmode_segment_valid(vcpu, VCPU_SREG_FS))
- return false;
- if (!rmode_segment_valid(vcpu, VCPU_SREG_GS))
- return false;
- } else {
- /* protected mode guest state checks */
- if (!cs_ss_rpl_check(vcpu))
- return false;
- if (!code_segment_valid(vcpu))
- return false;
- if (!stack_segment_valid(vcpu))
- return false;
- if (!data_segment_valid(vcpu, VCPU_SREG_DS))
- return false;
- if (!data_segment_valid(vcpu, VCPU_SREG_ES))
- return false;
- if (!data_segment_valid(vcpu, VCPU_SREG_FS))
- return false;
- if (!data_segment_valid(vcpu, VCPU_SREG_GS))
- return false;
- if (!tr_valid(vcpu))
- return false;
- if (!ldtr_valid(vcpu))
- return false;
- }
- /* TODO:
- * - Add checks on RIP
- * - Add checks on RFLAGS
- */
-
- return true;
-}
-
-static int init_rmode_tss(struct kvm *kvm)
-{
- gfn_t fn;
- u16 data = 0;
- int idx, r;
-
- idx = srcu_read_lock(&kvm->srcu);
- fn = to_kvm_vmx(kvm)->tss_addr >> PAGE_SHIFT;
- r = kvm_clear_guest_page(kvm, fn, 0, PAGE_SIZE);
- if (r < 0)
- goto out;
- data = TSS_BASE_SIZE + TSS_REDIRECTION_SIZE;
- r = kvm_write_guest_page(kvm, fn++, &data,
- TSS_IOPB_BASE_OFFSET, sizeof(u16));
- if (r < 0)
- goto out;
- r = kvm_clear_guest_page(kvm, fn++, 0, PAGE_SIZE);
- if (r < 0)
- goto out;
- r = kvm_clear_guest_page(kvm, fn, 0, PAGE_SIZE);
- if (r < 0)
- goto out;
- data = ~0;
- r = kvm_write_guest_page(kvm, fn, &data,
- RMODE_TSS_SIZE - 2 * PAGE_SIZE - 1,
- sizeof(u8));
-out:
- srcu_read_unlock(&kvm->srcu, idx);
- return r;
-}
-
-static int init_rmode_identity_map(struct kvm *kvm)
-{
- struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm);
- int i, idx, r = 0;
- kvm_pfn_t identity_map_pfn;
- u32 tmp;
-
- /* Protect kvm_vmx->ept_identity_pagetable_done. */
- mutex_lock(&kvm->slots_lock);
-
- if (likely(kvm_vmx->ept_identity_pagetable_done))
- goto out2;
-
- if (!kvm_vmx->ept_identity_map_addr)
- kvm_vmx->ept_identity_map_addr = VMX_EPT_IDENTITY_PAGETABLE_ADDR;
- identity_map_pfn = kvm_vmx->ept_identity_map_addr >> PAGE_SHIFT;
-
- r = __x86_set_memory_region(kvm, IDENTITY_PAGETABLE_PRIVATE_MEMSLOT,
- kvm_vmx->ept_identity_map_addr, PAGE_SIZE);
- if (r < 0)
- goto out2;
-
- idx = srcu_read_lock(&kvm->srcu);
- r = kvm_clear_guest_page(kvm, identity_map_pfn, 0, PAGE_SIZE);
- if (r < 0)
- goto out;
- /* Set up identity-mapping pagetable for EPT in real mode */
- for (i = 0; i < PT32_ENT_PER_PAGE; i++) {
- tmp = (i << 22) + (_PAGE_PRESENT | _PAGE_RW | _PAGE_USER |
- _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE);
- r = kvm_write_guest_page(kvm, identity_map_pfn,
- &tmp, i * sizeof(tmp), sizeof(tmp));
- if (r < 0)
- goto out;
- }
- kvm_vmx->ept_identity_pagetable_done = true;
-
-out:
- srcu_read_unlock(&kvm->srcu, idx);
-
-out2:
- mutex_unlock(&kvm->slots_lock);
- return r;
-}
-
-static void seg_setup(int seg)
-{
- const struct kvm_vmx_segment_field *sf = &kvm_vmx_segment_fields[seg];
- unsigned int ar;
-
- vmcs_write16(sf->selector, 0);
- vmcs_writel(sf->base, 0);
- vmcs_write32(sf->limit, 0xffff);
- ar = 0x93;
- if (seg == VCPU_SREG_CS)
- ar |= 0x08; /* code segment */
-
- vmcs_write32(sf->ar_bytes, ar);
-}
-
-static int alloc_apic_access_page(struct kvm *kvm)
-{
- struct page *page;
- int r = 0;
-
- mutex_lock(&kvm->slots_lock);
- if (kvm->arch.apic_access_page_done)
- goto out;
- r = __x86_set_memory_region(kvm, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
- APIC_DEFAULT_PHYS_BASE, PAGE_SIZE);
- if (r)
- goto out;
-
- page = gfn_to_page(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT);
- if (is_error_page(page)) {
- r = -EFAULT;
- goto out;
- }
-
- /*
- * Do not pin the page in memory, so that memory hot-unplug
- * is able to migrate it.
- */
- put_page(page);
- kvm->arch.apic_access_page_done = true;
-out:
- mutex_unlock(&kvm->slots_lock);
- return r;
-}
-
-int allocate_vpid(void)
-{
- int vpid;
-
- if (!enable_vpid)
- return 0;
- spin_lock(&vmx_vpid_lock);
- vpid = find_first_zero_bit(vmx_vpid_bitmap, VMX_NR_VPIDS);
- if (vpid < VMX_NR_VPIDS)
- __set_bit(vpid, vmx_vpid_bitmap);
- else
- vpid = 0;
- spin_unlock(&vmx_vpid_lock);
- return vpid;
-}
-
-void free_vpid(int vpid)
-{
- if (!enable_vpid || vpid == 0)
- return;
- spin_lock(&vmx_vpid_lock);
- __clear_bit(vpid, vmx_vpid_bitmap);
- spin_unlock(&vmx_vpid_lock);
-}
-
-static __always_inline void vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
- u32 msr, int type)
-{
- int f = sizeof(unsigned long);
-
- if (!cpu_has_vmx_msr_bitmap())
- return;
-
- if (static_branch_unlikely(&enable_evmcs))
- evmcs_touch_msr_bitmap();
-
- /*
- * See Intel PRM Vol. 3, 20.6.9 (MSR-Bitmap Address). Early manuals
- * have the write-low and read-high bitmap offsets the wrong way round.
- * We can control MSRs 0x00000000-0x00001fff and 0xc0000000-0xc0001fff.
- */
- if (msr <= 0x1fff) {
- if (type & MSR_TYPE_R)
- /* read-low */
- __clear_bit(msr, msr_bitmap + 0x000 / f);
-
- if (type & MSR_TYPE_W)
- /* write-low */
- __clear_bit(msr, msr_bitmap + 0x800 / f);
-
- } else if ((msr >= 0xc0000000) && (msr <= 0xc0001fff)) {
- msr &= 0x1fff;
- if (type & MSR_TYPE_R)
- /* read-high */
- __clear_bit(msr, msr_bitmap + 0x400 / f);
-
- if (type & MSR_TYPE_W)
- /* write-high */
- __clear_bit(msr, msr_bitmap + 0xc00 / f);
-
- }
-}
-
-static __always_inline void vmx_enable_intercept_for_msr(unsigned long *msr_bitmap,
- u32 msr, int type)
-{
- int f = sizeof(unsigned long);
-
- if (!cpu_has_vmx_msr_bitmap())
- return;
-
- if (static_branch_unlikely(&enable_evmcs))
- evmcs_touch_msr_bitmap();
-
- /*
- * See Intel PRM Vol. 3, 20.6.9 (MSR-Bitmap Address). Early manuals
- * have the write-low and read-high bitmap offsets the wrong way round.
- * We can control MSRs 0x00000000-0x00001fff and 0xc0000000-0xc0001fff.
- */
- if (msr <= 0x1fff) {
- if (type & MSR_TYPE_R)
- /* read-low */
- __set_bit(msr, msr_bitmap + 0x000 / f);
-
- if (type & MSR_TYPE_W)
- /* write-low */
- __set_bit(msr, msr_bitmap + 0x800 / f);
-
- } else if ((msr >= 0xc0000000) && (msr <= 0xc0001fff)) {
- msr &= 0x1fff;
- if (type & MSR_TYPE_R)
- /* read-high */
- __set_bit(msr, msr_bitmap + 0x400 / f);
-
- if (type & MSR_TYPE_W)
- /* write-high */
- __set_bit(msr, msr_bitmap + 0xc00 / f);
-
- }
-}
-
-static __always_inline void vmx_set_intercept_for_msr(unsigned long *msr_bitmap,
- u32 msr, int type, bool value)
-{
- if (value)
- vmx_enable_intercept_for_msr(msr_bitmap, msr, type);
- else
- vmx_disable_intercept_for_msr(msr_bitmap, msr, type);
-}
-
-static u8 vmx_msr_bitmap_mode(struct kvm_vcpu *vcpu)
-{
- u8 mode = 0;
-
- if (cpu_has_secondary_exec_ctrls() &&
- (secondary_exec_controls_get(to_vmx(vcpu)) &
- SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE)) {
- mode |= MSR_BITMAP_MODE_X2APIC;
- if (enable_apicv && kvm_vcpu_apicv_active(vcpu))
- mode |= MSR_BITMAP_MODE_X2APIC_APICV;
- }
-
- return mode;
-}
-
-static void vmx_update_msr_bitmap_x2apic(unsigned long *msr_bitmap,
- u8 mode)
-{
- int msr;
-
- for (msr = 0x800; msr <= 0x8ff; msr += BITS_PER_LONG) {
- unsigned word = msr / BITS_PER_LONG;
- msr_bitmap[word] = (mode & MSR_BITMAP_MODE_X2APIC_APICV) ? 0 : ~0;
- msr_bitmap[word + (0x800 / sizeof(long))] = ~0;
- }
-
- if (mode & MSR_BITMAP_MODE_X2APIC) {
- /*
- * TPR reads and writes can be virtualized even if virtual interrupt
- * delivery is not in use.
- */
- vmx_disable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_TASKPRI), MSR_TYPE_RW);
- if (mode & MSR_BITMAP_MODE_X2APIC_APICV) {
- vmx_enable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R);
- vmx_disable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_EOI), MSR_TYPE_W);
- vmx_disable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W);
- }
- }
-}
-
-void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap;
- u8 mode = vmx_msr_bitmap_mode(vcpu);
- u8 changed = mode ^ vmx->msr_bitmap_mode;
-
- if (!changed)
- return;
-
- if (changed & (MSR_BITMAP_MODE_X2APIC | MSR_BITMAP_MODE_X2APIC_APICV))
- vmx_update_msr_bitmap_x2apic(msr_bitmap, mode);
-
- vmx->msr_bitmap_mode = mode;
-}
-
-void pt_update_intercept_for_msr(struct vcpu_vmx *vmx)
-{
- unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap;
- bool flag = !(vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN);
- u32 i;
-
- vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_STATUS,
- MSR_TYPE_RW, flag);
- vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_OUTPUT_BASE,
- MSR_TYPE_RW, flag);
- vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_OUTPUT_MASK,
- MSR_TYPE_RW, flag);
- vmx_set_intercept_for_msr(msr_bitmap, MSR_IA32_RTIT_CR3_MATCH,
- MSR_TYPE_RW, flag);
- for (i = 0; i < vmx->pt_desc.addr_range; i++) {
- vmx_set_intercept_for_msr(msr_bitmap,
- MSR_IA32_RTIT_ADDR0_A + i * 2, MSR_TYPE_RW, flag);
- vmx_set_intercept_for_msr(msr_bitmap,
- MSR_IA32_RTIT_ADDR0_B + i * 2, MSR_TYPE_RW, flag);
- }
-}
-
-static bool vmx_get_enable_apicv(struct kvm *kvm)
-{
- return enable_apicv;
-}
-
-static bool vmx_guest_apic_has_interrupt(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- void *vapic_page;
- u32 vppr;
- int rvi;
-
- if (WARN_ON_ONCE(!is_guest_mode(vcpu)) ||
- !nested_cpu_has_vid(get_vmcs12(vcpu)) ||
- WARN_ON_ONCE(!vmx->nested.virtual_apic_map.gfn))
- return false;
-
- rvi = vmx_get_rvi();
-
- vapic_page = vmx->nested.virtual_apic_map.hva;
- vppr = *((u32 *)(vapic_page + APIC_PROCPRI));
-
- return ((rvi & 0xf0) > (vppr & 0xf0));
-}
-
-static inline bool kvm_vcpu_trigger_posted_interrupt(struct kvm_vcpu *vcpu,
- bool nested)
-{
-#ifdef CONFIG_SMP
- int pi_vec = nested ? POSTED_INTR_NESTED_VECTOR : POSTED_INTR_VECTOR;
-
- if (vcpu->mode == IN_GUEST_MODE) {
- /*
- * The vector of interrupt to be delivered to vcpu had
- * been set in PIR before this function.
- *
- * Following cases will be reached in this block, and
- * we always send a notification event in all cases as
- * explained below.
- *
- * Case 1: vcpu keeps in non-root mode. Sending a
- * notification event posts the interrupt to vcpu.
- *
- * Case 2: vcpu exits to root mode and is still
- * runnable. PIR will be synced to vIRR before the
- * next vcpu entry. Sending a notification event in
- * this case has no effect, as vcpu is not in root
- * mode.
- *
- * Case 3: vcpu exits to root mode and is blocked.
- * vcpu_block() has already synced PIR to vIRR and
- * never blocks vcpu if vIRR is not cleared. Therefore,
- * a blocked vcpu here does not wait for any requested
- * interrupts in PIR, and sending a notification event
- * which has no effect is safe here.
- */
-
- apic->send_IPI_mask(get_cpu_mask(vcpu->cpu), pi_vec);
- return true;
- }
-#endif
- return false;
-}
-
-static int vmx_deliver_nested_posted_interrupt(struct kvm_vcpu *vcpu,
- int vector)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
-
- if (is_guest_mode(vcpu) &&
- vector == vmx->nested.posted_intr_nv) {
- /*
- * If a posted intr is not recognized by hardware,
- * we will accomplish it in the next vmentry.
- */
- vmx->nested.pi_pending = true;
- kvm_make_request(KVM_REQ_EVENT, vcpu);
- /* the PIR and ON have been set by L1. */
- if (!kvm_vcpu_trigger_posted_interrupt(vcpu, true))
- kvm_vcpu_kick(vcpu);
- return 0;
- }
- return -1;
-}
-/*
- * Send interrupt to vcpu via posted interrupt way.
- * 1. If target vcpu is running(non-root mode), send posted interrupt
- * notification to vcpu and hardware will sync PIR to vIRR atomically.
- * 2. If target vcpu isn't running(root mode), kick it to pick up the
- * interrupt from PIR in next vmentry.
- */
-static void vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- int r;
-
- r = vmx_deliver_nested_posted_interrupt(vcpu, vector);
- if (!r)
- return;
-
- if (pi_test_and_set_pir(vector, &vmx->pi_desc))
- return;
-
- /* If a previous notification has sent the IPI, nothing to do. */
- if (pi_test_and_set_on(&vmx->pi_desc))
- return;
-
- if (!kvm_vcpu_trigger_posted_interrupt(vcpu, false))
- kvm_vcpu_kick(vcpu);
-}
-
-/*
- * Set up the vmcs's constant host-state fields, i.e., host-state fields that
- * will not change in the lifetime of the guest.
- * Note that host-state that does change is set elsewhere. E.g., host-state
- * that is set differently for each CPU is set in vmx_vcpu_load(), not here.
- */
-void vmx_set_constant_host_state(struct vcpu_vmx *vmx)
-{
- u32 low32, high32;
- unsigned long tmpl;
- unsigned long cr0, cr3, cr4;
-
- cr0 = read_cr0();
- WARN_ON(cr0 & X86_CR0_TS);
- vmcs_writel(HOST_CR0, cr0); /* 22.2.3 */
-
- /*
- * Save the most likely value for this task's CR3 in the VMCS.
- * We can't use __get_current_cr3_fast() because we're not atomic.
- */
- cr3 = __read_cr3();
- vmcs_writel(HOST_CR3, cr3); /* 22.2.3 FIXME: shadow tables */
- vmx->loaded_vmcs->host_state.cr3 = cr3;
-
- /* Save the most likely value for this task's CR4 in the VMCS. */
- cr4 = cr4_read_shadow();
- vmcs_writel(HOST_CR4, cr4); /* 22.2.3, 22.2.5 */
- vmx->loaded_vmcs->host_state.cr4 = cr4;
-
- vmcs_write16(HOST_CS_SELECTOR, __KERNEL_CS); /* 22.2.4 */
-#ifdef CONFIG_X86_64
- /*
- * Load null selectors, so we can avoid reloading them in
- * vmx_prepare_switch_to_host(), in case userspace uses
- * the null selectors too (the expected case).
- */
- vmcs_write16(HOST_DS_SELECTOR, 0);
- vmcs_write16(HOST_ES_SELECTOR, 0);
-#else
- vmcs_write16(HOST_DS_SELECTOR, __KERNEL_DS); /* 22.2.4 */
- vmcs_write16(HOST_ES_SELECTOR, __KERNEL_DS); /* 22.2.4 */
-#endif
- vmcs_write16(HOST_SS_SELECTOR, __KERNEL_DS); /* 22.2.4 */
- vmcs_write16(HOST_TR_SELECTOR, GDT_ENTRY_TSS*8); /* 22.2.4 */
-
- vmcs_writel(HOST_IDTR_BASE, host_idt_base); /* 22.2.4 */
-
- vmcs_writel(HOST_RIP, (unsigned long)vmx_vmexit); /* 22.2.5 */
-
- rdmsr(MSR_IA32_SYSENTER_CS, low32, high32);
- vmcs_write32(HOST_IA32_SYSENTER_CS, low32);
- rdmsrl(MSR_IA32_SYSENTER_EIP, tmpl);
- vmcs_writel(HOST_IA32_SYSENTER_EIP, tmpl); /* 22.2.3 */
-
- if (vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_IA32_PAT) {
- rdmsr(MSR_IA32_CR_PAT, low32, high32);
- vmcs_write64(HOST_IA32_PAT, low32 | ((u64) high32 << 32));
- }
-
- if (cpu_has_load_ia32_efer())
- vmcs_write64(HOST_IA32_EFER, host_efer);
-}
-
-void set_cr4_guest_host_mask(struct vcpu_vmx *vmx)
-{
- vmx->vcpu.arch.cr4_guest_owned_bits = KVM_CR4_GUEST_OWNED_BITS;
- if (enable_ept)
- vmx->vcpu.arch.cr4_guest_owned_bits |= X86_CR4_PGE;
- if (is_guest_mode(&vmx->vcpu))
- vmx->vcpu.arch.cr4_guest_owned_bits &=
- ~get_vmcs12(&vmx->vcpu)->cr4_guest_host_mask;
- vmcs_writel(CR4_GUEST_HOST_MASK, ~vmx->vcpu.arch.cr4_guest_owned_bits);
-}
-
-u32 vmx_pin_based_exec_ctrl(struct vcpu_vmx *vmx)
-{
- u32 pin_based_exec_ctrl = vmcs_config.pin_based_exec_ctrl;
-
- if (!kvm_vcpu_apicv_active(&vmx->vcpu))
- pin_based_exec_ctrl &= ~PIN_BASED_POSTED_INTR;
-
- if (!enable_vnmi)
- pin_based_exec_ctrl &= ~PIN_BASED_VIRTUAL_NMIS;
-
- if (!enable_preemption_timer)
- pin_based_exec_ctrl &= ~PIN_BASED_VMX_PREEMPTION_TIMER;
-
- return pin_based_exec_ctrl;
-}
-
-static void vmx_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
-
- pin_controls_set(vmx, vmx_pin_based_exec_ctrl(vmx));
- if (cpu_has_secondary_exec_ctrls()) {
- if (kvm_vcpu_apicv_active(vcpu))
- secondary_exec_controls_setbit(vmx,
- SECONDARY_EXEC_APIC_REGISTER_VIRT |
- SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY);
- else
- secondary_exec_controls_clearbit(vmx,
- SECONDARY_EXEC_APIC_REGISTER_VIRT |
- SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY);
- }
-
- if (cpu_has_vmx_msr_bitmap())
- vmx_update_msr_bitmap(vcpu);
-}
-
-u32 vmx_exec_control(struct vcpu_vmx *vmx)
-{
- u32 exec_control = vmcs_config.cpu_based_exec_ctrl;
-
- if (vmx->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
- exec_control &= ~CPU_BASED_MOV_DR_EXITING;
-
- if (!cpu_need_tpr_shadow(&vmx->vcpu)) {
- exec_control &= ~CPU_BASED_TPR_SHADOW;
-#ifdef CONFIG_X86_64
- exec_control |= CPU_BASED_CR8_STORE_EXITING |
- CPU_BASED_CR8_LOAD_EXITING;
-#endif
- }
- if (!enable_ept)
- exec_control |= CPU_BASED_CR3_STORE_EXITING |
- CPU_BASED_CR3_LOAD_EXITING |
- CPU_BASED_INVLPG_EXITING;
- if (kvm_mwait_in_guest(vmx->vcpu.kvm))
- exec_control &= ~(CPU_BASED_MWAIT_EXITING |
- CPU_BASED_MONITOR_EXITING);
- if (kvm_hlt_in_guest(vmx->vcpu.kvm))
- exec_control &= ~CPU_BASED_HLT_EXITING;
- return exec_control;
-}
-
-
-static void vmx_compute_secondary_exec_control(struct vcpu_vmx *vmx)
-{
- struct kvm_vcpu *vcpu = &vmx->vcpu;
-
- u32 exec_control = vmcs_config.cpu_based_2nd_exec_ctrl;
-
- if (pt_mode == PT_MODE_SYSTEM)
- exec_control &= ~(SECONDARY_EXEC_PT_USE_GPA | SECONDARY_EXEC_PT_CONCEAL_VMX);
- if (!cpu_need_virtualize_apic_accesses(vcpu))
- exec_control &= ~SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
- if (vmx->vpid == 0)
- exec_control &= ~SECONDARY_EXEC_ENABLE_VPID;
- if (!enable_ept) {
- exec_control &= ~SECONDARY_EXEC_ENABLE_EPT;
- enable_unrestricted_guest = 0;
- }
- if (!enable_unrestricted_guest)
- exec_control &= ~SECONDARY_EXEC_UNRESTRICTED_GUEST;
- if (kvm_pause_in_guest(vmx->vcpu.kvm))
- exec_control &= ~SECONDARY_EXEC_PAUSE_LOOP_EXITING;
- if (!kvm_vcpu_apicv_active(vcpu))
- exec_control &= ~(SECONDARY_EXEC_APIC_REGISTER_VIRT |
- SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY);
- exec_control &= ~SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE;
-
- /* SECONDARY_EXEC_DESC is enabled/disabled on writes to CR4.UMIP,
- * in vmx_set_cr4. */
- exec_control &= ~SECONDARY_EXEC_DESC;
-
- /* SECONDARY_EXEC_SHADOW_VMCS is enabled when L1 executes VMPTRLD
- (handle_vmptrld).
- We can NOT enable shadow_vmcs here because we don't have yet
- a current VMCS12
- */
- exec_control &= ~SECONDARY_EXEC_SHADOW_VMCS;
-
- if (!enable_pml)
- exec_control &= ~SECONDARY_EXEC_ENABLE_PML;
-
- if (vmx_xsaves_supported()) {
- /* Exposing XSAVES only when XSAVE is exposed */
- bool xsaves_enabled =
- guest_cpuid_has(vcpu, X86_FEATURE_XSAVE) &&
- guest_cpuid_has(vcpu, X86_FEATURE_XSAVES);
-
- vcpu->arch.xsaves_enabled = xsaves_enabled;
-
- if (!xsaves_enabled)
- exec_control &= ~SECONDARY_EXEC_XSAVES;
-
- if (nested) {
- if (xsaves_enabled)
- vmx->nested.msrs.secondary_ctls_high |=
- SECONDARY_EXEC_XSAVES;
- else
- vmx->nested.msrs.secondary_ctls_high &=
- ~SECONDARY_EXEC_XSAVES;
- }
- }
-
- if (vmx_rdtscp_supported()) {
- bool rdtscp_enabled = guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP);
- if (!rdtscp_enabled)
- exec_control &= ~SECONDARY_EXEC_RDTSCP;
-
- if (nested) {
- if (rdtscp_enabled)
- vmx->nested.msrs.secondary_ctls_high |=
- SECONDARY_EXEC_RDTSCP;
- else
- vmx->nested.msrs.secondary_ctls_high &=
- ~SECONDARY_EXEC_RDTSCP;
- }
- }
-
- if (vmx_invpcid_supported()) {
- /* Exposing INVPCID only when PCID is exposed */
- bool invpcid_enabled =
- guest_cpuid_has(vcpu, X86_FEATURE_INVPCID) &&
- guest_cpuid_has(vcpu, X86_FEATURE_PCID);
-
- if (!invpcid_enabled) {
- exec_control &= ~SECONDARY_EXEC_ENABLE_INVPCID;
- guest_cpuid_clear(vcpu, X86_FEATURE_INVPCID);
- }
-
- if (nested) {
- if (invpcid_enabled)
- vmx->nested.msrs.secondary_ctls_high |=
- SECONDARY_EXEC_ENABLE_INVPCID;
- else
- vmx->nested.msrs.secondary_ctls_high &=
- ~SECONDARY_EXEC_ENABLE_INVPCID;
- }
- }
-
- if (vmx_rdrand_supported()) {
- bool rdrand_enabled = guest_cpuid_has(vcpu, X86_FEATURE_RDRAND);
- if (rdrand_enabled)
- exec_control &= ~SECONDARY_EXEC_RDRAND_EXITING;
-
- if (nested) {
- if (rdrand_enabled)
- vmx->nested.msrs.secondary_ctls_high |=
- SECONDARY_EXEC_RDRAND_EXITING;
- else
- vmx->nested.msrs.secondary_ctls_high &=
- ~SECONDARY_EXEC_RDRAND_EXITING;
- }
- }
-
- if (vmx_rdseed_supported()) {
- bool rdseed_enabled = guest_cpuid_has(vcpu, X86_FEATURE_RDSEED);
- if (rdseed_enabled)
- exec_control &= ~SECONDARY_EXEC_RDSEED_EXITING;
-
- if (nested) {
- if (rdseed_enabled)
- vmx->nested.msrs.secondary_ctls_high |=
- SECONDARY_EXEC_RDSEED_EXITING;
- else
- vmx->nested.msrs.secondary_ctls_high &=
- ~SECONDARY_EXEC_RDSEED_EXITING;
- }
- }
-
- if (vmx_waitpkg_supported()) {
- bool waitpkg_enabled =
- guest_cpuid_has(vcpu, X86_FEATURE_WAITPKG);
-
- if (!waitpkg_enabled)
- exec_control &= ~SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE;
-
- if (nested) {
- if (waitpkg_enabled)
- vmx->nested.msrs.secondary_ctls_high |=
- SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE;
- else
- vmx->nested.msrs.secondary_ctls_high &=
- ~SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE;
- }
- }
-
- vmx->secondary_exec_control = exec_control;
-}
-
-static void ept_set_mmio_spte_mask(void)
-{
- /*
- * EPT Misconfigurations can be generated if the value of bits 2:0
- * of an EPT paging-structure entry is 110b (write/execute).
- */
- kvm_mmu_set_mmio_spte_mask(VMX_EPT_RWX_MASK,
- VMX_EPT_MISCONFIG_WX_VALUE, 0);
-}
-
-#define VMX_XSS_EXIT_BITMAP 0
-
-/*
- * Noting that the initialization of Guest-state Area of VMCS is in
- * vmx_vcpu_reset().
- */
-static void init_vmcs(struct vcpu_vmx *vmx)
-{
- if (nested)
- nested_vmx_set_vmcs_shadowing_bitmap();
-
- if (cpu_has_vmx_msr_bitmap())
- vmcs_write64(MSR_BITMAP, __pa(vmx->vmcs01.msr_bitmap));
-
- vmcs_write64(VMCS_LINK_POINTER, -1ull); /* 22.3.1.5 */
-
- /* Control */
- pin_controls_set(vmx, vmx_pin_based_exec_ctrl(vmx));
-
- exec_controls_set(vmx, vmx_exec_control(vmx));
-
- if (cpu_has_secondary_exec_ctrls()) {
- vmx_compute_secondary_exec_control(vmx);
- secondary_exec_controls_set(vmx, vmx->secondary_exec_control);
- }
-
- if (kvm_vcpu_apicv_active(&vmx->vcpu)) {
- vmcs_write64(EOI_EXIT_BITMAP0, 0);
- vmcs_write64(EOI_EXIT_BITMAP1, 0);
- vmcs_write64(EOI_EXIT_BITMAP2, 0);
- vmcs_write64(EOI_EXIT_BITMAP3, 0);
-
- vmcs_write16(GUEST_INTR_STATUS, 0);
-
- vmcs_write16(POSTED_INTR_NV, POSTED_INTR_VECTOR);
- vmcs_write64(POSTED_INTR_DESC_ADDR, __pa((&vmx->pi_desc)));
- }
-
- if (!kvm_pause_in_guest(vmx->vcpu.kvm)) {
- vmcs_write32(PLE_GAP, ple_gap);
- vmx->ple_window = ple_window;
- vmx->ple_window_dirty = true;
- }
-
- vmcs_write32(PAGE_FAULT_ERROR_CODE_MASK, 0);
- vmcs_write32(PAGE_FAULT_ERROR_CODE_MATCH, 0);
- vmcs_write32(CR3_TARGET_COUNT, 0); /* 22.2.1 */
-
- vmcs_write16(HOST_FS_SELECTOR, 0); /* 22.2.4 */
- vmcs_write16(HOST_GS_SELECTOR, 0); /* 22.2.4 */
- vmx_set_constant_host_state(vmx);
- vmcs_writel(HOST_FS_BASE, 0); /* 22.2.4 */
- vmcs_writel(HOST_GS_BASE, 0); /* 22.2.4 */
-
- if (cpu_has_vmx_vmfunc())
- vmcs_write64(VM_FUNCTION_CONTROL, 0);
-
- vmcs_write32(VM_EXIT_MSR_STORE_COUNT, 0);
- vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, 0);
- vmcs_write64(VM_EXIT_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.host.val));
- vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, 0);
- vmcs_write64(VM_ENTRY_MSR_LOAD_ADDR, __pa(vmx->msr_autoload.guest.val));
-
- if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT)
- vmcs_write64(GUEST_IA32_PAT, vmx->vcpu.arch.pat);
-
- vm_exit_controls_set(vmx, vmx_vmexit_ctrl());
-
- /* 22.2.1, 20.8.1 */
- vm_entry_controls_set(vmx, vmx_vmentry_ctrl());
-
- vmx->vcpu.arch.cr0_guest_owned_bits = X86_CR0_TS;
- vmcs_writel(CR0_GUEST_HOST_MASK, ~X86_CR0_TS);
-
- set_cr4_guest_host_mask(vmx);
-
- if (vmx->vpid != 0)
- vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->vpid);
-
- if (vmx_xsaves_supported())
- vmcs_write64(XSS_EXIT_BITMAP, VMX_XSS_EXIT_BITMAP);
-
- if (enable_pml) {
- vmcs_write64(PML_ADDRESS, page_to_phys(vmx->pml_pg));
- vmcs_write16(GUEST_PML_INDEX, PML_ENTITY_NUM - 1);
- }
-
- if (cpu_has_vmx_encls_vmexit())
- vmcs_write64(ENCLS_EXITING_BITMAP, -1ull);
-
- if (pt_mode == PT_MODE_HOST_GUEST) {
- memset(&vmx->pt_desc, 0, sizeof(vmx->pt_desc));
- /* Bit[6~0] are forced to 1, writes are ignored. */
- vmx->pt_desc.guest.output_mask = 0x7F;
- vmcs_write64(GUEST_IA32_RTIT_CTL, 0);
- }
-}
-
-static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- struct msr_data apic_base_msr;
- u64 cr0;
-
- vmx->rmode.vm86_active = 0;
- vmx->spec_ctrl = 0;
-
- vmx->msr_ia32_umwait_control = 0;
-
- vcpu->arch.microcode_version = 0x100000000ULL;
- vmx->vcpu.arch.regs[VCPU_REGS_RDX] = get_rdx_init_val();
- vmx->hv_deadline_tsc = -1;
- kvm_set_cr8(vcpu, 0);
-
- if (!init_event) {
- apic_base_msr.data = APIC_DEFAULT_PHYS_BASE |
- MSR_IA32_APICBASE_ENABLE;
- if (kvm_vcpu_is_reset_bsp(vcpu))
- apic_base_msr.data |= MSR_IA32_APICBASE_BSP;
- apic_base_msr.host_initiated = true;
- kvm_set_apic_base(vcpu, &apic_base_msr);
- }
-
- vmx_segment_cache_clear(vmx);
-
- seg_setup(VCPU_SREG_CS);
- vmcs_write16(GUEST_CS_SELECTOR, 0xf000);
- vmcs_writel(GUEST_CS_BASE, 0xffff0000ul);
-
- seg_setup(VCPU_SREG_DS);
- seg_setup(VCPU_SREG_ES);
- seg_setup(VCPU_SREG_FS);
- seg_setup(VCPU_SREG_GS);
- seg_setup(VCPU_SREG_SS);
-
- vmcs_write16(GUEST_TR_SELECTOR, 0);
- vmcs_writel(GUEST_TR_BASE, 0);
- vmcs_write32(GUEST_TR_LIMIT, 0xffff);
- vmcs_write32(GUEST_TR_AR_BYTES, 0x008b);
-
- vmcs_write16(GUEST_LDTR_SELECTOR, 0);
- vmcs_writel(GUEST_LDTR_BASE, 0);
- vmcs_write32(GUEST_LDTR_LIMIT, 0xffff);
- vmcs_write32(GUEST_LDTR_AR_BYTES, 0x00082);
-
- if (!init_event) {
- vmcs_write32(GUEST_SYSENTER_CS, 0);
- vmcs_writel(GUEST_SYSENTER_ESP, 0);
- vmcs_writel(GUEST_SYSENTER_EIP, 0);
- vmcs_write64(GUEST_IA32_DEBUGCTL, 0);
- }
-
- kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
- kvm_rip_write(vcpu, 0xfff0);
-
- vmcs_writel(GUEST_GDTR_BASE, 0);
- vmcs_write32(GUEST_GDTR_LIMIT, 0xffff);
-
- vmcs_writel(GUEST_IDTR_BASE, 0);
- vmcs_write32(GUEST_IDTR_LIMIT, 0xffff);
-
- vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE);
- vmcs_write32(GUEST_INTERRUPTIBILITY_INFO, 0);
- vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS, 0);
- if (kvm_mpx_supported())
- vmcs_write64(GUEST_BNDCFGS, 0);
-
- setup_msrs(vmx);
-
- vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0); /* 22.2.1 */
-
- if (cpu_has_vmx_tpr_shadow() && !init_event) {
- vmcs_write64(VIRTUAL_APIC_PAGE_ADDR, 0);
- if (cpu_need_tpr_shadow(vcpu))
- vmcs_write64(VIRTUAL_APIC_PAGE_ADDR,
- __pa(vcpu->arch.apic->regs));
- vmcs_write32(TPR_THRESHOLD, 0);
- }
-
- kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu);
-
- cr0 = X86_CR0_NW | X86_CR0_CD | X86_CR0_ET;
- vmx->vcpu.arch.cr0 = cr0;
- vmx_set_cr0(vcpu, cr0); /* enter rmode */
- vmx_set_cr4(vcpu, 0);
- vmx_set_efer(vcpu, 0);
-
- update_exception_bitmap(vcpu);
-
- vpid_sync_context(vmx->vpid);
- if (init_event)
- vmx_clear_hlt(vcpu);
-}
-
-static void enable_irq_window(struct kvm_vcpu *vcpu)
-{
- exec_controls_setbit(to_vmx(vcpu), CPU_BASED_INTR_WINDOW_EXITING);
-}
-
-static void enable_nmi_window(struct kvm_vcpu *vcpu)
-{
- if (!enable_vnmi ||
- vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) & GUEST_INTR_STATE_STI) {
- enable_irq_window(vcpu);
- return;
- }
-
- exec_controls_setbit(to_vmx(vcpu), CPU_BASED_NMI_WINDOW_EXITING);
-}
-
-static void vmx_inject_irq(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- uint32_t intr;
- int irq = vcpu->arch.interrupt.nr;
-
- trace_kvm_inj_virq(irq);
-
- ++vcpu->stat.irq_injections;
- if (vmx->rmode.vm86_active) {
- int inc_eip = 0;
- if (vcpu->arch.interrupt.soft)
- inc_eip = vcpu->arch.event_exit_inst_len;
- kvm_inject_realmode_interrupt(vcpu, irq, inc_eip);
- return;
- }
- intr = irq | INTR_INFO_VALID_MASK;
- if (vcpu->arch.interrupt.soft) {
- intr |= INTR_TYPE_SOFT_INTR;
- vmcs_write32(VM_ENTRY_INSTRUCTION_LEN,
- vmx->vcpu.arch.event_exit_inst_len);
- } else
- intr |= INTR_TYPE_EXT_INTR;
- vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr);
-
- vmx_clear_hlt(vcpu);
-}
-
-static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
-
- if (!enable_vnmi) {
- /*
- * Tracking the NMI-blocked state in software is built upon
- * finding the next open IRQ window. This, in turn, depends on
- * well-behaving guests: They have to keep IRQs disabled at
- * least as long as the NMI handler runs. Otherwise we may
- * cause NMI nesting, maybe breaking the guest. But as this is
- * highly unlikely, we can live with the residual risk.
- */
- vmx->loaded_vmcs->soft_vnmi_blocked = 1;
- vmx->loaded_vmcs->vnmi_blocked_time = 0;
- }
-
- ++vcpu->stat.nmi_injections;
- vmx->loaded_vmcs->nmi_known_unmasked = false;
-
- if (vmx->rmode.vm86_active) {
- kvm_inject_realmode_interrupt(vcpu, NMI_VECTOR, 0);
- return;
- }
-
- vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
- INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK | NMI_VECTOR);
-
- vmx_clear_hlt(vcpu);
-}
-
-bool vmx_get_nmi_mask(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- bool masked;
-
- if (!enable_vnmi)
- return vmx->loaded_vmcs->soft_vnmi_blocked;
- if (vmx->loaded_vmcs->nmi_known_unmasked)
- return false;
- masked = vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) & GUEST_INTR_STATE_NMI;
- vmx->loaded_vmcs->nmi_known_unmasked = !masked;
- return masked;
-}
-
-void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
-
- if (!enable_vnmi) {
- if (vmx->loaded_vmcs->soft_vnmi_blocked != masked) {
- vmx->loaded_vmcs->soft_vnmi_blocked = masked;
- vmx->loaded_vmcs->vnmi_blocked_time = 0;
- }
- } else {
- vmx->loaded_vmcs->nmi_known_unmasked = !masked;
- if (masked)
- vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO,
- GUEST_INTR_STATE_NMI);
- else
- vmcs_clear_bits(GUEST_INTERRUPTIBILITY_INFO,
- GUEST_INTR_STATE_NMI);
- }
-}
-
-static int vmx_nmi_allowed(struct kvm_vcpu *vcpu)
-{
- if (to_vmx(vcpu)->nested.nested_run_pending)
- return 0;
-
- if (!enable_vnmi &&
- to_vmx(vcpu)->loaded_vmcs->soft_vnmi_blocked)
- return 0;
-
- return !(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &
- (GUEST_INTR_STATE_MOV_SS | GUEST_INTR_STATE_STI
- | GUEST_INTR_STATE_NMI));
-}
-
-static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu)
-{
- return (!to_vmx(vcpu)->nested.nested_run_pending &&
- vmcs_readl(GUEST_RFLAGS) & X86_EFLAGS_IF) &&
- !(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &
- (GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS));
-}
-
-static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr)
-{
- int ret;
-
- if (enable_unrestricted_guest)
- return 0;
-
- ret = x86_set_memory_region(kvm, TSS_PRIVATE_MEMSLOT, addr,
- PAGE_SIZE * 3);
- if (ret)
- return ret;
- to_kvm_vmx(kvm)->tss_addr = addr;
- return init_rmode_tss(kvm);
-}
-
-static int vmx_set_identity_map_addr(struct kvm *kvm, u64 ident_addr)
-{
- to_kvm_vmx(kvm)->ept_identity_map_addr = ident_addr;
- return 0;
-}
-
-static bool rmode_exception(struct kvm_vcpu *vcpu, int vec)
-{
- switch (vec) {
- case BP_VECTOR:
- /*
- * Update instruction length as we may reinject the exception
- * from user space while in guest debugging mode.
- */
- to_vmx(vcpu)->vcpu.arch.event_exit_inst_len =
- vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
- if (vcpu->guest_debug & KVM_GUESTDBG_USE_SW_BP)
- return false;
- /* fall through */
- case DB_VECTOR:
- if (vcpu->guest_debug &
- (KVM_GUESTDBG_SINGLESTEP | KVM_GUESTDBG_USE_HW_BP))
- return false;
- /* fall through */
- case DE_VECTOR:
- case OF_VECTOR:
- case BR_VECTOR:
- case UD_VECTOR:
- case DF_VECTOR:
- case SS_VECTOR:
- case GP_VECTOR:
- case MF_VECTOR:
- return true;
- break;
- }
- return false;
-}
-
-static int handle_rmode_exception(struct kvm_vcpu *vcpu,
- int vec, u32 err_code)
-{
- /*
- * Instruction with address size override prefix opcode 0x67
- * Cause the #SS fault with 0 error code in VM86 mode.
- */
- if (((vec == GP_VECTOR) || (vec == SS_VECTOR)) && err_code == 0) {
- if (kvm_emulate_instruction(vcpu, 0)) {
- if (vcpu->arch.halt_request) {
- vcpu->arch.halt_request = 0;
- return kvm_vcpu_halt(vcpu);
- }
- return 1;
- }
- return 0;
- }
-
- /*
- * Forward all other exceptions that are valid in real mode.
- * FIXME: Breaks guest debugging in real mode, needs to be fixed with
- * the required debugging infrastructure rework.
- */
- kvm_queue_exception(vcpu, vec);
- return 1;
-}
-
-/*
- * Trigger machine check on the host. We assume all the MSRs are already set up
- * by the CPU and that we still run on the same CPU as the MCE occurred on.
- * We pass a fake environment to the machine check handler because we want
- * the guest to be always treated like user space, no matter what context
- * it used internally.
- */
-static void kvm_machine_check(void)
-{
-#if defined(CONFIG_X86_MCE) && defined(CONFIG_X86_64)
- struct pt_regs regs = {
- .cs = 3, /* Fake ring 3 no matter what the guest ran on */
- .flags = X86_EFLAGS_IF,
- };
-
- do_machine_check(®s, 0);
-#endif
-}
-
-static int handle_machine_check(struct kvm_vcpu *vcpu)
-{
- /* handled by vmx_vcpu_run() */
- return 1;
-}
-
-static int handle_exception_nmi(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- struct kvm_run *kvm_run = vcpu->run;
- u32 intr_info, ex_no, error_code;
- unsigned long cr2, rip, dr6;
- u32 vect_info;
-
- vect_info = vmx->idt_vectoring_info;
- intr_info = vmx->exit_intr_info;
-
- if (is_machine_check(intr_info) || is_nmi(intr_info))
- return 1; /* handled by handle_exception_nmi_irqoff() */
-
- if (is_invalid_opcode(intr_info))
- return handle_ud(vcpu);
-
- error_code = 0;
- if (intr_info & INTR_INFO_DELIVER_CODE_MASK)
- error_code = vmcs_read32(VM_EXIT_INTR_ERROR_CODE);
-
- if (!vmx->rmode.vm86_active && is_gp_fault(intr_info)) {
- WARN_ON_ONCE(!enable_vmware_backdoor);
-
- /*
- * VMware backdoor emulation on #GP interception only handles
- * IN{S}, OUT{S}, and RDPMC, none of which generate a non-zero
- * error code on #GP.
- */
- if (error_code) {
- kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);
- return 1;
- }
- return kvm_emulate_instruction(vcpu, EMULTYPE_VMWARE_GP);
- }
-
- /*
- * The #PF with PFEC.RSVD = 1 indicates the guest is accessing
- * MMIO, it is better to report an internal error.
- * See the comments in vmx_handle_exit.
- */
- if ((vect_info & VECTORING_INFO_VALID_MASK) &&
- !(is_page_fault(intr_info) && !(error_code & PFERR_RSVD_MASK))) {
- vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
- vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_SIMUL_EX;
- vcpu->run->internal.ndata = 3;
- vcpu->run->internal.data[0] = vect_info;
- vcpu->run->internal.data[1] = intr_info;
- vcpu->run->internal.data[2] = error_code;
- return 0;
- }
-
- if (is_page_fault(intr_info)) {
- cr2 = vmcs_readl(EXIT_QUALIFICATION);
- /* EPT won't cause page fault directly */
- WARN_ON_ONCE(!vcpu->arch.apf.host_apf_reason && enable_ept);
- return kvm_handle_page_fault(vcpu, error_code, cr2, NULL, 0);
- }
-
- ex_no = intr_info & INTR_INFO_VECTOR_MASK;
-
- if (vmx->rmode.vm86_active && rmode_exception(vcpu, ex_no))
- return handle_rmode_exception(vcpu, ex_no, error_code);
-
- switch (ex_no) {
- case AC_VECTOR:
- kvm_queue_exception_e(vcpu, AC_VECTOR, error_code);
- return 1;
- case DB_VECTOR:
- dr6 = vmcs_readl(EXIT_QUALIFICATION);
- if (!(vcpu->guest_debug &
- (KVM_GUESTDBG_SINGLESTEP | KVM_GUESTDBG_USE_HW_BP))) {
- vcpu->arch.dr6 &= ~DR_TRAP_BITS;
- vcpu->arch.dr6 |= dr6 | DR6_RTM;
- if (is_icebp(intr_info))
- WARN_ON(!skip_emulated_instruction(vcpu));
-
- kvm_queue_exception(vcpu, DB_VECTOR);
- return 1;
- }
- kvm_run->debug.arch.dr6 = dr6 | DR6_FIXED_1;
- kvm_run->debug.arch.dr7 = vmcs_readl(GUEST_DR7);
- /* fall through */
- case BP_VECTOR:
- /*
- * Update instruction length as we may reinject #BP from
- * user space while in guest debugging mode. Reading it for
- * #DB as well causes no harm, it is not used in that case.
- */
- vmx->vcpu.arch.event_exit_inst_len =
- vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
- kvm_run->exit_reason = KVM_EXIT_DEBUG;
- rip = kvm_rip_read(vcpu);
- kvm_run->debug.arch.pc = vmcs_readl(GUEST_CS_BASE) + rip;
- kvm_run->debug.arch.exception = ex_no;
- break;
- default:
- kvm_run->exit_reason = KVM_EXIT_EXCEPTION;
- kvm_run->ex.exception = ex_no;
- kvm_run->ex.error_code = error_code;
- break;
- }
- return 0;
-}
-
-static __always_inline int handle_external_interrupt(struct kvm_vcpu *vcpu)
-{
- ++vcpu->stat.irq_exits;
- return 1;
-}
-
-static int handle_triple_fault(struct kvm_vcpu *vcpu)
-{
- vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN;
- vcpu->mmio_needed = 0;
- return 0;
-}
-
-static int handle_io(struct kvm_vcpu *vcpu)
-{
- unsigned long exit_qualification;
- int size, in, string;
- unsigned port;
-
- exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
- string = (exit_qualification & 16) != 0;
-
- ++vcpu->stat.io_exits;
-
- if (string)
- return kvm_emulate_instruction(vcpu, 0);
-
- port = exit_qualification >> 16;
- size = (exit_qualification & 7) + 1;
- in = (exit_qualification & 8) != 0;
-
- return kvm_fast_pio(vcpu, size, port, in);
-}
-
-static void
-vmx_patch_hypercall(struct kvm_vcpu *vcpu, unsigned char *hypercall)
-{
- /*
- * Patch in the VMCALL instruction:
- */
- hypercall[0] = 0x0f;
- hypercall[1] = 0x01;
- hypercall[2] = 0xc1;
-}
-
-/* called to set cr0 as appropriate for a mov-to-cr0 exit. */
-static int handle_set_cr0(struct kvm_vcpu *vcpu, unsigned long val)
-{
- if (is_guest_mode(vcpu)) {
- struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
- unsigned long orig_val = val;
-
- /*
- * We get here when L2 changed cr0 in a way that did not change
- * any of L1's shadowed bits (see nested_vmx_exit_handled_cr),
- * but did change L0 shadowed bits. So we first calculate the
- * effective cr0 value that L1 would like to write into the
- * hardware. It consists of the L2-owned bits from the new
- * value combined with the L1-owned bits from L1's guest_cr0.
- */
- val = (val & ~vmcs12->cr0_guest_host_mask) |
- (vmcs12->guest_cr0 & vmcs12->cr0_guest_host_mask);
-
- if (!nested_guest_cr0_valid(vcpu, val))
- return 1;
-
- if (kvm_set_cr0(vcpu, val))
- return 1;
- vmcs_writel(CR0_READ_SHADOW, orig_val);
- return 0;
- } else {
- if (to_vmx(vcpu)->nested.vmxon &&
- !nested_host_cr0_valid(vcpu, val))
- return 1;
-
- return kvm_set_cr0(vcpu, val);
- }
-}
-
-static int handle_set_cr4(struct kvm_vcpu *vcpu, unsigned long val)
-{
- if (is_guest_mode(vcpu)) {
- struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
- unsigned long orig_val = val;
-
- /* analogously to handle_set_cr0 */
- val = (val & ~vmcs12->cr4_guest_host_mask) |
- (vmcs12->guest_cr4 & vmcs12->cr4_guest_host_mask);
- if (kvm_set_cr4(vcpu, val))
- return 1;
- vmcs_writel(CR4_READ_SHADOW, orig_val);
- return 0;
- } else
- return kvm_set_cr4(vcpu, val);
-}
-
-static int handle_desc(struct kvm_vcpu *vcpu)
-{
- WARN_ON(!(vcpu->arch.cr4 & X86_CR4_UMIP));
- return kvm_emulate_instruction(vcpu, 0);
-}
-
-static int handle_cr(struct kvm_vcpu *vcpu)
-{
- unsigned long exit_qualification, val;
- int cr;
- int reg;
- int err;
- int ret;
-
- exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
- cr = exit_qualification & 15;
- reg = (exit_qualification >> 8) & 15;
- switch ((exit_qualification >> 4) & 3) {
- case 0: /* mov to cr */
- val = kvm_register_readl(vcpu, reg);
- trace_kvm_cr_write(cr, val);
- switch (cr) {
- case 0:
- err = handle_set_cr0(vcpu, val);
- return kvm_complete_insn_gp(vcpu, err);
- case 3:
- WARN_ON_ONCE(enable_unrestricted_guest);
- err = kvm_set_cr3(vcpu, val);
- return kvm_complete_insn_gp(vcpu, err);
- case 4:
- err = handle_set_cr4(vcpu, val);
- return kvm_complete_insn_gp(vcpu, err);
- case 8: {
- u8 cr8_prev = kvm_get_cr8(vcpu);
- u8 cr8 = (u8)val;
- err = kvm_set_cr8(vcpu, cr8);
- ret = kvm_complete_insn_gp(vcpu, err);
- if (lapic_in_kernel(vcpu))
- return ret;
- if (cr8_prev <= cr8)
- return ret;
- /*
- * TODO: we might be squashing a
- * KVM_GUESTDBG_SINGLESTEP-triggered
- * KVM_EXIT_DEBUG here.
- */
- vcpu->run->exit_reason = KVM_EXIT_SET_TPR;
- return 0;
- }
- }
- break;
- case 2: /* clts */
- WARN_ONCE(1, "Guest should always own CR0.TS");
- vmx_set_cr0(vcpu, kvm_read_cr0_bits(vcpu, ~X86_CR0_TS));
- trace_kvm_cr_write(0, kvm_read_cr0(vcpu));
- return kvm_skip_emulated_instruction(vcpu);
- case 1: /*mov from cr*/
- switch (cr) {
- case 3:
- WARN_ON_ONCE(enable_unrestricted_guest);
- val = kvm_read_cr3(vcpu);
- kvm_register_write(vcpu, reg, val);
- trace_kvm_cr_read(cr, val);
- return kvm_skip_emulated_instruction(vcpu);
- case 8:
- val = kvm_get_cr8(vcpu);
- kvm_register_write(vcpu, reg, val);
- trace_kvm_cr_read(cr, val);
- return kvm_skip_emulated_instruction(vcpu);
- }
- break;
- case 3: /* lmsw */
- val = (exit_qualification >> LMSW_SOURCE_DATA_SHIFT) & 0x0f;
- trace_kvm_cr_write(0, (kvm_read_cr0(vcpu) & ~0xful) | val);
- kvm_lmsw(vcpu, val);
-
- return kvm_skip_emulated_instruction(vcpu);
- default:
- break;
- }
- vcpu->run->exit_reason = 0;
- vcpu_unimpl(vcpu, "unhandled control register: op %d cr %d\n",
- (int)(exit_qualification >> 4) & 3, cr);
- return 0;
-}
-
-static int handle_dr(struct kvm_vcpu *vcpu)
-{
- unsigned long exit_qualification;
- int dr, dr7, reg;
-
- exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
- dr = exit_qualification & DEBUG_REG_ACCESS_NUM;
-
- /* First, if DR does not exist, trigger UD */
- if (!kvm_require_dr(vcpu, dr))
- return 1;
-
- /* Do not handle if the CPL > 0, will trigger GP on re-entry */
- if (!kvm_require_cpl(vcpu, 0))
- return 1;
- dr7 = vmcs_readl(GUEST_DR7);
- if (dr7 & DR7_GD) {
- /*
- * As the vm-exit takes precedence over the debug trap, we
- * need to emulate the latter, either for the host or the
- * guest debugging itself.
- */
- if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP) {
- vcpu->run->debug.arch.dr6 = vcpu->arch.dr6;
- vcpu->run->debug.arch.dr7 = dr7;
- vcpu->run->debug.arch.pc = kvm_get_linear_rip(vcpu);
- vcpu->run->debug.arch.exception = DB_VECTOR;
- vcpu->run->exit_reason = KVM_EXIT_DEBUG;
- return 0;
- } else {
- vcpu->arch.dr6 &= ~DR_TRAP_BITS;
- vcpu->arch.dr6 |= DR6_BD | DR6_RTM;
- kvm_queue_exception(vcpu, DB_VECTOR);
- return 1;
- }
- }
-
- if (vcpu->guest_debug == 0) {
- exec_controls_clearbit(to_vmx(vcpu), CPU_BASED_MOV_DR_EXITING);
-
- /*
- * No more DR vmexits; force a reload of the debug registers
- * and reenter on this instruction. The next vmexit will
- * retrieve the full state of the debug registers.
- */
- vcpu->arch.switch_db_regs |= KVM_DEBUGREG_WONT_EXIT;
- return 1;
- }
-
- reg = DEBUG_REG_ACCESS_REG(exit_qualification);
- if (exit_qualification & TYPE_MOV_FROM_DR) {
- unsigned long val;
-
- if (kvm_get_dr(vcpu, dr, &val))
- return 1;
- kvm_register_write(vcpu, reg, val);
- } else
- if (kvm_set_dr(vcpu, dr, kvm_register_readl(vcpu, reg)))
- return 1;
-
- return kvm_skip_emulated_instruction(vcpu);
-}
-
-static u64 vmx_get_dr6(struct kvm_vcpu *vcpu)
-{
- return vcpu->arch.dr6;
-}
-
-static void vmx_set_dr6(struct kvm_vcpu *vcpu, unsigned long val)
-{
-}
-
-static void vmx_sync_dirty_debug_regs(struct kvm_vcpu *vcpu)
-{
- get_debugreg(vcpu->arch.db[0], 0);
- get_debugreg(vcpu->arch.db[1], 1);
- get_debugreg(vcpu->arch.db[2], 2);
- get_debugreg(vcpu->arch.db[3], 3);
- get_debugreg(vcpu->arch.dr6, 6);
- vcpu->arch.dr7 = vmcs_readl(GUEST_DR7);
-
- vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_WONT_EXIT;
- exec_controls_setbit(to_vmx(vcpu), CPU_BASED_MOV_DR_EXITING);
-}
-
-static void vmx_set_dr7(struct kvm_vcpu *vcpu, unsigned long val)
-{
- vmcs_writel(GUEST_DR7, val);
-}
-
-static int handle_tpr_below_threshold(struct kvm_vcpu *vcpu)
-{
- kvm_apic_update_ppr(vcpu);
- return 1;
-}
-
-static int handle_interrupt_window(struct kvm_vcpu *vcpu)
-{
- exec_controls_clearbit(to_vmx(vcpu), CPU_BASED_INTR_WINDOW_EXITING);
-
- kvm_make_request(KVM_REQ_EVENT, vcpu);
-
- ++vcpu->stat.irq_window_exits;
- return 1;
-}
-
-static int handle_vmcall(struct kvm_vcpu *vcpu)
-{
- return kvm_emulate_hypercall(vcpu);
-}
-
-static int handle_invd(struct kvm_vcpu *vcpu)
-{
- return kvm_emulate_instruction(vcpu, 0);
-}
-
-static int handle_invlpg(struct kvm_vcpu *vcpu)
-{
- unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
-
- kvm_mmu_invlpg(vcpu, exit_qualification);
- return kvm_skip_emulated_instruction(vcpu);
-}
-
-static int handle_rdpmc(struct kvm_vcpu *vcpu)
-{
- int err;
-
- err = kvm_rdpmc(vcpu);
- return kvm_complete_insn_gp(vcpu, err);
-}
-
-static int handle_wbinvd(struct kvm_vcpu *vcpu)
-{
- return kvm_emulate_wbinvd(vcpu);
-}
-
-static int handle_xsetbv(struct kvm_vcpu *vcpu)
-{
- u64 new_bv = kvm_read_edx_eax(vcpu);
- u32 index = kvm_rcx_read(vcpu);
-
- if (kvm_set_xcr(vcpu, index, new_bv) == 0)
- return kvm_skip_emulated_instruction(vcpu);
- return 1;
-}
-
-static int handle_apic_access(struct kvm_vcpu *vcpu)
-{
- if (likely(fasteoi)) {
- unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
- int access_type, offset;
-
- access_type = exit_qualification & APIC_ACCESS_TYPE;
- offset = exit_qualification & APIC_ACCESS_OFFSET;
- /*
- * Sane guest uses MOV to write EOI, with written value
- * not cared. So make a short-circuit here by avoiding
- * heavy instruction emulation.
- */
- if ((access_type == TYPE_LINEAR_APIC_INST_WRITE) &&
- (offset == APIC_EOI)) {
- kvm_lapic_set_eoi(vcpu);
- return kvm_skip_emulated_instruction(vcpu);
- }
- }
- return kvm_emulate_instruction(vcpu, 0);
-}
-
-static int handle_apic_eoi_induced(struct kvm_vcpu *vcpu)
-{
- unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
- int vector = exit_qualification & 0xff;
-
- /* EOI-induced VM exit is trap-like and thus no need to adjust IP */
- kvm_apic_set_eoi_accelerated(vcpu, vector);
- return 1;
-}
-
-static int handle_apic_write(struct kvm_vcpu *vcpu)
-{
- unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
- u32 offset = exit_qualification & 0xfff;
-
- /* APIC-write VM exit is trap-like and thus no need to adjust IP */
- kvm_apic_write_nodecode(vcpu, offset);
- return 1;
-}
-
-static int handle_task_switch(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- unsigned long exit_qualification;
- bool has_error_code = false;
- u32 error_code = 0;
- u16 tss_selector;
- int reason, type, idt_v, idt_index;
-
- idt_v = (vmx->idt_vectoring_info & VECTORING_INFO_VALID_MASK);
- idt_index = (vmx->idt_vectoring_info & VECTORING_INFO_VECTOR_MASK);
- type = (vmx->idt_vectoring_info & VECTORING_INFO_TYPE_MASK);
-
- exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
-
- reason = (u32)exit_qualification >> 30;
- if (reason == TASK_SWITCH_GATE && idt_v) {
- switch (type) {
- case INTR_TYPE_NMI_INTR:
- vcpu->arch.nmi_injected = false;
- vmx_set_nmi_mask(vcpu, true);
- break;
- case INTR_TYPE_EXT_INTR:
- case INTR_TYPE_SOFT_INTR:
- kvm_clear_interrupt_queue(vcpu);
- break;
- case INTR_TYPE_HARD_EXCEPTION:
- if (vmx->idt_vectoring_info &
- VECTORING_INFO_DELIVER_CODE_MASK) {
- has_error_code = true;
- error_code =
- vmcs_read32(IDT_VECTORING_ERROR_CODE);
- }
- /* fall through */
- case INTR_TYPE_SOFT_EXCEPTION:
- kvm_clear_exception_queue(vcpu);
- break;
- default:
- break;
- }
- }
- tss_selector = exit_qualification;
-
- if (!idt_v || (type != INTR_TYPE_HARD_EXCEPTION &&
- type != INTR_TYPE_EXT_INTR &&
- type != INTR_TYPE_NMI_INTR))
- WARN_ON(!skip_emulated_instruction(vcpu));
-
- /*
- * TODO: What about debug traps on tss switch?
- * Are we supposed to inject them and update dr6?
- */
- return kvm_task_switch(vcpu, tss_selector,
- type == INTR_TYPE_SOFT_INTR ? idt_index : -1,
- reason, has_error_code, error_code);
-}
-
-static int handle_ept_violation(struct kvm_vcpu *vcpu)
-{
- unsigned long exit_qualification;
- gpa_t gpa;
- u64 error_code;
-
- exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
-
- /*
- * EPT violation happened while executing iret from NMI,
- * "blocked by NMI" bit has to be set before next VM entry.
- * There are errata that may cause this bit to not be set:
- * AAK134, BY25.
- */
- if (!(to_vmx(vcpu)->idt_vectoring_info & VECTORING_INFO_VALID_MASK) &&
- enable_vnmi &&
- (exit_qualification & INTR_INFO_UNBLOCK_NMI))
- vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO, GUEST_INTR_STATE_NMI);
-
- gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS);
- trace_kvm_page_fault(gpa, exit_qualification);
-
- /* Is it a read fault? */
- error_code = (exit_qualification & EPT_VIOLATION_ACC_READ)
- ? PFERR_USER_MASK : 0;
- /* Is it a write fault? */
- error_code |= (exit_qualification & EPT_VIOLATION_ACC_WRITE)
- ? PFERR_WRITE_MASK : 0;
- /* Is it a fetch fault? */
- error_code |= (exit_qualification & EPT_VIOLATION_ACC_INSTR)
- ? PFERR_FETCH_MASK : 0;
- /* ept page table entry is present? */
- error_code |= (exit_qualification &
- (EPT_VIOLATION_READABLE | EPT_VIOLATION_WRITABLE |
- EPT_VIOLATION_EXECUTABLE))
- ? PFERR_PRESENT_MASK : 0;
-
- error_code |= (exit_qualification & 0x100) != 0 ?
- PFERR_GUEST_FINAL_MASK : PFERR_GUEST_PAGE_MASK;
-
- vcpu->arch.exit_qualification = exit_qualification;
- return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0);
-}
-
-static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
-{
- gpa_t gpa;
-
- /*
- * A nested guest cannot optimize MMIO vmexits, because we have an
- * nGPA here instead of the required GPA.
- */
- gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS);
- if (!is_guest_mode(vcpu) &&
- !kvm_io_bus_write(vcpu, KVM_FAST_MMIO_BUS, gpa, 0, NULL)) {
- trace_kvm_fast_mmio(gpa);
- return kvm_skip_emulated_instruction(vcpu);
- }
-
- return kvm_mmu_page_fault(vcpu, gpa, PFERR_RSVD_MASK, NULL, 0);
-}
-
-static int handle_nmi_window(struct kvm_vcpu *vcpu)
-{
- WARN_ON_ONCE(!enable_vnmi);
- exec_controls_clearbit(to_vmx(vcpu), CPU_BASED_NMI_WINDOW_EXITING);
- ++vcpu->stat.nmi_window_exits;
- kvm_make_request(KVM_REQ_EVENT, vcpu);
-
- return 1;
-}
-
-static int handle_invalid_guest_state(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- bool intr_window_requested;
- unsigned count = 130;
-
- /*
- * We should never reach the point where we are emulating L2
- * due to invalid guest state as that means we incorrectly
- * allowed a nested VMEntry with an invalid vmcs12.
- */
- WARN_ON_ONCE(vmx->emulation_required && vmx->nested.nested_run_pending);
-
- intr_window_requested = exec_controls_get(vmx) &
- CPU_BASED_INTR_WINDOW_EXITING;
-
- while (vmx->emulation_required && count-- != 0) {
- if (intr_window_requested && vmx_interrupt_allowed(vcpu))
- return handle_interrupt_window(&vmx->vcpu);
-
- if (kvm_test_request(KVM_REQ_EVENT, vcpu))
- return 1;
-
- if (!kvm_emulate_instruction(vcpu, 0))
- return 0;
-
- if (vmx->emulation_required && !vmx->rmode.vm86_active &&
- vcpu->arch.exception.pending) {
- vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
- vcpu->run->internal.suberror =
- KVM_INTERNAL_ERROR_EMULATION;
- vcpu->run->internal.ndata = 0;
- return 0;
- }
-
- if (vcpu->arch.halt_request) {
- vcpu->arch.halt_request = 0;
- return kvm_vcpu_halt(vcpu);
- }
-
- /*
- * Note, return 1 and not 0, vcpu_run() is responsible for
- * morphing the pending signal into the proper return code.
- */
- if (signal_pending(current))
- return 1;
-
- if (need_resched())
- schedule();
- }
-
- return 1;
-}
-
-static void grow_ple_window(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- unsigned int old = vmx->ple_window;
-
- vmx->ple_window = __grow_ple_window(old, ple_window,
- ple_window_grow,
- ple_window_max);
-
- if (vmx->ple_window != old) {
- vmx->ple_window_dirty = true;
- trace_kvm_ple_window_update(vcpu->vcpu_id,
- vmx->ple_window, old);
- }
-}
-
-static void shrink_ple_window(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- unsigned int old = vmx->ple_window;
-
- vmx->ple_window = __shrink_ple_window(old, ple_window,
- ple_window_shrink,
- ple_window);
-
- if (vmx->ple_window != old) {
- vmx->ple_window_dirty = true;
- trace_kvm_ple_window_update(vcpu->vcpu_id,
- vmx->ple_window, old);
- }
-}
-
-/*
- * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR.
- */
-static void wakeup_handler(void)
-{
- struct kvm_vcpu *vcpu;
- int cpu = smp_processor_id();
-
- spin_lock(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
- list_for_each_entry(vcpu, &per_cpu(blocked_vcpu_on_cpu, cpu),
- blocked_vcpu_list) {
- struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
-
- if (pi_test_on(pi_desc) == 1)
- kvm_vcpu_kick(vcpu);
- }
- spin_unlock(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
-}
-
-static void vmx_enable_tdp(void)
-{
- kvm_mmu_set_mask_ptes(VMX_EPT_READABLE_MASK,
- enable_ept_ad_bits ? VMX_EPT_ACCESS_BIT : 0ull,
- enable_ept_ad_bits ? VMX_EPT_DIRTY_BIT : 0ull,
- 0ull, VMX_EPT_EXECUTABLE_MASK,
- cpu_has_vmx_ept_execute_only() ? 0ull : VMX_EPT_READABLE_MASK,
- VMX_EPT_RWX_MASK, 0ull);
-
- ept_set_mmio_spte_mask();
- kvm_enable_tdp();
-}
-
-/*
- * Indicate a busy-waiting vcpu in spinlock. We do not enable the PAUSE
- * exiting, so only get here on cpu with PAUSE-Loop-Exiting.
- */
-static int handle_pause(struct kvm_vcpu *vcpu)
-{
- if (!kvm_pause_in_guest(vcpu->kvm))
- grow_ple_window(vcpu);
-
- /*
- * Intel sdm vol3 ch-25.1.3 says: The "PAUSE-loop exiting"
- * VM-execution control is ignored if CPL > 0. OTOH, KVM
- * never set PAUSE_EXITING and just set PLE if supported,
- * so the vcpu must be CPL=0 if it gets a PAUSE exit.
- */
- kvm_vcpu_on_spin(vcpu, true);
- return kvm_skip_emulated_instruction(vcpu);
-}
-
-static int handle_nop(struct kvm_vcpu *vcpu)
-{
- return kvm_skip_emulated_instruction(vcpu);
-}
-
-static int handle_mwait(struct kvm_vcpu *vcpu)
-{
- printk_once(KERN_WARNING "kvm: MWAIT instruction emulated as NOP!\n");
- return handle_nop(vcpu);
-}
-
-static int handle_invalid_op(struct kvm_vcpu *vcpu)
-{
- kvm_queue_exception(vcpu, UD_VECTOR);
- return 1;
-}
-
-static int handle_monitor_trap(struct kvm_vcpu *vcpu)
-{
- return 1;
-}
-
-static int handle_monitor(struct kvm_vcpu *vcpu)
-{
- printk_once(KERN_WARNING "kvm: MONITOR instruction emulated as NOP!\n");
- return handle_nop(vcpu);
-}
-
-static int handle_invpcid(struct kvm_vcpu *vcpu)
-{
- u32 vmx_instruction_info;
- unsigned long type;
- bool pcid_enabled;
- gva_t gva;
- struct x86_exception e;
- unsigned i;
- unsigned long roots_to_free = 0;
- struct {
- u64 pcid;
- u64 gla;
- } operand;
-
- if (!guest_cpuid_has(vcpu, X86_FEATURE_INVPCID)) {
- kvm_queue_exception(vcpu, UD_VECTOR);
- return 1;
- }
-
- vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
- type = kvm_register_readl(vcpu, (vmx_instruction_info >> 28) & 0xf);
-
- if (type > 3) {
- kvm_inject_gp(vcpu, 0);
- return 1;
- }
-
- /* According to the Intel instruction reference, the memory operand
- * is read even if it isn't needed (e.g., for type==all)
- */
- if (get_vmx_mem_address(vcpu, vmcs_readl(EXIT_QUALIFICATION),
- vmx_instruction_info, false,
- sizeof(operand), &gva))
- return 1;
-
- if (kvm_read_guest_virt(vcpu, gva, &operand, sizeof(operand), &e)) {
- kvm_inject_page_fault(vcpu, &e);
- return 1;
- }
-
- if (operand.pcid >> 12 != 0) {
- kvm_inject_gp(vcpu, 0);
- return 1;
- }
-
- pcid_enabled = kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE);
-
- switch (type) {
- case INVPCID_TYPE_INDIV_ADDR:
- if ((!pcid_enabled && (operand.pcid != 0)) ||
- is_noncanonical_address(operand.gla, vcpu)) {
- kvm_inject_gp(vcpu, 0);
- return 1;
- }
- kvm_mmu_invpcid_gva(vcpu, operand.gla, operand.pcid);
- return kvm_skip_emulated_instruction(vcpu);
-
- case INVPCID_TYPE_SINGLE_CTXT:
- if (!pcid_enabled && (operand.pcid != 0)) {
- kvm_inject_gp(vcpu, 0);
- return 1;
- }
-
- if (kvm_get_active_pcid(vcpu) == operand.pcid) {
- kvm_mmu_sync_roots(vcpu);
- kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
- }
-
- for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++)
- if (kvm_get_pcid(vcpu, vcpu->arch.mmu->prev_roots[i].cr3)
- == operand.pcid)
- roots_to_free |= KVM_MMU_ROOT_PREVIOUS(i);
-
- kvm_mmu_free_roots(vcpu, vcpu->arch.mmu, roots_to_free);
- /*
- * If neither the current cr3 nor any of the prev_roots use the
- * given PCID, then nothing needs to be done here because a
- * resync will happen anyway before switching to any other CR3.
- */
-
- return kvm_skip_emulated_instruction(vcpu);
-
- case INVPCID_TYPE_ALL_NON_GLOBAL:
- /*
- * Currently, KVM doesn't mark global entries in the shadow
- * page tables, so a non-global flush just degenerates to a
- * global flush. If needed, we could optimize this later by
- * keeping track of global entries in shadow page tables.
- */
-
- /* fall-through */
- case INVPCID_TYPE_ALL_INCL_GLOBAL:
- kvm_mmu_unload(vcpu);
- return kvm_skip_emulated_instruction(vcpu);
-
- default:
- BUG(); /* We have already checked above that type <= 3 */
- }
-}
-
-static int handle_pml_full(struct kvm_vcpu *vcpu)
-{
- unsigned long exit_qualification;
-
- trace_kvm_pml_full(vcpu->vcpu_id);
-
- exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
-
- /*
- * PML buffer FULL happened while executing iret from NMI,
- * "blocked by NMI" bit has to be set before next VM entry.
- */
- if (!(to_vmx(vcpu)->idt_vectoring_info & VECTORING_INFO_VALID_MASK) &&
- enable_vnmi &&
- (exit_qualification & INTR_INFO_UNBLOCK_NMI))
- vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO,
- GUEST_INTR_STATE_NMI);
-
- /*
- * PML buffer already flushed at beginning of VMEXIT. Nothing to do
- * here.., and there's no userspace involvement needed for PML.
- */
- return 1;
-}
-
-static int handle_preemption_timer(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
-
- if (!vmx->req_immediate_exit &&
- !unlikely(vmx->loaded_vmcs->hv_timer_soft_disabled))
- kvm_lapic_expired_hv_timer(vcpu);
-
- return 1;
-}
-
-/*
- * When nested=0, all VMX instruction VM Exits filter here. The handlers
- * are overwritten by nested_vmx_setup() when nested=1.
- */
-static int handle_vmx_instruction(struct kvm_vcpu *vcpu)
-{
- kvm_queue_exception(vcpu, UD_VECTOR);
- return 1;
-}
-
-static int handle_encls(struct kvm_vcpu *vcpu)
-{
- /*
- * SGX virtualization is not yet supported. There is no software
- * enable bit for SGX, so we have to trap ENCLS and inject a #UD
- * to prevent the guest from executing ENCLS.
- */
- kvm_queue_exception(vcpu, UD_VECTOR);
- return 1;
-}
-
-/*
- * The exit handlers return 1 if the exit was handled fully and guest execution
- * may resume. Otherwise they set the kvm_run parameter to indicate what needs
- * to be done to userspace and return 0.
- */
-static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = {
- [EXIT_REASON_EXCEPTION_NMI] = handle_exception_nmi,
- [EXIT_REASON_EXTERNAL_INTERRUPT] = handle_external_interrupt,
- [EXIT_REASON_TRIPLE_FAULT] = handle_triple_fault,
- [EXIT_REASON_NMI_WINDOW] = handle_nmi_window,
- [EXIT_REASON_IO_INSTRUCTION] = handle_io,
- [EXIT_REASON_CR_ACCESS] = handle_cr,
- [EXIT_REASON_DR_ACCESS] = handle_dr,
- [EXIT_REASON_CPUID] = kvm_emulate_cpuid,
- [EXIT_REASON_MSR_READ] = kvm_emulate_rdmsr,
- [EXIT_REASON_MSR_WRITE] = kvm_emulate_wrmsr,
- [EXIT_REASON_INTERRUPT_WINDOW] = handle_interrupt_window,
- [EXIT_REASON_HLT] = kvm_emulate_halt,
- [EXIT_REASON_INVD] = handle_invd,
- [EXIT_REASON_INVLPG] = handle_invlpg,
- [EXIT_REASON_RDPMC] = handle_rdpmc,
- [EXIT_REASON_VMCALL] = handle_vmcall,
- [EXIT_REASON_VMCLEAR] = handle_vmx_instruction,
- [EXIT_REASON_VMLAUNCH] = handle_vmx_instruction,
- [EXIT_REASON_VMPTRLD] = handle_vmx_instruction,
- [EXIT_REASON_VMPTRST] = handle_vmx_instruction,
- [EXIT_REASON_VMREAD] = handle_vmx_instruction,
- [EXIT_REASON_VMRESUME] = handle_vmx_instruction,
- [EXIT_REASON_VMWRITE] = handle_vmx_instruction,
- [EXIT_REASON_VMOFF] = handle_vmx_instruction,
- [EXIT_REASON_VMON] = handle_vmx_instruction,
- [EXIT_REASON_TPR_BELOW_THRESHOLD] = handle_tpr_below_threshold,
- [EXIT_REASON_APIC_ACCESS] = handle_apic_access,
- [EXIT_REASON_APIC_WRITE] = handle_apic_write,
- [EXIT_REASON_EOI_INDUCED] = handle_apic_eoi_induced,
- [EXIT_REASON_WBINVD] = handle_wbinvd,
- [EXIT_REASON_XSETBV] = handle_xsetbv,
- [EXIT_REASON_TASK_SWITCH] = handle_task_switch,
- [EXIT_REASON_MCE_DURING_VMENTRY] = handle_machine_check,
- [EXIT_REASON_GDTR_IDTR] = handle_desc,
- [EXIT_REASON_LDTR_TR] = handle_desc,
- [EXIT_REASON_EPT_VIOLATION] = handle_ept_violation,
- [EXIT_REASON_EPT_MISCONFIG] = handle_ept_misconfig,
- [EXIT_REASON_PAUSE_INSTRUCTION] = handle_pause,
- [EXIT_REASON_MWAIT_INSTRUCTION] = handle_mwait,
- [EXIT_REASON_MONITOR_TRAP_FLAG] = handle_monitor_trap,
- [EXIT_REASON_MONITOR_INSTRUCTION] = handle_monitor,
- [EXIT_REASON_INVEPT] = handle_vmx_instruction,
- [EXIT_REASON_INVVPID] = handle_vmx_instruction,
- [EXIT_REASON_RDRAND] = handle_invalid_op,
- [EXIT_REASON_RDSEED] = handle_invalid_op,
- [EXIT_REASON_PML_FULL] = handle_pml_full,
- [EXIT_REASON_INVPCID] = handle_invpcid,
- [EXIT_REASON_VMFUNC] = handle_vmx_instruction,
- [EXIT_REASON_PREEMPTION_TIMER] = handle_preemption_timer,
- [EXIT_REASON_ENCLS] = handle_encls,
-};
-
-static const int kvm_vmx_max_exit_handlers =
- ARRAY_SIZE(kvm_vmx_exit_handlers);
-
-static void vmx_get_exit_info(struct kvm_vcpu *vcpu, u64 *info1, u64 *info2)
-{
- *info1 = vmcs_readl(EXIT_QUALIFICATION);
- *info2 = vmcs_read32(VM_EXIT_INTR_INFO);
-}
-
-static void vmx_destroy_pml_buffer(struct vcpu_vmx *vmx)
-{
- if (vmx->pml_pg) {
- __free_page(vmx->pml_pg);
- vmx->pml_pg = NULL;
- }
-}
-
-static void vmx_flush_pml_buffer(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- u64 *pml_buf;
- u16 pml_idx;
-
- pml_idx = vmcs_read16(GUEST_PML_INDEX);
-
- /* Do nothing if PML buffer is empty */
- if (pml_idx == (PML_ENTITY_NUM - 1))
- return;
-
- /* PML index always points to next available PML buffer entity */
- if (pml_idx >= PML_ENTITY_NUM)
- pml_idx = 0;
- else
- pml_idx++;
-
- pml_buf = page_address(vmx->pml_pg);
- for (; pml_idx < PML_ENTITY_NUM; pml_idx++) {
- u64 gpa;
-
- gpa = pml_buf[pml_idx];
- WARN_ON(gpa & (PAGE_SIZE - 1));
- kvm_vcpu_mark_page_dirty(vcpu, gpa >> PAGE_SHIFT);
- }
-
- /* reset PML index */
- vmcs_write16(GUEST_PML_INDEX, PML_ENTITY_NUM - 1);
-}
-
-/*
- * Flush all vcpus' PML buffer and update logged GPAs to dirty_bitmap.
- * Called before reporting dirty_bitmap to userspace.
- */
-static void kvm_flush_pml_buffers(struct kvm *kvm)
-{
- int i;
- struct kvm_vcpu *vcpu;
- /*
- * We only need to kick vcpu out of guest mode here, as PML buffer
- * is flushed at beginning of all VMEXITs, and it's obvious that only
- * vcpus running in guest are possible to have unflushed GPAs in PML
- * buffer.
- */
- kvm_for_each_vcpu(i, vcpu, kvm)
- kvm_vcpu_kick(vcpu);
-}
-
-static void vmx_dump_sel(char *name, uint32_t sel)
-{
- pr_err("%s sel=0x%04x, attr=0x%05x, limit=0x%08x, base=0x%016lx\n",
- name, vmcs_read16(sel),
- vmcs_read32(sel + GUEST_ES_AR_BYTES - GUEST_ES_SELECTOR),
- vmcs_read32(sel + GUEST_ES_LIMIT - GUEST_ES_SELECTOR),
- vmcs_readl(sel + GUEST_ES_BASE - GUEST_ES_SELECTOR));
-}
-
-static void vmx_dump_dtsel(char *name, uint32_t limit)
-{
- pr_err("%s limit=0x%08x, base=0x%016lx\n",
- name, vmcs_read32(limit),
- vmcs_readl(limit + GUEST_GDTR_BASE - GUEST_GDTR_LIMIT));
-}
-
-void dump_vmcs(void)
-{
- u32 vmentry_ctl, vmexit_ctl;
- u32 cpu_based_exec_ctrl, pin_based_exec_ctrl, secondary_exec_control;
- unsigned long cr4;
- u64 efer;
- int i, n;
-
- if (!dump_invalid_vmcs) {
- pr_warn_ratelimited("set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.\n");
- return;
- }
-
- vmentry_ctl = vmcs_read32(VM_ENTRY_CONTROLS);
- vmexit_ctl = vmcs_read32(VM_EXIT_CONTROLS);
- cpu_based_exec_ctrl = vmcs_read32(CPU_BASED_VM_EXEC_CONTROL);
- pin_based_exec_ctrl = vmcs_read32(PIN_BASED_VM_EXEC_CONTROL);
- cr4 = vmcs_readl(GUEST_CR4);
- efer = vmcs_read64(GUEST_IA32_EFER);
- secondary_exec_control = 0;
- if (cpu_has_secondary_exec_ctrls())
- secondary_exec_control = vmcs_read32(SECONDARY_VM_EXEC_CONTROL);
-
- pr_err("*** Guest State ***\n");
- pr_err("CR0: actual=0x%016lx, shadow=0x%016lx, gh_mask=%016lx\n",
- vmcs_readl(GUEST_CR0), vmcs_readl(CR0_READ_SHADOW),
- vmcs_readl(CR0_GUEST_HOST_MASK));
- pr_err("CR4: actual=0x%016lx, shadow=0x%016lx, gh_mask=%016lx\n",
- cr4, vmcs_readl(CR4_READ_SHADOW), vmcs_readl(CR4_GUEST_HOST_MASK));
- pr_err("CR3 = 0x%016lx\n", vmcs_readl(GUEST_CR3));
- if ((secondary_exec_control & SECONDARY_EXEC_ENABLE_EPT) &&
- (cr4 & X86_CR4_PAE) && !(efer & EFER_LMA))
- {
- pr_err("PDPTR0 = 0x%016llx PDPTR1 = 0x%016llx\n",
- vmcs_read64(GUEST_PDPTR0), vmcs_read64(GUEST_PDPTR1));
- pr_err("PDPTR2 = 0x%016llx PDPTR3 = 0x%016llx\n",
- vmcs_read64(GUEST_PDPTR2), vmcs_read64(GUEST_PDPTR3));
- }
- pr_err("RSP = 0x%016lx RIP = 0x%016lx\n",
- vmcs_readl(GUEST_RSP), vmcs_readl(GUEST_RIP));
- pr_err("RFLAGS=0x%08lx DR7 = 0x%016lx\n",
- vmcs_readl(GUEST_RFLAGS), vmcs_readl(GUEST_DR7));
- pr_err("Sysenter RSP=%016lx CS:RIP=%04x:%016lx\n",
- vmcs_readl(GUEST_SYSENTER_ESP),
- vmcs_read32(GUEST_SYSENTER_CS), vmcs_readl(GUEST_SYSENTER_EIP));
- vmx_dump_sel("CS: ", GUEST_CS_SELECTOR);
- vmx_dump_sel("DS: ", GUEST_DS_SELECTOR);
- vmx_dump_sel("SS: ", GUEST_SS_SELECTOR);
- vmx_dump_sel("ES: ", GUEST_ES_SELECTOR);
- vmx_dump_sel("FS: ", GUEST_FS_SELECTOR);
- vmx_dump_sel("GS: ", GUEST_GS_SELECTOR);
- vmx_dump_dtsel("GDTR:", GUEST_GDTR_LIMIT);
- vmx_dump_sel("LDTR:", GUEST_LDTR_SELECTOR);
- vmx_dump_dtsel("IDTR:", GUEST_IDTR_LIMIT);
- vmx_dump_sel("TR: ", GUEST_TR_SELECTOR);
- if ((vmexit_ctl & (VM_EXIT_SAVE_IA32_PAT | VM_EXIT_SAVE_IA32_EFER)) ||
- (vmentry_ctl & (VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_IA32_EFER)))
- pr_err("EFER = 0x%016llx PAT = 0x%016llx\n",
- efer, vmcs_read64(GUEST_IA32_PAT));
- pr_err("DebugCtl = 0x%016llx DebugExceptions = 0x%016lx\n",
- vmcs_read64(GUEST_IA32_DEBUGCTL),
- vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS));
- if (cpu_has_load_perf_global_ctrl() &&
- vmentry_ctl & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL)
- pr_err("PerfGlobCtl = 0x%016llx\n",
- vmcs_read64(GUEST_IA32_PERF_GLOBAL_CTRL));
- if (vmentry_ctl & VM_ENTRY_LOAD_BNDCFGS)
- pr_err("BndCfgS = 0x%016llx\n", vmcs_read64(GUEST_BNDCFGS));
- pr_err("Interruptibility = %08x ActivityState = %08x\n",
- vmcs_read32(GUEST_INTERRUPTIBILITY_INFO),
- vmcs_read32(GUEST_ACTIVITY_STATE));
- if (secondary_exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY)
- pr_err("InterruptStatus = %04x\n",
- vmcs_read16(GUEST_INTR_STATUS));
-
- pr_err("*** Host State ***\n");
- pr_err("RIP = 0x%016lx RSP = 0x%016lx\n",
- vmcs_readl(HOST_RIP), vmcs_readl(HOST_RSP));
- pr_err("CS=%04x SS=%04x DS=%04x ES=%04x FS=%04x GS=%04x TR=%04x\n",
- vmcs_read16(HOST_CS_SELECTOR), vmcs_read16(HOST_SS_SELECTOR),
- vmcs_read16(HOST_DS_SELECTOR), vmcs_read16(HOST_ES_SELECTOR),
- vmcs_read16(HOST_FS_SELECTOR), vmcs_read16(HOST_GS_SELECTOR),
- vmcs_read16(HOST_TR_SELECTOR));
- pr_err("FSBase=%016lx GSBase=%016lx TRBase=%016lx\n",
- vmcs_readl(HOST_FS_BASE), vmcs_readl(HOST_GS_BASE),
- vmcs_readl(HOST_TR_BASE));
- pr_err("GDTBase=%016lx IDTBase=%016lx\n",
- vmcs_readl(HOST_GDTR_BASE), vmcs_readl(HOST_IDTR_BASE));
- pr_err("CR0=%016lx CR3=%016lx CR4=%016lx\n",
- vmcs_readl(HOST_CR0), vmcs_readl(HOST_CR3),
- vmcs_readl(HOST_CR4));
- pr_err("Sysenter RSP=%016lx CS:RIP=%04x:%016lx\n",
- vmcs_readl(HOST_IA32_SYSENTER_ESP),
- vmcs_read32(HOST_IA32_SYSENTER_CS),
- vmcs_readl(HOST_IA32_SYSENTER_EIP));
- if (vmexit_ctl & (VM_EXIT_LOAD_IA32_PAT | VM_EXIT_LOAD_IA32_EFER))
- pr_err("EFER = 0x%016llx PAT = 0x%016llx\n",
- vmcs_read64(HOST_IA32_EFER),
- vmcs_read64(HOST_IA32_PAT));
- if (cpu_has_load_perf_global_ctrl() &&
- vmexit_ctl & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL)
- pr_err("PerfGlobCtl = 0x%016llx\n",
- vmcs_read64(HOST_IA32_PERF_GLOBAL_CTRL));
-
- pr_err("*** Control State ***\n");
- pr_err("PinBased=%08x CPUBased=%08x SecondaryExec=%08x\n",
- pin_based_exec_ctrl, cpu_based_exec_ctrl, secondary_exec_control);
- pr_err("EntryControls=%08x ExitControls=%08x\n", vmentry_ctl, vmexit_ctl);
- pr_err("ExceptionBitmap=%08x PFECmask=%08x PFECmatch=%08x\n",
- vmcs_read32(EXCEPTION_BITMAP),
- vmcs_read32(PAGE_FAULT_ERROR_CODE_MASK),
- vmcs_read32(PAGE_FAULT_ERROR_CODE_MATCH));
- pr_err("VMEntry: intr_info=%08x errcode=%08x ilen=%08x\n",
- vmcs_read32(VM_ENTRY_INTR_INFO_FIELD),
- vmcs_read32(VM_ENTRY_EXCEPTION_ERROR_CODE),
- vmcs_read32(VM_ENTRY_INSTRUCTION_LEN));
- pr_err("VMExit: intr_info=%08x errcode=%08x ilen=%08x\n",
- vmcs_read32(VM_EXIT_INTR_INFO),
- vmcs_read32(VM_EXIT_INTR_ERROR_CODE),
- vmcs_read32(VM_EXIT_INSTRUCTION_LEN));
- pr_err(" reason=%08x qualification=%016lx\n",
- vmcs_read32(VM_EXIT_REASON), vmcs_readl(EXIT_QUALIFICATION));
- pr_err("IDTVectoring: info=%08x errcode=%08x\n",
- vmcs_read32(IDT_VECTORING_INFO_FIELD),
- vmcs_read32(IDT_VECTORING_ERROR_CODE));
- pr_err("TSC Offset = 0x%016llx\n", vmcs_read64(TSC_OFFSET));
- if (secondary_exec_control & SECONDARY_EXEC_TSC_SCALING)
- pr_err("TSC Multiplier = 0x%016llx\n",
- vmcs_read64(TSC_MULTIPLIER));
- if (cpu_based_exec_ctrl & CPU_BASED_TPR_SHADOW) {
- if (secondary_exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY) {
- u16 status = vmcs_read16(GUEST_INTR_STATUS);
- pr_err("SVI|RVI = %02x|%02x ", status >> 8, status & 0xff);
- }
- pr_cont("TPR Threshold = 0x%02x\n", vmcs_read32(TPR_THRESHOLD));
- if (secondary_exec_control & SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)
- pr_err("APIC-access addr = 0x%016llx ", vmcs_read64(APIC_ACCESS_ADDR));
- pr_cont("virt-APIC addr = 0x%016llx\n", vmcs_read64(VIRTUAL_APIC_PAGE_ADDR));
- }
- if (pin_based_exec_ctrl & PIN_BASED_POSTED_INTR)
- pr_err("PostedIntrVec = 0x%02x\n", vmcs_read16(POSTED_INTR_NV));
- if ((secondary_exec_control & SECONDARY_EXEC_ENABLE_EPT))
- pr_err("EPT pointer = 0x%016llx\n", vmcs_read64(EPT_POINTER));
- n = vmcs_read32(CR3_TARGET_COUNT);
- for (i = 0; i + 1 < n; i += 4)
- pr_err("CR3 target%u=%016lx target%u=%016lx\n",
- i, vmcs_readl(CR3_TARGET_VALUE0 + i * 2),
- i + 1, vmcs_readl(CR3_TARGET_VALUE0 + i * 2 + 2));
- if (i < n)
- pr_err("CR3 target%u=%016lx\n",
- i, vmcs_readl(CR3_TARGET_VALUE0 + i * 2));
- if (secondary_exec_control & SECONDARY_EXEC_PAUSE_LOOP_EXITING)
- pr_err("PLE Gap=%08x Window=%08x\n",
- vmcs_read32(PLE_GAP), vmcs_read32(PLE_WINDOW));
- if (secondary_exec_control & SECONDARY_EXEC_ENABLE_VPID)
- pr_err("Virtual processor ID = 0x%04x\n",
- vmcs_read16(VIRTUAL_PROCESSOR_ID));
-}
-
-/*
- * The guest has exited. See if we can fix it or if we need userspace
- * assistance.
- */
-static int vmx_handle_exit(struct kvm_vcpu *vcpu,
- enum exit_fastpath_completion exit_fastpath)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- u32 exit_reason = vmx->exit_reason;
- u32 vectoring_info = vmx->idt_vectoring_info;
-
- trace_kvm_exit(exit_reason, vcpu, KVM_ISA_VMX);
-
- /*
- * Flush logged GPAs PML buffer, this will make dirty_bitmap more
- * updated. Another good is, in kvm_vm_ioctl_get_dirty_log, before
- * querying dirty_bitmap, we only need to kick all vcpus out of guest
- * mode as if vcpus is in root mode, the PML buffer must has been
- * flushed already.
- */
- if (enable_pml)
- vmx_flush_pml_buffer(vcpu);
-
- /* If guest state is invalid, start emulating */
- if (vmx->emulation_required)
- return handle_invalid_guest_state(vcpu);
-
- if (is_guest_mode(vcpu) && nested_vmx_exit_reflected(vcpu, exit_reason))
- return nested_vmx_reflect_vmexit(vcpu, exit_reason);
-
- if (exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) {
- dump_vmcs();
- vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
- vcpu->run->fail_entry.hardware_entry_failure_reason
- = exit_reason;
- return 0;
- }
-
- if (unlikely(vmx->fail)) {
- dump_vmcs();
- vcpu->run->exit_reason = KVM_EXIT_FAIL_ENTRY;
- vcpu->run->fail_entry.hardware_entry_failure_reason
- = vmcs_read32(VM_INSTRUCTION_ERROR);
- return 0;
- }
-
- /*
- * Note:
- * Do not try to fix EXIT_REASON_EPT_MISCONFIG if it caused by
- * delivery event since it indicates guest is accessing MMIO.
- * The vm-exit can be triggered again after return to guest that
- * will cause infinite loop.
- */
- if ((vectoring_info & VECTORING_INFO_VALID_MASK) &&
- (exit_reason != EXIT_REASON_EXCEPTION_NMI &&
- exit_reason != EXIT_REASON_EPT_VIOLATION &&
- exit_reason != EXIT_REASON_PML_FULL &&
- exit_reason != EXIT_REASON_TASK_SWITCH)) {
- vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
- vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_DELIVERY_EV;
- vcpu->run->internal.ndata = 3;
- vcpu->run->internal.data[0] = vectoring_info;
- vcpu->run->internal.data[1] = exit_reason;
- vcpu->run->internal.data[2] = vcpu->arch.exit_qualification;
- if (exit_reason == EXIT_REASON_EPT_MISCONFIG) {
- vcpu->run->internal.ndata++;
- vcpu->run->internal.data[3] =
- vmcs_read64(GUEST_PHYSICAL_ADDRESS);
- }
- return 0;
- }
-
- if (unlikely(!enable_vnmi &&
- vmx->loaded_vmcs->soft_vnmi_blocked)) {
- if (vmx_interrupt_allowed(vcpu)) {
- vmx->loaded_vmcs->soft_vnmi_blocked = 0;
- } else if (vmx->loaded_vmcs->vnmi_blocked_time > 1000000000LL &&
- vcpu->arch.nmi_pending) {
- /*
- * This CPU don't support us in finding the end of an
- * NMI-blocked window if the guest runs with IRQs
- * disabled. So we pull the trigger after 1 s of
- * futile waiting, but inform the user about this.
- */
- printk(KERN_WARNING "%s: Breaking out of NMI-blocked "
- "state on VCPU %d after 1 s timeout\n",
- __func__, vcpu->vcpu_id);
- vmx->loaded_vmcs->soft_vnmi_blocked = 0;
- }
- }
-
- if (exit_fastpath == EXIT_FASTPATH_SKIP_EMUL_INS) {
- kvm_skip_emulated_instruction(vcpu);
- return 1;
- } else if (exit_reason < kvm_vmx_max_exit_handlers
- && kvm_vmx_exit_handlers[exit_reason]) {
-#ifdef CONFIG_RETPOLINE
- if (exit_reason == EXIT_REASON_MSR_WRITE)
- return kvm_emulate_wrmsr(vcpu);
- else if (exit_reason == EXIT_REASON_PREEMPTION_TIMER)
- return handle_preemption_timer(vcpu);
- else if (exit_reason == EXIT_REASON_INTERRUPT_WINDOW)
- return handle_interrupt_window(vcpu);
- else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
- return handle_external_interrupt(vcpu);
- else if (exit_reason == EXIT_REASON_HLT)
- return kvm_emulate_halt(vcpu);
- else if (exit_reason == EXIT_REASON_EPT_MISCONFIG)
- return handle_ept_misconfig(vcpu);
-#endif
- return kvm_vmx_exit_handlers[exit_reason](vcpu);
- } else {
- vcpu_unimpl(vcpu, "vmx: unexpected exit reason 0x%x\n",
- exit_reason);
- dump_vmcs();
- vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
- vcpu->run->internal.suberror =
- KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON;
- vcpu->run->internal.ndata = 1;
- vcpu->run->internal.data[0] = exit_reason;
- return 0;
- }
-}
-
-/*
- * Software based L1D cache flush which is used when microcode providing
- * the cache control MSR is not loaded.
- *
- * The L1D cache is 32 KiB on Nehalem and later microarchitectures, but to
- * flush it is required to read in 64 KiB because the replacement algorithm
- * is not exactly LRU. This could be sized at runtime via topology
- * information but as all relevant affected CPUs have 32KiB L1D cache size
- * there is no point in doing so.
- */
-static void vmx_l1d_flush(struct kvm_vcpu *vcpu)
-{
- int size = PAGE_SIZE << L1D_CACHE_ORDER;
-
- /*
- * This code is only executed when the the flush mode is 'cond' or
- * 'always'
- */
- if (static_branch_likely(&vmx_l1d_flush_cond)) {
- bool flush_l1d;
-
- /*
- * Clear the per-vcpu flush bit, it gets set again
- * either from vcpu_run() or from one of the unsafe
- * VMEXIT handlers.
- */
- flush_l1d = vcpu->arch.l1tf_flush_l1d;
- vcpu->arch.l1tf_flush_l1d = false;
-
- /*
- * Clear the per-cpu flush bit, it gets set again from
- * the interrupt handlers.
- */
- flush_l1d |= kvm_get_cpu_l1tf_flush_l1d();
- kvm_clear_cpu_l1tf_flush_l1d();
-
- if (!flush_l1d)
- return;
- }
-
- vcpu->stat.l1d_flush++;
-
- if (static_cpu_has(X86_FEATURE_FLUSH_L1D)) {
- wrmsrl(MSR_IA32_FLUSH_CMD, L1D_FLUSH);
- return;
- }
-
- asm volatile(
- /* First ensure the pages are in the TLB */
- "xorl %%eax, %%eax\n"
- ".Lpopulate_tlb:\n\t"
- "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t"
- "addl $4096, %%eax\n\t"
- "cmpl %%eax, %[size]\n\t"
- "jne .Lpopulate_tlb\n\t"
- "xorl %%eax, %%eax\n\t"
- "cpuid\n\t"
- /* Now fill the cache */
- "xorl %%eax, %%eax\n"
- ".Lfill_cache:\n"
- "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t"
- "addl $64, %%eax\n\t"
- "cmpl %%eax, %[size]\n\t"
- "jne .Lfill_cache\n\t"
- "lfence\n"
- :: [flush_pages] "r" (vmx_l1d_flush_pages),
- [size] "r" (size)
- : "eax", "ebx", "ecx", "edx");
-}
-
-static void update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr)
-{
- struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
- int tpr_threshold;
-
- if (is_guest_mode(vcpu) &&
- nested_cpu_has(vmcs12, CPU_BASED_TPR_SHADOW))
- return;
-
- tpr_threshold = (irr == -1 || tpr < irr) ? 0 : irr;
- if (is_guest_mode(vcpu))
- to_vmx(vcpu)->nested.l1_tpr_threshold = tpr_threshold;
- else
- vmcs_write32(TPR_THRESHOLD, tpr_threshold);
-}
-
-void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- u32 sec_exec_control;
-
- if (!lapic_in_kernel(vcpu))
- return;
-
- if (!flexpriority_enabled &&
- !cpu_has_vmx_virtualize_x2apic_mode())
- return;
-
- /* Postpone execution until vmcs01 is the current VMCS. */
- if (is_guest_mode(vcpu)) {
- vmx->nested.change_vmcs01_virtual_apic_mode = true;
- return;
- }
-
- sec_exec_control = secondary_exec_controls_get(vmx);
- sec_exec_control &= ~(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
- SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE);
-
- switch (kvm_get_apic_mode(vcpu)) {
- case LAPIC_MODE_INVALID:
- WARN_ONCE(true, "Invalid local APIC state");
- case LAPIC_MODE_DISABLED:
- break;
- case LAPIC_MODE_XAPIC:
- if (flexpriority_enabled) {
- sec_exec_control |=
- SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
- vmx_flush_tlb(vcpu, true);
- }
- break;
- case LAPIC_MODE_X2APIC:
- if (cpu_has_vmx_virtualize_x2apic_mode())
- sec_exec_control |=
- SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE;
- break;
- }
- secondary_exec_controls_set(vmx, sec_exec_control);
-
- vmx_update_msr_bitmap(vcpu);
-}
-
-static void vmx_set_apic_access_page_addr(struct kvm_vcpu *vcpu, hpa_t hpa)
-{
- if (!is_guest_mode(vcpu)) {
- vmcs_write64(APIC_ACCESS_ADDR, hpa);
- vmx_flush_tlb(vcpu, true);
- }
-}
-
-static void vmx_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr)
-{
- u16 status;
- u8 old;
-
- if (max_isr == -1)
- max_isr = 0;
-
- status = vmcs_read16(GUEST_INTR_STATUS);
- old = status >> 8;
- if (max_isr != old) {
- status &= 0xff;
- status |= max_isr << 8;
- vmcs_write16(GUEST_INTR_STATUS, status);
- }
-}
-
-static void vmx_set_rvi(int vector)
-{
- u16 status;
- u8 old;
-
- if (vector == -1)
- vector = 0;
-
- status = vmcs_read16(GUEST_INTR_STATUS);
- old = (u8)status & 0xff;
- if ((u8)vector != old) {
- status &= ~0xff;
- status |= (u8)vector;
- vmcs_write16(GUEST_INTR_STATUS, status);
- }
-}
-
-static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
-{
- /*
- * When running L2, updating RVI is only relevant when
- * vmcs12 virtual-interrupt-delivery enabled.
- * However, it can be enabled only when L1 also
- * intercepts external-interrupts and in that case
- * we should not update vmcs02 RVI but instead intercept
- * interrupt. Therefore, do nothing when running L2.
- */
- if (!is_guest_mode(vcpu))
- vmx_set_rvi(max_irr);
-}
-
-static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- int max_irr;
- bool max_irr_updated;
-
- WARN_ON(!vcpu->arch.apicv_active);
- if (pi_test_on(&vmx->pi_desc)) {
- pi_clear_on(&vmx->pi_desc);
- /*
- * IOMMU can write to PID.ON, so the barrier matters even on UP.
- * But on x86 this is just a compiler barrier anyway.
- */
- smp_mb__after_atomic();
- max_irr_updated =
- kvm_apic_update_irr(vcpu, vmx->pi_desc.pir, &max_irr);
-
- /*
- * If we are running L2 and L1 has a new pending interrupt
- * which can be injected, we should re-evaluate
- * what should be done with this new L1 interrupt.
- * If L1 intercepts external-interrupts, we should
- * exit from L2 to L1. Otherwise, interrupt should be
- * delivered directly to L2.
- */
- if (is_guest_mode(vcpu) && max_irr_updated) {
- if (nested_exit_on_intr(vcpu))
- kvm_vcpu_exiting_guest_mode(vcpu);
- else
- kvm_make_request(KVM_REQ_EVENT, vcpu);
- }
- } else {
- max_irr = kvm_lapic_find_highest_irr(vcpu);
- }
- vmx_hwapic_irr_update(vcpu, max_irr);
- return max_irr;
-}
-
-static bool vmx_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu)
-{
- struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
-
- return pi_test_on(pi_desc) ||
- (pi_test_sn(pi_desc) && !pi_is_pir_empty(pi_desc));
-}
-
-static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
-{
- if (!kvm_vcpu_apicv_active(vcpu))
- return;
-
- vmcs_write64(EOI_EXIT_BITMAP0, eoi_exit_bitmap[0]);
- vmcs_write64(EOI_EXIT_BITMAP1, eoi_exit_bitmap[1]);
- vmcs_write64(EOI_EXIT_BITMAP2, eoi_exit_bitmap[2]);
- vmcs_write64(EOI_EXIT_BITMAP3, eoi_exit_bitmap[3]);
-}
-
-static void vmx_apicv_post_state_restore(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
-
- pi_clear_on(&vmx->pi_desc);
- memset(vmx->pi_desc.pir, 0, sizeof(vmx->pi_desc.pir));
-}
-
-static void handle_exception_nmi_irqoff(struct vcpu_vmx *vmx)
-{
- vmx->exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
-
- /* if exit due to PF check for async PF */
- if (is_page_fault(vmx->exit_intr_info))
- vmx->vcpu.arch.apf.host_apf_reason = kvm_read_and_reset_pf_reason();
-
- /* Handle machine checks before interrupts are enabled */
- if (is_machine_check(vmx->exit_intr_info))
- kvm_machine_check();
-
- /* We need to handle NMIs before interrupts are enabled */
- if (is_nmi(vmx->exit_intr_info)) {
- kvm_before_interrupt(&vmx->vcpu);
- asm("int $2");
- kvm_after_interrupt(&vmx->vcpu);
- }
-}
-
-static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu)
-{
- unsigned int vector;
- unsigned long entry;
-#ifdef CONFIG_X86_64
- unsigned long tmp;
-#endif
- gate_desc *desc;
- u32 intr_info;
-
- intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
- if (WARN_ONCE(!is_external_intr(intr_info),
- "KVM: unexpected VM-Exit interrupt info: 0x%x", intr_info))
- return;
-
- vector = intr_info & INTR_INFO_VECTOR_MASK;
- desc = (gate_desc *)host_idt_base + vector;
- entry = gate_offset(desc);
-
- kvm_before_interrupt(vcpu);
-
- asm volatile(
-#ifdef CONFIG_X86_64
- "mov %%" _ASM_SP ", %[sp]\n\t"
- "and $0xfffffffffffffff0, %%" _ASM_SP "\n\t"
- "push $%c[ss]\n\t"
- "push %[sp]\n\t"
-#endif
- "pushf\n\t"
- __ASM_SIZE(push) " $%c[cs]\n\t"
- CALL_NOSPEC
- :
-#ifdef CONFIG_X86_64
- [sp]"=&r"(tmp),
-#endif
- ASM_CALL_CONSTRAINT
- :
- THUNK_TARGET(entry),
- [ss]"i"(__KERNEL_DS),
- [cs]"i"(__KERNEL_CS)
- );
-
- kvm_after_interrupt(vcpu);
-}
-STACK_FRAME_NON_STANDARD(handle_external_interrupt_irqoff);
-
-static void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu,
- enum exit_fastpath_completion *exit_fastpath)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
-
- if (vmx->exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
- handle_external_interrupt_irqoff(vcpu);
- else if (vmx->exit_reason == EXIT_REASON_EXCEPTION_NMI)
- handle_exception_nmi_irqoff(vmx);
- else if (!is_guest_mode(vcpu) &&
- vmx->exit_reason == EXIT_REASON_MSR_WRITE)
- *exit_fastpath = handle_fastpath_set_msr_irqoff(vcpu);
-}
-
-static bool vmx_has_emulated_msr(int index)
-{
- switch (index) {
- case MSR_IA32_SMBASE:
- /*
- * We cannot do SMM unless we can run the guest in big
- * real mode.
- */
- return enable_unrestricted_guest || emulate_invalid_guest_state;
- case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
- return nested;
- case MSR_AMD64_VIRT_SPEC_CTRL:
- /* This is AMD only. */
- return false;
- default:
- return true;
- }
-}
-
-static bool vmx_pt_supported(void)
-{
- return pt_mode == PT_MODE_HOST_GUEST;
-}
-
-static void vmx_recover_nmi_blocking(struct vcpu_vmx *vmx)
-{
- u32 exit_intr_info;
- bool unblock_nmi;
- u8 vector;
- bool idtv_info_valid;
-
- idtv_info_valid = vmx->idt_vectoring_info & VECTORING_INFO_VALID_MASK;
-
- if (enable_vnmi) {
- if (vmx->loaded_vmcs->nmi_known_unmasked)
- return;
- /*
- * Can't use vmx->exit_intr_info since we're not sure what
- * the exit reason is.
- */
- exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
- unblock_nmi = (exit_intr_info & INTR_INFO_UNBLOCK_NMI) != 0;
- vector = exit_intr_info & INTR_INFO_VECTOR_MASK;
- /*
- * SDM 3: 27.7.1.2 (September 2008)
- * Re-set bit "block by NMI" before VM entry if vmexit caused by
- * a guest IRET fault.
- * SDM 3: 23.2.2 (September 2008)
- * Bit 12 is undefined in any of the following cases:
- * If the VM exit sets the valid bit in the IDT-vectoring
- * information field.
- * If the VM exit is due to a double fault.
- */
- if ((exit_intr_info & INTR_INFO_VALID_MASK) && unblock_nmi &&
- vector != DF_VECTOR && !idtv_info_valid)
- vmcs_set_bits(GUEST_INTERRUPTIBILITY_INFO,
- GUEST_INTR_STATE_NMI);
- else
- vmx->loaded_vmcs->nmi_known_unmasked =
- !(vmcs_read32(GUEST_INTERRUPTIBILITY_INFO)
- & GUEST_INTR_STATE_NMI);
- } else if (unlikely(vmx->loaded_vmcs->soft_vnmi_blocked))
- vmx->loaded_vmcs->vnmi_blocked_time +=
- ktime_to_ns(ktime_sub(ktime_get(),
- vmx->loaded_vmcs->entry_time));
-}
-
-static void __vmx_complete_interrupts(struct kvm_vcpu *vcpu,
- u32 idt_vectoring_info,
- int instr_len_field,
- int error_code_field)
-{
- u8 vector;
- int type;
- bool idtv_info_valid;
-
- idtv_info_valid = idt_vectoring_info & VECTORING_INFO_VALID_MASK;
-
- vcpu->arch.nmi_injected = false;
- kvm_clear_exception_queue(vcpu);
- kvm_clear_interrupt_queue(vcpu);
-
- if (!idtv_info_valid)
- return;
-
- kvm_make_request(KVM_REQ_EVENT, vcpu);
-
- vector = idt_vectoring_info & VECTORING_INFO_VECTOR_MASK;
- type = idt_vectoring_info & VECTORING_INFO_TYPE_MASK;
-
- switch (type) {
- case INTR_TYPE_NMI_INTR:
- vcpu->arch.nmi_injected = true;
- /*
- * SDM 3: 27.7.1.2 (September 2008)
- * Clear bit "block by NMI" before VM entry if a NMI
- * delivery faulted.
- */
- vmx_set_nmi_mask(vcpu, false);
- break;
- case INTR_TYPE_SOFT_EXCEPTION:
- vcpu->arch.event_exit_inst_len = vmcs_read32(instr_len_field);
- /* fall through */
- case INTR_TYPE_HARD_EXCEPTION:
- if (idt_vectoring_info & VECTORING_INFO_DELIVER_CODE_MASK) {
- u32 err = vmcs_read32(error_code_field);
- kvm_requeue_exception_e(vcpu, vector, err);
- } else
- kvm_requeue_exception(vcpu, vector);
- break;
- case INTR_TYPE_SOFT_INTR:
- vcpu->arch.event_exit_inst_len = vmcs_read32(instr_len_field);
- /* fall through */
- case INTR_TYPE_EXT_INTR:
- kvm_queue_interrupt(vcpu, vector, type == INTR_TYPE_SOFT_INTR);
- break;
- default:
- break;
- }
-}
-
-static void vmx_complete_interrupts(struct vcpu_vmx *vmx)
-{
- __vmx_complete_interrupts(&vmx->vcpu, vmx->idt_vectoring_info,
- VM_EXIT_INSTRUCTION_LEN,
- IDT_VECTORING_ERROR_CODE);
-}
-
-static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
-{
- __vmx_complete_interrupts(vcpu,
- vmcs_read32(VM_ENTRY_INTR_INFO_FIELD),
- VM_ENTRY_INSTRUCTION_LEN,
- VM_ENTRY_EXCEPTION_ERROR_CODE);
-
- vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0);
-}
-
-static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
-{
- int i, nr_msrs;
- struct perf_guest_switch_msr *msrs;
-
- msrs = perf_guest_get_msrs(&nr_msrs);
-
- if (!msrs)
- return;
-
- for (i = 0; i < nr_msrs; i++)
- if (msrs[i].host == msrs[i].guest)
- clear_atomic_switch_msr(vmx, msrs[i].msr);
- else
- add_atomic_switch_msr(vmx, msrs[i].msr, msrs[i].guest,
- msrs[i].host, false);
-}
-
-static void atomic_switch_umwait_control_msr(struct vcpu_vmx *vmx)
-{
- u32 host_umwait_control;
-
- if (!vmx_has_waitpkg(vmx))
- return;
-
- host_umwait_control = get_umwait_control_msr();
-
- if (vmx->msr_ia32_umwait_control != host_umwait_control)
- add_atomic_switch_msr(vmx, MSR_IA32_UMWAIT_CONTROL,
- vmx->msr_ia32_umwait_control,
- host_umwait_control, false);
- else
- clear_atomic_switch_msr(vmx, MSR_IA32_UMWAIT_CONTROL);
-}
-
-static void vmx_update_hv_timer(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- u64 tscl;
- u32 delta_tsc;
-
- if (vmx->req_immediate_exit) {
- vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, 0);
- vmx->loaded_vmcs->hv_timer_soft_disabled = false;
- } else if (vmx->hv_deadline_tsc != -1) {
- tscl = rdtsc();
- if (vmx->hv_deadline_tsc > tscl)
- /* set_hv_timer ensures the delta fits in 32-bits */
- delta_tsc = (u32)((vmx->hv_deadline_tsc - tscl) >>
- cpu_preemption_timer_multi);
- else
- delta_tsc = 0;
-
- vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, delta_tsc);
- vmx->loaded_vmcs->hv_timer_soft_disabled = false;
- } else if (!vmx->loaded_vmcs->hv_timer_soft_disabled) {
- vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, -1);
- vmx->loaded_vmcs->hv_timer_soft_disabled = true;
- }
-}
-
-void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp)
-{
- if (unlikely(host_rsp != vmx->loaded_vmcs->host_state.rsp)) {
- vmx->loaded_vmcs->host_state.rsp = host_rsp;
- vmcs_writel(HOST_RSP, host_rsp);
- }
-}
-
-bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned long *regs, bool launched);
-
-static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- unsigned long cr3, cr4;
-
- /* Record the guest's net vcpu time for enforced NMI injections. */
- if (unlikely(!enable_vnmi &&
- vmx->loaded_vmcs->soft_vnmi_blocked))
- vmx->loaded_vmcs->entry_time = ktime_get();
-
- /* Don't enter VMX if guest state is invalid, let the exit handler
- start emulation until we arrive back to a valid state */
- if (vmx->emulation_required)
- return;
-
- if (vmx->ple_window_dirty) {
- vmx->ple_window_dirty = false;
- vmcs_write32(PLE_WINDOW, vmx->ple_window);
- }
-
- if (vmx->nested.need_vmcs12_to_shadow_sync)
- nested_sync_vmcs12_to_shadow(vcpu);
-
- if (kvm_register_is_dirty(vcpu, VCPU_REGS_RSP))
- vmcs_writel(GUEST_RSP, vcpu->arch.regs[VCPU_REGS_RSP]);
- if (kvm_register_is_dirty(vcpu, VCPU_REGS_RIP))
- vmcs_writel(GUEST_RIP, vcpu->arch.regs[VCPU_REGS_RIP]);
-
- cr3 = __get_current_cr3_fast();
- if (unlikely(cr3 != vmx->loaded_vmcs->host_state.cr3)) {
- vmcs_writel(HOST_CR3, cr3);
- vmx->loaded_vmcs->host_state.cr3 = cr3;
- }
-
- cr4 = cr4_read_shadow();
- if (unlikely(cr4 != vmx->loaded_vmcs->host_state.cr4)) {
- vmcs_writel(HOST_CR4, cr4);
- vmx->loaded_vmcs->host_state.cr4 = cr4;
- }
-
- /* When single-stepping over STI and MOV SS, we must clear the
- * corresponding interruptibility bits in the guest state. Otherwise
- * vmentry fails as it then expects bit 14 (BS) in pending debug
- * exceptions being set, but that's not correct for the guest debugging
- * case. */
- if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
- vmx_set_interrupt_shadow(vcpu, 0);
-
- kvm_load_guest_xsave_state(vcpu);
-
- if (static_cpu_has(X86_FEATURE_PKU) &&
- kvm_read_cr4_bits(vcpu, X86_CR4_PKE) &&
- vcpu->arch.pkru != vmx->host_pkru)
- __write_pkru(vcpu->arch.pkru);
-
- pt_guest_enter(vmx);
-
- atomic_switch_perf_msrs(vmx);
- atomic_switch_umwait_control_msr(vmx);
-
- if (enable_preemption_timer)
- vmx_update_hv_timer(vcpu);
-
- if (lapic_in_kernel(vcpu) &&
- vcpu->arch.apic->lapic_timer.timer_advance_ns)
- kvm_wait_lapic_expire(vcpu);
-
- /*
- * If this vCPU has touched SPEC_CTRL, restore the guest's value if
- * it's non-zero. Since vmentry is serialising on affected CPUs, there
- * is no need to worry about the conditional branch over the wrmsr
- * being speculatively taken.
- */
- x86_spec_ctrl_set_guest(vmx->spec_ctrl, 0);
-
- /* L1D Flush includes CPU buffer clear to mitigate MDS */
- if (static_branch_unlikely(&vmx_l1d_should_flush))
- vmx_l1d_flush(vcpu);
- else if (static_branch_unlikely(&mds_user_clear))
- mds_clear_cpu_buffers();
-
- if (vcpu->arch.cr2 != read_cr2())
- write_cr2(vcpu->arch.cr2);
-
- vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
- vmx->loaded_vmcs->launched);
-
- vcpu->arch.cr2 = read_cr2();
-
- /*
- * We do not use IBRS in the kernel. If this vCPU has used the
- * SPEC_CTRL MSR it may have left it on; save the value and
- * turn it off. This is much more efficient than blindly adding
- * it to the atomic save/restore list. Especially as the former
- * (Saving guest MSRs on vmexit) doesn't even exist in KVM.
- *
- * For non-nested case:
- * If the L01 MSR bitmap does not intercept the MSR, then we need to
- * save it.
- *
- * For nested case:
- * If the L02 MSR bitmap does not intercept the MSR, then we need to
- * save it.
- */
- if (unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
- vmx->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
-
- x86_spec_ctrl_restore_host(vmx->spec_ctrl, 0);
-
- /* All fields are clean at this point */
- if (static_branch_unlikely(&enable_evmcs))
- current_evmcs->hv_clean_fields |=
- HV_VMX_ENLIGHTENED_CLEAN_FIELD_ALL;
-
- if (static_branch_unlikely(&enable_evmcs))
- current_evmcs->hv_vp_id = vcpu->arch.hyperv.vp_index;
-
- /* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */
- if (vmx->host_debugctlmsr)
- update_debugctlmsr(vmx->host_debugctlmsr);
-
-#ifndef CONFIG_X86_64
- /*
- * The sysexit path does not restore ds/es, so we must set them to
- * a reasonable value ourselves.
- *
- * We can't defer this to vmx_prepare_switch_to_host() since that
- * function may be executed in interrupt context, which saves and
- * restore segments around it, nullifying its effect.
- */
- loadsegment(ds, __USER_DS);
- loadsegment(es, __USER_DS);
-#endif
-
- vcpu->arch.regs_avail = ~((1 << VCPU_REGS_RIP) | (1 << VCPU_REGS_RSP)
- | (1 << VCPU_EXREG_RFLAGS)
- | (1 << VCPU_EXREG_PDPTR)
- | (1 << VCPU_EXREG_SEGMENTS)
- | (1 << VCPU_EXREG_CR3));
- vcpu->arch.regs_dirty = 0;
-
- pt_guest_exit(vmx);
-
- /*
- * eager fpu is enabled if PKEY is supported and CR4 is switched
- * back on host, so it is safe to read guest PKRU from current
- * XSAVE.
- */
- if (static_cpu_has(X86_FEATURE_PKU) &&
- kvm_read_cr4_bits(vcpu, X86_CR4_PKE)) {
- vcpu->arch.pkru = rdpkru();
- if (vcpu->arch.pkru != vmx->host_pkru)
- __write_pkru(vmx->host_pkru);
- }
-
- kvm_load_host_xsave_state(vcpu);
-
- vmx->nested.nested_run_pending = 0;
- vmx->idt_vectoring_info = 0;
-
- vmx->exit_reason = vmx->fail ? 0xdead : vmcs_read32(VM_EXIT_REASON);
- if ((u16)vmx->exit_reason == EXIT_REASON_MCE_DURING_VMENTRY)
- kvm_machine_check();
-
- if (vmx->fail || (vmx->exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY))
- return;
-
- vmx->loaded_vmcs->launched = 1;
- vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
-
- vmx_recover_nmi_blocking(vmx);
- vmx_complete_interrupts(vmx);
-}
-
-static struct kvm *vmx_vm_alloc(void)
-{
- struct kvm_vmx *kvm_vmx = __vmalloc(sizeof(struct kvm_vmx),
- GFP_KERNEL_ACCOUNT | __GFP_ZERO,
- PAGE_KERNEL);
- return &kvm_vmx->kvm;
-}
-
-static void vmx_vm_free(struct kvm *kvm)
-{
- kfree(kvm->arch.hyperv.hv_pa_pg);
- vfree(to_kvm_vmx(kvm));
-}
-
-static void vmx_free_vcpu(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
-
- if (enable_pml)
- vmx_destroy_pml_buffer(vmx);
- free_vpid(vmx->vpid);
- nested_vmx_free_vcpu(vcpu);
- free_loaded_vmcs(vmx->loaded_vmcs);
- kvm_vcpu_uninit(vcpu);
- kmem_cache_free(x86_fpu_cache, vmx->vcpu.arch.user_fpu);
- kmem_cache_free(x86_fpu_cache, vmx->vcpu.arch.guest_fpu);
- kmem_cache_free(kvm_vcpu_cache, vmx);
-}
-
-static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id)
-{
- int err;
- struct vcpu_vmx *vmx;
- unsigned long *msr_bitmap;
- int i, cpu;
-
- BUILD_BUG_ON_MSG(offsetof(struct vcpu_vmx, vcpu) != 0,
- "struct kvm_vcpu must be at offset 0 for arch usercopy region");
-
- vmx = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL_ACCOUNT);
- if (!vmx)
- return ERR_PTR(-ENOMEM);
-
- vmx->vcpu.arch.user_fpu = kmem_cache_zalloc(x86_fpu_cache,
- GFP_KERNEL_ACCOUNT);
- if (!vmx->vcpu.arch.user_fpu) {
- printk(KERN_ERR "kvm: failed to allocate kvm userspace's fpu\n");
- err = -ENOMEM;
- goto free_partial_vcpu;
- }
-
- vmx->vcpu.arch.guest_fpu = kmem_cache_zalloc(x86_fpu_cache,
- GFP_KERNEL_ACCOUNT);
- if (!vmx->vcpu.arch.guest_fpu) {
- printk(KERN_ERR "kvm: failed to allocate vcpu's fpu\n");
- err = -ENOMEM;
- goto free_user_fpu;
- }
-
- vmx->vpid = allocate_vpid();
-
- err = kvm_vcpu_init(&vmx->vcpu, kvm, id);
- if (err)
- goto free_vcpu;
-
- err = -ENOMEM;
-
- /*
- * If PML is turned on, failure on enabling PML just results in failure
- * of creating the vcpu, therefore we can simplify PML logic (by
- * avoiding dealing with cases, such as enabling PML partially on vcpus
- * for the guest), etc.
- */
- if (enable_pml) {
- vmx->pml_pg = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
- if (!vmx->pml_pg)
- goto uninit_vcpu;
- }
-
- BUILD_BUG_ON(ARRAY_SIZE(vmx_msr_index) != NR_SHARED_MSRS);
-
- for (i = 0; i < ARRAY_SIZE(vmx_msr_index); ++i) {
- u32 index = vmx_msr_index[i];
- u32 data_low, data_high;
- int j = vmx->nmsrs;
-
- if (rdmsr_safe(index, &data_low, &data_high) < 0)
- continue;
- if (wrmsr_safe(index, data_low, data_high) < 0)
- continue;
-
- vmx->guest_msrs[j].index = i;
- vmx->guest_msrs[j].data = 0;
- switch (index) {
- case MSR_IA32_TSX_CTRL:
- /*
- * No need to pass TSX_CTRL_CPUID_CLEAR through, so
- * let's avoid changing CPUID bits under the host
- * kernel's feet.
- */
- vmx->guest_msrs[j].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
- break;
- default:
- vmx->guest_msrs[j].mask = -1ull;
- break;
- }
- ++vmx->nmsrs;
- }
-
- err = alloc_loaded_vmcs(&vmx->vmcs01);
- if (err < 0)
- goto free_pml;
-
- msr_bitmap = vmx->vmcs01.msr_bitmap;
- vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_TSC, MSR_TYPE_R);
- vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
- vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW);
- vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
- vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_CS, MSR_TYPE_RW);
- vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW);
- vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW);
- if (kvm_cstate_in_guest(kvm)) {
- vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C1_RES, MSR_TYPE_R);
- vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C3_RESIDENCY, MSR_TYPE_R);
- vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C6_RESIDENCY, MSR_TYPE_R);
- vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C7_RESIDENCY, MSR_TYPE_R);
- }
- vmx->msr_bitmap_mode = 0;
-
- vmx->loaded_vmcs = &vmx->vmcs01;
- cpu = get_cpu();
- vmx_vcpu_load(&vmx->vcpu, cpu);
- vmx->vcpu.cpu = cpu;
- init_vmcs(vmx);
- vmx_vcpu_put(&vmx->vcpu);
- put_cpu();
- if (cpu_need_virtualize_apic_accesses(&vmx->vcpu)) {
- err = alloc_apic_access_page(kvm);
- if (err)
- goto free_vmcs;
- }
-
- if (enable_ept && !enable_unrestricted_guest) {
- err = init_rmode_identity_map(kvm);
- if (err)
- goto free_vmcs;
- }
-
- if (nested)
- nested_vmx_setup_ctls_msrs(&vmx->nested.msrs,
- vmx_capability.ept,
- kvm_vcpu_apicv_active(&vmx->vcpu));
- else
- memset(&vmx->nested.msrs, 0, sizeof(vmx->nested.msrs));
-
- vmx->nested.posted_intr_nv = -1;
- vmx->nested.current_vmptr = -1ull;
-
- vmx->msr_ia32_feature_control_valid_bits = FEATURE_CONTROL_LOCKED;
-
- /*
- * Enforce invariant: pi_desc.nv is always either POSTED_INTR_VECTOR
- * or POSTED_INTR_WAKEUP_VECTOR.
- */
- vmx->pi_desc.nv = POSTED_INTR_VECTOR;
- vmx->pi_desc.sn = 1;
-
- vmx->ept_pointer = INVALID_PAGE;
-
- return &vmx->vcpu;
-
-free_vmcs:
- free_loaded_vmcs(vmx->loaded_vmcs);
-free_pml:
- vmx_destroy_pml_buffer(vmx);
-uninit_vcpu:
- kvm_vcpu_uninit(&vmx->vcpu);
-free_vcpu:
- free_vpid(vmx->vpid);
- kmem_cache_free(x86_fpu_cache, vmx->vcpu.arch.guest_fpu);
-free_user_fpu:
- kmem_cache_free(x86_fpu_cache, vmx->vcpu.arch.user_fpu);
-free_partial_vcpu:
- kmem_cache_free(kvm_vcpu_cache, vmx);
- return ERR_PTR(err);
-}
-
-#define L1TF_MSG_SMT "L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.\n"
-#define L1TF_MSG_L1D "L1TF CPU bug present and virtualization mitigation disabled, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.\n"
-
-static int vmx_vm_init(struct kvm *kvm)
-{
- spin_lock_init(&to_kvm_vmx(kvm)->ept_pointer_lock);
-
- if (!ple_gap)
- kvm->arch.pause_in_guest = true;
-
- if (boot_cpu_has(X86_BUG_L1TF) && enable_ept) {
- switch (l1tf_mitigation) {
- case L1TF_MITIGATION_OFF:
- case L1TF_MITIGATION_FLUSH_NOWARN:
- /* 'I explicitly don't care' is set */
- break;
- case L1TF_MITIGATION_FLUSH:
- case L1TF_MITIGATION_FLUSH_NOSMT:
- case L1TF_MITIGATION_FULL:
- /*
- * Warn upon starting the first VM in a potentially
- * insecure environment.
- */
- if (sched_smt_active())
- pr_warn_once(L1TF_MSG_SMT);
- if (l1tf_vmx_mitigation == VMENTER_L1D_FLUSH_NEVER)
- pr_warn_once(L1TF_MSG_L1D);
- break;
- case L1TF_MITIGATION_FULL_FORCE:
- /* Flush is enforced */
- break;
- }
- }
- return 0;
-}
-
-static int __init vmx_check_processor_compat(void)
-{
- struct vmcs_config vmcs_conf;
- struct vmx_capability vmx_cap;
-
- if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0)
- return -EIO;
- if (nested)
- nested_vmx_setup_ctls_msrs(&vmcs_conf.nested, vmx_cap.ept,
- enable_apicv);
- if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config)) != 0) {
- printk(KERN_ERR "kvm: CPU %d feature inconsistency!\n",
- smp_processor_id());
- return -EIO;
- }
- return 0;
-}
-
-static u64 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
-{
- u8 cache;
- u64 ipat = 0;
-
- /* For VT-d and EPT combination
- * 1. MMIO: always map as UC
- * 2. EPT with VT-d:
- * a. VT-d without snooping control feature: can't guarantee the
- * result, try to trust guest.
- * b. VT-d with snooping control feature: snooping control feature of
- * VT-d engine can guarantee the cache correctness. Just set it
- * to WB to keep consistent with host. So the same as item 3.
- * 3. EPT without VT-d: always map as WB and set IPAT=1 to keep
- * consistent with host MTRR
- */
- if (is_mmio) {
- cache = MTRR_TYPE_UNCACHABLE;
- goto exit;
- }
-
- if (!kvm_arch_has_noncoherent_dma(vcpu->kvm)) {
- ipat = VMX_EPT_IPAT_BIT;
- cache = MTRR_TYPE_WRBACK;
- goto exit;
- }
-
- if (kvm_read_cr0(vcpu) & X86_CR0_CD) {
- ipat = VMX_EPT_IPAT_BIT;
- if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED))
- cache = MTRR_TYPE_WRBACK;
- else
- cache = MTRR_TYPE_UNCACHABLE;
- goto exit;
- }
-
- cache = kvm_mtrr_get_guest_memory_type(vcpu, gfn);
-
-exit:
- return (cache << VMX_EPT_MT_EPTE_SHIFT) | ipat;
-}
-
-static int vmx_get_lpage_level(void)
-{
- if (enable_ept && !cpu_has_vmx_ept_1g_page())
- return PT_DIRECTORY_LEVEL;
- else
- /* For shadow and EPT supported 1GB page */
- return PT_PDPE_LEVEL;
-}
-
-static void vmcs_set_secondary_exec_control(struct vcpu_vmx *vmx)
-{
- /*
- * These bits in the secondary execution controls field
- * are dynamic, the others are mostly based on the hypervisor
- * architecture and the guest's CPUID. Do not touch the
- * dynamic bits.
- */
- u32 mask =
- SECONDARY_EXEC_SHADOW_VMCS |
- SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
- SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
- SECONDARY_EXEC_DESC;
-
- u32 new_ctl = vmx->secondary_exec_control;
- u32 cur_ctl = secondary_exec_controls_get(vmx);
-
- secondary_exec_controls_set(vmx, (new_ctl & ~mask) | (cur_ctl & mask));
-}
-
-/*
- * Generate MSR_IA32_VMX_CR{0,4}_FIXED1 according to CPUID. Only set bits
- * (indicating "allowed-1") if they are supported in the guest's CPUID.
- */
-static void nested_vmx_cr_fixed1_bits_update(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- struct kvm_cpuid_entry2 *entry;
-
- vmx->nested.msrs.cr0_fixed1 = 0xffffffff;
- vmx->nested.msrs.cr4_fixed1 = X86_CR4_PCE;
-
-#define cr4_fixed1_update(_cr4_mask, _reg, _cpuid_mask) do { \
- if (entry && (entry->_reg & (_cpuid_mask))) \
- vmx->nested.msrs.cr4_fixed1 |= (_cr4_mask); \
-} while (0)
-
- entry = kvm_find_cpuid_entry(vcpu, 0x1, 0);
- cr4_fixed1_update(X86_CR4_VME, edx, bit(X86_FEATURE_VME));
- cr4_fixed1_update(X86_CR4_PVI, edx, bit(X86_FEATURE_VME));
- cr4_fixed1_update(X86_CR4_TSD, edx, bit(X86_FEATURE_TSC));
- cr4_fixed1_update(X86_CR4_DE, edx, bit(X86_FEATURE_DE));
- cr4_fixed1_update(X86_CR4_PSE, edx, bit(X86_FEATURE_PSE));
- cr4_fixed1_update(X86_CR4_PAE, edx, bit(X86_FEATURE_PAE));
- cr4_fixed1_update(X86_CR4_MCE, edx, bit(X86_FEATURE_MCE));
- cr4_fixed1_update(X86_CR4_PGE, edx, bit(X86_FEATURE_PGE));
- cr4_fixed1_update(X86_CR4_OSFXSR, edx, bit(X86_FEATURE_FXSR));
- cr4_fixed1_update(X86_CR4_OSXMMEXCPT, edx, bit(X86_FEATURE_XMM));
- cr4_fixed1_update(X86_CR4_VMXE, ecx, bit(X86_FEATURE_VMX));
- cr4_fixed1_update(X86_CR4_SMXE, ecx, bit(X86_FEATURE_SMX));
- cr4_fixed1_update(X86_CR4_PCIDE, ecx, bit(X86_FEATURE_PCID));
- cr4_fixed1_update(X86_CR4_OSXSAVE, ecx, bit(X86_FEATURE_XSAVE));
-
- entry = kvm_find_cpuid_entry(vcpu, 0x7, 0);
- cr4_fixed1_update(X86_CR4_FSGSBASE, ebx, bit(X86_FEATURE_FSGSBASE));
- cr4_fixed1_update(X86_CR4_SMEP, ebx, bit(X86_FEATURE_SMEP));
- cr4_fixed1_update(X86_CR4_SMAP, ebx, bit(X86_FEATURE_SMAP));
- cr4_fixed1_update(X86_CR4_PKE, ecx, bit(X86_FEATURE_PKU));
- cr4_fixed1_update(X86_CR4_UMIP, ecx, bit(X86_FEATURE_UMIP));
- cr4_fixed1_update(X86_CR4_LA57, ecx, bit(X86_FEATURE_LA57));
-
-#undef cr4_fixed1_update
-}
-
-static void nested_vmx_entry_exit_ctls_update(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
-
- if (kvm_mpx_supported()) {
- bool mpx_enabled = guest_cpuid_has(vcpu, X86_FEATURE_MPX);
-
- if (mpx_enabled) {
- vmx->nested.msrs.entry_ctls_high |= VM_ENTRY_LOAD_BNDCFGS;
- vmx->nested.msrs.exit_ctls_high |= VM_EXIT_CLEAR_BNDCFGS;
- } else {
- vmx->nested.msrs.entry_ctls_high &= ~VM_ENTRY_LOAD_BNDCFGS;
- vmx->nested.msrs.exit_ctls_high &= ~VM_EXIT_CLEAR_BNDCFGS;
- }
- }
-}
-
-static void update_intel_pt_cfg(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- struct kvm_cpuid_entry2 *best = NULL;
- int i;
-
- for (i = 0; i < PT_CPUID_LEAVES; i++) {
- best = kvm_find_cpuid_entry(vcpu, 0x14, i);
- if (!best)
- return;
- vmx->pt_desc.caps[CPUID_EAX + i*PT_CPUID_REGS_NUM] = best->eax;
- vmx->pt_desc.caps[CPUID_EBX + i*PT_CPUID_REGS_NUM] = best->ebx;
- vmx->pt_desc.caps[CPUID_ECX + i*PT_CPUID_REGS_NUM] = best->ecx;
- vmx->pt_desc.caps[CPUID_EDX + i*PT_CPUID_REGS_NUM] = best->edx;
- }
-
- /* Get the number of configurable Address Ranges for filtering */
- vmx->pt_desc.addr_range = intel_pt_validate_cap(vmx->pt_desc.caps,
- PT_CAP_num_address_ranges);
-
- /* Initialize and clear the no dependency bits */
- vmx->pt_desc.ctl_bitmask = ~(RTIT_CTL_TRACEEN | RTIT_CTL_OS |
- RTIT_CTL_USR | RTIT_CTL_TSC_EN | RTIT_CTL_DISRETC);
-
- /*
- * If CPUID.(EAX=14H,ECX=0):EBX[0]=1 CR3Filter can be set otherwise
- * will inject an #GP
- */
- if (intel_pt_validate_cap(vmx->pt_desc.caps, PT_CAP_cr3_filtering))
- vmx->pt_desc.ctl_bitmask &= ~RTIT_CTL_CR3EN;
-
- /*
- * If CPUID.(EAX=14H,ECX=0):EBX[1]=1 CYCEn, CycThresh and
- * PSBFreq can be set
- */
- if (intel_pt_validate_cap(vmx->pt_desc.caps, PT_CAP_psb_cyc))
- vmx->pt_desc.ctl_bitmask &= ~(RTIT_CTL_CYCLEACC |
- RTIT_CTL_CYC_THRESH | RTIT_CTL_PSB_FREQ);
-
- /*
- * If CPUID.(EAX=14H,ECX=0):EBX[3]=1 MTCEn BranchEn and
- * MTCFreq can be set
- */
- if (intel_pt_validate_cap(vmx->pt_desc.caps, PT_CAP_mtc))
- vmx->pt_desc.ctl_bitmask &= ~(RTIT_CTL_MTC_EN |
- RTIT_CTL_BRANCH_EN | RTIT_CTL_MTC_RANGE);
-
- /* If CPUID.(EAX=14H,ECX=0):EBX[4]=1 FUPonPTW and PTWEn can be set */
- if (intel_pt_validate_cap(vmx->pt_desc.caps, PT_CAP_ptwrite))
- vmx->pt_desc.ctl_bitmask &= ~(RTIT_CTL_FUP_ON_PTW |
- RTIT_CTL_PTW_EN);
-
- /* If CPUID.(EAX=14H,ECX=0):EBX[5]=1 PwrEvEn can be set */
- if (intel_pt_validate_cap(vmx->pt_desc.caps, PT_CAP_power_event_trace))
- vmx->pt_desc.ctl_bitmask &= ~RTIT_CTL_PWR_EVT_EN;
-
- /* If CPUID.(EAX=14H,ECX=0):ECX[0]=1 ToPA can be set */
- if (intel_pt_validate_cap(vmx->pt_desc.caps, PT_CAP_topa_output))
- vmx->pt_desc.ctl_bitmask &= ~RTIT_CTL_TOPA;
-
- /* If CPUID.(EAX=14H,ECX=0):ECX[3]=1 FabircEn can be set */
- if (intel_pt_validate_cap(vmx->pt_desc.caps, PT_CAP_output_subsys))
- vmx->pt_desc.ctl_bitmask &= ~RTIT_CTL_FABRIC_EN;
-
- /* unmask address range configure area */
- for (i = 0; i < vmx->pt_desc.addr_range; i++)
- vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4));
-}
-
-static void vmx_cpuid_update(struct kvm_vcpu *vcpu)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
-
- /* xsaves_enabled is recomputed in vmx_compute_secondary_exec_control(). */
- vcpu->arch.xsaves_enabled = false;
-
- if (cpu_has_secondary_exec_ctrls()) {
- vmx_compute_secondary_exec_control(vmx);
- vmcs_set_secondary_exec_control(vmx);
- }
-
- if (nested_vmx_allowed(vcpu))
- to_vmx(vcpu)->msr_ia32_feature_control_valid_bits |=
- FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX |
- FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX;
- else
- to_vmx(vcpu)->msr_ia32_feature_control_valid_bits &=
- ~(FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX |
- FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX);
-
- if (nested_vmx_allowed(vcpu)) {
- nested_vmx_cr_fixed1_bits_update(vcpu);
- nested_vmx_entry_exit_ctls_update(vcpu);
- }
-
- if (boot_cpu_has(X86_FEATURE_INTEL_PT) &&
- guest_cpuid_has(vcpu, X86_FEATURE_INTEL_PT))
- update_intel_pt_cfg(vcpu);
-
- if (boot_cpu_has(X86_FEATURE_RTM)) {
- struct shared_msr_entry *msr;
- msr = find_msr_entry(vmx, MSR_IA32_TSX_CTRL);
- if (msr) {
- bool enabled = guest_cpuid_has(vcpu, X86_FEATURE_RTM);
- vmx_set_guest_msr(vmx, msr, enabled ? 0 : TSX_CTRL_RTM_DISABLE);
- }
- }
-}
-
-static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry)
-{
- if (func == 1 && nested)
- entry->ecx |= bit(X86_FEATURE_VMX);
-}
-
-static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu)
-{
- to_vmx(vcpu)->req_immediate_exit = true;
-}
-
-static int vmx_check_intercept(struct kvm_vcpu *vcpu,
- struct x86_instruction_info *info,
- enum x86_intercept_stage stage)
-{
- struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
- struct x86_emulate_ctxt *ctxt = &vcpu->arch.emulate_ctxt;
-
- /*
- * RDPID causes #UD if disabled through secondary execution controls.
- * Because it is marked as EmulateOnUD, we need to intercept it here.
- */
- if (info->intercept == x86_intercept_rdtscp &&
- !nested_cpu_has2(vmcs12, SECONDARY_EXEC_RDTSCP)) {
- ctxt->exception.vector = UD_VECTOR;
- ctxt->exception.error_code_valid = false;
- return X86EMUL_PROPAGATE_FAULT;
- }
-
- /* TODO: check more intercepts... */
- return X86EMUL_CONTINUE;
-}
-
-#ifdef CONFIG_X86_64
-/* (a << shift) / divisor, return 1 if overflow otherwise 0 */
-static inline int u64_shl_div_u64(u64 a, unsigned int shift,
- u64 divisor, u64 *result)
-{
- u64 low = a << shift, high = a >> (64 - shift);
-
- /* To avoid the overflow on divq */
- if (high >= divisor)
- return 1;
-
- /* Low hold the result, high hold rem which is discarded */
- asm("divq %2\n\t" : "=a" (low), "=d" (high) :
- "rm" (divisor), "0" (low), "1" (high));
- *result = low;
-
- return 0;
-}
-
-static int vmx_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc,
- bool *expired)
-{
- struct vcpu_vmx *vmx;
- u64 tscl, guest_tscl, delta_tsc, lapic_timer_advance_cycles;
- struct kvm_timer *ktimer = &vcpu->arch.apic->lapic_timer;
-
- if (kvm_mwait_in_guest(vcpu->kvm) ||
- kvm_can_post_timer_interrupt(vcpu))
- return -EOPNOTSUPP;
-
- vmx = to_vmx(vcpu);
- tscl = rdtsc();
- guest_tscl = kvm_read_l1_tsc(vcpu, tscl);
- delta_tsc = max(guest_deadline_tsc, guest_tscl) - guest_tscl;
- lapic_timer_advance_cycles = nsec_to_cycles(vcpu,
- ktimer->timer_advance_ns);
-
- if (delta_tsc > lapic_timer_advance_cycles)
- delta_tsc -= lapic_timer_advance_cycles;
- else
- delta_tsc = 0;
-
- /* Convert to host delta tsc if tsc scaling is enabled */
- if (vcpu->arch.tsc_scaling_ratio != kvm_default_tsc_scaling_ratio &&
- delta_tsc && u64_shl_div_u64(delta_tsc,
- kvm_tsc_scaling_ratio_frac_bits,
- vcpu->arch.tsc_scaling_ratio, &delta_tsc))
- return -ERANGE;
-
- /*
- * If the delta tsc can't fit in the 32 bit after the multi shift,
- * we can't use the preemption timer.
- * It's possible that it fits on later vmentries, but checking
- * on every vmentry is costly so we just use an hrtimer.
- */
- if (delta_tsc >> (cpu_preemption_timer_multi + 32))
- return -ERANGE;
-
- vmx->hv_deadline_tsc = tscl + delta_tsc;
- *expired = !delta_tsc;
- return 0;
-}
-
-static void vmx_cancel_hv_timer(struct kvm_vcpu *vcpu)
-{
- to_vmx(vcpu)->hv_deadline_tsc = -1;
-}
-#endif
-
-static void vmx_sched_in(struct kvm_vcpu *vcpu, int cpu)
-{
- if (!kvm_pause_in_guest(vcpu->kvm))
- shrink_ple_window(vcpu);
-}
-
-static void vmx_slot_enable_log_dirty(struct kvm *kvm,
- struct kvm_memory_slot *slot)
-{
- kvm_mmu_slot_leaf_clear_dirty(kvm, slot);
- kvm_mmu_slot_largepage_remove_write_access(kvm, slot);
-}
-
-static void vmx_slot_disable_log_dirty(struct kvm *kvm,
- struct kvm_memory_slot *slot)
-{
- kvm_mmu_slot_set_dirty(kvm, slot);
-}
-
-static void vmx_flush_log_dirty(struct kvm *kvm)
-{
- kvm_flush_pml_buffers(kvm);
-}
-
-static int vmx_write_pml_buffer(struct kvm_vcpu *vcpu)
-{
- struct vmcs12 *vmcs12;
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- gpa_t gpa, dst;
-
- if (is_guest_mode(vcpu)) {
- WARN_ON_ONCE(vmx->nested.pml_full);
-
- /*
- * Check if PML is enabled for the nested guest.
- * Whether eptp bit 6 is set is already checked
- * as part of A/D emulation.
- */
- vmcs12 = get_vmcs12(vcpu);
- if (!nested_cpu_has_pml(vmcs12))
- return 0;
-
- if (vmcs12->guest_pml_index >= PML_ENTITY_NUM) {
- vmx->nested.pml_full = true;
- return 1;
- }
-
- gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS) & ~0xFFFull;
- dst = vmcs12->pml_address + sizeof(u64) * vmcs12->guest_pml_index;
-
- if (kvm_write_guest_page(vcpu->kvm, gpa_to_gfn(dst), &gpa,
- offset_in_page(dst), sizeof(gpa)))
- return 0;
-
- vmcs12->guest_pml_index--;
- }
-
- return 0;
-}
-
-static void vmx_enable_log_dirty_pt_masked(struct kvm *kvm,
- struct kvm_memory_slot *memslot,
- gfn_t offset, unsigned long mask)
-{
- kvm_mmu_clear_dirty_pt_masked(kvm, memslot, offset, mask);
-}
-
-static void __pi_post_block(struct kvm_vcpu *vcpu)
-{
- struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
- struct pi_desc old, new;
- unsigned int dest;
-
- do {
- old.control = new.control = pi_desc->control;
- WARN(old.nv != POSTED_INTR_WAKEUP_VECTOR,
- "Wakeup handler not enabled while the VCPU is blocked\n");
-
- dest = cpu_physical_id(vcpu->cpu);
-
- if (x2apic_enabled())
- new.ndst = dest;
- else
- new.ndst = (dest << 8) & 0xFF00;
-
- /* set 'NV' to 'notification vector' */
- new.nv = POSTED_INTR_VECTOR;
- } while (cmpxchg64(&pi_desc->control, old.control,
- new.control) != old.control);
-
- if (!WARN_ON_ONCE(vcpu->pre_pcpu == -1)) {
- spin_lock(&per_cpu(blocked_vcpu_on_cpu_lock, vcpu->pre_pcpu));
- list_del(&vcpu->blocked_vcpu_list);
- spin_unlock(&per_cpu(blocked_vcpu_on_cpu_lock, vcpu->pre_pcpu));
- vcpu->pre_pcpu = -1;
- }
-}
-
-/*
- * This routine does the following things for vCPU which is going
- * to be blocked if VT-d PI is enabled.
- * - Store the vCPU to the wakeup list, so when interrupts happen
- * we can find the right vCPU to wake up.
- * - Change the Posted-interrupt descriptor as below:
- * 'NDST' <-- vcpu->pre_pcpu
- * 'NV' <-- POSTED_INTR_WAKEUP_VECTOR
- * - If 'ON' is set during this process, which means at least one
- * interrupt is posted for this vCPU, we cannot block it, in
- * this case, return 1, otherwise, return 0.
- *
- */
-static int pi_pre_block(struct kvm_vcpu *vcpu)
-{
- unsigned int dest;
- struct pi_desc old, new;
- struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
-
- if (!kvm_arch_has_assigned_device(vcpu->kvm) ||
- !irq_remapping_cap(IRQ_POSTING_CAP) ||
- !kvm_vcpu_apicv_active(vcpu))
- return 0;
-
- WARN_ON(irqs_disabled());
- local_irq_disable();
- if (!WARN_ON_ONCE(vcpu->pre_pcpu != -1)) {
- vcpu->pre_pcpu = vcpu->cpu;
- spin_lock(&per_cpu(blocked_vcpu_on_cpu_lock, vcpu->pre_pcpu));
- list_add_tail(&vcpu->blocked_vcpu_list,
- &per_cpu(blocked_vcpu_on_cpu,
- vcpu->pre_pcpu));
- spin_unlock(&per_cpu(blocked_vcpu_on_cpu_lock, vcpu->pre_pcpu));
- }
-
- do {
- old.control = new.control = pi_desc->control;
-
- WARN((pi_desc->sn == 1),
- "Warning: SN field of posted-interrupts "
- "is set before blocking\n");
-
- /*
- * Since vCPU can be preempted during this process,
- * vcpu->cpu could be different with pre_pcpu, we
- * need to set pre_pcpu as the destination of wakeup
- * notification event, then we can find the right vCPU
- * to wakeup in wakeup handler if interrupts happen
- * when the vCPU is in blocked state.
- */
- dest = cpu_physical_id(vcpu->pre_pcpu);
-
- if (x2apic_enabled())
- new.ndst = dest;
- else
- new.ndst = (dest << 8) & 0xFF00;
-
- /* set 'NV' to 'wakeup vector' */
- new.nv = POSTED_INTR_WAKEUP_VECTOR;
- } while (cmpxchg64(&pi_desc->control, old.control,
- new.control) != old.control);
-
- /* We should not block the vCPU if an interrupt is posted for it. */
- if (pi_test_on(pi_desc) == 1)
- __pi_post_block(vcpu);
-
- local_irq_enable();
- return (vcpu->pre_pcpu == -1);
-}
-
-static int vmx_pre_block(struct kvm_vcpu *vcpu)
-{
- if (pi_pre_block(vcpu))
- return 1;
-
- if (kvm_lapic_hv_timer_in_use(vcpu))
- kvm_lapic_switch_to_sw_timer(vcpu);
-
- return 0;
-}
-
-static void pi_post_block(struct kvm_vcpu *vcpu)
-{
- if (vcpu->pre_pcpu == -1)
- return;
-
- WARN_ON(irqs_disabled());
- local_irq_disable();
- __pi_post_block(vcpu);
- local_irq_enable();
-}
-
-static void vmx_post_block(struct kvm_vcpu *vcpu)
-{
- if (kvm_x86_ops->set_hv_timer)
- kvm_lapic_switch_to_hv_timer(vcpu);
-
- pi_post_block(vcpu);
-}
-
-/*
- * vmx_update_pi_irte - set IRTE for Posted-Interrupts
- *
- * @kvm: kvm
- * @host_irq: host irq of the interrupt
- * @guest_irq: gsi of the interrupt
- * @set: set or unset PI
- * returns 0 on success, < 0 on failure
- */
-static int vmx_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
- uint32_t guest_irq, bool set)
-{
- struct kvm_kernel_irq_routing_entry *e;
- struct kvm_irq_routing_table *irq_rt;
- struct kvm_lapic_irq irq;
- struct kvm_vcpu *vcpu;
- struct vcpu_data vcpu_info;
- int idx, ret = 0;
-
- if (!kvm_arch_has_assigned_device(kvm) ||
- !irq_remapping_cap(IRQ_POSTING_CAP) ||
- !kvm_vcpu_apicv_active(kvm->vcpus[0]))
- return 0;
-
- idx = srcu_read_lock(&kvm->irq_srcu);
- irq_rt = srcu_dereference(kvm->irq_routing, &kvm->irq_srcu);
- if (guest_irq >= irq_rt->nr_rt_entries ||
- hlist_empty(&irq_rt->map[guest_irq])) {
- pr_warn_once("no route for guest_irq %u/%u (broken user space?)\n",
- guest_irq, irq_rt->nr_rt_entries);
- goto out;
- }
-
- hlist_for_each_entry(e, &irq_rt->map[guest_irq], link) {
- if (e->type != KVM_IRQ_ROUTING_MSI)
- continue;
- /*
- * VT-d PI cannot support posting multicast/broadcast
- * interrupts to a vCPU, we still use interrupt remapping
- * for these kind of interrupts.
- *
- * For lowest-priority interrupts, we only support
- * those with single CPU as the destination, e.g. user
- * configures the interrupts via /proc/irq or uses
- * irqbalance to make the interrupts single-CPU.
- *
- * We will support full lowest-priority interrupt later.
- *
- * In addition, we can only inject generic interrupts using
- * the PI mechanism, refuse to route others through it.
- */
-
- kvm_set_msi_irq(kvm, e, &irq);
- if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu) ||
- !kvm_irq_is_postable(&irq)) {
- /*
- * Make sure the IRTE is in remapped mode if
- * we don't handle it in posted mode.
- */
- ret = irq_set_vcpu_affinity(host_irq, NULL);
- if (ret < 0) {
- printk(KERN_INFO
- "failed to back to remapped mode, irq: %u\n",
- host_irq);
- goto out;
- }
-
- continue;
- }
-
- vcpu_info.pi_desc_addr = __pa(vcpu_to_pi_desc(vcpu));
- vcpu_info.vector = irq.vector;
-
- trace_kvm_pi_irte_update(host_irq, vcpu->vcpu_id, e->gsi,
- vcpu_info.vector, vcpu_info.pi_desc_addr, set);
-
- if (set)
- ret = irq_set_vcpu_affinity(host_irq, &vcpu_info);
- else
- ret = irq_set_vcpu_affinity(host_irq, NULL);
-
- if (ret < 0) {
- printk(KERN_INFO "%s: failed to update PI IRTE\n",
- __func__);
- goto out;
- }
- }
-
- ret = 0;
-out:
- srcu_read_unlock(&kvm->irq_srcu, idx);
- return ret;
-}
-
-static void vmx_setup_mce(struct kvm_vcpu *vcpu)
-{
- if (vcpu->arch.mcg_cap & MCG_LMCE_P)
- to_vmx(vcpu)->msr_ia32_feature_control_valid_bits |=
- FEATURE_CONTROL_LMCE;
- else
- to_vmx(vcpu)->msr_ia32_feature_control_valid_bits &=
- ~FEATURE_CONTROL_LMCE;
-}
-
-static int vmx_smi_allowed(struct kvm_vcpu *vcpu)
-{
- /* we need a nested vmexit to enter SMM, postpone if run is pending */
- if (to_vmx(vcpu)->nested.nested_run_pending)
- return 0;
- return 1;
-}
-
-static int vmx_pre_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
-
- vmx->nested.smm.guest_mode = is_guest_mode(vcpu);
- if (vmx->nested.smm.guest_mode)
- nested_vmx_vmexit(vcpu, -1, 0, 0);
-
- vmx->nested.smm.vmxon = vmx->nested.vmxon;
- vmx->nested.vmxon = false;
- vmx_clear_hlt(vcpu);
- return 0;
-}
-
-static int vmx_pre_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
-{
- struct vcpu_vmx *vmx = to_vmx(vcpu);
- int ret;
-
- if (vmx->nested.smm.vmxon) {
- vmx->nested.vmxon = true;
- vmx->nested.smm.vmxon = false;
- }
-
- if (vmx->nested.smm.guest_mode) {
- ret = nested_vmx_enter_non_root_mode(vcpu, false);
- if (ret)
- return ret;
-
- vmx->nested.smm.guest_mode = false;
- }
- return 0;
-}
-
-static int enable_smi_window(struct kvm_vcpu *vcpu)
-{
- return 0;
-}
-
-static bool vmx_need_emulation_on_page_fault(struct kvm_vcpu *vcpu)
-{
- return false;
-}
-
-static bool vmx_apic_init_signal_blocked(struct kvm_vcpu *vcpu)
-{
- return to_vmx(vcpu)->nested.vmxon;
-}
-
-static __init int hardware_setup(void)
-{
- unsigned long host_bndcfgs;
- struct desc_ptr dt;
- int r, i;
-
- rdmsrl_safe(MSR_EFER, &host_efer);
-
- store_idt(&dt);
- host_idt_base = dt.address;
-
- for (i = 0; i < ARRAY_SIZE(vmx_msr_index); ++i)
- kvm_define_shared_msr(i, vmx_msr_index[i]);
-
- if (setup_vmcs_config(&vmcs_config, &vmx_capability) < 0)
- return -EIO;
-
- if (boot_cpu_has(X86_FEATURE_NX))
- kvm_enable_efer_bits(EFER_NX);
-
- if (boot_cpu_has(X86_FEATURE_MPX)) {
- rdmsrl(MSR_IA32_BNDCFGS, host_bndcfgs);
- WARN_ONCE(host_bndcfgs, "KVM: BNDCFGS in host will be lost");
- }
-
- if (!cpu_has_vmx_vpid() || !cpu_has_vmx_invvpid() ||
- !(cpu_has_vmx_invvpid_single() || cpu_has_vmx_invvpid_global()))
- enable_vpid = 0;
-
- if (!cpu_has_vmx_ept() ||
- !cpu_has_vmx_ept_4levels() ||
- !cpu_has_vmx_ept_mt_wb() ||
- !cpu_has_vmx_invept_global())
- enable_ept = 0;
-
- if (!cpu_has_vmx_ept_ad_bits() || !enable_ept)
- enable_ept_ad_bits = 0;
-
- if (!cpu_has_vmx_unrestricted_guest() || !enable_ept)
- enable_unrestricted_guest = 0;
-
- if (!cpu_has_vmx_flexpriority())
- flexpriority_enabled = 0;
-
- if (!cpu_has_virtual_nmis())
- enable_vnmi = 0;
-
- /*
- * set_apic_access_page_addr() is used to reload apic access
- * page upon invalidation. No need to do anything if not
- * using the APIC_ACCESS_ADDR VMCS field.
- */
- if (!flexpriority_enabled)
- kvm_x86_ops->set_apic_access_page_addr = NULL;
-
- if (!cpu_has_vmx_tpr_shadow())
- kvm_x86_ops->update_cr8_intercept = NULL;
-
- if (enable_ept && !cpu_has_vmx_ept_2m_page())
- kvm_disable_largepages();
-
-#if IS_ENABLED(CONFIG_HYPERV)
- if (ms_hyperv.nested_features & HV_X64_NESTED_GUEST_MAPPING_FLUSH
- && enable_ept) {
- kvm_x86_ops->tlb_remote_flush = hv_remote_flush_tlb;
- kvm_x86_ops->tlb_remote_flush_with_range =
- hv_remote_flush_tlb_with_range;
- }
-#endif
-
- if (!cpu_has_vmx_ple()) {
- ple_gap = 0;
- ple_window = 0;
- ple_window_grow = 0;
- ple_window_max = 0;
- ple_window_shrink = 0;
- }
-
- if (!cpu_has_vmx_apicv()) {
- enable_apicv = 0;
- kvm_x86_ops->sync_pir_to_irr = NULL;
- }
-
- if (cpu_has_vmx_tsc_scaling()) {
- kvm_has_tsc_control = true;
- kvm_max_tsc_scaling_ratio = KVM_VMX_TSC_MULTIPLIER_MAX;
- kvm_tsc_scaling_ratio_frac_bits = 48;
- }
-
- set_bit(0, vmx_vpid_bitmap); /* 0 is reserved for host */
-
- if (enable_ept)
- vmx_enable_tdp();
- else
- kvm_disable_tdp();
-
- /*
- * Only enable PML when hardware supports PML feature, and both EPT
- * and EPT A/D bit features are enabled -- PML depends on them to work.
- */
- if (!enable_ept || !enable_ept_ad_bits || !cpu_has_vmx_pml())
- enable_pml = 0;
-
- if (!enable_pml) {
- kvm_x86_ops->slot_enable_log_dirty = NULL;
- kvm_x86_ops->slot_disable_log_dirty = NULL;
- kvm_x86_ops->flush_log_dirty = NULL;
- kvm_x86_ops->enable_log_dirty_pt_masked = NULL;
- }
-
- if (!cpu_has_vmx_preemption_timer())
- enable_preemption_timer = false;
-
- if (enable_preemption_timer) {
- u64 use_timer_freq = 5000ULL * 1000 * 1000;
- u64 vmx_msr;
-
- rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
- cpu_preemption_timer_multi =
- vmx_msr & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
-
- if (tsc_khz)
- use_timer_freq = (u64)tsc_khz * 1000;
- use_timer_freq >>= cpu_preemption_timer_multi;
-
- /*
- * KVM "disables" the preemption timer by setting it to its max
- * value. Don't use the timer if it might cause spurious exits
- * at a rate faster than 0.1 Hz (of uninterrupted guest time).
- */
- if (use_timer_freq > 0xffffffffu / 10)
- enable_preemption_timer = false;
- }
-
- if (!enable_preemption_timer) {
- kvm_x86_ops->set_hv_timer = NULL;
- kvm_x86_ops->cancel_hv_timer = NULL;
- kvm_x86_ops->request_immediate_exit = __kvm_request_immediate_exit;
- }
-
- kvm_set_posted_intr_wakeup_handler(wakeup_handler);
-
- kvm_mce_cap_supported |= MCG_LMCE_P;
-
- if (pt_mode != PT_MODE_SYSTEM && pt_mode != PT_MODE_HOST_GUEST)
- return -EINVAL;
- if (!enable_ept || !cpu_has_vmx_intel_pt())
- pt_mode = PT_MODE_SYSTEM;
-
- if (nested) {
- nested_vmx_setup_ctls_msrs(&vmcs_config.nested,
- vmx_capability.ept, enable_apicv);
-
- r = nested_vmx_hardware_setup(kvm_vmx_exit_handlers);
- if (r)
- return r;
- }
-
- r = alloc_kvm_area();
- if (r)
- nested_vmx_hardware_unsetup();
- return r;
-}
-
-static __exit void hardware_unsetup(void)
-{
- if (nested)
- nested_vmx_hardware_unsetup();
-
- free_kvm_area();
-}
-
-static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
- .cpu_has_kvm_support = cpu_has_kvm_support,
- .disabled_by_bios = vmx_disabled_by_bios,
- .hardware_setup = hardware_setup,
- .hardware_unsetup = hardware_unsetup,
- .check_processor_compatibility = vmx_check_processor_compat,
- .hardware_enable = hardware_enable,
- .hardware_disable = hardware_disable,
- .cpu_has_accelerated_tpr = report_flexpriority,
- .has_emulated_msr = vmx_has_emulated_msr,
-
- .vm_init = vmx_vm_init,
- .vm_alloc = vmx_vm_alloc,
- .vm_free = vmx_vm_free,
-
- .vcpu_create = vmx_create_vcpu,
- .vcpu_free = vmx_free_vcpu,
- .vcpu_reset = vmx_vcpu_reset,
-
- .prepare_guest_switch = vmx_prepare_switch_to_guest,
- .vcpu_load = vmx_vcpu_load,
- .vcpu_put = vmx_vcpu_put,
-
- .update_bp_intercept = update_exception_bitmap,
- .get_msr_feature = vmx_get_msr_feature,
- .get_msr = vmx_get_msr,
- .set_msr = vmx_set_msr,
- .get_segment_base = vmx_get_segment_base,
- .get_segment = vmx_get_segment,
- .set_segment = vmx_set_segment,
- .get_cpl = vmx_get_cpl,
- .get_cs_db_l_bits = vmx_get_cs_db_l_bits,
- .decache_cr0_guest_bits = vmx_decache_cr0_guest_bits,
- .decache_cr4_guest_bits = vmx_decache_cr4_guest_bits,
- .set_cr0 = vmx_set_cr0,
- .set_cr3 = vmx_set_cr3,
- .set_cr4 = vmx_set_cr4,
- .set_efer = vmx_set_efer,
- .get_idt = vmx_get_idt,
- .set_idt = vmx_set_idt,
- .get_gdt = vmx_get_gdt,
- .set_gdt = vmx_set_gdt,
- .get_dr6 = vmx_get_dr6,
- .set_dr6 = vmx_set_dr6,
- .set_dr7 = vmx_set_dr7,
- .sync_dirty_debug_regs = vmx_sync_dirty_debug_regs,
- .cache_reg = vmx_cache_reg,
- .get_rflags = vmx_get_rflags,
- .set_rflags = vmx_set_rflags,
-
- .tlb_flush = vmx_flush_tlb,
- .tlb_flush_gva = vmx_flush_tlb_gva,
-
- .run = vmx_vcpu_run,
- .handle_exit = vmx_handle_exit,
- .skip_emulated_instruction = skip_emulated_instruction,
- .set_interrupt_shadow = vmx_set_interrupt_shadow,
- .get_interrupt_shadow = vmx_get_interrupt_shadow,
- .patch_hypercall = vmx_patch_hypercall,
- .set_irq = vmx_inject_irq,
- .set_nmi = vmx_inject_nmi,
- .queue_exception = vmx_queue_exception,
- .cancel_injection = vmx_cancel_injection,
- .interrupt_allowed = vmx_interrupt_allowed,
- .nmi_allowed = vmx_nmi_allowed,
- .get_nmi_mask = vmx_get_nmi_mask,
- .set_nmi_mask = vmx_set_nmi_mask,
- .enable_nmi_window = enable_nmi_window,
- .enable_irq_window = enable_irq_window,
- .update_cr8_intercept = update_cr8_intercept,
- .set_virtual_apic_mode = vmx_set_virtual_apic_mode,
- .set_apic_access_page_addr = vmx_set_apic_access_page_addr,
- .get_enable_apicv = vmx_get_enable_apicv,
- .refresh_apicv_exec_ctrl = vmx_refresh_apicv_exec_ctrl,
- .load_eoi_exitmap = vmx_load_eoi_exitmap,
- .apicv_post_state_restore = vmx_apicv_post_state_restore,
- .hwapic_irr_update = vmx_hwapic_irr_update,
- .hwapic_isr_update = vmx_hwapic_isr_update,
- .guest_apic_has_interrupt = vmx_guest_apic_has_interrupt,
- .sync_pir_to_irr = vmx_sync_pir_to_irr,
- .deliver_posted_interrupt = vmx_deliver_posted_interrupt,
- .dy_apicv_has_pending_interrupt = vmx_dy_apicv_has_pending_interrupt,
-
- .set_tss_addr = vmx_set_tss_addr,
- .set_identity_map_addr = vmx_set_identity_map_addr,
- .get_tdp_level = get_ept_level,
- .get_mt_mask = vmx_get_mt_mask,
-
- .get_exit_info = vmx_get_exit_info,
-
- .get_lpage_level = vmx_get_lpage_level,
-
- .cpuid_update = vmx_cpuid_update,
-
- .rdtscp_supported = vmx_rdtscp_supported,
- .invpcid_supported = vmx_invpcid_supported,
-
- .set_supported_cpuid = vmx_set_supported_cpuid,
-
- .has_wbinvd_exit = cpu_has_vmx_wbinvd_exit,
-
- .read_l1_tsc_offset = vmx_read_l1_tsc_offset,
- .write_l1_tsc_offset = vmx_write_l1_tsc_offset,
-
- .set_tdp_cr3 = vmx_set_cr3,
-
- .check_intercept = vmx_check_intercept,
- .handle_exit_irqoff = vmx_handle_exit_irqoff,
- .mpx_supported = vmx_mpx_supported,
- .xsaves_supported = vmx_xsaves_supported,
- .umip_emulated = vmx_umip_emulated,
- .pt_supported = vmx_pt_supported,
-
- .request_immediate_exit = vmx_request_immediate_exit,
-
- .sched_in = vmx_sched_in,
-
- .slot_enable_log_dirty = vmx_slot_enable_log_dirty,
- .slot_disable_log_dirty = vmx_slot_disable_log_dirty,
- .flush_log_dirty = vmx_flush_log_dirty,
- .enable_log_dirty_pt_masked = vmx_enable_log_dirty_pt_masked,
- .write_log_dirty = vmx_write_pml_buffer,
-
- .pre_block = vmx_pre_block,
- .post_block = vmx_post_block,
-
- .pmu_ops = &intel_pmu_ops,
-
- .update_pi_irte = vmx_update_pi_irte,
-
-#ifdef CONFIG_X86_64
- .set_hv_timer = vmx_set_hv_timer,
- .cancel_hv_timer = vmx_cancel_hv_timer,
-#endif
-
- .setup_mce = vmx_setup_mce,
-
- .smi_allowed = vmx_smi_allowed,
- .pre_enter_smm = vmx_pre_enter_smm,
- .pre_leave_smm = vmx_pre_leave_smm,
- .enable_smi_window = enable_smi_window,
-
- .check_nested_events = NULL,
- .get_nested_state = NULL,
- .set_nested_state = NULL,
- .get_vmcs12_pages = NULL,
- .nested_enable_evmcs = NULL,
- .nested_get_evmcs_version = NULL,
- .need_emulation_on_page_fault = vmx_need_emulation_on_page_fault,
- .apic_init_signal_blocked = vmx_apic_init_signal_blocked,
-};
-
-static void vmx_cleanup_l1d_flush(void)
-{
- if (vmx_l1d_flush_pages) {
- free_pages((unsigned long)vmx_l1d_flush_pages, L1D_CACHE_ORDER);
- vmx_l1d_flush_pages = NULL;
- }
- /* Restore state so sysfs ignores VMX */
- l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_AUTO;
-}
-
-static void vmx_exit(void)
-{
-#ifdef CONFIG_KEXEC_CORE
- RCU_INIT_POINTER(crash_vmclear_loaded_vmcss, NULL);
- synchronize_rcu();
-#endif
-
- kvm_exit();
-
-#if IS_ENABLED(CONFIG_HYPERV)
- if (static_branch_unlikely(&enable_evmcs)) {
- int cpu;
- struct hv_vp_assist_page *vp_ap;
- /*
- * Reset everything to support using non-enlightened VMCS
- * access later (e.g. when we reload the module with
- * enlightened_vmcs=0)
- */
- for_each_online_cpu(cpu) {
- vp_ap = hv_get_vp_assist_page(cpu);
-
- if (!vp_ap)
- continue;
-
- vp_ap->nested_control.features.directhypercall = 0;
- vp_ap->current_nested_vmcs = 0;
- vp_ap->enlighten_vmentry = 0;
- }
-
- static_branch_disable(&enable_evmcs);
- }
-#endif
- vmx_cleanup_l1d_flush();
-}
-module_exit(vmx_exit);
-
-static int __init vmx_init(void)
-{
- int r;
-
-#if IS_ENABLED(CONFIG_HYPERV)
- /*
- * Enlightened VMCS usage should be recommended and the host needs
- * to support eVMCS v1 or above. We can also disable eVMCS support
- * with module parameter.
- */
- if (enlightened_vmcs &&
- ms_hyperv.hints & HV_X64_ENLIGHTENED_VMCS_RECOMMENDED &&
- (ms_hyperv.nested_features & HV_X64_ENLIGHTENED_VMCS_VERSION) >=
- KVM_EVMCS_VERSION) {
- int cpu;
-
- /* Check that we have assist pages on all online CPUs */
- for_each_online_cpu(cpu) {
- if (!hv_get_vp_assist_page(cpu)) {
- enlightened_vmcs = false;
- break;
- }
- }
-
- if (enlightened_vmcs) {
- pr_info("KVM: vmx: using Hyper-V Enlightened VMCS\n");
- static_branch_enable(&enable_evmcs);
- }
-
- if (ms_hyperv.nested_features & HV_X64_NESTED_DIRECT_FLUSH)
- vmx_x86_ops.enable_direct_tlbflush
- = hv_enable_direct_tlbflush;
-
- } else {
- enlightened_vmcs = false;
- }
-#endif
-
- r = kvm_init(&vmx_x86_ops, sizeof(struct vcpu_vmx),
- __alignof__(struct vcpu_vmx), THIS_MODULE);
- if (r)
- return r;
-
- /*
- * Must be called after kvm_init() so enable_ept is properly set
- * up. Hand the parameter mitigation value in which was stored in
- * the pre module init parser. If no parameter was given, it will
- * contain 'auto' which will be turned into the default 'cond'
- * mitigation mode.
- */
- r = vmx_setup_l1d_flush(vmentry_l1d_flush_param);
- if (r) {
- vmx_exit();
- return r;
- }
-
-#ifdef CONFIG_KEXEC_CORE
- rcu_assign_pointer(crash_vmclear_loaded_vmcss,
- crash_vmclear_local_loaded_vmcss);
-#endif
- vmx_check_vmcs12_offsets();
-
- return 0;
-}
-module_init(vmx_init);
--
2.20.1
From: Geert Uytterhoeven <[email protected]>
[ Upstream commit 55b1cb1f03ad5eea39897d0c74035e02deddcff2 ]
pinmux_func_gpios[] contains a hole due to the missing function GPIO
definition for the "CTX0&CTX1" signal, which is the logical "AND" of the
two CAN outputs.
Fix this by:
- Renaming CRX0_CRX1_MARK to CTX0_CTX1_MARK, as PJ2MD[2:0]=010
configures the combined "CTX0&CTX1" output signal,
- Renaming CRX0X1_MARK to CRX0_CRX1_MARK, as PJ3MD[1:0]=10 configures
the shared "CRX0/CRX1" input signal, which is fed to both CAN
inputs,
- Adding the missing function GPIO definition for "CTX0&CTX1" to
pinmux_func_gpios[],
- Moving all CAN enums next to each other.
See SH7262 Group, SH7264 Group User's Manual: Hardware, Rev. 4.00:
[1] Figure 1.2 (3) (Pin Assignment for the SH7264 Group (1-Mbyte
Version),
[2] Figure 1.2 (4) Pin Assignment for the SH7264 Group (640-Kbyte
Version,
[3] Table 1.4 List of Pins,
[4] Figure 20.29 Connection Example when Using This Module as 1-Channel
Module (64 Mailboxes x 1 Channel),
[5] Table 32.10 Multiplexed Pins (Port J),
[6] Section 32.2.30 (3) Port J Control Register 0 (PJCR0).
Note that the last 2 disagree about PJ2MD[2:0], which is probably the
root cause of this bug. But considering [4], "CTx0&CTx1" in [5] must
be correct, and "CRx0&CRx1" in [6] must be wrong.
Signed-off-by: Geert Uytterhoeven <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/pinctrl/sh-pfc/pfc-sh7264.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/pinctrl/sh-pfc/pfc-sh7264.c b/drivers/pinctrl/sh-pfc/pfc-sh7264.c
index e1c34e19222ee..3ddb9565ed804 100644
--- a/drivers/pinctrl/sh-pfc/pfc-sh7264.c
+++ b/drivers/pinctrl/sh-pfc/pfc-sh7264.c
@@ -500,17 +500,15 @@ enum {
SD_WP_MARK, SD_CLK_MARK, SD_CMD_MARK,
CRX0_MARK, CRX1_MARK,
CTX0_MARK, CTX1_MARK,
+ CRX0_CRX1_MARK, CTX0_CTX1_MARK,
PWM1A_MARK, PWM1B_MARK, PWM1C_MARK, PWM1D_MARK,
PWM1E_MARK, PWM1F_MARK, PWM1G_MARK, PWM1H_MARK,
PWM2A_MARK, PWM2B_MARK, PWM2C_MARK, PWM2D_MARK,
PWM2E_MARK, PWM2F_MARK, PWM2G_MARK, PWM2H_MARK,
IERXD_MARK, IETXD_MARK,
- CRX0_CRX1_MARK,
WDTOVF_MARK,
- CRX0X1_MARK,
-
/* DMAC */
TEND0_MARK, DACK0_MARK, DREQ0_MARK,
TEND1_MARK, DACK1_MARK, DREQ1_MARK,
@@ -998,12 +996,12 @@ static const u16 pinmux_data[] = {
PINMUX_DATA(PJ3_DATA, PJ3MD_00),
PINMUX_DATA(CRX1_MARK, PJ3MD_01),
- PINMUX_DATA(CRX0X1_MARK, PJ3MD_10),
+ PINMUX_DATA(CRX0_CRX1_MARK, PJ3MD_10),
PINMUX_DATA(IRQ1_PJ_MARK, PJ3MD_11),
PINMUX_DATA(PJ2_DATA, PJ2MD_000),
PINMUX_DATA(CTX1_MARK, PJ2MD_001),
- PINMUX_DATA(CRX0_CRX1_MARK, PJ2MD_010),
+ PINMUX_DATA(CTX0_CTX1_MARK, PJ2MD_010),
PINMUX_DATA(CS2_MARK, PJ2MD_011),
PINMUX_DATA(SCK0_MARK, PJ2MD_100),
PINMUX_DATA(LCD_M_DISP_MARK, PJ2MD_101),
@@ -1248,6 +1246,7 @@ static const struct pinmux_func pinmux_func_gpios[] = {
GPIO_FN(CTX1),
GPIO_FN(CRX1),
GPIO_FN(CTX0),
+ GPIO_FN(CTX0_CTX1),
GPIO_FN(CRX0),
GPIO_FN(CRX0_CRX1),
--
2.20.1
From: Takashi Sakamoto <[email protected]>
[ Upstream commit d61fe22c2ae42d9fd76c34ef4224064cca4b04b0 ]
A design of ALSA control core allows applications to execute three
operations for TLV feature; read, write and command. Furthermore, it
allows driver developers to process the operations by two ways; allocated
array or callback function. In the former, read operation is just allowed,
thus developers uses the latter when device driver supports variety of
models or the target model is expected to dynamically change information
stored in TLV container.
The core also allows applications to lock any element so that the other
applications can't perform write operation to the element for element
value and TLV information. When the element is locked, write and command
operation for TLV information are prohibited as well as element value.
Any read operation should be allowed in the case.
At present, when an element has callback function for TLV information,
TLV read operation returns EPERM if the element is locked. On the
other hand, the read operation is success when an element has allocated
array for TLV information. In both cases, read operation is success for
element value expectedly.
This commit fixes the bug. This change can be backported to v4.14
kernel or later.
Signed-off-by: Takashi Sakamoto <[email protected]>
Reviewed-by: Jaroslav Kysela <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/core/control.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/sound/core/control.c b/sound/core/control.c
index 649d3217590ed..d1312f14d78fb 100644
--- a/sound/core/control.c
+++ b/sound/core/control.c
@@ -1468,8 +1468,9 @@ static int call_tlv_handler(struct snd_ctl_file *file, int op_flag,
if (kctl->tlv.c == NULL)
return -ENXIO;
- /* When locked, this is unavailable. */
- if (vd->owner != NULL && vd->owner != file)
+ /* Write and command operations are not allowed for locked element. */
+ if (op_flag != SNDRV_CTL_TLV_OP_READ &&
+ vd->owner != NULL && vd->owner != file)
return -EPERM;
return kctl->tlv.c(kctl, op_flag, size, buf);
--
2.20.1
From: Rasmus Villemoes <[email protected]>
[ Upstream commit 800cd6fb76f0ec7711deb72a86c924db1ae42648 ]
There are a number of problems with cpm_muram_alloc() and its
callers. Most callers assign the return value to some variable and
then use IS_ERR_VALUE to check for allocation failure. However, when
that variable is not sizeof(long), this leads to warnings - and it is
indeed broken to do e.g.
u32 foo = cpm_muram_alloc();
if (IS_ERR_VALUE(foo))
on a 64-bit platform, since the condition
foo >= (unsigned long)-ENOMEM
is tautologically false. There are also callers that ignore the
possibility of error, and then there are those that check for error by
comparing the return value to 0...
One could fix that by changing all callers to store the return value
temporarily in an "unsigned long" and test that. However, use of
IS_ERR_VALUE() is error-prone and should be restricted to things which
are inherently long-sized (stuff in pt_regs etc.). Instead, let's aim
for changing to the standard kernel style
int foo = cpm_muram_alloc();
if (foo < 0)
deal_with_it()
some->where = foo;
Changing the return type from unsigned long to s32 (aka signed int)
doesn't change the value that gets stored into any of the callers'
variables except if the caller was storing the result in a u64 _and_
the allocation failed, so in itself this patch should be a no-op.
Another problem with cpm_muram_alloc() is that it can certainly
validly return 0 - and except if some cpm_muram_alloc_fixed() call
interferes, the very first cpm_muram_alloc() call will return just
that. But that shows that both ucc_slow_free() and ucc_fast_free() are
buggy, since they assume that a value of 0 means "that field was never
allocated". We'll later change cpm_muram_free() to accept (and ignore)
a negative offset, so callers can use a sentinel of -1 instead of 0
and just unconditionally call cpm_muram_free().
Reviewed-by: Timur Tabi <[email protected]>
Signed-off-by: Rasmus Villemoes <[email protected]>
Signed-off-by: Li Yang <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/soc/fsl/qe/qe_common.c | 29 ++++++++++++++++-------------
include/soc/fsl/qe/qe.h | 16 ++++++++--------
2 files changed, 24 insertions(+), 21 deletions(-)
diff --git a/drivers/soc/fsl/qe/qe_common.c b/drivers/soc/fsl/qe/qe_common.c
index 104e68d9b84f2..4f60724b06b7c 100644
--- a/drivers/soc/fsl/qe/qe_common.c
+++ b/drivers/soc/fsl/qe/qe_common.c
@@ -35,7 +35,7 @@ static phys_addr_t muram_pbase;
struct muram_block {
struct list_head head;
- unsigned long start;
+ s32 start;
int size;
};
@@ -113,13 +113,14 @@ out_muram:
* @algo: algorithm for alloc.
* @data: data for genalloc's algorithm.
*
- * This function returns an offset into the muram area.
+ * This function returns a non-negative offset into the muram area, or
+ * a negative errno on failure.
*/
-static unsigned long cpm_muram_alloc_common(unsigned long size,
- genpool_algo_t algo, void *data)
+static s32 cpm_muram_alloc_common(unsigned long size,
+ genpool_algo_t algo, void *data)
{
struct muram_block *entry;
- unsigned long start;
+ s32 start;
if (!muram_pool && cpm_muram_init())
goto out2;
@@ -140,7 +141,7 @@ static unsigned long cpm_muram_alloc_common(unsigned long size,
out1:
gen_pool_free(muram_pool, start, size);
out2:
- return (unsigned long)-ENOMEM;
+ return -ENOMEM;
}
/*
@@ -148,13 +149,14 @@ out2:
* @size: number of bytes to allocate
* @align: requested alignment, in bytes
*
- * This function returns an offset into the muram area.
+ * This function returns a non-negative offset into the muram area, or
+ * a negative errno on failure.
* Use cpm_dpram_addr() to get the virtual address of the area.
* Use cpm_muram_free() to free the allocation.
*/
-unsigned long cpm_muram_alloc(unsigned long size, unsigned long align)
+s32 cpm_muram_alloc(unsigned long size, unsigned long align)
{
- unsigned long start;
+ s32 start;
unsigned long flags;
struct genpool_data_align muram_pool_data;
@@ -171,7 +173,7 @@ EXPORT_SYMBOL(cpm_muram_alloc);
* cpm_muram_free - free a chunk of multi-user ram
* @offset: The beginning of the chunk as returned by cpm_muram_alloc().
*/
-int cpm_muram_free(unsigned long offset)
+int cpm_muram_free(s32 offset)
{
unsigned long flags;
int size;
@@ -197,13 +199,14 @@ EXPORT_SYMBOL(cpm_muram_free);
* cpm_muram_alloc_fixed - reserve a specific region of multi-user ram
* @offset: offset of allocation start address
* @size: number of bytes to allocate
- * This function returns an offset into the muram area
+ * This function returns @offset if the area was available, a negative
+ * errno otherwise.
* Use cpm_dpram_addr() to get the virtual address of the area.
* Use cpm_muram_free() to free the allocation.
*/
-unsigned long cpm_muram_alloc_fixed(unsigned long offset, unsigned long size)
+s32 cpm_muram_alloc_fixed(unsigned long offset, unsigned long size)
{
- unsigned long start;
+ s32 start;
unsigned long flags;
struct genpool_data_fixed muram_pool_data_fixed;
diff --git a/include/soc/fsl/qe/qe.h b/include/soc/fsl/qe/qe.h
index b3d1aff5e8ad5..deb6238416947 100644
--- a/include/soc/fsl/qe/qe.h
+++ b/include/soc/fsl/qe/qe.h
@@ -102,26 +102,26 @@ static inline void qe_reset(void) {}
int cpm_muram_init(void);
#if defined(CONFIG_CPM) || defined(CONFIG_QUICC_ENGINE)
-unsigned long cpm_muram_alloc(unsigned long size, unsigned long align);
-int cpm_muram_free(unsigned long offset);
-unsigned long cpm_muram_alloc_fixed(unsigned long offset, unsigned long size);
+s32 cpm_muram_alloc(unsigned long size, unsigned long align);
+int cpm_muram_free(s32 offset);
+s32 cpm_muram_alloc_fixed(unsigned long offset, unsigned long size);
void __iomem *cpm_muram_addr(unsigned long offset);
unsigned long cpm_muram_offset(void __iomem *addr);
dma_addr_t cpm_muram_dma(void __iomem *addr);
#else
-static inline unsigned long cpm_muram_alloc(unsigned long size,
- unsigned long align)
+static inline s32 cpm_muram_alloc(unsigned long size,
+ unsigned long align)
{
return -ENOSYS;
}
-static inline int cpm_muram_free(unsigned long offset)
+static inline int cpm_muram_free(s32 offset)
{
return -ENOSYS;
}
-static inline unsigned long cpm_muram_alloc_fixed(unsigned long offset,
- unsigned long size)
+static inline s32 cpm_muram_alloc_fixed(unsigned long offset,
+ unsigned long size)
{
return -ENOSYS;
}
--
2.20.1
From: Davide Caratti <[email protected]>
[ Upstream commit 1afa3cc90f8fb745c777884d79eaa1001d6927a6 ]
unlike other classifiers that can be offloaded (i.e. users can set flags
like 'skip_hw' and 'skip_sw'), 'cls_matchall' doesn't validate the size
of netlink attribute 'TCA_MATCHALL_FLAGS' provided by user: add a proper
entry to mall_policy.
Fixes: b87f7936a932 ("net/sched: Add match-all classifier hw offloading.")
Signed-off-by: Davide Caratti <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/sched/cls_matchall.c | 1 +
1 file changed, 1 insertion(+)
--- a/net/sched/cls_matchall.c
+++ b/net/sched/cls_matchall.c
@@ -137,6 +137,7 @@ static void *mall_get(struct tcf_proto *
static const struct nla_policy mall_policy[TCA_MATCHALL_MAX + 1] = {
[TCA_MATCHALL_UNSPEC] = { .type = NLA_UNSPEC },
[TCA_MATCHALL_CLASSID] = { .type = NLA_U32 },
+ [TCA_MATCHALL_FLAGS] = { .type = NLA_U32 },
};
static int mall_set_parms(struct net *net, struct tcf_proto *tp,
From: "Toke H?iland-J?rgensen" <[email protected]>
[ Upstream commit ad1e03b2b3d4430baaa109b77bc308dc73050de3 ]
The current generic XDP handler skips execution of XDP programs entirely if
an SKB is marked as cloned. This leads to some surprising behaviour, as
packets can end up being cloned in various ways, which will make an XDP
program not see all the traffic on an interface.
This was discovered by a simple test case where an XDP program that always
returns XDP_DROP is installed on a veth device. When combining this with
the Scapy packet sniffer (which uses an AF_PACKET) socket on the sending
side, SKBs reliably end up in the cloned state, causing them to be passed
through to the receiving interface instead of being dropped. A minimal
reproducer script for this is included below.
This patch fixed the issue by simply triggering the existing linearisation
code for cloned SKBs instead of skipping the XDP program execution. This
behaviour is in line with the behaviour of the native XDP implementation
for the veth driver, which will reallocate and copy the SKB data if the SKB
is marked as shared.
Reproducer Python script (requires BCC and Scapy):
from scapy.all import TCP, IP, Ether, sendp, sniff, AsyncSniffer, Raw, UDP
from bcc import BPF
import time, sys, subprocess, shlex
SKB_MODE = (1 << 1)
DRV_MODE = (1 << 2)
PYTHON=sys.executable
def client():
time.sleep(2)
# Sniffing on the sender causes skb_cloned() to be set
s = AsyncSniffer()
s.start()
for p in range(10):
sendp(Ether(dst="aa:aa:aa:aa:aa:aa", src="cc:cc:cc:cc:cc:cc")/IP()/UDP()/Raw("Test"),
verbose=False)
time.sleep(0.1)
s.stop()
return 0
def server(mode):
prog = BPF(text="int dummy_drop(struct xdp_md *ctx) {return XDP_DROP;}")
func = prog.load_func("dummy_drop", BPF.XDP)
prog.attach_xdp("a_to_b", func, mode)
time.sleep(1)
s = sniff(iface="a_to_b", count=10, timeout=15)
if len(s):
print(f"Got {len(s)} packets - should have gotten 0")
return 1
else:
print("Got no packets - as expected")
return 0
if len(sys.argv) < 2:
print(f"Usage: {sys.argv[0]} <skb|drv>")
sys.exit(1)
if sys.argv[1] == "client":
sys.exit(client())
elif sys.argv[1] == "server":
mode = SKB_MODE if sys.argv[2] == 'skb' else DRV_MODE
sys.exit(server(mode))
else:
try:
mode = sys.argv[1]
if mode not in ('skb', 'drv'):
print(f"Usage: {sys.argv[0]} <skb|drv>")
sys.exit(1)
print(f"Running in {mode} mode")
for cmd in [
'ip netns add netns_a',
'ip netns add netns_b',
'ip -n netns_a link add a_to_b type veth peer name b_to_a netns netns_b',
# Disable ipv6 to make sure there's no address autoconf traffic
'ip netns exec netns_a sysctl -qw net.ipv6.conf.a_to_b.disable_ipv6=1',
'ip netns exec netns_b sysctl -qw net.ipv6.conf.b_to_a.disable_ipv6=1',
'ip -n netns_a link set dev a_to_b address aa:aa:aa:aa:aa:aa',
'ip -n netns_b link set dev b_to_a address cc:cc:cc:cc:cc:cc',
'ip -n netns_a link set dev a_to_b up',
'ip -n netns_b link set dev b_to_a up']:
subprocess.check_call(shlex.split(cmd))
server = subprocess.Popen(shlex.split(f"ip netns exec netns_a {PYTHON} {sys.argv[0]} server {mode}"))
client = subprocess.Popen(shlex.split(f"ip netns exec netns_b {PYTHON} {sys.argv[0]} client"))
client.wait()
server.wait()
sys.exit(server.returncode)
finally:
subprocess.run(shlex.split("ip netns delete netns_a"))
subprocess.run(shlex.split("ip netns delete netns_b"))
Fixes: d445516966dc ("net: xdp: support xdp generic on virtual devices")
Reported-by: Stepan Horacek <[email protected]>
Suggested-by: Paolo Abeni <[email protected]>
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/dev.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4306,14 +4306,14 @@ static u32 netif_receive_generic_xdp(str
/* Reinjected packets coming from act_mirred or similar should
* not get XDP generic processing.
*/
- if (skb_cloned(skb) || skb_is_tc_redirected(skb))
+ if (skb_is_tc_redirected(skb))
return XDP_PASS;
/* XDP packets must be linear and must have sufficient headroom
* of XDP_PACKET_HEADROOM bytes. This is the guarantee that also
* native XDP provides, thus we need to do it here as well.
*/
- if (skb_is_nonlinear(skb) ||
+ if (skb_cloned(skb) || skb_is_nonlinear(skb) ||
skb_headroom(skb) < XDP_PACKET_HEADROOM) {
int hroom = XDP_PACKET_HEADROOM - skb_headroom(skb);
int troom = skb->tail + skb->data_len - skb->end;
From: Arvind Sankar <[email protected]>
[ Upstream commit dacc9092336be20b01642afe1a51720b31f60369 ]
When checking whether the reported lfb_size makes sense, the height
* stride result is page-aligned before seeing whether it exceeds the
reported size.
This doesn't work if height * stride is not an exact number of pages.
For example, as reported in the kernel bugzilla below, an 800x600x32 EFI
framebuffer gets skipped because of this.
Move the PAGE_ALIGN to after the check vs size.
Reported-by: Christopher Head <[email protected]>
Tested-by: Christopher Head <[email protected]>
Signed-off-by: Arvind Sankar <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206051
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kernel/sysfb_simplefb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/sysfb_simplefb.c b/arch/x86/kernel/sysfb_simplefb.c
index 85195d447a922..f3215346e47fd 100644
--- a/arch/x86/kernel/sysfb_simplefb.c
+++ b/arch/x86/kernel/sysfb_simplefb.c
@@ -94,11 +94,11 @@ __init int create_simplefb(const struct screen_info *si,
if (si->orig_video_isVGA == VIDEO_TYPE_VLFB)
size <<= 16;
length = mode->height * mode->stride;
- length = PAGE_ALIGN(length);
if (length > size) {
printk(KERN_WARNING "sysfb: VRAM smaller than advertised\n");
return -EINVAL;
}
+ length = PAGE_ALIGN(length);
/* setup IORESOURCE_MEM as framebuffer memory */
memset(&res, 0, sizeof(res));
--
2.20.1
From: Colin Ian King <[email protected]>
[ Upstream commit 2052d032c06761330bca4944bb7858b00960e868 ]
Currently when setup_irq fails the error exit path will leak the
recently allocated timer structure. Originally the code would
throw a panic but a later commit changed the behaviour to return
via the err_iounmap path and hence we now have a memory leak. Fix
this by adding a err_timer_free error path that kfree's timer.
Addresses-Coverity: ("Resource Leak")
Fixes: 524a7f08983d ("clocksource/drivers/bcm2835_timer: Convert init function to return error")
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Daniel Lezcano <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/clocksource/bcm2835_timer.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/clocksource/bcm2835_timer.c b/drivers/clocksource/bcm2835_timer.c
index 60da2537bef93..1082dcef17d17 100644
--- a/drivers/clocksource/bcm2835_timer.c
+++ b/drivers/clocksource/bcm2835_timer.c
@@ -134,7 +134,7 @@ static int __init bcm2835_timer_init(struct device_node *node)
ret = setup_irq(irq, &timer->act);
if (ret) {
pr_err("Can't set up timer IRQ\n");
- goto err_iounmap;
+ goto err_timer_free;
}
clockevents_config_and_register(&timer->evt, freq, 0xf, 0xffffffff);
@@ -143,6 +143,9 @@ static int __init bcm2835_timer_init(struct device_node *node)
return 0;
+err_timer_free:
+ kfree(timer);
+
err_iounmap:
iounmap(base);
return ret;
--
2.20.1
This reverts commit 740d876bd9565857a695ce7c05efda4eba5bc585.
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 997926a9121c0..3791ce8d269e0 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2968,9 +2968,6 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
static int get_ept_level(struct kvm_vcpu *vcpu)
{
- /* Nested EPT currently only supports 4-level walks. */
- if (is_guest_mode(vcpu) && nested_cpu_has_ept(get_vmcs12(vcpu)))
- return 4;
if (cpu_has_vmx_ept_5levels() && (cpuid_maxphyaddr(vcpu) > 48))
return 5;
return 4;
--
2.20.1
On Fri, Feb 21, 2020 at 08:41:50AM +0100, Greg Kroah-Hartman wrote:
> From: Hanjun Guo <[email protected]>
>
> [ Upstream commit 3c23b83a88d00383e1d498cfa515249aa2fe0238 ]
Please drop this patch from the stable queue, thanks.
Lorenzo
> The IORT specification [0] (Section 3, table 4, page 9) defines the
> 'Number of IDs' as 'The number of IDs in the range minus one'.
>
> However, the IORT ID mapping function iort_id_map() treats the 'Number
> of IDs' field as if it were the full IDs mapping count, with the
> following check in place to detect out of boundary input IDs:
>
> InputID >= Input base + Number of IDs
>
> This check is flawed in that it considers the 'Number of IDs' field as
> the full number of IDs mapping and disregards the 'minus one' from
> the IDs count.
>
> The correct check in iort_id_map() should be implemented as:
>
> InputID > Input base + Number of IDs
>
> this implements the specification correctly but unfortunately it breaks
> existing firmwares that erroneously set the 'Number of IDs' as the full
> IDs mapping count rather than IDs mapping count minus one.
>
> e.g.
>
> PCI hostbridge mapping entry 1:
> Input base: 0x1000
> ID Count: 0x100
> Output base: 0x1000
> Output reference: 0xC4 //ITS reference
>
> PCI hostbridge mapping entry 2:
> Input base: 0x1100
> ID Count: 0x100
> Output base: 0x2000
> Output reference: 0xD4 //ITS reference
>
> Two mapping entries which the second entry's Input base = the first
> entry's Input base + ID count, so for InputID 0x1100 and with the
> correct InputID check in place in iort_id_map() the kernel would map
> the InputID to ITS 0xC4 not 0xD4 as it would be expected.
>
> Therefore, to keep supporting existing flawed firmwares, introduce a
> workaround that instructs the kernel to use the old InputID range check
> logic in iort_id_map(), so that we can support both firmwares written
> with the flawed 'Number of IDs' logic and the correct one as defined in
> the specifications.
>
> [0]: http://infocenter.arm.com/help/topic/com.arm.doc.den0049d/DEN0049D_IO_Remapping_Table.pdf
>
> Reported-by: Pankaj Bansal <[email protected]>
> Link: https://lore.kernel.org/linux-acpi/[email protected]/
> Signed-off-by: Hanjun Guo <[email protected]>
> Signed-off-by: Lorenzo Pieralisi <[email protected]>
> Cc: Pankaj Bansal <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Sudeep Holla <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Robin Murphy <[email protected]>
> Signed-off-by: Will Deacon <[email protected]>
> Signed-off-by: Sasha Levin <[email protected]>
> ---
> drivers/acpi/arm64/iort.c | 57 +++++++++++++++++++++++++++++++++++++--
> 1 file changed, 55 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index e11b5da6f828f..7d86468300b78 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -306,6 +306,59 @@ out:
> return status;
> }
>
> +struct iort_workaround_oem_info {
> + char oem_id[ACPI_OEM_ID_SIZE + 1];
> + char oem_table_id[ACPI_OEM_TABLE_ID_SIZE + 1];
> + u32 oem_revision;
> +};
> +
> +static bool apply_id_count_workaround;
> +
> +static struct iort_workaround_oem_info wa_info[] __initdata = {
> + {
> + .oem_id = "HISI ",
> + .oem_table_id = "HIP07 ",
> + .oem_revision = 0,
> + }, {
> + .oem_id = "HISI ",
> + .oem_table_id = "HIP08 ",
> + .oem_revision = 0,
> + }
> +};
> +
> +static void __init
> +iort_check_id_count_workaround(struct acpi_table_header *tbl)
> +{
> + int i;
> +
> + for (i = 0; i < ARRAY_SIZE(wa_info); i++) {
> + if (!memcmp(wa_info[i].oem_id, tbl->oem_id, ACPI_OEM_ID_SIZE) &&
> + !memcmp(wa_info[i].oem_table_id, tbl->oem_table_id, ACPI_OEM_TABLE_ID_SIZE) &&
> + wa_info[i].oem_revision == tbl->oem_revision) {
> + apply_id_count_workaround = true;
> + pr_warn(FW_BUG "ID count for ID mapping entry is wrong, applying workaround\n");
> + break;
> + }
> + }
> +}
> +
> +static inline u32 iort_get_map_max(struct acpi_iort_id_mapping *map)
> +{
> + u32 map_max = map->input_base + map->id_count;
> +
> + /*
> + * The IORT specification revision D (Section 3, table 4, page 9) says
> + * Number of IDs = The number of IDs in the range minus one, but the
> + * IORT code ignored the "minus one", and some firmware did that too,
> + * so apply a workaround here to keep compatible with both the spec
> + * compliant and non-spec compliant firmwares.
> + */
> + if (apply_id_count_workaround)
> + map_max--;
> +
> + return map_max;
> +}
> +
> static int iort_id_map(struct acpi_iort_id_mapping *map, u8 type, u32 rid_in,
> u32 *rid_out)
> {
> @@ -322,8 +375,7 @@ static int iort_id_map(struct acpi_iort_id_mapping *map, u8 type, u32 rid_in,
> return -ENXIO;
> }
>
> - if (rid_in < map->input_base ||
> - (rid_in >= map->input_base + map->id_count))
> + if (rid_in < map->input_base || rid_in > iort_get_map_max(map))
> return -ENXIO;
>
> *rid_out = map->output_base + (rid_in - map->input_base);
> @@ -1542,5 +1594,6 @@ void __init acpi_iort_init(void)
> return;
> }
>
> + iort_check_id_count_workaround(iort_table);
> iort_init_platform_devices();
> }
> --
> 2.20.1
>
>
>
On 21/02/2020 07:39, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.106 release.
> There are 191 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 23 Feb 2020 07:19:49 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.106-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
All tests are passing for Tegra ...
Test results for stable-v4.19:
11 builds: 11 pass, 0 fail
22 boots: 22 pass, 0 fail
32 tests: 32 pass, 0 fail
Linux version: 4.19.106-rc1-g27ac98449017
Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000,
tegra194-p2972-0000, tegra20-ventana,
tegra210-p2371-2180, tegra30-cardhu-a04
Cheers
Jon
--
nvpublic
Hello Greg,
> From: [email protected] <[email protected]> On
> Behalf Of Greg Kroah-Hartman
> Sent: 21 February 2020 07:40
>
> This is the start of the stable review cycle for the 4.19.106 release.
> There are 191 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 23 Feb 2020 07:19:49 +0000.
> Anything received after that time might be too late.
No issues seen for CIP configs for Linux 4.19.106-rc1 (27ac98449017).
Build/test logs: https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/pipelines/119856527
Pipeline: https://gitlab.com/cip-project/cip-testing/linux-cip-pipelines/-/blob/ba32334b/trees/linux-4.19.y.yml
Kind regards, Chris
Hi!
> Hardcode the EPT page-walk level for L2 to be 4 levels, as KVM's MMU
> currently also hardcodes the page walk level for nested EPT to be 4
> levels. The L2 guest is all but guaranteed to soft hang on its first
> instruction when L1 is using EPT, as KVM will construct 4-level page
> tables and then tell hardware to use 5-level page tables.
I don't get it. 7/191 reverts the patch, then 9/191 reverts the
revert. Can we simply drop both 7 and 9, for exactly the same result?
(Patch 8 is a unused file, so it does not change the picture).
Best regards,
Pavel
> +++ b/arch/x86/kvm/vmx.c
> @@ -5302,6 +5302,9 @@ static void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
>
> static int get_ept_level(struct kvm_vcpu *vcpu)
> {
> + /* Nested EPT currently only supports 4-level walks. */
> + if (is_guest_mode(vcpu) && nested_cpu_has_ept(get_vmcs12(vcpu)))
> + return 4;
> if (cpu_has_vmx_ept_5levels() && (cpuid_maxphyaddr(vcpu) > 48))
> return 5;
> return 4;
> --
> 2.20.1
>
>
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Hi!
> With cross-server COPY we've introduced the possibility that the current
> or saved filehandle might not have fh_dentry/fh_export filled in, but we
> missed a place that assumed it was. I think this could be triggered by
> a compound like:
>
> PUTFH(foreign filehandle)
> GETATTR
> SAVEFH
> COPY
>
> First, check_if_stalefh_allowed sets no_verify on the first (PUTFH) op.
> Then op_func = nfsd4_putfh runs and leaves current_fh->fh_export NULL.
> need_wrongsec_check returns true, since this PUTFH has OP_IS_PUTFH_LIKE
> set and GETATTR does not have OP_HANDLES_WRONGSEC set.
>
> We should probably also consider tightening the checks in
> check_if_stalefh_allowed and double-checking that we don't assume the
> filehandle is verified elsewhere in the compound. But I think this
> fixes the immediate issue.
>
> Reported-by: Dan Carpenter <[email protected]>
> Fixes: 4e48f1cccab3 "NFSD: allow inter server COPY to have... "
AFAICT 4e48f1cccab3 "NFSD: allow inter server COPY to have... " is not
part of 4.19 series, so this should not be needed in 4.19.
Best regards,
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
On Fri, Feb 21, 2020 at 11:29:49AM +0100, Pavel Machek wrote:
> Hi!
>
> > Hardcode the EPT page-walk level for L2 to be 4 levels, as KVM's MMU
> > currently also hardcodes the page walk level for nested EPT to be 4
> > levels. The L2 guest is all but guaranteed to soft hang on its first
> > instruction when L1 is using EPT, as KVM will construct 4-level page
> > tables and then tell hardware to use 5-level page tables.
>
> I don't get it. 7/191 reverts the patch, then 9/191 reverts the
> revert. Can we simply drop both 7 and 9, for exactly the same result?
>
> (Patch 8 is a unused file, so it does not change the picture).
Patch 07 is reverting this patch from the same unused file,
arch/x86/kvm/vmx/vmx.c[*]. The reason patch 07 looks like a normal diff is
that a prior patch in 4.19.105 created the unused file (which is what's
reverted by patch 08 here).
Patch 09 reintroduces the fix for the correct file, arch/x86/kvm/vmx.c.
[*] In upstream, vmx.c now lives in arch/x86/kvm/vmx/, but in 4.19 and
earlier it lives in arch/x86/kvm/.
>
> Best regards,
> Pavel
>
> > +++ b/arch/x86/kvm/vmx.c
> > @@ -5302,6 +5302,9 @@ static void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
> >
> > static int get_ept_level(struct kvm_vcpu *vcpu)
> > {
> > + /* Nested EPT currently only supports 4-level walks. */
> > + if (is_guest_mode(vcpu) && nested_cpu_has_ept(get_vmcs12(vcpu)))
> > + return 4;
> > if (cpu_has_vmx_ept_5levels() && (cpuid_maxphyaddr(vcpu) > 48))
> > return 5;
> > return 4;
> > --
> > 2.20.1
> >
> >
>
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
On 2/21/20 12:39 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.106 release.
> There are 191 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 23 Feb 2020 07:19:49 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.106-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>
Compiled and booted on test system. No dmesg regressions.
thanks,
-- Shuah
Hi!
On Fri 2020-02-21 08:41:25, Greg Kroah-Hartman wrote:
> From: Changbin Du <[email protected]>
>
> [ Upstream commit 248ed51048c40d36728e70914e38bffd7821da57 ]
>
> First, printk() is NMI-context safe now since the safe printk() has been
> implemented and it already has an irq_work to make NMI-context safe.
>
> Second, this NMI irq_work actually does not work if a NMI handler causes
> panic by watchdog timeout. It has no chance to run in such case, while
> the safe printk() will flush its per-cpu buffers before panicking.
>
> While at it, repurpose the irq_work callback into a function which
> concentrates the NMI duration checking and makes the code easier to
> follow.
I know there were printk() changes recently, but are they all in 4.19?
Does this actually fix any bug in 4.19?
Best regards,
Pavel
> diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
> index 086cf1d1d71d8..0f8b9b900b0e7 100644
> --- a/arch/x86/kernel/nmi.c
> +++ b/arch/x86/kernel/nmi.c
> @@ -102,18 +102,22 @@ static int __init nmi_warning_debugfs(void)
> }
> fs_initcall(nmi_warning_debugfs);
>
> -static void nmi_max_handler(struct irq_work *w)
> +static void nmi_check_duration(struct nmiaction *action, u64 duration)
> {
> - struct nmiaction *a = container_of(w, struct nmiaction, irq_work);
> + u64 whole_msecs = READ_ONCE(action->max_duration);
> int remainder_ns, decimal_msecs;
> - u64 whole_msecs = READ_ONCE(a->max_duration);
> +
> + if (duration < nmi_longest_ns || duration < action->max_duration)
> + return;
> +
> + action->max_duration = duration;
>
> remainder_ns = do_div(whole_msecs, (1000 * 1000));
> decimal_msecs = remainder_ns / 1000;
>
> printk_ratelimited(KERN_INFO
> "INFO: NMI handler (%ps) took too long to run: %lld.%03d msecs\n",
> - a->handler, whole_msecs, decimal_msecs);
> + action->handler, whole_msecs, decimal_msecs);
> }
>
> static int nmi_handle(unsigned int type, struct pt_regs *regs)
> @@ -140,11 +144,7 @@ static int nmi_handle(unsigned int type, struct pt_regs *regs)
> delta = sched_clock() - delta;
> trace_nmi_handler(a->handler, (int)delta, thishandled);
>
> - if (delta < nmi_longest_ns || delta < a->max_duration)
> - continue;
> -
> - a->max_duration = delta;
> - irq_work_queue(&a->irq_work);
> + nmi_check_duration(a, delta);
> }
>
> rcu_read_unlock();
> @@ -162,8 +162,6 @@ int __register_nmi_handler(unsigned int type, struct nmiaction *action)
> if (!action->handler)
> return -EINVAL;
>
> - init_irq_work(&action->irq_work, nmi_max_handler);
> -
> raw_spin_lock_irqsave(&desc->lock, flags);
>
> /*
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Hi!
> [ Upstream commit 8fea78029f5e6ed734ae1957bef23cfda1af4354 ]
>
> If CONFIG_SND_ATMEL_SOC_DMA=m, build error:
>
> sound/soc/atmel/atmel_ssc_dai.o: In function `atmel_ssc_set_audio':
> (.text+0x7cd): undefined reference to `atmel_pcm_dma_platform_register'
>
> Function atmel_pcm_dma_platform_register is defined under
> CONFIG SND_ATMEL_SOC_DMA, so select SND_ATMEL_SOC_DMA in
> CONFIG SND_ATMEL_SOC_SSC, same to CONFIG_SND_ATMEL_SOC_PDC.
4.19 code has significant differences from mainline in this area.
4.19:
config SND_ATMEL_SOC_DMA
tristate
select SND_SOC_GENERIC_DMAENGINE_PCM
default m if SND_ATMEL_SOC_SSC_DMA=m && SND_ATMEL_SOC_SSC=m
...
mainline:
config SND_ATMEL_SOC_DMA
tristate
select SND_SOC_GENERIC_DMAENGINE_PCM
...
Extra 'default m if' line in 4.19 should already prevent this bug,
additional select statements are not neccessary.
Best regards,
Pavel
> +++ b/sound/soc/atmel/Kconfig
> @@ -25,6 +25,8 @@ config SND_ATMEL_SOC_DMA
>
> config SND_ATMEL_SOC_SSC_DMA
> tristate
> + select SND_ATMEL_SOC_DMA
> + select SND_ATMEL_SOC_PDC
>
> config SND_ATMEL_SOC_SSC
> tristate
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
On Fri, Feb 21, 2020 at 08:40:34AM +0100, Greg Kroah-Hartman wrote:
> From: Daniel Jordan <[email protected]>
>
> [ Upstream commit 38228e8848cd7dd86ccb90406af32de0cad24be3 ]
>
> lockdep complains when padata's paths to update cpumasks via CPU hotplug
> and sysfs are both taken:
>
> # echo 0 > /sys/devices/system/cpu/cpu1/online
> # echo ff > /sys/kernel/pcrypt/pencrypt/parallel_cpumask
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.4.0-rc8-padata-cpuhp-v3+ #1 Not tainted
> ------------------------------------------------------
> bash/205 is trying to acquire lock:
> ffffffff8286bcd0 (cpu_hotplug_lock.rw_sem){++++}, at: padata_set_cpumask+0x2b/0x120
>
> but task is already holding lock:
> ffff8880001abfa0 (&pinst->lock){+.+.}, at: padata_set_cpumask+0x26/0x120
>
> which lock already depends on the new lock.
I think this patch should be dropped from all stable queues (4.4, 4.9, 4.14,
4.19, 5.4, and 5.5).
The main benefit is to un-break lockdep for testing with future padata changes,
and an actual deadlock seems unlikely.
These stable versions don't fix the ordering in padata_remove_cpu() either
(nothing calls it though).
I tried the other stable padata patch in this cycle ("padata: validate cpumask
without removed CPU during offline"), it passed my tests and should stay in.
thanks,
Daniel
Hi!
> > > Hardcode the EPT page-walk level for L2 to be 4 levels, as KVM's MMU
> > > currently also hardcodes the page walk level for nested EPT to be 4
> > > levels. The L2 guest is all but guaranteed to soft hang on its first
> > > instruction when L1 is using EPT, as KVM will construct 4-level page
> > > tables and then tell hardware to use 5-level page tables.
> >
> > I don't get it. 7/191 reverts the patch, then 9/191 reverts the
> > revert. Can we simply drop both 7 and 9, for exactly the same result?
> >
> > (Patch 8 is a unused file, so it does not change the picture).
>
> Patch 07 is reverting this patch from the same unused file,
> arch/x86/kvm/vmx/vmx.c[*]. The reason patch 07 looks like a normal diff is
> that a prior patch in 4.19.105 created the unused file (which is what's
> reverted by patch 08 here).
>
> Patch 09 reintroduces the fix for the correct file, arch/x86/kvm/vmx.c.
Aha, thanks, I checked content few times but missed difference in
filename. Now it makes sense.
Best regards,
Pavel
--
DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
On Fri 2020-02-21 08:42:41, Greg Kroah-Hartman wrote:
> From: Steve French <[email protected]>
>
> [ Upstream commit d6fd41905ec577851734623fb905b1763801f5ef ]
>
> We ran into a confusing problem where an application wasn't checking
> return code on close and so user didn't realize that the application
> ran out of disk space. log a warning message (once) in these
> cases. For example:
>
> [ 8407.391909] Out of space writing to \\oleg-server\small-share
Out of space can happen on any filesystem, and yes, it can be
confusing. But why is cifs so special that we warn here (and not
elsewhere) and why was this marked for stable?
Best regards,
Pavel
> +++ b/fs/cifs/smb2pdu.c
> @@ -3425,6 +3425,9 @@ smb2_writev_callback(struct mid_q_entry *mid)
> wdata->cfile->fid.persistent_fid,
> tcon->tid, tcon->ses->Suid, wdata->offset,
> wdata->bytes, wdata->result);
> + if (wdata->result == -ENOSPC)
> + printk_once(KERN_WARNING "Out of space writing to %s\n",
> + tcon->treeName);
> } else
> trace_smb3_write_done(0 /* no xid */,
> wdata->cfile->fid.persistent_fid,
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
On 2/20/20 11:39 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.106 release.
> There are 191 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 23 Feb 2020 07:19:49 +0000.
> Anything received after that time might be too late.
>
I didn't send final summaries this time around because all release candidates
are broken. I expected to get updated images last night, but that didn't
happen. FTR, the test results for v4.19.105-192-g27ac98449017 are:
Build results:
total: 156 pass: 155 fail: 1
Failed builds:
x86_64:allnoconfig
Qemu test results:
total: 403 pass: 403 fail: 0
with:
Building x86_64:allnoconfig ... failed
--------------
Error log:
arch/x86/kernel/unwind_orc.c: In function 'unwind_init':
arch/x86/kernel/unwind_orc.c:267:56: error: 'orc_sort_cmp' undeclared (first use in this function)
267 | sort(__start_orc_unwind_ip, num_entries, sizeof(int), orc_sort_cmp,
| ^~~~~~~~~~~~
arch/x86/kernel/unwind_orc.c:267:56: note: each undeclared identifier is reported only once for each function it appears in
arch/x86/kernel/unwind_orc.c:268:7: error: 'orc_sort_swap' undeclared (first use in this function)
268 | orc_sort_swap);
| ^~~~~~~~~~~~~
I don't plan to send final summaries for the other release candidates;
all failures have already been reported.
Guenter
On Sat, Feb 22, 2020 at 01:59:31PM +0100, Pavel Machek wrote:
>On Fri 2020-02-21 08:42:41, Greg Kroah-Hartman wrote:
>> From: Steve French <[email protected]>
>>
>> [ Upstream commit d6fd41905ec577851734623fb905b1763801f5ef ]
>>
>> We ran into a confusing problem where an application wasn't checking
>> return code on close and so user didn't realize that the application
>> ran out of disk space. log a warning message (once) in these
>> cases. For example:
>>
>> [ 8407.391909] Out of space writing to \\oleg-server\small-share
>
>Out of space can happen on any filesystem, and yes, it can be
>confusing. But why is cifs so special that we warn here (and not
cifs isn't special, we tend to take this type of patches that address
usability issues. Here's an example of a similar patch for btrfs from
the previous release (3 days ago):
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.5.y&id=eb7a7968c9ee183def1d727d4bb209c701fe402a
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-5.5.y&id=f7447ff1d58a590e4b04479d1209fcee253a96d7
>elsewhere) and why was this marked for stable?
Reading the patch description, it describes a bug that happened because
of lacking kernel feedback.
--
Thanks,
Sasha
On Fri, Feb 21, 2020 at 11:51:04AM +0100, Pavel Machek wrote:
>Hi!
>
>> With cross-server COPY we've introduced the possibility that the current
>> or saved filehandle might not have fh_dentry/fh_export filled in, but we
>> missed a place that assumed it was. I think this could be triggered by
>> a compound like:
>>
>> PUTFH(foreign filehandle)
>> GETATTR
>> SAVEFH
>> COPY
>>
>> First, check_if_stalefh_allowed sets no_verify on the first (PUTFH) op.
>> Then op_func = nfsd4_putfh runs and leaves current_fh->fh_export NULL.
>> need_wrongsec_check returns true, since this PUTFH has OP_IS_PUTFH_LIKE
>> set and GETATTR does not have OP_HANDLES_WRONGSEC set.
>>
>> We should probably also consider tightening the checks in
>> check_if_stalefh_allowed and double-checking that we don't assume the
>> filehandle is verified elsewhere in the compound. But I think this
>> fixes the immediate issue.
>>
>> Reported-by: Dan Carpenter <[email protected]>
>> Fixes: 4e48f1cccab3 "NFSD: allow inter server COPY to have... "
>
>AFAICT 4e48f1cccab3 "NFSD: allow inter server COPY to have... " is not
>part of 4.19 series, so this should not be needed in 4.19.
Not only 4e48f1cccab3 isn't in 4.19, it isn't in any tree! :)
Looks like an error in the patch, I'll drop this commit from everywhere.
--
Thanks,
Sasha
On Fri, Feb 21, 2020 at 08:39:33AM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.106 release.
> There are 191 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 23 Feb 2020 07:19:49 +0000.
> Anything received after that time might be too late.
Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.
Summary
------------------------------------------------------------------------
kernel: 4.19.106-rc1
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.19.y
git commit: 27ac98449017eb9c569bcc95c65f29ca3948148f
git describe: v4.19.105-192-g27ac98449017
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.19-oe/build/v4.19.105-192-g27ac98449017
No regressions (compared to build v4.19.105)
No fixes (compared to build v4.19.105)
Ran 29020 total tests in the following environments and test suites.
Environments
--------------
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- qemu_arm
- qemu_arm64
- qemu_i386
- qemu_x86_64
- x15 - arm
- x86_64
Test Suites
-----------
* build
* install-android-platform-tools-r2600
* kselftest
* libhugetlbfs
* linux-log-parser
* ltp-cap_bounds-tests
* ltp-commands-tests
* ltp-containers-tests
* ltp-cpuhotplug-tests
* ltp-cve-tests
* ltp-dio-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-mm-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* perf
* spectre-meltdown-checker-test
* v4l2-compliance
* ltp-fs-tests
* network-basic-tests
* ltp-open-posix-tests
* kvm-unit-tests
* ltp-cap_bounds-64k-page_size-tests
* ltp-cap_bounds-kasan-tests
* ltp-commands-64k-page_size-tests
* ltp-commands-kasan-tests
* ltp-containers-64k-page_size-tests
* ltp-containers-kasan-tests
* ltp-cpuhotplug-64k-page_size-tests
* ltp-cpuhotplug-kasan-tests
* ltp-cve-64k-page_size-tests
* ltp-cve-kasan-tests
* ltp-dio-64k-page_size-tests
* ltp-dio-kasan-tests
* ltp-fcntl-locktests-64k-page_size-tests
* ltp-fcntl-locktests-kasan-tests
* ltp-filecaps-64k-page_size-tests
* ltp-filecaps-kasan-tests
* ltp-fs-64k-page_size-tests
* ltp-fs-kasan-tests
* ltp-fs_bind-64k-page_size-tests
* ltp-fs_bind-kasan-tests
* ltp-fs_perms_simple-64k-page_size-tests
* ltp-fs_perms_simple-kasan-tests
* ltp-fsx-64k-page_size-tests
* ltp-fsx-kasan-tests
* ltp-hugetlb-64k-page_size-tests
* ltp-hugetlb-kasan-tests
* ltp-io-64k-page_size-tests
* ltp-io-kasan-tests
* ltp-ipc-64k-page_size-tests
* ltp-ipc-kasan-tests
* ltp-math-64k-page_size-tests
* ltp-math-kasan-tests
* ltp-mm-64k-page_size-tests
* ltp-mm-kasan-tests
* ltp-nptl-64k-page_size-tests
* ltp-nptl-kasan-tests
* ltp-pty-64k-page_size-tests
* ltp-pty-kasan-tests
* ltp-sched-64k-page_size-tests
* ltp-sched-kasan-tests
* ltp-securebits-64k-page_size-tests
* ltp-securebits-kasan-tests
* ltp-syscalls-64k-page_size-tests
* ltp-syscalls-compat-tests
* ltp-syscalls-kasan-tests
* ssuite
* kselftest-vsyscall-mode-native
* kselftest-vsyscall-mode-none
--
Linaro LKFT
https://lkft.linaro.org
Hi!
> From: Jia-Ju Bai <[email protected]>
>
> [ Upstream commit 9c1ed62ae0690dfe5d5e31d8f70e70a95cb48e52 ]
>
> The driver may sleep while holding a spinlock.
True, but you can't just fix that by removing the locking.
> +++ b/drivers/usb/gadget/udc/gr_udc.c
> @@ -2180,8 +2180,6 @@ static int gr_probe(struct platform_device *pdev)
> return -ENOMEM;
> }
>
> - spin_lock(&dev->lock);
> -
> /* Inside lock so that no gadget can use this udc until probe is done */
> retval = usb_add_gadget_udc(dev->dev, &dev->gadget);
> if (retval) {
As this comment tries to explain. It is possible that the comment can
just be removed, but it looks like the code needs to be rearranged so
that rest of system does not see partly-initialized device.
Best regards,
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Hi!
> From: Jia-Ju Bai <[email protected]>
>
> [ Upstream commit e36eaf94be8f7bc4e686246eed3cf92d845e2ef8 ]
>
> The driver may sleep while holding a spinlock.
True.
But you can't fix the bug by simply removing the locking, as now
nothing prevents grgpio_irq_unmap() from running while
grgpio_irq_map() is proceeding.
grgpio_irq_map()
lirq->irq = irq;
(drops the lock)
grgpio_irq_unmap()
(gets the lock)
if (lirq->irq == irq) {
...
(proceeds to work with half-initialized structure)
Best regards,
Pavel
> index 60a1556c570a4..c1be299e5567b 100644
> --- a/drivers/gpio/gpio-grgpio.c
> +++ b/drivers/gpio/gpio-grgpio.c
> @@ -258,17 +258,16 @@ static int grgpio_irq_map(struct irq_domain *d, unsigned int irq,
> lirq->irq = irq;
> uirq = &priv->uirqs[lirq->index];
> if (uirq->refcnt == 0) {
> + spin_unlock_irqrestore(&priv->gc.bgpio_lock, flags);
> ret = request_irq(uirq->uirq, grgpio_irq_handler, 0,
> dev_name(priv->dev), priv);
> if (ret) {
> dev_err(priv->dev,
> "Could not request underlying irq %d\n",
> uirq->uirq);
> -
> - spin_unlock_irqrestore(&priv->gc.bgpio_lock, flags);
> -
> return ret;
> }
> + spin_lock_irqsave(&priv->gc.bgpio_lock, flags);
> }
> uirq->refcnt++;
>
> @@ -314,8 +313,11 @@ static void grgpio_irq_unmap(struct irq_domain *d, unsigned int irq)
> if (index >= 0) {
> uirq = &priv->uirqs[lirq->index];
> uirq->refcnt--;
> - if (uirq->refcnt == 0)
> + if (uirq->refcnt == 0) {
> + spin_unlock_irqrestore(&priv->gc.bgpio_lock, flags);
> free_irq(uirq->uirq, priv);
> + return;
> + }
> }
>
> spin_unlock_irqrestore(&priv->gc.bgpio_lock, flags);
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
On Fri, Feb 21, 2020 at 08:39:33AM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.106 release.
> There are 191 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 23 Feb 2020 07:19:49 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.106-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
> and the diffstat can be found below.
-rc2 is out to hopefully resolve the reported problems with -rc1:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.106-rc2.gz
thanks,
greg k-h
On Sun, Feb 23, 2020 at 04:41:21PM +0100, Greg Kroah-Hartman wrote:
> On Fri, Feb 21, 2020 at 08:39:33AM +0100, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.19.106 release.
> > There are 191 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Sun, 23 Feb 2020 07:19:49 +0000.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.106-rc1.gz
> > or in the git tree and branch at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
> > and the diffstat can be found below.
>
> -rc2 is out to hopefully resolve the reported problems with -rc1:
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.106-rc2.gz
Ok, -rc3 is now out, hopefully now things will work:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.106-rc3.gz
On Sat, Feb 22, 2020 at 07:46:57AM -0800, Guenter Roeck wrote:
> On 2/20/20 11:39 PM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.19.106 release.
> > There are 191 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Sun, 23 Feb 2020 07:19:49 +0000.
> > Anything received after that time might be too late.
> >
>
> I didn't send final summaries this time around because all release candidates
> are broken. I expected to get updated images last night, but that didn't
> happen. FTR, the test results for v4.19.105-192-g27ac98449017 are:
Sorry, I was traveling for much longer than expected, hopefully this
should be resolved.
greg k-h
On 2/20/20 11:39 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.19.106 release.
> There are 191 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 23 Feb 2020 07:19:49 +0000.
> Anything received after that time might be too late.
>
For v4.19.105-185-g119e922a87ef:
Build results:
total: 156 pass: 156 fail: 0
Qemu test results:
total: 403 pass: 403 fail: 0
Guenter
On Sun, 23 Feb 2020 at 23:01, Greg Kroah-Hartman
<[email protected]> wrote:
>
> Ok, -rc3 is now out, hopefully now things will work:
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.106-rc3.gz
Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.
Summary
------------------------------------------------------------------------
kernel: 4.19.106-rc3
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.19.y
git commit: 119e922a87ef462ac31618f71713a252895e3f11
git describe: v4.19.105-185-g119e922a87ef
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.19-oe/build/v4.19.105-185-g119e922a87ef
No regressions (compared to build v4.19.105)
No fixes (compared to build v4.19.105)
Ran 26767 total tests in the following environments and test suites.
Environments
--------------
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- qemu_arm
- qemu_arm64
- qemu_i386
- qemu_x86_64
- x15 - arm
- x86_64
Test Suites
-----------
* build
* install-android-platform-tools-r2600
* kselftest
* libhugetlbfs
* linux-log-parser
* ltp-cap_bounds-tests
* ltp-commands-tests
* ltp-containers-tests
* ltp-cpuhotplug-tests
* ltp-cve-tests
* ltp-dio-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-mm-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* perf
* spectre-meltdown-checker-test
* v4l2-compliance
* ltp-fs-tests
* network-basic-tests
* ltp-open-posix-tests
* kvm-unit-tests
* ltp-cap_bounds-64k-page_size-tests
* ltp-cap_bounds-kasan-tests
* ltp-commands-64k-page_size-tests
* ltp-commands-kasan-tests
* ltp-containers-64k-page_size-tests
* ltp-containers-kasan-tests
* ltp-cpuhotplug-64k-page_size-tests
* ltp-cpuhotplug-kasan-tests
* ltp-cve-64k-page_size-tests
* ltp-cve-kasan-tests
* ltp-dio-64k-page_size-tests
* ltp-dio-kasan-tests
* ltp-fcntl-locktests-64k-page_size-tests
* ltp-fcntl-locktests-kasan-tests
* ltp-filecaps-64k-page_size-tests
* ltp-filecaps-kasan-tests
* ltp-fs-64k-page_size-tests
* ltp-fs-kasan-tests
* ltp-fs_bind-64k-page_size-tests
* ltp-fs_bind-kasan-tests
* ltp-fs_perms_simple-64k-page_size-tests
* ltp-fs_perms_simple-kasan-tests
* ltp-fsx-64k-page_size-tests
* ltp-fsx-kasan-tests
* ltp-hugetlb-64k-page_size-tests
* ltp-hugetlb-kasan-tests
* ltp-io-64k-page_size-tests
* ltp-io-kasan-tests
* ltp-ipc-64k-page_size-tests
* ltp-ipc-kasan-tests
* ltp-math-64k-page_size-tests
* ltp-math-kasan-tests
* ltp-mm-64k-page_size-tests
* ltp-mm-kasan-tests
* ltp-nptl-64k-page_size-tests
* ltp-nptl-kasan-tests
* ltp-pty-64k-page_size-tests
* ltp-pty-kasan-tests
* ltp-sched-64k-page_size-tests
* ltp-sched-kasan-tests
* ltp-securebits-64k-page_size-tests
* ltp-securebits-kasan-tests
* ltp-syscalls-64k-page_size-tests
* ltp-syscalls-compat-tests
* ltp-syscalls-kasan-tests
* ssuite
* kselftest-vsyscall-mode-native
* kselftest-vsyscall-mode-none
--
Linaro LKFT
https://lkft.linaro.org
Hi!
> From: Aditya Pakki <[email protected]>
>
> [ Upstream commit c705f9fc6a1736dcf6ec01f8206707c108dca824 ]
>
> In ezusb_init, if upriv is NULL, the code crashes. However, the caller
> in ezusb_probe can handle the error and print the failure message.
> The patch replaces the BUG_ON call to error return.
The caller already checked that upriv is not NULL, AFAICT.
priv = alloc_orinocodev(sizeof(*upriv), &udev->dev,
ezusb_hard_reset, NULL);
if (!priv) {
err("Couldn't allocate orinocodev");
retval = -ENOMEM;
goto exit;
}
I don't see this as an improvement.
Best regards,
Pavel
> +++ b/drivers/net/wireless/intersil/orinoco/orinoco_usb.c
> @@ -1364,7 +1364,8 @@ static int ezusb_init(struct hermes *hw)
> int retval;
>
> BUG_ON(in_interrupt());
> - BUG_ON(!upriv);
> + if (!upriv)
> + return -EINVAL;
>
> upriv->reply_count = 0;
> /* Write the MAGIC number on the simulated registers to keep
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Hi!
> From: Alex Deucher <[email protected]>
>
> [ Upstream commit 4d0a72b66065dd7e274bad6aa450196d42fd8f84 ]
>
> Only send non-0 clocks to DC for validation. This mirrors
> what the windows driver does.
> diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
> index 1546bc49004f8..3fa6e8123b8eb 100644
> --- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
> +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu10_hwmgr.c
> @@ -994,12 +994,15 @@ static int smu10_get_clock_by_type_with_latency(struct pp_hwmgr *hwmgr,
>
> clocks->num_levels = 0;
> for (i = 0; i < pclk_vol_table->count; i++) {
> - clocks->data[i].clocks_in_khz = pclk_vol_table->entries[i].clk * 10;
> - clocks->data[i].latency_in_us = latency_required ?
> - smu10_get_mem_latency(hwmgr,
> - pclk_vol_table->entries[i].clk) :
> - 0;
> - clocks->num_levels++;
> + if (pclk_vol_table->entries[i].clk) {
> + clocks->data[clocks->num_levels].clocks_in_khz =
> + pclk_vol_table->entries[i].clk * 10;
> + clocks->data[clocks->num_levels].latency_in_us = latency_required ?
> + smu10_get_mem_latency(hwmgr,
> + pclk_vol_table->entries[i].clk) :
> + 0;
> + clocks->num_levels++;
> + }
> }
This results in very ugly wrapping. Better way would be to do
"if !(pclk_vol_table->entries[i].clk) continue;"...
"drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_voltage" could be
improved in similar way.
Best regards,
Pavel
--
DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany