2020-03-03 17:49:03

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 000/176] 5.5.8-stable review

This is the start of the stable review cycle for the 5.5.8 release.
There are 176 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu, 05 Mar 2020 17:42:06 +0000.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.5.8-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.5.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 5.5.8-rc1

Jim Mattson <[email protected]>
kvm: nVMX: VMWRITE checks unsupported field before read-only field

Jim Mattson <[email protected]>
kvm: nVMX: VMWRITE checks VMCS-link pointer before VMCS field

David Rientjes <[email protected]>
mm, thp: fix defrag setting if newline is not used

Wei Yang <[email protected]>
mm/huge_memory.c: use head to check huge zero page

John Hubbard <[email protected]>
mm/gup: allow FOLL_FORCE for get_user_pages_fast()

Vlastimil Babka <[email protected]>
mm/debug.c: always print flags in dump_page()

Waiman Long <[email protected]>
locking/lockdep: Fix lockdep_stats indentation problem

Daniel Jordan <[email protected]>
padata: always acquire cpu_hotplug_lock before pinst->lock

Christoph Hellwig <[email protected]>
xfs: clear kernel only flags in XFS_IOC_ATTRMULTI_BY_HANDLE

Bjorn Andersson <[email protected]>
clk: qcom: rpmh: Sort OF match table

Sameer Pujar <[email protected]>
bus: tegra-aconnect: Remove PM_CLK dependency

Matteo Croce <[email protected]>
netfilter: nf_flowtable: fix documentation

Xin Long <[email protected]>
netfilter: nft_tunnel: no need to call htons() when dumping ports

Florian Fainelli <[email protected]>
thermal: brcmstb_thermal: Do not use DT coefficients

Linus Walleij <[email protected]>
thermal: db8500: Depromote debug print

Geert Uytterhoeven <[email protected]>
ubifs: Fix ino_t format warnings in orphan_delete()

Neeraj Upadhyay <[email protected]>
rcu: Allow only one expedited GP to run concurrently with wakeups

Sean Christopherson <[email protected]>
KVM: x86: Remove spurious clearing of async #PF MSR

Sean Christopherson <[email protected]>
KVM: x86: Remove spurious kvm_mmu_unload() from vcpu destruction path

Peter Xu <[email protected]>
KVM: X86: Fix kvm_bitmap_or_dest_vcpus() to use irq shorthand

Xiaochen Shen <[email protected]>
x86/resctrl: Check monitoring static key in the MBM overflow handler

Cengiz Can <[email protected]>
perf maps: Add missing unlock to maps__insert() error case

Jiri Olsa <[email protected]>
perf ui gtk: Add missing zalloc object

Arnaldo Carvalho de Melo <[email protected]>
perf hists browser: Restore ESC as "Zoom out" of DSO/thread/etc

Uwe Kleine-König <[email protected]>
pwm: omap-dmtimer: put_device() after of_find_device_by_node()

Thomas Gleixner <[email protected]>
lib/vdso: Update coarse timekeeper unconditionally

Thomas Gleixner <[email protected]>
lib/vdso: Make __arch_update_vdso_data() logic understandable

Masami Hiramatsu <[email protected]>
kprobes: Set unoptimized flag after unoptimizing code

Janne Karhunen <[email protected]>
ima: ima/lsm policy rule loading logic bug fixes

Christophe JAILLET <[email protected]>
drivers: net: xgene: Fix the order of the arguments of 'alloc_etherdev_mqs()'

Lijun Ou <[email protected]>
RDMA/hns: Bugfix for posting a wqe with sge

Yixian Liu <[email protected]>
RDMA/hns: Simplify the calculation and usage of wqe idx for post verbs

Chao Yu <[email protected]>
f2fs: fix to add swap extent correctly

Cheng Jian <[email protected]>
sched/fair: Optimize select_idle_cpu

Sean Christopherson <[email protected]>
KVM: Check for a bad hva before dropping into the ghc slow path

Tom Lendacky <[email protected]>
KVM: SVM: Override default MMIO mask if memory encryption is enabled

Jin Yao <[email protected]>
perf report: Fix no libunwind compiled warning break s390 issue

Brian Norris <[email protected]>
mwifiex: delete unused mwifiex_get_intf_num()

Brian Norris <[email protected]>
mwifiex: drop most magic numbers from mwifiex_process_tdls_action_frame()

Aleksa Sarai <[email protected]>
namei: only return -ECHILD from follow_dotdot_rcu()

Tuong Lien <[email protected]>
tipc: fix successful connect() but timed out

Arthur Kiyanovski <[email protected]>
net: ena: make ena rxfh support ETH_RSS_HASH_NO_CHANGE

Ursula Braun <[email protected]>
net/smc: no peer ID in CLC decline for SMCD

Michael Ellerman <[email protected]>
selftests: Install settings files to fix TIMEOUT failures

Dmitry Bogdanov <[email protected]>
net: atlantic: fix out of range usage of active_vlans array

Pavel Belous <[email protected]>
net: atlantic: possible fault in transition to hibernation

Pavel Belous <[email protected]>
net: atlantic: fix potential error handling

Pavel Belous <[email protected]>
net: atlantic: fix use after free kasan warn

Nikita Danilov <[email protected]>
net: atlantic: better loopback mode handling

Dmitry Bezrukov <[email protected]>
net: atlantic: checksum compat issue

Nikolay Aleksandrov <[email protected]>
net: netlink: cap max groups which will be considered in netlink_bind()

Julian Wiedmann <[email protected]>
s390/qeth: fix off-by-one in RX copybreak check

Alexandra Winter <[email protected]>
s390/qeth: vnicc Fix EOPNOTSUPP precedence

Bijan Mottahedeh <[email protected]>
nvme-pci: Hold cq_poll_lock while completing CQEs

Peter Chen <[email protected]>
usb: charger: assign specific number for enum value

Haiyang Zhang <[email protected]>
hv_netvsc: Fix unwanted wakeup in netvsc_attach()

Masahiro Yamada <[email protected]>
kbuild: fix DT binding schema rule to detect command line changes

Andrei Otcheretianski <[email protected]>
mac80211: Remove a redundant mutex unlock

Johannes Berg <[email protected]>
nl80211: fix potential leak in AP start

Tina Zhang <[email protected]>
drm/i915/gvt: Separate display reset from ALL_ENGINES reset

Chris Wilson <[email protected]>
drm/i915: Avoid recursing onto active vma from the shrinker

Tina Zhang <[email protected]>
drm/i915/gvt: Fix orphan vgpu dmabuf_objs' lifetime

Mark Tomlinson <[email protected]>
MIPS: cavium_octeon: Fix syncw generation.

Wolfram Sang <[email protected]>
i2c: jz4780: silence log flood on txabrt

Gustavo A. R. Silva <[email protected]>
i2c: altera: Fix potential integer overflow

Oliver Upton <[email protected]>
KVM: nVMX: Emulate MTF when performing instruction emulation

Christophe JAILLET <[email protected]>
MIPS: VPE: Fix a double free and a memory leak in 'release_vpe()'

Anup Patel <[email protected]>
RISC-V: Don't enable all interrupts in trap_init()

[email protected] <[email protected]>
HID: hiddev: Fix race in in hiddev_disconnect()

Christophe JAILLET <[email protected]>
HID: alps: Fix an error handling path in 'alps_input_configured()'

Cong Wang <[email protected]>
netfilter: xt_hashlimit: reduce hashlimit_mutex scope for htable_put()

Jozsef Kadlecsik <[email protected]>
netfilter: ipset: Fix forceadd evaluation path

Eugenio Pérez <[email protected]>
vhost: Check docket sk_family instead of call getname

Ursula Braun <[email protected]>
net/smc: transfer fasync_list in case of fallback

Jozsef Kadlecsik <[email protected]>
netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports

Jens Axboe <[email protected]>
io_uring: fix 32-bit compatability with sendmsg/recvmsg

Rafael J. Wysocki <[email protected]>
cpufreq: Fix policy initialization for internal governor drivers

Shirish S <[email protected]>
amdgpu/gmc_v9: save/restore sdpif regs during S3

Orson Zhai <[email protected]>
Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"

Steven Rostedt (VMware) <[email protected]>
tracing: Disable trace_printk() on post poned tests

Jan Kara <[email protected]>
blktrace: Protect q->blk_trace with RCU

Wolfram Sang <[email protected]>
macintosh: therm_windtunnel: fix regression when instantiating devices

Daniel Vetter <[email protected]>
drm/radeon: Inline drm_get_pci_dev

Daniel Vetter <[email protected]>
drm/amdgpu: Drop DRIVER_USE_AGP

Johan Korsnes <[email protected]>
HID: core: increase HID report buffer size to 8KiB

Johan Korsnes <[email protected]>
HID: core: fix off-by-one memset in hid_report_raw_event()

Hans de Goede <[email protected]>
HID: ite: Only bind to keyboard USB interface on Acer SW5-012 keyboard dock

Oliver Upton <[email protected]>
KVM: VMX: check descriptor table exits on instruction emulation

Mika Westerberg <[email protected]>
ACPI: watchdog: Fix gas->access_width usage

Mika Westerberg <[email protected]>
ACPICA: Introduce ACPI_ACCESS_BYTE_WIDTH() macro

Paul Moore <[email protected]>
audit: always check the netlink payload length in audit_receive_msg()

Paul Moore <[email protected]>
audit: fix error handling in audit_data_to_entry()

Dan Carpenter <[email protected]>
ext4: potential crash on allocation error in ext4_alloc_flex_bg_array()

Kees Cook <[email protected]>
docs: Fix empty parallelism argument

Benjamin Block <[email protected]>
scsi: zfcp: fix wrong data and display format of SFP+ temperature

Damien Le Moal <[email protected]>
scsi: sd_sbc: Fix sd_zbc_report_zones()

Keith Busch <[email protected]>
nvme/pci: move cqe check after device shutdown

Nigel Kirkland <[email protected]>
nvme: prevent warning triggered by nvme_stop_keep_alive

Anton Eidelman <[email protected]>
nvme/tcp: fix bug on double requeue when send fails

Guangbin Huang <[email protected]>
net: hns3: fix a copying IPv6 address error in hclge_fd_get_flow_tuples()

Yonglong Liu <[email protected]>
net: hns3: fix VF bandwidth does not take effect in some case

Yufeng Mo <[email protected]>
net: hns3: add management table after IMP reset

Shay Bar <[email protected]>
mac80211: fix wrong 160/80+80 MHz setting

Sergey Matyukevich <[email protected]>
cfg80211: add missing policy for NL80211_ATTR_STATUS_CODE

Coly Li <[email protected]>
bcache: ignore pending signals when creating gc and allocator thread

Frank Sorenson <[email protected]>
cifs: Fix mode output in debugging statements

Jens Axboe <[email protected]>
io-wq: don't call kXalloc_node() with non-online node

Ben Shelton <[email protected]>
ice: Use correct netif error function

Anirudh Venkataramanan <[email protected]>
ice: Use ice_pf_to_dev

Bruce Allan <[email protected]>
ice: update Unit Load Status bitmask to check after reset

Bruce Allan <[email protected]>
ice: fix and consolidate logging of NVM/firmware version information

Brett Creeley <[email protected]>
ice: Don't allow same value for Rx tail to be written twice

Dave Ertman <[email protected]>
ice: Fix switch between FW and SW LLDP

Arthur Kiyanovski <[email protected]>
net: ena: ena-com.c: prevent NULL pointer dereference

Sameeh Jubran <[email protected]>
net: ena: ethtool: use correct value for crc32 hash

Arthur Kiyanovski <[email protected]>
net: ena: fix corruption of dev_idx_to_host_tbl

Arthur Kiyanovski <[email protected]>
net: ena: fix incorrectly saving queue numbers when setting RSS indirection table

Arthur Kiyanovski <[email protected]>
net: ena: rss: store hash function as values and not bits

Sameeh Jubran <[email protected]>
net: ena: rss: fix failure to get indirection table

Sameeh Jubran <[email protected]>
net: ena: rss: do not allocate key when not supported

Arthur Kiyanovski <[email protected]>
net: ena: fix incorrect default RSS key

Arthur Kiyanovski <[email protected]>
net: ena: add missing ethtool TX timestamping indication

Arthur Kiyanovski <[email protected]>
net: ena: fix uses of round_jiffies()

Arthur Kiyanovski <[email protected]>
net: ena: fix potential crash when rxfh key is NULL

Brett Creeley <[email protected]>
i40e: Fix the conditional for i40e_vc_validate_vqs_bitmaps

Thierry Reding <[email protected]>
soc/tegra: fuse: Fix build with Tegra194 configuration

Daniel Kolesa <[email protected]>
amdgpu: Prevent build errors regarding soft/hard-float FP ABI tags

Isabel Zhang <[email protected]>
drm/amd/display: Add initialitions for PLL2 clock source

Yongqiang Sun <[email protected]>
drm/amd/display: Limit minimum DPPCLK to 100MHz.

Aric Cyr <[email protected]>
drm/amd/display: Check engine is not NULL before acquiring

Krishnamraju Eraparaju <[email protected]>
RDMA/siw: Remove unwanted WARN_ON in siw_cm_llp_data_ready()

Sung Lee <[email protected]>
drm/amd/display: Do not set optimized_require to false after plane disable

Kuninori Morimoto <[email protected]>
ARM: dts: sti: fixup sound frame-inversion for stihxxx-b2120.dtsi

Xiubo Li <[email protected]>
ceph: do not execute direct write in parallel if O_APPEND is specified

Kan Liang <[email protected]>
perf/x86/msr: Add Tremont support

Kan Liang <[email protected]>
perf/x86/cstate: Add Tremont support

Kan Liang <[email protected]>
perf/x86/intel: Add Elkhart Lake support

Peter Zijlstra <[email protected]>
arm/ftrace: Fix BE text poking

John Garry <[email protected]>
perf/smmuv3: Use platform_get_irq_optional() for wired interrupt

Trond Myklebust <[email protected]>
NFSv4: Fix races between open and dentry revalidation

Bjørn Mork <[email protected]>
qmi_wwan: unconditionally reject 2 ep interfaces

Bjørn Mork <[email protected]>
qmi_wwan: re-add DW5821e pre-production variant

Harald Freudenberger <[email protected]>
s390/zcrypt: fix card and queue total counter wrap

Stefano Garzarella <[email protected]>
io_uring: flush overflowed CQ events in the io_uring_poll()

Sergey Matyukevich <[email protected]>
cfg80211: check wiphy driver existence for drvinfo report

Johannes Berg <[email protected]>
mac80211: consider more elements in parsing CRC

Jeff Moyer <[email protected]>
dax: pass NOWAIT flag to iomap_apply

Vincent Guittot <[email protected]>
sched/fair: Prevent unlimited runtime on throttled group

Peter Zijlstra (Intel) <[email protected]>
timers/nohz: Update NOHZ load in remote tick

Scott Wood <[email protected]>
sched/core: Don't skip remote tick for idle CPUs

Sean Paul <[email protected]>
drm/msm: Set dma maximum segment size for mdss

Corey Minyard <[email protected]>
ipmi:ssif: Handle a possible NULL pointer reference

Eric Dumazet <[email protected]>
net: rtnetlink: fix bugs in rtnl_alt_ifname()

Alexandre Belloni <[email protected]>
net: macb: Properly handle phylink on at91rm9200

Eric Dumazet <[email protected]>
net: add strict checks in netdev_name_node_alt_destroy()

Shannon Nelson <[email protected]>
ionic: fix fw_status read

Benjamin Poirier <[email protected]>
ipv6: Fix nlmsg_flags when splitting a multipath route

Benjamin Poirier <[email protected]>
ipv6: Fix route replacement with dev-only route

Taehee Yoo <[email protected]>
bonding: fix lockdep warning in bond_get_stats()

Taehee Yoo <[email protected]>
net: export netdev_next_lower_dev_rcu()

Taehee Yoo <[email protected]>
bonding: add missing netdev_update_lockdep_key()

Vasundhara Volam <[email protected]>
bnxt_en: Issue PCIe FLR in kdump kernel to cleanup pending DMAs.

Vasundhara Volam <[email protected]>
bnxt_en: Improve device shutdown method.

Xin Long <[email protected]>
sctp: move the format error check out of __sctp_sf_do_9_1_abort

Willem de Bruijn <[email protected]>
udp: rehash on disconnect

Paolo Abeni <[email protected]>
Revert "net: dev: introduce support for sch BYPASS for lockless qdisc"

Michal Kalderon <[email protected]>
qede: Fix race between rdma destroy workqueue and link change event

Dmitry Osipenko <[email protected]>
nfc: pn544: Fix occasional HW initialization failure

Rohit Maheshwari <[email protected]>
net/tls: Fix to avoid gettig invalid tls record

Jason Baron <[email protected]>
net: sched: correct flower port blocking

Arun Parameswaran <[email protected]>
net: phy: restore mdio regs in the iproc mdio driver

Horatiu Vultur <[email protected]>
net: mscc: fix in frame extraction

Alexandre Belloni <[email protected]>
net: macb: ensure interface is not suspended on at91rm9200

Jethro Beekman <[email protected]>
net: fib_rules: Correctly set table field when table number exceeds 8 bits

Florian Fainelli <[email protected]>
net: dsa: b53: Ensure the default VID is untagged

Aristeu Rozanski <[email protected]>
EDAC: skx_common: downgrade message importance on missing PCI device


-------------

Diffstat:

Documentation/networking/nf_flowtable.txt | 2 +-
Documentation/sphinx/parallel-wrapper.sh | 2 +-
Makefile | 4 +-
arch/arm/boot/dts/stihxxx-b2120.dtsi | 2 +-
arch/arm/include/asm/vdso/vsyscall.h | 4 +-
arch/arm/kernel/ftrace.c | 7 +-
arch/mips/include/asm/sync.h | 4 +-
arch/mips/kernel/vpe.c | 2 +-
arch/riscv/kernel/traps.c | 4 +-
arch/x86/events/intel/core.c | 1 +
arch/x86/events/intel/cstate.c | 22 +-
arch/x86/events/msr.c | 3 +-
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/include/uapi/asm/kvm.h | 1 +
arch/x86/kernel/cpu/resctrl/internal.h | 1 +
arch/x86/kernel/cpu/resctrl/monitor.c | 4 +-
arch/x86/kvm/lapic.c | 2 +-
arch/x86/kvm/svm.c | 44 ++
arch/x86/kvm/vmx/nested.c | 105 ++--
arch/x86/kvm/vmx/nested.h | 5 +
arch/x86/kvm/vmx/vmx.c | 52 +-
arch/x86/kvm/vmx/vmx.h | 3 +
arch/x86/kvm/x86.c | 8 +-
drivers/acpi/acpi_watchdog.c | 3 +-
drivers/bus/Kconfig | 1 -
drivers/char/ipmi/ipmi_ssif.c | 10 +-
drivers/clk/qcom/clk-rpmh.c | 2 +-
drivers/cpufreq/cpufreq.c | 12 +-
drivers/devfreq/devfreq.c | 4 +-
drivers/edac/skx_common.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 1 +
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 37 +-
drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile | 6 +
.../drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c | 6 +
drivers/gpu/drm/amd/display/dc/dce/dce_aux.c | 2 +-
drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 1 -
.../gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 6 +
.../drm/amd/include/asic_reg/dce/dce_12_0_offset.h | 2 +
drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 4 +-
drivers/gpu/drm/i915/gvt/dmabuf.c | 2 +-
drivers/gpu/drm/i915/gvt/vgpu.c | 2 +-
drivers/gpu/drm/msm/msm_drv.c | 8 +
drivers/gpu/drm/radeon/radeon_drv.c | 43 +-
drivers/gpu/drm/radeon/radeon_kms.c | 6 +
drivers/hid/hid-alps.c | 2 +-
drivers/hid/hid-core.c | 4 +-
drivers/hid/hid-ite.c | 5 +-
drivers/hid/usbhid/hiddev.c | 2 +-
drivers/i2c/busses/i2c-altera.c | 2 +-
drivers/i2c/busses/i2c-jz4780.c | 36 +-
drivers/infiniband/hw/hns/hns_roce_device.h | 3 +-
drivers/infiniband/hw/hns/hns_roce_hw_v1.c | 37 +-
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 80 +--
drivers/infiniband/sw/siw/siw_cm.c | 5 +-
drivers/macintosh/therm_windtunnel.c | 52 +-
drivers/md/bcache/alloc.c | 18 +-
drivers/md/bcache/btree.c | 13 +
drivers/net/bonding/bond_main.c | 55 +-
drivers/net/bonding/bond_options.c | 2 +
drivers/net/dsa/b53/b53_common.c | 3 +
drivers/net/ethernet/amazon/ena/ena_com.c | 96 ++--
drivers/net/ethernet/amazon/ena/ena_com.h | 9 +
drivers/net/ethernet/amazon/ena/ena_ethtool.c | 46 +-
drivers/net/ethernet/amazon/ena/ena_netdev.c | 6 +-
drivers/net/ethernet/amazon/ena/ena_netdev.h | 2 +
drivers/net/ethernet/apm/xgene/xgene_enet_main.c | 2 +-
.../net/ethernet/aquantia/atlantic/aq_ethtool.c | 5 +
.../net/ethernet/aquantia/atlantic/aq_filters.c | 2 +-
drivers/net/ethernet/aquantia/atlantic/aq_nic.c | 8 +-
.../net/ethernet/aquantia/atlantic/aq_pci_func.c | 13 +-
drivers/net/ethernet/aquantia/atlantic/aq_ring.c | 10 +-
drivers/net/ethernet/aquantia/atlantic/aq_ring.h | 3 +-
.../ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c | 18 +-
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 12 +-
drivers/net/ethernet/cadence/macb.h | 1 +
drivers/net/ethernet/cadence/macb_main.c | 66 ++-
.../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 22 +-
.../net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c | 2 +-
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 4 +-
drivers/net/ethernet/intel/ice/ice_base.c | 12 +-
drivers/net/ethernet/intel/ice/ice_common.c | 17 +-
drivers/net/ethernet/intel/ice/ice_dcb_nl.c | 12 +-
drivers/net/ethernet/intel/ice/ice_ethtool.c | 17 +-
drivers/net/ethernet/intel/ice/ice_hw_autogen.h | 6 +
drivers/net/ethernet/intel/ice/ice_lib.c | 33 +-
drivers/net/ethernet/intel/ice/ice_lib.h | 2 -
drivers/net/ethernet/intel/ice/ice_main.c | 23 +-
drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 2 +-
drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c | 8 +-
drivers/net/ethernet/mscc/ocelot_board.c | 8 +
drivers/net/ethernet/pensando/ionic/ionic_dev.c | 11 +-
drivers/net/ethernet/pensando/ionic/ionic_if.h | 1 +
drivers/net/ethernet/qlogic/qede/qede.h | 2 +
drivers/net/ethernet/qlogic/qede/qede_rdma.c | 29 +-
drivers/net/hyperv/netvsc.c | 2 +-
drivers/net/hyperv/netvsc_drv.c | 3 +
drivers/net/phy/mdio-bcm-iproc.c | 20 +
drivers/net/usb/qmi_wwan.c | 43 +-
drivers/net/wireless/marvell/mwifiex/main.h | 13 -
drivers/net/wireless/marvell/mwifiex/tdls.c | 75 +--
drivers/nfc/pn544/i2c.c | 1 +
drivers/nvme/host/core.c | 10 +-
drivers/nvme/host/pci.c | 25 +-
drivers/nvme/host/rdma.c | 2 +-
drivers/nvme/host/tcp.c | 9 +-
drivers/perf/arm_smmuv3_pmu.c | 2 +-
drivers/pwm/pwm-omap-dmtimer.c | 21 +-
drivers/s390/crypto/ap_bus.h | 4 +-
drivers/s390/crypto/ap_card.c | 8 +-
drivers/s390/crypto/ap_queue.c | 6 +-
drivers/s390/crypto/zcrypt_api.c | 16 +-
drivers/s390/net/qeth_core_main.c | 2 +-
drivers/s390/net/qeth_l2_main.c | 29 +-
drivers/s390/scsi/zfcp_fsf.h | 2 +-
drivers/s390/scsi/zfcp_sysfs.c | 2 +-
drivers/scsi/sd_zbc.c | 7 +-
drivers/soc/tegra/fuse/fuse-tegra30.c | 3 +-
drivers/thermal/broadcom/brcmstb_thermal.c | 31 +-
drivers/thermal/db8500_thermal.c | 4 +-
drivers/vhost/net.c | 10 +-
drivers/watchdog/wdat_wdt.c | 2 +-
fs/ceph/file.c | 17 +-
fs/cifs/cifsacl.c | 4 +-
fs/cifs/connect.c | 2 +-
fs/cifs/inode.c | 2 +-
fs/dax.c | 3 +
fs/ext4/super.c | 6 +-
fs/f2fs/data.c | 32 +-
fs/io-wq.c | 22 +-
fs/io_uring.c | 12 +-
fs/namei.c | 2 +-
fs/nfs/nfs4file.c | 1 -
fs/nfs/nfs4proc.c | 18 +-
fs/ubifs/orphan.c | 4 +-
fs/xfs/libxfs/xfs_attr.h | 7 +-
fs/xfs/xfs_ioctl.c | 2 +
fs/xfs/xfs_ioctl32.c | 2 +
include/acpi/actypes.h | 3 +-
include/asm-generic/vdso/vsyscall.h | 4 +-
include/linux/blkdev.h | 2 +-
include/linux/blktrace_api.h | 18 +-
include/linux/hid.h | 2 +-
include/linux/netdevice.h | 7 +-
include/linux/netfilter/ipset/ip_set.h | 11 +-
include/linux/sched/nohz.h | 2 +
include/net/flow_dissector.h | 9 +
include/uapi/linux/usb/charger.h | 16 +-
kernel/audit.c | 40 +-
kernel/auditfilter.c | 71 +--
kernel/kprobes.c | 4 +-
kernel/locking/lockdep_proc.c | 4 +-
kernel/padata.c | 4 +-
kernel/rcu/tree_exp.h | 11 +-
kernel/sched/core.c | 31 +-
kernel/sched/fair.c | 7 +-
kernel/sched/loadavg.c | 33 +-
kernel/time/vsyscall.c | 37 +-
kernel/trace/blktrace.c | 114 +++-
kernel/trace/trace.c | 2 +
mm/debug.c | 8 +-
mm/gup.c | 3 +-
mm/huge_memory.c | 26 +-
net/core/dev.c | 34 +-
net/core/fib_rules.c | 2 +-
net/core/rtnetlink.c | 26 +-
net/ipv4/udp.c | 6 +-
net/ipv6/ip6_fib.c | 7 +-
net/ipv6/route.c | 1 +
net/mac80211/mlme.c | 6 +-
net/mac80211/util.c | 34 +-
net/netfilter/ipset/ip_set_core.c | 34 +-
net/netfilter/ipset/ip_set_hash_gen.h | 635 ++++++++++++++-------
net/netfilter/nft_tunnel.c | 4 +-
net/netfilter/xt_hashlimit.c | 12 +-
net/netlink/af_netlink.c | 5 +-
net/sched/cls_flower.c | 1 +
net/sctp/sm_statefuns.c | 29 +-
net/smc/af_smc.c | 2 +
net/smc/smc_clc.c | 4 +-
net/tipc/socket.c | 2 +
net/tls/tls_device.c | 20 +-
net/wireless/ethtool.c | 8 +-
net/wireless/nl80211.c | 5 +-
scripts/Makefile.lib | 4 +-
security/integrity/ima/ima_policy.c | 44 +-
tools/perf/builtin-report.c | 6 +-
tools/perf/ui/browsers/hists.c | 1 +
tools/perf/ui/gtk/Build | 5 +
tools/perf/util/map.c | 1 +
tools/testing/selftests/ftrace/Makefile | 2 +-
tools/testing/selftests/livepatch/Makefile | 2 +
tools/testing/selftests/net/fib_tests.sh | 6 +
tools/testing/selftests/rseq/Makefile | 2 +
tools/testing/selftests/rtc/Makefile | 2 +
virt/kvm/kvm_main.c | 12 +-
196 files changed, 2077 insertions(+), 1118 deletions(-)



2020-03-03 17:49:04

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 049/176] drm/amd/display: Add initialitions for PLL2 clock source

From: Isabel Zhang <[email protected]>

[ Upstream commit c134c3cabae46a56ab2e1f5e5fa49405e1758838 ]

[Why]
Starting from 14nm, the PLL is built into the PHY and the PLL is mapped
to PHY on 1 to 1 basis. In the code, the DP port is mapped to a PLL that was not
initialized. This causes DP to HDMI dongle to not light up the display.

[How]
Initializations added for PLL2 when creating resources.

Signed-off-by: Isabel Zhang <[email protected]>
Reviewed-by: Eric Yang <[email protected]>
Acked-by: Bhawanpreet Lakha <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
index 83cda43a1b6b3..77741b18c85b0 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
@@ -57,6 +57,7 @@
#include "dcn20/dcn20_dccg.h"
#include "dcn21_hubbub.h"
#include "dcn10/dcn10_resource.h"
+#include "dce110/dce110_resource.h"

#include "dcn20/dcn20_dwb.h"
#include "dcn20/dcn20_mmhubbub.h"
@@ -867,6 +868,7 @@ static const struct dc_debug_options debug_defaults_diags = {
enum dcn20_clk_src_array_id {
DCN20_CLK_SRC_PLL0,
DCN20_CLK_SRC_PLL1,
+ DCN20_CLK_SRC_PLL2,
DCN20_CLK_SRC_TOTAL_DCN21
};

@@ -1730,6 +1732,10 @@ static bool construct(
dcn21_clock_source_create(ctx, ctx->dc_bios,
CLOCK_SOURCE_COMBO_PHY_PLL1,
&clk_src_regs[1], false);
+ pool->base.clock_sources[DCN20_CLK_SRC_PLL2] =
+ dcn21_clock_source_create(ctx, ctx->dc_bios,
+ CLOCK_SOURCE_COMBO_PHY_PLL2,
+ &clk_src_regs[2], false);

pool->base.clk_src_count = DCN20_CLK_SRC_TOTAL_DCN21;

--
2.20.1



2020-03-03 17:49:07

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 051/176] soc/tegra: fuse: Fix build with Tegra194 configuration

From: Thierry Reding <[email protected]>

[ Upstream commit 6f4ecbe284df5f22e386a640d9a4b32cede62030 ]

If only Tegra194 support is enabled, the tegra30_fuse_read() and
tegra30_fuse_init() function are not declared and cause a build failure.
Add Tegra194 to the preprocessor guard to make sure these functions are
available for Tegra194-only builds as well.

Link: https://lore.kernel.org/r/[email protected]
Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Signed-off-by: Olof Johansson <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/soc/tegra/fuse/fuse-tegra30.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/soc/tegra/fuse/fuse-tegra30.c b/drivers/soc/tegra/fuse/fuse-tegra30.c
index b8daaf5b7291b..efd158b4607cb 100644
--- a/drivers/soc/tegra/fuse/fuse-tegra30.c
+++ b/drivers/soc/tegra/fuse/fuse-tegra30.c
@@ -36,7 +36,8 @@
defined(CONFIG_ARCH_TEGRA_124_SOC) || \
defined(CONFIG_ARCH_TEGRA_132_SOC) || \
defined(CONFIG_ARCH_TEGRA_210_SOC) || \
- defined(CONFIG_ARCH_TEGRA_186_SOC)
+ defined(CONFIG_ARCH_TEGRA_186_SOC) || \
+ defined(CONFIG_ARCH_TEGRA_194_SOC)
static u32 tegra30_fuse_read_early(struct tegra_fuse *fuse, unsigned int offset)
{
if (WARN_ON(!fuse->base))
--
2.20.1



2020-03-03 17:49:10

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 052/176] i40e: Fix the conditional for i40e_vc_validate_vqs_bitmaps

From: Brett Creeley <[email protected]>

[ Upstream commit f27f37a04a69890ac85d9155f03ee2d23b678d8f ]

Commit d9d6a9aed3f6 ("i40e: Fix virtchnl_queue_select bitmap
validation") introduced a necessary change for verifying how queue
bitmaps from the iavf driver get validated. Unfortunately, the
conditional was reversed. Fix this.

Fixes: d9d6a9aed3f6 ("i40e: Fix virtchnl_queue_select bitmap validation")
Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 69523ac85639e..56b9e445732ba 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2362,7 +2362,7 @@ static int i40e_vc_enable_queues_msg(struct i40e_vf *vf, u8 *msg)
goto error_param;
}

- if (i40e_vc_validate_vqs_bitmaps(vqs)) {
+ if (!i40e_vc_validate_vqs_bitmaps(vqs)) {
aq_ret = I40E_ERR_PARAM;
goto error_param;
}
@@ -2424,7 +2424,7 @@ static int i40e_vc_disable_queues_msg(struct i40e_vf *vf, u8 *msg)
goto error_param;
}

- if (i40e_vc_validate_vqs_bitmaps(vqs)) {
+ if (!i40e_vc_validate_vqs_bitmaps(vqs)) {
aq_ret = I40E_ERR_PARAM;
goto error_param;
}
--
2.20.1



2020-03-03 17:49:10

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 053/176] net: ena: fix potential crash when rxfh key is NULL

From: Arthur Kiyanovski <[email protected]>

[ Upstream commit 91a65b7d3ed8450f31ab717a65dcb5f9ceb5ab02 ]

When ethtool -X is called without an hkey, ena_com_fill_hash_function()
is called with key=NULL, which is passed to memcpy causing a crash.

This commit fixes this issue by checking key is not NULL.

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <[email protected]>
Signed-off-by: Arthur Kiyanovski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/amazon/ena/ena_com.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index ea62604fdf8ca..e54c44fdcaa73 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -2297,15 +2297,16 @@ int ena_com_fill_hash_function(struct ena_com_dev *ena_dev,

switch (func) {
case ENA_ADMIN_TOEPLITZ:
- if (key_len > sizeof(hash_key->key)) {
- pr_err("key len (%hu) is bigger than the max supported (%zu)\n",
- key_len, sizeof(hash_key->key));
- return -EINVAL;
+ if (key) {
+ if (key_len != sizeof(hash_key->key)) {
+ pr_err("key len (%hu) doesn't equal the supported size (%zu)\n",
+ key_len, sizeof(hash_key->key));
+ return -EINVAL;
+ }
+ memcpy(hash_key->key, key, key_len);
+ rss->hash_init_val = init_val;
+ hash_key->keys_num = key_len >> 2;
}
-
- memcpy(hash_key->key, key, key_len);
- rss->hash_init_val = init_val;
- hash_key->keys_num = key_len >> 2;
break;
case ENA_ADMIN_CRC32:
rss->hash_init_val = init_val;
--
2.20.1



2020-03-03 17:49:13

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 055/176] net: ena: add missing ethtool TX timestamping indication

From: Arthur Kiyanovski <[email protected]>

[ Upstream commit cf6d17fde93bdda23c9b02dd5906a12bf8c55209 ]

Current implementation of the driver calls skb_tx_timestamp()to add a
software tx timestamp to the skb, however the software-transmit capability
is not reported in ethtool -T.

This commit updates the ethtool structure to report the software-transmit
capability in ethtool -T using the standard ethtool_op_get_ts_info().
This function reports all software timestamping capabilities (tx and rx),
as well as setting phc_index = -1. phc_index is the index of the PTP
hardware clock device that will be used for hardware timestamps. Since we
don't have such a device in ENA, using the default -1 value is the correct
setting.

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Ezequiel Lara Gomez <[email protected]>
Signed-off-by: Arthur Kiyanovski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/amazon/ena/ena_ethtool.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/amazon/ena/ena_ethtool.c b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
index fc96c66b44cb5..8b56383b64aea 100644
--- a/drivers/net/ethernet/amazon/ena/ena_ethtool.c
+++ b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
@@ -812,6 +812,7 @@ static const struct ethtool_ops ena_ethtool_ops = {
.set_channels = ena_set_channels,
.get_tunable = ena_get_tunable,
.set_tunable = ena_set_tunable,
+ .get_ts_info = ethtool_op_get_ts_info,
};

void ena_set_ethtool_ops(struct net_device *netdev)
--
2.20.1



2020-03-03 17:49:14

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 056/176] net: ena: fix incorrect default RSS key

From: Arthur Kiyanovski <[email protected]>

[ Upstream commit 0d1c3de7b8c78a5e44b74b62ede4a63629f5d811 ]

Bug description:
When running "ethtool -x <if_name>" the key shows up as all zeros.

When we use "ethtool -X <if_name> hfunc toeplitz hkey <some:random:key>" to
set the key and then try to retrieve it using "ethtool -x <if_name>" then
we return the correct key because we return the one we saved.

Bug cause:
We don't fetch the key from the device but instead return the key
that we have saved internally which is by default set to zero upon
allocation.

Fix:
This commit fixes the issue by initializing the key to a random value
using netdev_rss_key_fill().

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <[email protected]>
Signed-off-by: Arthur Kiyanovski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/amazon/ena/ena_com.c | 15 +++++++++++++++
drivers/net/ethernet/amazon/ena/ena_com.h | 1 +
2 files changed, 16 insertions(+)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index e54c44fdcaa73..d6b894b06fa30 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -1041,6 +1041,19 @@ static int ena_com_get_feature(struct ena_com_dev *ena_dev,
feature_ver);
}

+static void ena_com_hash_key_fill_default_key(struct ena_com_dev *ena_dev)
+{
+ struct ena_admin_feature_rss_flow_hash_control *hash_key =
+ (ena_dev->rss).hash_key;
+
+ netdev_rss_key_fill(&hash_key->key, sizeof(hash_key->key));
+ /* The key is stored in the device in u32 array
+ * as well as the API requires the key to be passed in this
+ * format. Thus the size of our array should be divided by 4
+ */
+ hash_key->keys_num = sizeof(hash_key->key) / sizeof(u32);
+}
+
static int ena_com_hash_key_allocate(struct ena_com_dev *ena_dev)
{
struct ena_rss *rss = &ena_dev->rss;
@@ -2631,6 +2644,8 @@ int ena_com_rss_init(struct ena_com_dev *ena_dev, u16 indr_tbl_log_size)
if (unlikely(rc))
goto err_hash_key;

+ ena_com_hash_key_fill_default_key(ena_dev);
+
rc = ena_com_hash_ctrl_init(ena_dev);
if (unlikely(rc))
goto err_hash_ctrl;
diff --git a/drivers/net/ethernet/amazon/ena/ena_com.h b/drivers/net/ethernet/amazon/ena/ena_com.h
index 0ce37d54ed108..9b5bd28ed0ac6 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.h
+++ b/drivers/net/ethernet/amazon/ena/ena_com.h
@@ -44,6 +44,7 @@
#include <linux/spinlock.h>
#include <linux/types.h>
#include <linux/wait.h>
+#include <linux/netdevice.h>

#include "ena_common_defs.h"
#include "ena_admin_defs.h"
--
2.20.1



2020-03-03 17:49:19

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 058/176] net: ena: rss: fix failure to get indirection table

From: Sameeh Jubran <[email protected]>

[ Upstream commit 0c8923c0a64fb5d14bebb9a9065d2dc25ac5e600 ]

On old hardware, getting / setting the hash function is not supported while
gettting / setting the indirection table is.

This commit enables us to still show the indirection table on older
hardwares by setting the hash function and key to NULL.

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/amazon/ena/ena_ethtool.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/drivers/net/ethernet/amazon/ena/ena_ethtool.c b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
index 8b56383b64aea..8be9df885bf4f 100644
--- a/drivers/net/ethernet/amazon/ena/ena_ethtool.c
+++ b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
@@ -648,7 +648,21 @@ static int ena_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key,
if (rc)
return rc;

+ /* We call this function in order to check if the device
+ * supports getting/setting the hash function.
+ */
rc = ena_com_get_hash_function(adapter->ena_dev, &ena_func, key);
+
+ if (rc) {
+ if (rc == -EOPNOTSUPP) {
+ key = NULL;
+ hfunc = NULL;
+ rc = 0;
+ }
+
+ return rc;
+ }
+
if (rc)
return rc;

--
2.20.1



2020-03-03 17:49:22

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 059/176] net: ena: rss: store hash function as values and not bits

From: Arthur Kiyanovski <[email protected]>

[ Upstream commit 4844470d472d660c26149ad764da2406adb13423 ]

The device receives, stores and retrieves the hash function value as bits
and not as their enum value.

The bug:
* In ena_com_set_hash_function() we set
cmd.u.flow_hash_func.selected_func to the bit value of rss->hash_func.
(1 << rss->hash_func)
* In ena_com_get_hash_function() we retrieve the hash function and store
it's bit value in rss->hash_func. (Now the bit value of rss->hash_func
is stored in rss->hash_func instead of it's enum value)

The fix:
This commit fixes the issue by converting the retrieved hash function
values from the device to the matching enum value of the set bit using
ffs(). ffs() finds the first set bit's index in a word. Since the function
returns 1 for the LSB's index, we need to subtract 1 from the returned
value (note that BIT(0) is 1).

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <[email protected]>
Signed-off-by: Arthur Kiyanovski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/amazon/ena/ena_com.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 6f758ece86f60..8ab192cb26b74 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -2370,7 +2370,11 @@ int ena_com_get_hash_function(struct ena_com_dev *ena_dev,
if (unlikely(rc))
return rc;

- rss->hash_func = get_resp.u.flow_hash_func.selected_func;
+ /* ffs() returns 1 in case the lsb is set */
+ rss->hash_func = ffs(get_resp.u.flow_hash_func.selected_func);
+ if (rss->hash_func)
+ rss->hash_func--;
+
if (func)
*func = rss->hash_func;

--
2.20.1



2020-03-03 17:49:24

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 061/176] net: ena: fix corruption of dev_idx_to_host_tbl

From: Arthur Kiyanovski <[email protected]>

[ Upstream commit e3f89f91e98ce07dc0f121a3b70d21aca749ba39 ]

The function ena_com_ind_tbl_convert_from_device() has an overflow
bug as explained below. Either way, this function is not needed at
all since we don't retrieve the indirection table from the device
at any point which means that this conversion is not needed.

The bug:
The for loop iterates over all io_sq_queues, when passing the actual
number of used queues the io_sq_queues[i].idx equals 0 since they are
uninitialized which results in the following code to be executed till
the end of the loop:

dev_idx_to_host_tbl[0] = i;

This results dev_idx_to_host_tbl[0] in being equal to
ENA_TOTAL_NUM_QUEUES - 1.

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <[email protected]>
Signed-off-by: Arthur Kiyanovski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/amazon/ena/ena_com.c | 28 -----------------------
1 file changed, 28 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 8ab192cb26b74..74743fd8a1e0a 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -1281,30 +1281,6 @@ static int ena_com_ind_tbl_convert_to_device(struct ena_com_dev *ena_dev)
return 0;
}

-static int ena_com_ind_tbl_convert_from_device(struct ena_com_dev *ena_dev)
-{
- u16 dev_idx_to_host_tbl[ENA_TOTAL_NUM_QUEUES] = { (u16)-1 };
- struct ena_rss *rss = &ena_dev->rss;
- u8 idx;
- u16 i;
-
- for (i = 0; i < ENA_TOTAL_NUM_QUEUES; i++)
- dev_idx_to_host_tbl[ena_dev->io_sq_queues[i].idx] = i;
-
- for (i = 0; i < 1 << rss->tbl_log_size; i++) {
- if (rss->rss_ind_tbl[i].cq_idx > ENA_TOTAL_NUM_QUEUES)
- return -EINVAL;
- idx = (u8)rss->rss_ind_tbl[i].cq_idx;
-
- if (dev_idx_to_host_tbl[idx] > ENA_TOTAL_NUM_QUEUES)
- return -EINVAL;
-
- rss->host_rss_ind_tbl[i] = dev_idx_to_host_tbl[idx];
- }
-
- return 0;
-}
-
static void ena_com_update_intr_delay_resolution(struct ena_com_dev *ena_dev,
u16 intr_delay_resolution)
{
@@ -2638,10 +2614,6 @@ int ena_com_indirect_table_get(struct ena_com_dev *ena_dev, u32 *ind_tbl)
if (!ind_tbl)
return 0;

- rc = ena_com_ind_tbl_convert_from_device(ena_dev);
- if (unlikely(rc))
- return rc;
-
for (i = 0; i < (1 << rss->tbl_log_size); i++)
ind_tbl[i] = rss->host_rss_ind_tbl[i];

--
2.20.1



2020-03-03 17:49:25

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 062/176] net: ena: ethtool: use correct value for crc32 hash

From: Sameeh Jubran <[email protected]>

[ Upstream commit 886d2089276e40d460731765083a741c5c762461 ]

Up till kernel 4.11 there was no enum defined for crc32 hash in ethtool,
thus the xor enum was used for supporting crc32.

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/amazon/ena/ena_ethtool.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_ethtool.c b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
index 610a7c63e1742..4ad69066e7846 100644
--- a/drivers/net/ethernet/amazon/ena/ena_ethtool.c
+++ b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
@@ -693,7 +693,7 @@ static int ena_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key,
func = ETH_RSS_HASH_TOP;
break;
case ENA_ADMIN_CRC32:
- func = ETH_RSS_HASH_XOR;
+ func = ETH_RSS_HASH_CRC32;
break;
default:
netif_err(adapter, drv, netdev,
@@ -739,7 +739,7 @@ static int ena_set_rxfh(struct net_device *netdev, const u32 *indir,
case ETH_RSS_HASH_TOP:
func = ENA_ADMIN_TOEPLITZ;
break;
- case ETH_RSS_HASH_XOR:
+ case ETH_RSS_HASH_CRC32:
func = ENA_ADMIN_CRC32;
break;
default:
--
2.20.1



2020-03-03 17:49:28

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 065/176] ice: Dont allow same value for Rx tail to be written twice

From: Brett Creeley <[email protected]>

[ Upstream commit 168983a8e19b89efd175661e53faa6246be363a0 ]

Currently we compare the value we are about to write to the Rx tail
register with the previous value of next_to_use. The problem with this
is we only write tail on 8 descriptor boundaries, but next_to_use is
updated whenever we clean Rx descriptors. Fix this by comparing the
value we are about to write to tail with the previously written tail
value. This will prevent duplicate Rx tail bumps.

Signed-off-by: Brett Creeley <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
index 35bbc4ff603cd..6da048a6ca7c1 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
@@ -10,7 +10,7 @@
*/
void ice_release_rx_desc(struct ice_ring *rx_ring, u32 val)
{
- u16 prev_ntu = rx_ring->next_to_use;
+ u16 prev_ntu = rx_ring->next_to_use & ~0x7;

rx_ring->next_to_use = val;

--
2.20.1



2020-03-03 17:49:30

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 008/176] net/tls: Fix to avoid gettig invalid tls record

From: Rohit Maheshwari <[email protected]>

[ Upstream commit 06f5201c6392f998a49ca9c9173e2930c8eb51d8 ]

Current code doesn't check if tcp sequence number is starting from (/after)
1st record's start sequnce number. It only checks if seq number is before
1st record's end sequnce number. This problem will always be a possibility
in re-transmit case. If a record which belongs to a requested seq number is
already deleted, tls_get_record will start looking into list and as per the
check it will look if seq number is before the end seq of 1st record, which
will always be true and will return 1st record always, it should in fact
return NULL.
As part of the fix, start looking each record only if the sequence number
lies in the list else return NULL.
There is one more check added, driver look for the start marker record to
handle tcp packets which are before the tls offload start sequence number,
hence return 1st record if the record is tls start marker and seq number is
before the 1st record's starting sequence number.

Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure")
Signed-off-by: Rohit Maheshwari <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/tls/tls_device.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)

--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -592,7 +592,7 @@ struct tls_record_info *tls_get_record(s
u32 seq, u64 *p_record_sn)
{
u64 record_sn = context->hint_record_sn;
- struct tls_record_info *info;
+ struct tls_record_info *info, *last;

info = context->retransmit_hint;
if (!info ||
@@ -604,6 +604,24 @@ struct tls_record_info *tls_get_record(s
struct tls_record_info, list);
if (!info)
return NULL;
+ /* send the start_marker record if seq number is before the
+ * tls offload start marker sequence number. This record is
+ * required to handle TCP packets which are before TLS offload
+ * started.
+ * And if it's not start marker, look if this seq number
+ * belongs to the list.
+ */
+ if (likely(!tls_record_is_start_marker(info))) {
+ /* we have the first record, get the last record to see
+ * if this seq number belongs to the list.
+ */
+ last = list_last_entry(&context->records_list,
+ struct tls_record_info, list);
+
+ if (!between(seq, tls_record_start_seq(info),
+ last->end_seq))
+ return NULL;
+ }
record_sn = context->unacked_record_sn;
}



2020-03-03 17:49:30

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 005/176] net: mscc: fix in frame extraction

From: Horatiu Vultur <[email protected]>

[ Upstream commit a81541041ceb55bcec9a8bb8ad3482263f0a205a ]

Each extracted frame on Ocelot has an IFH. The frame and IFH are extracted
by reading chuncks of 4 bytes from a register.

In case the IFH and frames were read corretly it would try to read the next
frame. In case there are no more frames in the queue, it checks if there
were any previous errors and in that case clear the queue. But this check
will always succeed also when there are no errors. Because when extracting
the IFH the error is checked against 4(number of bytes read) and then the
error is set only if the extraction of the frame failed. So in a happy case
where there are no errors the err variable is still 4. So it could be
a case where after the check that there are no more frames in the queue, a
frame will arrive in the queue but because the error is not reseted, it
would try to flush the queue. So the frame will be lost.

The fix consist in resetting the error after reading the IFH.

Signed-off-by: Horatiu Vultur <[email protected]>
Acked-by: Alexandre Belloni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mscc/ocelot_board.c | 8 ++++++++
1 file changed, 8 insertions(+)

--- a/drivers/net/ethernet/mscc/ocelot_board.c
+++ b/drivers/net/ethernet/mscc/ocelot_board.c
@@ -114,6 +114,14 @@ static irqreturn_t ocelot_xtr_irq_handle
if (err != 4)
break;

+ /* At this point the IFH was read correctly, so it is safe to
+ * presume that there is no error. The err needs to be reset
+ * otherwise a frame could come in CPU queue between the while
+ * condition and the check for error later on. And in that case
+ * the new frame is just removed and not processed.
+ */
+ err = 0;
+
ocelot_parse_ifh(ifh, &info);

ocelot_port = ocelot->ports[info.port];


2020-03-03 17:49:30

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 066/176] ice: fix and consolidate logging of NVM/firmware version information

From: Bruce Allan <[email protected]>

[ Upstream commit fbf1e1f6988e70287b1bfcad4f655ca96b681929 ]

Logging the firmware/NVM information during driver load is redundant since
that information is also available via ethtool. Move the functionality
found in ice_nvm_version_str() directly into ice_get_drvinfo() and remove
calling the former and logging that info during driver probe. This also
gets rid of a bug in ice_nvm_version_str() where it returns a pointer to
a buffer which is free'ed when that function exits.

Signed-off-by: Bruce Allan <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/intel/ice/ice_ethtool.c | 15 +++++++++++++--
drivers/net/ethernet/intel/ice/ice_lib.c | 19 -------------------
drivers/net/ethernet/intel/ice/ice_lib.h | 2 --
drivers/net/ethernet/intel/ice/ice_main.c | 5 -----
4 files changed, 13 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
index 9ebd93e79aeb6..f956f7bb4ef2d 100644
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
@@ -165,13 +165,24 @@ static void
ice_get_drvinfo(struct net_device *netdev, struct ethtool_drvinfo *drvinfo)
{
struct ice_netdev_priv *np = netdev_priv(netdev);
+ u8 oem_ver, oem_patch, nvm_ver_hi, nvm_ver_lo;
struct ice_vsi *vsi = np->vsi;
struct ice_pf *pf = vsi->back;
+ struct ice_hw *hw = &pf->hw;
+ u16 oem_build;

strlcpy(drvinfo->driver, KBUILD_MODNAME, sizeof(drvinfo->driver));
strlcpy(drvinfo->version, ice_drv_ver, sizeof(drvinfo->version));
- strlcpy(drvinfo->fw_version, ice_nvm_version_str(&pf->hw),
- sizeof(drvinfo->fw_version));
+
+ /* Display NVM version (from which the firmware version can be
+ * determined) which contains more pertinent information.
+ */
+ ice_get_nvm_version(hw, &oem_ver, &oem_build, &oem_patch,
+ &nvm_ver_hi, &nvm_ver_lo);
+ snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
+ "%x.%02x 0x%x %d.%d.%d", nvm_ver_hi, nvm_ver_lo,
+ hw->nvm.eetrack, oem_ver, oem_build, oem_patch);
+
strlcpy(drvinfo->bus_info, pci_name(pf->pdev),
sizeof(drvinfo->bus_info));
drvinfo->n_priv_flags = ICE_PRIV_FLAG_ARRAY_SIZE;
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index e7449248fab4c..e0e3c6400e4b9 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -2647,25 +2647,6 @@ int ice_vsi_cfg_tc(struct ice_vsi *vsi, u8 ena_tc)
}
#endif /* CONFIG_DCB */

-/**
- * ice_nvm_version_str - format the NVM version strings
- * @hw: ptr to the hardware info
- */
-char *ice_nvm_version_str(struct ice_hw *hw)
-{
- u8 oem_ver, oem_patch, ver_hi, ver_lo;
- static char buf[ICE_NVM_VER_LEN];
- u16 oem_build;
-
- ice_get_nvm_version(hw, &oem_ver, &oem_build, &oem_patch, &ver_hi,
- &ver_lo);
-
- snprintf(buf, sizeof(buf), "%x.%02x 0x%x %d.%d.%d", ver_hi, ver_lo,
- hw->nvm.eetrack, oem_ver, oem_build, oem_patch);
-
- return buf;
-}
-
/**
* ice_update_ring_stats - Update ring statistics
* @ring: ring to update
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.h b/drivers/net/ethernet/intel/ice/ice_lib.h
index 6e31e30aba394..0d2b1119c0e38 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.h
+++ b/drivers/net/ethernet/intel/ice/ice_lib.h
@@ -97,8 +97,6 @@ void ice_vsi_cfg_frame_size(struct ice_vsi *vsi);

u32 ice_intrl_usec_to_reg(u8 intrl, u8 gran);

-char *ice_nvm_version_str(struct ice_hw *hw);
-
enum ice_status
ice_vsi_cfg_mac_fltr(struct ice_vsi *vsi, const u8 *macaddr, bool set);

diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 69bff085acf75..b4cbeb4f3177f 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -3241,11 +3241,6 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
goto err_exit_unroll;
}

- dev_info(dev, "firmware %d.%d.%d api %d.%d.%d nvm %s build 0x%08x\n",
- hw->fw_maj_ver, hw->fw_min_ver, hw->fw_patch,
- hw->api_maj_ver, hw->api_min_ver, hw->api_patch,
- ice_nvm_version_str(hw), hw->fw_build);
-
ice_request_fw(pf);

/* if ice_request_fw fails, ICE_FLAG_ADV_FEATURES bit won't be
--
2.20.1



2020-03-03 17:49:31

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 024/176] net: rtnetlink: fix bugs in rtnl_alt_ifname()

From: Eric Dumazet <[email protected]>

[ Upstream commit 44bfa9c5e5f06c72540273813e4c66beb5a8c213 ]

Since IFLA_ALT_IFNAME is an NLA_STRING, we have no
guarantee it is nul terminated.

We should use nla_strdup() instead of kstrdup(), since this
helper will make sure not accessing out-of-bounds data.

BUG: KMSAN: uninit-value in strlen+0x5e/0xa0 lib/string.c:535
CPU: 1 PID: 19157 Comm: syz-executor.5 Not tainted 5.5.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1c9/0x220 lib/dump_stack.c:118
kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118
__msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
strlen+0x5e/0xa0 lib/string.c:535
kstrdup+0x7f/0x1a0 mm/util.c:59
rtnl_alt_ifname net/core/rtnetlink.c:3495 [inline]
rtnl_linkprop+0x85d/0xc00 net/core/rtnetlink.c:3553
rtnl_newlinkprop+0x9d/0xb0 net/core/rtnetlink.c:3568
rtnetlink_rcv_msg+0x1153/0x1570 net/core/rtnetlink.c:5424
netlink_rcv_skb+0x451/0x650 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5442
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0xf9e/0x1100 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x1248/0x14d0 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:639 [inline]
sock_sendmsg net/socket.c:659 [inline]
____sys_sendmsg+0x12b6/0x1350 net/socket.c:2330
___sys_sendmsg net/socket.c:2384 [inline]
__sys_sendmsg+0x451/0x5f0 net/socket.c:2417
__do_sys_sendmsg net/socket.c:2426 [inline]
__se_sys_sendmsg+0x97/0xb0 net/socket.c:2424
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424
do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45b3b9
Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ff1c7b1ac78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007ff1c7b1b6d4 RCX: 000000000045b3b9
RDX: 0000000000000000 RSI: 0000000020000040 RDI: 0000000000000003
RBP: 000000000075bf20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000000009cb R14: 00000000004cb3dd R15: 000000000075bf2c

Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline]
kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127
kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82
slab_alloc_node mm/slub.c:2774 [inline]
__kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4382
__kmalloc_reserve net/core/skbuff.c:141 [inline]
__alloc_skb+0x2fd/0xac0 net/core/skbuff.c:209
alloc_skb include/linux/skbuff.h:1049 [inline]
netlink_alloc_large_skb net/netlink/af_netlink.c:1174 [inline]
netlink_sendmsg+0x7d3/0x14d0 net/netlink/af_netlink.c:1892
sock_sendmsg_nosec net/socket.c:639 [inline]
sock_sendmsg net/socket.c:659 [inline]
____sys_sendmsg+0x12b6/0x1350 net/socket.c:2330
___sys_sendmsg net/socket.c:2384 [inline]
__sys_sendmsg+0x451/0x5f0 net/socket.c:2417
__do_sys_sendmsg net/socket.c:2426 [inline]
__se_sys_sendmsg+0x97/0xb0 net/socket.c:2424
__x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424
do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 36fbf1e52bd3 ("net: rtnetlink: add linkprop commands to add and delete alternative ifnames")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Jiri Pirko <[email protected]>
Reported-by: syzbot <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/rtnetlink.c | 26 ++++++++++++--------------
1 file changed, 12 insertions(+), 14 deletions(-)

--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3499,27 +3499,25 @@ static int rtnl_alt_ifname(int cmd, stru
if (err)
return err;

- alt_ifname = nla_data(attr);
+ alt_ifname = nla_strdup(attr, GFP_KERNEL);
+ if (!alt_ifname)
+ return -ENOMEM;
+
if (cmd == RTM_NEWLINKPROP) {
- alt_ifname = kstrdup(alt_ifname, GFP_KERNEL);
- if (!alt_ifname)
- return -ENOMEM;
err = netdev_name_node_alt_create(dev, alt_ifname);
- if (err) {
- kfree(alt_ifname);
- return err;
- }
+ if (!err)
+ alt_ifname = NULL;
} else if (cmd == RTM_DELLINKPROP) {
err = netdev_name_node_alt_destroy(dev, alt_ifname);
- if (err)
- return err;
} else {
- WARN_ON(1);
- return 0;
+ WARN_ON_ONCE(1);
+ err = -EINVAL;
}

- *changed = true;
- return 0;
+ kfree(alt_ifname);
+ if (!err)
+ *changed = true;
+ return err;
}

static int rtnl_linkprop(int cmd, struct sk_buff *skb, struct nlmsghdr *nlh,


2020-03-03 17:49:33

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 040/176] perf/x86/intel: Add Elkhart Lake support

From: Kan Liang <[email protected]>

[ Upstream commit eda23b387f6c4bb2971ac7e874a09913f533b22c ]

Elkhart Lake also uses Tremont CPU. From the perspective of Intel PMU,
there is nothing changed compared with Jacobsville.
Share the perf code with Jacobsville.

Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/events/intel/core.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 3be51aa06e67e..dff6623804c28 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4765,6 +4765,7 @@ __init int intel_pmu_init(void)
break;

case INTEL_FAM6_ATOM_TREMONT_D:
+ case INTEL_FAM6_ATOM_TREMONT:
x86_pmu.late_ack = true;
memcpy(hw_cache_event_ids, glp_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
--
2.20.1



2020-03-03 17:49:36

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 067/176] ice: update Unit Load Status bitmask to check after reset

From: Bruce Allan <[email protected]>

[ Upstream commit cf8fc2a0863f9ff27ebd2efcdb1f7d378b9fb8a6 ]

After a reset the Unit Load Status bits in the GLNVM_ULD register to check
for completion should be 0x7FF before continuing. Update the mask to check
(minus the three reserved bits that are always set).

Signed-off-by: Bruce Allan <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/intel/ice/ice_common.c | 17 ++++++++++++-----
drivers/net/ethernet/intel/ice/ice_hw_autogen.h | 6 ++++++
2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c
index fb1d930470c71..cb437a448305e 100644
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -937,7 +937,7 @@ void ice_deinit_hw(struct ice_hw *hw)
*/
enum ice_status ice_check_reset(struct ice_hw *hw)
{
- u32 cnt, reg = 0, grst_delay;
+ u32 cnt, reg = 0, grst_delay, uld_mask;

/* Poll for Device Active state in case a recent CORER, GLOBR,
* or EMPR has occurred. The grst delay value is in 100ms units.
@@ -959,13 +959,20 @@ enum ice_status ice_check_reset(struct ice_hw *hw)
return ICE_ERR_RESET_FAILED;
}

-#define ICE_RESET_DONE_MASK (GLNVM_ULD_CORER_DONE_M | \
- GLNVM_ULD_GLOBR_DONE_M)
+#define ICE_RESET_DONE_MASK (GLNVM_ULD_PCIER_DONE_M |\
+ GLNVM_ULD_PCIER_DONE_1_M |\
+ GLNVM_ULD_CORER_DONE_M |\
+ GLNVM_ULD_GLOBR_DONE_M |\
+ GLNVM_ULD_POR_DONE_M |\
+ GLNVM_ULD_POR_DONE_1_M |\
+ GLNVM_ULD_PCIER_DONE_2_M)
+
+ uld_mask = ICE_RESET_DONE_MASK;

/* Device is Active; check Global Reset processes are done */
for (cnt = 0; cnt < ICE_PF_RESET_WAIT_COUNT; cnt++) {
- reg = rd32(hw, GLNVM_ULD) & ICE_RESET_DONE_MASK;
- if (reg == ICE_RESET_DONE_MASK) {
+ reg = rd32(hw, GLNVM_ULD) & uld_mask;
+ if (reg == uld_mask) {
ice_debug(hw, ICE_DBG_INIT,
"Global reset processes done. %d\n", cnt);
break;
diff --git a/drivers/net/ethernet/intel/ice/ice_hw_autogen.h b/drivers/net/ethernet/intel/ice/ice_hw_autogen.h
index e8f32350fed29..6f4a70fa39037 100644
--- a/drivers/net/ethernet/intel/ice/ice_hw_autogen.h
+++ b/drivers/net/ethernet/intel/ice/ice_hw_autogen.h
@@ -276,8 +276,14 @@
#define GLNVM_GENS_SR_SIZE_S 5
#define GLNVM_GENS_SR_SIZE_M ICE_M(0x7, 5)
#define GLNVM_ULD 0x000B6008
+#define GLNVM_ULD_PCIER_DONE_M BIT(0)
+#define GLNVM_ULD_PCIER_DONE_1_M BIT(1)
#define GLNVM_ULD_CORER_DONE_M BIT(3)
#define GLNVM_ULD_GLOBR_DONE_M BIT(4)
+#define GLNVM_ULD_POR_DONE_M BIT(5)
+#define GLNVM_ULD_POR_DONE_1_M BIT(8)
+#define GLNVM_ULD_PCIER_DONE_2_M BIT(9)
+#define GLNVM_ULD_PE_DONE_M BIT(10)
#define GLPCI_CNF2 0x000BE004
#define GLPCI_CNF2_CACHELINE_SIZE_M BIT(1)
#define PF_FUNC_RID 0x0009E880
--
2.20.1



2020-03-03 17:49:36

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 068/176] ice: Use ice_pf_to_dev

From: Anirudh Venkataramanan <[email protected]>

[ Upstream commit 9a946843ba5c173e259fef7a035feac994a65b59 ]

Use ice_pf_to_dev(pf) instead of &pf->pdev->dev
Use ice_pf_to_dev(vsi->back) instead of &vsi->back->pdev->dev
When a pointer to the pf instance is available, use ice_pf_to_dev
instead of ice_hw_to_dev

Signed-off-by: Anirudh Venkataramanan <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/intel/ice/ice_base.c | 12 ++++++------
drivers/net/ethernet/intel/ice/ice_dcb_nl.c | 2 +-
drivers/net/ethernet/intel/ice/ice_ethtool.c | 2 +-
drivers/net/ethernet/intel/ice/ice_lib.c | 14 +++++++-------
drivers/net/ethernet/intel/ice/ice_main.c | 16 ++++++++--------
drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c | 8 ++++----
6 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c
index 77d6a0291e975..6939c14858b20 100644
--- a/drivers/net/ethernet/intel/ice/ice_base.c
+++ b/drivers/net/ethernet/intel/ice/ice_base.c
@@ -320,7 +320,7 @@ int ice_setup_rx_ctx(struct ice_ring *ring)
if (err)
return err;

- dev_info(&vsi->back->pdev->dev, "Registered XDP mem model MEM_TYPE_ZERO_COPY on Rx ring %d\n",
+ dev_info(ice_pf_to_dev(vsi->back), "Registered XDP mem model MEM_TYPE_ZERO_COPY on Rx ring %d\n",
ring->q_index);
} else {
if (!xdp_rxq_info_is_reg(&ring->xdp_rxq))
@@ -399,7 +399,7 @@ int ice_setup_rx_ctx(struct ice_ring *ring)
/* Absolute queue number out of 2K needs to be passed */
err = ice_write_rxq_ctx(hw, &rlan_ctx, pf_q);
if (err) {
- dev_err(&vsi->back->pdev->dev,
+ dev_err(ice_pf_to_dev(vsi->back),
"Failed to set LAN Rx queue context for absolute Rx queue %d error: %d\n",
pf_q, err);
return -EIO;
@@ -422,7 +422,7 @@ int ice_setup_rx_ctx(struct ice_ring *ring)
ice_alloc_rx_bufs_slow_zc(ring, ICE_DESC_UNUSED(ring)) :
ice_alloc_rx_bufs(ring, ICE_DESC_UNUSED(ring));
if (err)
- dev_info(&vsi->back->pdev->dev,
+ dev_info(ice_pf_to_dev(vsi->back),
"Failed allocate some buffers on %sRx ring %d (pf_q %d)\n",
ring->xsk_umem ? "UMEM enabled " : "",
ring->q_index, pf_q);
@@ -817,13 +817,13 @@ ice_vsi_stop_tx_ring(struct ice_vsi *vsi, enum ice_disq_rst_src rst_src,
* queues at the hardware level anyway.
*/
if (status == ICE_ERR_RESET_ONGOING) {
- dev_dbg(&vsi->back->pdev->dev,
+ dev_dbg(ice_pf_to_dev(vsi->back),
"Reset in progress. LAN Tx queues already disabled\n");
} else if (status == ICE_ERR_DOES_NOT_EXIST) {
- dev_dbg(&vsi->back->pdev->dev,
+ dev_dbg(ice_pf_to_dev(vsi->back),
"LAN Tx queues do not exist, nothing to disable\n");
} else if (status) {
- dev_err(&vsi->back->pdev->dev,
+ dev_err(ice_pf_to_dev(vsi->back),
"Failed to disable LAN Tx queues, error: %d\n", status);
return -ENODEV;
}
diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_nl.c b/drivers/net/ethernet/intel/ice/ice_dcb_nl.c
index 926c9772f0860..265cf69b321bf 100644
--- a/drivers/net/ethernet/intel/ice/ice_dcb_nl.c
+++ b/drivers/net/ethernet/intel/ice/ice_dcb_nl.c
@@ -882,7 +882,7 @@ ice_dcbnl_vsi_del_app(struct ice_vsi *vsi,
sapp.protocol = app->prot_id;
sapp.priority = app->priority;
err = ice_dcbnl_delapp(vsi->netdev, &sapp);
- dev_dbg(&vsi->back->pdev->dev,
+ dev_dbg(ice_pf_to_dev(vsi->back),
"Deleting app for VSI idx=%d err=%d sel=%d proto=0x%x, prio=%d\n",
vsi->idx, err, app->selector, app->prot_id, app->priority);
}
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
index f956f7bb4ef2d..9bd166e3dff3d 100644
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
@@ -1054,7 +1054,7 @@ ice_set_fecparam(struct net_device *netdev, struct ethtool_fecparam *fecparam)
fec = ICE_FEC_NONE;
break;
default:
- dev_warn(&vsi->back->pdev->dev, "Unsupported FEC mode: %d\n",
+ dev_warn(ice_pf_to_dev(vsi->back), "Unsupported FEC mode: %d\n",
fecparam->fec);
return -EINVAL;
}
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index e0e3c6400e4b9..b43bb51f6067a 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -116,7 +116,7 @@ static void ice_vsi_set_num_desc(struct ice_vsi *vsi)
vsi->num_tx_desc = ICE_DFLT_NUM_TX_DESC;
break;
default:
- dev_dbg(&vsi->back->pdev->dev,
+ dev_dbg(ice_pf_to_dev(vsi->back),
"Not setting number of Tx/Rx descriptors for VSI type %d\n",
vsi->type);
break;
@@ -697,7 +697,7 @@ static void ice_vsi_setup_q_map(struct ice_vsi *vsi, struct ice_vsi_ctx *ctxt)
vsi->num_txq = tx_count;

if (vsi->type == ICE_VSI_VF && vsi->num_txq != vsi->num_rxq) {
- dev_dbg(&vsi->back->pdev->dev, "VF VSI should have same number of Tx and Rx queues. Hence making them equal\n");
+ dev_dbg(ice_pf_to_dev(vsi->back), "VF VSI should have same number of Tx and Rx queues. Hence making them equal\n");
/* since there is a chance that num_rxq could have been changed
* in the above for loop, make num_txq equal to num_rxq.
*/
@@ -1306,7 +1306,7 @@ int ice_vsi_cfg_rxqs(struct ice_vsi *vsi)

err = ice_setup_rx_ctx(vsi->rx_rings[i]);
if (err) {
- dev_err(&vsi->back->pdev->dev,
+ dev_err(ice_pf_to_dev(vsi->back),
"ice_setup_rx_ctx failed for RxQ %d, err %d\n",
i, err);
return err;
@@ -1476,7 +1476,7 @@ int ice_vsi_manage_vlan_insertion(struct ice_vsi *vsi)

status = ice_update_vsi(hw, vsi->idx, ctxt, NULL);
if (status) {
- dev_err(&vsi->back->pdev->dev, "update VSI for VLAN insert failed, err %d aq_err %d\n",
+ dev_err(ice_pf_to_dev(vsi->back), "update VSI for VLAN insert failed, err %d aq_err %d\n",
status, hw->adminq.sq_last_status);
ret = -EIO;
goto out;
@@ -1522,7 +1522,7 @@ int ice_vsi_manage_vlan_stripping(struct ice_vsi *vsi, bool ena)

status = ice_update_vsi(hw, vsi->idx, ctxt, NULL);
if (status) {
- dev_err(&vsi->back->pdev->dev, "update VSI for VLAN strip failed, ena = %d err %d aq_err %d\n",
+ dev_err(ice_pf_to_dev(vsi->back), "update VSI for VLAN strip failed, ena = %d err %d aq_err %d\n",
ena, status, hw->adminq.sq_last_status);
ret = -EIO;
goto out;
@@ -1696,7 +1696,7 @@ ice_vsi_set_q_vectors_reg_idx(struct ice_vsi *vsi)
struct ice_q_vector *q_vector = vsi->q_vectors[i];

if (!q_vector) {
- dev_err(&vsi->back->pdev->dev,
+ dev_err(ice_pf_to_dev(vsi->back),
"Failed to set reg_idx on q_vector %d VSI %d\n",
i, vsi->vsi_num);
goto clear_reg_idx;
@@ -2718,6 +2718,6 @@ ice_vsi_cfg_mac_fltr(struct ice_vsi *vsi, const u8 *macaddr, bool set)
status = ice_remove_mac(&vsi->back->hw, &tmp_add_list);

cfg_mac_fltr_exit:
- ice_free_fltr_list(&vsi->back->pdev->dev, &tmp_add_list);
+ ice_free_fltr_list(ice_pf_to_dev(vsi->back), &tmp_add_list);
return status;
}
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index b4cbeb4f3177f..c9b35b202639d 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -269,7 +269,7 @@ static int ice_cfg_promisc(struct ice_vsi *vsi, u8 promisc_m, bool set_promisc)
*/
static int ice_vsi_sync_fltr(struct ice_vsi *vsi)
{
- struct device *dev = &vsi->back->pdev->dev;
+ struct device *dev = ice_pf_to_dev(vsi->back);
struct net_device *netdev = vsi->netdev;
bool promisc_forced_on = false;
struct ice_pf *pf = vsi->back;
@@ -1364,7 +1364,7 @@ static int ice_force_phys_link_state(struct ice_vsi *vsi, bool link_up)
if (vsi->type != ICE_VSI_PF)
return 0;

- dev = &vsi->back->pdev->dev;
+ dev = ice_pf_to_dev(vsi->back);

pi = vsi->port_info;

@@ -1682,7 +1682,7 @@ static int ice_vsi_req_irq_msix(struct ice_vsi *vsi, char *basename)
*/
static int ice_xdp_alloc_setup_rings(struct ice_vsi *vsi)
{
- struct device *dev = &vsi->back->pdev->dev;
+ struct device *dev = ice_pf_to_dev(vsi->back);
int i;

for (i = 0; i < vsi->num_xdp_txq; i++) {
@@ -3858,14 +3858,14 @@ ice_set_features(struct net_device *netdev, netdev_features_t features)

/* Don't set any netdev advanced features with device in Safe Mode */
if (ice_is_safe_mode(vsi->back)) {
- dev_err(&vsi->back->pdev->dev,
+ dev_err(ice_pf_to_dev(vsi->back),
"Device is in Safe Mode - not enabling advanced netdev features\n");
return ret;
}

/* Do not change setting during reset */
if (ice_is_reset_in_progress(pf->state)) {
- dev_err(&vsi->back->pdev->dev,
+ dev_err(ice_pf_to_dev(vsi->back),
"Device is resetting, changing advanced netdev features temporarily unavailable.\n");
return -EBUSY;
}
@@ -4408,7 +4408,7 @@ int ice_vsi_setup_tx_rings(struct ice_vsi *vsi)
int i, err = 0;

if (!vsi->num_txq) {
- dev_err(&vsi->back->pdev->dev, "VSI %d has 0 Tx queues\n",
+ dev_err(ice_pf_to_dev(vsi->back), "VSI %d has 0 Tx queues\n",
vsi->vsi_num);
return -EINVAL;
}
@@ -4439,7 +4439,7 @@ int ice_vsi_setup_rx_rings(struct ice_vsi *vsi)
int i, err = 0;

if (!vsi->num_rxq) {
- dev_err(&vsi->back->pdev->dev, "VSI %d has 0 Rx queues\n",
+ dev_err(ice_pf_to_dev(vsi->back), "VSI %d has 0 Rx queues\n",
vsi->vsi_num);
return -EINVAL;
}
@@ -4968,7 +4968,7 @@ static int ice_vsi_update_bridge_mode(struct ice_vsi *vsi, u16 bmode)

status = ice_update_vsi(hw, vsi->idx, ctxt, NULL);
if (status) {
- dev_err(&vsi->back->pdev->dev, "update VSI for bridge mode failed, bmode = %d err %d aq_err %d\n",
+ dev_err(ice_pf_to_dev(vsi->back), "update VSI for bridge mode failed, bmode = %d err %d aq_err %d\n",
bmode, status, hw->adminq.sq_last_status);
ret = -EIO;
goto out;
diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
index edb374296d1f3..e2114f24a19e9 100644
--- a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
@@ -508,7 +508,7 @@ static int ice_vsi_manage_pvid(struct ice_vsi *vsi, u16 vid, bool enable)

status = ice_update_vsi(hw, vsi->idx, ctxt, NULL);
if (status) {
- dev_info(&vsi->back->pdev->dev, "update VSI for port VLAN failed, err %d aq_err %d\n",
+ dev_info(ice_pf_to_dev(vsi->back), "update VSI for port VLAN failed, err %d aq_err %d\n",
status, hw->adminq.sq_last_status);
ret = -EIO;
goto out;
@@ -2019,7 +2019,7 @@ static int ice_vc_ena_qs_msg(struct ice_vf *vf, u8 *msg)
continue;

if (ice_vsi_ctrl_rx_ring(vsi, true, vf_q_id)) {
- dev_err(&vsi->back->pdev->dev,
+ dev_err(ice_pf_to_dev(vsi->back),
"Failed to enable Rx ring %d on VSI %d\n",
vf_q_id, vsi->vsi_num);
v_ret = VIRTCHNL_STATUS_ERR_PARAM;
@@ -2122,7 +2122,7 @@ static int ice_vc_dis_qs_msg(struct ice_vf *vf, u8 *msg)

if (ice_vsi_stop_tx_ring(vsi, ICE_NO_RESET, vf->vf_id,
ring, &txq_meta)) {
- dev_err(&vsi->back->pdev->dev,
+ dev_err(ice_pf_to_dev(vsi->back),
"Failed to stop Tx ring %d on VSI %d\n",
vf_q_id, vsi->vsi_num);
v_ret = VIRTCHNL_STATUS_ERR_PARAM;
@@ -2149,7 +2149,7 @@ static int ice_vc_dis_qs_msg(struct ice_vf *vf, u8 *msg)
continue;

if (ice_vsi_ctrl_rx_ring(vsi, false, vf_q_id)) {
- dev_err(&vsi->back->pdev->dev,
+ dev_err(ice_pf_to_dev(vsi->back),
"Failed to stop Rx ring %d on VSI %d\n",
vf_q_id, vsi->vsi_num);
v_ret = VIRTCHNL_STATUS_ERR_PARAM;
--
2.20.1



2020-03-03 17:49:38

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 070/176] io-wq: dont call kXalloc_node() with non-online node

From: Jens Axboe <[email protected]>

[ Upstream commit 7563439adfae153b20331f1567c8b5d0e5cbd8a7 ]

Glauber reports a crash on init on a box he has:

RIP: 0010:__alloc_pages_nodemask+0x132/0x340
Code: 18 01 75 04 41 80 ce 80 89 e8 48 8b 54 24 08 8b 74 24 1c c1 e8 0c 48 8b 3c 24 83 e0 01 88 44 24 20 48 85 d2 0f 85 74 01 00 00 <3b> 77 08 0f 82 6b 01 00 00 48 89 7c 24 10 89 ea 48 8b 07 b9 00 02
RSP: 0018:ffffb8be4d0b7c28 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000e8e8
RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000002080
RBP: 0000000000012cc0 R08: 0000000000000000 R09: 0000000000000002
R10: 0000000000000dc0 R11: ffff995c60400100 R12: 0000000000000000
R13: 0000000000012cc0 R14: 0000000000000001 R15: ffff995c60db00f0
FS: 00007f4d115ca900(0000) GS:ffff995c60d80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000002088 CR3: 00000017cca66002 CR4: 00000000007606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
alloc_slab_page+0x46/0x320
new_slab+0x9d/0x4e0
___slab_alloc+0x507/0x6a0
? io_wq_create+0xb4/0x2a0
__slab_alloc+0x1c/0x30
kmem_cache_alloc_node_trace+0xa6/0x260
io_wq_create+0xb4/0x2a0
io_uring_setup+0x97f/0xaa0
? io_remove_personalities+0x30/0x30
? io_poll_trigger_evfd+0x30/0x30
do_syscall_64+0x5b/0x1c0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f4d116cb1ed

which is due to the 'wqe' and 'worker' allocation being node affine.
But it isn't valid to call the node affine allocation if the node isn't
online.

Setup structures for even offline nodes, as usual, but skip them in
terms of thread setup to not waste resources. If the node isn't online,
just alloc memory with NUMA_NO_NODE.

Reported-by: Glauber Costa <[email protected]>
Tested-by: Glauber Costa <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/io-wq.c | 22 ++++++++++++++++++----
1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/fs/io-wq.c b/fs/io-wq.c
index 0dc4bb6de6566..25ffb6685baea 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -666,11 +666,16 @@ static int io_wq_manager(void *data)
/* create fixed workers */
refcount_set(&wq->refs, workers_to_create);
for_each_node(node) {
+ if (!node_online(node))
+ continue;
if (!create_io_worker(wq, wq->wqes[node], IO_WQ_ACCT_BOUND))
goto err;
workers_to_create--;
}

+ while (workers_to_create--)
+ refcount_dec(&wq->refs);
+
complete(&wq->done);

while (!kthread_should_stop()) {
@@ -678,6 +683,9 @@ static int io_wq_manager(void *data)
struct io_wqe *wqe = wq->wqes[node];
bool fork_worker[2] = { false, false };

+ if (!node_online(node))
+ continue;
+
spin_lock_irq(&wqe->lock);
if (io_wqe_need_worker(wqe, IO_WQ_ACCT_BOUND))
fork_worker[IO_WQ_ACCT_BOUND] = true;
@@ -793,7 +801,9 @@ static bool io_wq_for_each_worker(struct io_wqe *wqe,

list_for_each_entry_rcu(worker, &wqe->all_list, all_list) {
if (io_worker_get(worker)) {
- ret = func(worker, data);
+ /* no task if node is/was offline */
+ if (worker->task)
+ ret = func(worker, data);
io_worker_release(worker);
if (ret)
break;
@@ -1006,6 +1016,8 @@ void io_wq_flush(struct io_wq *wq)
for_each_node(node) {
struct io_wqe *wqe = wq->wqes[node];

+ if (!node_online(node))
+ continue;
init_completion(&data.done);
INIT_IO_WORK(&data.work, io_wq_flush_func);
data.work.flags |= IO_WQ_WORK_INTERNAL;
@@ -1038,12 +1050,15 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)

for_each_node(node) {
struct io_wqe *wqe;
+ int alloc_node = node;

- wqe = kzalloc_node(sizeof(struct io_wqe), GFP_KERNEL, node);
+ if (!node_online(alloc_node))
+ alloc_node = NUMA_NO_NODE;
+ wqe = kzalloc_node(sizeof(struct io_wqe), GFP_KERNEL, alloc_node);
if (!wqe)
goto err;
wq->wqes[node] = wqe;
- wqe->node = node;
+ wqe->node = alloc_node;
wqe->acct[IO_WQ_ACCT_BOUND].max_workers = bounded;
atomic_set(&wqe->acct[IO_WQ_ACCT_BOUND].nr_running, 0);
if (wq->user) {
@@ -1051,7 +1066,6 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
task_rlimit(current, RLIMIT_NPROC);
}
atomic_set(&wqe->acct[IO_WQ_ACCT_UNBOUND].nr_running, 0);
- wqe->node = node;
wqe->wq = wq;
spin_lock_init(&wqe->lock);
INIT_WQ_LIST(&wqe->work_list);
--
2.20.1



2020-03-03 17:49:38

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 071/176] cifs: Fix mode output in debugging statements

From: Frank Sorenson <[email protected]>

[ Upstream commit f52aa79df43c4509146140de0241bc21a4a3b4c7 ]

A number of the debug statements output file or directory mode
in hex. Change these to print using octal.

Signed-off-by: Frank Sorenson <[email protected]>
Signed-off-by: Steve French <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/cifs/cifsacl.c | 4 ++--
fs/cifs/connect.c | 2 +-
fs/cifs/inode.c | 2 +-
3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c
index fb41e51dd5743..25704beb9d4ca 100644
--- a/fs/cifs/cifsacl.c
+++ b/fs/cifs/cifsacl.c
@@ -601,7 +601,7 @@ static void access_flags_to_mode(__le32 ace_flags, int type, umode_t *pmode,
((flags & FILE_EXEC_RIGHTS) == FILE_EXEC_RIGHTS))
*pmode |= (S_IXUGO & (*pbits_to_set));

- cifs_dbg(NOISY, "access flags 0x%x mode now 0x%x\n", flags, *pmode);
+ cifs_dbg(NOISY, "access flags 0x%x mode now %04o\n", flags, *pmode);
return;
}

@@ -630,7 +630,7 @@ static void mode_to_access_flags(umode_t mode, umode_t bits_to_use,
if (mode & S_IXUGO)
*pace_flags |= SET_FILE_EXEC_RIGHTS;

- cifs_dbg(NOISY, "mode: 0x%x, access flags now 0x%x\n",
+ cifs_dbg(NOISY, "mode: %04o, access flags now 0x%x\n",
mode, *pace_flags);
return;
}
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 0aa3623ae0e16..641825cfa7670 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -4151,7 +4151,7 @@ int cifs_setup_cifs_sb(struct smb_vol *pvolume_info,
cifs_sb->mnt_gid = pvolume_info->linux_gid;
cifs_sb->mnt_file_mode = pvolume_info->file_mode;
cifs_sb->mnt_dir_mode = pvolume_info->dir_mode;
- cifs_dbg(FYI, "file mode: 0x%hx dir mode: 0x%hx\n",
+ cifs_dbg(FYI, "file mode: %04ho dir mode: %04ho\n",
cifs_sb->mnt_file_mode, cifs_sb->mnt_dir_mode);

cifs_sb->actimeo = pvolume_info->actimeo;
diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
index ca76a9287456f..b3f3675e18788 100644
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -1649,7 +1649,7 @@ int cifs_mkdir(struct inode *inode, struct dentry *direntry, umode_t mode)
struct TCP_Server_Info *server;
char *full_path;

- cifs_dbg(FYI, "In cifs_mkdir, mode = 0x%hx inode = 0x%p\n",
+ cifs_dbg(FYI, "In cifs_mkdir, mode = %04ho inode = 0x%p\n",
mode, inode);

cifs_sb = CIFS_SB(inode->i_sb);
--
2.20.1



2020-03-03 17:49:43

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 041/176] perf/x86/cstate: Add Tremont support

From: Kan Liang <[email protected]>

[ Upstream commit ecf71fbccb9ac5cb964eb7de59bb9da3755b7885 ]

Tremont is Intel's successor to Goldmont Plus. From the perspective of
Intel cstate residency counters, there is nothing changed compared with
Goldmont Plus and Goldmont.

Share glm_cstates with Goldmont Plus and Goldmont.
Update the comments for Tremont.

Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/events/intel/cstate.c | 22 +++++++++++++---------
1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index e1daf4151e116..4814c964692cb 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -40,17 +40,18 @@
* Model specific counters:
* MSR_CORE_C1_RES: CORE C1 Residency Counter
* perf code: 0x00
- * Available model: SLM,AMT,GLM,CNL
+ * Available model: SLM,AMT,GLM,CNL,TNT
* Scope: Core (each processor core has a MSR)
* MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter
* perf code: 0x01
* Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,GLM,
- * CNL,KBL,CML
+ * CNL,KBL,CML,TNT
* Scope: Core
* MSR_CORE_C6_RESIDENCY: CORE C6 Residency Counter
* perf code: 0x02
* Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,
- * SKL,KNL,GLM,CNL,KBL,CML,ICL,TGL
+ * SKL,KNL,GLM,CNL,KBL,CML,ICL,TGL,
+ * TNT
* Scope: Core
* MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter
* perf code: 0x03
@@ -60,17 +61,18 @@
* MSR_PKG_C2_RESIDENCY: Package C2 Residency Counter.
* perf code: 0x00
* Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM,CNL,
- * KBL,CML,ICL,TGL
+ * KBL,CML,ICL,TGL,TNT
* Scope: Package (physical package)
* MSR_PKG_C3_RESIDENCY: Package C3 Residency Counter.
* perf code: 0x01
* Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL,
- * GLM,CNL,KBL,CML,ICL,TGL
+ * GLM,CNL,KBL,CML,ICL,TGL,TNT
* Scope: Package (physical package)
* MSR_PKG_C6_RESIDENCY: Package C6 Residency Counter.
* perf code: 0x02
- * Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW
- * SKL,KNL,GLM,CNL,KBL,CML,ICL,TGL
+ * Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,
+ * SKL,KNL,GLM,CNL,KBL,CML,ICL,TGL,
+ * TNT
* Scope: Package (physical package)
* MSR_PKG_C7_RESIDENCY: Package C7 Residency Counter.
* perf code: 0x03
@@ -87,7 +89,8 @@
* Scope: Package (physical package)
* MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
* perf code: 0x06
- * Available model: HSW ULT,KBL,GLM,CNL,CML,ICL,TGL
+ * Available model: HSW ULT,KBL,GLM,CNL,CML,ICL,TGL,
+ * TNT
* Scope: Package (physical package)
*
*/
@@ -640,8 +643,9 @@ static const struct x86_cpu_id intel_cstates_match[] __initconst = {

X86_CSTATES_MODEL(INTEL_FAM6_ATOM_GOLDMONT, glm_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_ATOM_GOLDMONT_D, glm_cstates),
-
X86_CSTATES_MODEL(INTEL_FAM6_ATOM_GOLDMONT_PLUS, glm_cstates),
+ X86_CSTATES_MODEL(INTEL_FAM6_ATOM_TREMONT_D, glm_cstates),
+ X86_CSTATES_MODEL(INTEL_FAM6_ATOM_TREMONT, glm_cstates),

X86_CSTATES_MODEL(INTEL_FAM6_ICELAKE_L, icl_cstates),
X86_CSTATES_MODEL(INTEL_FAM6_ICELAKE, icl_cstates),
--
2.20.1



2020-03-03 17:49:44

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 044/176] ARM: dts: sti: fixup sound frame-inversion for stihxxx-b2120.dtsi

From: Kuninori Morimoto <[email protected]>

[ Upstream commit f24667779b5348279e5e4328312a141a730a1fc7 ]

frame-inversion is "flag" not "uint32".
This patch fixup it.

Signed-off-by: Kuninori Morimoto <[email protected]>
Reviewed-by: Patrice Chotard <[email protected]>
Signed-off-by: Patrice Chotard <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/arm/boot/dts/stihxxx-b2120.dtsi | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/stihxxx-b2120.dtsi b/arch/arm/boot/dts/stihxxx-b2120.dtsi
index 60e11045ad762..d051f080e52ec 100644
--- a/arch/arm/boot/dts/stihxxx-b2120.dtsi
+++ b/arch/arm/boot/dts/stihxxx-b2120.dtsi
@@ -46,7 +46,7 @@
/* DAC */
format = "i2s";
mclk-fs = <256>;
- frame-inversion = <1>;
+ frame-inversion;
cpu {
sound-dai = <&sti_uni_player2>;
};
--
2.20.1



2020-03-03 17:49:50

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 073/176] cfg80211: add missing policy for NL80211_ATTR_STATUS_CODE

From: Sergey Matyukevich <[email protected]>

[ Upstream commit ea75080110a4c1fa011b0a73cb8f42227143ee3e ]

The nl80211_policy is missing for NL80211_ATTR_STATUS_CODE attribute.
As a result, for strictly validated commands, it's assumed to not be
supported.

Signed-off-by: Sergey Matyukevich <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
net/wireless/nl80211.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index 1e97ac5435b23..118a98de516cd 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -437,6 +437,7 @@ const struct nla_policy nl80211_policy[NUM_NL80211_ATTR] = {
[NL80211_ATTR_CONTROL_PORT_NO_ENCRYPT] = { .type = NLA_FLAG },
[NL80211_ATTR_CONTROL_PORT_OVER_NL80211] = { .type = NLA_FLAG },
[NL80211_ATTR_PRIVACY] = { .type = NLA_FLAG },
+ [NL80211_ATTR_STATUS_CODE] = { .type = NLA_U16 },
[NL80211_ATTR_CIPHER_SUITE_GROUP] = { .type = NLA_U32 },
[NL80211_ATTR_WPA_VERSIONS] = { .type = NLA_U32 },
[NL80211_ATTR_PID] = { .type = NLA_U32 },
--
2.20.1



2020-03-03 17:49:51

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 082/176] scsi: zfcp: fix wrong data and display format of SFP+ temperature

From: Benjamin Block <[email protected]>

commit a3fd4bfe85fbb67cf4ec1232d0af625ece3c508b upstream.

When implementing support for retrieval of local diagnostic data from the
FCP channel, the wrong data format was assumed for the temperature of the
local SFP+ connector. The Fibre Channel Link Services (FC-LS-3)
specification is not clear on the format of the stored integer, and only
after consulting the SNIA specification SFF-8472 did we realize it is
stored as two's complement. Thus, the used data and display format is
wrong, and highly misleading for users when the temperature should drop
below 0°C (however unlikely that may be).

To fix this, change the data format in `struct fsf_qtcb_bottom_port` from
unsigned to signed, and change the printf format string used to generate
`zfcp_sysfs_adapter_diag_sfp_temperature_show()` from `%hu` to `%hd`.

Link: https://lore.kernel.org/r/d6e3be5428da5c9490cfff4df7cae868bc9f1a7e.1582039501.git.bblock@linux.ibm.com
Fixes: a10a61e807b0 ("scsi: zfcp: support retrieval of SFP Data via Exchange Port Data")
Fixes: 6028f7c4cd87 ("scsi: zfcp: introduce sysfs interface for diagnostics of local SFP transceiver")
Cc: <[email protected]> # 5.5+
Reviewed-by: Jens Remus <[email protected]>
Reviewed-by: Fedor Loshakov <[email protected]>
Reviewed-by: Steffen Maier <[email protected]>
Signed-off-by: Benjamin Block <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/s390/scsi/zfcp_fsf.h | 2 +-
drivers/s390/scsi/zfcp_sysfs.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/s390/scsi/zfcp_fsf.h
+++ b/drivers/s390/scsi/zfcp_fsf.h
@@ -410,7 +410,7 @@ struct fsf_qtcb_bottom_port {
u8 cb_util;
u8 a_util;
u8 res2;
- u16 temperature;
+ s16 temperature;
u16 vcc;
u16 tx_bias;
u16 tx_power;
--- a/drivers/s390/scsi/zfcp_sysfs.c
+++ b/drivers/s390/scsi/zfcp_sysfs.c
@@ -800,7 +800,7 @@ static ZFCP_DEV_ATTR(adapter_diag, b2b_c
static ZFCP_DEV_ATTR(adapter_diag_sfp, _name, 0400, \
zfcp_sysfs_adapter_diag_sfp_##_name##_show, NULL)

-ZFCP_DEFINE_DIAG_SFP_ATTR(temperature, temperature, 5, "%hu");
+ZFCP_DEFINE_DIAG_SFP_ATTR(temperature, temperature, 6, "%hd");
ZFCP_DEFINE_DIAG_SFP_ATTR(vcc, vcc, 5, "%hu");
ZFCP_DEFINE_DIAG_SFP_ATTR(tx_bias, tx_bias, 5, "%hu");
ZFCP_DEFINE_DIAG_SFP_ATTR(tx_power, tx_power, 5, "%hu");


2020-03-03 17:49:54

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 085/176] audit: fix error handling in audit_data_to_entry()

From: Paul Moore <[email protected]>

commit 2ad3e17ebf94b7b7f3f64c050ff168f9915345eb upstream.

Commit 219ca39427bf ("audit: use union for audit_field values since
they are mutually exclusive") combined a number of separate fields in
the audit_field struct into a single union. Generally this worked
just fine because they are generally mutually exclusive.
Unfortunately in audit_data_to_entry() the overlap can be a problem
when a specific error case is triggered that causes the error path
code to attempt to cleanup an audit_field struct and the cleanup
involves attempting to free a stored LSM string (the lsm_str field).
Currently the code always has a non-NULL value in the
audit_field.lsm_str field as the top of the for-loop transfers a
value into audit_field.val (both .lsm_str and .val are part of the
same union); if audit_data_to_entry() fails and the audit_field
struct is specified to contain a LSM string, but the
audit_field.lsm_str has not yet been properly set, the error handling
code will attempt to free the bogus audit_field.lsm_str value that
was set with audit_field.val at the top of the for-loop.

This patch corrects this by ensuring that the audit_field.val is only
set when needed (it is cleared when the audit_field struct is
allocated with kcalloc()). It also corrects a few other issues to
ensure that in case of error the proper error code is returned.

Cc: [email protected]
Fixes: 219ca39427bf ("audit: use union for audit_field values since they are mutually exclusive")
Reported-by: [email protected]
Signed-off-by: Paul Moore <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/auditfilter.c | 71 ++++++++++++++++++++++++++++-----------------------
1 file changed, 39 insertions(+), 32 deletions(-)

--- a/kernel/auditfilter.c
+++ b/kernel/auditfilter.c
@@ -456,6 +456,7 @@ static struct audit_entry *audit_data_to
bufp = data->buf;
for (i = 0; i < data->field_count; i++) {
struct audit_field *f = &entry->rule.fields[i];
+ u32 f_val;

err = -EINVAL;

@@ -464,12 +465,12 @@ static struct audit_entry *audit_data_to
goto exit_free;

f->type = data->fields[i];
- f->val = data->values[i];
+ f_val = data->values[i];

/* Support legacy tests for a valid loginuid */
- if ((f->type == AUDIT_LOGINUID) && (f->val == AUDIT_UID_UNSET)) {
+ if ((f->type == AUDIT_LOGINUID) && (f_val == AUDIT_UID_UNSET)) {
f->type = AUDIT_LOGINUID_SET;
- f->val = 0;
+ f_val = 0;
entry->rule.pflags |= AUDIT_LOGINUID_LEGACY;
}

@@ -485,7 +486,7 @@ static struct audit_entry *audit_data_to
case AUDIT_SUID:
case AUDIT_FSUID:
case AUDIT_OBJ_UID:
- f->uid = make_kuid(current_user_ns(), f->val);
+ f->uid = make_kuid(current_user_ns(), f_val);
if (!uid_valid(f->uid))
goto exit_free;
break;
@@ -494,11 +495,12 @@ static struct audit_entry *audit_data_to
case AUDIT_SGID:
case AUDIT_FSGID:
case AUDIT_OBJ_GID:
- f->gid = make_kgid(current_user_ns(), f->val);
+ f->gid = make_kgid(current_user_ns(), f_val);
if (!gid_valid(f->gid))
goto exit_free;
break;
case AUDIT_ARCH:
+ f->val = f_val;
entry->rule.arch_f = f;
break;
case AUDIT_SUBJ_USER:
@@ -511,11 +513,13 @@ static struct audit_entry *audit_data_to
case AUDIT_OBJ_TYPE:
case AUDIT_OBJ_LEV_LOW:
case AUDIT_OBJ_LEV_HIGH:
- str = audit_unpack_string(&bufp, &remain, f->val);
- if (IS_ERR(str))
+ str = audit_unpack_string(&bufp, &remain, f_val);
+ if (IS_ERR(str)) {
+ err = PTR_ERR(str);
goto exit_free;
- entry->rule.buflen += f->val;
-
+ }
+ entry->rule.buflen += f_val;
+ f->lsm_str = str;
err = security_audit_rule_init(f->type, f->op, str,
(void **)&f->lsm_rule);
/* Keep currently invalid fields around in case they
@@ -524,68 +528,71 @@ static struct audit_entry *audit_data_to
pr_warn("audit rule for LSM \'%s\' is invalid\n",
str);
err = 0;
- }
- if (err) {
- kfree(str);
+ } else if (err)
goto exit_free;
- } else
- f->lsm_str = str;
break;
case AUDIT_WATCH:
- str = audit_unpack_string(&bufp, &remain, f->val);
- if (IS_ERR(str))
+ str = audit_unpack_string(&bufp, &remain, f_val);
+ if (IS_ERR(str)) {
+ err = PTR_ERR(str);
goto exit_free;
- entry->rule.buflen += f->val;
-
- err = audit_to_watch(&entry->rule, str, f->val, f->op);
+ }
+ err = audit_to_watch(&entry->rule, str, f_val, f->op);
if (err) {
kfree(str);
goto exit_free;
}
+ entry->rule.buflen += f_val;
break;
case AUDIT_DIR:
- str = audit_unpack_string(&bufp, &remain, f->val);
- if (IS_ERR(str))
+ str = audit_unpack_string(&bufp, &remain, f_val);
+ if (IS_ERR(str)) {
+ err = PTR_ERR(str);
goto exit_free;
- entry->rule.buflen += f->val;
-
+ }
err = audit_make_tree(&entry->rule, str, f->op);
kfree(str);
if (err)
goto exit_free;
+ entry->rule.buflen += f_val;
break;
case AUDIT_INODE:
+ f->val = f_val;
err = audit_to_inode(&entry->rule, f);
if (err)
goto exit_free;
break;
case AUDIT_FILTERKEY:
- if (entry->rule.filterkey || f->val > AUDIT_MAX_KEY_LEN)
+ if (entry->rule.filterkey || f_val > AUDIT_MAX_KEY_LEN)
goto exit_free;
- str = audit_unpack_string(&bufp, &remain, f->val);
- if (IS_ERR(str))
+ str = audit_unpack_string(&bufp, &remain, f_val);
+ if (IS_ERR(str)) {
+ err = PTR_ERR(str);
goto exit_free;
- entry->rule.buflen += f->val;
+ }
+ entry->rule.buflen += f_val;
entry->rule.filterkey = str;
break;
case AUDIT_EXE:
- if (entry->rule.exe || f->val > PATH_MAX)
+ if (entry->rule.exe || f_val > PATH_MAX)
goto exit_free;
- str = audit_unpack_string(&bufp, &remain, f->val);
+ str = audit_unpack_string(&bufp, &remain, f_val);
if (IS_ERR(str)) {
err = PTR_ERR(str);
goto exit_free;
}
- entry->rule.buflen += f->val;
-
- audit_mark = audit_alloc_mark(&entry->rule, str, f->val);
+ audit_mark = audit_alloc_mark(&entry->rule, str, f_val);
if (IS_ERR(audit_mark)) {
kfree(str);
err = PTR_ERR(audit_mark);
goto exit_free;
}
+ entry->rule.buflen += f_val;
entry->rule.exe = audit_mark;
break;
+ default:
+ f->val = f_val;
+ break;
}
}



2020-03-03 17:49:55

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 087/176] ACPICA: Introduce ACPI_ACCESS_BYTE_WIDTH() macro

From: Mika Westerberg <[email protected]>

commit 1dade3a7048ccfc675650cd2cf13d578b095e5fb upstream.

Sometimes it is useful to find the access_width field value in bytes and
not in bits so add a helper that can be used for this purpose.

Suggested-by: Jean Delvare <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>
Reviewed-by: Jean Delvare <[email protected]>
Cc: 4.16+ <[email protected]> # 4.16+
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
include/acpi/actypes.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/include/acpi/actypes.h
+++ b/include/acpi/actypes.h
@@ -532,11 +532,12 @@ typedef u64 acpi_integer;
strnlen (a, ACPI_NAMESEG_SIZE) == ACPI_NAMESEG_SIZE)

/*
- * Algorithm to obtain access bit width.
+ * Algorithm to obtain access bit or byte width.
* Can be used with access_width of struct acpi_generic_address and access_size of
* struct acpi_resource_generic_register.
*/
#define ACPI_ACCESS_BIT_WIDTH(size) (1 << ((size) + 2))
+#define ACPI_ACCESS_BYTE_WIDTH(size) (1 << ((size) - 1))

/*******************************************************************************
*


2020-03-03 17:49:59

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 091/176] HID: core: fix off-by-one memset in hid_report_raw_event()

From: Johan Korsnes <[email protected]>

commit 5ebdffd25098898aff1249ae2f7dbfddd76d8f8f upstream.

In case a report is greater than HID_MAX_BUFFER_SIZE, it is truncated,
but the report-number byte is not correctly handled. This results in a
off-by-one in the following memset, causing a kernel Oops and ensuing
system crash.

Note: With commit 8ec321e96e05 ("HID: Fix slab-out-of-bounds read in
hid_field_extract") I no longer hit the kernel Oops as we instead fail
"controlled" at probe if there is a report too long in the HID
report-descriptor. hid_report_raw_event() is an exported symbol, so
presumabely we cannot always rely on this being the case.

Fixes: 966922f26c7f ("HID: fix a crash in hid_report_raw_event()
function.")
Signed-off-by: Johan Korsnes <[email protected]>
Cc: Armando Visconti <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Alan Stern <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hid/hid-core.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -1741,7 +1741,9 @@ int hid_report_raw_event(struct hid_devi

rsize = ((report->size - 1) >> 3) + 1;

- if (rsize > HID_MAX_BUFFER_SIZE)
+ if (report_enum->numbered && rsize >= HID_MAX_BUFFER_SIZE)
+ rsize = HID_MAX_BUFFER_SIZE - 1;
+ else if (rsize > HID_MAX_BUFFER_SIZE)
rsize = HID_MAX_BUFFER_SIZE;

if (csize < rsize) {


2020-03-03 17:50:02

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 050/176] amdgpu: Prevent build errors regarding soft/hard-float FP ABI tags

From: Daniel Kolesa <[email protected]>

[ Upstream commit 416611d9b6eebaeae58ed26cc7d23131c69126b1 ]

On PowerPC, the compiler will tag object files with whether they
use hard or soft float FP ABI and whether they use 64 or 128-bit
long double ABI. On systems with 64-bit long double ABI, a tag
will get emitted whenever a double is used, as on those systems
a long double is the same as a double. This will prevent linkage
as other files are being compiled with hard-float.

On ppc64, this code will never actually get used for the time
being, as the only currently existing hardware using it are the
Renoir APUs. Therefore, until this is testable and can be fixed
properly, at least make sure the build will not fail.

Signed-off-by: Daniel Kolesa <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
index b864869cc7e3e..6fa7422c51da5 100644
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
@@ -91,6 +91,12 @@ ifdef CONFIG_DRM_AMD_DC_DCN2_1
###############################################################################
CLK_MGR_DCN21 = rn_clk_mgr.o rn_clk_mgr_vbios_smu.o

+# prevent build errors regarding soft-float vs hard-float FP ABI tags
+# this code is currently unused on ppc64, as it applies to Renoir APUs only
+ifdef CONFIG_PPC64
+CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn21/rn_clk_mgr.o := $(call cc-option,-mno-gnu-attribute)
+endif
+
AMD_DAL_CLK_MGR_DCN21 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn21/,$(CLK_MGR_DCN21))

AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN21)
--
2.20.1



2020-03-03 17:50:04

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 097/176] tracing: Disable trace_printk() on post poned tests

From: Steven Rostedt (VMware) <[email protected]>

commit 78041c0c9e935d9ce4086feeff6c569ed88ddfd4 upstream.

The tracing seftests checks various aspects of the tracing infrastructure,
and one is filtering. If trace_printk() is active during a self test, it can
cause the filtering to fail, which will disable that part of the trace.

To keep the selftests from failing because of trace_printk() calls,
trace_printk() checks the variable tracing_selftest_running, and if set, it
does not write to the tracing buffer.

As some tracers were registered earlier in boot, the selftest they triggered
would fail because not all the infrastructure was set up for the full
selftest. Thus, some of the tests were post poned to when their
infrastructure was ready (namely file system code). The postpone code did
not set the tracing_seftest_running variable, and could fail if a
trace_printk() was added and executed during their run.

Cc: [email protected]
Fixes: 9afecfbb95198 ("tracing: Postpone tracer start-up tests till the system is more robust")
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/trace/trace.c | 2 ++
1 file changed, 2 insertions(+)

--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1827,6 +1827,7 @@ static __init int init_trace_selftests(v

pr_info("Running postponed tracer tests:\n");

+ tracing_selftest_running = true;
list_for_each_entry_safe(p, n, &postponed_selftests, list) {
/* This loop can take minutes when sanitizers are enabled, so
* lets make sure we allow RCU processing.
@@ -1849,6 +1850,7 @@ static __init int init_trace_selftests(v
list_del(&p->list);
kfree(p);
}
+ tracing_selftest_running = false;

out:
mutex_unlock(&trace_types_lock);


2020-03-03 17:50:05

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 098/176] Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"

From: Orson Zhai <[email protected]>

commit 66d0e797bf095d407479c89952d42b1d96ef0a7f upstream.

This reverts commit 4585fbcb5331fc910b7e553ad3efd0dd7b320d14.

The name changing as devfreq(X) breaks some user space applications,
such as Android HAL from Unisoc and Hikey [1].
The device name will be changed unexpectly after every boot depending
on module init sequence. It will make trouble to setup some system
configuration like selinux for Android.

So we'd like to revert it back to old naming rule before any better
way being found.

[1] https://lkml.org/lkml/2018/5/8/1042

Cc: John Stultz <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: [email protected]
Signed-off-by: Orson Zhai <[email protected]>
Acked-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Chanwoo Choi <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/devfreq/devfreq.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -738,7 +738,6 @@ struct devfreq *devfreq_add_device(struc
{
struct devfreq *devfreq;
struct devfreq_governor *governor;
- static atomic_t devfreq_no = ATOMIC_INIT(-1);
int err = 0;

if (!dev || !profile || !governor_name) {
@@ -800,8 +799,7 @@ struct devfreq *devfreq_add_device(struc
devfreq->suspend_freq = dev_pm_opp_get_suspend_opp_freq(dev);
atomic_set(&devfreq->suspend_count, 0);

- dev_set_name(&devfreq->dev, "devfreq%d",
- atomic_inc_return(&devfreq_no));
+ dev_set_name(&devfreq->dev, "%s", dev_name(dev));
err = device_register(&devfreq->dev);
if (err) {
mutex_unlock(&devfreq->lock);


2020-03-03 17:50:09

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 075/176] net: hns3: add management table after IMP reset

From: Yufeng Mo <[email protected]>

[ Upstream commit d0db7ed397517c8b2be24a0d1abfa15df776908e ]

In the current process, the management table is missing after the
IMP reset. This patch adds the management table to the reset process.

Fixes: f5aac71c0327 ("net: hns3: add manager table initialization for hardware")
Signed-off-by: Yufeng Mo <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 13dbd249f35fa..bfdb08572f0cc 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -9821,6 +9821,13 @@ static int hclge_reset_ae_dev(struct hnae3_ae_dev *ae_dev)
return ret;
}

+ ret = init_mgr_tbl(hdev);
+ if (ret) {
+ dev_err(&pdev->dev,
+ "failed to reinit manager table, ret = %d\n", ret);
+ return ret;
+ }
+
ret = hclge_init_fd_config(hdev);
if (ret) {
dev_err(&pdev->dev, "fd table init fail, ret=%d\n", ret);
--
2.20.1



2020-03-03 17:50:10

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 076/176] net: hns3: fix VF bandwidth does not take effect in some case

From: Yonglong Liu <[email protected]>

[ Upstream commit 19eb1123b4e9337fe20b1763fec528f837ec6568 ]

When enabling 4 TC after setting the bandwidth of VF, the bandwidth
of VF will resume to default value, because of the qset resources
changed in this case.

This patch fixes it by using a fixed VF's qset resources according to
HNAE3_MAX_TC macro.

Fixes: ee9e44248f52 ("net: hns3: add support for configuring bandwidth of VF on the host")
Signed-off-by: Yonglong Liu <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
index 180224eab1ca4..28db13253a5e7 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
@@ -566,7 +566,7 @@ static void hclge_tm_vport_tc_info_update(struct hclge_vport *vport)
*/
kinfo->num_tc = vport->vport_id ? 1 :
min_t(u16, vport->alloc_tqps, hdev->tm_info.num_tc);
- vport->qs_offset = (vport->vport_id ? hdev->tm_info.num_tc : 0) +
+ vport->qs_offset = (vport->vport_id ? HNAE3_MAX_TC : 0) +
(vport->vport_id ? (vport->vport_id - 1) : 0);

max_rss_size = min_t(u16, hdev->rss_size_max,
--
2.20.1



2020-03-03 17:50:10

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 054/176] net: ena: fix uses of round_jiffies()

From: Arthur Kiyanovski <[email protected]>

[ Upstream commit 2a6e5fa2f4c25b66c763428a3e65363214946931 ]

>From the documentation of round_jiffies():
"Rounds a time delta in the future (in jiffies) up or down to
(approximately) full seconds. This is useful for timers for which
the exact time they fire does not matter too much, as long as
they fire approximately every X seconds.
By rounding these timers to whole seconds, all such timers will fire
at the same time, rather than at various times spread out. The goal
of this is to have the CPU wake up less, which saves power."

There are 2 parts to this patch:
================================
Part 1:
-------
In our case we need timer_service to be called approximately every
X=1 seconds, and the exact time does not matter, so using round_jiffies()
is the right way to go.

Therefore we add round_jiffies() to the mod_timer() in ena_timer_service().

Part 2:
-------
round_jiffies() is used in check_for_missing_keep_alive() when
getting the jiffies of the expiration of the keep_alive timeout. Here it
is actually a mistake to use round_jiffies() because we want the exact
time when keep_alive should expire and not an approximate rounded time,
which can cause early, false positive, timeouts.

Therefore we remove round_jiffies() in the calculation of
keep_alive_expired() in check_for_missing_keep_alive().

Fixes: 82ef30f13be0 ("net: ena: add hardware hints capability to the driver")
Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Arthur Kiyanovski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/amazon/ena/ena_netdev.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 948583fdcc286..1c1a41bd11daa 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -3049,8 +3049,8 @@ static void check_for_missing_keep_alive(struct ena_adapter *adapter)
if (adapter->keep_alive_timeout == ENA_HW_HINTS_NO_TIMEOUT)
return;

- keep_alive_expired = round_jiffies(adapter->last_keep_alive_jiffies +
- adapter->keep_alive_timeout);
+ keep_alive_expired = adapter->last_keep_alive_jiffies +
+ adapter->keep_alive_timeout;
if (unlikely(time_is_before_jiffies(keep_alive_expired))) {
netif_err(adapter, drv, adapter->netdev,
"Keep alive watchdog timeout.\n");
@@ -3152,7 +3152,7 @@ static void ena_timer_service(struct timer_list *t)
}

/* Reset the timer */
- mod_timer(&adapter->timer_service, jiffies + HZ);
+ mod_timer(&adapter->timer_service, round_jiffies(jiffies + HZ));
}

static int ena_calc_max_io_queue_num(struct pci_dev *pdev,
--
2.20.1



2020-03-03 17:50:11

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 064/176] ice: Fix switch between FW and SW LLDP

From: Dave Ertman <[email protected]>

[ Upstream commit 53977ee47410885e7d4eee87d2c811a48a275150 ]

When switching between FW and SW LLDP mode, the
number of configured TLV apps in the driver's
DCB configuration is getting out of synch with
what lldpad thinks is configured. This is causing
a problem when shutting down lldpad. The cleanup
is trying to delete TLV apps that are not defined
in the kernel.

Since the driver is keeping an accurate account
of the apps defined, use the drivers number of
apps to determine if there is an app to delete.
If the number of apps is <= 1, then do not
attempt to delete.

Signed-off-by: Dave Ertman <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/intel/ice/ice_dcb_nl.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_nl.c b/drivers/net/ethernet/intel/ice/ice_dcb_nl.c
index d870c1aedc170..926c9772f0860 100644
--- a/drivers/net/ethernet/intel/ice/ice_dcb_nl.c
+++ b/drivers/net/ethernet/intel/ice/ice_dcb_nl.c
@@ -713,13 +713,13 @@ static int ice_dcbnl_delapp(struct net_device *netdev, struct dcb_app *app)
return -EINVAL;

mutex_lock(&pf->tc_mutex);
- ret = dcb_ieee_delapp(netdev, app);
- if (ret)
- goto delapp_out;
-
old_cfg = &pf->hw.port_info->local_dcbx_cfg;

- if (old_cfg->numapps == 1)
+ if (old_cfg->numapps <= 1)
+ goto delapp_out;
+
+ ret = dcb_ieee_delapp(netdev, app);
+ if (ret)
goto delapp_out;

new_cfg = &pf->hw.port_info->desired_dcbx_cfg;
--
2.20.1



2020-03-03 17:50:14

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 042/176] perf/x86/msr: Add Tremont support

From: Kan Liang <[email protected]>

[ Upstream commit 0aa0e0d6b34b89649e6b5882a7e025a0eb9bd832 ]

Tremont is Intel's successor to Goldmont Plus. SMI_COUNT MSR is also
supported.

Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/events/msr.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c
index 6f86650b3f77d..a949f6f55991d 100644
--- a/arch/x86/events/msr.c
+++ b/arch/x86/events/msr.c
@@ -75,8 +75,9 @@ static bool test_intel(int idx, void *data)

case INTEL_FAM6_ATOM_GOLDMONT:
case INTEL_FAM6_ATOM_GOLDMONT_D:
-
case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
+ case INTEL_FAM6_ATOM_TREMONT_D:
+ case INTEL_FAM6_ATOM_TREMONT:

case INTEL_FAM6_XEON_PHI_KNL:
case INTEL_FAM6_XEON_PHI_KNM:
--
2.20.1



2020-03-03 17:50:15

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 079/176] nvme: prevent warning triggered by nvme_stop_keep_alive

From: Nigel Kirkland <[email protected]>

[ Upstream commit 97b2512ad000a409b4073dd1a71e4157d76675cb ]

Delayed keep alive work is queued on system workqueue and may be cancelled
via nvme_stop_keep_alive from nvme_reset_wq, nvme_fc_wq or nvme_wq.

Check_flush_dependency detects mismatched attributes between the work-queue
context used to cancel the keep alive work and system-wq. Specifically
system-wq does not have the WQ_MEM_RECLAIM flag, whereas the contexts used
to cancel keep alive work have WQ_MEM_RECLAIM flag.

Example warning:

workqueue: WQ_MEM_RECLAIM nvme-reset-wq:nvme_fc_reset_ctrl_work [nvme_fc]
is flushing !WQ_MEM_RECLAIM events:nvme_keep_alive_work [nvme_core]

To avoid the flags mismatch, delayed keep alive work is queued on nvme_wq.

However this creates a secondary concern where work and a request to cancel
that work may be in the same work queue - namely err_work in the rdma and
tcp transports, which will want to flush/cancel the keep alive work which
will now be on nvme_wq.

After reviewing the transports, it looks like err_work can be moved to
nvme_reset_wq. In fact that aligns them better with transition into
RESETTING and performing related reset work in nvme_reset_wq.

Change nvme-rdma and nvme-tcp to perform err_work in nvme_reset_wq.

Signed-off-by: Nigel Kirkland <[email protected]>
Signed-off-by: James Smart <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/nvme/host/core.c | 10 +++++-----
drivers/nvme/host/rdma.c | 2 +-
drivers/nvme/host/tcp.c | 2 +-
3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 641c07347e8d8..ada59df642d29 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -66,8 +66,8 @@ MODULE_PARM_DESC(streams, "turn on support for Streams write directives");
* nvme_reset_wq - hosts nvme reset works
* nvme_delete_wq - hosts nvme delete works
*
- * nvme_wq will host works such are scan, aen handling, fw activation,
- * keep-alive error recovery, periodic reconnects etc. nvme_reset_wq
+ * nvme_wq will host works such as scan, aen handling, fw activation,
+ * keep-alive, periodic reconnects etc. nvme_reset_wq
* runs reset works which also flush works hosted on nvme_wq for
* serialization purposes. nvme_delete_wq host controller deletion
* works which flush reset works for serialization.
@@ -976,7 +976,7 @@ static void nvme_keep_alive_end_io(struct request *rq, blk_status_t status)
startka = true;
spin_unlock_irqrestore(&ctrl->lock, flags);
if (startka)
- schedule_delayed_work(&ctrl->ka_work, ctrl->kato * HZ);
+ queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
}

static int nvme_keep_alive(struct nvme_ctrl *ctrl)
@@ -1006,7 +1006,7 @@ static void nvme_keep_alive_work(struct work_struct *work)
dev_dbg(ctrl->device,
"reschedule traffic based keep-alive timer\n");
ctrl->comp_seen = false;
- schedule_delayed_work(&ctrl->ka_work, ctrl->kato * HZ);
+ queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
return;
}

@@ -1023,7 +1023,7 @@ static void nvme_start_keep_alive(struct nvme_ctrl *ctrl)
if (unlikely(ctrl->kato == 0))
return;

- schedule_delayed_work(&ctrl->ka_work, ctrl->kato * HZ);
+ queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
}

void nvme_stop_keep_alive(struct nvme_ctrl *ctrl)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 2a47c6c5007e1..3e85c5cacefd2 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1088,7 +1088,7 @@ static void nvme_rdma_error_recovery(struct nvme_rdma_ctrl *ctrl)
if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING))
return;

- queue_work(nvme_wq, &ctrl->err_work);
+ queue_work(nvme_reset_wq, &ctrl->err_work);
}

static void nvme_rdma_wr_error(struct ib_cq *cq, struct ib_wc *wc,
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index f8fa5c5b79f17..49d4373b84eb3 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -422,7 +422,7 @@ static void nvme_tcp_error_recovery(struct nvme_ctrl *ctrl)
if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING))
return;

- queue_work(nvme_wq, &to_tcp_ctrl(ctrl)->err_work);
+ queue_work(nvme_reset_wq, &to_tcp_ctrl(ctrl)->err_work);
}

static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue,
--
2.20.1



2020-03-03 17:50:17

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 057/176] net: ena: rss: do not allocate key when not supported

From: Sameeh Jubran <[email protected]>

[ Upstream commit 6a4f7dc82d1e3abd3feb0c60b5041056fcd9880c ]

Currently we allocate the key whether the device supports setting the
key or not. This commit adds a check to the allocation function and
handles the error accordingly.

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/amazon/ena/ena_com.c | 24 ++++++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index d6b894b06fa30..6f758ece86f60 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -1057,6 +1057,20 @@ static void ena_com_hash_key_fill_default_key(struct ena_com_dev *ena_dev)
static int ena_com_hash_key_allocate(struct ena_com_dev *ena_dev)
{
struct ena_rss *rss = &ena_dev->rss;
+ struct ena_admin_feature_rss_flow_hash_control *hash_key;
+ struct ena_admin_get_feat_resp get_resp;
+ int rc;
+
+ hash_key = (ena_dev->rss).hash_key;
+
+ rc = ena_com_get_feature_ex(ena_dev, &get_resp,
+ ENA_ADMIN_RSS_HASH_FUNCTION,
+ ena_dev->rss.hash_key_dma_addr,
+ sizeof(ena_dev->rss.hash_key), 0);
+ if (unlikely(rc)) {
+ hash_key = NULL;
+ return -EOPNOTSUPP;
+ }

rss->hash_key =
dma_alloc_coherent(ena_dev->dmadev, sizeof(*rss->hash_key),
@@ -2640,11 +2654,15 @@ int ena_com_rss_init(struct ena_com_dev *ena_dev, u16 indr_tbl_log_size)
if (unlikely(rc))
goto err_indr_tbl;

+ /* The following function might return unsupported in case the
+ * device doesn't support setting the key / hash function. We can safely
+ * ignore this error and have indirection table support only.
+ */
rc = ena_com_hash_key_allocate(ena_dev);
- if (unlikely(rc))
+ if (unlikely(rc) && rc != -EOPNOTSUPP)
goto err_hash_key;
-
- ena_com_hash_key_fill_default_key(ena_dev);
+ else if (rc != -EOPNOTSUPP)
+ ena_com_hash_key_fill_default_key(ena_dev);

rc = ena_com_hash_ctrl_init(ena_dev);
if (unlikely(rc))
--
2.20.1



2020-03-03 17:50:20

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 081/176] scsi: sd_sbc: Fix sd_zbc_report_zones()

From: Damien Le Moal <[email protected]>

commit 51fdaa0490241e8cd41b40cbf43a336d1a014460 upstream.

The block layer generic blk_revalidate_disk_zones() checks the validity of
zone descriptors reported by a disk using the blk_revalidate_zone_cb()
callback function executed for each zone descriptor. If a ZBC disk reports
invalid zone descriptors, blk_revalidate_disk_zones() returns an error and
sd_zbc_read_zones() changes the disk capacity to 0, which in turn results
in the gendisk structure capacity to be set to 0. This all works well for
the first revalidate pass on a disk and the block layer detects the
capactiy change.

On the second revalidate pass, blk_revalidate_disk_zones() is called again
and sd_zbc_report_zones() executed to check the zones a second time.
However, for this second pass, the gendisk capacity is now 0, which results
in sd_zbc_report_zones() to do nothing and to report success and no
zones. blk_revalidate_disk_zones() in turn returns success and sets the
disk queue chunk_sectors limit with zero as no zones were checked, causing
a oops to trigger on the BUG_ON(!is_power_of_2(chunk_sectors)) in
blk_queue_chunk_sectors().

Fix this by using the sdkp capacity field rather than the gendisk capacity
for the report zones loop in sd_zbc_report_zones(). Also add a check to
return immediately an error if the sdkp capacity is 0. With this fix,
invalid/buggy ZBC disk scan does not trigger a oops and are exposed with a
0 capacity. This change also preserve the chance for the disk to be
correctly revalidated on the second revalidate pass as the scsi disk
structure capacity field is always set to the disk reported value when
sd_zbc_report_zones() is called.

Link: https://lore.kernel.org/r/[email protected]
Fixes: d41003513e61 ("block: rework zone reporting")
Cc: Cc: <[email protected]> # v5.5
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/scsi/sd_zbc.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -161,6 +161,7 @@ int sd_zbc_report_zones(struct gendisk *
unsigned int nr_zones, report_zones_cb cb, void *data)
{
struct scsi_disk *sdkp = scsi_disk(disk);
+ sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
unsigned int nr, i;
unsigned char *buf;
size_t offset, buflen = 0;
@@ -171,11 +172,15 @@ int sd_zbc_report_zones(struct gendisk *
/* Not a zoned device */
return -EOPNOTSUPP;

+ if (!capacity)
+ /* Device gone or invalid */
+ return -ENODEV;
+
buf = sd_zbc_alloc_report_buffer(sdkp, nr_zones, &buflen);
if (!buf)
return -ENOMEM;

- while (zone_idx < nr_zones && sector < get_capacity(disk)) {
+ while (zone_idx < nr_zones && sector < capacity) {
ret = sd_zbc_do_report_zones(sdkp, buf, buflen,
sectors_to_logical(sdkp->device, sector), true);
if (ret)


2020-03-03 17:50:23

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 101/176] io_uring: fix 32-bit compatability with sendmsg/recvmsg

From: Jens Axboe <[email protected]>

commit d876836204897b6d7d911f942084f69a1e9d5c4d upstream.

We must set MSG_CMSG_COMPAT if we're in compatability mode, otherwise
the iovec import for these commands will not do the right thing and fail
the command with -EINVAL.

Found by running the test suite compiled as 32-bit.

Cc: [email protected]
Fixes: aa1fa28fc73e ("io_uring: add support for recvmsg()")
Fixes: 0fa03c624d8f ("io_uring: add support for sendmsg()")
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/io_uring.c | 10 ++++++++++
1 file changed, 10 insertions(+)

--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2166,6 +2166,11 @@ static int io_sendmsg_prep(struct io_kio
sr->msg_flags = READ_ONCE(sqe->msg_flags);
sr->msg = u64_to_user_ptr(READ_ONCE(sqe->addr));

+#ifdef CONFIG_COMPAT
+ if (req->ctx->compat)
+ sr->msg_flags |= MSG_CMSG_COMPAT;
+#endif
+
if (!io)
return 0;

@@ -2258,6 +2263,11 @@ static int io_recvmsg_prep(struct io_kio
sr->msg_flags = READ_ONCE(sqe->msg_flags);
sr->msg = u64_to_user_ptr(READ_ONCE(sqe->addr));

+#ifdef CONFIG_COMPAT
+ if (req->ctx->compat)
+ sr->msg_flags |= MSG_CMSG_COMPAT;
+#endif
+
if (!io)
return 0;



2020-03-03 17:50:24

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 110/176] MIPS: VPE: Fix a double free and a memory leak in release_vpe()

From: Christophe JAILLET <[email protected]>

commit bef8e2dfceed6daeb6ca3e8d33f9c9d43b926580 upstream.

Pointer on the memory allocated by 'alloc_progmem()' is stored in
'v->load_addr'. So this is this memory that should be freed by
'release_progmem()'.

'release_progmem()' is only a call to 'kfree()'.

With the current code, there is both a double free and a memory leak.
Fix it by passing the correct pointer to 'release_progmem()'.

Fixes: e01402b115ccc ("More AP / SP bits for the 34K, the Malta bits and things. Still wants")
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/mips/kernel/vpe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/mips/kernel/vpe.c
+++ b/arch/mips/kernel/vpe.c
@@ -134,7 +134,7 @@ void release_vpe(struct vpe *v)
{
list_del(&v->list);
if (v->load_addr)
- release_progmem(v);
+ release_progmem(v->load_addr);
kfree(v);
}



2020-03-03 17:50:26

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 077/176] net: hns3: fix a copying IPv6 address error in hclge_fd_get_flow_tuples()

From: Guangbin Huang <[email protected]>

[ Upstream commit 47327c9315b2f3ae4ab659457977a26669631f20 ]

The IPv6 address defined in struct in6_addr is specified as
big endian, but there is no specified endian in struct
hclge_fd_rule_tuples, so it will cause a problem if directly
use memcpy() to copy ipv6 address between these two structures
since this field in struct hclge_fd_rule_tuples is little endian.

This patch fixes this problem by using be32_to_cpu() to convert
endian of IPv6 address of struct in6_addr before copying.

Fixes: d93ed94fbeaf ("net: hns3: add aRFS support for PF")
Signed-off-by: Guangbin Huang <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
.../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index bfdb08572f0cc..5d74f5a60102a 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -6106,6 +6106,9 @@ static int hclge_get_all_rules(struct hnae3_handle *handle,
static void hclge_fd_get_flow_tuples(const struct flow_keys *fkeys,
struct hclge_fd_rule_tuples *tuples)
{
+#define flow_ip6_src fkeys->addrs.v6addrs.src.in6_u.u6_addr32
+#define flow_ip6_dst fkeys->addrs.v6addrs.dst.in6_u.u6_addr32
+
tuples->ether_proto = be16_to_cpu(fkeys->basic.n_proto);
tuples->ip_proto = fkeys->basic.ip_proto;
tuples->dst_port = be16_to_cpu(fkeys->ports.dst);
@@ -6114,12 +6117,12 @@ static void hclge_fd_get_flow_tuples(const struct flow_keys *fkeys,
tuples->src_ip[3] = be32_to_cpu(fkeys->addrs.v4addrs.src);
tuples->dst_ip[3] = be32_to_cpu(fkeys->addrs.v4addrs.dst);
} else {
- memcpy(tuples->src_ip,
- fkeys->addrs.v6addrs.src.in6_u.u6_addr32,
- sizeof(tuples->src_ip));
- memcpy(tuples->dst_ip,
- fkeys->addrs.v6addrs.dst.in6_u.u6_addr32,
- sizeof(tuples->dst_ip));
+ int i;
+
+ for (i = 0; i < IPV6_SIZE; i++) {
+ tuples->src_ip[i] = be32_to_cpu(flow_ip6_src[i]);
+ tuples->dst_ip[i] = be32_to_cpu(flow_ip6_dst[i]);
+ }
}
}

--
2.20.1



2020-03-03 17:50:27

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 112/176] i2c: altera: Fix potential integer overflow

From: Gustavo A. R. Silva <[email protected]>

commit 54498e8070e19e74498a72c7331348143e7e1f8c upstream.

Factor out 100 from the equation and do 32-bit arithmetic (3 * clk_mhz / 10)
instead of 64-bit.

Notice that clk_mhz is MHz, so the multiplication will never wrap 32 bits
and there is no need for div_u64().

Addresses-Coverity: 1458369 ("Unintentional integer overflow")
Fixes: 0560ad576268 ("i2c: altera: Add Altera I2C Controller driver")
Suggested-by: David Laight <[email protected]>
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Reviewed-by: Thor Thayer <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/i2c/busses/i2c-altera.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/i2c/busses/i2c-altera.c
+++ b/drivers/i2c/busses/i2c-altera.c
@@ -171,7 +171,7 @@ static void altr_i2c_init(struct altr_i2
/* SCL Low Time */
writel(t_low, idev->base + ALTR_I2C_SCL_LOW);
/* SDA Hold Time, 300ns */
- writel(div_u64(300 * clk_mhz, 1000), idev->base + ALTR_I2C_SDA_HOLD);
+ writel(3 * clk_mhz / 10, idev->base + ALTR_I2C_SDA_HOLD);

/* Mask all master interrupt bits */
altr_i2c_int_enable(idev, ALTR_I2C_ALL_IRQ, false);


2020-03-03 17:50:29

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 111/176] KVM: nVMX: Emulate MTF when performing instruction emulation

From: Oliver Upton <[email protected]>

commit 5ef8acbdd687c9d72582e2c05c0b9756efb37863 upstream.

Since commit 5f3d45e7f282 ("kvm/x86: add support for
MONITOR_TRAP_FLAG"), KVM has allowed an L1 guest to use the monitor trap
flag processor-based execution control for its L2 guest. KVM simply
forwards any MTF VM-exits to the L1 guest, which works for normal
instruction execution.

However, when KVM needs to emulate an instruction on the behalf of an L2
guest, the monitor trap flag is not emulated. Add the necessary logic to
kvm_skip_emulated_instruction() to synthesize an MTF VM-exit to L1 upon
instruction emulation for L2.

Fixes: 5f3d45e7f282 ("kvm/x86: add support for MONITOR_TRAP_FLAG")
Signed-off-by: Oliver Upton <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/include/uapi/asm/kvm.h | 1 +
arch/x86/kvm/svm.c | 1 +
arch/x86/kvm/vmx/nested.c | 35 ++++++++++++++++++++++++++++++++++-
arch/x86/kvm/vmx/nested.h | 5 +++++
arch/x86/kvm/vmx/vmx.c | 37 ++++++++++++++++++++++++++++++++++++-
arch/x86/kvm/vmx/vmx.h | 3 +++
arch/x86/kvm/x86.c | 2 ++
8 files changed, 83 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1092,6 +1092,7 @@ struct kvm_x86_ops {
void (*run)(struct kvm_vcpu *vcpu);
int (*handle_exit)(struct kvm_vcpu *vcpu);
int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
+ void (*update_emulated_instruction)(struct kvm_vcpu *vcpu);
void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask);
u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu);
void (*patch_hypercall)(struct kvm_vcpu *vcpu,
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -390,6 +390,7 @@ struct kvm_sync_regs {
#define KVM_STATE_NESTED_GUEST_MODE 0x00000001
#define KVM_STATE_NESTED_RUN_PENDING 0x00000002
#define KVM_STATE_NESTED_EVMCS 0x00000004
+#define KVM_STATE_NESTED_MTF_PENDING 0x00000008

#define KVM_STATE_NESTED_SMM_GUEST_MODE 0x00000001
#define KVM_STATE_NESTED_SMM_VMXON 0x00000002
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -7311,6 +7311,7 @@ static struct kvm_x86_ops svm_x86_ops __
.run = svm_vcpu_run,
.handle_exit = handle_exit,
.skip_emulated_instruction = skip_emulated_instruction,
+ .update_emulated_instruction = NULL,
.set_interrupt_shadow = svm_set_interrupt_shadow,
.get_interrupt_shadow = svm_get_interrupt_shadow,
.patch_hypercall = svm_patch_hypercall,
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3616,8 +3616,15 @@ static int vmx_check_nested_events(struc
unsigned long exit_qual;
bool block_nested_events =
vmx->nested.nested_run_pending || kvm_event_needs_reinjection(vcpu);
+ bool mtf_pending = vmx->nested.mtf_pending;
struct kvm_lapic *apic = vcpu->arch.apic;

+ /*
+ * Clear the MTF state. If a higher priority VM-exit is delivered first,
+ * this state is discarded.
+ */
+ vmx->nested.mtf_pending = false;
+
if (lapic_in_kernel(vcpu) &&
test_bit(KVM_APIC_INIT, &apic->pending_events)) {
if (block_nested_events)
@@ -3628,8 +3635,28 @@ static int vmx_check_nested_events(struc
return 0;
}

+ /*
+ * Process any exceptions that are not debug traps before MTF.
+ */
+ if (vcpu->arch.exception.pending &&
+ !vmx_pending_dbg_trap(vcpu) &&
+ nested_vmx_check_exception(vcpu, &exit_qual)) {
+ if (block_nested_events)
+ return -EBUSY;
+ nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
+ return 0;
+ }
+
+ if (mtf_pending) {
+ if (block_nested_events)
+ return -EBUSY;
+ nested_vmx_update_pending_dbg(vcpu);
+ nested_vmx_vmexit(vcpu, EXIT_REASON_MONITOR_TRAP_FLAG, 0, 0);
+ return 0;
+ }
+
if (vcpu->arch.exception.pending &&
- nested_vmx_check_exception(vcpu, &exit_qual)) {
+ nested_vmx_check_exception(vcpu, &exit_qual)) {
if (block_nested_events)
return -EBUSY;
nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
@@ -5742,6 +5769,9 @@ static int vmx_get_nested_state(struct k

if (vmx->nested.nested_run_pending)
kvm_state.flags |= KVM_STATE_NESTED_RUN_PENDING;
+
+ if (vmx->nested.mtf_pending)
+ kvm_state.flags |= KVM_STATE_NESTED_MTF_PENDING;
}
}

@@ -5922,6 +5952,9 @@ static int vmx_set_nested_state(struct k
vmx->nested.nested_run_pending =
!!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);

+ vmx->nested.mtf_pending =
+ !!(kvm_state->flags & KVM_STATE_NESTED_MTF_PENDING);
+
ret = -EINVAL;
if (nested_cpu_has_shadow_vmcs(vmcs12) &&
vmcs12->vmcs_link_pointer != -1ull) {
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -176,6 +176,11 @@ static inline bool nested_cpu_has_virtua
return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS;
}

+static inline int nested_cpu_has_mtf(struct vmcs12 *vmcs12)
+{
+ return nested_cpu_has(vmcs12, CPU_BASED_MONITOR_TRAP_FLAG);
+}
+
static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12)
{
return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT);
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1595,6 +1595,40 @@ static int skip_emulated_instruction(str
return 1;
}

+
+/*
+ * Recognizes a pending MTF VM-exit and records the nested state for later
+ * delivery.
+ */
+static void vmx_update_emulated_instruction(struct kvm_vcpu *vcpu)
+{
+ struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+ if (!is_guest_mode(vcpu))
+ return;
+
+ /*
+ * Per the SDM, MTF takes priority over debug-trap exceptions besides
+ * T-bit traps. As instruction emulation is completed (i.e. at the
+ * instruction boundary), any #DB exception pending delivery must be a
+ * debug-trap. Record the pending MTF state to be delivered in
+ * vmx_check_nested_events().
+ */
+ if (nested_cpu_has_mtf(vmcs12) &&
+ (!vcpu->arch.exception.pending ||
+ vcpu->arch.exception.nr == DB_VECTOR))
+ vmx->nested.mtf_pending = true;
+ else
+ vmx->nested.mtf_pending = false;
+}
+
+static int vmx_skip_emulated_instruction(struct kvm_vcpu *vcpu)
+{
+ vmx_update_emulated_instruction(vcpu);
+ return skip_emulated_instruction(vcpu);
+}
+
static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
{
/*
@@ -7886,7 +7920,8 @@ static struct kvm_x86_ops vmx_x86_ops __

.run = vmx_vcpu_run,
.handle_exit = vmx_handle_exit,
- .skip_emulated_instruction = skip_emulated_instruction,
+ .skip_emulated_instruction = vmx_skip_emulated_instruction,
+ .update_emulated_instruction = vmx_update_emulated_instruction,
.set_interrupt_shadow = vmx_set_interrupt_shadow,
.get_interrupt_shadow = vmx_get_interrupt_shadow,
.patch_hypercall = vmx_patch_hypercall,
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -150,6 +150,9 @@ struct nested_vmx {
/* L2 must run next, and mustn't decide to exit to L1. */
bool nested_run_pending;

+ /* Pending MTF VM-exit into L1. */
+ bool mtf_pending;
+
struct loaded_vmcs vmcs02;

/*
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6838,6 +6838,8 @@ restart:
kvm_rip_write(vcpu, ctxt->eip);
if (r && ctxt->tf)
r = kvm_vcpu_do_singlestep(vcpu);
+ if (kvm_x86_ops->update_emulated_instruction)
+ kvm_x86_ops->update_emulated_instruction(vcpu);
__kvm_set_rflags(vcpu, ctxt->eflags);
}



2020-03-03 17:50:30

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 083/176] docs: Fix empty parallelism argument

From: Kees Cook <[email protected]>

commit adc10f5b0a03606e30c704cff1f0283a696d0260 upstream.

When there was no parallelism (no top-level -j arg and a pre-1.7
sphinx-build), the argument passed would be empty ("") instead of just
being missing, which would (understandably) badly confuse sphinx-build.
Fix this by removing the quotes.

Reported-by: Rafael J. Wysocki <[email protected]>
Fixes: 51e46c7a4007 ("docs, parallelism: Rearrange how jobserver reservations are made")
Cc: [email protected] # v5.5 only
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Jonathan Corbet <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
Documentation/sphinx/parallel-wrapper.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/Documentation/sphinx/parallel-wrapper.sh
+++ b/Documentation/sphinx/parallel-wrapper.sh
@@ -30,4 +30,4 @@ if [ -n "$parallel" ] ; then
parallel="-j$parallel"
fi

-exec "$sphinx" "$parallel" "$@"
+exec "$sphinx" $parallel "$@"


2020-03-03 17:50:32

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 037/176] NFSv4: Fix races between open and dentry revalidation

From: Trond Myklebust <[email protected]>

[ Upstream commit cf5b4059ba7197d6cef9c0e024979d178ed8c8ec ]

We want to make sure that we revalidate the dentry if and only if
we've done an OPEN by filename.
In order to avoid races with remote changes to the directory on the
server, we want to save the verifier before calling OPEN. The exception
is if the server returned a delegation with our OPEN, as we then
know that the filename can't have changed on the server.

Signed-off-by: Trond Myklebust <[email protected]>
Reviewed-by: Benjamin Coddington <[email protected]>
Tested-by: Benjamin Coddington <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/nfs/nfs4file.c | 1 -
fs/nfs/nfs4proc.c | 18 ++++++++++++++++--
2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 620de905cba97..3f892035c1413 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -86,7 +86,6 @@ nfs4_file_open(struct inode *inode, struct file *filp)
if (inode != d_inode(dentry))
goto out_drop;

- nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
nfs_file_set_open_context(filp, ctx);
nfs_fscache_open_file(inode, filp);
err = 0;
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 6ddb4f517d373..13c2de527718a 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2962,10 +2962,13 @@ static int _nfs4_open_and_get_state(struct nfs4_opendata *opendata,
struct dentry *dentry;
struct nfs4_state *state;
fmode_t acc_mode = _nfs4_ctx_to_accessmode(ctx);
+ struct inode *dir = d_inode(opendata->dir);
+ unsigned long dir_verifier;
unsigned int seq;
int ret;

seq = raw_seqcount_begin(&sp->so_reclaim_seqcount);
+ dir_verifier = nfs_save_change_attribute(dir);

ret = _nfs4_proc_open(opendata, ctx);
if (ret != 0)
@@ -2993,8 +2996,19 @@ static int _nfs4_open_and_get_state(struct nfs4_opendata *opendata,
dput(ctx->dentry);
ctx->dentry = dentry = alias;
}
- nfs_set_verifier(dentry,
- nfs_save_change_attribute(d_inode(opendata->dir)));
+ }
+
+ switch(opendata->o_arg.claim) {
+ default:
+ break;
+ case NFS4_OPEN_CLAIM_NULL:
+ case NFS4_OPEN_CLAIM_DELEGATE_CUR:
+ case NFS4_OPEN_CLAIM_DELEGATE_PREV:
+ if (!opendata->rpc_done)
+ break;
+ if (opendata->o_res.delegation_type != 0)
+ dir_verifier = nfs_save_change_attribute(dir);
+ nfs_set_verifier(dentry, dir_verifier);
}

/* Parse layoutget results before we check for access */
--
2.20.1



2020-03-03 17:50:36

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 117/176] drm/i915/gvt: Separate display reset from ALL_ENGINES reset

From: Tina Zhang <[email protected]>

commit 3eb55e6f753a379e293395de8d5f3be28351a7f8 upstream.

ALL_ENGINES reset doesn't clobber display with the current gvt-g
supported platforms. Thus ALL_ENGINES reset shouldn't reset the
display engine registers emulated by gvt-g.

This fixes guest warning like

[ 14.622026] [drm] Initialized i915 1.6.0 20200114 for 0000:00:03.0 on minor 0
[ 14.967917] fbcon: i915drmfb (fb0) is primary device
[ 25.100188] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] E RROR [CRTC:51:pipe A] flip_done timed out
[ 25.100860] -----------[ cut here ]-----------
[ 25.100861] pll on state mismatch (expected 0, found 1)
[ 25.101024] WARNING: CPU: 1 PID: 30 at drivers/gpu/drm/i915/display/intel_dis play.c:14382 verify_single_dpll_state.isra.115+0x28f/0x320 [i915]
[ 25.101025] Modules linked in: intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel i915 aesni_intel cr ypto_simd cryptd glue_helper cec rc_core video drm_kms_helper joydev drm input_l eds i2c_algo_bit serio_raw fb_sys_fops syscopyarea sysfillrect sysimgblt mac_hid qemu_fw_cfg sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 e1000 psmouse i2c_piix4 pata_acpi floppy
[ 25.101052] CPU: 1 PID: 30 Comm: kworker/u4:1 Not tainted 5.5.0+ #1
[ 25.101053] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1 .12.1-0-ga5cab58 04/01/2014
[ 25.101055] Workqueue: events_unbound async_run_entry_fn
[ 25.101092] RIP: 0010:verify_single_dpll_state.isra.115+0x28f/0x320 [i915]
[ 25.101093] Code: e0 d9 ff e9 a3 fe ff ff 80 3d e9 c2 11 00 00 44 89 f6 48 c7 c7 c0 9d 88 c0 75 3b e8 eb df d9 ff e9 c7 fe ff ff e8 d1 e0 ae c4 <0f> 0b e9 7a fe ff ff 80 3d c0 c2 11 00 00 8d 71 41 89 c2 48 c7 c7
[ 25.101093] RSP: 0018:ffffb1de80107878 EFLAGS: 00010286
[ 25.101094] RAX: 0000000000000000 RBX: ffffb1de80107884 RCX: 0000000000000007
[ 25.101095] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff94fdfdd19740
[ 25.101095] RBP: ffffb1de80107938 R08: 0000000d6bfdc7b4 R09: 000000000000002b
[ 25.101096] R10: ffff94fdf82dc000 R11: 0000000000000225 R12: 00000000000001f8
[ 25.101096] R13: ffff94fdb3ca6a90 R14: ffff94fdb3ca0000 R15: 0000000000000000
[ 25.101097] FS: 0000000000000000(0000) GS:ffff94fdfdd00000(0000) knlGS:00000 00000000000
[ 25.101098] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 25.101098] CR2: 00007fbc3e2be9c8 CR3: 000000003339a003 CR4: 0000000000360ee0
[ 25.101101] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 25.101101] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 25.101102] Call Trace:
[ 25.101139] intel_atomic_commit_tail+0xde4/0x1520 [i915]
[ 25.101141] ? flush_workqueue_prep_pwqs+0xfa/0x130
[ 25.101142] ? flush_workqueue+0x198/0x3c0
[ 25.101174] intel_atomic_commit+0x2ad/0x320 [i915]
[ 25.101209] drm_atomic_commit+0x4a/0x50 [drm]
[ 25.101220] drm_client_modeset_commit_atomic+0x1c4/0x200 [drm]
[ 25.101231] drm_client_modeset_commit_force+0x47/0x170 [drm]
[ 25.101250] drm_fb_helper_restore_fbdev_mode_unlocked+0x4e/0xa0 [drm_kms_hel per]
[ 25.101255] drm_fb_helper_set_par+0x2d/0x60 [drm_kms_helper]
[ 25.101287] intel_fbdev_set_par+0x1a/0x40 [i915]
[ 25.101289] ? con_is_visible+0x2e/0x60
[ 25.101290] fbcon_init+0x378/0x600
[ 25.101292] visual_init+0xd5/0x130
[ 25.101296] do_bind_con_driver+0x217/0x430
[ 25.101297] do_take_over_console+0x7d/0x1b0
[ 25.101298] do_fbcon_takeover+0x5c/0xb0
[ 25.101299] fbcon_fb_registered+0x199/0x1a0
[ 25.101301] register_framebuffer+0x22c/0x330
[ 25.101306] __drm_fb_helper_initial_config_and_unlock+0x31a/0x520 [drm_kms_h elper]
[ 25.101311] drm_fb_helper_initial_config+0x35/0x40 [drm_kms_helper]
[ 25.101341] intel_fbdev_initial_config+0x18/0x30 [i915]
[ 25.101342] async_run_entry_fn+0x3c/0x150
[ 25.101343] process_one_work+0x1fd/0x3f0
[ 25.101344] worker_thread+0x34/0x410
[ 25.101346] kthread+0x121/0x140
[ 25.101346] ? process_one_work+0x3f0/0x3f0
[ 25.101347] ? kthread_park+0x90/0x90
[ 25.101350] ret_from_fork+0x35/0x40
[ 25.101351] --[ end trace b5b47d44cd998ba1 ]--

Fixes: 6294b61ba769 ("drm/i915/gvt: add missing display part reset for vGPU reset")
Signed-off-by: Tina Zhang <[email protected]>
Reviewed-by: Zhenyu Wang <[email protected]>
Signed-off-by: Zhenyu Wang <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/gpu/drm/i915/gvt/vgpu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/i915/gvt/vgpu.c
+++ b/drivers/gpu/drm/i915/gvt/vgpu.c
@@ -560,9 +560,9 @@ void intel_gvt_reset_vgpu_locked(struct

intel_vgpu_reset_mmio(vgpu, dmlr);
populate_pvinfo_page(vgpu);
- intel_vgpu_reset_display(vgpu);

if (dmlr) {
+ intel_vgpu_reset_display(vgpu);
intel_vgpu_reset_cfg_space(vgpu);
/* only reset the failsafe mode when dmlr reset */
vgpu->failsafe = false;


2020-03-03 17:50:41

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 118/176] nl80211: fix potential leak in AP start

From: Johannes Berg <[email protected]>

commit 9951ebfcdf2b97dbb28a5d930458424341e61aa2 upstream.

If nl80211_parse_he_obss_pd() fails, we leak the previously
allocated ACL memory. Free it in this case.

Fixes: 796e90f42b7e ("cfg80211: add support for parsing OBBS_PD attributes")
Signed-off-by: Johannes Berg <[email protected]>
Link: https://lore.kernel.org/r/20200221104142.835aba4cdd14.I1923b55ba9989c57e13978f91f40bfdc45e60cbd@changeid
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/wireless/nl80211.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -4800,8 +4800,7 @@ static int nl80211_start_ap(struct sk_bu
err = nl80211_parse_he_obss_pd(
info->attrs[NL80211_ATTR_HE_OBSS_PD],
&params.he_obss_pd);
- if (err)
- return err;
+ goto out;
}

nl80211_calculate_ap_params(&params);
@@ -4823,6 +4822,7 @@ static int nl80211_start_ap(struct sk_bu
}
wdev_unlock(wdev);

+out:
kfree(params.acl);

return err;


2020-03-03 17:50:43

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 045/176] drm/amd/display: Do not set optimized_require to false after plane disable

From: Sung Lee <[email protected]>

[ Upstream commit df36f6cf23ada812930afa8ee76681d4ad307c61 ]

[WHY]
The optimized_require flag is needed to set watermarks and clocks lower
in certain conditions. This flag is set to true and then set to false
while programming front end in dcn20.

[HOW]
Do not set the flag to false while disabling plane.

Signed-off-by: Sung Lee <[email protected]>
Reviewed-by: Tony Cheng <[email protected]>
Acked-by: Bhawanpreet Lakha <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
index ac8c18fadefce..448bc9b39942f 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -493,7 +493,6 @@ static void dcn20_plane_atomic_disable(struct dc *dc, struct pipe_ctx *pipe_ctx)
dpp->funcs->dpp_dppclk_control(dpp, false, false);

hubp->power_gated = true;
- dc->optimized_required = false; /* We're powering off, no need to optimize */

dc->hwss.plane_atomic_power_down(dc,
pipe_ctx->plane_res.dpp,
--
2.20.1



2020-03-03 17:50:43

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 119/176] mac80211: Remove a redundant mutex unlock

From: Andrei Otcheretianski <[email protected]>

commit 0daa63ed4c6c4302790ce67b7a90c0997ceb7514 upstream.

The below-mentioned commit changed the code to unlock *inside*
the function, but previously the unlock was *outside*. It failed
to remove the outer unlock, however, leading to double unlock.

Fix this.

Fixes: 33483a6b88e4 ("mac80211: fix missing unlock on error in ieee80211_mark_sta_auth()")
Signed-off-by: Andrei Otcheretianski <[email protected]>
Link: https://lore.kernel.org/r/20200221104719.cce4741cf6eb.I671567b185c8a4c2409377e483fd149ce590f56d@changeid
[rewrite commit message to better explain what happened]
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/mac80211/mlme.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)

--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -2959,7 +2959,7 @@ static void ieee80211_rx_mgmt_auth(struc
(auth_transaction == 2 &&
ifmgd->auth_data->expected_transaction == 2)) {
if (!ieee80211_mark_sta_auth(sdata, bssid))
- goto out_err;
+ return; /* ignore frame -- wait for timeout */
} else if (ifmgd->auth_data->algorithm == WLAN_AUTH_SAE &&
auth_transaction == 2) {
sdata_info(sdata, "SAE peer confirmed\n");
@@ -2967,10 +2967,6 @@ static void ieee80211_rx_mgmt_auth(struc
}

cfg80211_rx_mlme_mgmt(sdata->dev, (u8 *)mgmt, len);
- return;
- out_err:
- mutex_unlock(&sdata->local->sta_mtx);
- /* ignore frame -- wait for timeout */
}

#define case_WLAN(type) \


2020-03-03 17:50:47

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 102/176] netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports

From: Jozsef Kadlecsik <[email protected]>

commit f66ee0410b1c3481ee75e5db9b34547b4d582465 upstream.

In the case of huge hash:* types of sets, due to the single spinlock of
a set the processing of the whole set under spinlock protection could take
too long.

There were four places where the whole hash table of the set was processed
from bucket to bucket under holding the spinlock:

- During resizing a set, the original set was locked to exclude kernel side
add/del element operations (userspace add/del is excluded by the
nfnetlink mutex). The original set is actually just read during the
resize, so the spinlocking is replaced with rcu locking of regions.
However, thus there can be parallel kernel side add/del of entries.
In order not to loose those operations a backlog is added and replayed
after the successful resize.
- Garbage collection of timed out entries was also protected by the spinlock.
In order not to lock too long, region locking is introduced and a single
region is processed in one gc go. Also, the simple timer based gc running
is replaced with a workqueue based solution. The internal book-keeping
(number of elements, size of extensions) is moved to region level due to
the region locking.
- Adding elements: when the max number of the elements is reached, the gc
was called to evict the timed out entries. The new approach is that the gc
is called just for the matching region, assuming that if the region
(proportionally) seems to be full, then the whole set does. We could scan
the other regions to check every entry under rcu locking, but for huge
sets it'd mean a slowdown at adding elements.
- Listing the set header data: when the set was defined with timeout
support, the garbage collector was called to clean up timed out entries
to get the correct element numbers and set size values. Now the set is
scanned to check non-timed out entries, without actually calling the gc
for the whole set.

Thanks to Florian Westphal for helping me to solve the SOFTIRQ-safe ->
SOFTIRQ-unsafe lock order issues during working on the patch.

Reported-by: [email protected]
Reported-by: [email protected]
Reported-by: [email protected]
Fixes: 23c42a403a9c ("netfilter: ipset: Introduction of new commands and protocol version 7")
Signed-off-by: Jozsef Kadlecsik <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
include/linux/netfilter/ipset/ip_set.h | 11
net/netfilter/ipset/ip_set_core.c | 34 +
net/netfilter/ipset/ip_set_hash_gen.h | 633 ++++++++++++++++++++++-----------
3 files changed, 472 insertions(+), 206 deletions(-)

--- a/include/linux/netfilter/ipset/ip_set.h
+++ b/include/linux/netfilter/ipset/ip_set.h
@@ -121,6 +121,7 @@ struct ip_set_ext {
u32 timeout;
u8 packets_op;
u8 bytes_op;
+ bool target;
};

struct ip_set;
@@ -187,6 +188,14 @@ struct ip_set_type_variant {
/* Return true if "b" set is the same as "a"
* according to the create set parameters */
bool (*same_set)(const struct ip_set *a, const struct ip_set *b);
+ /* Region-locking is used */
+ bool region_lock;
+};
+
+struct ip_set_region {
+ spinlock_t lock; /* Region lock */
+ size_t ext_size; /* Size of the dynamic extensions */
+ u32 elements; /* Number of elements vs timeout */
};

/* The core set type structure */
@@ -501,7 +510,7 @@ ip_set_init_skbinfo(struct ip_set_skbinf
}

#define IP_SET_INIT_KEXT(skb, opt, set) \
- { .bytes = (skb)->len, .packets = 1, \
+ { .bytes = (skb)->len, .packets = 1, .target = true,\
.timeout = ip_set_adt_opt_timeout(opt, set) }

#define IP_SET_INIT_UEXT(set) \
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -723,6 +723,20 @@ ip_set_rcu_get(struct net *net, ip_set_i
return set;
}

+static inline void
+ip_set_lock(struct ip_set *set)
+{
+ if (!set->variant->region_lock)
+ spin_lock_bh(&set->lock);
+}
+
+static inline void
+ip_set_unlock(struct ip_set *set)
+{
+ if (!set->variant->region_lock)
+ spin_unlock_bh(&set->lock);
+}
+
int
ip_set_test(ip_set_id_t index, const struct sk_buff *skb,
const struct xt_action_param *par, struct ip_set_adt_opt *opt)
@@ -744,9 +758,9 @@ ip_set_test(ip_set_id_t index, const str
if (ret == -EAGAIN) {
/* Type requests element to be completed */
pr_debug("element must be completed, ADD is triggered\n");
- spin_lock_bh(&set->lock);
+ ip_set_lock(set);
set->variant->kadt(set, skb, par, IPSET_ADD, opt);
- spin_unlock_bh(&set->lock);
+ ip_set_unlock(set);
ret = 1;
} else {
/* --return-nomatch: invert matched element */
@@ -775,9 +789,9 @@ ip_set_add(ip_set_id_t index, const stru
!(opt->family == set->family || set->family == NFPROTO_UNSPEC))
return -IPSET_ERR_TYPE_MISMATCH;

- spin_lock_bh(&set->lock);
+ ip_set_lock(set);
ret = set->variant->kadt(set, skb, par, IPSET_ADD, opt);
- spin_unlock_bh(&set->lock);
+ ip_set_unlock(set);

return ret;
}
@@ -797,9 +811,9 @@ ip_set_del(ip_set_id_t index, const stru
!(opt->family == set->family || set->family == NFPROTO_UNSPEC))
return -IPSET_ERR_TYPE_MISMATCH;

- spin_lock_bh(&set->lock);
+ ip_set_lock(set);
ret = set->variant->kadt(set, skb, par, IPSET_DEL, opt);
- spin_unlock_bh(&set->lock);
+ ip_set_unlock(set);

return ret;
}
@@ -1264,9 +1278,9 @@ ip_set_flush_set(struct ip_set *set)
{
pr_debug("set: %s\n", set->name);

- spin_lock_bh(&set->lock);
+ ip_set_lock(set);
set->variant->flush(set);
- spin_unlock_bh(&set->lock);
+ ip_set_unlock(set);
}

static int ip_set_flush(struct net *net, struct sock *ctnl, struct sk_buff *skb,
@@ -1713,9 +1727,9 @@ call_ad(struct sock *ctnl, struct sk_buf
bool eexist = flags & IPSET_FLAG_EXIST, retried = false;

do {
- spin_lock_bh(&set->lock);
+ ip_set_lock(set);
ret = set->variant->uadt(set, tb, adt, &lineno, flags, retried);
- spin_unlock_bh(&set->lock);
+ ip_set_unlock(set);
retried = true;
} while (ret == -EAGAIN &&
set->variant->resize &&
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -7,13 +7,21 @@
#include <linux/rcupdate.h>
#include <linux/jhash.h>
#include <linux/types.h>
+#include <linux/netfilter/nfnetlink.h>
#include <linux/netfilter/ipset/ip_set.h>

-#define __ipset_dereference_protected(p, c) rcu_dereference_protected(p, c)
-#define ipset_dereference_protected(p, set) \
- __ipset_dereference_protected(p, lockdep_is_held(&(set)->lock))
-
-#define rcu_dereference_bh_nfnl(p) rcu_dereference_bh_check(p, 1)
+#define __ipset_dereference(p) \
+ rcu_dereference_protected(p, 1)
+#define ipset_dereference_nfnl(p) \
+ rcu_dereference_protected(p, \
+ lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET))
+#define ipset_dereference_set(p, set) \
+ rcu_dereference_protected(p, \
+ lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET) || \
+ lockdep_is_held(&(set)->lock))
+#define ipset_dereference_bh_nfnl(p) \
+ rcu_dereference_bh_check(p, \
+ lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET))

/* Hashing which uses arrays to resolve clashing. The hash table is resized
* (doubled) when searching becomes too long.
@@ -72,11 +80,35 @@ struct hbucket {
__aligned(__alignof__(u64));
};

+/* Region size for locking == 2^HTABLE_REGION_BITS */
+#define HTABLE_REGION_BITS 10
+#define ahash_numof_locks(htable_bits) \
+ ((htable_bits) < HTABLE_REGION_BITS ? 1 \
+ : jhash_size((htable_bits) - HTABLE_REGION_BITS))
+#define ahash_sizeof_regions(htable_bits) \
+ (ahash_numof_locks(htable_bits) * sizeof(struct ip_set_region))
+#define ahash_region(n, htable_bits) \
+ ((n) % ahash_numof_locks(htable_bits))
+#define ahash_bucket_start(h, htable_bits) \
+ ((htable_bits) < HTABLE_REGION_BITS ? 0 \
+ : (h) * jhash_size(HTABLE_REGION_BITS))
+#define ahash_bucket_end(h, htable_bits) \
+ ((htable_bits) < HTABLE_REGION_BITS ? jhash_size(htable_bits) \
+ : ((h) + 1) * jhash_size(HTABLE_REGION_BITS))
+
+struct htable_gc {
+ struct delayed_work dwork;
+ struct ip_set *set; /* Set the gc belongs to */
+ u32 region; /* Last gc run position */
+};
+
/* The hash table: the table size stored here in order to make resizing easy */
struct htable {
atomic_t ref; /* References for resizing */
- atomic_t uref; /* References for dumping */
+ atomic_t uref; /* References for dumping and gc */
u8 htable_bits; /* size of hash table == 2^htable_bits */
+ u32 maxelem; /* Maxelem per region */
+ struct ip_set_region *hregion; /* Region locks and ext sizes */
struct hbucket __rcu *bucket[0]; /* hashtable buckets */
};

@@ -162,6 +194,10 @@ htable_bits(u32 hashsize)
#define NLEN 0
#endif /* IP_SET_HASH_WITH_NETS */

+#define SET_ELEM_EXPIRED(set, d) \
+ (SET_WITH_TIMEOUT(set) && \
+ ip_set_timeout_expired(ext_timeout(d, set)))
+
#endif /* _IP_SET_HASH_GEN_H */

#ifndef MTYPE
@@ -205,10 +241,12 @@ htable_bits(u32 hashsize)
#undef mtype_test_cidrs
#undef mtype_test
#undef mtype_uref
-#undef mtype_expire
#undef mtype_resize
+#undef mtype_ext_size
+#undef mtype_resize_ad
#undef mtype_head
#undef mtype_list
+#undef mtype_gc_do
#undef mtype_gc
#undef mtype_gc_init
#undef mtype_variant
@@ -247,10 +285,12 @@ htable_bits(u32 hashsize)
#define mtype_test_cidrs IPSET_TOKEN(MTYPE, _test_cidrs)
#define mtype_test IPSET_TOKEN(MTYPE, _test)
#define mtype_uref IPSET_TOKEN(MTYPE, _uref)
-#define mtype_expire IPSET_TOKEN(MTYPE, _expire)
#define mtype_resize IPSET_TOKEN(MTYPE, _resize)
+#define mtype_ext_size IPSET_TOKEN(MTYPE, _ext_size)
+#define mtype_resize_ad IPSET_TOKEN(MTYPE, _resize_ad)
#define mtype_head IPSET_TOKEN(MTYPE, _head)
#define mtype_list IPSET_TOKEN(MTYPE, _list)
+#define mtype_gc_do IPSET_TOKEN(MTYPE, _gc_do)
#define mtype_gc IPSET_TOKEN(MTYPE, _gc)
#define mtype_gc_init IPSET_TOKEN(MTYPE, _gc_init)
#define mtype_variant IPSET_TOKEN(MTYPE, _variant)
@@ -275,8 +315,7 @@ htable_bits(u32 hashsize)
/* The generic hash structure */
struct htype {
struct htable __rcu *table; /* the hash table */
- struct timer_list gc; /* garbage collection when timeout enabled */
- struct ip_set *set; /* attached to this ip_set */
+ struct htable_gc gc; /* gc workqueue */
u32 maxelem; /* max elements in the hash */
u32 initval; /* random jhash init value */
#ifdef IP_SET_HASH_WITH_MARKMASK
@@ -288,21 +327,33 @@ struct htype {
#ifdef IP_SET_HASH_WITH_NETMASK
u8 netmask; /* netmask value for subnets to store */
#endif
+ struct list_head ad; /* Resize add|del backlist */
struct mtype_elem next; /* temporary storage for uadd */
#ifdef IP_SET_HASH_WITH_NETS
struct net_prefixes nets[NLEN]; /* book-keeping of prefixes */
#endif
};

+/* ADD|DEL entries saved during resize */
+struct mtype_resize_ad {
+ struct list_head list;
+ enum ipset_adt ad; /* ADD|DEL element */
+ struct mtype_elem d; /* Element value */
+ struct ip_set_ext ext; /* Extensions for ADD */
+ struct ip_set_ext mext; /* Target extensions for ADD */
+ u32 flags; /* Flags for ADD */
+};
+
#ifdef IP_SET_HASH_WITH_NETS
/* Network cidr size book keeping when the hash stores different
* sized networks. cidr == real cidr + 1 to support /0.
*/
static void
-mtype_add_cidr(struct htype *h, u8 cidr, u8 n)
+mtype_add_cidr(struct ip_set *set, struct htype *h, u8 cidr, u8 n)
{
int i, j;

+ spin_lock_bh(&set->lock);
/* Add in increasing prefix order, so larger cidr first */
for (i = 0, j = -1; i < NLEN && h->nets[i].cidr[n]; i++) {
if (j != -1) {
@@ -311,7 +362,7 @@ mtype_add_cidr(struct htype *h, u8 cidr,
j = i;
} else if (h->nets[i].cidr[n] == cidr) {
h->nets[CIDR_POS(cidr)].nets[n]++;
- return;
+ goto unlock;
}
}
if (j != -1) {
@@ -320,24 +371,29 @@ mtype_add_cidr(struct htype *h, u8 cidr,
}
h->nets[i].cidr[n] = cidr;
h->nets[CIDR_POS(cidr)].nets[n] = 1;
+unlock:
+ spin_unlock_bh(&set->lock);
}

static void
-mtype_del_cidr(struct htype *h, u8 cidr, u8 n)
+mtype_del_cidr(struct ip_set *set, struct htype *h, u8 cidr, u8 n)
{
u8 i, j, net_end = NLEN - 1;

+ spin_lock_bh(&set->lock);
for (i = 0; i < NLEN; i++) {
if (h->nets[i].cidr[n] != cidr)
continue;
h->nets[CIDR_POS(cidr)].nets[n]--;
if (h->nets[CIDR_POS(cidr)].nets[n] > 0)
- return;
+ goto unlock;
for (j = i; j < net_end && h->nets[j].cidr[n]; j++)
h->nets[j].cidr[n] = h->nets[j + 1].cidr[n];
h->nets[j].cidr[n] = 0;
- return;
+ goto unlock;
}
+unlock:
+ spin_unlock_bh(&set->lock);
}
#endif

@@ -345,7 +401,7 @@ mtype_del_cidr(struct htype *h, u8 cidr,
static size_t
mtype_ahash_memsize(const struct htype *h, const struct htable *t)
{
- return sizeof(*h) + sizeof(*t);
+ return sizeof(*h) + sizeof(*t) + ahash_sizeof_regions(t->htable_bits);
}

/* Get the ith element from the array block n */
@@ -369,24 +425,29 @@ mtype_flush(struct ip_set *set)
struct htype *h = set->data;
struct htable *t;
struct hbucket *n;
- u32 i;
+ u32 r, i;

- t = ipset_dereference_protected(h->table, set);
- for (i = 0; i < jhash_size(t->htable_bits); i++) {
- n = __ipset_dereference_protected(hbucket(t, i), 1);
- if (!n)
- continue;
- if (set->extensions & IPSET_EXT_DESTROY)
- mtype_ext_cleanup(set, n);
- /* FIXME: use slab cache */
- rcu_assign_pointer(hbucket(t, i), NULL);
- kfree_rcu(n, rcu);
+ t = ipset_dereference_nfnl(h->table);
+ for (r = 0; r < ahash_numof_locks(t->htable_bits); r++) {
+ spin_lock_bh(&t->hregion[r].lock);
+ for (i = ahash_bucket_start(r, t->htable_bits);
+ i < ahash_bucket_end(r, t->htable_bits); i++) {
+ n = __ipset_dereference(hbucket(t, i));
+ if (!n)
+ continue;
+ if (set->extensions & IPSET_EXT_DESTROY)
+ mtype_ext_cleanup(set, n);
+ /* FIXME: use slab cache */
+ rcu_assign_pointer(hbucket(t, i), NULL);
+ kfree_rcu(n, rcu);
+ }
+ t->hregion[r].ext_size = 0;
+ t->hregion[r].elements = 0;
+ spin_unlock_bh(&t->hregion[r].lock);
}
#ifdef IP_SET_HASH_WITH_NETS
memset(h->nets, 0, sizeof(h->nets));
#endif
- set->elements = 0;
- set->ext_size = 0;
}

/* Destroy the hashtable part of the set */
@@ -397,7 +458,7 @@ mtype_ahash_destroy(struct ip_set *set,
u32 i;

for (i = 0; i < jhash_size(t->htable_bits); i++) {
- n = __ipset_dereference_protected(hbucket(t, i), 1);
+ n = __ipset_dereference(hbucket(t, i));
if (!n)
continue;
if (set->extensions & IPSET_EXT_DESTROY && ext_destroy)
@@ -406,6 +467,7 @@ mtype_ahash_destroy(struct ip_set *set,
kfree(n);
}

+ ip_set_free(t->hregion);
ip_set_free(t);
}

@@ -414,28 +476,21 @@ static void
mtype_destroy(struct ip_set *set)
{
struct htype *h = set->data;
+ struct list_head *l, *lt;

if (SET_WITH_TIMEOUT(set))
- del_timer_sync(&h->gc);
+ cancel_delayed_work_sync(&h->gc.dwork);

- mtype_ahash_destroy(set,
- __ipset_dereference_protected(h->table, 1), true);
+ mtype_ahash_destroy(set, ipset_dereference_nfnl(h->table), true);
+ list_for_each_safe(l, lt, &h->ad) {
+ list_del(l);
+ kfree(l);
+ }
kfree(h);

set->data = NULL;
}

-static void
-mtype_gc_init(struct ip_set *set, void (*gc)(struct timer_list *t))
-{
- struct htype *h = set->data;
-
- timer_setup(&h->gc, gc, 0);
- mod_timer(&h->gc, jiffies + IPSET_GC_PERIOD(set->timeout) * HZ);
- pr_debug("gc initialized, run in every %u\n",
- IPSET_GC_PERIOD(set->timeout));
-}
-
static bool
mtype_same_set(const struct ip_set *a, const struct ip_set *b)
{
@@ -454,11 +509,9 @@ mtype_same_set(const struct ip_set *a, c
a->extensions == b->extensions;
}

-/* Delete expired elements from the hashtable */
static void
-mtype_expire(struct ip_set *set, struct htype *h)
+mtype_gc_do(struct ip_set *set, struct htype *h, struct htable *t, u32 r)
{
- struct htable *t;
struct hbucket *n, *tmp;
struct mtype_elem *data;
u32 i, j, d;
@@ -466,10 +519,12 @@ mtype_expire(struct ip_set *set, struct
#ifdef IP_SET_HASH_WITH_NETS
u8 k;
#endif
+ u8 htable_bits = t->htable_bits;

- t = ipset_dereference_protected(h->table, set);
- for (i = 0; i < jhash_size(t->htable_bits); i++) {
- n = __ipset_dereference_protected(hbucket(t, i), 1);
+ spin_lock_bh(&t->hregion[r].lock);
+ for (i = ahash_bucket_start(r, htable_bits);
+ i < ahash_bucket_end(r, htable_bits); i++) {
+ n = __ipset_dereference(hbucket(t, i));
if (!n)
continue;
for (j = 0, d = 0; j < n->pos; j++) {
@@ -485,58 +540,100 @@ mtype_expire(struct ip_set *set, struct
smp_mb__after_atomic();
#ifdef IP_SET_HASH_WITH_NETS
for (k = 0; k < IPSET_NET_COUNT; k++)
- mtype_del_cidr(h,
+ mtype_del_cidr(set, h,
NCIDR_PUT(DCIDR_GET(data->cidr, k)),
k);
#endif
+ t->hregion[r].elements--;
ip_set_ext_destroy(set, data);
- set->elements--;
d++;
}
if (d >= AHASH_INIT_SIZE) {
if (d >= n->size) {
+ t->hregion[r].ext_size -=
+ ext_size(n->size, dsize);
rcu_assign_pointer(hbucket(t, i), NULL);
kfree_rcu(n, rcu);
continue;
}
tmp = kzalloc(sizeof(*tmp) +
- (n->size - AHASH_INIT_SIZE) * dsize,
- GFP_ATOMIC);
+ (n->size - AHASH_INIT_SIZE) * dsize,
+ GFP_ATOMIC);
if (!tmp)
- /* Still try to delete expired elements */
+ /* Still try to delete expired elements. */
continue;
tmp->size = n->size - AHASH_INIT_SIZE;
for (j = 0, d = 0; j < n->pos; j++) {
if (!test_bit(j, n->used))
continue;
data = ahash_data(n, j, dsize);
- memcpy(tmp->value + d * dsize, data, dsize);
+ memcpy(tmp->value + d * dsize,
+ data, dsize);
set_bit(d, tmp->used);
d++;
}
tmp->pos = d;
- set->ext_size -= ext_size(AHASH_INIT_SIZE, dsize);
+ t->hregion[r].ext_size -=
+ ext_size(AHASH_INIT_SIZE, dsize);
rcu_assign_pointer(hbucket(t, i), tmp);
kfree_rcu(n, rcu);
}
}
+ spin_unlock_bh(&t->hregion[r].lock);
}

static void
-mtype_gc(struct timer_list *t)
+mtype_gc(struct work_struct *work)
{
- struct htype *h = from_timer(h, t, gc);
- struct ip_set *set = h->set;
+ struct htable_gc *gc;
+ struct ip_set *set;
+ struct htype *h;
+ struct htable *t;
+ u32 r, numof_locks;
+ unsigned int next_run;
+
+ gc = container_of(work, struct htable_gc, dwork.work);
+ set = gc->set;
+ h = set->data;

- pr_debug("called\n");
spin_lock_bh(&set->lock);
- mtype_expire(set, h);
+ t = ipset_dereference_set(h->table, set);
+ atomic_inc(&t->uref);
+ numof_locks = ahash_numof_locks(t->htable_bits);
+ r = gc->region++;
+ if (r >= numof_locks) {
+ r = gc->region = 0;
+ }
+ next_run = (IPSET_GC_PERIOD(set->timeout) * HZ) / numof_locks;
+ if (next_run < HZ/10)
+ next_run = HZ/10;
spin_unlock_bh(&set->lock);

- h->gc.expires = jiffies + IPSET_GC_PERIOD(set->timeout) * HZ;
- add_timer(&h->gc);
+ mtype_gc_do(set, h, t, r);
+
+ if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
+ pr_debug("Table destroy after resize by expire: %p\n", t);
+ mtype_ahash_destroy(set, t, false);
+ }
+
+ queue_delayed_work(system_power_efficient_wq, &gc->dwork, next_run);
+
}

+static void
+mtype_gc_init(struct htable_gc *gc)
+{
+ INIT_DEFERRABLE_WORK(&gc->dwork, mtype_gc);
+ queue_delayed_work(system_power_efficient_wq, &gc->dwork, HZ);
+}
+
+static int
+mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
+ struct ip_set_ext *mext, u32 flags);
+static int
+mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
+ struct ip_set_ext *mext, u32 flags);
+
/* Resize a hash: create a new hash table with doubling the hashsize
* and inserting the elements to it. Repeat until we succeed or
* fail due to memory pressures.
@@ -547,7 +644,7 @@ mtype_resize(struct ip_set *set, bool re
struct htype *h = set->data;
struct htable *t, *orig;
u8 htable_bits;
- size_t extsize, dsize = set->dsize;
+ size_t dsize = set->dsize;
#ifdef IP_SET_HASH_WITH_NETS
u8 flags;
struct mtype_elem *tmp;
@@ -555,7 +652,9 @@ mtype_resize(struct ip_set *set, bool re
struct mtype_elem *data;
struct mtype_elem *d;
struct hbucket *n, *m;
- u32 i, j, key;
+ struct list_head *l, *lt;
+ struct mtype_resize_ad *x;
+ u32 i, j, r, nr, key;
int ret;

#ifdef IP_SET_HASH_WITH_NETS
@@ -563,10 +662,8 @@ mtype_resize(struct ip_set *set, bool re
if (!tmp)
return -ENOMEM;
#endif
- rcu_read_lock_bh();
- orig = rcu_dereference_bh_nfnl(h->table);
+ orig = ipset_dereference_bh_nfnl(h->table);
htable_bits = orig->htable_bits;
- rcu_read_unlock_bh();

retry:
ret = 0;
@@ -583,88 +680,124 @@ retry:
ret = -ENOMEM;
goto out;
}
+ t->hregion = ip_set_alloc(ahash_sizeof_regions(htable_bits));
+ if (!t->hregion) {
+ kfree(t);
+ ret = -ENOMEM;
+ goto out;
+ }
t->htable_bits = htable_bits;
+ t->maxelem = h->maxelem / ahash_numof_locks(htable_bits);
+ for (i = 0; i < ahash_numof_locks(htable_bits); i++)
+ spin_lock_init(&t->hregion[i].lock);

- spin_lock_bh(&set->lock);
- orig = __ipset_dereference_protected(h->table, 1);
- /* There can't be another parallel resizing, but dumping is possible */
+ /* There can't be another parallel resizing,
+ * but dumping, gc, kernel side add/del are possible
+ */
+ orig = ipset_dereference_bh_nfnl(h->table);
atomic_set(&orig->ref, 1);
atomic_inc(&orig->uref);
- extsize = 0;
pr_debug("attempt to resize set %s from %u to %u, t %p\n",
set->name, orig->htable_bits, htable_bits, orig);
- for (i = 0; i < jhash_size(orig->htable_bits); i++) {
- n = __ipset_dereference_protected(hbucket(orig, i), 1);
- if (!n)
- continue;
- for (j = 0; j < n->pos; j++) {
- if (!test_bit(j, n->used))
+ for (r = 0; r < ahash_numof_locks(orig->htable_bits); r++) {
+ /* Expire may replace a hbucket with another one */
+ rcu_read_lock_bh();
+ for (i = ahash_bucket_start(r, orig->htable_bits);
+ i < ahash_bucket_end(r, orig->htable_bits); i++) {
+ n = __ipset_dereference(hbucket(orig, i));
+ if (!n)
continue;
- data = ahash_data(n, j, dsize);
+ for (j = 0; j < n->pos; j++) {
+ if (!test_bit(j, n->used))
+ continue;
+ data = ahash_data(n, j, dsize);
+ if (SET_ELEM_EXPIRED(set, data))
+ continue;
#ifdef IP_SET_HASH_WITH_NETS
- /* We have readers running parallel with us,
- * so the live data cannot be modified.
- */
- flags = 0;
- memcpy(tmp, data, dsize);
- data = tmp;
- mtype_data_reset_flags(data, &flags);
-#endif
- key = HKEY(data, h->initval, htable_bits);
- m = __ipset_dereference_protected(hbucket(t, key), 1);
- if (!m) {
- m = kzalloc(sizeof(*m) +
+ /* We have readers running parallel with us,
+ * so the live data cannot be modified.
+ */
+ flags = 0;
+ memcpy(tmp, data, dsize);
+ data = tmp;
+ mtype_data_reset_flags(data, &flags);
+#endif
+ key = HKEY(data, h->initval, htable_bits);
+ m = __ipset_dereference(hbucket(t, key));
+ nr = ahash_region(key, htable_bits);
+ if (!m) {
+ m = kzalloc(sizeof(*m) +
AHASH_INIT_SIZE * dsize,
GFP_ATOMIC);
- if (!m) {
- ret = -ENOMEM;
- goto cleanup;
- }
- m->size = AHASH_INIT_SIZE;
- extsize += ext_size(AHASH_INIT_SIZE, dsize);
- RCU_INIT_POINTER(hbucket(t, key), m);
- } else if (m->pos >= m->size) {
- struct hbucket *ht;
-
- if (m->size >= AHASH_MAX(h)) {
- ret = -EAGAIN;
- } else {
- ht = kzalloc(sizeof(*ht) +
+ if (!m) {
+ ret = -ENOMEM;
+ goto cleanup;
+ }
+ m->size = AHASH_INIT_SIZE;
+ t->hregion[nr].ext_size +=
+ ext_size(AHASH_INIT_SIZE,
+ dsize);
+ RCU_INIT_POINTER(hbucket(t, key), m);
+ } else if (m->pos >= m->size) {
+ struct hbucket *ht;
+
+ if (m->size >= AHASH_MAX(h)) {
+ ret = -EAGAIN;
+ } else {
+ ht = kzalloc(sizeof(*ht) +
(m->size + AHASH_INIT_SIZE)
* dsize,
GFP_ATOMIC);
- if (!ht)
- ret = -ENOMEM;
+ if (!ht)
+ ret = -ENOMEM;
+ }
+ if (ret < 0)
+ goto cleanup;
+ memcpy(ht, m, sizeof(struct hbucket) +
+ m->size * dsize);
+ ht->size = m->size + AHASH_INIT_SIZE;
+ t->hregion[nr].ext_size +=
+ ext_size(AHASH_INIT_SIZE,
+ dsize);
+ kfree(m);
+ m = ht;
+ RCU_INIT_POINTER(hbucket(t, key), ht);
}
- if (ret < 0)
- goto cleanup;
- memcpy(ht, m, sizeof(struct hbucket) +
- m->size * dsize);
- ht->size = m->size + AHASH_INIT_SIZE;
- extsize += ext_size(AHASH_INIT_SIZE, dsize);
- kfree(m);
- m = ht;
- RCU_INIT_POINTER(hbucket(t, key), ht);
- }
- d = ahash_data(m, m->pos, dsize);
- memcpy(d, data, dsize);
- set_bit(m->pos++, m->used);
+ d = ahash_data(m, m->pos, dsize);
+ memcpy(d, data, dsize);
+ set_bit(m->pos++, m->used);
+ t->hregion[nr].elements++;
#ifdef IP_SET_HASH_WITH_NETS
- mtype_data_reset_flags(d, &flags);
+ mtype_data_reset_flags(d, &flags);
#endif
+ }
}
+ rcu_read_unlock_bh();
}
- rcu_assign_pointer(h->table, t);
- set->ext_size = extsize;

- spin_unlock_bh(&set->lock);
+ /* There can't be any other writer. */
+ rcu_assign_pointer(h->table, t);

/* Give time to other readers of the set */
synchronize_rcu();

pr_debug("set %s resized from %u (%p) to %u (%p)\n", set->name,
orig->htable_bits, orig, t->htable_bits, t);
- /* If there's nobody else dumping the table, destroy it */
+ /* Add/delete elements processed by the SET target during resize.
+ * Kernel-side add cannot trigger a resize and userspace actions
+ * are serialized by the mutex.
+ */
+ list_for_each_safe(l, lt, &h->ad) {
+ x = list_entry(l, struct mtype_resize_ad, list);
+ if (x->ad == IPSET_ADD) {
+ mtype_add(set, &x->d, &x->ext, &x->mext, x->flags);
+ } else {
+ mtype_del(set, &x->d, NULL, NULL, 0);
+ }
+ list_del(l);
+ kfree(l);
+ }
+ /* If there's nobody else using the table, destroy it */
if (atomic_dec_and_test(&orig->uref)) {
pr_debug("Table destroy by resize %p\n", orig);
mtype_ahash_destroy(set, orig, false);
@@ -677,15 +810,44 @@ out:
return ret;

cleanup:
+ rcu_read_unlock_bh();
atomic_set(&orig->ref, 0);
atomic_dec(&orig->uref);
- spin_unlock_bh(&set->lock);
mtype_ahash_destroy(set, t, false);
if (ret == -EAGAIN)
goto retry;
goto out;
}

+/* Get the current number of elements and ext_size in the set */
+static void
+mtype_ext_size(struct ip_set *set, u32 *elements, size_t *ext_size)
+{
+ struct htype *h = set->data;
+ const struct htable *t;
+ u32 i, j, r;
+ struct hbucket *n;
+ struct mtype_elem *data;
+
+ t = rcu_dereference_bh(h->table);
+ for (r = 0; r < ahash_numof_locks(t->htable_bits); r++) {
+ for (i = ahash_bucket_start(r, t->htable_bits);
+ i < ahash_bucket_end(r, t->htable_bits); i++) {
+ n = rcu_dereference_bh(hbucket(t, i));
+ if (!n)
+ continue;
+ for (j = 0; j < n->pos; j++) {
+ if (!test_bit(j, n->used))
+ continue;
+ data = ahash_data(n, j, set->dsize);
+ if (!SET_ELEM_EXPIRED(set, data))
+ (*elements)++;
+ }
+ }
+ *ext_size += t->hregion[r].ext_size;
+ }
+}
+
/* Add an element to a hash and update the internal counters when succeeded,
* otherwise report the proper error code.
*/
@@ -698,32 +860,49 @@ mtype_add(struct ip_set *set, void *valu
const struct mtype_elem *d = value;
struct mtype_elem *data;
struct hbucket *n, *old = ERR_PTR(-ENOENT);
- int i, j = -1;
+ int i, j = -1, ret;
bool flag_exist = flags & IPSET_FLAG_EXIST;
bool deleted = false, forceadd = false, reuse = false;
- u32 key, multi = 0;
+ u32 r, key, multi = 0, elements, maxelem;

- if (set->elements >= h->maxelem) {
- if (SET_WITH_TIMEOUT(set))
- /* FIXME: when set is full, we slow down here */
- mtype_expire(set, h);
- if (set->elements >= h->maxelem && SET_WITH_FORCEADD(set))
+ rcu_read_lock_bh();
+ t = rcu_dereference_bh(h->table);
+ key = HKEY(value, h->initval, t->htable_bits);
+ r = ahash_region(key, t->htable_bits);
+ atomic_inc(&t->uref);
+ elements = t->hregion[r].elements;
+ maxelem = t->maxelem;
+ if (elements >= maxelem) {
+ u32 e;
+ if (SET_WITH_TIMEOUT(set)) {
+ rcu_read_unlock_bh();
+ mtype_gc_do(set, h, t, r);
+ rcu_read_lock_bh();
+ }
+ maxelem = h->maxelem;
+ elements = 0;
+ for (e = 0; e < ahash_numof_locks(t->htable_bits); e++)
+ elements += t->hregion[e].elements;
+ if (elements >= maxelem && SET_WITH_FORCEADD(set))
forceadd = true;
}
+ rcu_read_unlock_bh();

- t = ipset_dereference_protected(h->table, set);
- key = HKEY(value, h->initval, t->htable_bits);
- n = __ipset_dereference_protected(hbucket(t, key), 1);
+ spin_lock_bh(&t->hregion[r].lock);
+ n = rcu_dereference_bh(hbucket(t, key));
if (!n) {
- if (forceadd || set->elements >= h->maxelem)
+ if (forceadd || elements >= maxelem)
goto set_full;
old = NULL;
n = kzalloc(sizeof(*n) + AHASH_INIT_SIZE * set->dsize,
GFP_ATOMIC);
- if (!n)
- return -ENOMEM;
+ if (!n) {
+ ret = -ENOMEM;
+ goto unlock;
+ }
n->size = AHASH_INIT_SIZE;
- set->ext_size += ext_size(AHASH_INIT_SIZE, set->dsize);
+ t->hregion[r].ext_size +=
+ ext_size(AHASH_INIT_SIZE, set->dsize);
goto copy_elem;
}
for (i = 0; i < n->pos; i++) {
@@ -737,19 +916,16 @@ mtype_add(struct ip_set *set, void *valu
}
data = ahash_data(n, i, set->dsize);
if (mtype_data_equal(data, d, &multi)) {
- if (flag_exist ||
- (SET_WITH_TIMEOUT(set) &&
- ip_set_timeout_expired(ext_timeout(data, set)))) {
+ if (flag_exist || SET_ELEM_EXPIRED(set, data)) {
/* Just the extensions could be overwritten */
j = i;
goto overwrite_extensions;
}
- return -IPSET_ERR_EXIST;
+ ret = -IPSET_ERR_EXIST;
+ goto unlock;
}
/* Reuse first timed out entry */
- if (SET_WITH_TIMEOUT(set) &&
- ip_set_timeout_expired(ext_timeout(data, set)) &&
- j == -1) {
+ if (SET_ELEM_EXPIRED(set, data) && j == -1) {
j = i;
reuse = true;
}
@@ -759,16 +935,16 @@ mtype_add(struct ip_set *set, void *valu
if (!deleted) {
#ifdef IP_SET_HASH_WITH_NETS
for (i = 0; i < IPSET_NET_COUNT; i++)
- mtype_del_cidr(h,
+ mtype_del_cidr(set, h,
NCIDR_PUT(DCIDR_GET(data->cidr, i)),
i);
#endif
ip_set_ext_destroy(set, data);
- set->elements--;
+ t->hregion[r].elements--;
}
goto copy_data;
}
- if (set->elements >= h->maxelem)
+ if (elements >= maxelem)
goto set_full;
/* Create a new slot */
if (n->pos >= n->size) {
@@ -776,28 +952,32 @@ mtype_add(struct ip_set *set, void *valu
if (n->size >= AHASH_MAX(h)) {
/* Trigger rehashing */
mtype_data_next(&h->next, d);
- return -EAGAIN;
+ ret = -EAGAIN;
+ goto resize;
}
old = n;
n = kzalloc(sizeof(*n) +
(old->size + AHASH_INIT_SIZE) * set->dsize,
GFP_ATOMIC);
- if (!n)
- return -ENOMEM;
+ if (!n) {
+ ret = -ENOMEM;
+ goto unlock;
+ }
memcpy(n, old, sizeof(struct hbucket) +
old->size * set->dsize);
n->size = old->size + AHASH_INIT_SIZE;
- set->ext_size += ext_size(AHASH_INIT_SIZE, set->dsize);
+ t->hregion[r].ext_size +=
+ ext_size(AHASH_INIT_SIZE, set->dsize);
}

copy_elem:
j = n->pos++;
data = ahash_data(n, j, set->dsize);
copy_data:
- set->elements++;
+ t->hregion[r].elements++;
#ifdef IP_SET_HASH_WITH_NETS
for (i = 0; i < IPSET_NET_COUNT; i++)
- mtype_add_cidr(h, NCIDR_PUT(DCIDR_GET(d->cidr, i)), i);
+ mtype_add_cidr(set, h, NCIDR_PUT(DCIDR_GET(d->cidr, i)), i);
#endif
memcpy(data, d, sizeof(struct mtype_elem));
overwrite_extensions:
@@ -820,13 +1000,41 @@ overwrite_extensions:
if (old)
kfree_rcu(old, rcu);
}
+ ret = 0;
+resize:
+ spin_unlock_bh(&t->hregion[r].lock);
+ if (atomic_read(&t->ref) && ext->target) {
+ /* Resize is in process and kernel side add, save values */
+ struct mtype_resize_ad *x;
+
+ x = kzalloc(sizeof(struct mtype_resize_ad), GFP_ATOMIC);
+ if (!x)
+ /* Don't bother */
+ goto out;
+ x->ad = IPSET_ADD;
+ memcpy(&x->d, value, sizeof(struct mtype_elem));
+ memcpy(&x->ext, ext, sizeof(struct ip_set_ext));
+ memcpy(&x->mext, mext, sizeof(struct ip_set_ext));
+ x->flags = flags;
+ spin_lock_bh(&set->lock);
+ list_add_tail(&x->list, &h->ad);
+ spin_unlock_bh(&set->lock);
+ }
+ goto out;

- return 0;
set_full:
if (net_ratelimit())
pr_warn("Set %s is full, maxelem %u reached\n",
- set->name, h->maxelem);
- return -IPSET_ERR_HASH_FULL;
+ set->name, maxelem);
+ ret = -IPSET_ERR_HASH_FULL;
+unlock:
+ spin_unlock_bh(&t->hregion[r].lock);
+out:
+ if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
+ pr_debug("Table destroy after resize by add: %p\n", t);
+ mtype_ahash_destroy(set, t, false);
+ }
+ return ret;
}

/* Delete an element from the hash and free up space if possible.
@@ -840,13 +1048,23 @@ mtype_del(struct ip_set *set, void *valu
const struct mtype_elem *d = value;
struct mtype_elem *data;
struct hbucket *n;
- int i, j, k, ret = -IPSET_ERR_EXIST;
+ struct mtype_resize_ad *x = NULL;
+ int i, j, k, r, ret = -IPSET_ERR_EXIST;
u32 key, multi = 0;
size_t dsize = set->dsize;

- t = ipset_dereference_protected(h->table, set);
+ /* Userspace add and resize is excluded by the mutex.
+ * Kernespace add does not trigger resize.
+ */
+ rcu_read_lock_bh();
+ t = rcu_dereference_bh(h->table);
key = HKEY(value, h->initval, t->htable_bits);
- n = __ipset_dereference_protected(hbucket(t, key), 1);
+ r = ahash_region(key, t->htable_bits);
+ atomic_inc(&t->uref);
+ rcu_read_unlock_bh();
+
+ spin_lock_bh(&t->hregion[r].lock);
+ n = rcu_dereference_bh(hbucket(t, key));
if (!n)
goto out;
for (i = 0, k = 0; i < n->pos; i++) {
@@ -857,8 +1075,7 @@ mtype_del(struct ip_set *set, void *valu
data = ahash_data(n, i, dsize);
if (!mtype_data_equal(data, d, &multi))
continue;
- if (SET_WITH_TIMEOUT(set) &&
- ip_set_timeout_expired(ext_timeout(data, set)))
+ if (SET_ELEM_EXPIRED(set, data))
goto out;

ret = 0;
@@ -866,20 +1083,33 @@ mtype_del(struct ip_set *set, void *valu
smp_mb__after_atomic();
if (i + 1 == n->pos)
n->pos--;
- set->elements--;
+ t->hregion[r].elements--;
#ifdef IP_SET_HASH_WITH_NETS
for (j = 0; j < IPSET_NET_COUNT; j++)
- mtype_del_cidr(h, NCIDR_PUT(DCIDR_GET(d->cidr, j)),
- j);
+ mtype_del_cidr(set, h,
+ NCIDR_PUT(DCIDR_GET(d->cidr, j)), j);
#endif
ip_set_ext_destroy(set, data);

+ if (atomic_read(&t->ref) && ext->target) {
+ /* Resize is in process and kernel side del,
+ * save values
+ */
+ x = kzalloc(sizeof(struct mtype_resize_ad),
+ GFP_ATOMIC);
+ if (x) {
+ x->ad = IPSET_DEL;
+ memcpy(&x->d, value,
+ sizeof(struct mtype_elem));
+ x->flags = flags;
+ }
+ }
for (; i < n->pos; i++) {
if (!test_bit(i, n->used))
k++;
}
if (n->pos == 0 && k == 0) {
- set->ext_size -= ext_size(n->size, dsize);
+ t->hregion[r].ext_size -= ext_size(n->size, dsize);
rcu_assign_pointer(hbucket(t, key), NULL);
kfree_rcu(n, rcu);
} else if (k >= AHASH_INIT_SIZE) {
@@ -898,7 +1128,8 @@ mtype_del(struct ip_set *set, void *valu
k++;
}
tmp->pos = k;
- set->ext_size -= ext_size(AHASH_INIT_SIZE, dsize);
+ t->hregion[r].ext_size -=
+ ext_size(AHASH_INIT_SIZE, dsize);
rcu_assign_pointer(hbucket(t, key), tmp);
kfree_rcu(n, rcu);
}
@@ -906,6 +1137,16 @@ mtype_del(struct ip_set *set, void *valu
}

out:
+ spin_unlock_bh(&t->hregion[r].lock);
+ if (x) {
+ spin_lock_bh(&set->lock);
+ list_add(&x->list, &h->ad);
+ spin_unlock_bh(&set->lock);
+ }
+ if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
+ pr_debug("Table destroy after resize by del: %p\n", t);
+ mtype_ahash_destroy(set, t, false);
+ }
return ret;
}

@@ -991,6 +1232,7 @@ mtype_test(struct ip_set *set, void *val
int i, ret = 0;
u32 key, multi = 0;

+ rcu_read_lock_bh();
t = rcu_dereference_bh(h->table);
#ifdef IP_SET_HASH_WITH_NETS
/* If we test an IP address and not a network address,
@@ -1022,6 +1264,7 @@ mtype_test(struct ip_set *set, void *val
goto out;
}
out:
+ rcu_read_unlock_bh();
return ret;
}

@@ -1033,23 +1276,14 @@ mtype_head(struct ip_set *set, struct sk
const struct htable *t;
struct nlattr *nested;
size_t memsize;
+ u32 elements = 0;
+ size_t ext_size = 0;
u8 htable_bits;

- /* If any members have expired, set->elements will be wrong
- * mytype_expire function will update it with the right count.
- * we do not hold set->lock here, so grab it first.
- * set->elements can still be incorrect in the case of a huge set,
- * because elements might time out during the listing.
- */
- if (SET_WITH_TIMEOUT(set)) {
- spin_lock_bh(&set->lock);
- mtype_expire(set, h);
- spin_unlock_bh(&set->lock);
- }
-
rcu_read_lock_bh();
- t = rcu_dereference_bh_nfnl(h->table);
- memsize = mtype_ahash_memsize(h, t) + set->ext_size;
+ t = rcu_dereference_bh(h->table);
+ mtype_ext_size(set, &elements, &ext_size);
+ memsize = mtype_ahash_memsize(h, t) + ext_size + set->ext_size;
htable_bits = t->htable_bits;
rcu_read_unlock_bh();

@@ -1071,7 +1305,7 @@ mtype_head(struct ip_set *set, struct sk
#endif
if (nla_put_net32(skb, IPSET_ATTR_REFERENCES, htonl(set->ref)) ||
nla_put_net32(skb, IPSET_ATTR_MEMSIZE, htonl(memsize)) ||
- nla_put_net32(skb, IPSET_ATTR_ELEMENTS, htonl(set->elements)))
+ nla_put_net32(skb, IPSET_ATTR_ELEMENTS, htonl(elements)))
goto nla_put_failure;
if (unlikely(ip_set_put_flags(skb, set)))
goto nla_put_failure;
@@ -1091,15 +1325,15 @@ mtype_uref(struct ip_set *set, struct ne

if (start) {
rcu_read_lock_bh();
- t = rcu_dereference_bh_nfnl(h->table);
+ t = ipset_dereference_bh_nfnl(h->table);
atomic_inc(&t->uref);
cb->args[IPSET_CB_PRIVATE] = (unsigned long)t;
rcu_read_unlock_bh();
} else if (cb->args[IPSET_CB_PRIVATE]) {
t = (struct htable *)cb->args[IPSET_CB_PRIVATE];
if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
- /* Resizing didn't destroy the hash table */
- pr_debug("Table destroy by dump: %p\n", t);
+ pr_debug("Table destroy after resize "
+ " by dump: %p\n", t);
mtype_ahash_destroy(set, t, false);
}
cb->args[IPSET_CB_PRIVATE] = 0;
@@ -1141,8 +1375,7 @@ mtype_list(const struct ip_set *set,
if (!test_bit(i, n->used))
continue;
e = ahash_data(n, i, set->dsize);
- if (SET_WITH_TIMEOUT(set) &&
- ip_set_timeout_expired(ext_timeout(e, set)))
+ if (SET_ELEM_EXPIRED(set, e))
continue;
pr_debug("list hash %lu hbucket %p i %u, data %p\n",
cb->args[IPSET_CB_ARG0], n, i, e);
@@ -1208,6 +1441,7 @@ static const struct ip_set_type_variant
.uref = mtype_uref,
.resize = mtype_resize,
.same_set = mtype_same_set,
+ .region_lock = true,
};

#ifdef IP_SET_EMIT_CREATE
@@ -1226,6 +1460,7 @@ IPSET_TOKEN(HTYPE, _create)(struct net *
size_t hsize;
struct htype *h;
struct htable *t;
+ u32 i;

pr_debug("Create set %s with family %s\n",
set->name, set->family == NFPROTO_IPV4 ? "inet" : "inet6");
@@ -1294,6 +1529,15 @@ IPSET_TOKEN(HTYPE, _create)(struct net *
kfree(h);
return -ENOMEM;
}
+ t->hregion = ip_set_alloc(ahash_sizeof_regions(hbits));
+ if (!t->hregion) {
+ kfree(t);
+ kfree(h);
+ return -ENOMEM;
+ }
+ h->gc.set = set;
+ for (i = 0; i < ahash_numof_locks(hbits); i++)
+ spin_lock_init(&t->hregion[i].lock);
h->maxelem = maxelem;
#ifdef IP_SET_HASH_WITH_NETMASK
h->netmask = netmask;
@@ -1304,9 +1548,10 @@ IPSET_TOKEN(HTYPE, _create)(struct net *
get_random_bytes(&h->initval, sizeof(h->initval));

t->htable_bits = hbits;
+ t->maxelem = h->maxelem / ahash_numof_locks(hbits);
RCU_INIT_POINTER(h->table, t);

- h->set = set;
+ INIT_LIST_HEAD(&h->ad);
set->data = h;
#ifndef IP_SET_PROTO_UNDEF
if (set->family == NFPROTO_IPV4) {
@@ -1329,12 +1574,10 @@ IPSET_TOKEN(HTYPE, _create)(struct net *
#ifndef IP_SET_PROTO_UNDEF
if (set->family == NFPROTO_IPV4)
#endif
- IPSET_TOKEN(HTYPE, 4_gc_init)(set,
- IPSET_TOKEN(HTYPE, 4_gc));
+ IPSET_TOKEN(HTYPE, 4_gc_init)(&h->gc);
#ifndef IP_SET_PROTO_UNDEF
else
- IPSET_TOKEN(HTYPE, 6_gc_init)(set,
- IPSET_TOKEN(HTYPE, 6_gc));
+ IPSET_TOKEN(HTYPE, 6_gc_init)(&h->gc);
#endif
}
pr_debug("create %s hashsize %u (%u) maxelem %u: %p(%p)\n",


2020-03-03 17:50:55

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 123/176] nvme-pci: Hold cq_poll_lock while completing CQEs

From: Bijan Mottahedeh <[email protected]>

commit 9515743bfb39c61aaf3d4f3219a645c8d1fe9a0e upstream.

Completions need to consumed in the same order the controller submitted
them, otherwise future completion entries may overwrite ones we haven't
handled yet. Hold the nvme queue's poll lock while completing new CQEs to
prevent another thread from freeing command tags for reuse out-of-order.

Fixes: dabcefab45d3 ("nvme: provide optimized poll function for separate poll queues")
Signed-off-by: Bijan Mottahedeh <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/nvme/host/pci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1078,9 +1078,9 @@ static int nvme_poll(struct blk_mq_hw_ct

spin_lock(&nvmeq->cq_poll_lock);
found = nvme_process_cq(nvmeq, &start, &end, -1);
+ nvme_complete_cqes(nvmeq, start, end);
spin_unlock(&nvmeq->cq_poll_lock);

- nvme_complete_cqes(nvmeq, start, end);
return found;
}



2020-03-03 17:50:58

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 124/176] s390/qeth: vnicc Fix EOPNOTSUPP precedence

From: Alexandra Winter <[email protected]>

commit 6f3846f0955308b6d1b219419da42b8de2c08845 upstream.

When getting or setting VNICC parameters, the error code EOPNOTSUPP
should have precedence over EBUSY.

EBUSY is used because vnicc feature and bridgeport feature are mutually
exclusive, which is a temporary condition.
Whereas EOPNOTSUPP indicates that the HW does not support all or parts of
the vnicc feature.
This issue causes the vnicc sysfs params to show 'blocked by bridgeport'
for HW that does not support VNICC at all.

Fixes: caa1f0b10d18 ("s390/qeth: add VNICC enable/disable support")
Signed-off-by: Alexandra Winter <[email protected]>
Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/s390/net/qeth_l2_main.c | 29 +++++++++++++----------------
1 file changed, 13 insertions(+), 16 deletions(-)

--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -1815,15 +1815,14 @@ int qeth_l2_vnicc_set_state(struct qeth_

QETH_CARD_TEXT(card, 2, "vniccsch");

- /* do not change anything if BridgePort is enabled */
- if (qeth_bridgeport_is_in_use(card))
- return -EBUSY;
-
/* check if characteristic and enable/disable are supported */
if (!(card->options.vnicc.sup_chars & vnicc) ||
!(card->options.vnicc.set_char_sup & vnicc))
return -EOPNOTSUPP;

+ if (qeth_bridgeport_is_in_use(card))
+ return -EBUSY;
+
/* set enable/disable command and store wanted characteristic */
if (state) {
cmd = IPA_VNICC_ENABLE;
@@ -1869,14 +1868,13 @@ int qeth_l2_vnicc_get_state(struct qeth_

QETH_CARD_TEXT(card, 2, "vniccgch");

- /* do not get anything if BridgePort is enabled */
- if (qeth_bridgeport_is_in_use(card))
- return -EBUSY;
-
/* check if characteristic is supported */
if (!(card->options.vnicc.sup_chars & vnicc))
return -EOPNOTSUPP;

+ if (qeth_bridgeport_is_in_use(card))
+ return -EBUSY;
+
/* if card is ready, query current VNICC state */
if (qeth_card_hw_is_reachable(card))
rc = qeth_l2_vnicc_query_chars(card);
@@ -1894,15 +1892,14 @@ int qeth_l2_vnicc_set_timeout(struct qet

QETH_CARD_TEXT(card, 2, "vniccsto");

- /* do not change anything if BridgePort is enabled */
- if (qeth_bridgeport_is_in_use(card))
- return -EBUSY;
-
/* check if characteristic and set_timeout are supported */
if (!(card->options.vnicc.sup_chars & QETH_VNICC_LEARNING) ||
!(card->options.vnicc.getset_timeout_sup & QETH_VNICC_LEARNING))
return -EOPNOTSUPP;

+ if (qeth_bridgeport_is_in_use(card))
+ return -EBUSY;
+
/* do we need to do anything? */
if (card->options.vnicc.learning_timeout == timeout)
return rc;
@@ -1931,14 +1928,14 @@ int qeth_l2_vnicc_get_timeout(struct qet

QETH_CARD_TEXT(card, 2, "vniccgto");

- /* do not get anything if BridgePort is enabled */
- if (qeth_bridgeport_is_in_use(card))
- return -EBUSY;
-
/* check if characteristic and get_timeout are supported */
if (!(card->options.vnicc.sup_chars & QETH_VNICC_LEARNING) ||
!(card->options.vnicc.getset_timeout_sup & QETH_VNICC_LEARNING))
return -EOPNOTSUPP;
+
+ if (qeth_bridgeport_is_in_use(card))
+ return -EBUSY;
+
/* if card is ready, get timeout. Otherwise, just return stored value */
*timeout = card->options.vnicc.learning_timeout;
if (qeth_card_hw_is_reachable(card))


2020-03-03 17:50:58

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 126/176] net: netlink: cap max groups which will be considered in netlink_bind()

From: Nikolay Aleksandrov <[email protected]>

commit 3a20773beeeeadec41477a5ba872175b778ff752 upstream.

Since nl_groups is a u32 we can't bind more groups via ->bind
(netlink_bind) call, but netlink has supported more groups via
setsockopt() for a long time and thus nlk->ngroups could be over 32.
Recently I added support for per-vlan notifications and increased the
groups to 33 for NETLINK_ROUTE which exposed an old bug in the
netlink_bind() code causing out-of-bounds access on archs where unsigned
long is 32 bits via test_bit() on a local variable. Fix this by capping the
maximum groups in netlink_bind() to BITS_PER_TYPE(u32), effectively
capping them at 32 which is the minimum of allocated groups and the
maximum groups which can be bound via netlink_bind().

CC: Christophe Leroy <[email protected]>
CC: Richard Guy Briggs <[email protected]>
Fixes: 4f520900522f ("netlink: have netlink per-protocol bind function return an error code.")
Reported-by: Erhard F. <[email protected]>
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/netlink/af_netlink.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1014,7 +1014,8 @@ static int netlink_bind(struct socket *s
if (nlk->netlink_bind && groups) {
int group;

- for (group = 0; group < nlk->ngroups; group++) {
+ /* nl_groups is a u32, so cap the maximum groups we can bind */
+ for (group = 0; group < BITS_PER_TYPE(u32); group++) {
if (!test_bit(group, &groups))
continue;
err = nlk->netlink_bind(net, group + 1);
@@ -1033,7 +1034,7 @@ static int netlink_bind(struct socket *s
netlink_insert(sk, nladdr->nl_pid) :
netlink_autobind(sock);
if (err) {
- netlink_undo_bind(nlk->ngroups, groups, sk);
+ netlink_undo_bind(BITS_PER_TYPE(u32), groups, sk);
goto unlock;
}
}


2020-03-03 17:51:00

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 074/176] mac80211: fix wrong 160/80+80 MHz setting

From: Shay Bar <[email protected]>

[ Upstream commit 33181ea7f5a62a17fbe55f0f73428ecb5e686be8 ]

Before this patch, STA's would set new width of 160/80+80 MHz based on AP capability only.
This is wrong because STA may not support > 80MHz BW.
Fix is to verify STA has 160/80+80 MHz capability before increasing its width to > 80MHz.

The "support_80_80" and "support_160" setting is based on:
"Table 9-272 — Setting of the Supported Channel Width Set subfield and Extended NSS BW
Support subfield at a STA transmitting the VHT Capabilities Information field"
>From "Draft P802.11REVmd_D3.0.pdf"

Signed-off-by: Aviad Brikman <[email protected]>
Signed-off-by: Shay Bar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
net/mac80211/util.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/net/mac80211/util.c b/net/mac80211/util.c
index 739e90555d8b9..decd46b383938 100644
--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@ -2993,10 +2993,22 @@ bool ieee80211_chandef_vht_oper(struct ieee80211_hw *hw,
int cf0, cf1;
int ccfs0, ccfs1, ccfs2;
int ccf0, ccf1;
+ u32 vht_cap;
+ bool support_80_80 = false;
+ bool support_160 = false;

if (!oper || !htop)
return false;

+ vht_cap = hw->wiphy->bands[chandef->chan->band]->vht_cap.cap;
+ support_160 = (vht_cap & (IEEE80211_VHT_CAP_SUPP_CHAN_WIDTH_MASK |
+ IEEE80211_VHT_CAP_EXT_NSS_BW_MASK));
+ support_80_80 = ((vht_cap &
+ IEEE80211_VHT_CAP_SUPP_CHAN_WIDTH_160_80PLUS80MHZ) ||
+ (vht_cap & IEEE80211_VHT_CAP_SUPP_CHAN_WIDTH_160MHZ &&
+ vht_cap & IEEE80211_VHT_CAP_EXT_NSS_BW_MASK) ||
+ ((vht_cap & IEEE80211_VHT_CAP_EXT_NSS_BW_MASK) >>
+ IEEE80211_VHT_CAP_EXT_NSS_BW_SHIFT > 1));
ccfs0 = oper->center_freq_seg0_idx;
ccfs1 = oper->center_freq_seg1_idx;
ccfs2 = (le16_to_cpu(htop->operation_mode) &
@@ -3024,10 +3036,10 @@ bool ieee80211_chandef_vht_oper(struct ieee80211_hw *hw,
unsigned int diff;

diff = abs(ccf1 - ccf0);
- if (diff == 8) {
+ if ((diff == 8) && support_160) {
new.width = NL80211_CHAN_WIDTH_160;
new.center_freq1 = cf1;
- } else if (diff > 8) {
+ } else if ((diff > 8) && support_80_80) {
new.width = NL80211_CHAN_WIDTH_80P80;
new.center_freq2 = cf1;
}
--
2.20.1



2020-03-03 17:51:02

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 122/176] usb: charger: assign specific number for enum value

From: Peter Chen <[email protected]>

commit ca4b43c14cd88d28cfc6467d2fa075aad6818f1d upstream.

To work properly on every architectures and compilers, the enum value
needs to be specific numbers.

Suggested-by: Greg KH <[email protected]>
Signed-off-by: Peter Chen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
include/uapi/linux/usb/charger.h | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)

--- a/include/uapi/linux/usb/charger.h
+++ b/include/uapi/linux/usb/charger.h
@@ -14,18 +14,18 @@
* ACA (Accessory Charger Adapters)
*/
enum usb_charger_type {
- UNKNOWN_TYPE,
- SDP_TYPE,
- DCP_TYPE,
- CDP_TYPE,
- ACA_TYPE,
+ UNKNOWN_TYPE = 0,
+ SDP_TYPE = 1,
+ DCP_TYPE = 2,
+ CDP_TYPE = 3,
+ ACA_TYPE = 4,
};

/* USB charger state */
enum usb_charger_state {
- USB_CHARGER_DEFAULT,
- USB_CHARGER_PRESENT,
- USB_CHARGER_ABSENT,
+ USB_CHARGER_DEFAULT = 0,
+ USB_CHARGER_PRESENT = 1,
+ USB_CHARGER_ABSENT = 2,
};

#endif /* _UAPI__LINUX_USB_CHARGER_H */


2020-03-03 17:51:03

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 127/176] net: atlantic: checksum compat issue

From: Dmitry Bezrukov <[email protected]>

commit 15beab0a9d797be1b7c67458da007a62269be29a upstream.

Yet another checksum offload compatibility issue was found.

The known issue is that AQC HW marks tcp packets with 0xFFFF checksum
as invalid (1). This is workarounded in driver, passing all the suspicious
packets up to the stack for further csum validation.

Another HW problem (2) is that it hides invalid csum of LRO aggregated
packets inside of the individual descriptors. That was workarounded
by forced scan of all LRO descriptors for checksum errors.

However the scan logic was joint for both LRO and multi-descriptor
packets (jumbos). And this causes the issue.

We have to drop LRO packets with the detected bad checksum
because of (2), but we have to pass jumbo packets to stack because of (1).

When using windows tcp partner with jumbo frames but with LSO disabled
driver discards such frames as bad checksummed. But only LRO frames
should be dropped, not jumbos.

On such a configurations tcp stream have a chance of drops and stucks.

(1) 76f254d4afe2 ("net: aquantia: tcp checksum 0xffff being handled incorrectly")
(2) d08b9a0a3ebd ("net: aquantia: do not pass lro session with invalid tcp checksum")

Fixes: d08b9a0a3ebd ("net: aquantia: do not pass lro session with invalid tcp checksum")
Signed-off-by: Dmitry Bezrukov <[email protected]>
Signed-off-by: Igor Russkikh <[email protected]>
Signed-off-by: Dmitry Bogdanov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/ethernet/aquantia/atlantic/aq_ring.c | 3 ++-
drivers/net/ethernet/aquantia/atlantic/aq_ring.h | 3 ++-
drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c | 5 +++--
3 files changed, 7 insertions(+), 4 deletions(-)

--- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
@@ -351,7 +351,8 @@ int aq_ring_rx_clean(struct aq_ring_s *s
err = 0;
goto err_exit;
}
- if (buff->is_error || buff->is_cso_err) {
+ if (buff->is_error ||
+ (buff->is_lro && buff->is_cso_err)) {
buff_ = buff;
do {
next_ = buff_->next,
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.h
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.h
@@ -78,7 +78,8 @@ struct __packed aq_ring_buff_s {
u32 is_cleaned:1;
u32 is_error:1;
u32 is_vlan:1;
- u32 rsvd3:4;
+ u32 is_lro:1;
+ u32 rsvd3:3;
u16 eop_index;
u16 rsvd4;
};
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
@@ -823,6 +823,8 @@ static int hw_atl_b0_hw_ring_rx_receive(
}
}

+ buff->is_lro = !!(HW_ATL_B0_RXD_WB_STAT2_RSCCNT &
+ rxd_wb->status);
if (HW_ATL_B0_RXD_WB_STAT2_EOP & rxd_wb->status) {
buff->len = rxd_wb->pkt_len %
AQ_CFG_RX_FRAME_MAX;
@@ -835,8 +837,7 @@ static int hw_atl_b0_hw_ring_rx_receive(
rxd_wb->pkt_len > AQ_CFG_RX_FRAME_MAX ?
AQ_CFG_RX_FRAME_MAX : rxd_wb->pkt_len;

- if (HW_ATL_B0_RXD_WB_STAT2_RSCCNT &
- rxd_wb->status) {
+ if (buff->is_lro) {
/* LRO */
buff->next = rxd_wb->next_desc_ptr;
++ring->stats.rx.lro_packets;


2020-03-03 17:51:08

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 129/176] net: atlantic: fix use after free kasan warn

From: Pavel Belous <[email protected]>

commit a4980919ad6a7be548d499bc5338015e1a9191c6 upstream.

skb->len is used to calculate statistics after xmit invocation.

Under a stress load it may happen that skb will be xmited,
rx interrupt will come and skb will be freed, all before xmit function
is even returned.

Eventually, skb->len will access unallocated area.

Moving stats calculation into tx_clean routine.

Fixes: 018423e90bee ("net: ethernet: aquantia: Add ring support code")
Reported-by: Christophe Vu-Brugier <[email protected]>
Signed-off-by: Igor Russkikh <[email protected]>
Signed-off-by: Pavel Belous <[email protected]>
Signed-off-by: Dmitry Bogdanov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/ethernet/aquantia/atlantic/aq_nic.c | 4 ----
drivers/net/ethernet/aquantia/atlantic/aq_ring.c | 7 +++++--
2 files changed, 5 insertions(+), 6 deletions(-)

--- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
@@ -655,10 +655,6 @@ int aq_nic_xmit(struct aq_nic_s *self, s
if (likely(frags)) {
err = self->aq_hw_ops->hw_ring_tx_xmit(self->aq_hw,
ring, frags);
- if (err >= 0) {
- ++ring->stats.tx.packets;
- ring->stats.tx.bytes += skb->len;
- }
} else {
err = NETDEV_TX_BUSY;
}
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
@@ -272,9 +272,12 @@ bool aq_ring_tx_clean(struct aq_ring_s *
}
}

- if (unlikely(buff->is_eop))
- dev_kfree_skb_any(buff->skb);
+ if (unlikely(buff->is_eop)) {
+ ++self->stats.rx.packets;
+ self->stats.tx.bytes += buff->skb->len;

+ dev_kfree_skb_any(buff->skb);
+ }
buff->pa = 0U;
buff->eop_index = 0xffffU;
self->sw_head = aq_ring_next_dx(self, self->sw_head);


2020-03-03 17:51:10

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 043/176] ceph: do not execute direct write in parallel if O_APPEND is specified

From: Xiubo Li <[email protected]>

[ Upstream commit 8e4473bb50a1796c9c32b244e5dbc5ee24ead937 ]

In O_APPEND & O_DIRECT mode, the data from different writers will
be possibly overlapping each other since they take the shared lock.

For example, both Writer1 and Writer2 are in O_APPEND and O_DIRECT
mode:

Writer1 Writer2

shared_lock() shared_lock()
getattr(CAP_SIZE) getattr(CAP_SIZE)
iocb->ki_pos = EOF iocb->ki_pos = EOF
write(data1)
write(data2)
shared_unlock() shared_unlock()

The data2 will overlap the data1 from the same file offset, the
old EOF.

Switch to exclusive lock instead when O_APPEND is specified.

Signed-off-by: Xiubo Li <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/ceph/file.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 11929d2bb594c..cd09e63d682b7 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -1418,6 +1418,7 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
struct ceph_cap_flush *prealloc_cf;
ssize_t count, written = 0;
int err, want, got;
+ bool direct_lock = false;
loff_t pos;
loff_t limit = max(i_size_read(inode), fsc->max_file_size);

@@ -1428,8 +1429,11 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
if (!prealloc_cf)
return -ENOMEM;

+ if ((iocb->ki_flags & (IOCB_DIRECT | IOCB_APPEND)) == IOCB_DIRECT)
+ direct_lock = true;
+
retry_snap:
- if (iocb->ki_flags & IOCB_DIRECT)
+ if (direct_lock)
ceph_start_io_direct(inode);
else
ceph_start_io_write(inode);
@@ -1519,14 +1523,15 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)

/* we might need to revert back to that point */
data = *from;
- if (iocb->ki_flags & IOCB_DIRECT) {
+ if (iocb->ki_flags & IOCB_DIRECT)
written = ceph_direct_read_write(iocb, &data, snapc,
&prealloc_cf);
- ceph_end_io_direct(inode);
- } else {
+ else
written = ceph_sync_write(iocb, &data, pos, snapc);
+ if (direct_lock)
+ ceph_end_io_direct(inode);
+ else
ceph_end_io_write(inode);
- }
if (written > 0)
iov_iter_advance(from, written);
ceph_put_snap_context(snapc);
@@ -1577,7 +1582,7 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)

goto out_unlocked;
out:
- if (iocb->ki_flags & IOCB_DIRECT)
+ if (direct_lock)
ceph_end_io_direct(inode);
else
ceph_end_io_write(inode);
--
2.20.1



2020-03-03 17:51:10

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 130/176] net: atlantic: fix potential error handling

From: Pavel Belous <[email protected]>

commit 380ec5b9af7f0d57dbf6ac067fd9f33cff2fef71 upstream.

Code inspection found that in case of mapping error we do return current
'ret' value. But beside error, it is used to count number of descriptors
allocated for the packet. In that case map_skb function could return '1'.

Changing it to return zero (number of mapped descriptors for skb)

Fixes: 018423e90bee ("net: ethernet: aquantia: Add ring support code")
Signed-off-by: Pavel Belous <[email protected]>
Signed-off-by: Igor Russkikh <[email protected]>
Signed-off-by: Dmitry Bogdanov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/ethernet/aquantia/atlantic/aq_nic.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
@@ -533,8 +533,10 @@ unsigned int aq_nic_map_skb(struct aq_ni
dx_buff->len,
DMA_TO_DEVICE);

- if (unlikely(dma_mapping_error(aq_nic_get_dev(self), dx_buff->pa)))
+ if (unlikely(dma_mapping_error(aq_nic_get_dev(self), dx_buff->pa))) {
+ ret = 0;
goto exit;
+ }

first = dx_buff;
dx_buff->len_pkt = skb->len;


2020-03-03 17:51:11

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 089/176] KVM: VMX: check descriptor table exits on instruction emulation

From: Oliver Upton <[email protected]>

commit 86f7e90ce840aa1db407d3ea6e9b3a52b2ce923c upstream.

KVM emulates UMIP on hardware that doesn't support it by setting the
'descriptor table exiting' VM-execution control and performing
instruction emulation. When running nested, this emulation is broken as
KVM refuses to emulate L2 instructions by default.

Correct this regression by allowing the emulation of descriptor table
instructions if L1 hasn't requested 'descriptor table exiting'.

Fixes: 07721feee46b ("KVM: nVMX: Don't emulate instructions in guest mode")
Reported-by: Jan Kiszka <[email protected]>
Cc: [email protected]
Cc: Paolo Bonzini <[email protected]>
Cc: Jim Mattson <[email protected]>
Signed-off-by: Oliver Upton <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kvm/vmx/vmx.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7179,6 +7179,7 @@ static int vmx_check_intercept_io(struct
else
intercept = nested_vmx_check_io_bitmaps(vcpu, port, size);

+ /* FIXME: produce nested vmexit and return X86EMUL_INTERCEPTED. */
return intercept ? X86EMUL_UNHANDLEABLE : X86EMUL_CONTINUE;
}

@@ -7208,6 +7209,20 @@ static int vmx_check_intercept(struct kv
case x86_intercept_outs:
return vmx_check_intercept_io(vcpu, info);

+ case x86_intercept_lgdt:
+ case x86_intercept_lidt:
+ case x86_intercept_lldt:
+ case x86_intercept_ltr:
+ case x86_intercept_sgdt:
+ case x86_intercept_sidt:
+ case x86_intercept_sldt:
+ case x86_intercept_str:
+ if (!nested_cpu_has2(vmcs12, SECONDARY_EXEC_DESC))
+ return X86EMUL_CONTINUE;
+
+ /* FIXME: produce nested vmexit and return X86EMUL_INTERCEPTED. */
+ break;
+
/* TODO: check more intercepts... */
default:
break;


2020-03-03 17:51:19

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 105/176] netfilter: ipset: Fix forceadd evaluation path

From: Jozsef Kadlecsik <[email protected]>

commit 8af1c6fbd9239877998c7f5a591cb2c88d41fb66 upstream.

When the forceadd option is enabled, the hash:* types should find and replace
the first entry in the bucket with the new one if there are no reuseable
(deleted or timed out) entries. However, the position index was just not set
to zero and remained the invalid -1 if there were no reuseable entries.

Reported-by: [email protected]
Fixes: 23c42a403a9c ("netfilter: ipset: Introduction of new commands and protocol version 7")
Signed-off-by: Jozsef Kadlecsik <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/netfilter/ipset/ip_set_hash_gen.h | 2 ++
1 file changed, 2 insertions(+)

--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -931,6 +931,8 @@ mtype_add(struct ip_set *set, void *valu
}
}
if (reuse || forceadd) {
+ if (j == -1)
+ j = 0;
data = ahash_data(n, j, set->dsize);
if (!deleted) {
#ifdef IP_SET_HASH_WITH_NETS


2020-03-03 17:51:21

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 072/176] bcache: ignore pending signals when creating gc and allocator thread

From: Coly Li <[email protected]>

[ Upstream commit 0b96da639a4874311e9b5156405f69ef9fc3bef8 ]

When run a cache set, all the bcache btree node of this cache set will
be checked by bch_btree_check(). If the bcache btree is very large,
iterating all the btree nodes will occupy too much system memory and
the bcache registering process might be selected and killed by system
OOM killer. kthread_run() will fail if current process has pending
signal, therefore the kthread creating in run_cache_set() for gc and
allocator kernel threads are very probably failed for a very large
bcache btree.

Indeed such OOM is safe and the registering process will exit after
the registration done. Therefore this patch flushes pending signals
during the cache set start up, specificly in bch_cache_allocator_start()
and bch_gc_thread_start(), to make sure run_cache_set() won't fail for
large cahced data set.

Signed-off-by: Coly Li <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/md/bcache/alloc.c | 18 ++++++++++++++++--
drivers/md/bcache/btree.c | 13 +++++++++++++
2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
index a1df0d95151c6..8bc1faf71ff2f 100644
--- a/drivers/md/bcache/alloc.c
+++ b/drivers/md/bcache/alloc.c
@@ -67,6 +67,7 @@
#include <linux/blkdev.h>
#include <linux/kthread.h>
#include <linux/random.h>
+#include <linux/sched/signal.h>
#include <trace/events/bcache.h>

#define MAX_OPEN_BUCKETS 128
@@ -733,8 +734,21 @@ int bch_open_buckets_alloc(struct cache_set *c)

int bch_cache_allocator_start(struct cache *ca)
{
- struct task_struct *k = kthread_run(bch_allocator_thread,
- ca, "bcache_allocator");
+ struct task_struct *k;
+
+ /*
+ * In case previous btree check operation occupies too many
+ * system memory for bcache btree node cache, and the
+ * registering process is selected by OOM killer. Here just
+ * ignore the SIGKILL sent by OOM killer if there is, to
+ * avoid kthread_run() being failed by pending signals. The
+ * bcache registering process will exit after the registration
+ * done.
+ */
+ if (signal_pending(current))
+ flush_signals(current);
+
+ k = kthread_run(bch_allocator_thread, ca, "bcache_allocator");
if (IS_ERR(k))
return PTR_ERR(k);

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 14d6c33b0957e..78f0711a25849 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -34,6 +34,7 @@
#include <linux/random.h>
#include <linux/rcupdate.h>
#include <linux/sched/clock.h>
+#include <linux/sched/signal.h>
#include <linux/rculist.h>
#include <linux/delay.h>
#include <trace/events/bcache.h>
@@ -1917,6 +1918,18 @@ static int bch_gc_thread(void *arg)

int bch_gc_thread_start(struct cache_set *c)
{
+ /*
+ * In case previous btree check operation occupies too many
+ * system memory for bcache btree node cache, and the
+ * registering process is selected by OOM killer. Here just
+ * ignore the SIGKILL sent by OOM killer if there is, to
+ * avoid kthread_run() being failed by pending signals. The
+ * bcache registering process will exit after the registration
+ * done.
+ */
+ if (signal_pending(current))
+ flush_signals(current);
+
c->gc_thread = kthread_run(bch_gc_thread, c, "bcache_gc");
return PTR_ERR_OR_ZERO(c->gc_thread);
}
--
2.20.1



2020-03-03 17:51:24

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 107/176] HID: alps: Fix an error handling path in alps_input_configured()

From: Christophe JAILLET <[email protected]>

commit 8d2e77b39b8fecb794e19cd006a12f90b14dd077 upstream.

They are issues:
- if 'input_allocate_device()' fails and return NULL, there is no need
to free anything and 'input_free_device()' call is a no-op. It can
be axed.
- 'ret' is known to be 0 at this point, so we must set it to a
meaningful value before returning

Fixes: 2562756dde55 ("HID: add Alps I2C HID Touchpad-Stick support")
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hid/hid-alps.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/hid/hid-alps.c
+++ b/drivers/hid/hid-alps.c
@@ -730,7 +730,7 @@ static int alps_input_configured(struct
if (data->has_sp) {
input2 = input_allocate_device();
if (!input2) {
- input_free_device(input2);
+ ret = -ENOMEM;
goto exit;
}



2020-03-03 17:51:27

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 109/176] RISC-V: Dont enable all interrupts in trap_init()

From: Anup Patel <[email protected]>

commit 6a1ce99dc4bde564e4a072936f9d41f4a439140e upstream.

Historically, we have been enabling all interrupts for each
HART in trap_init(). Ideally, we should only enable M-mode
interrupts for M-mode kernel and S-mode interrupts for S-mode
kernel in trap_init().

Currently, we get suprious S-mode interrupts on Kendryte K210
board running M-mode NO-MMU kernel because we are enabling all
interrupts in trap_init(). To fix this, we only enable software
and external interrupt in trap_init(). In future, trap_init()
will only enable software interrupt and PLIC driver will enable
external interrupt using CPU notifiers.

Fixes: a4c3733d32a7 ("riscv: abstract out CSR names for supervisor vs machine mode")
Signed-off-by: Anup Patel <[email protected]>
Reviewed-by: Atish Patra <[email protected]>
Tested-by: Palmer Dabbelt <[email protected]> [QMEU virt machine with SMP]
[Palmer: Move the Fixes up to a newer commit]
Reviewed-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/riscv/kernel/traps.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -156,6 +156,6 @@ void __init trap_init(void)
csr_write(CSR_SCRATCH, 0);
/* Set the exception vector address */
csr_write(CSR_TVEC, &handle_exception);
- /* Enable all interrupts */
- csr_write(CSR_IE, -1);
+ /* Enable interrupts */
+ csr_write(CSR_IE, IE_SIE | IE_EIE);
}


2020-03-03 17:51:34

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 115/176] drm/i915/gvt: Fix orphan vgpu dmabuf_objs lifetime

From: Tina Zhang <[email protected]>

commit b549c252b1292aea959cd9b83537fcb9384a6112 upstream.

Deleting dmabuf item's list head after releasing its container can lead
to KASAN-reported issue:

BUG: KASAN: use-after-free in __list_del_entry_valid+0x15/0xf0
Read of size 8 at addr ffff88818a4598a8 by task kworker/u8:3/13119

So fix this issue by puting deleting dmabuf_objs ahead of releasing its
container.

Fixes: dfb6ae4e14bd6 ("drm/i915/gvt: Handle orphan dmabuf_objs")
Signed-off-by: Tina Zhang <[email protected]>
Reviewed-by: Zhenyu Wang <[email protected]>
Signed-off-by: Zhenyu Wang <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/gpu/drm/i915/gvt/dmabuf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/i915/gvt/dmabuf.c
+++ b/drivers/gpu/drm/i915/gvt/dmabuf.c
@@ -151,12 +151,12 @@ static void dmabuf_gem_object_free(struc
dmabuf_obj = container_of(pos,
struct intel_vgpu_dmabuf_obj, list);
if (dmabuf_obj == obj) {
+ list_del(pos);
intel_gvt_hypervisor_put_vfio_device(vgpu);
idr_remove(&vgpu->object_idr,
dmabuf_obj->dmabuf_id);
kfree(dmabuf_obj->info);
kfree(dmabuf_obj);
- list_del(pos);
break;
}
}


2020-03-03 17:51:36

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 144/176] f2fs: fix to add swap extent correctly

From: Chao Yu <[email protected]>

commit 3e5e479a39ce9ed60cd63f7565cc1d9da77c2a4e upstream.

As Youling reported in mailing list:

https://www.linuxquestions.org/questions/linux-newbie-8/the-file-system-f2fs-is-broken-4175666043/

https://www.linux.org/threads/the-file-system-f2fs-is-broken.26490/

There is a test case can corrupt f2fs image:
- dd if=/dev/zero of=/swapfile bs=1M count=4096
- chmod 600 /swapfile
- mkswap /swapfile
- swapon --discard /swapfile

The root cause is f2fs_swap_activate() intends to return zero value
to setup_swap_extents() to enable SWP_FS mode (swap file goes through
fs), in this flow, setup_swap_extents() setups swap extent with wrong
block address range, result in discard_swap() erasing incorrect address.

Because f2fs_swap_activate() has pinned swapfile, its data block
address will not change, it's safe to let swap to handle IO through
raw device, so we can get rid of SWAP_FS mode and initial swap extents
inside f2fs_swap_activate(), by this way, later discard_swap() can trim
in right address range.

Fixes: 4969c06a0d83 ("f2fs: support swap file w/ DIO")
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/f2fs/data.c | 32 +++++++++++++++++++++++++-------
1 file changed, 25 insertions(+), 7 deletions(-)

--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -3132,7 +3132,8 @@ int f2fs_migrate_page(struct address_spa

#ifdef CONFIG_SWAP
/* Copied from generic_swapfile_activate() to check any holes */
-static int check_swap_activate(struct file *swap_file, unsigned int max)
+static int check_swap_activate(struct swap_info_struct *sis,
+ struct file *swap_file, sector_t *span)
{
struct address_space *mapping = swap_file->f_mapping;
struct inode *inode = mapping->host;
@@ -3143,6 +3144,8 @@ static int check_swap_activate(struct fi
sector_t last_block;
sector_t lowest_block = -1;
sector_t highest_block = 0;
+ int nr_extents = 0;
+ int ret;

blkbits = inode->i_blkbits;
blocks_per_page = PAGE_SIZE >> blkbits;
@@ -3154,7 +3157,8 @@ static int check_swap_activate(struct fi
probe_block = 0;
page_no = 0;
last_block = i_size_read(inode) >> blkbits;
- while ((probe_block + blocks_per_page) <= last_block && page_no < max) {
+ while ((probe_block + blocks_per_page) <= last_block &&
+ page_no < sis->max) {
unsigned block_in_page;
sector_t first_block;

@@ -3194,13 +3198,27 @@ static int check_swap_activate(struct fi
highest_block = first_block;
}

+ /*
+ * We found a PAGE_SIZE-length, PAGE_SIZE-aligned run of blocks
+ */
+ ret = add_swap_extent(sis, page_no, 1, first_block);
+ if (ret < 0)
+ goto out;
+ nr_extents += ret;
page_no++;
probe_block += blocks_per_page;
reprobe:
continue;
}
- return 0;
-
+ ret = nr_extents;
+ *span = 1 + highest_block - lowest_block;
+ if (page_no == 0)
+ page_no = 1; /* force Empty message */
+ sis->max = page_no;
+ sis->pages = page_no - 1;
+ sis->highest_bit = page_no - 1;
+out:
+ return ret;
bad_bmap:
pr_err("swapon: swapfile has holes\n");
return -EINVAL;
@@ -3222,14 +3240,14 @@ static int f2fs_swap_activate(struct swa
if (ret)
return ret;

- ret = check_swap_activate(file, sis->max);
- if (ret)
+ ret = check_swap_activate(sis, file, span);
+ if (ret < 0)
return ret;

set_inode_flag(inode, FI_PIN_FILE);
f2fs_precache_extents(inode);
f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
- return 0;
+ return ret;
}

static void f2fs_swap_deactivate(struct file *file)


2020-03-03 17:51:46

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 147/176] drivers: net: xgene: Fix the order of the arguments of alloc_etherdev_mqs()

From: Christophe JAILLET <[email protected]>

commit 5a44c71ccda60a50073c5d7fe3f694cdfa3ab0c2 upstream.

'alloc_etherdev_mqs()' expects first 'tx', then 'rx'. The semantic here
looks reversed.

Reorder the arguments passed to 'alloc_etherdev_mqs()' in order to keep
the correct semantic.

In fact, this is a no-op because both XGENE_NUM_[RT]X_RING are 8.

Fixes: 107dec2749fe ("drivers: net: xgene: Add support for multiple queues")
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/ethernet/apm/xgene/xgene_enet_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
@@ -2020,7 +2020,7 @@ static int xgene_enet_probe(struct platf
int ret;

ndev = alloc_etherdev_mqs(sizeof(struct xgene_enet_pdata),
- XGENE_NUM_RX_RING, XGENE_NUM_TX_RING);
+ XGENE_NUM_TX_RING, XGENE_NUM_RX_RING);
if (!ndev)
return -ENOMEM;



2020-03-03 17:51:46

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 114/176] MIPS: cavium_octeon: Fix syncw generation.

From: Mark Tomlinson <[email protected]>

commit 97e914b7de3c943011779b979b8093fdc0d85722 upstream.

The Cavium Octeon CPU uses a special sync instruction for implementing
wmb, and due to a CPU bug, the instruction must appear twice. A macro
had been defined to hide this:

#define __SYNC_rpt(type) (1 + (type == __SYNC_wmb))

which was intended to evaluate to 2 for __SYNC_wmb, and 1 for any other
type of sync. However, this expression is evaluated by the assembler,
and not the compiler, and the result of '==' in the assembler is 0 or
-1, not 0 or 1 as it is in C. The net result was wmb() producing no code
at all. The simple fix in this patch is to change the '+' to '-'.

Fixes: bf92927251b3 ("MIPS: barrier: Add __SYNC() infrastructure")
Signed-off-by: Mark Tomlinson <[email protected]>
Tested-by: Chris Packham <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/mips/include/asm/sync.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/arch/mips/include/asm/sync.h
+++ b/arch/mips/include/asm/sync.h
@@ -155,9 +155,11 @@
* effective barrier as noted by commit 6b07d38aaa52 ("MIPS: Octeon: Use
* optimized memory barrier primitives."). Here we specify that the affected
* sync instructions should be emitted twice.
+ * Note that this expression is evaluated by the assembler (not the compiler),
+ * and that the assembler evaluates '==' as 0 or -1, not 0 or 1.
*/
#ifdef CONFIG_CPU_CAVIUM_OCTEON
-# define __SYNC_rpt(type) (1 + (type == __SYNC_wmb))
+# define __SYNC_rpt(type) (1 - (type == __SYNC_wmb))
#else
# define __SYNC_rpt(type) 1
#endif


2020-03-03 17:51:47

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 104/176] vhost: Check docket sk_family instead of call getname

From: Eugenio Pérez <[email protected]>

commit 42d84c8490f9f0931786f1623191fcab397c3d64 upstream.

Doing so, we save one call to get data we already have in the struct.

Also, since there is no guarantee that getname use sockaddr_ll
parameter beyond its size, we add a little bit of security here.
It should do not do beyond MAX_ADDR_LEN, but syzbot found that
ax25_getname writes more (72 bytes, the size of full_sockaddr_ax25,
versus 20 + 32 bytes of sockaddr_ll + MAX_ADDR_LEN in syzbot repro).

Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server")
Reported-by: [email protected]
Signed-off-by: Eugenio Pérez <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/vhost/net.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)

--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1414,10 +1414,6 @@ static int vhost_net_release(struct inod

static struct socket *get_raw_socket(int fd)
{
- struct {
- struct sockaddr_ll sa;
- char buf[MAX_ADDR_LEN];
- } uaddr;
int r;
struct socket *sock = sockfd_lookup(fd, &r);

@@ -1430,11 +1426,7 @@ static struct socket *get_raw_socket(int
goto err;
}

- r = sock->ops->getname(sock, (struct sockaddr *)&uaddr.sa, 0);
- if (r < 0)
- goto err;
-
- if (uaddr.sa.sll_family != AF_PACKET) {
+ if (sock->sk->sk_family != AF_PACKET) {
r = -EPFNOSUPPORT;
goto err;
}


2020-03-03 17:51:48

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 145/176] RDMA/hns: Simplify the calculation and usage of wqe idx for post verbs

From: Yixian Liu <[email protected]>

commit 4768820243d71d49f1044b3f911ac3d52bdb79af upstream.

Currently, the wqe idx is calculated repeatly everywhere it is used. This
patch defines wqe_idx and calculated it only once, then just use it as
needed.

Fixes: 2d40788825ac ("RDMA/hns: Add support for processing send wr and receive wr")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Yixian Liu <[email protected]>
Signed-off-by: Weihang Li <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/infiniband/hw/hns/hns_roce_device.h | 3 -
drivers/infiniband/hw/hns/hns_roce_hw_v1.c | 37 ++++++++++--------------
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 43 +++++++++++-----------------
3 files changed, 35 insertions(+), 48 deletions(-)

--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -423,7 +423,7 @@ struct hns_roce_mr_table {
struct hns_roce_wq {
u64 *wrid; /* Work request ID */
spinlock_t lock;
- int wqe_cnt; /* WQE num */
+ u32 wqe_cnt; /* WQE num */
int max_gs;
int offset;
int wqe_shift; /* WQE size */
@@ -647,7 +647,6 @@ struct hns_roce_qp {
u8 sdb_en;
u32 doorbell_qpn;
u32 sq_signal_bits;
- u32 sq_next_wqe;
struct hns_roce_wq sq;

struct ib_umem *umem;
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
@@ -74,8 +74,8 @@ static int hns_roce_v1_post_send(struct
unsigned long flags = 0;
void *wqe = NULL;
__le32 doorbell[2];
+ u32 wqe_idx = 0;
int nreq = 0;
- u32 ind = 0;
int ret = 0;
u8 *smac;
int loopback;
@@ -88,7 +88,7 @@ static int hns_roce_v1_post_send(struct
}

spin_lock_irqsave(&qp->sq.lock, flags);
- ind = qp->sq_next_wqe;
+
for (nreq = 0; wr; ++nreq, wr = wr->next) {
if (hns_roce_wq_overflow(&qp->sq, nreq, qp->ibqp.send_cq)) {
ret = -ENOMEM;
@@ -96,6 +96,8 @@ static int hns_roce_v1_post_send(struct
goto out;
}

+ wqe_idx = (qp->sq.head + nreq) & (qp->sq.wqe_cnt - 1);
+
if (unlikely(wr->num_sge > qp->sq.max_gs)) {
dev_err(dev, "num_sge=%d > qp->sq.max_gs=%d\n",
wr->num_sge, qp->sq.max_gs);
@@ -104,9 +106,8 @@ static int hns_roce_v1_post_send(struct
goto out;
}

- wqe = get_send_wqe(qp, ind & (qp->sq.wqe_cnt - 1));
- qp->sq.wrid[(qp->sq.head + nreq) & (qp->sq.wqe_cnt - 1)] =
- wr->wr_id;
+ wqe = get_send_wqe(qp, wqe_idx);
+ qp->sq.wrid[wqe_idx] = wr->wr_id;

/* Corresponding to the RC and RD type wqe process separately */
if (ibqp->qp_type == IB_QPT_GSI) {
@@ -210,7 +211,6 @@ static int hns_roce_v1_post_send(struct
cpu_to_le32((wr->sg_list[1].addr) >> 32);
ud_sq_wqe->l_key1 =
cpu_to_le32(wr->sg_list[1].lkey);
- ind++;
} else if (ibqp->qp_type == IB_QPT_RC) {
u32 tmp_len = 0;

@@ -308,7 +308,6 @@ static int hns_roce_v1_post_send(struct
ctrl->flag |= cpu_to_le32(wr->num_sge <<
HNS_ROCE_WQE_SGE_NUM_BIT);
}
- ind++;
}
}

@@ -336,7 +335,6 @@ out:
doorbell[1] = sq_db.u32_8;

hns_roce_write64_k(doorbell, qp->sq.db_reg_l);
- qp->sq_next_wqe = ind;
}

spin_unlock_irqrestore(&qp->sq.lock, flags);
@@ -348,12 +346,6 @@ static int hns_roce_v1_post_recv(struct
const struct ib_recv_wr *wr,
const struct ib_recv_wr **bad_wr)
{
- int ret = 0;
- int nreq = 0;
- int ind = 0;
- int i = 0;
- u32 reg_val;
- unsigned long flags = 0;
struct hns_roce_rq_wqe_ctrl *ctrl = NULL;
struct hns_roce_wqe_data_seg *scat = NULL;
struct hns_roce_qp *hr_qp = to_hr_qp(ibqp);
@@ -361,9 +353,14 @@ static int hns_roce_v1_post_recv(struct
struct device *dev = &hr_dev->pdev->dev;
struct hns_roce_rq_db rq_db;
__le32 doorbell[2] = {0};
+ unsigned long flags = 0;
+ unsigned int wqe_idx;
+ int ret = 0;
+ int nreq = 0;
+ int i = 0;
+ u32 reg_val;

spin_lock_irqsave(&hr_qp->rq.lock, flags);
- ind = hr_qp->rq.head & (hr_qp->rq.wqe_cnt - 1);

for (nreq = 0; wr; ++nreq, wr = wr->next) {
if (hns_roce_wq_overflow(&hr_qp->rq, nreq,
@@ -373,6 +370,8 @@ static int hns_roce_v1_post_recv(struct
goto out;
}

+ wqe_idx = (hr_qp->rq.head + nreq) & (hr_qp->rq.wqe_cnt - 1);
+
if (unlikely(wr->num_sge > hr_qp->rq.max_gs)) {
dev_err(dev, "rq:num_sge=%d > qp->sq.max_gs=%d\n",
wr->num_sge, hr_qp->rq.max_gs);
@@ -381,7 +380,7 @@ static int hns_roce_v1_post_recv(struct
goto out;
}

- ctrl = get_recv_wqe(hr_qp, ind);
+ ctrl = get_recv_wqe(hr_qp, wqe_idx);

roce_set_field(ctrl->rwqe_byte_12,
RQ_WQE_CTRL_RWQE_BYTE_12_RWQE_SGE_NUM_M,
@@ -393,9 +392,7 @@ static int hns_roce_v1_post_recv(struct
for (i = 0; i < wr->num_sge; i++)
set_data_seg(scat + i, wr->sg_list + i);

- hr_qp->rq.wrid[ind] = wr->wr_id;
-
- ind = (ind + 1) & (hr_qp->rq.wqe_cnt - 1);
+ hr_qp->rq.wrid[wqe_idx] = wr->wr_id;
}

out:
@@ -2701,7 +2698,6 @@ static int hns_roce_v1_m_sqp(struct ib_q
hr_qp->rq.tail = 0;
hr_qp->sq.head = 0;
hr_qp->sq.tail = 0;
- hr_qp->sq_next_wqe = 0;
}

kfree(context);
@@ -3315,7 +3311,6 @@ static int hns_roce_v1_m_qp(struct ib_qp
hr_qp->rq.tail = 0;
hr_qp->sq.head = 0;
hr_qp->sq.tail = 0;
- hr_qp->sq_next_wqe = 0;
}
out:
kfree(context);
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -239,10 +239,10 @@ static int hns_roce_v2_post_send(struct
struct device *dev = hr_dev->dev;
struct hns_roce_v2_db sq_db;
struct ib_qp_attr attr;
- unsigned int sge_ind;
unsigned int owner_bit;
+ unsigned int sge_idx;
+ unsigned int wqe_idx;
unsigned long flags;
- unsigned int ind;
void *wqe = NULL;
bool loopback;
int attr_mask;
@@ -269,8 +269,7 @@ static int hns_roce_v2_post_send(struct
}

spin_lock_irqsave(&qp->sq.lock, flags);
- ind = qp->sq_next_wqe;
- sge_ind = qp->next_sge;
+ sge_idx = qp->next_sge;

for (nreq = 0; wr; ++nreq, wr = wr->next) {
if (hns_roce_wq_overflow(&qp->sq, nreq, qp->ibqp.send_cq)) {
@@ -279,6 +278,8 @@ static int hns_roce_v2_post_send(struct
goto out;
}

+ wqe_idx = (qp->sq.head + nreq) & (qp->sq.wqe_cnt - 1);
+
if (unlikely(wr->num_sge > qp->sq.max_gs)) {
dev_err(dev, "num_sge=%d > qp->sq.max_gs=%d\n",
wr->num_sge, qp->sq.max_gs);
@@ -287,10 +288,8 @@ static int hns_roce_v2_post_send(struct
goto out;
}

- wqe = get_send_wqe(qp, ind & (qp->sq.wqe_cnt - 1));
- qp->sq.wrid[(qp->sq.head + nreq) & (qp->sq.wqe_cnt - 1)] =
- wr->wr_id;
-
+ wqe = get_send_wqe(qp, wqe_idx);
+ qp->sq.wrid[wqe_idx] = wr->wr_id;
owner_bit =
~(((qp->sq.head + nreq) >> ilog2(qp->sq.wqe_cnt)) & 0x1);
tmp_len = 0;
@@ -373,7 +372,7 @@ static int hns_roce_v2_post_send(struct
roce_set_field(ud_sq_wqe->byte_20,
V2_UD_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_M,
V2_UD_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_S,
- sge_ind & (qp->sge.sge_cnt - 1));
+ sge_idx & (qp->sge.sge_cnt - 1));

roce_set_field(ud_sq_wqe->byte_24,
V2_UD_SEND_WQE_BYTE_24_UDPSPN_M,
@@ -423,8 +422,7 @@ static int hns_roce_v2_post_send(struct
memcpy(&ud_sq_wqe->dgid[0], &ah->av.dgid[0],
GID_LEN_V2);

- set_extend_sge(qp, wr, &sge_ind);
- ind++;
+ set_extend_sge(qp, wr, &sge_idx);
} else if (ibqp->qp_type == IB_QPT_RC) {
rc_sq_wqe = wqe;
memset(rc_sq_wqe, 0, sizeof(*rc_sq_wqe));
@@ -553,12 +551,10 @@ static int hns_roce_v2_post_send(struct
wr->num_sge);
} else if (wr->opcode != IB_WR_REG_MR) {
ret = set_rwqe_data_seg(ibqp, wr, rc_sq_wqe,
- wqe, &sge_ind, bad_wr);
+ wqe, &sge_idx, bad_wr);
if (ret)
goto out;
}
-
- ind++;
} else {
dev_err(dev, "Illegal qp_type(0x%x)\n", ibqp->qp_type);
spin_unlock_irqrestore(&qp->sq.lock, flags);
@@ -588,8 +584,7 @@ out:

hns_roce_write64(hr_dev, (__le32 *)&sq_db, qp->sq.db_reg_l);

- qp->sq_next_wqe = ind;
- qp->next_sge = sge_ind;
+ qp->next_sge = sge_idx;

if (qp->state == IB_QPS_ERR) {
attr_mask = IB_QP_STATE;
@@ -623,13 +618,12 @@ static int hns_roce_v2_post_recv(struct
unsigned long flags;
void *wqe = NULL;
int attr_mask;
+ u32 wqe_idx;
int ret = 0;
int nreq;
- int ind;
int i;

spin_lock_irqsave(&hr_qp->rq.lock, flags);
- ind = hr_qp->rq.head & (hr_qp->rq.wqe_cnt - 1);

if (hr_qp->state == IB_QPS_RESET) {
spin_unlock_irqrestore(&hr_qp->rq.lock, flags);
@@ -645,6 +639,8 @@ static int hns_roce_v2_post_recv(struct
goto out;
}

+ wqe_idx = (hr_qp->rq.head + nreq) & (hr_qp->rq.wqe_cnt - 1);
+
if (unlikely(wr->num_sge > hr_qp->rq.max_gs)) {
dev_err(dev, "rq:num_sge=%d > qp->sq.max_gs=%d\n",
wr->num_sge, hr_qp->rq.max_gs);
@@ -653,7 +649,7 @@ static int hns_roce_v2_post_recv(struct
goto out;
}

- wqe = get_recv_wqe(hr_qp, ind);
+ wqe = get_recv_wqe(hr_qp, wqe_idx);
dseg = (struct hns_roce_v2_wqe_data_seg *)wqe;
for (i = 0; i < wr->num_sge; i++) {
if (!wr->sg_list[i].length)
@@ -669,8 +665,8 @@ static int hns_roce_v2_post_recv(struct

/* rq support inline data */
if (hr_dev->caps.flags & HNS_ROCE_CAP_FLAG_RQ_INLINE) {
- sge_list = hr_qp->rq_inl_buf.wqe_list[ind].sg_list;
- hr_qp->rq_inl_buf.wqe_list[ind].sge_cnt =
+ sge_list = hr_qp->rq_inl_buf.wqe_list[wqe_idx].sg_list;
+ hr_qp->rq_inl_buf.wqe_list[wqe_idx].sge_cnt =
(u32)wr->num_sge;
for (i = 0; i < wr->num_sge; i++) {
sge_list[i].addr =
@@ -679,9 +675,7 @@ static int hns_roce_v2_post_recv(struct
}
}

- hr_qp->rq.wrid[ind] = wr->wr_id;
-
- ind = (ind + 1) & (hr_qp->rq.wqe_cnt - 1);
+ hr_qp->rq.wrid[wqe_idx] = wr->wr_id;
}

out:
@@ -4464,7 +4458,6 @@ static int hns_roce_v2_modify_qp(struct
hr_qp->rq.tail = 0;
hr_qp->sq.head = 0;
hr_qp->sq.tail = 0;
- hr_qp->sq_next_wqe = 0;
hr_qp->next_sge = 0;
if (hr_qp->rq.wqe_cnt)
*hr_qp->rdb.db_record = 0;


2020-03-03 17:51:51

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 149/176] kprobes: Set unoptimized flag after unoptimizing code

From: Masami Hiramatsu <[email protected]>

commit f66c0447cca1281116224d474cdb37d6a18e4b5b upstream.

Set the unoptimized flag after confirming the code is completely
unoptimized. Without this fix, when a kprobe hits the intermediate
modified instruction (the first byte is replaced by an INT3, but
later bytes can still be a jump address operand) while unoptimizing,
it can return to the middle byte of the modified code, which causes
an invalid instruction exception in the kernel.

Usually, this is a rare case, but if we put a probe on the function
call while text patching, it always causes a kernel panic as below:

# echo p text_poke+5 > kprobe_events
# echo 1 > events/kprobes/enable
# echo 0 > events/kprobes/enable

invalid opcode: 0000 [#1] PREEMPT SMP PTI
RIP: 0010:text_poke+0x9/0x50
Call Trace:
arch_unoptimize_kprobe+0x22/0x28
arch_unoptimize_kprobes+0x39/0x87
kprobe_optimizer+0x6e/0x290
process_one_work+0x2a0/0x610
worker_thread+0x28/0x3d0
? process_one_work+0x610/0x610
kthread+0x10d/0x130
? kthread_park+0x80/0x80
ret_from_fork+0x3a/0x50

text_poke() is used for patching the code in optprobes.

This can happen even if we blacklist text_poke() and other functions,
because there is a small time window during which we show the intermediate
code to other CPUs.

[ mingo: Edited the changelog. ]

Tested-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Fixes: 6274de4984a6 ("kprobes: Support delayed unoptimizing")
Link: https://lkml.kernel.org/r/157483422375.25881.13508326028469515760.stgit@devnote2
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/kprobes.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -510,6 +510,8 @@ static void do_unoptimize_kprobes(void)
arch_unoptimize_kprobes(&unoptimizing_list, &freeing_list);
/* Loop free_list for disarming */
list_for_each_entry_safe(op, tmp, &freeing_list, list) {
+ /* Switching from detour code to origin */
+ op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
/* Disarm probes if marked disabled */
if (kprobe_disabled(&op->kp))
arch_disarm_kprobe(&op->kp);
@@ -665,6 +667,7 @@ static void force_unoptimize_kprobe(stru
{
lockdep_assert_cpus_held();
arch_unoptimize_kprobe(op);
+ op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
if (kprobe_disabled(&op->kp))
arch_disarm_kprobe(&op->kp);
}
@@ -681,7 +684,6 @@ static void unoptimize_kprobe(struct kpr
if (!kprobe_optimized(p))
return;

- op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
if (!list_empty(&op->list)) {
if (optprobe_queued_unopt(op)) {
/* Queued in unoptimizing queue */


2020-03-03 17:51:51

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 150/176] lib/vdso: Make __arch_update_vdso_data() logic understandable

From: Thomas Gleixner <[email protected]>

commit 9a6b55ac4a44060bcb782baf002859b2a2c63267 upstream.

The function name suggests that this is a boolean checking whether the
architecture asks for an update of the VDSO data, but it works the other
way round. To spare further confusion invert the logic.

Fixes: 44f57d788e7d ("timekeeping: Provide a generic update_vsyscall() implementation")
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/include/asm/vdso/vsyscall.h | 4 ++--
include/asm-generic/vdso/vsyscall.h | 4 ++--
kernel/time/vsyscall.c | 2 +-
3 files changed, 5 insertions(+), 5 deletions(-)

--- a/arch/arm/include/asm/vdso/vsyscall.h
+++ b/arch/arm/include/asm/vdso/vsyscall.h
@@ -34,9 +34,9 @@ struct vdso_data *__arm_get_k_vdso_data(
#define __arch_get_k_vdso_data __arm_get_k_vdso_data

static __always_inline
-int __arm_update_vdso_data(void)
+bool __arm_update_vdso_data(void)
{
- return !cntvct_ok;
+ return cntvct_ok;
}
#define __arch_update_vdso_data __arm_update_vdso_data

--- a/include/asm-generic/vdso/vsyscall.h
+++ b/include/asm-generic/vdso/vsyscall.h
@@ -12,9 +12,9 @@ static __always_inline struct vdso_data
#endif /* __arch_get_k_vdso_data */

#ifndef __arch_update_vdso_data
-static __always_inline int __arch_update_vdso_data(void)
+static __always_inline bool __arch_update_vdso_data(void)
{
- return 0;
+ return true;
}
#endif /* __arch_update_vdso_data */

--- a/kernel/time/vsyscall.c
+++ b/kernel/time/vsyscall.c
@@ -84,7 +84,7 @@ void update_vsyscall(struct timekeeper *
struct vdso_timestamp *vdso_ts;
u64 nsec;

- if (__arch_update_vdso_data()) {
+ if (!__arch_update_vdso_data()) {
/*
* Some architectures might want to skip the update of the
* data page.


2020-03-03 17:51:53

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 151/176] lib/vdso: Update coarse timekeeper unconditionally

From: Thomas Gleixner <[email protected]>

commit 9f24c540f7f8eb3a981528da9a9a636a5bdf5987 upstream.

The low resolution parts of the VDSO, i.e.:

clock_gettime(CLOCK_*_COARSE), clock_getres(), time()

can be used even if there is no VDSO capable clocksource.

But if an architecture opts out of the VDSO data update then this
information becomes stale. This affects ARM when there is no architected
timer available. The lack of update causes userspace to use stale data
forever.

Make the update of the low resolution parts unconditional and only skip
the update of the high resolution parts if the architecture requests it.

Fixes: 44f57d788e7d ("timekeeping: Provide a generic update_vsyscall() implementation")
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/time/vsyscall.c | 37 +++++++++++++++++--------------------
1 file changed, 17 insertions(+), 20 deletions(-)

--- a/kernel/time/vsyscall.c
+++ b/kernel/time/vsyscall.c
@@ -28,11 +28,6 @@ static inline void update_vdso_data(stru
vdata[CS_RAW].mult = tk->tkr_raw.mult;
vdata[CS_RAW].shift = tk->tkr_raw.shift;

- /* CLOCK_REALTIME */
- vdso_ts = &vdata[CS_HRES_COARSE].basetime[CLOCK_REALTIME];
- vdso_ts->sec = tk->xtime_sec;
- vdso_ts->nsec = tk->tkr_mono.xtime_nsec;
-
/* CLOCK_MONOTONIC */
vdso_ts = &vdata[CS_HRES_COARSE].basetime[CLOCK_MONOTONIC];
vdso_ts->sec = tk->xtime_sec + tk->wall_to_monotonic.tv_sec;
@@ -70,12 +65,6 @@ static inline void update_vdso_data(stru
vdso_ts = &vdata[CS_HRES_COARSE].basetime[CLOCK_TAI];
vdso_ts->sec = tk->xtime_sec + (s64)tk->tai_offset;
vdso_ts->nsec = tk->tkr_mono.xtime_nsec;
-
- /*
- * Read without the seqlock held by clock_getres().
- * Note: No need to have a second copy.
- */
- WRITE_ONCE(vdata[CS_HRES_COARSE].hrtimer_res, hrtimer_resolution);
}

void update_vsyscall(struct timekeeper *tk)
@@ -84,20 +73,17 @@ void update_vsyscall(struct timekeeper *
struct vdso_timestamp *vdso_ts;
u64 nsec;

- if (!__arch_update_vdso_data()) {
- /*
- * Some architectures might want to skip the update of the
- * data page.
- */
- return;
- }
-
/* copy vsyscall data */
vdso_write_begin(vdata);

vdata[CS_HRES_COARSE].clock_mode = __arch_get_clock_mode(tk);
vdata[CS_RAW].clock_mode = __arch_get_clock_mode(tk);

+ /* CLOCK_REALTIME also required for time() */
+ vdso_ts = &vdata[CS_HRES_COARSE].basetime[CLOCK_REALTIME];
+ vdso_ts->sec = tk->xtime_sec;
+ vdso_ts->nsec = tk->tkr_mono.xtime_nsec;
+
/* CLOCK_REALTIME_COARSE */
vdso_ts = &vdata[CS_HRES_COARSE].basetime[CLOCK_REALTIME_COARSE];
vdso_ts->sec = tk->xtime_sec;
@@ -110,7 +96,18 @@ void update_vsyscall(struct timekeeper *
nsec = nsec + tk->wall_to_monotonic.tv_nsec;
vdso_ts->sec += __iter_div_u64_rem(nsec, NSEC_PER_SEC, &vdso_ts->nsec);

- update_vdso_data(vdata, tk);
+ /*
+ * Read without the seqlock held by clock_getres().
+ * Note: No need to have a second copy.
+ */
+ WRITE_ONCE(vdata[CS_HRES_COARSE].hrtimer_res, hrtimer_resolution);
+
+ /*
+ * Architectures can opt out of updating the high resolution part
+ * of the VDSO.
+ */
+ if (__arch_update_vdso_data())
+ update_vdso_data(vdata, tk);

__arch_update_vsyscall(vdata, tk);



2020-03-03 17:52:01

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 121/176] hv_netvsc: Fix unwanted wakeup in netvsc_attach()

From: Haiyang Zhang <[email protected]>

commit f6f13c125e05603f68f5bf31f045b95e6d493598 upstream.

When netvsc_attach() is called by operations like changing MTU, etc.,
an extra wakeup may happen while netvsc_attach() calling
rndis_filter_device_add() which sends rndis messages when queue is
stopped in netvsc_detach(). The completion message will wake up queue 0.

We can reproduce the issue by changing MTU etc., then the wake_queue
counter from "ethtool -S" will increase beyond stop_queue counter:
stop_queue: 0
wake_queue: 1
The issue causes queue wake up, and counter increment, no other ill
effects in current code. So we didn't see any network problem for now.

To fix this, initialize tx_disable to true, and set it to false when
the NIC is ready to be attached or registered.

Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic")
Signed-off-by: Haiyang Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/hyperv/netvsc.c | 2 +-
drivers/net/hyperv/netvsc_drv.c | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)

--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -99,7 +99,7 @@ static struct netvsc_device *alloc_net_d

init_waitqueue_head(&net_device->wait_drain);
net_device->destroy = false;
- net_device->tx_disable = false;
+ net_device->tx_disable = true;

net_device->max_pkt = RNDIS_MAX_PKT_DEFAULT;
net_device->pkt_align = RNDIS_PKT_ALIGN_DEFAULT;
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -977,6 +977,7 @@ static int netvsc_attach(struct net_devi
}

/* In any case device is now ready */
+ nvdev->tx_disable = false;
netif_device_attach(ndev);

/* Note: enable and attach happen when sub-channels setup */
@@ -2354,6 +2355,8 @@ static int netvsc_probe(struct hv_device
else
net->max_mtu = ETH_DATA_LEN;

+ nvdev->tx_disable = false;
+
ret = register_netdevice(net);
if (ret != 0) {
pr_err("Unable to register netdev.\n");


2020-03-03 17:52:04

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 155/176] perf maps: Add missing unlock to maps__insert() error case

From: Cengiz Can <[email protected]>

commit 85fc95d75970ee7dd8e01904e7fb1197c275ba6b upstream.

`tools/perf/util/map.c` has a function named `maps__insert` that
acquires a write lock if its in multithread context.

Even though this lock is released when function successfully completes,
there's a branch that is executed when `maps_by_name == NULL` that
returns from this function without releasing the write lock.

Added an `up_write` to release the lock when this happens.

Fixes: a7c2b572e217 ("perf map_groups: Auto sort maps by name, if needed")
Signed-off-by: Cengiz Can <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
tools/perf/util/map.c | 1 +
1 file changed, 1 insertion(+)

--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -549,6 +549,7 @@ void maps__insert(struct maps *maps, str

if (maps_by_name == NULL) {
__maps__free_maps_by_name(maps);
+ up_write(&maps->lock);
return;
}



2020-03-03 17:52:12

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 158/176] KVM: x86: Remove spurious kvm_mmu_unload() from vcpu destruction path

From: Sean Christopherson <[email protected]>

commit 9d979c7e6ff43ca3200ffcb74f57415fd633a2da upstream.

x86 does not load its MMU until KVM_RUN, which cannot be invoked until
after vCPU creation succeeds. Given that kvm_arch_vcpu_destroy() is
called if and only if vCPU creation fails, it is impossible for the MMU
to be loaded.

Note, the bogus kvm_mmu_unload() call was added during an unrelated
refactoring of vCPU allocation, i.e. was presumably added as an
opportunstic "fix" for a perceived leak.

Fixes: fb3f0f51d92d1 ("KVM: Dynamically allocate vcpus")
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kvm/x86.c | 4 ----
1 file changed, 4 deletions(-)

--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9229,10 +9229,6 @@ void kvm_arch_vcpu_destroy(struct kvm_vc
{
vcpu->arch.apf.msr_val = 0;

- vcpu_load(vcpu);
- kvm_mmu_unload(vcpu);
- vcpu_put(vcpu);
-
kvm_arch_vcpu_free(vcpu);
}



2020-03-03 17:52:14

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 159/176] KVM: x86: Remove spurious clearing of async #PF MSR

From: Sean Christopherson <[email protected]>

commit 208050dac5ef4de5cb83ffcafa78499c94d0b5ad upstream.

Remove a bogus clearing of apf.msr_val from kvm_arch_vcpu_destroy().

apf.msr_val is only set to a non-zero value by kvm_pv_enable_async_pf(),
which is only reachable by kvm_set_msr_common(), i.e. by writing
MSR_KVM_ASYNC_PF_EN. KVM does not autonomously write said MSR, i.e.
can only be written via KVM_SET_MSRS or KVM_RUN. Since KVM_SET_MSRS and
KVM_RUN are vcpu ioctls, they require a valid vcpu file descriptor.
kvm_arch_vcpu_destroy() is only called if KVM_CREATE_VCPU fails, and KVM
declares KVM_CREATE_VCPU successful once the vcpu fd is installed and
thus visible to userspace. Ergo, apf.msr_val cannot be non-zero when
kvm_arch_vcpu_destroy() is called.

Fixes: 344d9588a9df0 ("KVM: Add PV MSR to enable asynchronous page faults delivery.")
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kvm/x86.c | 2 --
1 file changed, 2 deletions(-)

--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9227,8 +9227,6 @@ void kvm_arch_vcpu_postcreate(struct kvm

void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
{
- vcpu->arch.apf.msr_val = 0;
-
kvm_arch_vcpu_free(vcpu);
}



2020-03-03 17:52:17

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 157/176] KVM: X86: Fix kvm_bitmap_or_dest_vcpus() to use irq shorthand

From: Peter Xu <[email protected]>

commit b4b2963616bbd91ebb33148522552e1135de56ae upstream.

The 3rd parameter of kvm_apic_match_dest() is the irq shorthand,
rather than the irq delivery mode.

Fixes: 7ee30bc132c6 ("KVM: x86: deliver KVM IOAPIC scan request to target vCPUs")
Reviewed-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kvm/lapic.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1150,7 +1150,7 @@ void kvm_bitmap_or_dest_vcpus(struct kvm
if (!kvm_apic_present(vcpu))
continue;
if (!kvm_apic_match_dest(vcpu, NULL,
- irq->delivery_mode,
+ irq->shorthand,
irq->dest_id,
irq->dest_mode))
continue;


2020-03-03 17:52:20

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 160/176] rcu: Allow only one expedited GP to run concurrently with wakeups

From: Neeraj Upadhyay <[email protected]>

commit 4bc6b745e5cbefed92c48071e28a5f41246d0470 upstream.

The current expedited RCU grace-period code expects that a task
requesting an expedited grace period cannot awaken until that grace
period has reached the wakeup phase. However, it is possible for a long
preemption to result in the waiting task never sleeping. For example,
consider the following sequence of events:

1. Task A starts an expedited grace period by invoking
synchronize_rcu_expedited(). It proceeds normally up to the
wait_event() near the end of that function, and is then preempted
(or interrupted or whatever).

2. The expedited grace period completes, and a kworker task starts
the awaken phase, having incremented the counter and acquired
the rcu_state structure's .exp_wake_mutex. This kworker task
is then preempted or interrupted or whatever.

3. Task A resumes and enters wait_event(), which notes that the
expedited grace period has completed, and thus doesn't sleep.

4. Task B starts an expedited grace period exactly as did Task A,
complete with the preemption (or whatever delay) just before
the call to wait_event().

5. The expedited grace period completes, and another kworker
task starts the awaken phase, having incremented the counter.
However, it blocks when attempting to acquire the rcu_state
structure's .exp_wake_mutex because step 2's kworker task has
not yet released it.

6. Steps 4 and 5 repeat, resulting in overflow of the rcu_node
structure's ->exp_wq[] array.

In theory, this is harmless. Tasks waiting on the various ->exp_wq[]
array will just be spuriously awakened, but they will just sleep again
on noting that the rcu_state structure's ->expedited_sequence value has
not advanced far enough.

In practice, this wastes CPU time and is an accident waiting to happen.
This commit therefore moves the rcu_exp_gp_seq_end() call that officially
ends the expedited grace period (along with associate tracing) until
after the ->exp_wake_mutex has been acquired. This prevents Task A from
awakening prematurely, thus preventing more than one expedited grace
period from being in flight during a previous expedited grace period's
wakeup phase.

Fixes: 3b5f668e715b ("rcu: Overlap wakeups with next expedited grace period")
Signed-off-by: Neeraj Upadhyay <[email protected]>
[ paulmck: Added updated comment. ]
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/rcu/tree_exp.h | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)

--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -540,14 +540,13 @@ static void rcu_exp_wait_wake(unsigned l
struct rcu_node *rnp;

synchronize_sched_expedited_wait();
- rcu_exp_gp_seq_end();
- trace_rcu_exp_grace_period(rcu_state.name, s, TPS("end"));

- /*
- * Switch over to wakeup mode, allowing the next GP, but -only- the
- * next GP, to proceed.
- */
+ // Switch over to wakeup mode, allowing the next GP to proceed.
+ // End the previous grace period only after acquiring the mutex
+ // to ensure that only one GP runs concurrently with wakeups.
mutex_lock(&rcu_state.exp_wake_mutex);
+ rcu_exp_gp_seq_end();
+ trace_rcu_exp_grace_period(rcu_state.name, s, TPS("end"));

rcu_for_each_node_breadth_first(rnp) {
if (ULONG_CMP_LT(READ_ONCE(rnp->exp_seq_rq), s)) {


2020-03-03 17:52:26

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 137/176] namei: only return -ECHILD from follow_dotdot_rcu()

From: Aleksa Sarai <[email protected]>

commit 2b98149c2377bff12be5dd3ce02ae0506e2dd613 upstream.

It's over-zealous to return hard errors under RCU-walk here, given that
a REF-walk will be triggered for all other cases handling ".." under
RCU.

The original purpose of this check was to ensure that if a rename occurs
such that a directory is moved outside of the bind-mount which the
resolution started in, it would be detected and blocked to avoid being
able to mess with paths outside of the bind-mount. However, triggering a
new REF-walk is just as effective a solution.

Cc: "Eric W. Biederman" <[email protected]>
Fixes: 397d425dc26d ("vfs: Test for and handle paths that are unreachable from their mnt_root")
Suggested-by: Al Viro <[email protected]>
Signed-off-by: Aleksa Sarai <[email protected]>
Signed-off-by: Al Viro <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/namei.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1367,7 +1367,7 @@ static int follow_dotdot_rcu(struct name
nd->path.dentry = parent;
nd->seq = seq;
if (unlikely(!path_connected(&nd->path)))
- return -ENOENT;
+ return -ECHILD;
break;
} else {
struct mount *mnt = real_mount(nd->path.mnt);


2020-03-03 17:52:47

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 173/176] mm/huge_memory.c: use head to check huge zero page

From: Wei Yang <[email protected]>

commit cb829624867b5ab10bc6a7036d183b1b82bfe9f8 upstream.

The page could be a tail page, if this is the case, this BUG_ON will
never be triggered.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()")

Signed-off-by: Wei Yang <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
mm/huge_memory.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2712,7 +2712,7 @@ int split_huge_page_to_list(struct page
unsigned long flags;
pgoff_t end;

- VM_BUG_ON_PAGE(is_huge_zero_page(page), page);
+ VM_BUG_ON_PAGE(is_huge_zero_page(head), head);
VM_BUG_ON_PAGE(!PageLocked(page), page);
VM_BUG_ON_PAGE(!PageCompound(page), page);



2020-03-03 17:52:49

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 156/176] x86/resctrl: Check monitoring static key in the MBM overflow handler

From: Xiaochen Shen <[email protected]>

commit 536a0d8e79fb928f2735db37dda95682b6754f9a upstream.

Currently, there are three static keys in the resctrl file system:
rdt_mon_enable_key and rdt_alloc_enable_key indicate if the monitoring
feature and the allocation feature are enabled, respectively. The
rdt_enable_key is enabled when either the monitoring feature or the
allocation feature is enabled.

If no monitoring feature is present (either hardware doesn't support a
monitoring feature or the feature is disabled by the kernel command line
option "rdt="), rdt_enable_key is still enabled but rdt_mon_enable_key
is disabled.

MBM is a monitoring feature. The MBM overflow handler intends to
check if the monitoring feature is not enabled for fast return.

So check the rdt_mon_enable_key in it instead of the rdt_enable_key as
former is the more accurate check.

[ bp: Massage commit message. ]

Fixes: e33026831bdb ("x86/intel_rdt/mbm: Handle counter overflow")
Signed-off-by: Xiaochen Shen <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kernel/cpu/resctrl/internal.h | 1 +
arch/x86/kernel/cpu/resctrl/monitor.c | 4 ++--
2 files changed, 3 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -57,6 +57,7 @@ static inline struct rdt_fs_context *rdt
}

DECLARE_STATIC_KEY_FALSE(rdt_enable_key);
+DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key);

/**
* struct mon_evt - Entry in the event list of a resource
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -514,7 +514,7 @@ void mbm_handle_overflow(struct work_str

mutex_lock(&rdtgroup_mutex);

- if (!static_branch_likely(&rdt_enable_key))
+ if (!static_branch_likely(&rdt_mon_enable_key))
goto out_unlock;

d = get_domain_from_cpu(cpu, &rdt_resources_all[RDT_RESOURCE_L3]);
@@ -543,7 +543,7 @@ void mbm_setup_overflow_handler(struct r
unsigned long delay = msecs_to_jiffies(delay_ms);
int cpu;

- if (!static_branch_likely(&rdt_enable_key))
+ if (!static_branch_likely(&rdt_mon_enable_key))
return;
cpu = cpumask_any(&dom->cpu_mask);
dom->mbm_work_cpu = cpu;


2020-03-03 17:52:53

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 146/176] RDMA/hns: Bugfix for posting a wqe with sge

From: Lijun Ou <[email protected]>

commit 468d020e2f02867b8ec561461a1689cd4365e493 upstream.

Driver should first check whether the sge is valid, then fill the valid
sge and the caculated total into hardware, otherwise invalid sges will
cause an error.

Fixes: 52e3b42a2f58 ("RDMA/hns: Filter for zero length of sge in hip08 kernel mode")
Fixes: 7bdee4158b37 ("RDMA/hns: Fill sq wqe context of ud type in hip08")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lijun Ou <[email protected]>
Signed-off-by: Weihang Li <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 41 +++++++++++++++++------------
1 file changed, 25 insertions(+), 16 deletions(-)

--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -110,7 +110,7 @@ static void set_atomic_seg(struct hns_ro
}

static void set_extend_sge(struct hns_roce_qp *qp, const struct ib_send_wr *wr,
- unsigned int *sge_ind)
+ unsigned int *sge_ind, int valid_num_sge)
{
struct hns_roce_v2_wqe_data_seg *dseg;
struct ib_sge *sg;
@@ -123,7 +123,7 @@ static void set_extend_sge(struct hns_ro

if (qp->ibqp.qp_type == IB_QPT_RC || qp->ibqp.qp_type == IB_QPT_UC)
num_in_wqe = HNS_ROCE_V2_UC_RC_SGE_NUM_IN_WQE;
- extend_sge_num = wr->num_sge - num_in_wqe;
+ extend_sge_num = valid_num_sge - num_in_wqe;
sg = wr->sg_list + num_in_wqe;
shift = qp->hr_buf.page_shift;

@@ -159,14 +159,16 @@ static void set_extend_sge(struct hns_ro
static int set_rwqe_data_seg(struct ib_qp *ibqp, const struct ib_send_wr *wr,
struct hns_roce_v2_rc_send_wqe *rc_sq_wqe,
void *wqe, unsigned int *sge_ind,
+ int valid_num_sge,
const struct ib_send_wr **bad_wr)
{
struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device);
struct hns_roce_v2_wqe_data_seg *dseg = wqe;
struct hns_roce_qp *qp = to_hr_qp(ibqp);
+ int j = 0;
int i;

- if (wr->send_flags & IB_SEND_INLINE && wr->num_sge) {
+ if (wr->send_flags & IB_SEND_INLINE && valid_num_sge) {
if (le32_to_cpu(rc_sq_wqe->msg_len) >
hr_dev->caps.max_sq_inline) {
*bad_wr = wr;
@@ -190,7 +192,7 @@ static int set_rwqe_data_seg(struct ib_q
roce_set_bit(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_INLINE_S,
1);
} else {
- if (wr->num_sge <= HNS_ROCE_V2_UC_RC_SGE_NUM_IN_WQE) {
+ if (valid_num_sge <= HNS_ROCE_V2_UC_RC_SGE_NUM_IN_WQE) {
for (i = 0; i < wr->num_sge; i++) {
if (likely(wr->sg_list[i].length)) {
set_data_seg_v2(dseg, wr->sg_list + i);
@@ -203,19 +205,21 @@ static int set_rwqe_data_seg(struct ib_q
V2_RC_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_S,
(*sge_ind) & (qp->sge.sge_cnt - 1));

- for (i = 0; i < HNS_ROCE_V2_UC_RC_SGE_NUM_IN_WQE; i++) {
+ for (i = 0; i < wr->num_sge &&
+ j < HNS_ROCE_V2_UC_RC_SGE_NUM_IN_WQE; i++) {
if (likely(wr->sg_list[i].length)) {
set_data_seg_v2(dseg, wr->sg_list + i);
dseg++;
+ j++;
}
}

- set_extend_sge(qp, wr, sge_ind);
+ set_extend_sge(qp, wr, sge_ind, valid_num_sge);
}

roce_set_field(rc_sq_wqe->byte_16,
V2_RC_SEND_WQE_BYTE_16_SGE_NUM_M,
- V2_RC_SEND_WQE_BYTE_16_SGE_NUM_S, wr->num_sge);
+ V2_RC_SEND_WQE_BYTE_16_SGE_NUM_S, valid_num_sge);
}

return 0;
@@ -243,6 +247,7 @@ static int hns_roce_v2_post_send(struct
unsigned int sge_idx;
unsigned int wqe_idx;
unsigned long flags;
+ int valid_num_sge;
void *wqe = NULL;
bool loopback;
int attr_mask;
@@ -292,8 +297,16 @@ static int hns_roce_v2_post_send(struct
qp->sq.wrid[wqe_idx] = wr->wr_id;
owner_bit =
~(((qp->sq.head + nreq) >> ilog2(qp->sq.wqe_cnt)) & 0x1);
+ valid_num_sge = 0;
tmp_len = 0;

+ for (i = 0; i < wr->num_sge; i++) {
+ if (likely(wr->sg_list[i].length)) {
+ tmp_len += wr->sg_list[i].length;
+ valid_num_sge++;
+ }
+ }
+
/* Corresponding to the QP type, wqe process separately */
if (ibqp->qp_type == IB_QPT_GSI) {
ud_sq_wqe = wqe;
@@ -329,9 +342,6 @@ static int hns_roce_v2_post_send(struct
V2_UD_SEND_WQE_BYTE_4_OPCODE_S,
HNS_ROCE_V2_WQE_OP_SEND);

- for (i = 0; i < wr->num_sge; i++)
- tmp_len += wr->sg_list[i].length;
-
ud_sq_wqe->msg_len =
cpu_to_le32(le32_to_cpu(ud_sq_wqe->msg_len) + tmp_len);

@@ -367,7 +377,7 @@ static int hns_roce_v2_post_send(struct
roce_set_field(ud_sq_wqe->byte_16,
V2_UD_SEND_WQE_BYTE_16_SGE_NUM_M,
V2_UD_SEND_WQE_BYTE_16_SGE_NUM_S,
- wr->num_sge);
+ valid_num_sge);

roce_set_field(ud_sq_wqe->byte_20,
V2_UD_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_M,
@@ -422,12 +432,10 @@ static int hns_roce_v2_post_send(struct
memcpy(&ud_sq_wqe->dgid[0], &ah->av.dgid[0],
GID_LEN_V2);

- set_extend_sge(qp, wr, &sge_idx);
+ set_extend_sge(qp, wr, &sge_idx, valid_num_sge);
} else if (ibqp->qp_type == IB_QPT_RC) {
rc_sq_wqe = wqe;
memset(rc_sq_wqe, 0, sizeof(*rc_sq_wqe));
- for (i = 0; i < wr->num_sge; i++)
- tmp_len += wr->sg_list[i].length;

rc_sq_wqe->msg_len =
cpu_to_le32(le32_to_cpu(rc_sq_wqe->msg_len) + tmp_len);
@@ -548,10 +556,11 @@ static int hns_roce_v2_post_send(struct
roce_set_field(rc_sq_wqe->byte_16,
V2_RC_SEND_WQE_BYTE_16_SGE_NUM_M,
V2_RC_SEND_WQE_BYTE_16_SGE_NUM_S,
- wr->num_sge);
+ valid_num_sge);
} else if (wr->opcode != IB_WR_REG_MR) {
ret = set_rwqe_data_seg(ibqp, wr, rc_sq_wqe,
- wqe, &sge_idx, bad_wr);
+ wqe, &sge_idx,
+ valid_num_sge, bad_wr);
if (ret)
goto out;
}


2020-03-03 17:52:53

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 174/176] mm, thp: fix defrag setting if newline is not used

From: David Rientjes <[email protected]>

commit f42f25526502d851d0e3ca1e46297da8aafce8a7 upstream.

If thp defrag setting "defer" is used and a newline is *not* used when
writing to the sysfs file, this is interpreted as the "defer+madvise"
option.

This is because we do prefix matching and if five characters are written
without a newline, the current code ends up comparing to the first five
bytes of the "defer+madvise" option and using that instead.

Use the more appropriate sysfs_streq() that handles the trailing newline
for us. Since this doubles as a nice cleanup, do it in enabled_store()
as well.

The current implementation relies on prefix matching: the number of
bytes compared is either the number of bytes written or the length of
the option being compared. With a newline, "defer\n" does not match
"defer+"madvise"; without a newline, however, "defer" is considered to
match "defer+madvise" (prefix matching is only comparing the first five
bytes). End result is that writing "defer" is broken unless it has an
additional trailing character.

This means that writing "madv" in the past would match and set
"madvise". With strict checking, that no longer is the case but it is
unlikely anybody is currently doing this.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 21440d7eb904 ("mm, thp: add new defer+madvise defrag option")
Signed-off-by: David Rientjes <[email protected]>
Suggested-by: Andrew Morton <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: Mel Gorman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
mm/huge_memory.c | 24 ++++++++----------------
1 file changed, 8 insertions(+), 16 deletions(-)

--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -177,16 +177,13 @@ static ssize_t enabled_store(struct kobj
{
ssize_t ret = count;

- if (!memcmp("always", buf,
- min(sizeof("always")-1, count))) {
+ if (sysfs_streq(buf, "always")) {
clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
set_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
- } else if (!memcmp("madvise", buf,
- min(sizeof("madvise")-1, count))) {
+ } else if (sysfs_streq(buf, "madvise")) {
clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
set_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
- } else if (!memcmp("never", buf,
- min(sizeof("never")-1, count))) {
+ } else if (sysfs_streq(buf, "never")) {
clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
} else
@@ -250,32 +247,27 @@ static ssize_t defrag_store(struct kobje
struct kobj_attribute *attr,
const char *buf, size_t count)
{
- if (!memcmp("always", buf,
- min(sizeof("always")-1, count))) {
+ if (sysfs_streq(buf, "always")) {
clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
- } else if (!memcmp("defer+madvise", buf,
- min(sizeof("defer+madvise")-1, count))) {
+ } else if (sysfs_streq(buf, "defer+madvise")) {
clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
- } else if (!memcmp("defer", buf,
- min(sizeof("defer")-1, count))) {
+ } else if (sysfs_streq(buf, "defer")) {
clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
- } else if (!memcmp("madvise", buf,
- min(sizeof("madvise")-1, count))) {
+ } else if (sysfs_streq(buf, "madvise")) {
clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);
set_bit(TRANSPARENT_HUGEPAGE_DEFRAG_REQ_MADV_FLAG, &transparent_hugepage_flags);
- } else if (!memcmp("never", buf,
- min(sizeof("never")-1, count))) {
+ } else if (sysfs_streq(buf, "never")) {
clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_DIRECT_FLAG, &transparent_hugepage_flags);
clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_FLAG, &transparent_hugepage_flags);
clear_bit(TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG, &transparent_hugepage_flags);


2020-03-03 17:52:54

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 154/176] perf ui gtk: Add missing zalloc object

From: Jiri Olsa <[email protected]>

commit 604e2139a1026793b8c2172bd92c7e9d039a5cf0 upstream.

When we moved zalloc.o to the library we missed gtk library which needs
it compiled in, otherwise the missing __zfree symbol will cause the
library to fail to load.

Adding the zalloc object to the gtk library build.

Fixes: 7f7c536f23e6 ("tools lib: Adopt zalloc()/zfree() from tools/perf")
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jelle van der Waa <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
tools/perf/ui/gtk/Build | 5 +++++
1 file changed, 5 insertions(+)

--- a/tools/perf/ui/gtk/Build
+++ b/tools/perf/ui/gtk/Build
@@ -7,3 +7,8 @@ gtk-y += util.o
gtk-y += helpline.o
gtk-y += progress.o
gtk-y += annotate.o
+gtk-y += zalloc.o
+
+$(OUTPUT)ui/gtk/zalloc.o: ../lib/zalloc.c FORCE
+ $(call rule_mkdir)
+ $(call if_changed_dep,cc_o_c)


2020-03-03 17:53:01

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 169/176] padata: always acquire cpu_hotplug_lock before pinst->lock

From: Daniel Jordan <[email protected]>

commit 38228e8848cd7dd86ccb90406af32de0cad24be3 upstream.

lockdep complains when padata's paths to update cpumasks via CPU hotplug
and sysfs are both taken:

# echo 0 > /sys/devices/system/cpu/cpu1/online
# echo ff > /sys/kernel/pcrypt/pencrypt/parallel_cpumask

======================================================
WARNING: possible circular locking dependency detected
5.4.0-rc8-padata-cpuhp-v3+ #1 Not tainted
------------------------------------------------------
bash/205 is trying to acquire lock:
ffffffff8286bcd0 (cpu_hotplug_lock.rw_sem){++++}, at: padata_set_cpumask+0x2b/0x120

but task is already holding lock:
ffff8880001abfa0 (&pinst->lock){+.+.}, at: padata_set_cpumask+0x26/0x120

which lock already depends on the new lock.

padata doesn't take cpu_hotplug_lock and pinst->lock in a consistent
order. Which should be first? CPU hotplug calls into padata with
cpu_hotplug_lock already held, so it should have priority.

Fixes: 6751fb3c0e0c ("padata: Use get_online_cpus/put_online_cpus")
Signed-off-by: Daniel Jordan <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Steffen Klassert <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/padata.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -643,8 +643,8 @@ int padata_set_cpumask(struct padata_ins
struct cpumask *serial_mask, *parallel_mask;
int err = -EINVAL;

- mutex_lock(&pinst->lock);
get_online_cpus();
+ mutex_lock(&pinst->lock);

switch (cpumask_type) {
case PADATA_CPU_PARALLEL:
@@ -662,8 +662,8 @@ int padata_set_cpumask(struct padata_ins
err = __padata_set_cpumasks(pinst, parallel_mask, serial_mask);

out:
- put_online_cpus();
mutex_unlock(&pinst->lock);
+ put_online_cpus();

return err;
}


2020-03-03 17:53:03

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 139/176] mwifiex: delete unused mwifiex_get_intf_num()

From: Brian Norris <[email protected]>

commit 1c9f329b084b7b8ea6d60d91a202e884cdcf6aae upstream.

Commit 7afb94da3cd8 ("mwifiex: update set_mac_address logic") fixed the
only user of this function, partly because the author seems to have
noticed that, as written, it's on the borderline between highly
misleading and buggy.

Anyway, no sense in keeping dead code around: let's drop it.

Fixes: 7afb94da3cd8 ("mwifiex: update set_mac_address logic")
Signed-off-by: Brian Norris <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/wireless/marvell/mwifiex/main.h | 13 -------------
1 file changed, 13 deletions(-)

--- a/drivers/net/wireless/marvell/mwifiex/main.h
+++ b/drivers/net/wireless/marvell/mwifiex/main.h
@@ -1295,19 +1295,6 @@ mwifiex_copy_rates(u8 *dest, u32 pos, u8
return pos;
}

-/* This function return interface number with the same bss_type.
- */
-static inline u8
-mwifiex_get_intf_num(struct mwifiex_adapter *adapter, u8 bss_type)
-{
- u8 i, num = 0;
-
- for (i = 0; i < adapter->priv_num; i++)
- if (adapter->priv[i] && adapter->priv[i]->bss_type == bss_type)
- num++;
- return num;
-}
-
/*
* This function returns the correct private structure pointer based
* upon the BSS type and BSS number.


2020-03-03 17:53:07

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 153/176] perf hists browser: Restore ESC as "Zoom out" of DSO/thread/etc

From: Arnaldo Carvalho de Melo <[email protected]>

commit 3f7774033e6820d25beee5cf7aefa11d4968b951 upstream.

We need to set actions->ms.map since 599a2f38a989 ("perf hists browser:
Check sort keys before hot key actions"), as in that patch we bail out
if map is NULL.

Reviewed-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Namhyung Kim <[email protected]>
Fixes: 599a2f38a989 ("perf hists browser: Check sort keys before hot key actions")
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
tools/perf/ui/browsers/hists.c | 1 +
1 file changed, 1 insertion(+)

--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -3062,6 +3062,7 @@ static int perf_evsel__hists_browse(stru

continue;
}
+ actions->ms.map = map;
top = pstack__peek(browser->pstack);
if (top == &browser->hists->dso_filter) {
/*


2020-03-03 17:53:09

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 135/176] net: ena: make ena rxfh support ETH_RSS_HASH_NO_CHANGE

From: Arthur Kiyanovski <[email protected]>

commit 470793a78ce344bd53d31e0c2d537f71ba957547 upstream.

As the name suggests ETH_RSS_HASH_NO_CHANGE is received upon changing
the key or indirection table using ethtool while keeping the same hash
function.

Also add a function for retrieving the current hash function from
the ena-com layer.

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <[email protected]>
Signed-off-by: Saeed Bshara <[email protected]>
Signed-off-by: Arthur Kiyanovski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/ethernet/amazon/ena/ena_com.c | 5 +++++
drivers/net/ethernet/amazon/ena/ena_com.h | 8 ++++++++
drivers/net/ethernet/amazon/ena/ena_ethtool.c | 3 +++
3 files changed, 16 insertions(+)

--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -1046,6 +1046,11 @@ static int ena_com_get_feature(struct en
feature_ver);
}

+int ena_com_get_current_hash_function(struct ena_com_dev *ena_dev)
+{
+ return ena_dev->rss.hash_func;
+}
+
static void ena_com_hash_key_fill_default_key(struct ena_com_dev *ena_dev)
{
struct ena_admin_feature_rss_flow_hash_control *hash_key =
--- a/drivers/net/ethernet/amazon/ena/ena_com.h
+++ b/drivers/net/ethernet/amazon/ena/ena_com.h
@@ -656,6 +656,14 @@ int ena_com_rss_init(struct ena_com_dev
*/
void ena_com_rss_destroy(struct ena_com_dev *ena_dev);

+/* ena_com_get_current_hash_function - Get RSS hash function
+ * @ena_dev: ENA communication layer struct
+ *
+ * Return the current hash function.
+ * @return: 0 or one of the ena_admin_hash_functions values.
+ */
+int ena_com_get_current_hash_function(struct ena_com_dev *ena_dev);
+
/* ena_com_fill_hash_function - Fill RSS hash function
* @ena_dev: ENA communication layer struct
* @func: The hash function (Toeplitz or crc)
--- a/drivers/net/ethernet/amazon/ena/ena_ethtool.c
+++ b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
@@ -736,6 +736,9 @@ static int ena_set_rxfh(struct net_devic
}

switch (hfunc) {
+ case ETH_RSS_HASH_NO_CHANGE:
+ func = ena_com_get_current_hash_function(ena_dev);
+ break;
case ETH_RSS_HASH_TOP:
func = ENA_ADMIN_TOEPLITZ;
break;


2020-03-03 17:53:13

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 172/176] mm/gup: allow FOLL_FORCE for get_user_pages_fast()

From: John Hubbard <[email protected]>

commit f4000fdf435b8301a11cf85237c561047f8c4c72 upstream.

Commit 817be129e6f2 ("mm: validate get_user_pages_fast flags") allowed
only FOLL_WRITE and FOLL_LONGTERM to be passed to get_user_pages_fast().
This, combined with the fact that get_user_pages_fast() falls back to
"slow gup", which *does* accept FOLL_FORCE, leads to an odd situation:
if you need FOLL_FORCE, you cannot call get_user_pages_fast().

There does not appear to be any reason for filtering out FOLL_FORCE.
There is nothing in the _fast() implementation that requires that we
avoid writing to the pages. So it appears to have been an oversight.

Fix by allowing FOLL_FORCE to be set for get_user_pages_fast().

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 817be129e6f2 ("mm: validate get_user_pages_fast flags")
Signed-off-by: John Hubbard <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Alex Williamson <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Björn Töpel <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Hans Verkuil <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Cc: Mike Rapoport <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
mm/gup.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2415,7 +2415,8 @@ int get_user_pages_fast(unsigned long st
unsigned long addr, len, end;
int nr = 0, ret = 0;

- if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM)))
+ if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM |
+ FOLL_FORCE)))
return -EINVAL;

start = untagged_addr(start) & PAGE_MASK;


2020-03-03 17:53:14

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 161/176] ubifs: Fix ino_t format warnings in orphan_delete()

From: Geert Uytterhoeven <[email protected]>

commit 155fc6ba488a8bdfd1d3be3d7ba98c9cec2b2429 upstream.

On alpha and s390x:

fs/ubifs/debug.h:158:11: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘ino_t {aka unsigned int}’ [-Wformat=]
...
fs/ubifs/orphan.c:132:3: note: in expansion of macro ‘dbg_gen’
dbg_gen("deleted twice ino %lu", orph->inum);
...
fs/ubifs/orphan.c:140:3: note: in expansion of macro ‘dbg_gen’
dbg_gen("delete later ino %lu", orph->inum);

__kernel_ino_t is "unsigned long" on most architectures, but not on
alpha and s390x, where it is "unsigned int". Hence when printing an
ino_t, it should always be cast to "unsigned long" first.

Fix this by re-adding the recently removed casts.

Fixes: 8009ce956c3d2802 ("ubifs: Don't leak orphans on memory during commit")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Richard Weinberger <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ubifs/orphan.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/fs/ubifs/orphan.c
+++ b/fs/ubifs/orphan.c
@@ -129,7 +129,7 @@ static void __orphan_drop(struct ubifs_i
static void orphan_delete(struct ubifs_info *c, struct ubifs_orphan *orph)
{
if (orph->del) {
- dbg_gen("deleted twice ino %lu", orph->inum);
+ dbg_gen("deleted twice ino %lu", (unsigned long)orph->inum);
return;
}

@@ -137,7 +137,7 @@ static void orphan_delete(struct ubifs_i
orph->del = 1;
orph->dnext = c->orph_dnext;
c->orph_dnext = orph;
- dbg_gen("delete later ino %lu", orph->inum);
+ dbg_gen("delete later ino %lu", (unsigned long)orph->inum);
return;
}



2020-03-03 17:53:27

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 176/176] kvm: nVMX: VMWRITE checks unsupported field before read-only field

From: Jim Mattson <[email protected]>

commit 693e02cc24090c379217138719d9d84e50036b24 upstream.

According to the SDM, VMWRITE checks to see if the secondary source
operand corresponds to an unsupported VMCS field before it checks to
see if the secondary source operand corresponds to a VM-exit
information field and the processor does not support writing to
VM-exit information fields.

Fixes: 49f705c5324aa ("KVM: nVMX: Implement VMREAD and VMWRITE")
Signed-off-by: Jim Mattson <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Reviewed-by: Peter Shier <[email protected]>
Reviewed-by: Oliver Upton <[email protected]>
Reviewed-by: Jon Cargille <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kvm/vmx/nested.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4940,6 +4940,12 @@ static int handle_vmwrite(struct kvm_vcp


field = kvm_register_readl(vcpu, (((vmx_instruction_info) >> 28) & 0xf));
+
+ offset = vmcs_field_to_offset(field);
+ if (offset < 0)
+ return nested_vmx_failValid(vcpu,
+ VMXERR_UNSUPPORTED_VMCS_COMPONENT);
+
/*
* If the vCPU supports "VMWRITE to any supported field in the
* VMCS," then the "read-only" fields are actually read/write.
@@ -4956,11 +4962,6 @@ static int handle_vmwrite(struct kvm_vcp
if (!is_guest_mode(vcpu) && !is_shadow_field_rw(field))
copy_vmcs02_to_vmcs12_rare(vcpu, vmcs12);

- offset = vmcs_field_to_offset(field);
- if (offset < 0)
- return nested_vmx_failValid(vcpu,
- VMXERR_UNSUPPORTED_VMCS_COMPONENT);
-
/*
* Some Intel CPUs intentionally drop the reserved bits of the AR byte
* fields on VMWRITE. Emulate this behavior to ensure consistent KVM


2020-03-03 17:53:35

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 138/176] mwifiex: drop most magic numbers from mwifiex_process_tdls_action_frame()

From: Brian Norris <[email protected]>

commit 70e5b8f445fd27fde0c5583460e82539a7242424 upstream.

Before commit 1e58252e334d ("mwifiex: Fix heap overflow in
mmwifiex_process_tdls_action_frame()"),
mwifiex_process_tdls_action_frame() already had too many magic numbers.
But this commit just added a ton more, in the name of checking for
buffer overflows. That seems like a really bad idea.

Let's make these magic numbers a little less magic, by
(a) factoring out 'pos[1]' as 'ie_len'
(b) using 'sizeof' on the appropriate source or destination fields where
possible, instead of bare numbers
(c) dropping redundant checks, per below.

Regarding redundant checks: the beginning of the loop has this:

if (pos + 2 + pos[1] > end)
break;

but then individual 'case's include stuff like this:

if (pos > end - 3)
return;
if (pos[1] != 1)
return;

Note that the second 'return' (validating the length, pos[1]) combined
with the above condition (ensuring 'pos + 2 + length' doesn't exceed
'end'), makes the first 'return' (whose 'if' can be reworded as 'pos >
end - pos[1] - 2') redundant. Rather than unwind the magic numbers
there, just drop those conditions.

Fixes: 1e58252e334d ("mwifiex: Fix heap overflow in mmwifiex_process_tdls_action_frame()")
Signed-off-by: Brian Norris <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/wireless/marvell/mwifiex/tdls.c | 75 ++++++++++------------------
1 file changed, 28 insertions(+), 47 deletions(-)

--- a/drivers/net/wireless/marvell/mwifiex/tdls.c
+++ b/drivers/net/wireless/marvell/mwifiex/tdls.c
@@ -894,7 +894,7 @@ void mwifiex_process_tdls_action_frame(s
u8 *peer, *pos, *end;
u8 i, action, basic;
u16 cap = 0;
- int ie_len = 0;
+ int ies_len = 0;

if (len < (sizeof(struct ethhdr) + 3))
return;
@@ -916,7 +916,7 @@ void mwifiex_process_tdls_action_frame(s
pos = buf + sizeof(struct ethhdr) + 4;
/* payload 1+ category 1 + action 1 + dialog 1 */
cap = get_unaligned_le16(pos);
- ie_len = len - sizeof(struct ethhdr) - TDLS_REQ_FIX_LEN;
+ ies_len = len - sizeof(struct ethhdr) - TDLS_REQ_FIX_LEN;
pos += 2;
break;

@@ -926,7 +926,7 @@ void mwifiex_process_tdls_action_frame(s
/* payload 1+ category 1 + action 1 + dialog 1 + status code 2*/
pos = buf + sizeof(struct ethhdr) + 6;
cap = get_unaligned_le16(pos);
- ie_len = len - sizeof(struct ethhdr) - TDLS_RESP_FIX_LEN;
+ ies_len = len - sizeof(struct ethhdr) - TDLS_RESP_FIX_LEN;
pos += 2;
break;

@@ -934,7 +934,7 @@ void mwifiex_process_tdls_action_frame(s
if (len < (sizeof(struct ethhdr) + TDLS_CONFIRM_FIX_LEN))
return;
pos = buf + sizeof(struct ethhdr) + TDLS_CONFIRM_FIX_LEN;
- ie_len = len - sizeof(struct ethhdr) - TDLS_CONFIRM_FIX_LEN;
+ ies_len = len - sizeof(struct ethhdr) - TDLS_CONFIRM_FIX_LEN;
break;
default:
mwifiex_dbg(priv->adapter, ERROR, "Unknown TDLS frame type.\n");
@@ -947,33 +947,33 @@ void mwifiex_process_tdls_action_frame(s

sta_ptr->tdls_cap.capab = cpu_to_le16(cap);

- for (end = pos + ie_len; pos + 1 < end; pos += 2 + pos[1]) {
- if (pos + 2 + pos[1] > end)
+ for (end = pos + ies_len; pos + 1 < end; pos += 2 + pos[1]) {
+ u8 ie_len = pos[1];
+
+ if (pos + 2 + ie_len > end)
break;

switch (*pos) {
case WLAN_EID_SUPP_RATES:
- if (pos[1] > 32)
+ if (ie_len > sizeof(sta_ptr->tdls_cap.rates))
return;
- sta_ptr->tdls_cap.rates_len = pos[1];
- for (i = 0; i < pos[1]; i++)
+ sta_ptr->tdls_cap.rates_len = ie_len;
+ for (i = 0; i < ie_len; i++)
sta_ptr->tdls_cap.rates[i] = pos[i + 2];
break;

case WLAN_EID_EXT_SUPP_RATES:
- if (pos[1] > 32)
+ if (ie_len > sizeof(sta_ptr->tdls_cap.rates))
return;
basic = sta_ptr->tdls_cap.rates_len;
- if (pos[1] > 32 - basic)
+ if (ie_len > sizeof(sta_ptr->tdls_cap.rates) - basic)
return;
- for (i = 0; i < pos[1]; i++)
+ for (i = 0; i < ie_len; i++)
sta_ptr->tdls_cap.rates[basic + i] = pos[i + 2];
- sta_ptr->tdls_cap.rates_len += pos[1];
+ sta_ptr->tdls_cap.rates_len += ie_len;
break;
case WLAN_EID_HT_CAPABILITY:
- if (pos > end - sizeof(struct ieee80211_ht_cap) - 2)
- return;
- if (pos[1] != sizeof(struct ieee80211_ht_cap))
+ if (ie_len != sizeof(struct ieee80211_ht_cap))
return;
/* copy the ie's value into ht_capb*/
memcpy((u8 *)&sta_ptr->tdls_cap.ht_capb, pos + 2,
@@ -981,59 +981,45 @@ void mwifiex_process_tdls_action_frame(s
sta_ptr->is_11n_enabled = 1;
break;
case WLAN_EID_HT_OPERATION:
- if (pos > end -
- sizeof(struct ieee80211_ht_operation) - 2)
- return;
- if (pos[1] != sizeof(struct ieee80211_ht_operation))
+ if (ie_len != sizeof(struct ieee80211_ht_operation))
return;
/* copy the ie's value into ht_oper*/
memcpy(&sta_ptr->tdls_cap.ht_oper, pos + 2,
sizeof(struct ieee80211_ht_operation));
break;
case WLAN_EID_BSS_COEX_2040:
- if (pos > end - 3)
- return;
- if (pos[1] != 1)
+ if (ie_len != sizeof(pos[2]))
return;
sta_ptr->tdls_cap.coex_2040 = pos[2];
break;
case WLAN_EID_EXT_CAPABILITY:
- if (pos > end - sizeof(struct ieee_types_header))
- return;
- if (pos[1] < sizeof(struct ieee_types_header))
+ if (ie_len < sizeof(struct ieee_types_header))
return;
- if (pos[1] > 8)
+ if (ie_len > 8)
return;
memcpy((u8 *)&sta_ptr->tdls_cap.extcap, pos,
sizeof(struct ieee_types_header) +
- min_t(u8, pos[1], 8));
+ min_t(u8, ie_len, 8));
break;
case WLAN_EID_RSN:
- if (pos > end - sizeof(struct ieee_types_header))
+ if (ie_len < sizeof(struct ieee_types_header))
return;
- if (pos[1] < sizeof(struct ieee_types_header))
- return;
- if (pos[1] > IEEE_MAX_IE_SIZE -
+ if (ie_len > IEEE_MAX_IE_SIZE -
sizeof(struct ieee_types_header))
return;
memcpy((u8 *)&sta_ptr->tdls_cap.rsn_ie, pos,
sizeof(struct ieee_types_header) +
- min_t(u8, pos[1], IEEE_MAX_IE_SIZE -
+ min_t(u8, ie_len, IEEE_MAX_IE_SIZE -
sizeof(struct ieee_types_header)));
break;
case WLAN_EID_QOS_CAPA:
- if (pos > end - 3)
- return;
- if (pos[1] != 1)
+ if (ie_len != sizeof(pos[2]))
return;
sta_ptr->tdls_cap.qos_info = pos[2];
break;
case WLAN_EID_VHT_OPERATION:
if (priv->adapter->is_hw_11ac_capable) {
- if (pos > end -
- sizeof(struct ieee80211_vht_operation) - 2)
- return;
- if (pos[1] !=
+ if (ie_len !=
sizeof(struct ieee80211_vht_operation))
return;
/* copy the ie's value into vhtoper*/
@@ -1043,10 +1029,7 @@ void mwifiex_process_tdls_action_frame(s
break;
case WLAN_EID_VHT_CAPABILITY:
if (priv->adapter->is_hw_11ac_capable) {
- if (pos > end -
- sizeof(struct ieee80211_vht_cap) - 2)
- return;
- if (pos[1] != sizeof(struct ieee80211_vht_cap))
+ if (ie_len != sizeof(struct ieee80211_vht_cap))
return;
/* copy the ie's value into vhtcap*/
memcpy((u8 *)&sta_ptr->tdls_cap.vhtcap, pos + 2,
@@ -1056,9 +1039,7 @@ void mwifiex_process_tdls_action_frame(s
break;
case WLAN_EID_AID:
if (priv->adapter->is_hw_11ac_capable) {
- if (pos > end - 4)
- return;
- if (pos[1] != 2)
+ if (ie_len != sizeof(u16))
return;
sta_ptr->tdls_cap.aid =
get_unaligned_le16((pos + 2));


2020-03-03 17:53:36

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 164/176] netfilter: nft_tunnel: no need to call htons() when dumping ports

From: Xin Long <[email protected]>

commit cf3e204a1ca5442190018a317d9ec181b4639bd6 upstream.

info->key.tp_src and tp_dst are __be16, when using nla_put_be16()
to dump them, htons() is not needed, so remove it in this patch.

Fixes: af308b94a2a4 ("netfilter: nf_tables: add tunnel support")
Signed-off-by: Xin Long <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/netfilter/nft_tunnel.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/netfilter/nft_tunnel.c
+++ b/net/netfilter/nft_tunnel.c
@@ -505,8 +505,8 @@ static int nft_tunnel_opts_dump(struct s
static int nft_tunnel_ports_dump(struct sk_buff *skb,
struct ip_tunnel_info *info)
{
- if (nla_put_be16(skb, NFTA_TUNNEL_KEY_SPORT, htons(info->key.tp_src)) < 0 ||
- nla_put_be16(skb, NFTA_TUNNEL_KEY_DPORT, htons(info->key.tp_dst)) < 0)
+ if (nla_put_be16(skb, NFTA_TUNNEL_KEY_SPORT, info->key.tp_src) < 0 ||
+ nla_put_be16(skb, NFTA_TUNNEL_KEY_DPORT, info->key.tp_dst) < 0)
return -1;

return 0;


2020-03-03 17:53:42

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 140/176] perf report: Fix no libunwind compiled warning break s390 issue

From: Jin Yao <[email protected]>

commit c3314a74f86dc00827e0945c8e5039fc3aebaa3c upstream.

Commit 800d3f561659 ("perf report: Add warning when libunwind not
compiled in") breaks the s390 platform. S390 uses libdw-dwarf-unwind for
call chain unwinding and had no support for libunwind.

So the warning "Please install libunwind development packages during the
perf build." caused the confusion even if the call-graph is displayed
correctly.

This patch adds checking for HAVE_DWARF_SUPPORT, which is set when
libdw-dwarf-unwind is compiled in.

Fixes: 800d3f561659 ("perf report: Add warning when libunwind not compiled in")
Signed-off-by: Jin Yao <[email protected]>
Reviewed-by: Thomas Richter <[email protected]>
Tested-by: Thomas Richter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
tools/perf/builtin-report.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -412,10 +412,10 @@ static int report__setup_sample_type(str
PERF_SAMPLE_BRANCH_ANY))
rep->nonany_branch_mode = true;

-#ifndef HAVE_LIBUNWIND_SUPPORT
+#if !defined(HAVE_LIBUNWIND_SUPPORT) && !defined(HAVE_DWARF_SUPPORT)
if (dwarf_callchain_users) {
- ui__warning("Please install libunwind development packages "
- "during the perf build.\n");
+ ui__warning("Please install libunwind or libdw "
+ "development packages during the perf build.\n");
}
#endif



2020-03-03 17:53:51

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 175/176] kvm: nVMX: VMWRITE checks VMCS-link pointer before VMCS field

From: Jim Mattson <[email protected]>

commit dd2d6042b7f4a5440705b4ffc6c4c2dba81a43b7 upstream.

According to the SDM, a VMWRITE in VMX non-root operation with an
invalid VMCS-link pointer results in VMfailInvalid before the validity
of the VMCS field in the secondary source operand is checked.

For consistency, modify both handle_vmwrite and handle_vmread, even
though there was no problem with the latter.

Fixes: 6d894f498f5d1 ("KVM: nVMX: vmread/vmwrite: Use shadow vmcs12 if running L2")
Signed-off-by: Jim Mattson <[email protected]>
Cc: Liran Alon <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Reviewed-by: Peter Shier <[email protected]>
Reviewed-by: Oliver Upton <[email protected]>
Reviewed-by: Jon Cargille <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kvm/vmx/nested.c | 59 +++++++++++++++++++---------------------------
1 file changed, 25 insertions(+), 34 deletions(-)

--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4808,32 +4808,28 @@ static int handle_vmread(struct kvm_vcpu
{
unsigned long field;
u64 field_value;
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
unsigned long exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
u32 vmx_instruction_info = vmcs_read32(VMX_INSTRUCTION_INFO);
int len;
gva_t gva = 0;
- struct vmcs12 *vmcs12;
+ struct vmcs12 *vmcs12 = is_guest_mode(vcpu) ? get_shadow_vmcs12(vcpu)
+ : get_vmcs12(vcpu);
struct x86_exception e;
short offset;

if (!nested_vmx_check_permission(vcpu))
return 1;

- if (to_vmx(vcpu)->nested.current_vmptr == -1ull)
+ /*
+ * In VMX non-root operation, when the VMCS-link pointer is -1ull,
+ * any VMREAD sets the ALU flags for VMfailInvalid.
+ */
+ if (vmx->nested.current_vmptr == -1ull ||
+ (is_guest_mode(vcpu) &&
+ get_vmcs12(vcpu)->vmcs_link_pointer == -1ull))
return nested_vmx_failInvalid(vcpu);

- if (!is_guest_mode(vcpu))
- vmcs12 = get_vmcs12(vcpu);
- else {
- /*
- * When vmcs->vmcs_link_pointer is -1ull, any VMREAD
- * to shadowed-field sets the ALU flags for VMfailInvalid.
- */
- if (get_vmcs12(vcpu)->vmcs_link_pointer == -1ull)
- return nested_vmx_failInvalid(vcpu);
- vmcs12 = get_shadow_vmcs12(vcpu);
- }
-
/* Decode instruction info and find the field to read */
field = kvm_register_readl(vcpu, (((vmx_instruction_info) >> 28) & 0xf));

@@ -4912,13 +4908,20 @@ static int handle_vmwrite(struct kvm_vcp
*/
u64 field_value = 0;
struct x86_exception e;
- struct vmcs12 *vmcs12;
+ struct vmcs12 *vmcs12 = is_guest_mode(vcpu) ? get_shadow_vmcs12(vcpu)
+ : get_vmcs12(vcpu);
short offset;

if (!nested_vmx_check_permission(vcpu))
return 1;

- if (vmx->nested.current_vmptr == -1ull)
+ /*
+ * In VMX non-root operation, when the VMCS-link pointer is -1ull,
+ * any VMWRITE sets the ALU flags for VMfailInvalid.
+ */
+ if (vmx->nested.current_vmptr == -1ull ||
+ (is_guest_mode(vcpu) &&
+ get_vmcs12(vcpu)->vmcs_link_pointer == -1ull))
return nested_vmx_failInvalid(vcpu);

if (vmx_instruction_info & (1u << 10))
@@ -4946,24 +4949,12 @@ static int handle_vmwrite(struct kvm_vcp
return nested_vmx_failValid(vcpu,
VMXERR_VMWRITE_READ_ONLY_VMCS_COMPONENT);

- if (!is_guest_mode(vcpu)) {
- vmcs12 = get_vmcs12(vcpu);
-
- /*
- * Ensure vmcs12 is up-to-date before any VMWRITE that dirties
- * vmcs12, else we may crush a field or consume a stale value.
- */
- if (!is_shadow_field_rw(field))
- copy_vmcs02_to_vmcs12_rare(vcpu, vmcs12);
- } else {
- /*
- * When vmcs->vmcs_link_pointer is -1ull, any VMWRITE
- * to shadowed-field sets the ALU flags for VMfailInvalid.
- */
- if (get_vmcs12(vcpu)->vmcs_link_pointer == -1ull)
- return nested_vmx_failInvalid(vcpu);
- vmcs12 = get_shadow_vmcs12(vcpu);
- }
+ /*
+ * Ensure vmcs12 is up-to-date before any VMWRITE that dirties
+ * vmcs12, else we may crush a field or consume a stale value.
+ */
+ if (!is_guest_mode(vcpu) && !is_shadow_field_rw(field))
+ copy_vmcs02_to_vmcs12_rare(vcpu, vmcs12);

offset = vmcs_field_to_offset(field);
if (offset < 0)


2020-03-03 17:54:03

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 171/176] mm/debug.c: always print flags in dump_page()

From: Vlastimil Babka <[email protected]>

commit 5b57b8f22709f07c0ab5921c94fd66e8c59c3e11 upstream.

Commit 76a1850e4572 ("mm/debug.c: __dump_page() prints an extra line")
inadvertently removed printing of page flags for pages that are neither
anon nor ksm nor have a mapping. Fix that.

Using pr_cont() again would be a solution, but the commit explicitly
removed its use. Avoiding the danger of mixing up split lines from
multiple CPUs might be beneficial for near-panic dumps like this, so fix
this without reintroducing pr_cont().

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 76a1850e4572 ("mm/debug.c: __dump_page() prints an extra line")
Signed-off-by: Vlastimil Babka <[email protected]>
Reported-by: Anshuman Khandual <[email protected]>
Reported-by: Michal Hocko <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Qian Cai <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Ralph Campbell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
mm/debug.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

--- a/mm/debug.c
+++ b/mm/debug.c
@@ -47,6 +47,7 @@ void __dump_page(struct page *page, cons
struct address_space *mapping;
bool page_poisoned = PagePoisoned(page);
int mapcount;
+ char *type = "";

/*
* If struct page is poisoned don't access Page*() functions as that
@@ -78,9 +79,9 @@ void __dump_page(struct page *page, cons
page, page_ref_count(page), mapcount,
page->mapping, page_to_pgoff(page));
if (PageKsm(page))
- pr_warn("ksm flags: %#lx(%pGp)\n", page->flags, &page->flags);
+ type = "ksm ";
else if (PageAnon(page))
- pr_warn("anon flags: %#lx(%pGp)\n", page->flags, &page->flags);
+ type = "anon ";
else if (mapping) {
if (mapping->host && mapping->host->i_dentry.first) {
struct dentry *dentry;
@@ -88,10 +89,11 @@ void __dump_page(struct page *page, cons
pr_warn("%ps name:\"%pd\"\n", mapping->a_ops, dentry);
} else
pr_warn("%ps\n", mapping->a_ops);
- pr_warn("flags: %#lx(%pGp)\n", page->flags, &page->flags);
}
BUILD_BUG_ON(ARRAY_SIZE(pageflag_names) != __NR_PAGEFLAGS + 1);

+ pr_warn("%sflags: %#lx(%pGp)\n", type, page->flags, &page->flags);
+
hex_only:
print_hex_dump(KERN_WARNING, "raw: ", DUMP_PREFIX_NONE, 32,
sizeof(unsigned long), page,


2020-03-03 17:54:06

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 170/176] locking/lockdep: Fix lockdep_stats indentation problem

From: Waiman Long <[email protected]>

commit a030f9767da1a6bbcec840fc54770eb11c2414b6 upstream.

It was found that two lines in the output of /proc/lockdep_stats have
indentation problem:

# cat /proc/lockdep_stats
:
in-process chains: 25057
stack-trace entries: 137827 [max: 524288]
number of stack traces: 7973
number of stack hash chains: 6355
combined max dependencies: 1356414598
hardirq-safe locks: 57
hardirq-unsafe locks: 1286
:

All the numbers displayed in /proc/lockdep_stats except the two stack
trace numbers are formatted with a field with of 11. To properly align
all the numbers, a field width of 11 is now added to the two stack
trace numbers.

Fixes: 8c779229d0f4 ("locking/lockdep: Report more stack trace statistics")
Signed-off-by: Waiman Long <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Bart Van Assche <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/locking/lockdep_proc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/kernel/locking/lockdep_proc.c
+++ b/kernel/locking/lockdep_proc.c
@@ -286,9 +286,9 @@ static int lockdep_stats_show(struct seq
seq_printf(m, " stack-trace entries: %11lu [max: %lu]\n",
nr_stack_trace_entries, MAX_STACK_TRACE_ENTRIES);
#if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING)
- seq_printf(m, " number of stack traces: %llu\n",
+ seq_printf(m, " number of stack traces: %11llu\n",
lockdep_stack_trace_count());
- seq_printf(m, " number of stack hash chains: %llu\n",
+ seq_printf(m, " number of stack hash chains: %11llu\n",
lockdep_stack_hash_count());
#endif
seq_printf(m, " combined max dependencies: %11u\n",


2020-03-03 17:54:24

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 168/176] xfs: clear kernel only flags in XFS_IOC_ATTRMULTI_BY_HANDLE

From: Christoph Hellwig <[email protected]>

commit 953aa9d136f53e226448dbd801a905c28f8071bf upstream.

Don't allow passing arbitrary flags as they change behavior including
memory allocation that the call stack is not prepared for.

Fixes: ddbca70cc45c ("xfs: allocate xattr buffer on demand")
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/xfs/libxfs/xfs_attr.h | 7 +++++--
fs/xfs/xfs_ioctl.c | 2 ++
fs/xfs/xfs_ioctl32.c | 2 ++
3 files changed, 9 insertions(+), 2 deletions(-)

--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -26,7 +26,7 @@ struct xfs_attr_list_context;
*========================================================================*/


-#define ATTR_DONTFOLLOW 0x0001 /* -- unused, from IRIX -- */
+#define ATTR_DONTFOLLOW 0x0001 /* -- ignored, from IRIX -- */
#define ATTR_ROOT 0x0002 /* use attrs in root (trusted) namespace */
#define ATTR_TRUST 0x0004 /* -- unused, from IRIX -- */
#define ATTR_SECURE 0x0008 /* use attrs in security namespace */
@@ -37,7 +37,10 @@ struct xfs_attr_list_context;
#define ATTR_KERNOVAL 0x2000 /* [kernel] get attr size only, not value */

#define ATTR_INCOMPLETE 0x4000 /* [kernel] return INCOMPLETE attr keys */
-#define ATTR_ALLOC 0x8000 /* allocate xattr buffer on demand */
+#define ATTR_ALLOC 0x8000 /* [kernel] allocate xattr buffer on demand */
+
+#define ATTR_KERNEL_FLAGS \
+ (ATTR_KERNOTIME | ATTR_KERNOVAL | ATTR_INCOMPLETE | ATTR_ALLOC)

#define XFS_ATTR_FLAGS \
{ ATTR_DONTFOLLOW, "DONTFOLLOW" }, \
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -462,6 +462,8 @@ xfs_attrmulti_by_handle(

error = 0;
for (i = 0; i < am_hreq.opcount; i++) {
+ ops[i].am_flags &= ~ATTR_KERNEL_FLAGS;
+
ops[i].am_error = strncpy_from_user((char *)attr_name,
ops[i].am_attrname, MAXNAMELEN);
if (ops[i].am_error == 0 || ops[i].am_error == MAXNAMELEN)
--- a/fs/xfs/xfs_ioctl32.c
+++ b/fs/xfs/xfs_ioctl32.c
@@ -450,6 +450,8 @@ xfs_compat_attrmulti_by_handle(

error = 0;
for (i = 0; i < am_hreq.opcount; i++) {
+ ops[i].am_flags &= ~ATTR_KERNEL_FLAGS;
+
ops[i].am_error = strncpy_from_user((char *)attr_name,
compat_ptr(ops[i].am_attrname),
MAXNAMELEN);


2020-03-03 18:07:28

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH 5.5 072/176] bcache: ignore pending signals when creating gc and allocator thread

On 3/3/20 10:42 AM, Greg Kroah-Hartman wrote:
> From: Coly Li <[email protected]>
>
> [ Upstream commit 0b96da639a4874311e9b5156405f69ef9fc3bef8 ]
>
> When run a cache set, all the bcache btree node of this cache set will
> be checked by bch_btree_check(). If the bcache btree is very large,
> iterating all the btree nodes will occupy too much system memory and
> the bcache registering process might be selected and killed by system
> OOM killer. kthread_run() will fail if current process has pending
> signal, therefore the kthread creating in run_cache_set() for gc and
> allocator kernel threads are very probably failed for a very large
> bcache btree.
>
> Indeed such OOM is safe and the registering process will exit after
> the registration done. Therefore this patch flushes pending signals
> during the cache set start up, specificly in bch_cache_allocator_start()
> and bch_gc_thread_start(), to make sure run_cache_set() won't fail for
> large cahced data set.

Ditto this one, of course.

Did someone send this in for stable? It's not marked stable in the
original commit.

--
Jens Axboe

2020-03-03 18:10:32

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 167/176] clk: qcom: rpmh: Sort OF match table

From: Bjorn Andersson <[email protected]>

commit 9e0cda721d18f44f1cd74d3a426931d71c1f1b30 upstream.

sc7180 was added to the end of the match table, sort the table.

Fixes: eee28109f871 ("clk: qcom: clk-rpmh: Add support for RPMHCC for SC7180")
Signed-off-by: Bjorn Andersson <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/clk/qcom/clk-rpmh.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/clk/qcom/clk-rpmh.c
+++ b/drivers/clk/qcom/clk-rpmh.c
@@ -481,9 +481,9 @@ static int clk_rpmh_probe(struct platfor
}

static const struct of_device_id clk_rpmh_match_table[] = {
+ { .compatible = "qcom,sc7180-rpmh-clk", .data = &clk_rpmh_sc7180},
{ .compatible = "qcom,sdm845-rpmh-clk", .data = &clk_rpmh_sdm845},
{ .compatible = "qcom,sm8150-rpmh-clk", .data = &clk_rpmh_sm8150},
- { .compatible = "qcom,sc7180-rpmh-clk", .data = &clk_rpmh_sc7180},
{ }
};
MODULE_DEVICE_TABLE(of, clk_rpmh_match_table);


2020-03-03 18:10:34

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 165/176] netfilter: nf_flowtable: fix documentation

From: Matteo Croce <[email protected]>

commit 78e06cf430934fc3768c342cbebdd1013dcd6fa7 upstream.

In the flowtable documentation there is a missing semicolon, the command
as is would give this error:

nftables.conf:5:27-33: Error: syntax error, unexpected devices, expecting newline or semicolon
hook ingress priority 0 devices = { br0, pppoe-data };
^^^^^^^
nftables.conf:4:12-13: Error: invalid hook (null)
flowtable ft {
^^

Fixes: 19b351f16fd9 ("netfilter: add flowtable documentation")
Signed-off-by: Matteo Croce <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
Documentation/networking/nf_flowtable.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/Documentation/networking/nf_flowtable.txt
+++ b/Documentation/networking/nf_flowtable.txt
@@ -76,7 +76,7 @@ flowtable and add one rule to your forwa

table inet x {
flowtable f {
- hook ingress priority 0 devices = { eth0, eth1 };
+ hook ingress priority 0; devices = { eth0, eth1 };
}
chain y {
type filter hook forward priority 0; policy accept;


2020-03-03 18:10:34

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 166/176] bus: tegra-aconnect: Remove PM_CLK dependency

From: Sameer Pujar <[email protected]>

commit 2f56acf818a08a9187ac8ec6e3d994fc13dc368d upstream.

The ACONNECT bus driver does not use pm-clk interface anymore and hence
the dependency can be removed from its Kconfig option.

Fixes: 0d7dab926130 ("bus: tegra-aconnect: use devm_clk_*() helpers")
Signed-off-by: Sameer Pujar <[email protected]>
Acked-by: Jon Hunter <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/bus/Kconfig | 1 -
1 file changed, 1 deletion(-)

--- a/drivers/bus/Kconfig
+++ b/drivers/bus/Kconfig
@@ -139,7 +139,6 @@ config TEGRA_ACONNECT
tristate "Tegra ACONNECT Bus Driver"
depends on ARCH_TEGRA_210_SOC
depends on OF && PM
- select PM_CLK
help
Driver for the Tegra ACONNECT bus which is used to interface with
the devices inside the Audio Processing Engine (APE) for Tegra210.


2020-03-03 18:10:49

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 141/176] KVM: SVM: Override default MMIO mask if memory encryption is enabled

From: Tom Lendacky <[email protected]>

commit 52918ed5fcf05d97d257f4131e19479da18f5d16 upstream.

The KVM MMIO support uses bit 51 as the reserved bit to cause nested page
faults when a guest performs MMIO. The AMD memory encryption support uses
a CPUID function to define the encryption bit position. Given this, it is
possible that these bits can conflict.

Use svm_hardware_setup() to override the MMIO mask if memory encryption
support is enabled. Various checks are performed to ensure that the mask
is properly defined and rsvd_bits() is used to generate the new mask (as
was done prior to the change that necessitated this patch).

Fixes: 28a1f3ac1d0c ("kvm: x86: Set highest physical address bits in non-present/reserved SPTEs")
Suggested-by: Sean Christopherson <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Signed-off-by: Tom Lendacky <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kvm/svm.c | 43 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 43 insertions(+)

--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1307,6 +1307,47 @@ static void shrink_ple_window(struct kvm
}
}

+/*
+ * The default MMIO mask is a single bit (excluding the present bit),
+ * which could conflict with the memory encryption bit. Check for
+ * memory encryption support and override the default MMIO mask if
+ * memory encryption is enabled.
+ */
+static __init void svm_adjust_mmio_mask(void)
+{
+ unsigned int enc_bit, mask_bit;
+ u64 msr, mask;
+
+ /* If there is no memory encryption support, use existing mask */
+ if (cpuid_eax(0x80000000) < 0x8000001f)
+ return;
+
+ /* If memory encryption is not enabled, use existing mask */
+ rdmsrl(MSR_K8_SYSCFG, msr);
+ if (!(msr & MSR_K8_SYSCFG_MEM_ENCRYPT))
+ return;
+
+ enc_bit = cpuid_ebx(0x8000001f) & 0x3f;
+ mask_bit = boot_cpu_data.x86_phys_bits;
+
+ /* Increment the mask bit if it is the same as the encryption bit */
+ if (enc_bit == mask_bit)
+ mask_bit++;
+
+ /*
+ * If the mask bit location is below 52, then some bits above the
+ * physical addressing limit will always be reserved, so use the
+ * rsvd_bits() function to generate the mask. This mask, along with
+ * the present bit, will be used to generate a page fault with
+ * PFER.RSV = 1.
+ *
+ * If the mask bit location is 52 (or above), then clear the mask.
+ */
+ mask = (mask_bit < 52) ? rsvd_bits(mask_bit, 51) | PT_PRESENT_MASK : 0;
+
+ kvm_mmu_set_mmio_spte_mask(mask, mask, PT_WRITABLE_MASK | PT_USER_MASK);
+}
+
static __init int svm_hardware_setup(void)
{
int cpu;
@@ -1361,6 +1402,8 @@ static __init int svm_hardware_setup(voi
}
}

+ svm_adjust_mmio_mask();
+
for_each_possible_cpu(cpu) {
r = svm_cpu_init(cpu);
if (r)


2020-03-03 18:11:00

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 136/176] tipc: fix successful connect() but timed out

From: Tuong Lien <[email protected]>

commit 5391a87751a164b3194864126f3b016038abc9fe upstream.

In commit 9546a0b7ce00 ("tipc: fix wrong connect() return code"), we
fixed the issue with the 'connect()' that returns zero even though the
connecting has failed by waiting for the connection to be 'ESTABLISHED'
really. However, the approach has one drawback in conjunction with our
'lightweight' connection setup mechanism that the following scenario
can happen:

(server) (client)

+- accept()| | wait_for_conn()
| | |connect() -------+
| |<-------[SYN]---------| > sleeping
| | *CONNECTING |
|--------->*ESTABLISHED | |
|--------[ACK]-------->*ESTABLISHED > wakeup()
send()|--------[DATA]------->|\ > wakeup()
send()|--------[DATA]------->| | > wakeup()
. . . . |-> recvq .
. . . . | .
send()|--------[DATA]------->|/ > wakeup()
close()|--------[FIN]-------->*DISCONNECTING |
*DISCONNECTING | |
| ~~~~~~~~~~~~~~~~~~> schedule()
| wait again
.
.
| ETIMEDOUT

Upon the receipt of the server 'ACK', the client becomes 'ESTABLISHED'
and the 'wait_for_conn()' process is woken up but not run. Meanwhile,
the server starts to send a number of data following by a 'close()'
shortly without waiting any response from the client, which then forces
the client socket to be 'DISCONNECTING' immediately. When the wait
process is switched to be running, it continues to wait until the timer
expires because of the unexpected socket state. The client 'connect()'
will finally get ‘-ETIMEDOUT’ and force to release the socket whereas
there remains the messages in its receive queue.

Obviously the issue would not happen if the server had some delay prior
to its 'close()' (or the number of 'DATA' messages is large enough),
but any kind of delay would make the connection setup/shutdown "heavy".
We solve this by simply allowing the 'connect()' returns zero in this
particular case. The socket is already 'DISCONNECTING', so any further
write will get '-EPIPE' but the socket is still able to read the
messages existing in its receive queue.

Note: This solution doesn't break the previous one as it deals with a
different situation that the socket state is 'DISCONNECTING' but has no
error (i.e. sk->sk_err = 0).

Fixes: 9546a0b7ce00 ("tipc: fix wrong connect() return code")
Acked-by: Ying Xue <[email protected]>
Acked-by: Jon Maloy <[email protected]>
Signed-off-by: Tuong Lien <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/tipc/socket.c | 2 ++
1 file changed, 2 insertions(+)

--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -2441,6 +2441,8 @@ static int tipc_wait_for_connect(struct
return -ETIMEDOUT;
if (signal_pending(current))
return sock_intr_errno(*timeo_p);
+ if (sk->sk_state == TIPC_DISCONNECTING)
+ break;

add_wait_queue(sk_sleep(sk), &wait);
done = sk_wait_event(sk, timeo_p, tipc_sk_connected(sk),


2020-03-03 18:11:17

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 152/176] pwm: omap-dmtimer: put_device() after of_find_device_by_node()

From: Uwe Kleine-König <[email protected]>

commit c7cb3a1dd53f63c64fb2b567d0be130b92a44d91 upstream.

This was found by coccicheck:

drivers/pwm/pwm-omap-dmtimer.c:304:2-8: ERROR: missing put_device;
call of_find_device_by_node on line 255, but without a corresponding
object release within this function.

Reported-by: Markus Elfring <[email protected]>
Fixes: 6604c6556db9 ("pwm: Add PWM driver for OMAP using dual-mode timers")
Signed-off-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/pwm/pwm-omap-dmtimer.c | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)

--- a/drivers/pwm/pwm-omap-dmtimer.c
+++ b/drivers/pwm/pwm-omap-dmtimer.c
@@ -256,7 +256,7 @@ static int pwm_omap_dmtimer_probe(struct
if (!timer_pdev) {
dev_err(&pdev->dev, "Unable to find Timer pdev\n");
ret = -ENODEV;
- goto put;
+ goto err_find_timer_pdev;
}

timer_pdata = dev_get_platdata(&timer_pdev->dev);
@@ -264,7 +264,7 @@ static int pwm_omap_dmtimer_probe(struct
dev_dbg(&pdev->dev,
"dmtimer pdata structure NULL, deferring probe\n");
ret = -EPROBE_DEFER;
- goto put;
+ goto err_platdata;
}

pdata = timer_pdata->timer_ops;
@@ -283,19 +283,19 @@ static int pwm_omap_dmtimer_probe(struct
!pdata->write_counter) {
dev_err(&pdev->dev, "Incomplete dmtimer pdata structure\n");
ret = -EINVAL;
- goto put;
+ goto err_platdata;
}

if (!of_get_property(timer, "ti,timer-pwm", NULL)) {
dev_err(&pdev->dev, "Missing ti,timer-pwm capability\n");
ret = -ENODEV;
- goto put;
+ goto err_timer_property;
}

dm_timer = pdata->request_by_node(timer);
if (!dm_timer) {
ret = -EPROBE_DEFER;
- goto put;
+ goto err_request_timer;
}

omap = devm_kzalloc(&pdev->dev, sizeof(*omap), GFP_KERNEL);
@@ -352,7 +352,14 @@ err_pwmchip_add:
err_alloc_omap:

pdata->free(dm_timer);
-put:
+err_request_timer:
+
+err_timer_property:
+err_platdata:
+
+ put_device(&timer_pdev->dev);
+err_find_timer_pdev:
+
of_node_put(timer);

return ret;
@@ -372,6 +379,8 @@ static int pwm_omap_dmtimer_remove(struc

omap->pdata->free(omap->dm_timer);

+ put_device(&omap->dm_timer_pdev->dev);
+
mutex_destroy(&omap->mutex);

return 0;


2020-03-03 18:11:31

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 163/176] thermal: brcmstb_thermal: Do not use DT coefficients

From: Florian Fainelli <[email protected]>

commit e1ff6fc22f19e2af8adbad618526b80067911d40 upstream.

At the time the brcmstb_thermal driver and its binding were merged, the
DT binding did not make the coefficients properties a mandatory one,
therefore all users of the brcmstb_thermal driver out there have a non
functional implementation with zero coefficients. Even if these
properties were provided, the formula used for computation is incorrect.

The coefficients are entirely process specific (right now, only 28nm is
supported) and not board or SoC specific, it is therefore appropriate to
hard code them in the driver given the compatibility string we are
probed with which has to be updated whenever a new process is
introduced.

We remove the existing coefficients definition since subsequent patches
are going to add support for a new process and will introduce new
coefficients as well.

Fixes: 9e03cf1b2dd5 ("thermal: add brcmstb AVS TMON driver")
Signed-off-by: Florian Fainelli <[email protected]>
Reviewed-by: Amit Kucheria <[email protected]>
Signed-off-by: Daniel Lezcano <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/thermal/broadcom/brcmstb_thermal.c | 31 ++++++++---------------------
1 file changed, 9 insertions(+), 22 deletions(-)

--- a/drivers/thermal/broadcom/brcmstb_thermal.c
+++ b/drivers/thermal/broadcom/brcmstb_thermal.c
@@ -49,7 +49,7 @@
#define AVS_TMON_TP_TEST_ENABLE 0x20

/* Default coefficients */
-#define AVS_TMON_TEMP_SLOPE -487
+#define AVS_TMON_TEMP_SLOPE 487
#define AVS_TMON_TEMP_OFFSET 410040

/* HW related temperature constants */
@@ -108,23 +108,12 @@ struct brcmstb_thermal_priv {
struct thermal_zone_device *thermal;
};

-static void avs_tmon_get_coeffs(struct thermal_zone_device *tz, int *slope,
- int *offset)
-{
- *slope = thermal_zone_get_slope(tz);
- *offset = thermal_zone_get_offset(tz);
-}
-
/* Convert a HW code to a temperature reading (millidegree celsius) */
static inline int avs_tmon_code_to_temp(struct thermal_zone_device *tz,
u32 code)
{
- const int val = code & AVS_TMON_TEMP_MASK;
- int slope, offset;
-
- avs_tmon_get_coeffs(tz, &slope, &offset);
-
- return slope * val + offset;
+ return (AVS_TMON_TEMP_OFFSET -
+ (int)((code & AVS_TMON_TEMP_MAX) * AVS_TMON_TEMP_SLOPE));
}

/*
@@ -136,20 +125,18 @@ static inline int avs_tmon_code_to_temp(
static inline u32 avs_tmon_temp_to_code(struct thermal_zone_device *tz,
int temp, bool low)
{
- int slope, offset;
-
if (temp < AVS_TMON_TEMP_MIN)
- return AVS_TMON_TEMP_MAX; /* Maximum code value */
-
- avs_tmon_get_coeffs(tz, &slope, &offset);
+ return AVS_TMON_TEMP_MAX; /* Maximum code value */

- if (temp >= offset)
+ if (temp >= AVS_TMON_TEMP_OFFSET)
return 0; /* Minimum code value */

if (low)
- return (u32)(DIV_ROUND_UP(offset - temp, abs(slope)));
+ return (u32)(DIV_ROUND_UP(AVS_TMON_TEMP_OFFSET - temp,
+ AVS_TMON_TEMP_SLOPE));
else
- return (u32)((offset - temp) / abs(slope));
+ return (u32)((AVS_TMON_TEMP_OFFSET - temp) /
+ AVS_TMON_TEMP_SLOPE);
}

static int brcmstb_get_temp(void *data, int *temp)


2020-03-03 18:11:40

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 143/176] sched/fair: Optimize select_idle_cpu

From: Cheng Jian <[email protected]>

commit 60588bfa223ff675b95f866249f90616613fbe31 upstream.

select_idle_cpu() will scan the LLC domain for idle CPUs,
it's always expensive. so the next commit :

1ad3aaf3fcd2 ("sched/core: Implement new approach to scale select_idle_cpu()")

introduces a way to limit how many CPUs we scan.

But it consume some CPUs out of 'nr' that are not allowed
for the task and thus waste our attempts. The function
always return nr_cpumask_bits, and we can't find a CPU
which our task is allowed to run.

Cpumask may be too big, similar to select_idle_core(), use
per_cpu_ptr 'select_idle_mask' to prevent stack overflow.

Fixes: 1ad3aaf3fcd2 ("sched/core: Implement new approach to scale select_idle_cpu()")
Signed-off-by: Cheng Jian <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Srikar Dronamraju <[email protected]>
Reviewed-by: Vincent Guittot <[email protected]>
Reviewed-by: Valentin Schneider <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/sched/fair.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5828,6 +5828,7 @@ static inline int select_idle_smt(struct
*/
static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int target)
{
+ struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
struct sched_domain *this_sd;
u64 avg_cost, avg_idle;
u64 time, cost;
@@ -5859,11 +5860,11 @@ static int select_idle_cpu(struct task_s

time = cpu_clock(this);

- for_each_cpu_wrap(cpu, sched_domain_span(sd), target) {
+ cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
+
+ for_each_cpu_wrap(cpu, cpus, target) {
if (!--nr)
return si_cpu;
- if (!cpumask_test_cpu(cpu, p->cpus_ptr))
- continue;
if (available_idle_cpu(cpu))
break;
if (si_cpu == -1 && sched_idle_cpu(cpu))


2020-03-03 18:11:46

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 106/176] netfilter: xt_hashlimit: reduce hashlimit_mutex scope for htable_put()

From: Cong Wang <[email protected]>

commit c4a3922d2d20c710f827d3a115ee338e8d0467df upstream.

It is unnecessary to hold hashlimit_mutex for htable_destroy()
as it is already removed from the global hashtable and its
refcount is already zero.

Also, switch hinfo->use to refcount_t so that we don't have
to hold the mutex until it reaches zero in htable_put().

Reported-and-tested-by: [email protected]
Acked-by: Florian Westphal <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/netfilter/xt_hashlimit.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -36,6 +36,7 @@
#include <linux/netfilter_ipv6/ip6_tables.h>
#include <linux/mutex.h>
#include <linux/kernel.h>
+#include <linux/refcount.h>
#include <uapi/linux/netfilter/xt_hashlimit.h>

#define XT_HASHLIMIT_ALL (XT_HASHLIMIT_HASH_DIP | XT_HASHLIMIT_HASH_DPT | \
@@ -114,7 +115,7 @@ struct dsthash_ent {

struct xt_hashlimit_htable {
struct hlist_node node; /* global list of all htables */
- int use;
+ refcount_t use;
u_int8_t family;
bool rnd_initialized;

@@ -315,7 +316,7 @@ static int htable_create(struct net *net
for (i = 0; i < hinfo->cfg.size; i++)
INIT_HLIST_HEAD(&hinfo->hash[i]);

- hinfo->use = 1;
+ refcount_set(&hinfo->use, 1);
hinfo->count = 0;
hinfo->family = family;
hinfo->rnd_initialized = false;
@@ -434,7 +435,7 @@ static struct xt_hashlimit_htable *htabl
hlist_for_each_entry(hinfo, &hashlimit_net->htables, node) {
if (!strcmp(name, hinfo->name) &&
hinfo->family == family) {
- hinfo->use++;
+ refcount_inc(&hinfo->use);
return hinfo;
}
}
@@ -443,12 +444,11 @@ static struct xt_hashlimit_htable *htabl

static void htable_put(struct xt_hashlimit_htable *hinfo)
{
- mutex_lock(&hashlimit_mutex);
- if (--hinfo->use == 0) {
+ if (refcount_dec_and_mutex_lock(&hinfo->use, &hashlimit_mutex)) {
hlist_del(&hinfo->node);
+ mutex_unlock(&hashlimit_mutex);
htable_destroy(hinfo);
}
- mutex_unlock(&hashlimit_mutex);
}

/* The algorithm used is the Simple Token Bucket Filter (TBF)


2020-03-03 18:11:48

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 100/176] cpufreq: Fix policy initialization for internal governor drivers

From: Rafael J. Wysocki <[email protected]>

commit f5739cb0b56590d68d8df8a44659893b6d0084c3 upstream.

Before commit 1e4f63aecb53 ("cpufreq: Avoid creating excessively
large stack frames") the initial value of the policy field in struct
cpufreq_policy set by the driver's ->init() callback was implicitly
passed from cpufreq_init_policy() to cpufreq_set_policy() if the
default governor was neither "performance" nor "powersave". After
that commit, however, cpufreq_init_policy() must take that case into
consideration explicitly and handle it as appropriate, so make that
happen.

Fixes: 1e4f63aecb53 ("cpufreq: Avoid creating excessively large stack frames")
Link: https://lore.kernel.org/linux-pm/[email protected]/
Reported-by: Artem Bityutskiy <[email protected]>
Cc: 5.4+ <[email protected]> # 5.4+
Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/cpufreq/cpufreq.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1074,9 +1074,17 @@ static int cpufreq_init_policy(struct cp
pol = policy->last_policy;
} else if (def_gov) {
pol = cpufreq_parse_policy(def_gov->name);
- } else {
- return -ENODATA;
+ /*
+ * In case the default governor is neiter "performance"
+ * nor "powersave", fall back to the initial policy
+ * value set by the driver.
+ */
+ if (pol == CPUFREQ_POLICY_UNKNOWN)
+ pol = policy->policy;
}
+ if (pol != CPUFREQ_POLICY_PERFORMANCE &&
+ pol != CPUFREQ_POLICY_POWERSAVE)
+ return -ENODATA;
}

return cpufreq_set_policy(policy, gov, pol);


2020-03-03 18:11:51

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 162/176] thermal: db8500: Depromote debug print

From: Linus Walleij <[email protected]>

commit c56dcfa3d4d0f49f0c37cd24886aa86db7aa7f30 upstream.

We are not interested in getting this debug print on our
console all the time.

Cc: Daniel Lezcano <[email protected]>
Cc: Stephan Gerhold <[email protected]>
Fixes: 6c375eccded4 ("thermal: db8500: Rewrite to be a pure OF sensor")
Signed-off-by: Linus Walleij <[email protected]>
Reviewed-by: Stephan Gerhold <[email protected]>
Signed-off-by: Daniel Lezcano <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/thermal/db8500_thermal.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/thermal/db8500_thermal.c
+++ b/drivers/thermal/db8500_thermal.c
@@ -152,8 +152,8 @@ static irqreturn_t prcmu_high_irq_handle
db8500_thermal_update_config(th, idx, THERMAL_TREND_RAISING,
next_low, next_high);

- dev_info(&th->tz->device,
- "PRCMU set max %ld, min %ld\n", next_high, next_low);
+ dev_dbg(&th->tz->device,
+ "PRCMU set max %ld, min %ld\n", next_high, next_low);
} else if (idx == num_points - 1)
/* So we roof out 1 degree over the max point */
th->interpolated_temp = db8500_thermal_points[idx] + 1;


2020-03-03 18:11:52

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 132/176] net: atlantic: fix out of range usage of active_vlans array

From: Dmitry Bogdanov <[email protected]>

commit 5a292c89a84d49b598f8978f154bdda48b1072c0 upstream.

fix static checker warning:
drivers/net/ethernet/aquantia/atlantic/aq_filters.c:166 aq_check_approve_fvlan()
error: passing untrusted data to 'test_bit()'

Reported-by: Dan Carpenter <[email protected]>
Fixes: 7975d2aff5af: ("net: aquantia: add support of rx-vlan-filter offload")
Signed-off-by: Dmitry Bogdanov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/ethernet/aquantia/atlantic/aq_filters.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/ethernet/aquantia/atlantic/aq_filters.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_filters.c
@@ -163,7 +163,7 @@ aq_check_approve_fvlan(struct aq_nic_s *
}

if ((aq_nic->ndev->features & NETIF_F_HW_VLAN_CTAG_FILTER) &&
- (!test_bit(be16_to_cpu(fsp->h_ext.vlan_tci),
+ (!test_bit(be16_to_cpu(fsp->h_ext.vlan_tci) & VLAN_VID_MASK,
aq_nic->active_vlans))) {
netdev_err(aq_nic->ndev,
"ethtool: unknown vlan-id specified");


2020-03-03 18:12:01

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 131/176] net: atlantic: possible fault in transition to hibernation

From: Pavel Belous <[email protected]>

commit 52a22f4d6ff95e8bdca557765c04893eb5dd83fd upstream.

during hibernation freeze, aq_nic_stop could be invoked
on a stopped device. That may cause panic on access to
not yet allocated vector/ring structures.

Add a check to stop device if it is not yet stopped.

Similiarly after freeze in hibernation thaw, aq_nic_start
could be invoked on a not initialized net device.
Result will be the same.

Add a check to start device if it is initialized.
In our case, this is the same as started.

Fixes: 8aaa112a57c1 ("net: atlantic: refactoring pm logic")
Signed-off-by: Pavel Belous <[email protected]>
Signed-off-by: Nikita Danilov <[email protected]>
Signed-off-by: Igor Russkikh <[email protected]>
Signed-off-by: Dmitry Bogdanov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

--- a/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c
@@ -359,7 +359,8 @@ static int aq_suspend_common(struct devi
netif_device_detach(nic->ndev);
netif_tx_stop_all_queues(nic->ndev);

- aq_nic_stop(nic);
+ if (netif_running(nic->ndev))
+ aq_nic_stop(nic);

if (deep) {
aq_nic_deinit(nic, !nic->aq_hw->aq_nic_cfg->wol);
@@ -375,7 +376,7 @@ static int atl_resume_common(struct devi
{
struct pci_dev *pdev = to_pci_dev(dev);
struct aq_nic_s *nic;
- int ret;
+ int ret = 0;

nic = pci_get_drvdata(pdev);

@@ -390,9 +391,11 @@ static int atl_resume_common(struct devi
goto err_exit;
}

- ret = aq_nic_start(nic);
- if (ret)
- goto err_exit;
+ if (netif_running(nic->ndev)) {
+ ret = aq_nic_start(nic);
+ if (ret)
+ goto err_exit;
+ }

netif_device_attach(nic->ndev);
netif_tx_start_all_queues(nic->ndev);


2020-03-03 18:12:09

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 148/176] ima: ima/lsm policy rule loading logic bug fixes

From: Janne Karhunen <[email protected]>

commit 483ec26eed42bf050931d9a5c5f9f0b5f2ad5f3b upstream.

Keep the ima policy rules around from the beginning even if they appear
invalid at the time of loading, as they may become active after an lsm
policy load. However, loading a custom IMA policy with unknown LSM
labels is only safe after we have transitioned from the "built-in"
policy rules to a custom IMA policy.

Patch also fixes the rule re-use during the lsm policy reload and makes
some prints a bit more human readable.

Changelog:
v4:
- Do not allow the initial policy load refer to non-existing lsm rules.
v3:
- Fix too wide policy rule matching for non-initialized LSMs
v2:
- Fix log prints

Fixes: b16942455193 ("ima: use the lsm policy update notifier")
Cc: Casey Schaufler <[email protected]>
Reported-by: Mimi Zohar <[email protected]>
Signed-off-by: Janne Karhunen <[email protected]>
Signed-off-by: Konsta Karsisto <[email protected]>
Signed-off-by: Mimi Zohar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
security/integrity/ima/ima_policy.c | 44 +++++++++++++++++++++---------------
1 file changed, 26 insertions(+), 18 deletions(-)

--- a/security/integrity/ima/ima_policy.c
+++ b/security/integrity/ima/ima_policy.c
@@ -263,7 +263,7 @@ static void ima_lsm_free_rule(struct ima
static struct ima_rule_entry *ima_lsm_copy_rule(struct ima_rule_entry *entry)
{
struct ima_rule_entry *nentry;
- int i, result;
+ int i;

nentry = kmalloc(sizeof(*nentry), GFP_KERNEL);
if (!nentry)
@@ -277,7 +277,7 @@ static struct ima_rule_entry *ima_lsm_co
memset(nentry->lsm, 0, sizeof_field(struct ima_rule_entry, lsm));

for (i = 0; i < MAX_LSM_RULES; i++) {
- if (!entry->lsm[i].rule)
+ if (!entry->lsm[i].args_p)
continue;

nentry->lsm[i].type = entry->lsm[i].type;
@@ -286,13 +286,13 @@ static struct ima_rule_entry *ima_lsm_co
if (!nentry->lsm[i].args_p)
goto out_err;

- result = security_filter_rule_init(nentry->lsm[i].type,
- Audit_equal,
- nentry->lsm[i].args_p,
- &nentry->lsm[i].rule);
- if (result == -EINVAL)
- pr_warn("ima: rule for LSM \'%d\' is undefined\n",
- entry->lsm[i].type);
+ security_filter_rule_init(nentry->lsm[i].type,
+ Audit_equal,
+ nentry->lsm[i].args_p,
+ &nentry->lsm[i].rule);
+ if (!nentry->lsm[i].rule)
+ pr_warn("rule for LSM \'%s\' is undefined\n",
+ (char *)entry->lsm[i].args_p);
}
return nentry;

@@ -329,7 +329,7 @@ static void ima_lsm_update_rules(void)
list_for_each_entry_safe(entry, e, &ima_policy_rules, list) {
needs_update = 0;
for (i = 0; i < MAX_LSM_RULES; i++) {
- if (entry->lsm[i].rule) {
+ if (entry->lsm[i].args_p) {
needs_update = 1;
break;
}
@@ -339,8 +339,7 @@ static void ima_lsm_update_rules(void)

result = ima_lsm_update_rule(entry);
if (result) {
- pr_err("ima: lsm rule update error %d\n",
- result);
+ pr_err("lsm rule update error %d\n", result);
return;
}
}
@@ -357,7 +356,7 @@ int ima_lsm_policy_change(struct notifie
}

/**
- * ima_match_rules - determine whether an inode matches the measure rule.
+ * ima_match_rules - determine whether an inode matches the policy rule.
* @rule: a pointer to a rule
* @inode: a pointer to an inode
* @cred: a pointer to a credentials structure for user validation
@@ -415,9 +414,12 @@ static bool ima_match_rules(struct ima_r
int rc = 0;
u32 osid;

- if (!rule->lsm[i].rule)
- continue;
-
+ if (!rule->lsm[i].rule) {
+ if (!rule->lsm[i].args_p)
+ continue;
+ else
+ return false;
+ }
switch (i) {
case LSM_OBJ_USER:
case LSM_OBJ_ROLE:
@@ -823,8 +825,14 @@ static int ima_lsm_rule_init(struct ima_
entry->lsm[lsm_rule].args_p,
&entry->lsm[lsm_rule].rule);
if (!entry->lsm[lsm_rule].rule) {
- kfree(entry->lsm[lsm_rule].args_p);
- return -EINVAL;
+ pr_warn("rule for LSM \'%s\' is undefined\n",
+ (char *)entry->lsm[lsm_rule].args_p);
+
+ if (ima_rules == &ima_default_rules) {
+ kfree(entry->lsm[lsm_rule].args_p);
+ result = -EINVAL;
+ } else
+ result = 0;
}

return result;


2020-03-03 18:12:23

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 134/176] net/smc: no peer ID in CLC decline for SMCD

From: Ursula Braun <[email protected]>

commit 369537c97024dca99303a8d4d6ab38b4f54d3909 upstream.

Just SMCR requires a CLC Peer ID, but not SMCD. The field should be
zero for SMCD.

Fixes: c758dfddc1b5 ("net/smc: add SMC-D support in CLC messages")
Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: Karsten Graul <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/smc/smc_clc.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/net/smc/smc_clc.c
+++ b/net/smc/smc_clc.c
@@ -372,7 +372,9 @@ int smc_clc_send_decline(struct smc_sock
dclc.hdr.length = htons(sizeof(struct smc_clc_msg_decline));
dclc.hdr.version = SMC_CLC_V1;
dclc.hdr.flag = (peer_diag_info == SMC_CLC_DECL_SYNCERR) ? 1 : 0;
- memcpy(dclc.id_for_peer, local_systemid, sizeof(local_systemid));
+ if (smc->conn.lgr && !smc->conn.lgr->is_smcd)
+ memcpy(dclc.id_for_peer, local_systemid,
+ sizeof(local_systemid));
dclc.peer_diagnosis = htonl(peer_diag_info);
memcpy(dclc.trl.eyecatcher, SMC_EYECATCHER, sizeof(SMC_EYECATCHER));



2020-03-03 18:12:31

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 128/176] net: atlantic: better loopback mode handling

From: Nikita Danilov <[email protected]>

commit b42726fcf76e9367e524392e0ead7e672cc0791c upstream.

Add checks to not enable multiple loopback modes simultaneously,
It was also discovered that for dma loopback to function correctly
promisc mode should be enabled on device.

Fixes: ea4b4d7fc106 ("net: atlantic: loopback tests via private flags")
Signed-off-by: Nikita Danilov <[email protected]>
Signed-off-by: Igor Russkikh <[email protected]>
Signed-off-by: Dmitry Bogdanov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c | 5 +++++
drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c | 13 ++++++++-----
2 files changed, 13 insertions(+), 5 deletions(-)

--- a/drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c
@@ -722,6 +722,11 @@ static int aq_ethtool_set_priv_flags(str
if (flags & ~AQ_PRIV_FLAGS_MASK)
return -EOPNOTSUPP;

+ if (hweight32((flags | priv_flags) & AQ_HW_LOOPBACK_MASK) > 1) {
+ netdev_info(ndev, "Can't enable more than one loopback simultaneously\n");
+ return -EINVAL;
+ }
+
cfg->priv_flags = flags;

if ((priv_flags ^ flags) & BIT(AQ_HW_LOOPBACK_DMA_NET)) {
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
@@ -885,13 +885,16 @@ static int hw_atl_b0_hw_packet_filter_se
{
struct aq_nic_cfg_s *cfg = self->aq_nic_cfg;
unsigned int i = 0U;
+ u32 vlan_promisc;
+ u32 l2_promisc;

- hw_atl_rpfl2promiscuous_mode_en_set(self,
- IS_FILTER_ENABLED(IFF_PROMISC));
+ l2_promisc = IS_FILTER_ENABLED(IFF_PROMISC) ||
+ !!(cfg->priv_flags & BIT(AQ_HW_LOOPBACK_DMA_NET));
+ vlan_promisc = l2_promisc || cfg->is_vlan_force_promisc;

- hw_atl_rpf_vlan_prom_mode_en_set(self,
- IS_FILTER_ENABLED(IFF_PROMISC) ||
- cfg->is_vlan_force_promisc);
+ hw_atl_rpfl2promiscuous_mode_en_set(self, l2_promisc);
+
+ hw_atl_rpf_vlan_prom_mode_en_set(self, vlan_promisc);

hw_atl_rpfl2multicast_flr_en_set(self,
IS_FILTER_ENABLED(IFF_ALLMULTI) &&


2020-03-03 18:12:31

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 142/176] KVM: Check for a bad hva before dropping into the ghc slow path

From: Sean Christopherson <[email protected]>

commit fcfbc617547fc6d9552cb6c1c563b6a90ee98085 upstream.

When reading/writing using the guest/host cache, check for a bad hva
before checking for a NULL memslot, which triggers the slow path for
handing cross-page accesses. Because the memslot is nullified on error
by __kvm_gfn_to_hva_cache_init(), if the bad hva is encountered after
crossing into a new page, then the kvm_{read,write}_guest() slow path
could potentially write/access the first chunk prior to detecting the
bad hva.

Arguably, performing a partial access is semantically correct from an
architectural perspective, but that behavior is certainly not intended.
In the original implementation, memslot was not explicitly nullified
and therefore the partial access behavior varied based on whether the
memslot itself was null, or if the hva was simply bad. The current
behavior was introduced as a seemingly unintentional side effect in
commit f1b9dd5eb86c ("kvm: Disallow wraparound in
kvm_gfn_to_hva_cache_init"), which justified the change with "since some
callers don't check the return code from this function, it sit seems
prudent to clear ghc->memslot in the event of an error".

Regardless of intent, the partial access is dependent on _not_ checking
the result of the cache initialization, which is arguably a bug in its
own right, at best simply weird.

Fixes: 8f964525a121 ("KVM: Allow cross page reads and writes from cached translations.")
Cc: Jim Mattson <[email protected]>
Cc: Andrew Honig <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
virt/kvm/kvm_main.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2287,12 +2287,12 @@ int kvm_write_guest_offset_cached(struct
if (slots->generation != ghc->generation)
__kvm_gfn_to_hva_cache_init(slots, ghc, ghc->gpa, ghc->len);

- if (unlikely(!ghc->memslot))
- return kvm_write_guest(kvm, gpa, data, len);
-
if (kvm_is_error_hva(ghc->hva))
return -EFAULT;

+ if (unlikely(!ghc->memslot))
+ return kvm_write_guest(kvm, gpa, data, len);
+
r = __copy_to_user((void __user *)ghc->hva + offset, data, len);
if (r)
return -EFAULT;
@@ -2320,12 +2320,12 @@ int kvm_read_guest_cached(struct kvm *kv
if (slots->generation != ghc->generation)
__kvm_gfn_to_hva_cache_init(slots, ghc, ghc->gpa, ghc->len);

- if (unlikely(!ghc->memslot))
- return kvm_read_guest(kvm, ghc->gpa, data, len);
-
if (kvm_is_error_hva(ghc->hva))
return -EFAULT;

+ if (unlikely(!ghc->memslot))
+ return kvm_read_guest(kvm, ghc->gpa, data, len);
+
r = __copy_from_user(data, (void __user *)ghc->hva, len);
if (r)
return -EFAULT;


2020-03-03 18:12:35

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 125/176] s390/qeth: fix off-by-one in RX copybreak check

From: Julian Wiedmann <[email protected]>

commit 54a61fbc020fd2e305680871c453abcf7fc0339b upstream.

The RX copybreak is intended as the _max_ value where the frame's data
should be copied. So for frame_len == copybreak, don't build an SG skb.

Fixes: 4a71df50047f ("qeth: new qeth device driver")
Signed-off-by: Julian Wiedmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/s390/net/qeth_core_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -5142,7 +5142,7 @@ next_packet:
}

use_rx_sg = (card->options.cq == QETH_CQ_ENABLED) ||
- ((skb_len >= card->options.rx_sg_cb) &&
+ (skb_len > card->options.rx_sg_cb &&
!atomic_read(&card->force_alloc_skb) &&
!IS_OSN(card));



2020-03-03 18:12:38

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 120/176] kbuild: fix DT binding schema rule to detect command line changes

From: Masahiro Yamada <[email protected]>

commit 7a04960560640ac5b0b89461f7757322b57d0c7a upstream.

This if_change_rule is not working properly; it cannot detect any
command line change.

The reason is because cmd-check in scripts/Kbuild.include compares
$(cmd_$@) and $(cmd_$1), but cmd_dtc_dt_yaml does not exist here.

For if_change_rule to work properly, the stem part of cmd_* and rule_*
must match. Because this cmd_and_fixdep invokes cmd_dtc, this rule must
be named rule_dtc.

Fixes: 4f0e3a57d6eb ("kbuild: Add support for DT binding schema checks")
Signed-off-by: Masahiro Yamada <[email protected]>
Acked-by: Rob Herring <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
scripts/Makefile.lib | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -291,13 +291,13 @@ DT_TMP_SCHEMA := $(objtree)/$(DT_BINDING
quiet_cmd_dtb_check = CHECK $@
cmd_dtb_check = $(DT_CHECKER) -u $(srctree)/$(DT_BINDING_DIR) -p $(DT_TMP_SCHEMA) $@ ;

-define rule_dtc_dt_yaml
+define rule_dtc
$(call cmd_and_fixdep,dtc,yaml)
$(call cmd,dtb_check)
endef

$(obj)/%.dt.yaml: $(src)/%.dts $(DTC) $(DT_TMP_SCHEMA) FORCE
- $(call if_changed_rule,dtc_dt_yaml)
+ $(call if_changed_rule,dtc)

dtc-tmp = $(subst $(comma),_,$(dot-target).dts.tmp)



2020-03-03 18:12:45

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 133/176] selftests: Install settings files to fix TIMEOUT failures

From: Michael Ellerman <[email protected]>

commit b9167c8078c3527de6da241c8a1a75a9224ed90a upstream.

Commit 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second
timeout per test") added a 45 second timeout for tests, and also added
a way for tests to customise the timeout via a settings file.

For example the ftrace tests take multiple minutes to run, so they
were given longer in commit b43e78f65b1d ("tracing/selftests: Turn off
timeout setting").

This works when the tests are run from the source tree. However if the
tests are installed with "make -C tools/testing/selftests install",
the settings files are not copied into the install directory. When the
tests are then run from the install directory the longer timeouts are
not applied and the tests timeout incorrectly.

So add the settings files to TEST_FILES of the appropriate Makefiles
to cause the settings files to be installed using the existing install
logic.

Fixes: 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second timeout per test")
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
tools/testing/selftests/ftrace/Makefile | 2 +-
tools/testing/selftests/livepatch/Makefile | 2 ++
tools/testing/selftests/rseq/Makefile | 2 ++
tools/testing/selftests/rtc/Makefile | 2 ++
4 files changed, 7 insertions(+), 1 deletion(-)

--- a/tools/testing/selftests/ftrace/Makefile
+++ b/tools/testing/selftests/ftrace/Makefile
@@ -2,7 +2,7 @@
all:

TEST_PROGS := ftracetest
-TEST_FILES := test.d
+TEST_FILES := test.d settings
EXTRA_CLEAN := $(OUTPUT)/logs/*

include ../lib.mk
--- a/tools/testing/selftests/livepatch/Makefile
+++ b/tools/testing/selftests/livepatch/Makefile
@@ -8,4 +8,6 @@ TEST_PROGS := \
test-state.sh \
test-ftrace.sh

+TEST_FILES := settings
+
include ../lib.mk
--- a/tools/testing/selftests/rseq/Makefile
+++ b/tools/testing/selftests/rseq/Makefile
@@ -19,6 +19,8 @@ TEST_GEN_PROGS_EXTENDED = librseq.so

TEST_PROGS = run_param_test.sh

+TEST_FILES := settings
+
include ../lib.mk

$(OUTPUT)/librseq.so: rseq.c rseq.h rseq-*.h
--- a/tools/testing/selftests/rtc/Makefile
+++ b/tools/testing/selftests/rtc/Makefile
@@ -6,4 +6,6 @@ TEST_GEN_PROGS = rtctest

TEST_GEN_PROGS_EXTENDED = setdate

+TEST_FILES := settings
+
include ../lib.mk


2020-03-03 18:12:50

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 116/176] drm/i915: Avoid recursing onto active vma from the shrinker

From: Chris Wilson <[email protected]>

commit 238734262142075056653b4de091458e0ca858f2 upstream.

We mark the vma as active while binding it in order to protect outselves
from being shrunk under mempressure. This only works if we are strict in
not attempting to shrink active objects.

<6> [472.618968] Workqueue: events_unbound fence_work [i915]
<4> [472.618970] Call Trace:
<4> [472.618974] ? __schedule+0x2e5/0x810
<4> [472.618978] schedule+0x37/0xe0
<4> [472.618982] schedule_preempt_disabled+0xf/0x20
<4> [472.618984] __mutex_lock+0x281/0x9c0
<4> [472.618987] ? mark_held_locks+0x49/0x70
<4> [472.618989] ? _raw_spin_unlock_irqrestore+0x47/0x60
<4> [472.619038] ? i915_vma_unbind+0xae/0x110 [i915]
<4> [472.619084] ? i915_vma_unbind+0xae/0x110 [i915]
<4> [472.619122] i915_vma_unbind+0xae/0x110 [i915]
<4> [472.619165] i915_gem_object_unbind+0x1dc/0x400 [i915]
<4> [472.619208] i915_gem_shrink+0x328/0x660 [i915]
<4> [472.619250] ? i915_gem_shrink_all+0x38/0x60 [i915]
<4> [472.619282] i915_gem_shrink_all+0x38/0x60 [i915]
<4> [472.619325] vm_alloc_page.constprop.25+0x1aa/0x240 [i915]
<4> [472.619330] ? rcu_read_lock_sched_held+0x4d/0x80
<4> [472.619363] ? __alloc_pd+0xb/0x30 [i915]
<4> [472.619366] ? module_assert_mutex_or_preempt+0xf/0x30
<4> [472.619368] ? __module_address+0x23/0xe0
<4> [472.619371] ? is_module_address+0x26/0x40
<4> [472.619374] ? static_obj+0x34/0x50
<4> [472.619376] ? lockdep_init_map+0x4d/0x1e0
<4> [472.619407] setup_page_dma+0xd/0x90 [i915]
<4> [472.619437] alloc_pd+0x29/0x50 [i915]
<4> [472.619470] __gen8_ppgtt_alloc+0x443/0x6b0 [i915]
<4> [472.619503] gen8_ppgtt_alloc+0xd7/0x300 [i915]
<4> [472.619535] ppgtt_bind_vma+0x2a/0xe0 [i915]
<4> [472.619577] __vma_bind+0x26/0x40 [i915]
<4> [472.619611] fence_work+0x1c/0x90 [i915]
<4> [472.619617] process_one_work+0x26a/0x620

Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 6f24e41022f28061368776ea1514db0a6e67a9b1)
Signed-off-by: Jani Nikula <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -257,8 +257,7 @@ unsigned long i915_gem_shrink_all(struct
with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
freed = i915_gem_shrink(i915, -1UL, NULL,
I915_SHRINK_BOUND |
- I915_SHRINK_UNBOUND |
- I915_SHRINK_ACTIVE);
+ I915_SHRINK_UNBOUND);
}

return freed;
@@ -337,7 +336,6 @@ i915_gem_shrinker_oom(struct notifier_bl
freed_pages = 0;
with_intel_runtime_pm(&i915->runtime_pm, wakeref)
freed_pages += i915_gem_shrink(i915, -1UL, NULL,
- I915_SHRINK_ACTIVE |
I915_SHRINK_BOUND |
I915_SHRINK_UNBOUND |
I915_SHRINK_WRITEBACK);


2020-03-03 18:12:51

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 113/176] i2c: jz4780: silence log flood on txabrt

From: Wolfram Sang <[email protected]>

commit 9e661cedcc0a072d91a32cb88e0515ea26e35711 upstream.

The printout for txabrt is way too talkative and is highly annoying with
scanning programs like 'i2cdetect'. Reduce it to the minimum, the rest
can be gained by I2C core debugging and datasheet information. Also,
make it a debug printout, it won't help the regular user.

Fixes: ba92222ed63a ("i2c: jz4780: Add i2c bus controller driver for Ingenic JZ4780")
Reported-by: H. Nikolaus Schaller <[email protected]>
Tested-by: H. Nikolaus Schaller <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/i2c/busses/i2c-jz4780.c | 36 ++----------------------------------
1 file changed, 2 insertions(+), 34 deletions(-)

--- a/drivers/i2c/busses/i2c-jz4780.c
+++ b/drivers/i2c/busses/i2c-jz4780.c
@@ -73,25 +73,6 @@
#define JZ4780_I2C_STA_TFNF BIT(1)
#define JZ4780_I2C_STA_ACT BIT(0)

-static const char * const jz4780_i2c_abrt_src[] = {
- "ABRT_7B_ADDR_NOACK",
- "ABRT_10ADDR1_NOACK",
- "ABRT_10ADDR2_NOACK",
- "ABRT_XDATA_NOACK",
- "ABRT_GCALL_NOACK",
- "ABRT_GCALL_READ",
- "ABRT_HS_ACKD",
- "SBYTE_ACKDET",
- "ABRT_HS_NORSTRT",
- "SBYTE_NORSTRT",
- "ABRT_10B_RD_NORSTRT",
- "ABRT_MASTER_DIS",
- "ARB_LOST",
- "SLVFLUSH_TXFIFO",
- "SLV_ARBLOST",
- "SLVRD_INTX",
-};
-
#define JZ4780_I2C_INTST_IGC BIT(11)
#define JZ4780_I2C_INTST_ISTT BIT(10)
#define JZ4780_I2C_INTST_ISTP BIT(9)
@@ -529,21 +510,8 @@ done:

static void jz4780_i2c_txabrt(struct jz4780_i2c *i2c, int src)
{
- int i;
-
- dev_err(&i2c->adap.dev, "txabrt: 0x%08x\n", src);
- dev_err(&i2c->adap.dev, "device addr=%x\n",
- jz4780_i2c_readw(i2c, JZ4780_I2C_TAR));
- dev_err(&i2c->adap.dev, "send cmd count:%d %d\n",
- i2c->cmd, i2c->cmd_buf[i2c->cmd]);
- dev_err(&i2c->adap.dev, "receive data count:%d %d\n",
- i2c->cmd, i2c->data_buf[i2c->cmd]);
-
- for (i = 0; i < 16; i++) {
- if (src & BIT(i))
- dev_dbg(&i2c->adap.dev, "I2C TXABRT[%d]=%s\n",
- i, jz4780_i2c_abrt_src[i]);
- }
+ dev_dbg(&i2c->adap.dev, "txabrt: 0x%08x, cmd: %d, send: %d, recv: %d\n",
+ src, i2c->cmd, i2c->cmd_buf[i2c->cmd], i2c->data_buf[i2c->cmd]);
}

static inline int jz4780_i2c_xfer_read(struct jz4780_i2c *i2c,


2020-03-03 18:12:53

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 108/176] HID: hiddev: Fix race in in hiddev_disconnect()

From: [email protected] <[email protected]>

commit 5c02c447eaeda29d3da121a2e17b97ccaf579b51 upstream.

Syzbot reports that "hiddev" is used after it's free in hiddev_disconnect().
The hiddev_disconnect() function sets "hiddev->exist = 0;" so
hiddev_release() can free it as soon as we drop the "existancelock"
lock. This patch moves the mutex_unlock(&hiddev->existancelock) until
after we have finished using it.

Reported-by: [email protected]
Fixes: 7f77897ef2b6 ("HID: hiddev: fix potential use-after-free")
Suggested-by: Alan Stern <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hid/usbhid/hiddev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/hid/usbhid/hiddev.c
+++ b/drivers/hid/usbhid/hiddev.c
@@ -932,9 +932,9 @@ void hiddev_disconnect(struct hid_device
hiddev->exist = 0;

if (hiddev->open) {
- mutex_unlock(&hiddev->existancelock);
hid_hw_close(hiddev->hid);
wake_up_interruptible(&hiddev->wait);
+ mutex_unlock(&hiddev->existancelock);
} else {
mutex_unlock(&hiddev->existancelock);
kfree(hiddev);


2020-03-03 18:13:03

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 080/176] nvme/pci: move cqe check after device shutdown

From: Keith Busch <[email protected]>

[ Upstream commit fa46c6fb5d61b1f17b06d7c6ef75478b576304c7 ]

Many users have reported nvme triggered irq_startup() warnings during
shutdown. The driver uses the nvme queue's irq to synchronize scanning
for completions, and enabling an interrupt affined to only offline CPUs
triggers the alarming warning.

Move the final CQE check to after disabling the device and all
registered interrupts have been torn down so that we do not have any
IRQ to synchronize.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206509
Reviewed-by: Sagi Grimberg <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/nvme/host/pci.c | 23 ++++++++++++++++++-----
1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index da392b50f73e7..9c80f9f081496 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1401,6 +1401,23 @@ static void nvme_disable_admin_queue(struct nvme_dev *dev, bool shutdown)
nvme_poll_irqdisable(nvmeq, -1);
}

+/*
+ * Called only on a device that has been disabled and after all other threads
+ * that can check this device's completion queues have synced. This is the
+ * last chance for the driver to see a natural completion before
+ * nvme_cancel_request() terminates all incomplete requests.
+ */
+static void nvme_reap_pending_cqes(struct nvme_dev *dev)
+{
+ u16 start, end;
+ int i;
+
+ for (i = dev->ctrl.queue_count - 1; i > 0; i--) {
+ nvme_process_cq(&dev->queues[i], &start, &end, -1);
+ nvme_complete_cqes(&dev->queues[i], start, end);
+ }
+}
+
static int nvme_cmb_qdepth(struct nvme_dev *dev, int nr_io_queues,
int entry_size)
{
@@ -2235,11 +2252,6 @@ static bool __nvme_disable_io_queues(struct nvme_dev *dev, u8 opcode)
if (timeout == 0)
return false;

- /* handle any remaining CQEs */
- if (opcode == nvme_admin_delete_cq &&
- !test_bit(NVMEQ_DELETE_ERROR, &nvmeq->flags))
- nvme_poll_irqdisable(nvmeq, -1);
-
sent--;
if (nr_queues)
goto retry;
@@ -2428,6 +2440,7 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
nvme_suspend_io_queues(dev);
nvme_suspend_queue(&dev->queues[0]);
nvme_pci_disable(dev);
+ nvme_reap_pending_cqes(dev);

blk_mq_tagset_busy_iter(&dev->tagset, nvme_cancel_request, &dev->ctrl);
blk_mq_tagset_busy_iter(&dev->admin_tagset, nvme_cancel_request, &dev->ctrl);
--
2.20.1



2020-03-03 18:13:06

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 078/176] nvme/tcp: fix bug on double requeue when send fails

From: Anton Eidelman <[email protected]>

[ Upstream commit 2d570a7c0251c594489a2c16b82b14ae30345c03 ]

When nvme_tcp_io_work() fails to send to socket due to
connection close/reset, error_recovery work is triggered
from nvme_tcp_state_change() socket callback.
This cancels all the active requests in the tagset,
which requeues them.

The failed request, however, was ended and thus requeued
individually as well unless send returned -EPIPE.
Another return code to be treated the same way is -ECONNRESET.

Double requeue caused BUG_ON(blk_queued_rq(rq))
in blk_mq_requeue_request() from either the individual requeue
of the failed request or the bulk requeue from
blk_mq_tagset_busy_iter(, nvme_cancel_request, );

Signed-off-by: Anton Eidelman <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/nvme/host/tcp.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 6d43b23a0fc8b..f8fa5c5b79f17 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1054,7 +1054,12 @@ static void nvme_tcp_io_work(struct work_struct *w)
} else if (unlikely(result < 0)) {
dev_err(queue->ctrl->ctrl.device,
"failed to send request %d\n", result);
- if (result != -EPIPE)
+
+ /*
+ * Fail the request unless peer closed the connection,
+ * in which case error recovery flow will complete all.
+ */
+ if ((result != -EPIPE) && (result != -ECONNRESET))
nvme_tcp_fail_request(queue->request);
nvme_tcp_done_send_req(queue);
return;
--
2.20.1



2020-03-03 18:13:22

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 103/176] net/smc: transfer fasync_list in case of fallback

From: Ursula Braun <[email protected]>

commit 67f562e3e147750a02b2a91d21a163fc44a1d13e upstream.

SMC does not work together with FASTOPEN. If sendmsg() is called with
flag MSG_FASTOPEN in SMC_INIT state, the SMC-socket switches to
fallback mode. To handle the previous ioctl FIOASYNC call correctly
in this case, it is necessary to transfer the socket wait queue
fasync_list to the internal TCP socket.

Reported-by: [email protected]
Fixes: ee9dfbef02d18 ("net/smc: handle sockopts forcing fallback")
Signed-off-by: Ursula Braun <[email protected]>
Signed-off-by: Karsten Graul <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/smc/af_smc.c | 2 ++
1 file changed, 2 insertions(+)

--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -470,6 +470,8 @@ static void smc_switch_to_fallback(struc
if (smc->sk.sk_socket && smc->sk.sk_socket->file) {
smc->clcsock->file = smc->sk.sk_socket->file;
smc->clcsock->file->private_data = smc->clcsock;
+ smc->clcsock->wq.fasync_list =
+ smc->sk.sk_socket->wq.fasync_list;
}
}



2020-03-03 18:13:24

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 094/176] drm/radeon: Inline drm_get_pci_dev

From: Daniel Vetter <[email protected]>

commit eb12c957735b582607e5842a06d1f4c62e185c1d upstream.

It's the last user, and more importantly, it's the last non-legacy
user of anything in drm_pci.c.

The only tricky bit is the agp initialization. But a close look shows
that radeon does not use the drm_agp midlayer (the main use of that is
drm_bufs for legacy drivers), and instead could use the agp subsystem
directly (like nouveau does already). Hence we can just pull this in
too.

A further step would be to entirely drop the use of drm_device->agp,
but feels like too much churn just for this patch.

Signed-off-by: Daniel Vetter <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: "Christian König" <[email protected]>
Cc: "David (ChunMing) Zhou" <[email protected]>
Cc: [email protected]
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/gpu/drm/radeon/radeon_drv.c | 43 ++++++++++++++++++++++++++++++++++--
drivers/gpu/drm/radeon/radeon_kms.c | 6 +++++
2 files changed, 47 insertions(+), 2 deletions(-)

--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -37,6 +37,7 @@
#include <linux/vga_switcheroo.h>
#include <linux/mmu_notifier.h>

+#include <drm/drm_agpsupport.h>
#include <drm/drm_crtc_helper.h>
#include <drm/drm_drv.h>
#include <drm/drm_fb_helper.h>
@@ -325,6 +326,7 @@ static int radeon_pci_probe(struct pci_d
const struct pci_device_id *ent)
{
unsigned long flags = 0;
+ struct drm_device *dev;
int ret;

if (!ent)
@@ -365,7 +367,44 @@ static int radeon_pci_probe(struct pci_d
if (ret)
return ret;

- return drm_get_pci_dev(pdev, ent, &kms_driver);
+ dev = drm_dev_alloc(&kms_driver, &pdev->dev);
+ if (IS_ERR(dev))
+ return PTR_ERR(dev);
+
+ ret = pci_enable_device(pdev);
+ if (ret)
+ goto err_free;
+
+ dev->pdev = pdev;
+#ifdef __alpha__
+ dev->hose = pdev->sysdata;
+#endif
+
+ pci_set_drvdata(pdev, dev);
+
+ if (pci_find_capability(dev->pdev, PCI_CAP_ID_AGP))
+ dev->agp = drm_agp_init(dev);
+ if (dev->agp) {
+ dev->agp->agp_mtrr = arch_phys_wc_add(
+ dev->agp->agp_info.aper_base,
+ dev->agp->agp_info.aper_size *
+ 1024 * 1024);
+ }
+
+ ret = drm_dev_register(dev, ent->driver_data);
+ if (ret)
+ goto err_agp;
+
+ return 0;
+
+err_agp:
+ if (dev->agp)
+ arch_phys_wc_del(dev->agp->agp_mtrr);
+ kfree(dev->agp);
+ pci_disable_device(pdev);
+err_free:
+ drm_dev_put(dev);
+ return ret;
}

static void
@@ -575,7 +614,7 @@ radeon_get_crtc_scanout_position(struct

static struct drm_driver kms_driver = {
.driver_features =
- DRIVER_USE_AGP | DRIVER_GEM | DRIVER_RENDER,
+ DRIVER_GEM | DRIVER_RENDER,
.load = radeon_driver_load_kms,
.open = radeon_driver_open_kms,
.postclose = radeon_driver_postclose_kms,
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -31,6 +31,7 @@
#include <linux/uaccess.h>
#include <linux/vga_switcheroo.h>

+#include <drm/drm_agpsupport.h>
#include <drm/drm_fb_helper.h>
#include <drm/drm_file.h>
#include <drm/drm_ioctl.h>
@@ -77,6 +78,11 @@ void radeon_driver_unload_kms(struct drm
radeon_modeset_fini(rdev);
radeon_device_fini(rdev);

+ if (dev->agp)
+ arch_phys_wc_del(dev->agp->agp_mtrr);
+ kfree(dev->agp);
+ dev->agp = NULL;
+
done_free:
kfree(rdev);
dev->dev_private = NULL;


2020-03-03 18:13:24

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 093/176] drm/amdgpu: Drop DRIVER_USE_AGP

From: Daniel Vetter <[email protected]>

commit 8a3bddf67ce88b96531fb22c5a75d7f4dc41d155 upstream.

This doesn't do anything except auto-init drm_agp support when you
call drm_get_pci_dev(). Which amdgpu stopped doing with

commit b58c11314a1706bf094c489ef5cb28f76478c704
Author: Alex Deucher <[email protected]>
Date: Fri Jun 2 17:16:31 2017 -0400

drm/amdgpu: drop deprecated drm_get_pci_dev and drm_put_dev

No idea whether this was intentional or accidental breakage, but I
guess anyone who manages to boot a this modern gpu behind an agp
bridge deserves a price. A price I never expect anyone to ever collect
:-)

Cc: Alex Deucher <[email protected]>
Cc: "Christian König" <[email protected]>
Cc: Hawking Zhang <[email protected]>
Cc: Xiaojie Yuan <[email protected]>
Cc: Evan Quan <[email protected]>
Cc: "Tianci.Yin" <[email protected]>
Cc: "Marek Olšák" <[email protected]>
Cc: Hans de Goede <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1357,7 +1357,7 @@ amdgpu_get_crtc_scanout_position(struct

static struct drm_driver kms_driver = {
.driver_features =
- DRIVER_USE_AGP | DRIVER_ATOMIC |
+ DRIVER_ATOMIC |
DRIVER_GEM |
DRIVER_RENDER | DRIVER_MODESET | DRIVER_SYNCOBJ |
DRIVER_SYNCOBJ_TIMELINE,


2020-03-03 18:13:29

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 088/176] ACPI: watchdog: Fix gas->access_width usage

From: Mika Westerberg <[email protected]>

commit 2ba33a4e9e22ac4dda928d3e9b5978a3a2ded4e0 upstream.

ACPI Generic Address Structure (GAS) access_width field is not in bytes
as the driver seems to expect in few places so fix this by using the
newly introduced macro ACPI_ACCESS_BYTE_WIDTH().

Fixes: b1abf6fc4982 ("ACPI / watchdog: Fix off-by-one error at resource assignment")
Fixes: 058dfc767008 ("ACPI / watchdog: Add support for WDAT hardware watchdog")
Reported-by: Jean Delvare <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>
Reviewed-by: Jean Delvare <[email protected]>
Cc: 4.16+ <[email protected]> # 4.16+
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/acpi/acpi_watchdog.c | 3 +--
drivers/watchdog/wdat_wdt.c | 2 +-
2 files changed, 2 insertions(+), 3 deletions(-)

--- a/drivers/acpi/acpi_watchdog.c
+++ b/drivers/acpi/acpi_watchdog.c
@@ -126,12 +126,11 @@ void __init acpi_watchdog_init(void)
gas = &entries[i].register_region;

res.start = gas->address;
+ res.end = res.start + ACPI_ACCESS_BYTE_WIDTH(gas->access_width) - 1;
if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) {
res.flags = IORESOURCE_MEM;
- res.end = res.start + ALIGN(gas->access_width, 4) - 1;
} else if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_IO) {
res.flags = IORESOURCE_IO;
- res.end = res.start + gas->access_width - 1;
} else {
pr_warn("Unsupported address space: %u\n",
gas->space_id);
--- a/drivers/watchdog/wdat_wdt.c
+++ b/drivers/watchdog/wdat_wdt.c
@@ -389,7 +389,7 @@ static int wdat_wdt_probe(struct platfor

memset(&r, 0, sizeof(r));
r.start = gas->address;
- r.end = r.start + gas->access_width - 1;
+ r.end = r.start + ACPI_ACCESS_BYTE_WIDTH(gas->access_width) - 1;
if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) {
r.flags = IORESOURCE_MEM;
} else if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_IO) {


2020-03-03 18:13:33

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 086/176] audit: always check the netlink payload length in audit_receive_msg()

From: Paul Moore <[email protected]>

commit 756125289285f6e55a03861bf4b6257aa3d19a93 upstream.

This patch ensures that we always check the netlink payload length
in audit_receive_msg() before we take any action on the payload
itself.

Cc: [email protected]
Reported-by: [email protected]
Reported-by: [email protected]
Signed-off-by: Paul Moore <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/audit.c | 40 +++++++++++++++++++++-------------------
1 file changed, 21 insertions(+), 19 deletions(-)

--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1100,13 +1100,11 @@ static void audit_log_feature_change(int
audit_log_end(ab);
}

-static int audit_set_feature(struct sk_buff *skb)
+static int audit_set_feature(struct audit_features *uaf)
{
- struct audit_features *uaf;
int i;

BUILD_BUG_ON(AUDIT_LAST_FEATURE + 1 > ARRAY_SIZE(audit_feature_names));
- uaf = nlmsg_data(nlmsg_hdr(skb));

/* if there is ever a version 2 we should handle that here */

@@ -1174,6 +1172,7 @@ static int audit_receive_msg(struct sk_b
{
u32 seq;
void *data;
+ int data_len;
int err;
struct audit_buffer *ab;
u16 msg_type = nlh->nlmsg_type;
@@ -1187,6 +1186,7 @@ static int audit_receive_msg(struct sk_b

seq = nlh->nlmsg_seq;
data = nlmsg_data(nlh);
+ data_len = nlmsg_len(nlh);

switch (msg_type) {
case AUDIT_GET: {
@@ -1210,7 +1210,7 @@ static int audit_receive_msg(struct sk_b
struct audit_status s;
memset(&s, 0, sizeof(s));
/* guard against past and future API changes */
- memcpy(&s, data, min_t(size_t, sizeof(s), nlmsg_len(nlh)));
+ memcpy(&s, data, min_t(size_t, sizeof(s), data_len));
if (s.mask & AUDIT_STATUS_ENABLED) {
err = audit_set_enabled(s.enabled);
if (err < 0)
@@ -1314,7 +1314,9 @@ static int audit_receive_msg(struct sk_b
return err;
break;
case AUDIT_SET_FEATURE:
- err = audit_set_feature(skb);
+ if (data_len < sizeof(struct audit_features))
+ return -EINVAL;
+ err = audit_set_feature(data);
if (err)
return err;
break;
@@ -1326,6 +1328,8 @@ static int audit_receive_msg(struct sk_b

err = audit_filter(msg_type, AUDIT_FILTER_USER);
if (err == 1) { /* match or error */
+ char *str = data;
+
err = 0;
if (msg_type == AUDIT_USER_TTY) {
err = tty_audit_push();
@@ -1333,26 +1337,24 @@ static int audit_receive_msg(struct sk_b
break;
}
audit_log_user_recv_msg(&ab, msg_type);
- if (msg_type != AUDIT_USER_TTY)
+ if (msg_type != AUDIT_USER_TTY) {
+ /* ensure NULL termination */
+ str[data_len - 1] = '\0';
audit_log_format(ab, " msg='%.*s'",
AUDIT_MESSAGE_TEXT_MAX,
- (char *)data);
- else {
- int size;
-
+ str);
+ } else {
audit_log_format(ab, " data=");
- size = nlmsg_len(nlh);
- if (size > 0 &&
- ((unsigned char *)data)[size - 1] == '\0')
- size--;
- audit_log_n_untrustedstring(ab, data, size);
+ if (data_len > 0 && str[data_len - 1] == '\0')
+ data_len--;
+ audit_log_n_untrustedstring(ab, str, data_len);
}
audit_log_end(ab);
}
break;
case AUDIT_ADD_RULE:
case AUDIT_DEL_RULE:
- if (nlmsg_len(nlh) < sizeof(struct audit_rule_data))
+ if (data_len < sizeof(struct audit_rule_data))
return -EINVAL;
if (audit_enabled == AUDIT_LOCKED) {
audit_log_common_recv_msg(audit_context(), &ab,
@@ -1364,7 +1366,7 @@ static int audit_receive_msg(struct sk_b
audit_log_end(ab);
return -EPERM;
}
- err = audit_rule_change(msg_type, seq, data, nlmsg_len(nlh));
+ err = audit_rule_change(msg_type, seq, data, data_len);
break;
case AUDIT_LIST_RULES:
err = audit_list_rules_send(skb, seq);
@@ -1379,7 +1381,7 @@ static int audit_receive_msg(struct sk_b
case AUDIT_MAKE_EQUIV: {
void *bufp = data;
u32 sizes[2];
- size_t msglen = nlmsg_len(nlh);
+ size_t msglen = data_len;
char *old, *new;

err = -EINVAL;
@@ -1455,7 +1457,7 @@ static int audit_receive_msg(struct sk_b

memset(&s, 0, sizeof(s));
/* guard against past and future API changes */
- memcpy(&s, data, min_t(size_t, sizeof(s), nlmsg_len(nlh)));
+ memcpy(&s, data, min_t(size_t, sizeof(s), data_len));
/* check if new data is valid */
if ((s.enabled != 0 && s.enabled != 1) ||
(s.log_passwd != 0 && s.log_passwd != 1))


2020-03-03 18:13:40

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 046/176] RDMA/siw: Remove unwanted WARN_ON in siw_cm_llp_data_ready()

From: Krishnamraju Eraparaju <[email protected]>

[ Upstream commit 663218a3e715fd9339d143a3e10088316b180f4f ]

Warnings like below can fill up the dmesg while disconnecting RDMA
connections.
Hence, remove the unwanted WARN_ON.

WARNING: CPU: 6 PID: 0 at drivers/infiniband/sw/siw/siw_cm.c:1229 siw_cm_llp_data_ready+0xc1/0xd0 [siw]
RIP: 0010:siw_cm_llp_data_ready+0xc1/0xd0 [siw]
Call Trace:
<IRQ>
tcp_data_queue+0x226/0xb40
tcp_rcv_established+0x220/0x620
tcp_v4_do_rcv+0x12a/0x1e0
tcp_v4_rcv+0xb05/0xc00
ip_local_deliver_finish+0x69/0x210
ip_local_deliver+0x6b/0xe0
ip_rcv+0x273/0x362
__netif_receive_skb_core+0xb35/0xc30
netif_receive_skb_internal+0x3d/0xb0
napi_gro_frags+0x13b/0x200
t4_ethrx_handler+0x433/0x7d0 [cxgb4]
process_responses+0x318/0x580 [cxgb4]
napi_rx_handler+0x14/0x100 [cxgb4]
net_rx_action+0x149/0x3b0
__do_softirq+0xe3/0x30a
irq_exit+0x100/0x110
do_IRQ+0x7f/0xe0
common_interrupt+0xf/0xf
</IRQ>

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Krishnamraju Eraparaju <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/infiniband/sw/siw/siw_cm.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c
index 3bccfef40e7e1..ac86363ce1a24 100644
--- a/drivers/infiniband/sw/siw/siw_cm.c
+++ b/drivers/infiniband/sw/siw/siw_cm.c
@@ -1225,10 +1225,9 @@ static void siw_cm_llp_data_ready(struct sock *sk)
read_lock(&sk->sk_callback_lock);

cep = sk_to_cep(sk);
- if (!cep) {
- WARN_ON(1);
+ if (!cep)
goto out;
- }
+
siw_dbg_cep(cep, "state: %d\n", cep->state);

switch (cep->state) {
--
2.20.1



2020-03-03 18:13:45

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 069/176] ice: Use correct netif error function

From: Ben Shelton <[email protected]>

[ Upstream commit 1d8bd9927234081db15a1d42a7f99505244e3703 ]

Use the correct netif_msg_[tx,rx]_error() function to determine whether to
print the MDD event type.

Signed-off-by: Ben Shelton <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/intel/ice/ice_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index c9b35b202639d..7f71f06fa819c 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -1235,7 +1235,7 @@ static void ice_handle_mdd_event(struct ice_pf *pf)
u16 queue = ((reg & GL_MDET_TX_TCLAN_QNUM_M) >>
GL_MDET_TX_TCLAN_QNUM_S);

- if (netif_msg_rx_err(pf))
+ if (netif_msg_tx_err(pf))
dev_info(dev, "Malicious Driver Detection event %d on TX queue %d PF# %d VF# %d\n",
event, queue, pf_num, vf_num);
wr32(hw, GL_MDET_TX_TCLAN, 0xffffffff);
--
2.20.1



2020-03-03 18:13:47

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 095/176] macintosh: therm_windtunnel: fix regression when instantiating devices

From: Wolfram Sang <[email protected]>

commit 38b17afb0ebb9ecd41418d3c08bcf9198af4349d upstream.

Removing attach_adapter from this driver caused a regression for at
least some machines. Those machines had the sensors described in their
DT, too, so they didn't need manual creation of the sensor devices. The
old code worked, though, because manual creation came first. Creation of
DT devices then failed later and caused error logs, but the sensors
worked nonetheless because of the manually created devices.

When removing attach_adaper, manual creation now comes later and loses
the race. The sensor devices were already registered via DT, yet with
another binding, so the driver could not be bound to it.

This fix refactors the code to remove the race and only manually creates
devices if there are no DT nodes present. Also, the DT binding is updated
to match both, the DT and manually created devices. Because we don't
know which device creation will be used at runtime, the code to start
the kthread is moved to do_probe() which will be called by both methods.

Fixes: 3e7bed52719d ("macintosh: therm_windtunnel: drop using attach_adapter")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201723
Reported-by: Erhard Furtner <[email protected]>
Tested-by: Erhard Furtner <[email protected]>
Acked-by: Michael Ellerman <[email protected]> (powerpc)
Signed-off-by: Wolfram Sang <[email protected]>
Cc: [email protected] # v4.19+
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/macintosh/therm_windtunnel.c | 52 ++++++++++++++++++++---------------
1 file changed, 31 insertions(+), 21 deletions(-)

--- a/drivers/macintosh/therm_windtunnel.c
+++ b/drivers/macintosh/therm_windtunnel.c
@@ -300,9 +300,11 @@ static int control_loop(void *dummy)
/* i2c probing and setup */
/************************************************************************/

-static int
-do_attach( struct i2c_adapter *adapter )
+static void do_attach(struct i2c_adapter *adapter)
{
+ struct i2c_board_info info = { };
+ struct device_node *np;
+
/* scan 0x48-0x4f (DS1775) and 0x2c-2x2f (ADM1030) */
static const unsigned short scan_ds1775[] = {
0x48, 0x49, 0x4a, 0x4b, 0x4c, 0x4d, 0x4e, 0x4f,
@@ -313,25 +315,24 @@ do_attach( struct i2c_adapter *adapter )
I2C_CLIENT_END
};

- if( strncmp(adapter->name, "uni-n", 5) )
- return 0;
-
- if( !x.running ) {
- struct i2c_board_info info;
+ if (x.running || strncmp(adapter->name, "uni-n", 5))
+ return;

- memset(&info, 0, sizeof(struct i2c_board_info));
- strlcpy(info.type, "therm_ds1775", I2C_NAME_SIZE);
+ np = of_find_compatible_node(adapter->dev.of_node, NULL, "MAC,ds1775");
+ if (np) {
+ of_node_put(np);
+ } else {
+ strlcpy(info.type, "MAC,ds1775", I2C_NAME_SIZE);
i2c_new_probed_device(adapter, &info, scan_ds1775, NULL);
+ }

- strlcpy(info.type, "therm_adm1030", I2C_NAME_SIZE);
+ np = of_find_compatible_node(adapter->dev.of_node, NULL, "MAC,adm1030");
+ if (np) {
+ of_node_put(np);
+ } else {
+ strlcpy(info.type, "MAC,adm1030", I2C_NAME_SIZE);
i2c_new_probed_device(adapter, &info, scan_adm1030, NULL);
-
- if( x.thermostat && x.fan ) {
- x.running = 1;
- x.poll_task = kthread_run(control_loop, NULL, "g4fand");
- }
}
- return 0;
}

static int
@@ -404,8 +405,8 @@ out:
enum chip { ds1775, adm1030 };

static const struct i2c_device_id therm_windtunnel_id[] = {
- { "therm_ds1775", ds1775 },
- { "therm_adm1030", adm1030 },
+ { "MAC,ds1775", ds1775 },
+ { "MAC,adm1030", adm1030 },
{ }
};
MODULE_DEVICE_TABLE(i2c, therm_windtunnel_id);
@@ -414,6 +415,7 @@ static int
do_probe(struct i2c_client *cl, const struct i2c_device_id *id)
{
struct i2c_adapter *adapter = cl->adapter;
+ int ret = 0;

if( !i2c_check_functionality(adapter, I2C_FUNC_SMBUS_WORD_DATA
| I2C_FUNC_SMBUS_WRITE_BYTE) )
@@ -421,11 +423,19 @@ do_probe(struct i2c_client *cl, const st

switch (id->driver_data) {
case adm1030:
- return attach_fan( cl );
+ ret = attach_fan(cl);
+ break;
case ds1775:
- return attach_thermostat(cl);
+ ret = attach_thermostat(cl);
+ break;
}
- return 0;
+
+ if (!x.running && x.thermostat && x.fan) {
+ x.running = 1;
+ x.poll_task = kthread_run(control_loop, NULL, "g4fand");
+ }
+
+ return ret;
}

static struct i2c_driver g4fan_driver = {


2020-03-03 18:13:52

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 084/176] ext4: potential crash on allocation error in ext4_alloc_flex_bg_array()

From: Dan Carpenter <[email protected]>

commit 37b0b6b8b99c0e1c1f11abbe7cf49b6d03795b3f upstream.

If sbi->s_flex_groups_allocated is zero and the first allocation fails
then this code will crash. The problem is that "i--" will set "i" to
-1 but when we compare "i >= sbi->s_flex_groups_allocated" then the -1
is type promoted to unsigned and becomes UINT_MAX. Since UINT_MAX
is more than zero, the condition is true so we call kvfree(new_groups[-1]).
The loop will carry on freeing invalid memory until it crashes.

Fixes: 7c990728b99e ("ext4: fix potential race between s_flex_groups online resizing and access")
Reviewed-by: Suraj Jitindar Singh <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ext4/super.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -2346,7 +2346,7 @@ int ext4_alloc_flex_bg_array(struct supe
{
struct ext4_sb_info *sbi = EXT4_SB(sb);
struct flex_groups **old_groups, **new_groups;
- int size, i;
+ int size, i, j;

if (!sbi->s_log_groups_per_flex)
return 0;
@@ -2367,8 +2367,8 @@ int ext4_alloc_flex_bg_array(struct supe
sizeof(struct flex_groups)),
GFP_KERNEL);
if (!new_groups[i]) {
- for (i--; i >= sbi->s_flex_groups_allocated; i--)
- kvfree(new_groups[i]);
+ for (j = sbi->s_flex_groups_allocated; j < i; j++)
+ kvfree(new_groups[j]);
kvfree(new_groups);
ext4_msg(sb, KERN_ERR,
"not enough memory for %d flex groups", size);


2020-03-03 18:13:57

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 063/176] net: ena: ena-com.c: prevent NULL pointer dereference

From: Arthur Kiyanovski <[email protected]>

[ Upstream commit c207979f5ae10ed70aff1bb13f39f0736973de99 ]

comp_ctx can be NULL in a very rare case when an admin command is executed
during the execution of ena_remove().

The bug scenario is as follows:

* ena_destroy_device() sets the comp_ctx to be NULL
* An admin command is executed before executing unregister_netdev(),
this can still happen because our device can still receive callbacks
from the netdev infrastructure such as ethtool commands.
* When attempting to access the comp_ctx, the bug occurs since it's set
to NULL

Fix:
Added a check that comp_ctx is not NULL

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <[email protected]>
Signed-off-by: Arthur Kiyanovski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/amazon/ena/ena_com.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 74743fd8a1e0a..304531332e70a 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -200,6 +200,11 @@ static void comp_ctxt_release(struct ena_com_admin_queue *queue,
static struct ena_comp_ctx *get_comp_ctxt(struct ena_com_admin_queue *queue,
u16 command_id, bool capture)
{
+ if (unlikely(!queue->comp_ctx)) {
+ pr_err("Completion context is NULL\n");
+ return NULL;
+ }
+
if (unlikely(command_id >= queue->q_depth)) {
pr_err("command id is larger than the queue size. cmd_id: %u queue size %d\n",
command_id, queue->q_depth);
--
2.20.1



2020-03-03 18:14:05

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 099/176] amdgpu/gmc_v9: save/restore sdpif regs during S3

From: Shirish S <[email protected]>

commit a3ed353cf8015ba84a0407a5dc3ffee038166ab0 upstream.

fixes S3 issue with IOMMU + S/G enabled @ 64M VRAM.

Suggested-by: Alex Deucher <[email protected]>
Signed-off-by: Shirish S <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 1
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 37 ++++++++++++-
drivers/gpu/drm/amd/include/asic_reg/dce/dce_12_0_offset.h | 2
3 files changed, 39 insertions(+), 1 deletion(-)

--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -192,6 +192,7 @@ struct amdgpu_gmc {
uint32_t srbm_soft_reset;
bool prt_warning;
uint64_t stolen_size;
+ uint32_t sdpif_register;
/* apertures */
u64 shared_aperture_start;
u64 shared_aperture_end;
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1204,6 +1204,19 @@ static void gmc_v9_0_init_golden_registe
}

/**
+ * gmc_v9_0_restore_registers - restores regs
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * This restores register values, saved at suspend.
+ */
+static void gmc_v9_0_restore_registers(struct amdgpu_device *adev)
+{
+ if (adev->asic_type == CHIP_RAVEN)
+ WREG32(mmDCHUBBUB_SDPIF_MMIO_CNTRL_0, adev->gmc.sdpif_register);
+}
+
+/**
* gmc_v9_0_gart_enable - gart enable
*
* @adev: amdgpu_device pointer
@@ -1308,6 +1321,20 @@ static int gmc_v9_0_hw_init(void *handle
}

/**
+ * gmc_v9_0_save_registers - saves regs
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * This saves potential register values that should be
+ * restored upon resume
+ */
+static void gmc_v9_0_save_registers(struct amdgpu_device *adev)
+{
+ if (adev->asic_type == CHIP_RAVEN)
+ adev->gmc.sdpif_register = RREG32(mmDCHUBBUB_SDPIF_MMIO_CNTRL_0);
+}
+
+/**
* gmc_v9_0_gart_disable - gart disable
*
* @adev: amdgpu_device pointer
@@ -1343,9 +1370,16 @@ static int gmc_v9_0_hw_fini(void *handle

static int gmc_v9_0_suspend(void *handle)
{
+ int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;

- return gmc_v9_0_hw_fini(adev);
+ r = gmc_v9_0_hw_fini(adev);
+ if (r)
+ return r;
+
+ gmc_v9_0_save_registers(adev);
+
+ return 0;
}

static int gmc_v9_0_resume(void *handle)
@@ -1353,6 +1387,7 @@ static int gmc_v9_0_resume(void *handle)
int r;
struct amdgpu_device *adev = (struct amdgpu_device *)handle;

+ gmc_v9_0_restore_registers(adev);
r = gmc_v9_0_hw_init(adev);
if (r)
return r;
--- a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_12_0_offset.h
+++ b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_12_0_offset.h
@@ -7376,6 +7376,8 @@
#define mmCRTC4_CRTC_DRR_CONTROL 0x0f3e
#define mmCRTC4_CRTC_DRR_CONTROL_BASE_IDX 2

+#define mmDCHUBBUB_SDPIF_MMIO_CNTRL_0 0x395d
+#define mmDCHUBBUB_SDPIF_MMIO_CNTRL_0_BASE_IDX 2

// addressBlock: dce_dc_fmt4_dispdec
// base address: 0x2000


2020-03-03 18:14:13

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 039/176] arm/ftrace: Fix BE text poking

From: Peter Zijlstra <[email protected]>

[ Upstream commit be993e44badc448add6a18d6f12b20615692c4c3 ]

The __patch_text() function already applies __opcode_to_mem_*(), so
when __opcode_to_mem_*() is not the identity (BE*), it is applied
twice, wrecking the instruction.

Fixes: 42e51f187f86 ("arm/ftrace: Use __patch_text()")
Reported-by: Dmitry Osipenko <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Dmitry Osipenko <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/arm/kernel/ftrace.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
index bda949fd84e8b..93caf757f1d5d 100644
--- a/arch/arm/kernel/ftrace.c
+++ b/arch/arm/kernel/ftrace.c
@@ -81,13 +81,10 @@ static int ftrace_modify_code(unsigned long pc, unsigned long old,
{
unsigned long replaced;

- if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) {
+ if (IS_ENABLED(CONFIG_THUMB2_KERNEL))
old = __opcode_to_mem_thumb32(old);
- new = __opcode_to_mem_thumb32(new);
- } else {
+ else
old = __opcode_to_mem_arm(old);
- new = __opcode_to_mem_arm(new);
- }

if (validate) {
if (probe_kernel_read(&replaced, (void *)pc, MCOUNT_INSN_SIZE))
--
2.20.1



2020-03-03 18:14:18

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 096/176] blktrace: Protect q->blk_trace with RCU

From: Jan Kara <[email protected]>

commit c780e86dd48ef6467a1146cf7d0fe1e05a635039 upstream.

KASAN is reporting that __blk_add_trace() has a use-after-free issue
when accessing q->blk_trace. Indeed the switching of block tracing (and
thus eventual freeing of q->blk_trace) is completely unsynchronized with
the currently running tracing and thus it can happen that the blk_trace
structure is being freed just while __blk_add_trace() works on it.
Protect accesses to q->blk_trace by RCU during tracing and make sure we
wait for the end of RCU grace period when shutting down tracing. Luckily
that is rare enough event that we can afford that. Note that postponing
the freeing of blk_trace to an RCU callback should better be avoided as
it could have unexpected user visible side-effects as debugfs files
would be still existing for a short while block tracing has been shut
down.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=205711
CC: [email protected]
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Tested-by: Ming Lei <[email protected]>
Reviewed-by: Bart Van Assche <[email protected]>
Reported-by: Tristan Madani <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
include/linux/blkdev.h | 2
include/linux/blktrace_api.h | 18 ++++--
kernel/trace/blktrace.c | 114 +++++++++++++++++++++++++++++++------------
3 files changed, 97 insertions(+), 37 deletions(-)

--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -524,7 +524,7 @@ struct request_queue {
unsigned int sg_reserved_size;
int node;
#ifdef CONFIG_BLK_DEV_IO_TRACE
- struct blk_trace *blk_trace;
+ struct blk_trace __rcu *blk_trace;
struct mutex blk_trace_mutex;
#endif
/*
--- a/include/linux/blktrace_api.h
+++ b/include/linux/blktrace_api.h
@@ -51,9 +51,13 @@ void __trace_note_message(struct blk_tra
**/
#define blk_add_cgroup_trace_msg(q, cg, fmt, ...) \
do { \
- struct blk_trace *bt = (q)->blk_trace; \
+ struct blk_trace *bt; \
+ \
+ rcu_read_lock(); \
+ bt = rcu_dereference((q)->blk_trace); \
if (unlikely(bt)) \
__trace_note_message(bt, cg, fmt, ##__VA_ARGS__);\
+ rcu_read_unlock(); \
} while (0)
#define blk_add_trace_msg(q, fmt, ...) \
blk_add_cgroup_trace_msg(q, NULL, fmt, ##__VA_ARGS__)
@@ -61,10 +65,14 @@ void __trace_note_message(struct blk_tra

static inline bool blk_trace_note_message_enabled(struct request_queue *q)
{
- struct blk_trace *bt = q->blk_trace;
- if (likely(!bt))
- return false;
- return bt->act_mask & BLK_TC_NOTIFY;
+ struct blk_trace *bt;
+ bool ret;
+
+ rcu_read_lock();
+ bt = rcu_dereference(q->blk_trace);
+ ret = bt && (bt->act_mask & BLK_TC_NOTIFY);
+ rcu_read_unlock();
+ return ret;
}

extern void blk_add_driver_data(struct request_queue *q, struct request *rq,
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -335,6 +335,7 @@ static void put_probe_ref(void)

static void blk_trace_cleanup(struct blk_trace *bt)
{
+ synchronize_rcu();
blk_trace_free(bt);
put_probe_ref();
}
@@ -629,8 +630,10 @@ static int compat_blk_trace_setup(struct
static int __blk_trace_startstop(struct request_queue *q, int start)
{
int ret;
- struct blk_trace *bt = q->blk_trace;
+ struct blk_trace *bt;

+ bt = rcu_dereference_protected(q->blk_trace,
+ lockdep_is_held(&q->blk_trace_mutex));
if (bt == NULL)
return -EINVAL;

@@ -740,8 +743,8 @@ int blk_trace_ioctl(struct block_device
void blk_trace_shutdown(struct request_queue *q)
{
mutex_lock(&q->blk_trace_mutex);
-
- if (q->blk_trace) {
+ if (rcu_dereference_protected(q->blk_trace,
+ lockdep_is_held(&q->blk_trace_mutex))) {
__blk_trace_startstop(q, 0);
__blk_trace_remove(q);
}
@@ -752,8 +755,10 @@ void blk_trace_shutdown(struct request_q
#ifdef CONFIG_BLK_CGROUP
static u64 blk_trace_bio_get_cgid(struct request_queue *q, struct bio *bio)
{
- struct blk_trace *bt = q->blk_trace;
+ struct blk_trace *bt;

+ /* We don't use the 'bt' value here except as an optimization... */
+ bt = rcu_dereference_protected(q->blk_trace, 1);
if (!bt || !(blk_tracer_flags.val & TRACE_BLK_OPT_CGROUP))
return 0;

@@ -796,10 +801,14 @@ blk_trace_request_get_cgid(struct reques
static void blk_add_trace_rq(struct request *rq, int error,
unsigned int nr_bytes, u32 what, u64 cgid)
{
- struct blk_trace *bt = rq->q->blk_trace;
+ struct blk_trace *bt;

- if (likely(!bt))
+ rcu_read_lock();
+ bt = rcu_dereference(rq->q->blk_trace);
+ if (likely(!bt)) {
+ rcu_read_unlock();
return;
+ }

if (blk_rq_is_passthrough(rq))
what |= BLK_TC_ACT(BLK_TC_PC);
@@ -808,6 +817,7 @@ static void blk_add_trace_rq(struct requ

__blk_add_trace(bt, blk_rq_trace_sector(rq), nr_bytes, req_op(rq),
rq->cmd_flags, what, error, 0, NULL, cgid);
+ rcu_read_unlock();
}

static void blk_add_trace_rq_insert(void *ignore,
@@ -853,14 +863,19 @@ static void blk_add_trace_rq_complete(vo
static void blk_add_trace_bio(struct request_queue *q, struct bio *bio,
u32 what, int error)
{
- struct blk_trace *bt = q->blk_trace;
+ struct blk_trace *bt;

- if (likely(!bt))
+ rcu_read_lock();
+ bt = rcu_dereference(q->blk_trace);
+ if (likely(!bt)) {
+ rcu_read_unlock();
return;
+ }

__blk_add_trace(bt, bio->bi_iter.bi_sector, bio->bi_iter.bi_size,
bio_op(bio), bio->bi_opf, what, error, 0, NULL,
blk_trace_bio_get_cgid(q, bio));
+ rcu_read_unlock();
}

static void blk_add_trace_bio_bounce(void *ignore,
@@ -905,11 +920,14 @@ static void blk_add_trace_getrq(void *ig
if (bio)
blk_add_trace_bio(q, bio, BLK_TA_GETRQ, 0);
else {
- struct blk_trace *bt = q->blk_trace;
+ struct blk_trace *bt;

+ rcu_read_lock();
+ bt = rcu_dereference(q->blk_trace);
if (bt)
__blk_add_trace(bt, 0, 0, rw, 0, BLK_TA_GETRQ, 0, 0,
NULL, 0);
+ rcu_read_unlock();
}
}

@@ -921,27 +939,35 @@ static void blk_add_trace_sleeprq(void *
if (bio)
blk_add_trace_bio(q, bio, BLK_TA_SLEEPRQ, 0);
else {
- struct blk_trace *bt = q->blk_trace;
+ struct blk_trace *bt;

+ rcu_read_lock();
+ bt = rcu_dereference(q->blk_trace);
if (bt)
__blk_add_trace(bt, 0, 0, rw, 0, BLK_TA_SLEEPRQ,
0, 0, NULL, 0);
+ rcu_read_unlock();
}
}

static void blk_add_trace_plug(void *ignore, struct request_queue *q)
{
- struct blk_trace *bt = q->blk_trace;
+ struct blk_trace *bt;

+ rcu_read_lock();
+ bt = rcu_dereference(q->blk_trace);
if (bt)
__blk_add_trace(bt, 0, 0, 0, 0, BLK_TA_PLUG, 0, 0, NULL, 0);
+ rcu_read_unlock();
}

static void blk_add_trace_unplug(void *ignore, struct request_queue *q,
unsigned int depth, bool explicit)
{
- struct blk_trace *bt = q->blk_trace;
+ struct blk_trace *bt;

+ rcu_read_lock();
+ bt = rcu_dereference(q->blk_trace);
if (bt) {
__be64 rpdu = cpu_to_be64(depth);
u32 what;
@@ -953,14 +979,17 @@ static void blk_add_trace_unplug(void *i

__blk_add_trace(bt, 0, 0, 0, 0, what, 0, sizeof(rpdu), &rpdu, 0);
}
+ rcu_read_unlock();
}

static void blk_add_trace_split(void *ignore,
struct request_queue *q, struct bio *bio,
unsigned int pdu)
{
- struct blk_trace *bt = q->blk_trace;
+ struct blk_trace *bt;

+ rcu_read_lock();
+ bt = rcu_dereference(q->blk_trace);
if (bt) {
__be64 rpdu = cpu_to_be64(pdu);

@@ -969,6 +998,7 @@ static void blk_add_trace_split(void *ig
BLK_TA_SPLIT, bio->bi_status, sizeof(rpdu),
&rpdu, blk_trace_bio_get_cgid(q, bio));
}
+ rcu_read_unlock();
}

/**
@@ -988,11 +1018,15 @@ static void blk_add_trace_bio_remap(void
struct request_queue *q, struct bio *bio,
dev_t dev, sector_t from)
{
- struct blk_trace *bt = q->blk_trace;
+ struct blk_trace *bt;
struct blk_io_trace_remap r;

- if (likely(!bt))
+ rcu_read_lock();
+ bt = rcu_dereference(q->blk_trace);
+ if (likely(!bt)) {
+ rcu_read_unlock();
return;
+ }

r.device_from = cpu_to_be32(dev);
r.device_to = cpu_to_be32(bio_dev(bio));
@@ -1001,6 +1035,7 @@ static void blk_add_trace_bio_remap(void
__blk_add_trace(bt, bio->bi_iter.bi_sector, bio->bi_iter.bi_size,
bio_op(bio), bio->bi_opf, BLK_TA_REMAP, bio->bi_status,
sizeof(r), &r, blk_trace_bio_get_cgid(q, bio));
+ rcu_read_unlock();
}

/**
@@ -1021,11 +1056,15 @@ static void blk_add_trace_rq_remap(void
struct request *rq, dev_t dev,
sector_t from)
{
- struct blk_trace *bt = q->blk_trace;
+ struct blk_trace *bt;
struct blk_io_trace_remap r;

- if (likely(!bt))
+ rcu_read_lock();
+ bt = rcu_dereference(q->blk_trace);
+ if (likely(!bt)) {
+ rcu_read_unlock();
return;
+ }

r.device_from = cpu_to_be32(dev);
r.device_to = cpu_to_be32(disk_devt(rq->rq_disk));
@@ -1034,6 +1073,7 @@ static void blk_add_trace_rq_remap(void
__blk_add_trace(bt, blk_rq_pos(rq), blk_rq_bytes(rq),
rq_data_dir(rq), 0, BLK_TA_REMAP, 0,
sizeof(r), &r, blk_trace_request_get_cgid(q, rq));
+ rcu_read_unlock();
}

/**
@@ -1051,14 +1091,19 @@ void blk_add_driver_data(struct request_
struct request *rq,
void *data, size_t len)
{
- struct blk_trace *bt = q->blk_trace;
+ struct blk_trace *bt;

- if (likely(!bt))
+ rcu_read_lock();
+ bt = rcu_dereference(q->blk_trace);
+ if (likely(!bt)) {
+ rcu_read_unlock();
return;
+ }

__blk_add_trace(bt, blk_rq_trace_sector(rq), blk_rq_bytes(rq), 0, 0,
BLK_TA_DRV_DATA, 0, len, data,
blk_trace_request_get_cgid(q, rq));
+ rcu_read_unlock();
}
EXPORT_SYMBOL_GPL(blk_add_driver_data);

@@ -1597,6 +1642,7 @@ static int blk_trace_remove_queue(struct
return -EINVAL;

put_probe_ref();
+ synchronize_rcu();
blk_trace_free(bt);
return 0;
}
@@ -1758,6 +1804,7 @@ static ssize_t sysfs_blk_trace_attr_show
struct hd_struct *p = dev_to_part(dev);
struct request_queue *q;
struct block_device *bdev;
+ struct blk_trace *bt;
ssize_t ret = -ENXIO;

bdev = bdget(part_devt(p));
@@ -1770,21 +1817,23 @@ static ssize_t sysfs_blk_trace_attr_show

mutex_lock(&q->blk_trace_mutex);

+ bt = rcu_dereference_protected(q->blk_trace,
+ lockdep_is_held(&q->blk_trace_mutex));
if (attr == &dev_attr_enable) {
- ret = sprintf(buf, "%u\n", !!q->blk_trace);
+ ret = sprintf(buf, "%u\n", !!bt);
goto out_unlock_bdev;
}

- if (q->blk_trace == NULL)
+ if (bt == NULL)
ret = sprintf(buf, "disabled\n");
else if (attr == &dev_attr_act_mask)
- ret = blk_trace_mask2str(buf, q->blk_trace->act_mask);
+ ret = blk_trace_mask2str(buf, bt->act_mask);
else if (attr == &dev_attr_pid)
- ret = sprintf(buf, "%u\n", q->blk_trace->pid);
+ ret = sprintf(buf, "%u\n", bt->pid);
else if (attr == &dev_attr_start_lba)
- ret = sprintf(buf, "%llu\n", q->blk_trace->start_lba);
+ ret = sprintf(buf, "%llu\n", bt->start_lba);
else if (attr == &dev_attr_end_lba)
- ret = sprintf(buf, "%llu\n", q->blk_trace->end_lba);
+ ret = sprintf(buf, "%llu\n", bt->end_lba);

out_unlock_bdev:
mutex_unlock(&q->blk_trace_mutex);
@@ -1801,6 +1850,7 @@ static ssize_t sysfs_blk_trace_attr_stor
struct block_device *bdev;
struct request_queue *q;
struct hd_struct *p;
+ struct blk_trace *bt;
u64 value;
ssize_t ret = -EINVAL;

@@ -1831,8 +1881,10 @@ static ssize_t sysfs_blk_trace_attr_stor

mutex_lock(&q->blk_trace_mutex);

+ bt = rcu_dereference_protected(q->blk_trace,
+ lockdep_is_held(&q->blk_trace_mutex));
if (attr == &dev_attr_enable) {
- if (!!value == !!q->blk_trace) {
+ if (!!value == !!bt) {
ret = 0;
goto out_unlock_bdev;
}
@@ -1844,18 +1896,18 @@ static ssize_t sysfs_blk_trace_attr_stor
}

ret = 0;
- if (q->blk_trace == NULL)
+ if (bt == NULL)
ret = blk_trace_setup_queue(q, bdev);

if (ret == 0) {
if (attr == &dev_attr_act_mask)
- q->blk_trace->act_mask = value;
+ bt->act_mask = value;
else if (attr == &dev_attr_pid)
- q->blk_trace->pid = value;
+ bt->pid = value;
else if (attr == &dev_attr_start_lba)
- q->blk_trace->start_lba = value;
+ bt->start_lba = value;
else if (attr == &dev_attr_end_lba)
- q->blk_trace->end_lba = value;
+ bt->end_lba = value;
}

out_unlock_bdev:


2020-03-03 18:14:25

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 090/176] HID: ite: Only bind to keyboard USB interface on Acer SW5-012 keyboard dock

From: Hans de Goede <[email protected]>

commit beae56192a2570578ae45050e73c5ff9254f63e6 upstream.

Commit 8f18eca9ebc5 ("HID: ite: Add USB id match for Acer SW5-012 keyboard
dock") added the USB id for the Acer SW5-012's keyboard dock to the
hid-ite driver to fix the rfkill driver not working.

Most keyboard docks with an ITE 8595 keyboard/touchpad controller have the
"Wireless Radio Control" bits which need the special hid-ite driver on the
second USB interface (the mouse interface) and their touchpad only supports
mouse emulation, so using generic hid-input handling for anything but
the "Wireless Radio Control" bits is fine. On these devices we simply bind
to all USB interfaces.

But unlike other ITE8595 using keyboard docks, the Acer Aspire Switch 10
(SW5-012)'s touchpad not only does mouse emulation it also supports
HID-multitouch and all the keys including the "Wireless Radio Control"
bits have been moved to the first USB interface (the keyboard intf).

So we need hid-ite to handle the first (keyboard) USB interface and have
it NOT bind to the second (mouse) USB interface so that that can be
handled by hid-multitouch.c and we get proper multi-touch support.

This commit changes the hid_device_id for the SW5-012 keyboard dock to
only match on hid devices from the HID_GROUP_GENERIC group, this way
hid-ite will not bind the the mouse/multi-touch interface which has
HID_GROUP_MULTITOUCH_WIN_8 as group.
This fixes the regression to mouse-emulation mode introduced by adding
the keyboard dock USB id.

Cc: [email protected]
Fixes: 8f18eca9ebc5 ("HID: ite: Add USB id match for Acer SW5-012 keyboard dock")
Reported-by: Zdeněk Rampas <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hid/hid-ite.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/hid/hid-ite.c
+++ b/drivers/hid/hid-ite.c
@@ -41,8 +41,9 @@ static const struct hid_device_id ite_de
{ HID_USB_DEVICE(USB_VENDOR_ID_ITE, USB_DEVICE_ID_ITE8595) },
{ HID_USB_DEVICE(USB_VENDOR_ID_258A, USB_DEVICE_ID_258A_6A88) },
/* ITE8595 USB kbd ctlr, with Synaptics touchpad connected to it. */
- { HID_USB_DEVICE(USB_VENDOR_ID_SYNAPTICS,
- USB_DEVICE_ID_SYNAPTICS_ACER_SWITCH5_012) },
+ { HID_DEVICE(BUS_USB, HID_GROUP_GENERIC,
+ USB_VENDOR_ID_SYNAPTICS,
+ USB_DEVICE_ID_SYNAPTICS_ACER_SWITCH5_012) },
{ }
};
MODULE_DEVICE_TABLE(hid, ite_devices);


2020-03-03 18:14:33

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 092/176] HID: core: increase HID report buffer size to 8KiB

From: Johan Korsnes <[email protected]>

commit 84a4062632462c4320704fcdf8e99e89e94c0aba upstream.

We have a HID touch device that reports its opens and shorts test
results in HID buffers of size 8184 bytes. The maximum size of the HID
buffer is currently set to 4096 bytes, causing probe of this device to
fail. With this patch we increase the maximum size of the HID buffer to
8192 bytes, making device probe and acquisition of said buffers succeed.

Signed-off-by: Johan Korsnes <[email protected]>
Cc: Alan Stern <[email protected]>
Cc: Armando Visconti <[email protected]>
Cc: Jiri Kosina <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
include/linux/hid.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/linux/hid.h
+++ b/include/linux/hid.h
@@ -492,7 +492,7 @@ struct hid_report_enum {
};

#define HID_MIN_BUFFER_SIZE 64 /* make sure there is at least a packet size of space */
-#define HID_MAX_BUFFER_SIZE 4096 /* 4kb */
+#define HID_MAX_BUFFER_SIZE 8192 /* 8kb */
#define HID_CONTROL_FIFO_SIZE 256 /* to init devices with >100 reports */
#define HID_OUTPUT_FIFO_SIZE 64



2020-03-03 18:14:36

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 035/176] qmi_wwan: re-add DW5821e pre-production variant

From: Bjørn Mork <[email protected]>

[ Upstream commit 88bf54603f6f2c137dfee1abf6436ceac3528d2d ]

Commit f25e1392fdb5 removed the support for the pre-production variant
of the Dell DW5821e to avoid probing another USB interface unnecessarily.
However, the pre-production samples are found in the wild, and this lack
of support is causing problems for users of such samples. It is therefore
necessary to support both variants.

Matching on both interfaces 0 and 1 is not expected to cause any problem
with either variant, as only the QMI function will be probed successfully
on either. Interface 1 will be rejected based on the HID class for the
production variant:

T: Bus=01 Lev=03 Prnt=04 Port=00 Cnt=01 Dev#= 16 Spd=480 MxCh= 0
D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 2
P: Vendor=413c ProdID=81d7 Rev=03.18
S: Manufacturer=DELL
S: Product=DW5821e Snapdragon X20 LTE
S: SerialNumber=0123456789ABCDEF
C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA
I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I: If#= 1 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver=usbhid
I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option

And interface 0 will be rejected based on too few endpoints for the
pre-production variant:

T: Bus=01 Lev=02 Prnt=02 Port=03 Cnt=03 Dev#= 7 Spd=480 MxCh= 0
D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 2
P: Vendor=413c ProdID=81d7 Rev= 3.18
S: Manufacturer=DELL
S: Product=DW5821e Snapdragon X20 LTE
S: SerialNumber=0123456789ABCDEF
C: #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=500mA
I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=
I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option

Fixes: f25e1392fdb5 ("qmi_wwan: fix interface number for DW5821e production firmware")
Link: https://whrl.pl/Rf0vNk
Reported-by: Lars Melin <[email protected]>
Cc: Aleksander Morgado <[email protected]>
Signed-off-by: Bjørn Mork <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/usb/qmi_wwan.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 9485c8d1de8a3..839cef720cf64 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -1363,6 +1363,7 @@ static const struct usb_device_id products[] = {
{QMI_FIXED_INTF(0x413c, 0x81b6, 8)}, /* Dell Wireless 5811e */
{QMI_FIXED_INTF(0x413c, 0x81b6, 10)}, /* Dell Wireless 5811e */
{QMI_FIXED_INTF(0x413c, 0x81d7, 0)}, /* Dell Wireless 5821e */
+ {QMI_FIXED_INTF(0x413c, 0x81d7, 1)}, /* Dell Wireless 5821e preproduction config */
{QMI_FIXED_INTF(0x413c, 0x81e0, 0)}, /* Dell Wireless 5821e with eSIM support*/
{QMI_FIXED_INTF(0x03f0, 0x4e1d, 8)}, /* HP lt4111 LTE/EV-DO/HSPA+ Gobi 4G Module */
{QMI_FIXED_INTF(0x03f0, 0x9d1d, 1)}, /* HP lt4120 Snapdragon X5 LTE */
--
2.20.1



2020-03-03 18:14:37

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 034/176] s390/zcrypt: fix card and queue total counter wrap

From: Harald Freudenberger <[email protected]>

[ Upstream commit fcd98d4002539f1e381916fc1b6648938c1eac76 ]

The internal statistic counters for the total number of
requests processed per card and per queue used integers. So they do
wrap after a rather huge amount of crypto requests processed. This
patch introduces uint64 counters which should hold much longer but
still may wrap. The sysfs attributes request_count for card and queue
also used only %ld and now display the counter value with %llu.

This is not a security relevant fix. The int overflow which happened
is not in any way exploitable as a security breach.

Signed-off-by: Harald Freudenberger <[email protected]>
Signed-off-by: Vasily Gorbik <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/s390/crypto/ap_bus.h | 4 ++--
drivers/s390/crypto/ap_card.c | 8 ++++----
drivers/s390/crypto/ap_queue.c | 6 +++---
drivers/s390/crypto/zcrypt_api.c | 16 +++++++++-------
4 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/drivers/s390/crypto/ap_bus.h b/drivers/s390/crypto/ap_bus.h
index bb35ba4a8d243..4348fdff1c61e 100644
--- a/drivers/s390/crypto/ap_bus.h
+++ b/drivers/s390/crypto/ap_bus.h
@@ -162,7 +162,7 @@ struct ap_card {
unsigned int functions; /* AP device function bitfield. */
int queue_depth; /* AP queue depth.*/
int id; /* AP card number. */
- atomic_t total_request_count; /* # requests ever for this AP device.*/
+ atomic64_t total_request_count; /* # requests ever for this AP device.*/
};

#define to_ap_card(x) container_of((x), struct ap_card, ap_dev.device)
@@ -179,7 +179,7 @@ struct ap_queue {
enum ap_state state; /* State of the AP device. */
int pendingq_count; /* # requests on pendingq list. */
int requestq_count; /* # requests on requestq list. */
- int total_request_count; /* # requests ever for this AP device.*/
+ u64 total_request_count; /* # requests ever for this AP device.*/
int request_timeout; /* Request timeout in jiffies. */
struct timer_list timeout; /* Timer for request timeouts. */
struct list_head pendingq; /* List of message sent to AP queue. */
diff --git a/drivers/s390/crypto/ap_card.c b/drivers/s390/crypto/ap_card.c
index 63b4cc6cd7e59..e85bfca1ed163 100644
--- a/drivers/s390/crypto/ap_card.c
+++ b/drivers/s390/crypto/ap_card.c
@@ -63,13 +63,13 @@ static ssize_t request_count_show(struct device *dev,
char *buf)
{
struct ap_card *ac = to_ap_card(dev);
- unsigned int req_cnt;
+ u64 req_cnt;

req_cnt = 0;
spin_lock_bh(&ap_list_lock);
- req_cnt = atomic_read(&ac->total_request_count);
+ req_cnt = atomic64_read(&ac->total_request_count);
spin_unlock_bh(&ap_list_lock);
- return snprintf(buf, PAGE_SIZE, "%d\n", req_cnt);
+ return snprintf(buf, PAGE_SIZE, "%llu\n", req_cnt);
}

static ssize_t request_count_store(struct device *dev,
@@ -83,7 +83,7 @@ static ssize_t request_count_store(struct device *dev,
for_each_ap_queue(aq, ac)
aq->total_request_count = 0;
spin_unlock_bh(&ap_list_lock);
- atomic_set(&ac->total_request_count, 0);
+ atomic64_set(&ac->total_request_count, 0);

return count;
}
diff --git a/drivers/s390/crypto/ap_queue.c b/drivers/s390/crypto/ap_queue.c
index 37c3bdc3642dc..a317ab4849320 100644
--- a/drivers/s390/crypto/ap_queue.c
+++ b/drivers/s390/crypto/ap_queue.c
@@ -479,12 +479,12 @@ static ssize_t request_count_show(struct device *dev,
char *buf)
{
struct ap_queue *aq = to_ap_queue(dev);
- unsigned int req_cnt;
+ u64 req_cnt;

spin_lock_bh(&aq->lock);
req_cnt = aq->total_request_count;
spin_unlock_bh(&aq->lock);
- return snprintf(buf, PAGE_SIZE, "%d\n", req_cnt);
+ return snprintf(buf, PAGE_SIZE, "%llu\n", req_cnt);
}

static ssize_t request_count_store(struct device *dev,
@@ -676,7 +676,7 @@ void ap_queue_message(struct ap_queue *aq, struct ap_message *ap_msg)
list_add_tail(&ap_msg->list, &aq->requestq);
aq->requestq_count++;
aq->total_request_count++;
- atomic_inc(&aq->card->total_request_count);
+ atomic64_inc(&aq->card->total_request_count);
/* Send/receive as many request from the queue as possible. */
ap_wait(ap_sm_event_loop(aq, AP_EVENT_POLL));
spin_unlock_bh(&aq->lock);
diff --git a/drivers/s390/crypto/zcrypt_api.c b/drivers/s390/crypto/zcrypt_api.c
index 9157e728a362d..7fa0262e91af0 100644
--- a/drivers/s390/crypto/zcrypt_api.c
+++ b/drivers/s390/crypto/zcrypt_api.c
@@ -605,8 +605,8 @@ static inline bool zcrypt_card_compare(struct zcrypt_card *zc,
weight += atomic_read(&zc->load);
pref_weight += atomic_read(&pref_zc->load);
if (weight == pref_weight)
- return atomic_read(&zc->card->total_request_count) >
- atomic_read(&pref_zc->card->total_request_count);
+ return atomic64_read(&zc->card->total_request_count) >
+ atomic64_read(&pref_zc->card->total_request_count);
return weight > pref_weight;
}

@@ -1216,11 +1216,12 @@ static void zcrypt_qdepth_mask(char qdepth[], size_t max_adapters)
spin_unlock(&zcrypt_list_lock);
}

-static void zcrypt_perdev_reqcnt(int reqcnt[], size_t max_adapters)
+static void zcrypt_perdev_reqcnt(u32 reqcnt[], size_t max_adapters)
{
struct zcrypt_card *zc;
struct zcrypt_queue *zq;
int card;
+ u64 cnt;

memset(reqcnt, 0, sizeof(int) * max_adapters);
spin_lock(&zcrypt_list_lock);
@@ -1232,8 +1233,9 @@ static void zcrypt_perdev_reqcnt(int reqcnt[], size_t max_adapters)
|| card >= max_adapters)
continue;
spin_lock(&zq->queue->lock);
- reqcnt[card] = zq->queue->total_request_count;
+ cnt = zq->queue->total_request_count;
spin_unlock(&zq->queue->lock);
+ reqcnt[card] = (cnt < UINT_MAX) ? (u32) cnt : UINT_MAX;
}
}
local_bh_enable();
@@ -1411,9 +1413,9 @@ static long zcrypt_unlocked_ioctl(struct file *filp, unsigned int cmd,
return 0;
}
case ZCRYPT_PERDEV_REQCNT: {
- int *reqcnt;
+ u32 *reqcnt;

- reqcnt = kcalloc(AP_DEVICES, sizeof(int), GFP_KERNEL);
+ reqcnt = kcalloc(AP_DEVICES, sizeof(u32), GFP_KERNEL);
if (!reqcnt)
return -ENOMEM;
zcrypt_perdev_reqcnt(reqcnt, AP_DEVICES);
@@ -1470,7 +1472,7 @@ static long zcrypt_unlocked_ioctl(struct file *filp, unsigned int cmd,
}
case Z90STAT_PERDEV_REQCNT: {
/* the old ioctl supports only 64 adapters */
- int reqcnt[MAX_ZDEV_CARDIDS];
+ u32 reqcnt[MAX_ZDEV_CARDIDS];

zcrypt_perdev_reqcnt(reqcnt, MAX_ZDEV_CARDIDS);
if (copy_to_user((int __user *) arg, reqcnt, sizeof(reqcnt)))
--
2.20.1



2020-03-03 18:14:43

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 004/176] net: macb: ensure interface is not suspended on at91rm9200

From: Alexandre Belloni <[email protected]>

[ Upstream commit e6a41c23df0d5da01540d2abef41591589c0b4be ]

Because of autosuspend, at91ether_start is called with clocks disabled.
Ensure that pm_runtime doesn't suspend the interface as soon as it is
opened as there is no pm_runtime support is the other relevant parts of the
platform support for at91rm9200.

Fixes: d54f89af6cc4 ("net: macb: Add pm runtime support")
Signed-off-by: Alexandre Belloni <[email protected]>
Reviewed-by: Claudiu Beznea <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/cadence/macb_main.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -3751,6 +3751,10 @@ static int at91ether_open(struct net_dev
u32 ctl;
int ret;

+ ret = pm_runtime_get_sync(&lp->pdev->dev);
+ if (ret < 0)
+ return ret;
+
/* Clear internal statistics */
ctl = macb_readl(lp, NCR);
macb_writel(lp, NCR, ctl | MACB_BIT(CLRSTAT));
@@ -3815,7 +3819,7 @@ static int at91ether_close(struct net_de
q->rx_buffers, q->rx_buffers_dma);
q->rx_buffers = NULL;

- return 0;
+ return pm_runtime_put(&lp->pdev->dev);
}

/* Transmit packet */


2020-03-03 18:14:46

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 029/176] sched/fair: Prevent unlimited runtime on throttled group

From: Vincent Guittot <[email protected]>

[ Upstream commit 2a4b03ffc69f2dedc6388e9a6438b5f4c133a40d ]

When a running task is moved on a throttled task group and there is no
other task enqueued on the CPU, the task can keep running using 100% CPU
whatever the allocated bandwidth for the group and although its cfs rq is
throttled. Furthermore, the group entity of the cfs_rq and its parents are
not enqueued but only set as curr on their respective cfs_rqs.

We have the following sequence:

sched_move_task
-dequeue_task: dequeue task and group_entities.
-put_prev_task: put task and group entities.
-sched_change_group: move task to new group.
-enqueue_task: enqueue only task but not group entities because cfs_rq is
throttled.
-set_next_task : set task and group_entities as current sched_entity of
their cfs_rq.

Another impact is that the root cfs_rq runnable_load_avg at root rq stays
null because the group_entities are not enqueued. This situation will stay
the same until an "external" event triggers a reschedule. Let trigger it
immediately instead.

Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Ben Segall <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/sched/core.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 65ed821335dd5..9e7768dbd92d2 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7068,8 +7068,15 @@ void sched_move_task(struct task_struct *tsk)

if (queued)
enqueue_task(rq, tsk, queue_flags);
- if (running)
+ if (running) {
set_next_task(rq, tsk);
+ /*
+ * After changing group, the running task may have joined a
+ * throttled one but it's still the running task. Trigger a
+ * resched to make sure that task can still run.
+ */
+ resched_curr(rq);
+ }

task_rq_unlock(rq, tsk, &rf);
}
--
2.20.1



2020-03-03 18:14:49

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 060/176] net: ena: fix incorrectly saving queue numbers when setting RSS indirection table

From: Arthur Kiyanovski <[email protected]>

[ Upstream commit 92569fd27f5cb0ccbdf7c7d70044b690e89a0277 ]

The indirection table has the indices of the Rx queues. When we store it
during set indirection operation, we convert the indices to our internal
representation of the indices.

Our internal representation of the indices is: even indices for Tx and
uneven indices for Rx, where every Tx/Rx pair are in a consecutive order
starting from 0. For example if the driver has 3 queues (3 for Tx and 3
for Rx) then the indices are as follows:
0 1 2 3 4 5
Tx Rx Tx Rx Tx Rx

The BUG:
The issue is that when we satisfy a get request for the indirection
table, we don't convert the indices back to the original representation.

The FIX:
Simply apply the inverse function for the indices of the indirection
table after we set it.

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Sameeh Jubran <[email protected]>
Signed-off-by: Arthur Kiyanovski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/amazon/ena/ena_ethtool.c | 24 ++++++++++++++++++-
drivers/net/ethernet/amazon/ena/ena_netdev.h | 2 ++
2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_ethtool.c b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
index 8be9df885bf4f..610a7c63e1742 100644
--- a/drivers/net/ethernet/amazon/ena/ena_ethtool.c
+++ b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
@@ -636,6 +636,28 @@ static u32 ena_get_rxfh_key_size(struct net_device *netdev)
return ENA_HASH_KEY_SIZE;
}

+static int ena_indirection_table_get(struct ena_adapter *adapter, u32 *indir)
+{
+ struct ena_com_dev *ena_dev = adapter->ena_dev;
+ int i, rc;
+
+ if (!indir)
+ return 0;
+
+ rc = ena_com_indirect_table_get(ena_dev, indir);
+ if (rc)
+ return rc;
+
+ /* Our internal representation of the indices is: even indices
+ * for Tx and uneven indices for Rx. We need to convert the Rx
+ * indices to be consecutive
+ */
+ for (i = 0; i < ENA_RX_RSS_TABLE_SIZE; i++)
+ indir[i] = ENA_IO_RXQ_IDX_TO_COMBINED_IDX(indir[i]);
+
+ return rc;
+}
+
static int ena_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key,
u8 *hfunc)
{
@@ -644,7 +666,7 @@ static int ena_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key,
u8 func;
int rc;

- rc = ena_com_indirect_table_get(adapter->ena_dev, indir);
+ rc = ena_indirection_table_get(adapter, indir);
if (rc)
return rc;

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h
index bffd778f2ce34..2fe5eeea6b695 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.h
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h
@@ -129,6 +129,8 @@

#define ENA_IO_TXQ_IDX(q) (2 * (q))
#define ENA_IO_RXQ_IDX(q) (2 * (q) + 1)
+#define ENA_IO_TXQ_IDX_TO_COMBINED_IDX(q) ((q) / 2)
+#define ENA_IO_RXQ_IDX_TO_COMBINED_IDX(q) (((q) - 1) / 2)

#define ENA_MGMNT_IRQ_IDX 0
#define ENA_IO_IRQ_FIRST_IDX 1
--
2.20.1



2020-03-03 18:15:20

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 011/176] Revert "net: dev: introduce support for sch BYPASS for lockless qdisc"

From: Paolo Abeni <[email protected]>

[ Upstream commit 379349e9bc3b42b8b2f8f7a03f64a97623fff323 ]

This reverts commit ba27b4cdaaa66561aaedb2101876e563738d36fe

Ahmed reported ouf-of-order issues bisected to commit ba27b4cdaaa6
("net: dev: introduce support for sch BYPASS for lockless qdisc").
I can't find any working solution other than a plain revert.

This will introduce some minor performance regressions for
pfifo_fast qdisc. I plan to address them in net-next with more
indirect call wrapper boilerplate for qdiscs.

Reported-by: Ahmad Fatoum <[email protected]>
Fixes: ba27b4cdaaa6 ("net: dev: introduce support for sch BYPASS for lockless qdisc")
Signed-off-by: Paolo Abeni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/dev.c | 22 ++--------------------
1 file changed, 2 insertions(+), 20 deletions(-)

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3607,26 +3607,8 @@ static inline int __dev_xmit_skb(struct
qdisc_calculate_pkt_len(skb, q);

if (q->flags & TCQ_F_NOLOCK) {
- if ((q->flags & TCQ_F_CAN_BYPASS) && READ_ONCE(q->empty) &&
- qdisc_run_begin(q)) {
- if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED,
- &q->state))) {
- __qdisc_drop(skb, &to_free);
- rc = NET_XMIT_DROP;
- goto end_run;
- }
- qdisc_bstats_cpu_update(q, skb);
-
- rc = NET_XMIT_SUCCESS;
- if (sch_direct_xmit(skb, q, dev, txq, NULL, true))
- __qdisc_run(q);
-
-end_run:
- qdisc_run_end(q);
- } else {
- rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK;
- qdisc_run(q);
- }
+ rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK;
+ qdisc_run(q);

if (unlikely(to_free))
kfree_skb_list(to_free);


2020-03-03 18:16:02

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 022/176] net: add strict checks in netdev_name_node_alt_destroy()

From: Eric Dumazet <[email protected]>

[ Upstream commit e08ad80551b4b33c02f2fce1522f6c227d3976cf ]

netdev_name_node_alt_destroy() does a lookup over all
device names of a namespace.

We need to make sure the name belongs to the device
of interest, and that we do not destroy its primary
name, since we rely on it being not deleted :
dev->name_node would indeed point to freed memory.

syzbot report was the following :

BUG: KASAN: use-after-free in dev_net include/linux/netdevice.h:2206 [inline]
BUG: KASAN: use-after-free in mld_force_mld_version net/ipv6/mcast.c:1172 [inline]
BUG: KASAN: use-after-free in mld_in_v2_mode_only net/ipv6/mcast.c:1180 [inline]
BUG: KASAN: use-after-free in mld_in_v1_mode+0x203/0x230 net/ipv6/mcast.c:1190
Read of size 8 at addr ffff88809886c588 by task swapper/1/0

CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x197/0x210 lib/dump_stack.c:118
print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
__kasan_report.cold+0x1b/0x32 mm/kasan/report.c:506
kasan_report+0x12/0x20 mm/kasan/common.c:641
__asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:135
dev_net include/linux/netdevice.h:2206 [inline]
mld_force_mld_version net/ipv6/mcast.c:1172 [inline]
mld_in_v2_mode_only net/ipv6/mcast.c:1180 [inline]
mld_in_v1_mode+0x203/0x230 net/ipv6/mcast.c:1190
mld_send_initial_cr net/ipv6/mcast.c:2083 [inline]
mld_dad_timer_expire+0x24/0x230 net/ipv6/mcast.c:2118
call_timer_fn+0x1ac/0x780 kernel/time/timer.c:1404
expire_timers kernel/time/timer.c:1449 [inline]
__run_timers kernel/time/timer.c:1773 [inline]
__run_timers kernel/time/timer.c:1740 [inline]
run_timer_softirq+0x6c3/0x1790 kernel/time/timer.c:1786
__do_softirq+0x262/0x98c kernel/softirq.c:292
invoke_softirq kernel/softirq.c:373 [inline]
irq_exit+0x19b/0x1e0 kernel/softirq.c:413
exiting_irq arch/x86/include/asm/apic.h:546 [inline]
smp_apic_timer_interrupt+0x1a3/0x610 arch/x86/kernel/apic/apic.c:1146
apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:829
</IRQ>
RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61
Code: 68 73 c5 f9 eb 8a cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d 94 be 59 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 84 be 59 00 fb f4 <c3> cc 55 48 89 e5 41 57 41 56 41 55 41 54 53 e8 de 2a 74 f9 e8 09
RSP: 0018:ffffc90000d3fd68 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff13
RAX: 1ffffffff136761a RBX: ffff8880a99fc340 RCX: 0000000000000000
RDX: dffffc0000000000 RSI: 0000000000000006 RDI: ffff8880a99fcbd4
RBP: ffffc90000d3fd98 R08: ffff8880a99fc340 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000
R13: ffffffff8aa5a1c0 R14: 0000000000000000 R15: 0000000000000001
arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:686
default_idle_call+0x84/0xb0 kernel/sched/idle.c:94
cpuidle_idle_call kernel/sched/idle.c:154 [inline]
do_idle+0x3c8/0x6e0 kernel/sched/idle.c:269
cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:361
start_secondary+0x2f4/0x410 arch/x86/kernel/smpboot.c:264
secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:242

Allocated by task 10229:
save_stack+0x23/0x90 mm/kasan/common.c:72
set_track mm/kasan/common.c:80 [inline]
__kasan_kmalloc mm/kasan/common.c:515 [inline]
__kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:488
kasan_kmalloc+0x9/0x10 mm/kasan/common.c:529
__do_kmalloc_node mm/slab.c:3616 [inline]
__kmalloc_node+0x4e/0x70 mm/slab.c:3623
kmalloc_node include/linux/slab.h:578 [inline]
kvmalloc_node+0x68/0x100 mm/util.c:574
kvmalloc include/linux/mm.h:645 [inline]
kvzalloc include/linux/mm.h:653 [inline]
alloc_netdev_mqs+0x98/0xe40 net/core/dev.c:9797
rtnl_create_link+0x22d/0xaf0 net/core/rtnetlink.c:3047
__rtnl_newlink+0xf9f/0x1790 net/core/rtnetlink.c:3309
rtnl_newlink+0x69/0xa0 net/core/rtnetlink.c:3377
rtnetlink_rcv_msg+0x45e/0xaf0 net/core/rtnetlink.c:5438
netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5456
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:672
__sys_sendto+0x262/0x380 net/socket.c:1998
__do_compat_sys_socketcall net/compat.c:771 [inline]
__se_compat_sys_socketcall net/compat.c:719 [inline]
__ia32_compat_sys_socketcall+0x530/0x710 net/compat.c:719
do_syscall_32_irqs_on arch/x86/entry/common.c:337 [inline]
do_fast_syscall_32+0x27b/0xe16 arch/x86/entry/common.c:408
entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139

Freed by task 10229:
save_stack+0x23/0x90 mm/kasan/common.c:72
set_track mm/kasan/common.c:80 [inline]
kasan_set_free_info mm/kasan/common.c:337 [inline]
__kasan_slab_free+0x102/0x150 mm/kasan/common.c:476
kasan_slab_free+0xe/0x10 mm/kasan/common.c:485
__cache_free mm/slab.c:3426 [inline]
kfree+0x10a/0x2c0 mm/slab.c:3757
__netdev_name_node_alt_destroy+0x1ff/0x2a0 net/core/dev.c:322
netdev_name_node_alt_destroy+0x57/0x80 net/core/dev.c:334
rtnl_alt_ifname net/core/rtnetlink.c:3518 [inline]
rtnl_linkprop.isra.0+0x575/0x6f0 net/core/rtnetlink.c:3567
rtnl_dellinkprop+0x46/0x60 net/core/rtnetlink.c:3588
rtnetlink_rcv_msg+0x45e/0xaf0 net/core/rtnetlink.c:5438
netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5456
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:672
____sys_sendmsg+0x753/0x880 net/socket.c:2343
___sys_sendmsg+0x100/0x170 net/socket.c:2397
__sys_sendmsg+0x105/0x1d0 net/socket.c:2430
__compat_sys_sendmsg net/compat.c:642 [inline]
__do_compat_sys_sendmsg net/compat.c:649 [inline]
__se_compat_sys_sendmsg net/compat.c:646 [inline]
__ia32_compat_sys_sendmsg+0x7a/0xb0 net/compat.c:646
do_syscall_32_irqs_on arch/x86/entry/common.c:337 [inline]
do_fast_syscall_32+0x27b/0xe16 arch/x86/entry/common.c:408
entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139

The buggy address belongs to the object at ffff88809886c000
which belongs to the cache kmalloc-4k of size 4096
The buggy address is located 1416 bytes inside of
4096-byte region [ffff88809886c000, ffff88809886d000)
The buggy address belongs to the page:
page:ffffea0002621b00 refcount:1 mapcount:0 mapping:ffff8880aa402000 index:0x0 compound_mapcount: 0
flags: 0xfffe0000010200(slab|head)
raw: 00fffe0000010200 ffffea0002610d08 ffffea0002607608 ffff8880aa402000
raw: 0000000000000000 ffff88809886c000 0000000100000001 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff88809886c480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88809886c500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88809886c580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88809886c600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88809886c680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: 36fbf1e52bd3 ("net: rtnetlink: add linkprop commands to add and delete alternative ifnames")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: syzbot <[email protected]>
Cc: Jiri Pirko <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/dev.c | 6 ++++++
1 file changed, 6 insertions(+)

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -330,6 +330,12 @@ int netdev_name_node_alt_destroy(struct
name_node = netdev_name_node_lookup(net, name);
if (!name_node)
return -ENOENT;
+ /* lookup might have found our primary name or a name belonging
+ * to another device.
+ */
+ if (name_node == dev->name_node || name_node->dev != dev)
+ return -EINVAL;
+
__netdev_name_node_alt_destroy(name_node);

return 0;


2020-03-03 18:16:08

by Greg KH

[permalink] [raw]
Subject: [PATCH 5.5 002/176] net: dsa: b53: Ensure the default VID is untagged

From: Florian Fainelli <[email protected]>

[ Upstream commit d965a5432d4c3e6b9c3d2bc1d4a800013bbf76f6 ]

We need to ensure that the default VID is untagged otherwise the switch
will be sending tagged frames and the results can be problematic. This
is especially true with b53 switches that use VID 0 as their default
VLAN since VID 0 has a special meaning.

Fixes: fea83353177a ("net: dsa: b53: Fix default VLAN ID")
Fixes: 061f6a505ac3 ("net: dsa: Add ndo_vlan_rx_{add, kill}_vid implementation")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/dsa/b53/b53_common.c | 3 +++
1 file changed, 3 insertions(+)

--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1353,6 +1353,9 @@ void b53_vlan_add(struct dsa_switch *ds,

b53_get_vlan_entry(dev, vid, vl);

+ if (vid == 0 && vid == b53_default_pvid(dev))
+ untagged = true;
+
vl->members |= BIT(port);
if (untagged && !dsa_is_cpu_port(ds, port))
vl->untag |= BIT(port);


2020-03-03 18:18:43

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 5.5 072/176] bcache: ignore pending signals when creating gc and allocator thread

On Tue, Mar 03, 2020 at 10:58:29AM -0700, Jens Axboe wrote:
> On 3/3/20 10:42 AM, Greg Kroah-Hartman wrote:
> > From: Coly Li <[email protected]>
> >
> > [ Upstream commit 0b96da639a4874311e9b5156405f69ef9fc3bef8 ]
> >
> > When run a cache set, all the bcache btree node of this cache set will
> > be checked by bch_btree_check(). If the bcache btree is very large,
> > iterating all the btree nodes will occupy too much system memory and
> > the bcache registering process might be selected and killed by system
> > OOM killer. kthread_run() will fail if current process has pending
> > signal, therefore the kthread creating in run_cache_set() for gc and
> > allocator kernel threads are very probably failed for a very large
> > bcache btree.
> >
> > Indeed such OOM is safe and the registering process will exit after
> > the registration done. Therefore this patch flushes pending signals
> > during the cache set start up, specificly in bch_cache_allocator_start()
> > and bch_gc_thread_start(), to make sure run_cache_set() won't fail for
> > large cahced data set.
>
> Ditto this one, of course.
>
> Did someone send this in for stable? It's not marked stable in the
> original commit.

I think the autobot grabbed it.

2020-03-03 22:12:31

by Jon Hunter

[permalink] [raw]
Subject: Re: [PATCH 5.5 000/176] 5.5.8-stable review


On 03/03/2020 17:41, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.5.8 release.
> There are 176 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Thu, 05 Mar 2020 17:42:06 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.5.8-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.5.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

All tests passing for Tegra ...

Test results for stable-v5.5:
13 builds: 13 pass, 0 fail
22 boots: 22 pass, 0 fail
40 tests: 40 pass, 0 fail

Linux version: 5.5.8-rc1-g3517b32c0774
Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000,
tegra194-p2972-0000, tegra20-ventana,
tegra210-p2371-2180, tegra210-p3450-0000,
tegra30-cardhu-a04

Cheers
Jon

--
nvpublic

2020-03-03 23:02:59

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 5.5 000/176] 5.5.8-stable review

On 3/3/20 10:41 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.5.8 release.
> There are 176 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Thu, 05 Mar 2020 17:42:06 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.5.8-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.5.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

2020-03-04 06:34:45

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 5.5 000/176] 5.5.8-stable review

On Tue, Mar 03, 2020 at 04:01:34PM -0700, shuah wrote:
> On 3/3/20 10:41 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.5.8 release.
> > There are 176 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Thu, 05 Mar 2020 17:42:06 +0000.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.5.8-rc1.gz
> > or in the git tree and branch at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.5.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
> >
>
> Compiled and booted on my test system. No dmesg regressions.

Great, thanks for testing these and letting me know.

greg k-h

2020-03-04 06:35:58

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 5.5 000/176] 5.5.8-stable review

On Tue, Mar 03, 2020 at 10:11:19PM +0000, Jon Hunter wrote:
>
> On 03/03/2020 17:41, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.5.8 release.
> > There are 176 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Thu, 05 Mar 2020 17:42:06 +0000.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.5.8-rc1.gz
> > or in the git tree and branch at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.5.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
>
> All tests passing for Tegra ...
>
> Test results for stable-v5.5:
> 13 builds: 13 pass, 0 fail
> 22 boots: 22 pass, 0 fail
> 40 tests: 40 pass, 0 fail
>
> Linux version: 5.5.8-rc1-g3517b32c0774
> Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000,
> tegra194-p2972-0000, tegra20-ventana,
> tegra210-p2371-2180, tegra210-p3450-0000,
> tegra30-cardhu-a04
>

thanks for testing all of these and letting me know.

greg k-h

2020-03-04 07:14:18

by Naresh Kamboju

[permalink] [raw]
Subject: Re: [PATCH 5.5 000/176] 5.5.8-stable review

On Tue, 3 Mar 2020 at 23:16, Greg Kroah-Hartman
<[email protected]> wrote:
>
> This is the start of the stable review cycle for the 5.5.8 release.
> There are 176 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Thu, 05 Mar 2020 17:42:06 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.5.8-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.5.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Results from Linaro’s test farm.
Regressions detected on x86_64 and i386.

Test failure output:
CVE-2017-5715: VULN (IBRS+IBPB or retpoline+IBPB+RSB filling, is
needed to mitigate the vulnerability)

Test description:
CVE-2017-5715 branch target injection (Spectre Variant 2)

Impact: Kernel
Mitigation 1: new opcode via microcode update that should be used by
up to date compilers to protect the BTB (by flushing indirect branch
predictors)
Mitigation 2: introducing "retpoline" into compilers, and recompile
software/OS with it
Performance impact of the mitigation: high for mitigation 1, medium
for mitigation 2, depending on your CPU

ref:
https://github.com/speed47/spectre-meltdown-checker
https://qa-reports.linaro.org/lkft/linux-stable-rc-5.5-oe/tests/spectre-meltdown-checker-test/CVE-2017-5715
https://lkft.validation.linaro.org/scheduler/job/1264643#L21206

Summary
------------------------------------------------------------------------

kernel: 5.5.8-rc1
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-5.5.y
git commit: 3517b32c0774341d492140b2be08c4bf6d1a833e
git describe: v5.5.7-177-g3517b32c0774
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-5.5-oe/build/v5.5.7-177-g3517b32c0774

Regressions (compared to build v5.5.7)
------------------------------------------------------------------------

i386:
spectre-meltdown-checker-test:
* CVE-2017-5715

x86:
spectre-meltdown-checker-test:
* CVE-2017-5715

No fixes (compared to build v5.5.7)

Ran 25662 total tests in the following environments and test suites.

Environments
--------------
- dragonboard-410c
- hi6220-hikey
- i386
- juno-r2
- nxp-ls2088
- qemu_arm
- qemu_arm64
- qemu_i386
- qemu_x86_64
- x15
- x86

Test Suites
-----------
* build
* install-android-platform-tools-r2600
* kselftest
* libgpiod
* linux-log-parser
* perf
* ltp-cap_bounds-tests
* ltp-cpuhotplug-tests
* ltp-fcntl-locktests-tests
* ltp-fs-tests
* ltp-ipc-tests
* ltp-sched-tests
* network-basic-tests
* kvm-unit-tests
* libhugetlbfs
* ltp-commands-tests
* ltp-containers-tests
* ltp-crypto-tests
* ltp-cve-tests
* ltp-dio-tests
* ltp-filecaps-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-m[
* ltp-mm-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* spectre-meltdown-checker-test
* v4l2-compliance
* ltp-cap_bounds-64k-page_size-tests
* ltp-cap_bounds-kasan-tests
* ltp-commands-64k-page_size-tests
* ltp-commands-kasan-tests
* ltp-containers-64k-page_size-tests
* ltp-containers-kasan-tests
* ltp-cpuhotplug-64k-page_size-tests
* ltp-cpuhotplug-kasan-tests
* ltp-crypto-64k-page_size-tests
* ltp-crypto-kasan-tests
* ltp-cve-64k-page_size-tests
* ltp-cve-kasan-tests
* ltp-dio-64k-page_size-tests
* ltp-dio-kasan-tests
* ltp-fcntl-locktests-64k-page_size-tests
* ltp-fcntl-locktests-kasan-tests
* ltp-filecaps-64k-page_size-tests
* ltp-filecaps-kasan-tests
* ltp-fs-64k-page_size-tests
* ltp-fs-kasan-tests
* ltp-fs_bind-64k-page_size-tests
* ltp-fs_bind-kasan-tests
* ltp-fs_perms_simple-64k-page_size-tests
* ltp-fs_perms_simple-kasan-tests
* ltp-fsx-64k-page_size-tests
* ltp-fsx-kasan-tests
* ltp-hugetlb-64k-page_size-tests
* ltp-hugetlb-kasan-tests
* ltp-io-64k-page_size-tests
* ltp-io-kasan-tests
* ltp-ipc-64k-page_size-tests
* ltp-ipc-kasan-tests
* ltp-math-64k-page_size-tests
* ltp-math-kasan-tests
* ltp-math-tests
* ltp-mm-64k-page_size-tests
* ltp-mm-kasan-tests
* ltp-nptl-64k-page_size-tests
* ltp-nptl-kasan-tests
* ltp-pty-64k-page_size-tests
* ltp-pty-kasan-tests
* ltp-sched-64k-page_size-tests
* ltp-sched-kasan-tests
* ltp-securebits-64k-page_size-tests
* ltp-securebits-kasan-tests
* ltp-syscalls-64k-page_size-tests
* ltp-syscalls-compat-tests
* ltp-syscalls-kasan-tests
* ltp-open-posix-tests
* ssuite
* kselftest-vsyscall-mode-native
* kselftest-vsyscall-mode-none

--
Linaro LKFT
https://lkft.linaro.org

2020-03-04 07:23:57

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 5.5 111/176] KVM: nVMX: Emulate MTF when performing instruction emulation

On 03/03/20 18:42, Greg Kroah-Hartman wrote:
> From: Oliver Upton <[email protected]>
>
> commit 5ef8acbdd687c9d72582e2c05c0b9756efb37863 upstream.
>
> Since commit 5f3d45e7f282 ("kvm/x86: add support for
> MONITOR_TRAP_FLAG"), KVM has allowed an L1 guest to use the monitor trap
> flag processor-based execution control for its L2 guest. KVM simply
> forwards any MTF VM-exits to the L1 guest, which works for normal
> instruction execution.
>
> However, when KVM needs to emulate an instruction on the behalf of an L2
> guest, the monitor trap flag is not emulated. Add the necessary logic to
> kvm_skip_emulated_instruction() to synthesize an MTF VM-exit to L1 upon
> instruction emulation for L2.
>
> Fixes: 5f3d45e7f282 ("kvm/x86: add support for MONITOR_TRAP_FLAG")
> Signed-off-by: Oliver Upton <[email protected]>
> Signed-off-by: Paolo Bonzini <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>

Why is this included in a stable release? It was part of a series of
four patches and the prerequisites as far as I can see are not part of 5.5.

I have already said half a dozen times that I don't want any of the
autopick stuff for KVM. Is a Fixes tag sufficient to get patches into
stable now?

Paolo

> ---
> arch/x86/include/asm/kvm_host.h | 1 +
> arch/x86/include/uapi/asm/kvm.h | 1 +
> arch/x86/kvm/svm.c | 1 +
> arch/x86/kvm/vmx/nested.c | 35 ++++++++++++++++++++++++++++++++++-
> arch/x86/kvm/vmx/nested.h | 5 +++++
> arch/x86/kvm/vmx/vmx.c | 37 ++++++++++++++++++++++++++++++++++++-
> arch/x86/kvm/vmx/vmx.h | 3 +++
> arch/x86/kvm/x86.c | 2 ++
> 8 files changed, 83 insertions(+), 2 deletions(-)
>
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1092,6 +1092,7 @@ struct kvm_x86_ops {
> void (*run)(struct kvm_vcpu *vcpu);
> int (*handle_exit)(struct kvm_vcpu *vcpu);
> int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
> + void (*update_emulated_instruction)(struct kvm_vcpu *vcpu);
> void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask);
> u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu);
> void (*patch_hypercall)(struct kvm_vcpu *vcpu,
> --- a/arch/x86/include/uapi/asm/kvm.h
> +++ b/arch/x86/include/uapi/asm/kvm.h
> @@ -390,6 +390,7 @@ struct kvm_sync_regs {
> #define KVM_STATE_NESTED_GUEST_MODE 0x00000001
> #define KVM_STATE_NESTED_RUN_PENDING 0x00000002
> #define KVM_STATE_NESTED_EVMCS 0x00000004
> +#define KVM_STATE_NESTED_MTF_PENDING 0x00000008
>
> #define KVM_STATE_NESTED_SMM_GUEST_MODE 0x00000001
> #define KVM_STATE_NESTED_SMM_VMXON 0x00000002
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -7311,6 +7311,7 @@ static struct kvm_x86_ops svm_x86_ops __
> .run = svm_vcpu_run,
> .handle_exit = handle_exit,
> .skip_emulated_instruction = skip_emulated_instruction,
> + .update_emulated_instruction = NULL,
> .set_interrupt_shadow = svm_set_interrupt_shadow,
> .get_interrupt_shadow = svm_get_interrupt_shadow,
> .patch_hypercall = svm_patch_hypercall,
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -3616,8 +3616,15 @@ static int vmx_check_nested_events(struc
> unsigned long exit_qual;
> bool block_nested_events =
> vmx->nested.nested_run_pending || kvm_event_needs_reinjection(vcpu);
> + bool mtf_pending = vmx->nested.mtf_pending;
> struct kvm_lapic *apic = vcpu->arch.apic;
>
> + /*
> + * Clear the MTF state. If a higher priority VM-exit is delivered first,
> + * this state is discarded.
> + */
> + vmx->nested.mtf_pending = false;
> +
> if (lapic_in_kernel(vcpu) &&
> test_bit(KVM_APIC_INIT, &apic->pending_events)) {
> if (block_nested_events)
> @@ -3628,8 +3635,28 @@ static int vmx_check_nested_events(struc
> return 0;
> }
>
> + /*
> + * Process any exceptions that are not debug traps before MTF.
> + */
> + if (vcpu->arch.exception.pending &&
> + !vmx_pending_dbg_trap(vcpu) &&
> + nested_vmx_check_exception(vcpu, &exit_qual)) {
> + if (block_nested_events)
> + return -EBUSY;
> + nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
> + return 0;
> + }
> +
> + if (mtf_pending) {
> + if (block_nested_events)
> + return -EBUSY;
> + nested_vmx_update_pending_dbg(vcpu);
> + nested_vmx_vmexit(vcpu, EXIT_REASON_MONITOR_TRAP_FLAG, 0, 0);
> + return 0;
> + }
> +
> if (vcpu->arch.exception.pending &&
> - nested_vmx_check_exception(vcpu, &exit_qual)) {
> + nested_vmx_check_exception(vcpu, &exit_qual)) {
> if (block_nested_events)
> return -EBUSY;
> nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
> @@ -5742,6 +5769,9 @@ static int vmx_get_nested_state(struct k
>
> if (vmx->nested.nested_run_pending)
> kvm_state.flags |= KVM_STATE_NESTED_RUN_PENDING;
> +
> + if (vmx->nested.mtf_pending)
> + kvm_state.flags |= KVM_STATE_NESTED_MTF_PENDING;
> }
> }
>
> @@ -5922,6 +5952,9 @@ static int vmx_set_nested_state(struct k
> vmx->nested.nested_run_pending =
> !!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);
>
> + vmx->nested.mtf_pending =
> + !!(kvm_state->flags & KVM_STATE_NESTED_MTF_PENDING);
> +
> ret = -EINVAL;
> if (nested_cpu_has_shadow_vmcs(vmcs12) &&
> vmcs12->vmcs_link_pointer != -1ull) {
> --- a/arch/x86/kvm/vmx/nested.h
> +++ b/arch/x86/kvm/vmx/nested.h
> @@ -176,6 +176,11 @@ static inline bool nested_cpu_has_virtua
> return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS;
> }
>
> +static inline int nested_cpu_has_mtf(struct vmcs12 *vmcs12)
> +{
> + return nested_cpu_has(vmcs12, CPU_BASED_MONITOR_TRAP_FLAG);
> +}
> +
> static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12)
> {
> return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT);
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1595,6 +1595,40 @@ static int skip_emulated_instruction(str
> return 1;
> }
>
> +
> +/*
> + * Recognizes a pending MTF VM-exit and records the nested state for later
> + * delivery.
> + */
> +static void vmx_update_emulated_instruction(struct kvm_vcpu *vcpu)
> +{
> + struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> + struct vcpu_vmx *vmx = to_vmx(vcpu);
> +
> + if (!is_guest_mode(vcpu))
> + return;
> +
> + /*
> + * Per the SDM, MTF takes priority over debug-trap exceptions besides
> + * T-bit traps. As instruction emulation is completed (i.e. at the
> + * instruction boundary), any #DB exception pending delivery must be a
> + * debug-trap. Record the pending MTF state to be delivered in
> + * vmx_check_nested_events().
> + */
> + if (nested_cpu_has_mtf(vmcs12) &&
> + (!vcpu->arch.exception.pending ||
> + vcpu->arch.exception.nr == DB_VECTOR))
> + vmx->nested.mtf_pending = true;
> + else
> + vmx->nested.mtf_pending = false;
> +}
> +
> +static int vmx_skip_emulated_instruction(struct kvm_vcpu *vcpu)
> +{
> + vmx_update_emulated_instruction(vcpu);
> + return skip_emulated_instruction(vcpu);
> +}
> +
> static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
> {
> /*
> @@ -7886,7 +7920,8 @@ static struct kvm_x86_ops vmx_x86_ops __
>
> .run = vmx_vcpu_run,
> .handle_exit = vmx_handle_exit,
> - .skip_emulated_instruction = skip_emulated_instruction,
> + .skip_emulated_instruction = vmx_skip_emulated_instruction,
> + .update_emulated_instruction = vmx_update_emulated_instruction,
> .set_interrupt_shadow = vmx_set_interrupt_shadow,
> .get_interrupt_shadow = vmx_get_interrupt_shadow,
> .patch_hypercall = vmx_patch_hypercall,
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -150,6 +150,9 @@ struct nested_vmx {
> /* L2 must run next, and mustn't decide to exit to L1. */
> bool nested_run_pending;
>
> + /* Pending MTF VM-exit into L1. */
> + bool mtf_pending;
> +
> struct loaded_vmcs vmcs02;
>
> /*
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6838,6 +6838,8 @@ restart:
> kvm_rip_write(vcpu, ctxt->eip);
> if (r && ctxt->tf)
> r = kvm_vcpu_do_singlestep(vcpu);
> + if (kvm_x86_ops->update_emulated_instruction)
> + kvm_x86_ops->update_emulated_instruction(vcpu);
> __kvm_set_rflags(vcpu, ctxt->eflags);
> }
>
>
>

2020-03-04 07:41:26

by Oliver Upton

[permalink] [raw]
Subject: Re: [PATCH 5.5 111/176] KVM: nVMX: Emulate MTF when performing instruction emulation

On Tue, Mar 3, 2020 at 11:23 PM Paolo Bonzini <[email protected]> wrote:
>
> On 03/03/20 18:42, Greg Kroah-Hartman wrote:
> > From: Oliver Upton <[email protected]>
> >
> > commit 5ef8acbdd687c9d72582e2c05c0b9756efb37863 upstream.
> >
> > Since commit 5f3d45e7f282 ("kvm/x86: add support for
> > MONITOR_TRAP_FLAG"), KVM has allowed an L1 guest to use the monitor trap
> > flag processor-based execution control for its L2 guest. KVM simply
> > forwards any MTF VM-exits to the L1 guest, which works for normal
> > instruction execution.
> >
> > However, when KVM needs to emulate an instruction on the behalf of an L2
> > guest, the monitor trap flag is not emulated. Add the necessary logic to
> > kvm_skip_emulated_instruction() to synthesize an MTF VM-exit to L1 upon
> > instruction emulation for L2.
> >
> > Fixes: 5f3d45e7f282 ("kvm/x86: add support for MONITOR_TRAP_FLAG")
> > Signed-off-by: Oliver Upton <[email protected]>
> > Signed-off-by: Paolo Bonzini <[email protected]>
> > Signed-off-by: Greg Kroah-Hartman <[email protected]>
>
> Why is this included in a stable release? It was part of a series of
> four patches and the prerequisites as far as I can see are not part of 5.5.

It looks like these commits were already picked from upstream:

684c0422da71 ("KVM: nVMX: Handle pending #DB when injecting INIT VM-exit")
307f1cfa2696 ("KVM: x86: Mask off reserved bit from #DB exception payload")

Which were bug fixes in their own right and were sensible for
backporting (though I didn't cc stable if I'm not mistaken).

but not:

a06230b62b89 ("KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS")

which this patch absolutely depends on.

Otherwise, I'll defer discussions regarding the suitability of this
patch for stable to Paolo.

--
Thanks,
Oliver

> I have already said half a dozen times that I don't want any of the
> autopick stuff for KVM. Is a Fixes tag sufficient to get patches into
> stable now?
>
> Paolo
>
> > ---
> > arch/x86/include/asm/kvm_host.h | 1 +
> > arch/x86/include/uapi/asm/kvm.h | 1 +
> > arch/x86/kvm/svm.c | 1 +
> > arch/x86/kvm/vmx/nested.c | 35 ++++++++++++++++++++++++++++++++++-
> > arch/x86/kvm/vmx/nested.h | 5 +++++
> > arch/x86/kvm/vmx/vmx.c | 37 ++++++++++++++++++++++++++++++++++++-
> > arch/x86/kvm/vmx/vmx.h | 3 +++
> > arch/x86/kvm/x86.c | 2 ++
> > 8 files changed, 83 insertions(+), 2 deletions(-)
> >
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1092,6 +1092,7 @@ struct kvm_x86_ops {
> > void (*run)(struct kvm_vcpu *vcpu);
> > int (*handle_exit)(struct kvm_vcpu *vcpu);
> > int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
> > + void (*update_emulated_instruction)(struct kvm_vcpu *vcpu);
> > void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask);
> > u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu);
> > void (*patch_hypercall)(struct kvm_vcpu *vcpu,
> > --- a/arch/x86/include/uapi/asm/kvm.h
> > +++ b/arch/x86/include/uapi/asm/kvm.h
> > @@ -390,6 +390,7 @@ struct kvm_sync_regs {
> > #define KVM_STATE_NESTED_GUEST_MODE 0x00000001
> > #define KVM_STATE_NESTED_RUN_PENDING 0x00000002
> > #define KVM_STATE_NESTED_EVMCS 0x00000004
> > +#define KVM_STATE_NESTED_MTF_PENDING 0x00000008
> >
> > #define KVM_STATE_NESTED_SMM_GUEST_MODE 0x00000001
> > #define KVM_STATE_NESTED_SMM_VMXON 0x00000002
> > --- a/arch/x86/kvm/svm.c
> > +++ b/arch/x86/kvm/svm.c
> > @@ -7311,6 +7311,7 @@ static struct kvm_x86_ops svm_x86_ops __
> > .run = svm_vcpu_run,
> > .handle_exit = handle_exit,
> > .skip_emulated_instruction = skip_emulated_instruction,
> > + .update_emulated_instruction = NULL,
> > .set_interrupt_shadow = svm_set_interrupt_shadow,
> > .get_interrupt_shadow = svm_get_interrupt_shadow,
> > .patch_hypercall = svm_patch_hypercall,
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -3616,8 +3616,15 @@ static int vmx_check_nested_events(struc
> > unsigned long exit_qual;
> > bool block_nested_events =
> > vmx->nested.nested_run_pending || kvm_event_needs_reinjection(vcpu);
> > + bool mtf_pending = vmx->nested.mtf_pending;
> > struct kvm_lapic *apic = vcpu->arch.apic;
> >
> > + /*
> > + * Clear the MTF state. If a higher priority VM-exit is delivered first,
> > + * this state is discarded.
> > + */
> > + vmx->nested.mtf_pending = false;
> > +
> > if (lapic_in_kernel(vcpu) &&
> > test_bit(KVM_APIC_INIT, &apic->pending_events)) {
> > if (block_nested_events)
> > @@ -3628,8 +3635,28 @@ static int vmx_check_nested_events(struc
> > return 0;
> > }
> >
> > + /*
> > + * Process any exceptions that are not debug traps before MTF.
> > + */
> > + if (vcpu->arch.exception.pending &&
> > + !vmx_pending_dbg_trap(vcpu) &&
> > + nested_vmx_check_exception(vcpu, &exit_qual)) {
> > + if (block_nested_events)
> > + return -EBUSY;
> > + nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
> > + return 0;
> > + }
> > +
> > + if (mtf_pending) {
> > + if (block_nested_events)
> > + return -EBUSY;
> > + nested_vmx_update_pending_dbg(vcpu);
> > + nested_vmx_vmexit(vcpu, EXIT_REASON_MONITOR_TRAP_FLAG, 0, 0);
> > + return 0;
> > + }
> > +
> > if (vcpu->arch.exception.pending &&
> > - nested_vmx_check_exception(vcpu, &exit_qual)) {
> > + nested_vmx_check_exception(vcpu, &exit_qual)) {
> > if (block_nested_events)
> > return -EBUSY;
> > nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
> > @@ -5742,6 +5769,9 @@ static int vmx_get_nested_state(struct k
> >
> > if (vmx->nested.nested_run_pending)
> > kvm_state.flags |= KVM_STATE_NESTED_RUN_PENDING;
> > +
> > + if (vmx->nested.mtf_pending)
> > + kvm_state.flags |= KVM_STATE_NESTED_MTF_PENDING;
> > }
> > }
> >
> > @@ -5922,6 +5952,9 @@ static int vmx_set_nested_state(struct k
> > vmx->nested.nested_run_pending =
> > !!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);
> >
> > + vmx->nested.mtf_pending =
> > + !!(kvm_state->flags & KVM_STATE_NESTED_MTF_PENDING);
> > +
> > ret = -EINVAL;
> > if (nested_cpu_has_shadow_vmcs(vmcs12) &&
> > vmcs12->vmcs_link_pointer != -1ull) {
> > --- a/arch/x86/kvm/vmx/nested.h
> > +++ b/arch/x86/kvm/vmx/nested.h
> > @@ -176,6 +176,11 @@ static inline bool nested_cpu_has_virtua
> > return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS;
> > }
> >
> > +static inline int nested_cpu_has_mtf(struct vmcs12 *vmcs12)
> > +{
> > + return nested_cpu_has(vmcs12, CPU_BASED_MONITOR_TRAP_FLAG);
> > +}
> > +
> > static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12)
> > {
> > return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT);
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -1595,6 +1595,40 @@ static int skip_emulated_instruction(str
> > return 1;
> > }
> >
> > +
> > +/*
> > + * Recognizes a pending MTF VM-exit and records the nested state for later
> > + * delivery.
> > + */
> > +static void vmx_update_emulated_instruction(struct kvm_vcpu *vcpu)
> > +{
> > + struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> > + struct vcpu_vmx *vmx = to_vmx(vcpu);
> > +
> > + if (!is_guest_mode(vcpu))
> > + return;
> > +
> > + /*
> > + * Per the SDM, MTF takes priority over debug-trap exceptions besides
> > + * T-bit traps. As instruction emulation is completed (i.e. at the
> > + * instruction boundary), any #DB exception pending delivery must be a
> > + * debug-trap. Record the pending MTF state to be delivered in
> > + * vmx_check_nested_events().
> > + */
> > + if (nested_cpu_has_mtf(vmcs12) &&
> > + (!vcpu->arch.exception.pending ||
> > + vcpu->arch.exception.nr == DB_VECTOR))
> > + vmx->nested.mtf_pending = true;
> > + else
> > + vmx->nested.mtf_pending = false;
> > +}
> > +
> > +static int vmx_skip_emulated_instruction(struct kvm_vcpu *vcpu)
> > +{
> > + vmx_update_emulated_instruction(vcpu);
> > + return skip_emulated_instruction(vcpu);
> > +}
> > +
> > static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
> > {
> > /*
> > @@ -7886,7 +7920,8 @@ static struct kvm_x86_ops vmx_x86_ops __
> >
> > .run = vmx_vcpu_run,
> > .handle_exit = vmx_handle_exit,
> > - .skip_emulated_instruction = skip_emulated_instruction,
> > + .skip_emulated_instruction = vmx_skip_emulated_instruction,
> > + .update_emulated_instruction = vmx_update_emulated_instruction,
> > .set_interrupt_shadow = vmx_set_interrupt_shadow,
> > .get_interrupt_shadow = vmx_get_interrupt_shadow,
> > .patch_hypercall = vmx_patch_hypercall,
> > --- a/arch/x86/kvm/vmx/vmx.h
> > +++ b/arch/x86/kvm/vmx/vmx.h
> > @@ -150,6 +150,9 @@ struct nested_vmx {
> > /* L2 must run next, and mustn't decide to exit to L1. */
> > bool nested_run_pending;
> >
> > + /* Pending MTF VM-exit into L1. */
> > + bool mtf_pending;
> > +
> > struct loaded_vmcs vmcs02;
> >
> > /*
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -6838,6 +6838,8 @@ restart:
> > kvm_rip_write(vcpu, ctxt->eip);
> > if (r && ctxt->tf)
> > r = kvm_vcpu_do_singlestep(vcpu);
> > + if (kvm_x86_ops->update_emulated_instruction)
> > + kvm_x86_ops->update_emulated_instruction(vcpu);
> > __kvm_set_rflags(vcpu, ctxt->eflags);
> > }
> >
> >
> >
>

2020-03-04 08:11:47

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 5.5 111/176] KVM: nVMX: Emulate MTF when performing instruction emulation

On Tue, Mar 03, 2020 at 11:39:35PM -0800, Oliver Upton wrote:
> On Tue, Mar 3, 2020 at 11:23 PM Paolo Bonzini <[email protected]> wrote:
> >
> > On 03/03/20 18:42, Greg Kroah-Hartman wrote:
> > > From: Oliver Upton <[email protected]>
> > >
> > > commit 5ef8acbdd687c9d72582e2c05c0b9756efb37863 upstream.
> > >
> > > Since commit 5f3d45e7f282 ("kvm/x86: add support for
> > > MONITOR_TRAP_FLAG"), KVM has allowed an L1 guest to use the monitor trap
> > > flag processor-based execution control for its L2 guest. KVM simply
> > > forwards any MTF VM-exits to the L1 guest, which works for normal
> > > instruction execution.
> > >
> > > However, when KVM needs to emulate an instruction on the behalf of an L2
> > > guest, the monitor trap flag is not emulated. Add the necessary logic to
> > > kvm_skip_emulated_instruction() to synthesize an MTF VM-exit to L1 upon
> > > instruction emulation for L2.
> > >
> > > Fixes: 5f3d45e7f282 ("kvm/x86: add support for MONITOR_TRAP_FLAG")
> > > Signed-off-by: Oliver Upton <[email protected]>
> > > Signed-off-by: Paolo Bonzini <[email protected]>
> > > Signed-off-by: Greg Kroah-Hartman <[email protected]>
> >
> > Why is this included in a stable release? It was part of a series of
> > four patches and the prerequisites as far as I can see are not part of 5.5.
>
> It looks like these commits were already picked from upstream:
>
> 684c0422da71 ("KVM: nVMX: Handle pending #DB when injecting INIT VM-exit")
> 307f1cfa2696 ("KVM: x86: Mask off reserved bit from #DB exception payload")
>
> Which were bug fixes in their own right and were sensible for
> backporting (though I didn't cc stable if I'm not mistaken).
>
> but not:
>
> a06230b62b89 ("KVM: x86: Deliver exception payload on KVM_GET_VCPU_EVENTS")
>
> which this patch absolutely depends on.
>
> Otherwise, I'll defer discussions regarding the suitability of this
> patch for stable to Paolo.

I picked this patch up solely because of the Fixes: tag showed that this
ommit resolved something from a previous commit. The interdependancies
were not obvious, especially as it all seemed to build just fine here.

> > I have already said half a dozen times that I don't want any of the
> > autopick stuff for KVM. Is a Fixes tag sufficient to get patches into
> > stable now?

Yes, it can happen that a Fixes tag does cause a patch to get into
stable because I look out for that. I do that because a number of
maintainers/developers only put that tag in there, and also to catch
patches for when we backport stuff and then need to take a fix for that
backport (not the case here though).

I'll be glad to just put KVM into the "never apply any patches to
stable unless you explicitly mark it as such", but the sad fact is that
many recent KVM fixes for reported CVEs never had any "Cc: stable@vger"
markings. They only had "Fixes:" tags and so I have had to dig them out
of the tree and backport them myself in order to resolve those very
public issues.

So can I ask that you always properly tag things for stable? If so, I
will be glad to ignore Fixes: tags for KVM patches in the future.

I'll go drop this patch as well. Note, there are other KVM patches in
this release cycle also, can someone verify that I did not overreach for
them as well?

thanks,

greg k-h

2020-03-04 08:11:51

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 5.5 000/176] 5.5.8-stable review

On Wed, Mar 04, 2020 at 12:43:42PM +0530, Naresh Kamboju wrote:
> On Tue, 3 Mar 2020 at 23:16, Greg Kroah-Hartman
> <[email protected]> wrote:
> >
> > This is the start of the stable review cycle for the 5.5.8 release.
> > There are 176 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Thu, 05 Mar 2020 17:42:06 +0000.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.5.8-rc1.gz
> > or in the git tree and branch at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.5.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
> >
>
> Results from Linaro’s test farm.
> Regressions detected on x86_64 and i386.
>
> Test failure output:
> CVE-2017-5715: VULN (IBRS+IBPB or retpoline+IBPB+RSB filling, is
> needed to mitigate the vulnerability)
>
> Test description:
> CVE-2017-5715 branch target injection (Spectre Variant 2)
>
> Impact: Kernel
> Mitigation 1: new opcode via microcode update that should be used by
> up to date compilers to protect the BTB (by flushing indirect branch
> predictors)
> Mitigation 2: introducing "retpoline" into compilers, and recompile
> software/OS with it
> Performance impact of the mitigation: high for mitigation 1, medium
> for mitigation 2, depending on your CPU

So these are regressions or just new tests?

If regressions, can you do 'git bisect' to find the offending commit?

Also, are you sure you have an updated microcode on these machines and a
proper compiler for retpoline?

thanks,

greg k-h

2020-03-04 08:20:22

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 5.5 111/176] KVM: nVMX: Emulate MTF when performing instruction emulation

On 04/03/20 09:10, Greg Kroah-Hartman wrote:
> I'll be glad to just put KVM into the "never apply any patches to
> stable unless you explicitly mark it as such", but the sad fact is that
> many recent KVM fixes for reported CVEs never had any "Cc: stable@vger"
> markings.

Hmm, I did miss it in 433f4ba1904100da65a311033f17a9bf586b287e and
acff78477b9b4f26ecdf65733a4ed77fe837e9dc, but that's going back to
August 2018, so I can do better but it's not too shabby a record. :)

> They only had "Fixes:" tags and so I have had to dig them out
> of the tree and backport them myself in order to resolve those very
> public issues.
>
> So can I ask that you always properly tag things for stable? If so, I
> will be glad to ignore Fixes: tags for KVM patches in the future.
>
> I'll go drop this patch as well. Note, there are other KVM patches in
> this release cycle also, can someone verify that I did not overreach for
> them as well?

I checked them and they are fine.

Paolo

2020-03-04 08:26:33

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 5.5 111/176] KVM: nVMX: Emulate MTF when performing instruction emulation

On Wed, Mar 04, 2020 at 09:19:09AM +0100, Paolo Bonzini wrote:
> On 04/03/20 09:10, Greg Kroah-Hartman wrote:
> > I'll be glad to just put KVM into the "never apply any patches to
> > stable unless you explicitly mark it as such", but the sad fact is that
> > many recent KVM fixes for reported CVEs never had any "Cc: stable@vger"
> > markings.
>
> Hmm, I did miss it in 433f4ba1904100da65a311033f17a9bf586b287e and
> acff78477b9b4f26ecdf65733a4ed77fe837e9dc, but that's going back to
> August 2018, so I can do better but it's not too shabby a record. :)

35a571346a94 ("KVM: nVMX: Check IO instruction VM-exit conditions")
e71237d3ff1a ("KVM: nVMX: Refactor IO bitmap checks into helper function")

Were both from a few weeks ago and needed to resolve CVE-2020-2732 :(

> > They only had "Fixes:" tags and so I have had to dig them out
> > of the tree and backport them myself in order to resolve those very
> > public issues.
> >
> > So can I ask that you always properly tag things for stable? If so, I
> > will be glad to ignore Fixes: tags for KVM patches in the future.
> >
> > I'll go drop this patch as well. Note, there are other KVM patches in
> > this release cycle also, can someone verify that I did not overreach for
> > them as well?
>
> I checked them and they are fine.

Thank you for that.

greg k-h

2020-03-04 08:43:52

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 5.5 111/176] KVM: nVMX: Emulate MTF when performing instruction emulation

On 04/03/20 09:26, Greg Kroah-Hartman wrote:
> On Wed, Mar 04, 2020 at 09:19:09AM +0100, Paolo Bonzini wrote:
>> On 04/03/20 09:10, Greg Kroah-Hartman wrote:
>>> I'll be glad to just put KVM into the "never apply any patches to
>>> stable unless you explicitly mark it as such", but the sad fact is that
>>> many recent KVM fixes for reported CVEs never had any "Cc: stable@vger"
>>> markings.
>>
>> Hmm, I did miss it in 433f4ba1904100da65a311033f17a9bf586b287e and
>> acff78477b9b4f26ecdf65733a4ed77fe837e9dc, but that's going back to
>> August 2018, so I can do better but it's not too shabby a record. :)
>
> 35a571346a94 ("KVM: nVMX: Check IO instruction VM-exit conditions")
> e71237d3ff1a ("KVM: nVMX: Refactor IO bitmap checks into helper function")
>
> Were both from a few weeks ago and needed to resolve CVE-2020-2732 :(

No, they weren't, only the patch that was CCed stable was needed to
resolve the CVE.

Remember that at this point a lot of bugfixes or vulnerabilities in KVM
exploit corner cases of the architecture and don't show up with the
usual guests (Linux, Windows, BSDs). Since we didn't have full
information on the impact on guests that people do run, we started with
the bare minimum (the two patches above) but only for 5.6. The idea was
to collect follow-up patches for 2-4 weeks, decide which subset was
stable-worthy, and only then post them as stable backport subsets.

Paolo

2020-03-04 08:48:30

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 5.5 000/176] 5.5.8-stable review

On Wed, Mar 04, 2020 at 09:11:28AM +0100, Greg Kroah-Hartman wrote:
> On Wed, Mar 04, 2020 at 12:43:42PM +0530, Naresh Kamboju wrote:
> > On Tue, 3 Mar 2020 at 23:16, Greg Kroah-Hartman
> > <[email protected]> wrote:
> > >
> > > This is the start of the stable review cycle for the 5.5.8 release.
> > > There are 176 patches in this series, all will be posted as a response
> > > to this one. If anyone has any issues with these being applied, please
> > > let me know.
> > >
> > > Responses should be made by Thu, 05 Mar 2020 17:42:06 +0000.
> > > Anything received after that time might be too late.
> > >
> > > The whole patch series can be found in one patch at:
> > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.5.8-rc1.gz
> > > or in the git tree and branch at:
> > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.5.y
> > > and the diffstat can be found below.
> > >
> > > thanks,
> > >
> > > greg k-h
> > >
> >
> > Results from Linaro’s test farm.
> > Regressions detected on x86_64 and i386.
> >
> > Test failure output:
> > CVE-2017-5715: VULN (IBRS+IBPB or retpoline+IBPB+RSB filling, is
> > needed to mitigate the vulnerability)
> >
> > Test description:
> > CVE-2017-5715 branch target injection (Spectre Variant 2)
> >
> > Impact: Kernel
> > Mitigation 1: new opcode via microcode update that should be used by
> > up to date compilers to protect the BTB (by flushing indirect branch
> > predictors)
> > Mitigation 2: introducing "retpoline" into compilers, and recompile
> > software/OS with it
> > Performance impact of the mitigation: high for mitigation 1, medium
> > for mitigation 2, depending on your CPU
>
> So these are regressions or just new tests?
>
> If regressions, can you do 'git bisect' to find the offending commit?
>
> Also, are you sure you have an updated microcode on these machines and a
> proper compiler for retpoline?

As an example of just how crazy that script is, here's the output of my
machine for that first CVE issue:

CVE-2017-5715 aka 'Spectre Variant 2, branch target injection'
* Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB: conditional, IBRS_FW, STIBP: conditional, RSB filling)
* Mitigation 1
* Kernel is compiled with IBRS support: YES
* IBRS enabled and active: YES (for firmware code only)
* Kernel is compiled with IBPB support: YES
* IBPB enabled and active: YES
* Mitigation 2
* Kernel has branch predictor hardening (arm): NO
* Kernel compiled with retpoline option: YES
* Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline compilation)
* Kernel supports RSB filling: UNKNOWN (couldn't check (couldn't find your kernel image in /boot, if you used netboot, this is normal))
> STATUS: VULNERABLE (IBRS+IBPB or retpoline+IBPB+RSB filling, is needed to mitigate

So why is this "Vulnerable"? Because it didn't think it could find my
kernel image for some odd reason, despite it really being in /boot/ (I
don't use netboot)

So please verify that this really is a real issue, and not just the
script doing foolish things.

thanks,

greg k-h

2020-03-04 08:50:21

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 5.5 000/176] 5.5.8-stable review

On Wed, Mar 04, 2020 at 09:47:02AM +0100, Greg Kroah-Hartman wrote:
> On Wed, Mar 04, 2020 at 09:11:28AM +0100, Greg Kroah-Hartman wrote:
> > On Wed, Mar 04, 2020 at 12:43:42PM +0530, Naresh Kamboju wrote:
> > > On Tue, 3 Mar 2020 at 23:16, Greg Kroah-Hartman
> > > <[email protected]> wrote:
> > > >
> > > > This is the start of the stable review cycle for the 5.5.8 release.
> > > > There are 176 patches in this series, all will be posted as a response
> > > > to this one. If anyone has any issues with these being applied, please
> > > > let me know.
> > > >
> > > > Responses should be made by Thu, 05 Mar 2020 17:42:06 +0000.
> > > > Anything received after that time might be too late.
> > > >
> > > > The whole patch series can be found in one patch at:
> > > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.5.8-rc1.gz
> > > > or in the git tree and branch at:
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.5.y
> > > > and the diffstat can be found below.
> > > >
> > > > thanks,
> > > >
> > > > greg k-h
> > > >
> > >
> > > Results from Linaro’s test farm.
> > > Regressions detected on x86_64 and i386.
> > >
> > > Test failure output:
> > > CVE-2017-5715: VULN (IBRS+IBPB or retpoline+IBPB+RSB filling, is
> > > needed to mitigate the vulnerability)
> > >
> > > Test description:
> > > CVE-2017-5715 branch target injection (Spectre Variant 2)
> > >
> > > Impact: Kernel
> > > Mitigation 1: new opcode via microcode update that should be used by
> > > up to date compilers to protect the BTB (by flushing indirect branch
> > > predictors)
> > > Mitigation 2: introducing "retpoline" into compilers, and recompile
> > > software/OS with it
> > > Performance impact of the mitigation: high for mitigation 1, medium
> > > for mitigation 2, depending on your CPU
> >
> > So these are regressions or just new tests?
> >
> > If regressions, can you do 'git bisect' to find the offending commit?
> >
> > Also, are you sure you have an updated microcode on these machines and a
> > proper compiler for retpoline?
>
> As an example of just how crazy that script is, here's the output of my
> machine for that first CVE issue:
>
> CVE-2017-5715 aka 'Spectre Variant 2, branch target injection'
> * Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB: conditional, IBRS_FW, STIBP: conditional, RSB filling)
> * Mitigation 1
> * Kernel is compiled with IBRS support: YES
> * IBRS enabled and active: YES (for firmware code only)
> * Kernel is compiled with IBPB support: YES
> * IBPB enabled and active: YES
> * Mitigation 2
> * Kernel has branch predictor hardening (arm): NO
> * Kernel compiled with retpoline option: YES
> * Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline compilation)
> * Kernel supports RSB filling: UNKNOWN (couldn't check (couldn't find your kernel image in /boot, if you used netboot, this is normal))
> > STATUS: VULNERABLE (IBRS+IBPB or retpoline+IBPB+RSB filling, is needed to mitigate
>
> So why is this "Vulnerable"? Because it didn't think it could find my
> kernel image for some odd reason, despite it really being in /boot/ (I
> don't use netboot)
>
> So please verify that this really is a real issue, and not just the
> script doing foolish things.

And, if I tell the script where my kernel image is, suddenly all is
good:

CVE-2017-5715 aka 'Spectre Variant 2, branch target injection'
* Mitigation 1
* Kernel is compiled with IBRS support: YES
* IBRS enabled and active: N/A (not testable in offline mode)
* Kernel is compiled with IBPB support: YES
* IBPB enabled and active: N/A (not testable in offline mode)
* Mitigation 2
* Kernel has branch predictor hardening (arm): UNKNOWN
* Kernel compiled with retpoline option: UNKNOWN (couldn't read your kernel configuration)
* Kernel supports RSB filling: YES
> STATUS: NOT VULNERABLE (offline mode: kernel supports IBRS + IBPB to mitigate the vulnerability)



2020-03-04 08:52:08

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 5.5 111/176] KVM: nVMX: Emulate MTF when performing instruction emulation

On Wed, Mar 04, 2020 at 09:43:18AM +0100, Paolo Bonzini wrote:
> On 04/03/20 09:26, Greg Kroah-Hartman wrote:
> > On Wed, Mar 04, 2020 at 09:19:09AM +0100, Paolo Bonzini wrote:
> >> On 04/03/20 09:10, Greg Kroah-Hartman wrote:
> >>> I'll be glad to just put KVM into the "never apply any patches to
> >>> stable unless you explicitly mark it as such", but the sad fact is that
> >>> many recent KVM fixes for reported CVEs never had any "Cc: stable@vger"
> >>> markings.
> >>
> >> Hmm, I did miss it in 433f4ba1904100da65a311033f17a9bf586b287e and
> >> acff78477b9b4f26ecdf65733a4ed77fe837e9dc, but that's going back to
> >> August 2018, so I can do better but it's not too shabby a record. :)
> >
> > 35a571346a94 ("KVM: nVMX: Check IO instruction VM-exit conditions")
> > e71237d3ff1a ("KVM: nVMX: Refactor IO bitmap checks into helper function")
> >
> > Were both from a few weeks ago and needed to resolve CVE-2020-2732 :(
>
> No, they weren't, only the patch that was CCed stable was needed to
> resolve the CVE.

Ah, that's not what was posted to oss-security :(

> Remember that at this point a lot of bugfixes or vulnerabilities in KVM
> exploit corner cases of the architecture and don't show up with the
> usual guests (Linux, Windows, BSDs). Since we didn't have full
> information on the impact on guests that people do run, we started with
> the bare minimum (the two patches above) but only for 5.6. The idea was
> to collect follow-up patches for 2-4 weeks, decide which subset was
> stable-worthy, and only then post them as stable backport subsets.

Ok, that's fine, but it would be good if someone told me about this so
that I knew what was going on when people asked me about this type of
thing :)

thanks,

greg k-h

2020-03-04 10:53:12

by Naresh Kamboju

[permalink] [raw]
Subject: Re: [PATCH 5.5 000/176] 5.5.8-stable review

On Wed, 4 Mar 2020 at 14:19, Greg Kroah-Hartman
<[email protected]> wrote:
>
> On Wed, Mar 04, 2020 at 09:47:02AM +0100, Greg Kroah-Hartman wrote:
> > On Wed, Mar 04, 2020 at 09:11:28AM +0100, Greg Kroah-Hartman wrote:
> > > On Wed, Mar 04, 2020 at 12:43:42PM +0530, Naresh Kamboju wrote:
> > > > On Tue, 3 Mar 2020 at 23:16, Greg Kroah-Hartman
> > > > <[email protected]> wrote:
> > > > >
> > > > > This is the start of the stable review cycle for the 5.5.8 release.
> > > > > There are 176 patches in this series, all will be posted as a response
> > > > > to this one. If anyone has any issues with these being applied, please
> > > > > let me know.
> > > > >
> > > > > Responses should be made by Thu, 05 Mar 2020 17:42:06 +0000.
> > > > > Anything received after that time might be too late.
> > > > >
> > > > > The whole patch series can be found in one patch at:
> > > > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.5.8-rc1.gz
> > > > > or in the git tree and branch at:
> > > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.5.y
> > > > > and the diffstat can be found below.
> > > > >
> > > > > thanks,
> > > > >
> > > > > greg k-h
> > > > >
> > > >
> > > > Results from Linaro’s test farm.
> > > > Regressions detected on x86_64 and i386.
> > > >
> > > > Test failure output:
> > > > CVE-2017-5715: VULN (IBRS+IBPB or retpoline+IBPB+RSB filling, is
> > > > needed to mitigate the vulnerability)
> > > >
> > > > Test description:
> > > > CVE-2017-5715 branch target injection (Spectre Variant 2)
> > > >
> > > > Impact: Kernel
> > > > Mitigation 1: new opcode via microcode update that should be used by
> > > > up to date compilers to protect the BTB (by flushing indirect branch
> > > > predictors)
> > > > Mitigation 2: introducing "retpoline" into compilers, and recompile
> > > > software/OS with it
> > > > Performance impact of the mitigation: high for mitigation 1, medium
> > > > for mitigation 2, depending on your CPU
> > >
> > > So these are regressions or just new tests?
> > >
> > > If regressions, can you do 'git bisect' to find the offending commit?
> > >
> > > Also, are you sure you have an updated microcode on these machines and a
> > > proper compiler for retpoline?
> >
> > As an example of just how crazy that script is, here's the output of my
> > machine for that first CVE issue:
> >
> > CVE-2017-5715 aka 'Spectre Variant 2, branch target injection'
> > * Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB: conditional, IBRS_FW, STIBP: conditional, RSB filling)
> > * Mitigation 1
> > * Kernel is compiled with IBRS support: YES
> > * IBRS enabled and active: YES (for firmware code only)
> > * Kernel is compiled with IBPB support: YES
> > * IBPB enabled and active: YES
> > * Mitigation 2
> > * Kernel has branch predictor hardening (arm): NO
> > * Kernel compiled with retpoline option: YES
> > * Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline compilation)
> > * Kernel supports RSB filling: UNKNOWN (couldn't check (couldn't find your kernel image in /boot, if you used netboot, this is normal))
> > > STATUS: VULNERABLE (IBRS+IBPB or retpoline+IBPB+RSB filling, is needed to mitigate
> >
> > So why is this "Vulnerable"? Because it didn't think it could find my
> > kernel image for some odd reason, despite it really being in /boot/ (I
> > don't use netboot)

Now I know the real reason why this test failed.
With this note we can conclude this is not a regression.

No regressions on arm64, arm, x86_64, and i386 for 4.19, 5.4 and 5.5 branches.

Sorry for the noise.

--
Linaro LKFT
https://lkft.linaro.org

2020-03-04 11:53:39

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 5.5 000/176] 5.5.8-stable review

On Wed, Mar 04, 2020 at 04:22:30PM +0530, Naresh Kamboju wrote:
> On Wed, 4 Mar 2020 at 14:19, Greg Kroah-Hartman
> <[email protected]> wrote:
> >
> > On Wed, Mar 04, 2020 at 09:47:02AM +0100, Greg Kroah-Hartman wrote:
> > > On Wed, Mar 04, 2020 at 09:11:28AM +0100, Greg Kroah-Hartman wrote:
> > > > On Wed, Mar 04, 2020 at 12:43:42PM +0530, Naresh Kamboju wrote:
> > > > > On Tue, 3 Mar 2020 at 23:16, Greg Kroah-Hartman
> > > > > <[email protected]> wrote:
> > > > > >
> > > > > > This is the start of the stable review cycle for the 5.5.8 release.
> > > > > > There are 176 patches in this series, all will be posted as a response
> > > > > > to this one. If anyone has any issues with these being applied, please
> > > > > > let me know.
> > > > > >
> > > > > > Responses should be made by Thu, 05 Mar 2020 17:42:06 +0000.
> > > > > > Anything received after that time might be too late.
> > > > > >
> > > > > > The whole patch series can be found in one patch at:
> > > > > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.5.8-rc1.gz
> > > > > > or in the git tree and branch at:
> > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.5.y
> > > > > > and the diffstat can be found below.
> > > > > >
> > > > > > thanks,
> > > > > >
> > > > > > greg k-h
> > > > > >
> > > > >
> > > > > Results from Linaro’s test farm.
> > > > > Regressions detected on x86_64 and i386.
> > > > >
> > > > > Test failure output:
> > > > > CVE-2017-5715: VULN (IBRS+IBPB or retpoline+IBPB+RSB filling, is
> > > > > needed to mitigate the vulnerability)
> > > > >
> > > > > Test description:
> > > > > CVE-2017-5715 branch target injection (Spectre Variant 2)
> > > > >
> > > > > Impact: Kernel
> > > > > Mitigation 1: new opcode via microcode update that should be used by
> > > > > up to date compilers to protect the BTB (by flushing indirect branch
> > > > > predictors)
> > > > > Mitigation 2: introducing "retpoline" into compilers, and recompile
> > > > > software/OS with it
> > > > > Performance impact of the mitigation: high for mitigation 1, medium
> > > > > for mitigation 2, depending on your CPU
> > > >
> > > > So these are regressions or just new tests?
> > > >
> > > > If regressions, can you do 'git bisect' to find the offending commit?
> > > >
> > > > Also, are you sure you have an updated microcode on these machines and a
> > > > proper compiler for retpoline?
> > >
> > > As an example of just how crazy that script is, here's the output of my
> > > machine for that first CVE issue:
> > >
> > > CVE-2017-5715 aka 'Spectre Variant 2, branch target injection'
> > > * Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB: conditional, IBRS_FW, STIBP: conditional, RSB filling)
> > > * Mitigation 1
> > > * Kernel is compiled with IBRS support: YES
> > > * IBRS enabled and active: YES (for firmware code only)
> > > * Kernel is compiled with IBPB support: YES
> > > * IBPB enabled and active: YES
> > > * Mitigation 2
> > > * Kernel has branch predictor hardening (arm): NO
> > > * Kernel compiled with retpoline option: YES
> > > * Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline compilation)
> > > * Kernel supports RSB filling: UNKNOWN (couldn't check (couldn't find your kernel image in /boot, if you used netboot, this is normal))
> > > > STATUS: VULNERABLE (IBRS+IBPB or retpoline+IBPB+RSB filling, is needed to mitigate
> > >
> > > So why is this "Vulnerable"? Because it didn't think it could find my
> > > kernel image for some odd reason, despite it really being in /boot/ (I
> > > don't use netboot)
>
> Now I know the real reason why this test failed.
> With this note we can conclude this is not a regression.
>
> No regressions on arm64, arm, x86_64, and i386 for 4.19, 5.4 and 5.5 branches.

Great, thanks for confirming and for testing all of these.

greg k-h

2020-03-04 15:27:41

by Dan Rue

[permalink] [raw]
Subject: Re: [PATCH 5.5 000/176] 5.5.8-stable review

On Wed, Mar 04, 2020 at 12:52:32PM +0100, Greg Kroah-Hartman wrote:
> On Wed, Mar 04, 2020 at 04:22:30PM +0530, Naresh Kamboju wrote:
> > > > So why is this "Vulnerable"? Because it didn't think it could find my
> > > > kernel image for some odd reason, despite it really being in /boot/ (I
> > > > don't use netboot)
> >
> > Now I know the real reason why this test failed.
> > With this note we can conclude this is not a regression.
> >
> > No regressions on arm64, arm, x86_64, and i386 for 4.19, 5.4 and 5.5 branches.
>
> Great, thanks for confirming and for testing all of these.

We originally added spectre-meltdown-checker to lkft for informational
purposes, so that we could compare its report to any actual tests that
produce spectre/meltdown related failures (and help determine if the
problem is hardware/firmware or kernel). In practice, it's never been
helpful (because it's not an actual test) and so we'll be removing it
from LKFT.

Dan

>
> greg k-h

--
Linaro LKFT
https://lkft.linaro.org

2020-03-04 16:54:02

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 5.5 000/176] 5.5.8-stable review

On Tue, Mar 03, 2020 at 06:41:04PM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.5.8 release.
> There are 176 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Thu, 05 Mar 2020 17:42:06 +0000.
> Anything received after that time might be too late.
>

Build results:
total: 157 pass: 157 fail: 0
Qemu test results:
total: 423 pass: 423 fail: 0

Guenter

2020-03-04 17:19:36

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 5.5 000/176] 5.5.8-stable review

On Wed, Mar 04, 2020 at 08:53:12AM -0800, Guenter Roeck wrote:
> On Tue, Mar 03, 2020 at 06:41:04PM +0100, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.5.8 release.
> > There are 176 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Thu, 05 Mar 2020 17:42:06 +0000.
> > Anything received after that time might be too late.
> >
>
> Build results:
> total: 157 pass: 157 fail: 0
> Qemu test results:
> total: 423 pass: 423 fail: 0

Wonderful, thanks for testing these and letting me konw.

greg k-h

2020-03-04 21:38:59

by walter harms

[permalink] [raw]
Subject: AW: [PATCH 5.5 110/176] MIPS: VPE: Fix a double free and a memory leak in release_vpe()


________________________________________
Von: [email protected] <[email protected]> im Auftrag von Greg Kroah-Hartman <[email protected]>
Gesendet: Dienstag, 3. M?rz 2020 18:42
An: [email protected]
Cc: Greg Kroah-Hartman; [email protected]; Christophe JAILLET; Paul Burton; [email protected]; [email protected]; [email protected]
Betreff: [PATCH 5.5 110/176] MIPS: VPE: Fix a double free and a memory leak in release_vpe()

From: Christophe JAILLET <[email protected]>

commit bef8e2dfceed6daeb6ca3e8d33f9c9d43b926580 upstream.

Pointer on the memory allocated by 'alloc_progmem()' is stored in
'v->load_addr'. So this is this memory that should be freed by
'release_progmem()'.

'release_progmem()' is only a call to 'kfree()'.

With the current code, there is both a double free and a memory leak.
Fix it by passing the correct pointer to 'release_progmem()'.

Fixes: e01402b115ccc ("More AP / SP bits for the 34K, the Malta bits and things. Still wants")
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/mips/kernel/vpe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/mips/kernel/vpe.c
+++ b/arch/mips/kernel/vpe.c
@@ -134,7 +134,7 @@ void release_vpe(struct vpe *v)
{
list_del(&v->list);
if (v->load_addr)
- release_progmem(v);
+ release_progmem(v->load_addr);
kfree(v);
}


since release_progmem() is kfree() it is also possible to drop "if (v->load_addr)"

jm2c

re,
wh

2020-03-04 22:15:39

by Christophe JAILLET

[permalink] [raw]
Subject: Re: AW: [PATCH 5.5 110/176] MIPS: VPE: Fix a double free and a memory leak in release_vpe()

Le 04/03/2020 à 22:28, Walter Harms a écrit :
> ________________________________________
> Von: [email protected] <[email protected]> im Auftrag von Greg Kroah-Hartman <[email protected]>
> Gesendet: Dienstag, 3. März 2020 18:42
> An: [email protected]
> Cc: Greg Kroah-Hartman; [email protected]; Christophe JAILLET; Paul Burton; [email protected]; [email protected]; [email protected]
> Betreff: [PATCH 5.5 110/176] MIPS: VPE: Fix a double free and a memory leak in release_vpe()
>
> From: Christophe JAILLET <[email protected]>
>
> commit bef8e2dfceed6daeb6ca3e8d33f9c9d43b926580 upstream.
>
> Pointer on the memory allocated by 'alloc_progmem()' is stored in
> 'v->load_addr'. So this is this memory that should be freed by
> 'release_progmem()'.
>
> 'release_progmem()' is only a call to 'kfree()'.
>
> With the current code, there is both a double free and a memory leak.
> Fix it by passing the correct pointer to 'release_progmem()'.
>
> Fixes: e01402b115ccc ("More AP / SP bits for the 34K, the Malta bits and things. Still wants")
> Signed-off-by: Christophe JAILLET <[email protected]>
> Signed-off-by: Paul Burton <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
>
> ---
> arch/mips/kernel/vpe.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> --- a/arch/mips/kernel/vpe.c
> +++ b/arch/mips/kernel/vpe.c
> @@ -134,7 +134,7 @@ void release_vpe(struct vpe *v)
> {
> list_del(&v->list);
> if (v->load_addr)
> - release_progmem(v);
> + release_progmem(v->load_addr);
> kfree(v);
> }
>
>
> since release_progmem() is kfree() it is also possible to drop "if (v->load_addr)"
>
> jm2c
>
> re,
> wh

Agreed.

My patch had the following comment after the patch description:
---
The 'if (v->load_addr)' looks also redundant, but, well, the code is old
and I feel lazy tonight to send another patch for only that.
---

git log shows nearly no update since end of 2015, so I kept my proposal
as minimal :)

CJ