This is the start of the stable review cycle for the 5.8.3 release.
There are 232 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sat, 22 Aug 2020 09:15:09 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.3-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <[email protected]>
Linux 5.8.3-rc1
hersen wu <[email protected]>
drm/amd/display: dchubbub p-state warning during surface planes switch
Stylon Wang <[email protected]>
drm/amd/display: Fix dmesg warning from setting abm level
Sandeep Raghuraman <[email protected]>
drm/amdgpu: Fix bug where DPM is not enabled after hibernate and resume
Xin Xiong <[email protected]>
drm: fix drm_dp_mst_port refcount leaks in drm_dp_mst_allocate_vcpi
Marius Iacob <[email protected]>
drm: Added orientation quirk for ASUS tablet model T103HAF
Tomi Valkeinen <[email protected]>
drm/tidss: fix modeset init for DPI panels
Tomi Valkeinen <[email protected]>
drm/omap: force runtime PM suspend on system suspend
Imre Deak <[email protected]>
drm/dp_mst: Fix the DDC I2C device unregistration of an MST port
Imre Deak <[email protected]>
drm/dp_mst: Fix timeout handling of MST down messages
Imre Deak <[email protected]>
drm/dp_mst: Fix the DDC I2C device registration of an MST port
Denis Efremov <[email protected]>
drm/panfrost: Use kvfree() to free bo->sgts
Chris Wilson <[email protected]>
drm/i915/gt: Force the GT reset on shutdown
Denis Efremov <[email protected]>
drm/radeon: fix fb_div check in ni_init_smc_spll_table()
Geert Uytterhoeven <[email protected]>
sh: fault: Fix duplicate printing of "PC:"
Geert Uytterhoeven <[email protected]>
sh: landisk: Add missing initialization of sh_io_port_base
Zhang Rui <[email protected]>
perf/x86/rapl: Fix missing psys sysfs attributes
Daniel Díaz <[email protected]>
tools build feature: Quote CC and CXX for their arguments
Vincent Whitchurch <[email protected]>
perf bench mem: Always memset source before memcpy
Dinghao Liu <[email protected]>
ALSA: echoaudio: Fix potential Oops in snd_echo_resume()
Ondrej Mosnacek <[email protected]>
crypto: algif_aead - fix uninitialized ctx->init
Andy Shevchenko <[email protected]>
mfd: dln2: Run event handler loop under spinlock
Dhananjay Phadke <[email protected]>
i2c: iproc: fix race between client unreg and isr
Tiezhu Yang <[email protected]>
test_kmod: avoid potential double free in trigger_config_run_type()
Colin Ian King <[email protected]>
fs/ufs: avoid potential u32 multiplication overflow
Eric Biggers <[email protected]>
fs/minix: remove expected error message in block_to_path()
Eric Biggers <[email protected]>
fs/minix: fix block limit check for V1 filesystems
Eric Biggers <[email protected]>
fs/minix: set s_maxbytes correctly
Tiezhu Yang <[email protected]>
lib/test_lockup.c: fix return value of test_lockup_init()
Trond Myklebust <[email protected]>
NFS: Fix flexfiles read failover
Jeffrey Mitchell <[email protected]>
nfs: Fix getxattr kernel panic and memory overflow
Wang Hai <[email protected]>
net: qcom/emac: add missed clk_disable_unprepare in error path of emac_clks_phase1_init
Krzysztof Kozlowski <[email protected]>
s390/Kconfig: add missing ZCRYPT dependency to VFIO_AP
Wang Hai <[email protected]>
s390/test_unwind: fix possible memleak in test_unwind()
Dan Carpenter <[email protected]>
drm/vmwgfx: Fix two list_for_each loop exit tests
Dan Carpenter <[email protected]>
drm/vmwgfx: Use correct vmw_legacy_display_unit pointer
Dan Carpenter <[email protected]>
vdpa: Fix pointer math bug in vdpasim_get_config()
Christophe Leroy <[email protected]>
recordmcount: Fix build failure on non arm64
Michael S. Tsirkin <[email protected]>
vdpa_sim: init iommu lock
Andrii Nakryiko <[email protected]>
selftests/bpf: Fix silent Makefile output
Jin Yao <[email protected]>
perf record: Skip side-band event setup if HAVE_LIBBPF_SUPPORT is not set
Colin Ian King <[email protected]>
Input: sentelic - fix error return when fsp_reg_write fails
Andrii Nakryiko <[email protected]>
selftests/bpf: Prevent runqslower from racing on building bpftool
Pawan Gupta <[email protected]>
x86/bugs/multihit: Fix mitigation reporting when VMX is not in use
Dilip Kota <[email protected]>
x86/tsr: Fix tsc frequency enumeration bug on Lightning Mountain SoC
Muchun Song <[email protected]>
kprobes: Fix compiler warning for !CONFIG_KPROBES_ON_FTRACE
Dan Carpenter <[email protected]>
md-cluster: Fix potential error pointer dereference in resize_bitmaps()
Tero Kristo <[email protected]>
watchdog: rti-wdt: balance pm runtime enable calls
Krzysztof Sobota <[email protected]>
watchdog: initialize device before misc_register
Scott Mayhew <[email protected]>
nfs: nfs_file_write() should check for writeback errors
Ewan D. Milne <[email protected]>
scsi: lpfc: nvmet: Avoid hang / use-after-free again when destroying targetport
Jin Yao <[email protected]>
perf evsel: Don't set sample_regs_intr/sample_regs_user for dummy event
Stafford Horne <[email protected]>
openrisc: Fix oops caused when dumping stack
Jane Chu <[email protected]>
libnvdimm/security: ensure sysfs poll thread woke up and fetch updated attr
Jane Chu <[email protected]>
libnvdimm/security: fix a typo
Nicolas Saenz Julienne <[email protected]>
clk: bcm2835: Do not use prediv with bcm2711's PLLs
Geert Uytterhoeven <[email protected]>
clk: hsdk: Fix bad dependency on IOMEM
Zhihao Cheng <[email protected]>
ubifs: Fix wrong orphan node deletion in ubifs_jnl_update|rename
Zhihao Cheng <[email protected]>
ubi: fastmap: Free fastmap next anchor peb during detach
Zhihao Cheng <[email protected]>
ubi: fastmap: Don't produce the initial next anchor PEB when fastmap is disabled
Scott Mayhew <[email protected]>
nfs: ensure correct writeback errors are returned on close()
Wolfram Sang <[email protected]>
i2c: rcar: avoid race when unregistering slave
Thomas Hebb <[email protected]>
tools build feature: Use CC and CXX from parent
Jiri Olsa <[email protected]>
perf tools: Fix term parsing for raw syntax
Rayagonda Kokatanur <[email protected]>
pwm: bcm-iproc: handle clk_get_rate() return
Qais Yousef <[email protected]>
sched/uclamp: Fix a deadlock when enabling uclamp static key
Sagi Grimberg <[email protected]>
nvme: fix deadlock in disconnect during scan_work and/or ana_work
Xu Wang <[email protected]>
clk: clk-atlas6: fix return value check in atlas6_clk_init()
Konrad Dybcio <[email protected]>
clk: qcom: gcc-sdm660: Fix up gcc_mss_mnoc_bimc_axi_clk
Wei Hu <[email protected]>
PCI: hv: Fix a timing issue which causes kdump to fail occasionally
Chao Yu <[email protected]>
f2fs: compress: fix to update isize when overwriting compressed file
Wolfram Sang <[email protected]>
i2c: rcar: slave: only send STOP event when we have been addressed
Jacob Pan <[email protected]>
iommu/vt-d: Disable multiple GPASID-dev bind
Jacob Pan <[email protected]>
iommu/vt-d: Warn on out-of-range invalidation address
Liu Yi L <[email protected]>
iommu/vt-d: Enforce PASID devTLB field mask
Liu Yi L <[email protected]>
iommu/vt-d: Handle non-page aligned address
Jonathan Marek <[email protected]>
clk: qcom: clk-alpha-pll: remove unused/incorrect PLL_CAL_VAL
Jonathan Marek <[email protected]>
clk: qcom: gcc: fix sm8150 GPU and NPU clocks
Colin Ian King <[email protected]>
iommu/omap: Check for failure of a call to omap_iommu_dump_ctx
Aneesh Kumar K.V <[email protected]>
selftests/powerpc: ptrace-pkey: Don't update expected UAMOR value
Aneesh Kumar K.V <[email protected]>
selftests/powerpc: ptrace-pkey: Update the test to mark an invalid pkey correctly
Aneesh Kumar K.V <[email protected]>
selftests/powerpc: ptrace-pkey: Rename variables to make it easier to follow code
Cristian Ciocaltea <[email protected]>
clk: actions: Fix h_clk for Actions S500 SoC
Chao Yu <[email protected]>
f2fs: compress: fix to avoid memory leak on cc->cpages
Tyler Hicks <[email protected]>
ima: Fail rule parsing when appraise_flag=blacklist is unsupportable
Ming Lei <[email protected]>
dm rq: don't call blk_mq_queue_stopped() in dm_stop_queue()
Steve Longerbeam <[email protected]>
gpu: ipu-v3: image-convert: Wait for all EOFs before completing a tile
Steve Longerbeam <[email protected]>
gpu: ipu-v3: image-convert: Combine rotate/no-rotate irq handlers
Herbert Xu <[email protected]>
crypto: caam - Remove broken arc4 support
Sudeep Holla <[email protected]>
rtc: pl031: fix set_alarm by adding back call to alarm_irq_enable
Yan-Hsuan Chuang <[email protected]>
rtw88: pci: disable aspm for platform inter-op with module parameter
Yoshihiro Shimoda <[email protected]>
mmc: renesas_sdhi_internal_dmac: clean up the code for dma complete
Mark Zhang <[email protected]>
RDMA/counter: Allow manually bind QPs with different pids to same counter
Mark Zhang <[email protected]>
RDMA/counter: Only bind user QPs in auto mode
Vladimir Oltean <[email protected]>
devres: keep both device name and resource name in pretty name
Herbert Xu <[email protected]>
crypto: af_alg - Fix regression on empty requests
Johan Hovold <[email protected]>
USB: serial: ftdi_sio: fix break and sysrq handling
Johan Hovold <[email protected]>
USB: serial: ftdi_sio: clean up receive processing
Johan Hovold <[email protected]>
USB: serial: ftdi_sio: make process-packet buffer unsigned
Jesper Dangaard Brouer <[email protected]>
selftests/bpf: test_progs avoid minus shell exit codes
Jesper Dangaard Brouer <[email protected]>
selftests/bpf: test_progs use another shell exit on non-actions
Martin KaFai Lau <[email protected]>
bpf: selftests: Restore netns after each test
Jesper Dangaard Brouer <[email protected]>
selftests/bpf: Test_progs indicate to shell on non-actions
Qais Yousef <[email protected]>
sched/uclamp: Protect uclamp fast path code with static key
Yishai Hadas <[email protected]>
IB/uverbs: Set IOVA on IB MR in uverbs layer
Paul Kocialkowski <[email protected]>
media: rockchip: rga: Only set output CSC mode for RGB input
Paul Kocialkowski <[email protected]>
media: rockchip: rga: Introduce color fmt macros and refactor CSC mode logic
Dafna Hirschfeld <[email protected]>
media: staging: rkisp1: rsz: set default format if the given format is not RKISP1_ISP_SD_SRC
Dafna Hirschfeld <[email protected]>
media: staging: rkisp1: rename macros 'RKISP1_DIR_*' to 'RKISP1_ISP_SD_*'
Dafna Hirschfeld <[email protected]>
media: staging: rkisp1: remove macro RKISP1_DIR_SINK_SRC
Sebastian Reichel <[email protected]>
rtc: cpcap: fix range
Jason Gunthorpe <[email protected]>
RDMA/ipoib: Fix ABBA deadlock with ipoib_reap_ah()
Kamal Heib <[email protected]>
RDMA/ipoib: Return void from ipoib_ib_dev_stop()
Chen Tao <[email protected]>
drm/amdgpu/debugfs: fix memory leak when pm_runtime_get_sync failed
Qiushi Wu <[email protected]>
platform/chrome: cros_ec_ishtp: Fix a double-unlock issue
Kamal Dasu <[email protected]>
mtd: rawnand: brcmnand: ECC error handling on EDU transfers
Boris Brezillon <[email protected]>
mtd: rawnand: fsl_upm: Remove unused mtd var
Eric Dumazet <[email protected]>
octeontx2-af: change (struct qmem)->entry_sz from u8 to u16
Charles Keepax <[email protected]>
mfd: arizona: Ensure 32k clock is put on driver unbind and error
Herbert Xu <[email protected]>
crypto: algif_aead - Only wake up when ctx->more is zero
Paul Cercueil <[email protected]>
drm/ingenic: Fix incorrect assumption about plane->index
Liu Ying <[email protected]>
drm/imx: imx-ldb: Disable both channels for split mode in enc->disable()
Dan Williams <[email protected]>
libnvdimm: Validate command family indices
Sibi Sankar <[email protected]>
remoteproc: qcom_q6v5_mss: Validate modem blob firmware size before load
Sibi Sankar <[email protected]>
remoteproc: qcom_q6v5_mss: Validate MBA firmware size before load
Sibi Sankar <[email protected]>
remoteproc: qcom: q6v5: Update running state before requesting stop
Bob Peterson <[email protected]>
gfs2: Never call gfs2_block_zero_range with an open transaction
Andreas Gruenbacher <[email protected]>
gfs2: Fix refcount leak in gfs2_glock_poke
Adrian Hunter <[email protected]>
perf intel-pt: Fix duplicate branch after CBR
Adrian Hunter <[email protected]>
perf intel-pt: Fix FUP packet state
Masami Hiramatsu <[email protected]>
perf probe: Fix memory leakage when the probe point is not found
Masami Hiramatsu <[email protected]>
perf probe: Fix wrong variable warning when the probe point is not found
Masami Hiramatsu <[email protected]>
bootconfig: Fix to find the initargs correctly
Kees Cook <[email protected]>
module: Correctly truncate sysfs sections output
Johannes Thumshirn <[email protected]>
dm: don't call report zones for more than the user requested
John Dorminy <[email protected]>
dm ebs: Fix incorrect checking for REQ_OP_FLUSH
Anton Blanchard <[email protected]>
pseries: Fix 64 bit logical memory block panic
Jeff Layton <[email protected]>
ceph: handle zero-length feature mask in session messages
Jeff Layton <[email protected]>
ceph: set sec_context xattr on symlink creation
Ahmad Fatoum <[email protected]>
watchdog: f71808e_wdt: clear watchdog timeout occurred flag
Ahmad Fatoum <[email protected]>
watchdog: f71808e_wdt: remove use of wrong watchdog_info option
Ahmad Fatoum <[email protected]>
watchdog: f71808e_wdt: indicate WDIOF_CARDRESET support in watchdog_info.options
Steven Rostedt (VMware) <[email protected]>
tracing: Use trace_sched_process_free() instead of exit() for pid tracing
Kevin Hao <[email protected]>
tracing/hwlat: Honor the tracing_cpumask
Muchun Song <[email protected]>
kprobes: Fix NULL pointer dereference at kprobe_ftrace_handler
Chengming Zhou <[email protected]>
ftrace: Setup correct FTRACE_FL_REGS flags for module
Jia He <[email protected]>
mm/memory_hotplug: fix unpaired mem_hotplug_begin/done
Mike Kravetz <[email protected]>
cma: don't quit at first error when activating reserved areas
Michal Koutný <[email protected]>
mm/page_counter.c: fix protection usage propagation
Junxiao Bi <[email protected]>
ocfs2: change slot number type s16 to u16
Peter Zijlstra <[email protected]>
mm: fix kthread_use_mm() vs TLB invalidate
David Hildenbrand <[email protected]>
mm/shuffle: don't move pages between zones and don't read garbage memmaps
Mike Kravetz <[email protected]>
hugetlbfs: remove call to huge_pte_alloc without i_mmap_rwsem
Hugh Dickins <[email protected]>
khugepaged: retract_page_tables() remember to test exit
Hugh Dickins <[email protected]>
khugepaged: collapse_pte_mapped_thp() protect the pmd lock
Peter Xu <[email protected]>
mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible
Hugh Dickins <[email protected]>
khugepaged: collapse_pte_mapped_thp() flush the right range
Mikulas Patocka <[email protected]>
ext2: fix missing percpu_counter_inc
Mike Rapoport <[email protected]>
MIPS: SGI-IP27: always enable NUMA in Kconfig
Paul Cercueil <[email protected]>
MIPS: qi_lb60: Fix routing to audio amplifier
Huacai Chen <[email protected]>
MIPS: CPU#0 is not hotpluggable
Lukas Wunner <[email protected]>
driver core: Avoid binding drivers to dead devices
Vincent Duvert <[email protected]>
appletalk: Fix atalk_proc_init() return path
Johannes Berg <[email protected]>
mac80211: fix misplaced while instead of if
Coly Li <[email protected]>
bcache: use disk_{start,end}_io_acct() to count I/O for bcache device
Coly Li <[email protected]>
bcache: fix bio_{start,end}_io_acct with proper device
Coly Li <[email protected]>
bcache: avoid nr_stripes overflow in bcache_device_init()
Coly Li <[email protected]>
bcache: fix overflow in offset_to_stripe()
Coly Li <[email protected]>
bcache: allocate meta data pages as compound pages
ChangSyun Peng <[email protected]>
md/raid5: Fix Force reconstruct-write io stuck in degraded raid5
Kees Cook <[email protected]>
selftests/seccomp: Set NNP for TSYNC ESRCH flag test
Kees Cook <[email protected]>
net/compat: Add missing sock updates for SCM_RIGHTS
Kees Cook <[email protected]>
pidfd: Add missing sock updates for pidfd_getfd()
Zenghui Yu <[email protected]>
irqchip/gic-v4.1: Ensure accessing the correct RD when writing INVALLR
Huacai Chen <[email protected]>
irqchip/loongson-liointc: Fix misuse of gc->mask_cache
Jonathan McDowell <[email protected]>
net: stmmac: dwmac1000: provide multicast filter fallback
Jonathan McDowell <[email protected]>
net: ethernet: stmmac: Disable hardware multicast filter
Eugeniu Rosca <[email protected]>
media: vsp1: dl: Fix NULL pointer dereference on unbind
Mansur Alisha Shaik <[email protected]>
media: venus: fix multiple encoder crash
Paul Cercueil <[email protected]>
pinctrl: ingenic: Properly detect GPIO direction when configured for IRQ
Paul Cercueil <[email protected]>
pinctrl: ingenic: Enhance support for IRQ_TYPE_EDGE_BOTH
Michael Ellerman <[email protected]>
powerpc: Fix circular dependency between percpu.h and mmu.h
Michael Ellerman <[email protected]>
powerpc: Allow 4224 bytes of stack expansion for the signal frame
Christophe Leroy <[email protected]>
powerpc/ptdump: Fix build failure in hashpagetable.c
Paul Aurich <[email protected]>
cifs: Fix leak when handling lease break for cached root fid
Max Filippov <[email protected]>
xtensa: fix xtensa_pmu_setup prototype
Max Filippov <[email protected]>
xtensa: add missing exclusive access state management
Lorenzo Bianconi <[email protected]>
iio: imu: st_lsm6dsx: reset hw ts after resume
Alexandru Ardelean <[email protected]>
iio: dac: ad5592r: fix unbalanced mutex unlocks in ad5592r_read_raw()
Christian Eggers <[email protected]>
dt-bindings: iio: io-channel-mux: Fix compatible string in example code
Shaokun Zhang <[email protected]>
arm64: perf: Correct the event index in sysfs
Sibi Sankar <[email protected]>
arm64: dts: qcom: sc7180: Drop the unused non-MSA SID
Boleyn Su <[email protected]>
btrfs: check correct variable after allocation in btrfs_backref_iter_alloc
Pavel Machek <[email protected]>
btrfs: fix return value mixup in btrfs_get_extent
Josef Bacik <[email protected]>
btrfs: make sure SB_I_VERSION doesn't get unset by remount
Qu Wenruo <[email protected]>
btrfs: trim: fix underflow in trim length to prevent access beyond device boundary
Filipe Manana <[email protected]>
btrfs: fix memory leaks after failure to lookup checksums during inode logging
Qu Wenruo <[email protected]>
btrfs: inode: fix NULL pointer dereference if inode doesn't need compression
Josef Bacik <[email protected]>
btrfs: only search for left_info if there is no right_info in try_merge_free_space
David Sterba <[email protected]>
btrfs: fix messages after changing compression level by remount
Josef Bacik <[email protected]>
btrfs: don't show full path of bind mounts in subvol=
Filipe Manana <[email protected]>
btrfs: fix race between page release and a fast fsync
Josef Bacik <[email protected]>
btrfs: don't WARN if we abort a transaction with EROFS
Josef Bacik <[email protected]>
btrfs: sysfs: use NOFS for device creation
Josef Bacik <[email protected]>
btrfs: return EROFS for BTRFS_FS_STATE_ERROR cases
Qu Wenruo <[email protected]>
btrfs: avoid possible signal interruption of btrfs_drop_snapshot() on relocation tree
David Sterba <[email protected]>
btrfs: add missing check for nocow and compression inode flags
Qu Wenruo <[email protected]>
btrfs: relocation: review the call sites which can be interrupted by signal
Josef Bacik <[email protected]>
btrfs: move the chunk_mutex in btrfs_read_chunk_tree
Josef Bacik <[email protected]>
btrfs: open device without device_list_mutex
Johannes Thumshirn <[email protected]>
btrfs: pass checksum type via BTRFS_IOC_FS_INFO ioctl
Anand Jain <[email protected]>
btrfs: don't traverse into the seed devices in show_devname
Filipe Manana <[email protected]>
btrfs: remove no longer needed use of log_writers for the log root tree
Filipe Manana <[email protected]>
btrfs: only commit delayed items at fsync if we are logging a directory
Filipe Manana <[email protected]>
btrfs: stop incremening log_batch for the log root tree when syncing log
Filipe Manana <[email protected]>
btrfs: only commit the delayed inode when doing a full fsync
Tom Rix <[email protected]>
btrfs: ref-verify: fix memory leak in add_block_entry
Qu Wenruo <[email protected]>
btrfs: preallocate anon block device at first phase of snapshot creation
Qu Wenruo <[email protected]>
btrfs: don't allocate anonymous block device for user invisible roots
Qu Wenruo <[email protected]>
btrfs: free anon block device right after subvolume deletion
David Sterba <[email protected]>
btrfs: allow use of global block reserve for balance item deletion
Ansuel Smith <[email protected]>
PCI: qcom: Add support for tx term offset for rev 2.1.0
Ansuel Smith <[email protected]>
PCI: qcom: Define some PARF params needed for ipq8064 SoC
Rajat Jain <[email protected]>
PCI: Add device even if driver attach failed
Kai-Heng Feng <[email protected]>
PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken
Ashok Raj <[email protected]>
PCI/ATS: Add pci_pri_supported() to check device or associated PF
Rafael J. Wysocki <[email protected]>
PCI: hotplug: ACPI: Fix context refcounting in acpiphp_grab_context()
Guenter Roeck <[email protected]>
genirq/PM: Always unlock IRQ descriptor in rearm_wake_irq()
Guenter Roeck <[email protected]>
genirq: Unlock irq descriptor after errors
Thomas Gleixner <[email protected]>
genirq/affinity: Make affinity setting if activated opt-in
Steve French <[email protected]>
SMB3: Fix mkdir when idsfromsid configured on mount
Steve French <[email protected]>
smb3: warn on confusing error scenario with sec=krb5
Takashi Iwai <[email protected]>
ALSA: hda/realtek - Fix unused variable warning
-------------
Diffstat:
Documentation/admin-guide/hw-vuln/multihit.rst | 4 +
.../bindings/iio/multiplexer/io-channel-mux.txt | 2 +-
Makefile | 4 +-
arch/arm64/boot/dts/qcom/sc7180-idp.dts | 2 +-
arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi | 2 +-
arch/arm64/kernel/perf_event.c | 13 +-
arch/mips/Kconfig | 1 +
arch/mips/boot/dts/ingenic/qi_lb60.dts | 2 +-
arch/mips/kernel/topology.c | 2 +-
arch/openrisc/kernel/stacktrace.c | 18 ++-
arch/powerpc/include/asm/percpu.h | 4 +-
arch/powerpc/mm/fault.c | 7 +-
arch/powerpc/mm/ptdump/hashpagetable.c | 2 +-
arch/powerpc/platforms/pseries/hotplug-memory.c | 2 +-
arch/s390/Kconfig | 1 +
arch/s390/lib/test_unwind.c | 1 +
arch/sh/boards/mach-landisk/setup.c | 3 +
arch/sh/mm/fault.c | 1 -
arch/x86/events/rapl.c | 2 +-
arch/x86/kernel/apic/vector.c | 4 +
arch/x86/kernel/cpu/bugs.c | 8 +-
arch/x86/kernel/tsc_msr.c | 9 +-
arch/xtensa/include/asm/thread_info.h | 4 +
arch/xtensa/kernel/asm-offsets.c | 3 +
arch/xtensa/kernel/entry.S | 11 ++
arch/xtensa/kernel/perf_event.c | 2 +-
crypto/af_alg.c | 11 +-
crypto/algif_aead.c | 10 +-
crypto/algif_skcipher.c | 11 +-
drivers/acpi/nfit/core.c | 11 +-
drivers/acpi/nfit/nfit.h | 1 -
drivers/base/dd.c | 4 +-
drivers/clk/Kconfig | 2 +-
drivers/clk/actions/owl-s500.c | 2 +-
drivers/clk/bcm/clk-bcm2835.c | 25 +++-
drivers/clk/qcom/clk-alpha-pll.c | 2 -
drivers/clk/qcom/gcc-sdm660.c | 3 +
drivers/clk/qcom/gcc-sm8150.c | 8 +-
drivers/clk/sirf/clk-atlas6.c | 2 +-
drivers/crypto/caam/caamalg.c | 29 -----
drivers/crypto/caam/compat.h | 1 -
drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 11 +-
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 23 ++++
.../drm/amd/display/dc/clk_mgr/dcn10/rv1_clk_mgr.c | 69 +++++++++-
drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c | 5 +-
drivers/gpu/drm/drm_dp_mst_topology.c | 41 +++---
drivers/gpu/drm/drm_panel_orientation_quirks.c | 6 +
drivers/gpu/drm/i915/gt/intel_gt.c | 5 +
drivers/gpu/drm/imx/imx-ldb.c | 7 +-
drivers/gpu/drm/ingenic/ingenic-drm.c | 2 +-
drivers/gpu/drm/omapdrm/dss/dispc.c | 1 +
drivers/gpu/drm/omapdrm/dss/dsi.c | 1 +
drivers/gpu/drm/omapdrm/dss/dss.c | 1 +
drivers/gpu/drm/omapdrm/dss/venc.c | 1 +
drivers/gpu/drm/panfrost/panfrost_gem.c | 2 +-
drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +-
drivers/gpu/drm/radeon/ni_dpm.c | 2 +-
drivers/gpu/drm/tidss/tidss_kms.c | 2 +-
drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 8 +-
drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c | 5 +-
drivers/gpu/ipu-v3/ipu-image-convert.c | 145 +++++++++++++--------
drivers/i2c/busses/i2c-bcm-iproc.c | 13 +-
drivers/i2c/busses/i2c-rcar.c | 15 ++-
drivers/iio/dac/ad5592r-base.c | 4 +-
drivers/iio/imu/st_lsm6dsx/st_lsm6dsx.h | 3 +-
drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c | 23 ++--
drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c | 2 +-
drivers/infiniband/core/counters.c | 4 +-
drivers/infiniband/core/uverbs_cmd.c | 4 +
drivers/infiniband/hw/cxgb4/mem.c | 1 -
drivers/infiniband/hw/mlx4/mr.c | 1 -
drivers/infiniband/ulp/ipoib/ipoib.h | 2 +-
drivers/infiniband/ulp/ipoib/ipoib_ib.c | 67 +++++-----
drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 +
drivers/input/mouse/sentelic.c | 2 +-
drivers/iommu/intel/dmar.c | 21 ++-
drivers/iommu/intel/iommu.c | 7 +-
drivers/iommu/intel/svm.c | 22 ++--
drivers/iommu/omap-iommu-debug.c | 3 +
drivers/irqchip/irq-gic-v3-its.c | 15 ++-
drivers/irqchip/irq-loongson-liointc.c | 10 +-
drivers/md/bcache/bcache.h | 2 +-
drivers/md/bcache/bset.c | 2 +-
drivers/md/bcache/btree.c | 2 +-
drivers/md/bcache/journal.c | 4 +-
drivers/md/bcache/request.c | 14 +-
drivers/md/bcache/super.c | 14 +-
drivers/md/bcache/writeback.c | 14 +-
drivers/md/bcache/writeback.h | 19 ++-
drivers/md/dm-ebs-target.c | 2 +-
drivers/md/dm-rq.c | 3 -
drivers/md/dm.c | 3 +-
drivers/md/md-cluster.c | 1 +
drivers/md/raid5.c | 3 +-
drivers/media/platform/qcom/venus/pm_helpers.c | 4 +
drivers/media/platform/rockchip/rga/rga-hw.c | 29 +++--
drivers/media/platform/rockchip/rga/rga-hw.h | 5 +
drivers/media/platform/vsp1/vsp1_dl.c | 2 +
drivers/mfd/arizona-core.c | 18 +++
drivers/mfd/dln2.c | 4 +
drivers/mmc/host/renesas_sdhi_internal_dmac.c | 18 ++-
drivers/mtd/nand/raw/brcmnand/brcmnand.c | 26 ++++
drivers/mtd/nand/raw/fsl_upm.c | 1 -
drivers/mtd/ubi/fastmap-wl.c | 5 +
drivers/mtd/ubi/wl.c | 3 +-
drivers/net/ethernet/marvell/octeontx2/af/common.h | 2 +-
drivers/net/ethernet/qualcomm/emac/emac.c | 17 ++-
.../net/ethernet/stmicro/stmmac/dwmac-ipq806x.c | 1 +
.../net/ethernet/stmicro/stmmac/dwmac1000_core.c | 3 +
drivers/net/wireless/realtek/rtw88/pci.c | 9 ++
drivers/nvdimm/bus.c | 16 +++
drivers/nvdimm/security.c | 13 +-
drivers/nvme/host/core.c | 15 +++
drivers/nvme/host/fabrics.c | 2 +-
drivers/nvme/host/fabrics.h | 3 +-
drivers/nvme/host/fc.c | 1 +
drivers/nvme/host/multipath.c | 18 ++-
drivers/nvme/host/nvme.h | 1 +
drivers/nvme/host/rdma.c | 10 +-
drivers/nvme/host/tcp.c | 15 ++-
drivers/pci/ats.c | 15 +++
drivers/pci/bus.c | 6 +-
drivers/pci/controller/dwc/pcie-qcom.c | 41 +++++-
drivers/pci/controller/pci-hyperv.c | 71 +++++-----
drivers/pci/hotplug/acpiphp_glue.c | 14 +-
drivers/pci/quirks.c | 5 +-
drivers/pinctrl/pinctrl-ingenic.c | 9 +-
drivers/platform/chrome/cros_ec_ishtp.c | 4 +-
drivers/pwm/pwm-bcm-iproc.c | 9 +-
drivers/remoteproc/qcom_q6v5.c | 2 +
drivers/remoteproc/qcom_q6v5_mss.c | 11 +-
drivers/rtc/rtc-cpcap.c | 2 +-
drivers/rtc/rtc-pl031.c | 1 +
drivers/scsi/lpfc/lpfc_nvmet.c | 2 +-
drivers/staging/media/rkisp1/rkisp1-common.h | 3 +
drivers/staging/media/rkisp1/rkisp1-isp.c | 46 +++----
drivers/staging/media/rkisp1/rkisp1-resizer.c | 2 +-
drivers/usb/serial/ftdi_sio.c | 57 ++++----
drivers/vdpa/vdpa_sim/vdpa_sim.c | 3 +-
drivers/watchdog/f71808e_wdt.c | 13 +-
drivers/watchdog/rti_wdt.c | 2 +
drivers/watchdog/watchdog_dev.c | 18 +--
fs/btrfs/backref.c | 2 +-
fs/btrfs/ctree.h | 4 +-
fs/btrfs/disk-io.c | 78 ++++++++++-
fs/btrfs/disk-io.h | 2 +
fs/btrfs/extent-io-tree.h | 2 +
fs/btrfs/extent-tree.c | 23 +++-
fs/btrfs/extent_io.c | 18 ++-
fs/btrfs/free-space-cache.c | 4 +-
fs/btrfs/inode.c | 20 ++-
fs/btrfs/ioctl.c | 67 ++++++++--
fs/btrfs/ref-verify.c | 2 +
fs/btrfs/relocation.c | 12 +-
fs/btrfs/scrub.c | 2 +-
fs/btrfs/super.c | 51 +++++---
fs/btrfs/sysfs.c | 3 +
fs/btrfs/transaction.c | 7 +-
fs/btrfs/transaction.h | 2 +
fs/btrfs/tree-log.c | 43 ++----
fs/btrfs/volumes.c | 48 ++++++-
fs/ceph/dir.c | 4 +
fs/ceph/mds_client.c | 6 +-
fs/cifs/smb2inode.c | 1 +
fs/cifs/smb2misc.c | 73 ++++++++---
fs/cifs/smb2pdu.c | 2 +
fs/ext2/ialloc.c | 3 +-
fs/f2fs/compress.c | 2 +
fs/f2fs/data.c | 4 +
fs/gfs2/bmap.c | 69 +++++-----
fs/gfs2/glock.c | 4 +-
fs/minix/inode.c | 12 +-
fs/minix/itree_v1.c | 12 +-
fs/minix/itree_v2.c | 13 +-
fs/minix/minix.h | 1 -
fs/nfs/file.c | 17 ++-
fs/nfs/flexfilelayout/flexfilelayout.c | 50 +++++--
fs/nfs/nfs4file.c | 5 +-
fs/nfs/nfs4proc.c | 2 -
fs/nfs/nfs4xdr.c | 6 +-
fs/nfs/pnfs.c | 4 +-
fs/nfs/pnfs.h | 2 +-
fs/ocfs2/ocfs2.h | 4 +-
fs/ocfs2/suballoc.c | 4 +-
fs/ocfs2/super.c | 4 +-
fs/ubifs/journal.c | 10 +-
fs/ufs/super.c | 2 +-
include/crypto/if_alg.h | 4 +-
include/linux/fs.h | 10 ++
include/linux/hugetlb.h | 8 +-
include/linux/intel-iommu.h | 4 +-
include/linux/irq.h | 13 ++
include/linux/libnvdimm.h | 2 +
include/linux/pci-ats.h | 4 +
include/net/sock.h | 4 +
include/uapi/linux/btrfs.h | 14 +-
include/uapi/linux/ndctl.h | 4 +
init/main.c | 14 +-
kernel/irq/manage.c | 13 +-
kernel/irq/pm.c | 8 +-
kernel/kprobes.c | 24 +++-
kernel/kthread.c | 7 +-
kernel/module.c | 22 +++-
kernel/pid.c | 7 +-
kernel/sched/core.c | 81 +++++++++++-
kernel/sched/cpufreq_schedutil.c | 2 +-
kernel/sched/sched.h | 47 ++++++-
kernel/trace/ftrace.c | 15 ++-
kernel/trace/trace_events.c | 4 +-
kernel/trace/trace_hwlat.c | 5 +-
lib/devres.c | 11 +-
lib/test_kmod.c | 2 +-
lib/test_lockup.c | 4 +-
mm/cma.c | 23 ++--
mm/hugetlb.c | 39 +++---
mm/khugepaged.c | 70 +++++-----
mm/memory_hotplug.c | 5 +-
mm/page_counter.c | 6 +-
mm/rmap.c | 2 +-
mm/shuffle.c | 18 +--
net/appletalk/atalk_proc.c | 2 +
net/compat.c | 1 +
net/core/sock.c | 21 +++
net/mac80211/sta_info.c | 2 +-
scripts/recordmcount.c | 2 +
security/integrity/ima/ima_policy.c | 15 ++-
sound/pci/echoaudio/echoaudio.c | 2 -
sound/pci/hda/patch_realtek.c | 2 -
tools/build/Makefile.feature | 2 +-
tools/build/feature/Makefile | 2 -
tools/perf/bench/mem-functions.c | 21 +--
tools/perf/builtin-record.c | 4 +-
tools/perf/tests/parse-events.c | 37 +++++-
tools/perf/util/evsel.c | 6 +-
.../perf/util/intel-pt-decoder/intel-pt-decoder.c | 29 ++---
tools/perf/util/parse-events.c | 28 ++++
tools/perf/util/parse-events.h | 2 +
tools/perf/util/parse-events.l | 19 +--
tools/perf/util/probe-finder.c | 5 +-
tools/testing/selftests/bpf/Makefile | 49 +++----
tools/testing/selftests/bpf/test_progs.c | 33 ++++-
tools/testing/selftests/bpf/test_progs.h | 2 +
.../testing/selftests/powerpc/ptrace/ptrace-pkey.c | 55 ++++----
tools/testing/selftests/seccomp/seccomp_bpf.c | 5 +
244 files changed, 2095 insertions(+), 909 deletions(-)
From: Max Filippov <[email protected]>
commit a0fc1436f1f4f84e93144480bf30e0c958d135b6 upstream.
The result of the s32ex opcode is recorded in the ATOMCTL special
register and must be retrieved with the getex opcode. Context switch
between s32ex and getex may trash the ATOMCTL register and result in
duplicate update or missing update of the atomic variable.
Add atomctl8 field to the struct thread_info and use getex to swap
ATOMCTL bit 8 as a part of context switch.
Clear exclusive access monitor on kernel entry.
Cc: [email protected]
Fixes: f7c34874f04a ("xtensa: add exclusive atomics support")
Signed-off-by: Max Filippov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/xtensa/include/asm/thread_info.h | 4 ++++
arch/xtensa/kernel/asm-offsets.c | 3 +++
arch/xtensa/kernel/entry.S | 11 +++++++++++
3 files changed, 18 insertions(+)
--- a/arch/xtensa/include/asm/thread_info.h
+++ b/arch/xtensa/include/asm/thread_info.h
@@ -55,6 +55,10 @@ struct thread_info {
mm_segment_t addr_limit; /* thread address space */
unsigned long cpenable;
+#if XCHAL_HAVE_EXCLUSIVE
+ /* result of the most recent exclusive store */
+ unsigned long atomctl8;
+#endif
/* Allocate storage for extra user states and coprocessor states. */
#if XTENSA_HAVE_COPROCESSORS
--- a/arch/xtensa/kernel/asm-offsets.c
+++ b/arch/xtensa/kernel/asm-offsets.c
@@ -93,6 +93,9 @@ int main(void)
DEFINE(THREAD_RA, offsetof (struct task_struct, thread.ra));
DEFINE(THREAD_SP, offsetof (struct task_struct, thread.sp));
DEFINE(THREAD_CPENABLE, offsetof (struct thread_info, cpenable));
+#if XCHAL_HAVE_EXCLUSIVE
+ DEFINE(THREAD_ATOMCTL8, offsetof (struct thread_info, atomctl8));
+#endif
#if XTENSA_HAVE_COPROCESSORS
DEFINE(THREAD_XTREGS_CP0, offsetof(struct thread_info, xtregs_cp.cp0));
DEFINE(THREAD_XTREGS_CP1, offsetof(struct thread_info, xtregs_cp.cp1));
--- a/arch/xtensa/kernel/entry.S
+++ b/arch/xtensa/kernel/entry.S
@@ -374,6 +374,11 @@ common_exception:
s32i a2, a1, PT_LCOUNT
#endif
+#if XCHAL_HAVE_EXCLUSIVE
+ /* Clear exclusive access monitor set by interrupted code */
+ clrex
+#endif
+
/* It is now save to restore the EXC_TABLE_FIXUP variable. */
rsr a2, exccause
@@ -2020,6 +2025,12 @@ ENTRY(_switch_to)
s32i a3, a4, THREAD_CPENABLE
#endif
+#if XCHAL_HAVE_EXCLUSIVE
+ l32i a3, a5, THREAD_ATOMCTL8
+ getex a3
+ s32i a3, a4, THREAD_ATOMCTL8
+#endif
+
/* Flush register file. */
spill_registers_kernel
From: Steve French <[email protected]>
commit 0a018944eee913962bce8ffebbb121960d5125d9 upstream.
When mounting with Kerberos, users have been confused about the
default error returned in scenarios in which either keyutils is
not installed or the user did not properly acquire a krb5 ticket.
Log a warning message in the case that "ENOKEY" is returned
from the get_spnego_key upcall so that users can better understand
why mount failed in those two cases.
CC: Stable <[email protected]>
Signed-off-by: Steve French <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/cifs/smb2pdu.c | 2 ++
1 file changed, 2 insertions(+)
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -1387,6 +1387,8 @@ SMB2_auth_kerberos(struct SMB2_sess_data
spnego_key = cifs_get_spnego_key(ses);
if (IS_ERR(spnego_key)) {
rc = PTR_ERR(spnego_key);
+ if (rc == -ENOKEY)
+ cifs_dbg(VFS, "Verify user has a krb5 ticket and keyutils is installed\n");
spnego_key = NULL;
goto out;
}
From: Qu Wenruo <[email protected]>
commit 44d354abf33e92a5e73b965c84caf5a5d5e58a0b upstream.
Since most metadata reservation calls can return -EINTR when get
interrupted by fatal signal, we need to review the all the metadata
reservation call sites.
In relocation code, the metadata reservation happens in the following
sites:
- btrfs_block_rsv_refill() in merge_reloc_root()
merge_reloc_root() is a pretty critical section, we don't want to be
interrupted by signal, so change the flush status to
BTRFS_RESERVE_FLUSH_LIMIT, so it won't get interrupted by signal.
Since such change can be ENPSPC-prone, also shrink the amount of
metadata to reserve least amount avoid deadly ENOSPC there.
- btrfs_block_rsv_refill() in reserve_metadata_space()
It calls with BTRFS_RESERVE_FLUSH_LIMIT, which won't get interrupted
by signal.
- btrfs_block_rsv_refill() in prepare_to_relocate()
- btrfs_block_rsv_add() in prepare_to_relocate()
- btrfs_block_rsv_refill() in relocate_block_group()
- btrfs_delalloc_reserve_metadata() in relocate_file_extent_cluster()
- btrfs_start_transaction() in relocate_block_group()
- btrfs_start_transaction() in create_reloc_inode()
Can be interrupted by fatal signal and we can handle it easily.
For these call sites, just catch the -EINTR value in btrfs_balance()
and count them as canceled.
CC: [email protected] # 5.4+
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/relocation.c | 12 ++++++++++--
fs/btrfs/volumes.c | 17 ++++++++++++++++-
2 files changed, 26 insertions(+), 3 deletions(-)
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1686,12 +1686,20 @@ static noinline_for_stack int merge_relo
btrfs_unlock_up_safe(path, 0);
}
- min_reserved = fs_info->nodesize * (BTRFS_MAX_LEVEL - 1) * 2;
+ /*
+ * In merge_reloc_root(), we modify the upper level pointer to swap the
+ * tree blocks between reloc tree and subvolume tree. Thus for tree
+ * block COW, we COW at most from level 1 to root level for each tree.
+ *
+ * Thus the needed metadata size is at most root_level * nodesize,
+ * and * 2 since we have two trees to COW.
+ */
+ min_reserved = fs_info->nodesize * btrfs_root_level(root_item) * 2;
memset(&next_key, 0, sizeof(next_key));
while (1) {
ret = btrfs_block_rsv_refill(root, rc->block_rsv, min_reserved,
- BTRFS_RESERVE_FLUSH_ALL);
+ BTRFS_RESERVE_FLUSH_LIMIT);
if (ret) {
err = ret;
goto out;
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4150,7 +4150,22 @@ int btrfs_balance(struct btrfs_fs_info *
mutex_lock(&fs_info->balance_mutex);
if (ret == -ECANCELED && atomic_read(&fs_info->balance_pause_req))
btrfs_info(fs_info, "balance: paused");
- else if (ret == -ECANCELED && atomic_read(&fs_info->balance_cancel_req))
+ /*
+ * Balance can be canceled by:
+ *
+ * - Regular cancel request
+ * Then ret == -ECANCELED and balance_cancel_req > 0
+ *
+ * - Fatal signal to "btrfs" process
+ * Either the signal caught by wait_reserve_ticket() and callers
+ * got -EINTR, or caught by btrfs_should_cancel_balance() and
+ * got -ECANCELED.
+ * Either way, in this case balance_cancel_req = 0, and
+ * ret == -EINTR or ret == -ECANCELED.
+ *
+ * So here we only check the return value to catch canceled balance.
+ */
+ else if (ret == -ECANCELED || ret == -EINTR)
btrfs_info(fs_info, "balance: canceled");
else
btrfs_info(fs_info, "balance: ended with status: %d", ret);
From: Christophe Leroy <[email protected]>
commit 7c466b0807960edc13e4b855be85ea765df9a6cd upstream.
H_SUCCESS is only defined when CONFIG_PPC_PSERIES is defined.
!= H_SUCCESS means != 0. Modify the test accordingly.
Fixes: 65e701b2d2a8 ("powerpc/ptdump: drop non vital #ifdefs")
Cc: [email protected]
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/795158fc1d2b3dff3bf7347881947a887ea9391a.1592227105.git.christophe.leroy@csgroup.eu
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/powerpc/mm/ptdump/hashpagetable.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/powerpc/mm/ptdump/hashpagetable.c
+++ b/arch/powerpc/mm/ptdump/hashpagetable.c
@@ -258,7 +258,7 @@ static int pseries_find(unsigned long ea
for (i = 0; i < HPTES_PER_GROUP; i += 4, hpte_group += 4) {
lpar_rc = plpar_pte_read_4(0, hpte_group, (void *)ptes);
- if (lpar_rc != H_SUCCESS)
+ if (lpar_rc)
continue;
for (j = 0; j < 4; j++) {
if (HPTE_V_COMPARE(ptes[j].v, want_v) &&
From: Filipe Manana <[email protected]>
commit 28a9579561bcb9082715e720eac93012e708ab94 upstream.
We are incrementing the log_batch atomic counter of the root log tree but
we never use that counter, it's used only for the log trees of subvolume
roots. We started doing it when we moved the log_batch and log_write
counters from the global, per fs, btrfs_fs_info structure, into the
btrfs_root structure in commit 7237f1833601dc ("Btrfs: fix tree logs
parallel sync").
So just stop doing it for the log root tree and add a comment over the
field declaration so inform it's used only for log trees of subvolume
roots.
This patch is part of a series that has the following patches:
1/4 btrfs: only commit the delayed inode when doing a full fsync
2/4 btrfs: only commit delayed items at fsync if we are logging a directory
3/4 btrfs: stop incremening log_batch for the log root tree when syncing log
4/4 btrfs: remove no longer needed use of log_writers for the log root tree
After the entire patchset applied I saw about 12% decrease on max latency
reported by dbench. The test was done on a qemu vm, with 8 cores, 16Gb of
ram, using kvm and using a raw NVMe device directly (no intermediary fs on
the host). The test was invoked like the following:
mkfs.btrfs -f /dev/sdk
mount -o ssd -o nospace_cache /dev/sdk /mnt/sdk
dbench -D /mnt/sdk -t 300 8
umount /mnt/dsk
CC: [email protected] # 5.4+
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/ctree.h | 1 +
fs/btrfs/tree-log.c | 1 -
2 files changed, 1 insertion(+), 1 deletion(-)
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1061,6 +1061,7 @@ struct btrfs_root {
struct list_head log_ctxs[2];
atomic_t log_writers;
atomic_t log_commit[2];
+ /* Used only for log trees of subvolumes, not for the log root tree */
atomic_t log_batch;
int log_transid;
/* No matter the commit succeeds or not*/
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3116,7 +3116,6 @@ int btrfs_sync_log(struct btrfs_trans_ha
btrfs_init_log_ctx(&root_log_ctx, NULL);
mutex_lock(&log_root_tree->log_mutex);
- atomic_inc(&log_root_tree->log_batch);
atomic_inc(&log_root_tree->log_writers);
index2 = log_root_tree->log_transid % 2;
From: Michael Ellerman <[email protected]>
commit 63dee5df43a31f3844efabc58972f0a206ca4534 upstream.
We have powerpc specific logic in our page fault handling to decide if
an access to an unmapped address below the stack pointer should expand
the stack VMA.
The code was originally added in 2004 "ported from 2.4". The rough
logic is that the stack is allowed to grow to 1MB with no extra
checking. Over 1MB the access must be within 2048 bytes of the stack
pointer, or be from a user instruction that updates the stack pointer.
The 2048 byte allowance below the stack pointer is there to cover the
288 byte "red zone" as well as the "about 1.5kB" needed by the signal
delivery code.
Unfortunately since then the signal frame has expanded, and is now
4224 bytes on 64-bit kernels with transactional memory enabled. This
means if a process has consumed more than 1MB of stack, and its stack
pointer lies less than 4224 bytes from the next page boundary, signal
delivery will fault when trying to expand the stack and the process
will see a SEGV.
The total size of the signal frame is the size of struct rt_sigframe
(which includes the red zone) plus __SIGNAL_FRAMESIZE (128 bytes on
64-bit).
The 2048 byte allowance was correct until 2008 as the signal frame
was:
struct rt_sigframe {
struct ucontext uc; /* 0 1440 */
/* --- cacheline 11 boundary (1408 bytes) was 32 bytes ago --- */
long unsigned int _unused[2]; /* 1440 16 */
unsigned int tramp[6]; /* 1456 24 */
struct siginfo * pinfo; /* 1480 8 */
void * puc; /* 1488 8 */
struct siginfo info; /* 1496 128 */
/* --- cacheline 12 boundary (1536 bytes) was 88 bytes ago --- */
char abigap[288]; /* 1624 288 */
/* size: 1920, cachelines: 15, members: 7 */
/* padding: 8 */
};
1920 + 128 = 2048
Then in commit ce48b2100785 ("powerpc: Add VSX context save/restore,
ptrace and signal support") (Jul 2008) the signal frame expanded to
2304 bytes:
struct rt_sigframe {
struct ucontext uc; /* 0 1696 */ <--
/* --- cacheline 13 boundary (1664 bytes) was 32 bytes ago --- */
long unsigned int _unused[2]; /* 1696 16 */
unsigned int tramp[6]; /* 1712 24 */
struct siginfo * pinfo; /* 1736 8 */
void * puc; /* 1744 8 */
struct siginfo info; /* 1752 128 */
/* --- cacheline 14 boundary (1792 bytes) was 88 bytes ago --- */
char abigap[288]; /* 1880 288 */
/* size: 2176, cachelines: 17, members: 7 */
/* padding: 8 */
};
2176 + 128 = 2304
At this point we should have been exposed to the bug, though as far as
I know it was never reported. I no longer have a system old enough to
easily test on.
Then in 2010 commit 320b2b8de126 ("mm: keep a guard page below a
grow-down stack segment") caused our stack expansion code to never
trigger, as there was always a VMA found for a write up to PAGE_SIZE
below r1.
That meant the bug was hidden as we continued to expand the signal
frame in commit 2b0a576d15e0 ("powerpc: Add new transactional memory
state to the signal context") (Feb 2013):
struct rt_sigframe {
struct ucontext uc; /* 0 1696 */
/* --- cacheline 13 boundary (1664 bytes) was 32 bytes ago --- */
struct ucontext uc_transact; /* 1696 1696 */ <--
/* --- cacheline 26 boundary (3328 bytes) was 64 bytes ago --- */
long unsigned int _unused[2]; /* 3392 16 */
unsigned int tramp[6]; /* 3408 24 */
struct siginfo * pinfo; /* 3432 8 */
void * puc; /* 3440 8 */
struct siginfo info; /* 3448 128 */
/* --- cacheline 27 boundary (3456 bytes) was 120 bytes ago --- */
char abigap[288]; /* 3576 288 */
/* size: 3872, cachelines: 31, members: 8 */
/* padding: 8 */
/* last cacheline: 32 bytes */
};
3872 + 128 = 4000
And commit 573ebfa6601f ("powerpc: Increase stack redzone for 64-bit
userspace to 512 bytes") (Feb 2014):
struct rt_sigframe {
struct ucontext uc; /* 0 1696 */
/* --- cacheline 13 boundary (1664 bytes) was 32 bytes ago --- */
struct ucontext uc_transact; /* 1696 1696 */
/* --- cacheline 26 boundary (3328 bytes) was 64 bytes ago --- */
long unsigned int _unused[2]; /* 3392 16 */
unsigned int tramp[6]; /* 3408 24 */
struct siginfo * pinfo; /* 3432 8 */
void * puc; /* 3440 8 */
struct siginfo info; /* 3448 128 */
/* --- cacheline 27 boundary (3456 bytes) was 120 bytes ago --- */
char abigap[512]; /* 3576 512 */ <--
/* size: 4096, cachelines: 32, members: 8 */
/* padding: 8 */
};
4096 + 128 = 4224
Then finally in 2017, commit 1be7107fbe18 ("mm: larger stack guard
gap, between vmas") exposed us to the existing bug, because it changed
the stack VMA to be the correct/real size, meaning our stack expansion
code is now triggered.
Fix it by increasing the allowance to 4224 bytes.
Hard-coding 4224 is obviously unsafe against future expansions of the
signal frame in the same way as the existing code. We can't easily use
sizeof() because the signal frame structure is not in a header. We
will either fix that, or rip out all the custom stack expansion
checking logic entirely.
Fixes: ce48b2100785 ("powerpc: Add VSX context save/restore, ptrace and signal support")
Cc: [email protected] # v2.6.27+
Reported-by: Tom Lane <[email protected]>
Tested-by: Daniel Axtens <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/powerpc/mm/fault.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -267,6 +267,9 @@ static bool bad_kernel_fault(struct pt_r
return false;
}
+// This comes from 64-bit struct rt_sigframe + __SIGNAL_FRAMESIZE
+#define SIGFRAME_MAX_SIZE (4096 + 128)
+
static bool bad_stack_expansion(struct pt_regs *regs, unsigned long address,
struct vm_area_struct *vma, unsigned int flags,
bool *must_retry)
@@ -274,7 +277,7 @@ static bool bad_stack_expansion(struct p
/*
* N.B. The POWER/Open ABI allows programs to access up to
* 288 bytes below the stack pointer.
- * The kernel signal delivery code writes up to about 1.5kB
+ * The kernel signal delivery code writes a bit over 4KB
* below the stack pointer (r1) before decrementing it.
* The exec code can write slightly over 640kB to the stack
* before setting the user r1. Thus we allow the stack to
@@ -299,7 +302,7 @@ static bool bad_stack_expansion(struct p
* between the last mapped region and the stack will
* expand the stack rather than segfaulting.
*/
- if (address + 2048 >= uregs->gpr[1])
+ if (address + SIGFRAME_MAX_SIZE >= uregs->gpr[1])
return false;
if ((flags & FAULT_FLAG_WRITE) && (flags & FAULT_FLAG_USER) &&
From: Steve French <[email protected]>
commit c8c412f976124d85b8ded85c6ac3f760c12b63a3 upstream.
mkdir uses a compounded create operation which was not setting
the security descriptor on create of a directory. Fix so
mkdir now sets the mode and owner info properly when idsfromsid
and modefromsid are configured on the mount.
Signed-off-by: Steve French <[email protected]>
CC: Stable <[email protected]> # v5.8
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Reviewed-by: Pavel Shilovsky <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/cifs/smb2inode.c | 1 +
1 file changed, 1 insertion(+)
--- a/fs/cifs/smb2inode.c
+++ b/fs/cifs/smb2inode.c
@@ -115,6 +115,7 @@ smb2_compound_op(const unsigned int xid,
vars->oparms.fid = &fid;
vars->oparms.reconnect = false;
vars->oparms.mode = mode;
+ vars->oparms.cifs_sb = cifs_sb;
rqst[num_rqst].rq_iov = &vars->open_iov[0];
rqst[num_rqst].rq_nvec = SMB2_CREATE_IOV_SIZE;
From: Paul Cercueil <[email protected]>
commit 1c95348ba327fe8621d3680890c2341523d3524a upstream.
Ingenic SoCs don't natively support registering an interrupt for both
rising and falling edges. This has to be emulated in software.
Until now, this was emulated by switching back and forth between
IRQ_TYPE_EDGE_RISING and IRQ_TYPE_EDGE_FALLING according to the level of
the GPIO. While this worked most of the time, when used with GPIOs that
need debouncing, some events would be lost. For instance, between the
time a falling-edge interrupt happens and the interrupt handler
configures the hardware for rising-edge, the level of the pin may have
already risen, and the rising-edge event is lost.
To address that issue, instead of switching back and forth between
IRQ_TYPE_EDGE_RISING and IRQ_TYPE_EDGE_FALLING, we now switch back and
forth between IRQ_TYPE_LEVEL_LOW and IRQ_TYPE_LEVEL_HIGH. Since we
always switch in the interrupt handler, they actually permit to detect
level changes. In the example above, if the pin level rises before
switching the IRQ type from IRQ_TYPE_LEVEL_LOW to IRQ_TYPE_LEVEL_HIGH,
a new interrupt will raise as soon as the handler exits, and the
rising-edge event will be properly detected.
Fixes: e72394e2ea19 ("pinctrl: ingenic: Merge GPIO functionality")
Reported-by: João Henrique <[email protected]>
Signed-off-by: Paul Cercueil <[email protected]>
Tested-by: João Henrique <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Linus Walleij <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/pinctrl/pinctrl-ingenic.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/pinctrl/pinctrl-ingenic.c
+++ b/drivers/pinctrl/pinctrl-ingenic.c
@@ -1810,9 +1810,9 @@ static void ingenic_gpio_irq_ack(struct
*/
high = ingenic_gpio_get_value(jzgc, irq);
if (high)
- irq_set_type(jzgc, irq, IRQ_TYPE_EDGE_FALLING);
+ irq_set_type(jzgc, irq, IRQ_TYPE_LEVEL_LOW);
else
- irq_set_type(jzgc, irq, IRQ_TYPE_EDGE_RISING);
+ irq_set_type(jzgc, irq, IRQ_TYPE_LEVEL_HIGH);
}
if (jzgc->jzpc->info->version >= ID_JZ4760)
@@ -1848,7 +1848,7 @@ static int ingenic_gpio_irq_set_type(str
*/
bool high = ingenic_gpio_get_value(jzgc, irqd->hwirq);
- type = high ? IRQ_TYPE_EDGE_FALLING : IRQ_TYPE_EDGE_RISING;
+ type = high ? IRQ_TYPE_LEVEL_LOW : IRQ_TYPE_LEVEL_HIGH;
}
irq_set_type(jzgc, irqd->hwirq, type);
From: Rafael J. Wysocki <[email protected]>
commit dae68d7fd4930315389117e9da35b763f12238f9 upstream.
If context is not NULL in acpiphp_grab_context(), but the
is_going_away flag is set for the device's parent, the reference
counter of the context needs to be decremented before returning
NULL or the context will never be freed, so make that happen.
Fixes: edf5bf34d408 ("ACPI / dock: Use callback pointers from devices' ACPI hotplug contexts")
Reported-by: Vasily Averin <[email protected]>
Cc: 3.15+ <[email protected]> # 3.15+
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/pci/hotplug/acpiphp_glue.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -122,13 +122,21 @@ static struct acpiphp_context *acpiphp_g
struct acpiphp_context *context;
acpi_lock_hp_context();
+
context = acpiphp_get_context(adev);
- if (!context || context->func.parent->is_going_away) {
- acpi_unlock_hp_context();
- return NULL;
+ if (!context)
+ goto unlock;
+
+ if (context->func.parent->is_going_away) {
+ acpiphp_put_context(context);
+ context = NULL;
+ goto unlock;
}
+
get_bridge(context->func.parent);
acpiphp_put_context(context);
+
+unlock:
acpi_unlock_hp_context();
return context;
}
From: Josef Bacik <[email protected]>
commit f95ebdbed46a4d8b9fdb7bff109fdbb6fc9a6dc8 upstream.
If we got some sort of corruption via a read and call
btrfs_handle_fs_error() we'll set BTRFS_FS_STATE_ERROR on the fs and
complain. If a subsequent trans handle trips over this it'll get EROFS
and then abort. However at that point we're not aborting for the
original reason, we're aborting because we've been flipped read only.
We do not need to WARN_ON() here.
CC: [email protected] # 5.4+
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/ctree.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3198,7 +3198,7 @@ do { \
/* Report first abort since mount */ \
if (!test_and_set_bit(BTRFS_FS_STATE_TRANS_ABORTED, \
&((trans)->fs_info->fs_state))) { \
- if ((errno) != -EIO) { \
+ if ((errno) != -EIO && (errno) != -EROFS) { \
WARN(1, KERN_DEBUG \
"BTRFS: Transaction aborted (error %d)\n", \
(errno)); \
From: Filipe Manana <[email protected]>
commit 3d6448e631591756da36efb3ea6355ff6f383c3a upstream.
When releasing an extent map, done through the page release callback, we
can race with an ongoing fast fsync and cause the fsync to miss a new
extent and not log it. The steps for this to happen are the following:
1) A page is dirtied for some inode I;
2) Writeback for that page is triggered by a path other than fsync, for
example by the system due to memory pressure;
3) When the ordered extent for the extent (a single 4K page) finishes,
we unpin the corresponding extent map and set its generation to N,
the current transaction's generation;
4) The btrfs_releasepage() callback is invoked by the system due to
memory pressure for that no longer dirty page of inode I;
5) At the same time, some task calls fsync on inode I, joins transaction
N, and at btrfs_log_inode() it sees that the inode does not have the
full sync flag set, so we proceed with a fast fsync. But before we get
into btrfs_log_changed_extents() and lock the inode's extent map tree:
6) Through btrfs_releasepage() we end up at try_release_extent_mapping()
and we remove the extent map for the new 4Kb extent, because it is
neither pinned anymore nor locked. By calling remove_extent_mapping(),
we remove the extent map from the list of modified extents, since the
extent map does not have the logging flag set. We unlock the inode's
extent map tree;
7) The task doing the fast fsync now enters btrfs_log_changed_extents(),
locks the inode's extent map tree and iterates its list of modified
extents, which no longer has the 4Kb extent in it, so it does not log
the extent;
8) The fsync finishes;
9) Before transaction N is committed, a power failure happens. After
replaying the log, the 4K extent of inode I will be missing, since
it was not logged due to the race with try_release_extent_mapping().
So fix this by teaching try_release_extent_mapping() to not remove an
extent map if it's still in the list of modified extents.
Fixes: ff44c6e36dc9dc ("Btrfs: do not hold the write_lock on the extent tree while logging")
CC: [email protected] # 5.4+
Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/extent_io.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4502,15 +4502,25 @@ int try_release_extent_mapping(struct pa
free_extent_map(em);
break;
}
- if (!test_range_bit(tree, em->start,
- extent_map_end(em) - 1,
- EXTENT_LOCKED, 0, NULL)) {
+ if (test_range_bit(tree, em->start,
+ extent_map_end(em) - 1,
+ EXTENT_LOCKED, 0, NULL))
+ goto next;
+ /*
+ * If it's not in the list of modified extents, used
+ * by a fast fsync, we can remove it. If it's being
+ * logged we can safely remove it since fsync took an
+ * extra reference on the em.
+ */
+ if (list_empty(&em->list) ||
+ test_bit(EXTENT_FLAG_LOGGING, &em->flags)) {
set_bit(BTRFS_INODE_NEEDS_FULL_SYNC,
&btrfs_inode->runtime_flags);
remove_extent_mapping(map, em);
/* once for the rb tree */
free_extent_map(em);
}
+next:
start = extent_map_end(em);
write_unlock(&map->lock);
From: Guenter Roeck <[email protected]>
commit e27b1636e9337d1a1d174b191e53d0f86421a822 upstream.
rearm_wake_irq() does not unlock the irq descriptor if the interrupt
is not suspended or if wakeup is not enabled on it.
Restucture the exit conditions so the unlock is always ensured.
Fixes: 3a79bc63d9075 ("PCI: irq: Introduce rearm_wake_irq()")
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/irq/pm.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/kernel/irq/pm.c
+++ b/kernel/irq/pm.c
@@ -185,14 +185,18 @@ void rearm_wake_irq(unsigned int irq)
unsigned long flags;
struct irq_desc *desc = irq_get_desc_buslock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL);
- if (!desc || !(desc->istate & IRQS_SUSPENDED) ||
- !irqd_is_wakeup_set(&desc->irq_data))
+ if (!desc)
return;
+ if (!(desc->istate & IRQS_SUSPENDED) ||
+ !irqd_is_wakeup_set(&desc->irq_data))
+ goto unlock;
+
desc->istate &= ~IRQS_SUSPENDED;
irqd_set(&desc->irq_data, IRQD_WAKEUP_ARMED);
__enable_irq(desc);
+unlock:
irq_put_desc_busunlock(desc, flags);
}
From: Mikulas Patocka <[email protected]>
commit bc2fbaa4d3808aef82dd1064a8e61c16549fe956 upstream.
sbi->s_freeinodes_counter is only decreased by the ext2 code, it is never
increased. This patch fixes it.
Note that sbi->s_freeinodes_counter is only used in the algorithm that
tries to find the group for new allocations, so this bug is not easily
visible (the only visibility is that the group finding algorithm selects
inoptinal result).
Link: https://lore.kernel.org/r/alpine.LRH.2.02.2004201538300.19436@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Mikulas Patocka <[email protected]>
Cc: [email protected]
Signed-off-by: Jan Kara <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/ext2/ialloc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/ext2/ialloc.c
+++ b/fs/ext2/ialloc.c
@@ -80,6 +80,7 @@ static void ext2_release_inode(struct su
if (dir)
le16_add_cpu(&desc->bg_used_dirs_count, -1);
spin_unlock(sb_bgl_lock(EXT2_SB(sb), group));
+ percpu_counter_inc(&EXT2_SB(sb)->s_freeinodes_counter);
if (dir)
percpu_counter_dec(&EXT2_SB(sb)->s_dirs_counter);
mark_buffer_dirty(bh);
@@ -528,7 +529,7 @@ got:
goto fail;
}
- percpu_counter_add(&sbi->s_freeinodes_counter, -1);
+ percpu_counter_dec(&sbi->s_freeinodes_counter);
if (S_ISDIR(mode))
percpu_counter_inc(&sbi->s_dirs_counter);
From: Peter Xu <[email protected]>
commit 75802ca66354a39ab8e35822747cd08b3384a99a upstream.
This is found by code observation only.
Firstly, the worst case scenario should assume the whole range was covered
by pmd sharing. The old algorithm might not work as expected for ranges
like (1g-2m, 1g+2m), where the adjusted range should be (0, 1g+2m) but the
expected range should be (0, 2g).
Since at it, remove the loop since it should not be required. With that,
the new code should be faster too when the invalidating range is huge.
Mike said:
: With range (1g-2m, 1g+2m) within a vma (0, 2g) the existing code will only
: adjust to (0, 1g+2m) which is incorrect.
:
: We should cc stable. The original reason for adjusting the range was to
: prevent data corruption (getting wrong page). Since the range is not
: always adjusted correctly, the potential for corruption still exists.
:
: However, I am fairly confident that adjust_range_if_pmd_sharing_possible
: is only gong to be called in two cases:
:
: 1) for a single page
: 2) for range == entire vma
:
: In those cases, the current code should produce the correct results.
:
: To be safe, let's just cc stable.
Fixes: 017b1660df89 ("mm: migration: fix migration of huge PMD shared pages")
Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/hugetlb.c | 24 ++++++++++--------------
1 file changed, 10 insertions(+), 14 deletions(-)
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5313,25 +5313,21 @@ static bool vma_shareable(struct vm_area
void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
unsigned long *start, unsigned long *end)
{
- unsigned long check_addr;
+ unsigned long a_start, a_end;
if (!(vma->vm_flags & VM_MAYSHARE))
return;
- for (check_addr = *start; check_addr < *end; check_addr += PUD_SIZE) {
- unsigned long a_start = check_addr & PUD_MASK;
- unsigned long a_end = a_start + PUD_SIZE;
+ /* Extend the range to be PUD aligned for a worst case scenario */
+ a_start = ALIGN_DOWN(*start, PUD_SIZE);
+ a_end = ALIGN(*end, PUD_SIZE);
- /*
- * If sharing is possible, adjust start/end if necessary.
- */
- if (range_in_vma(vma, a_start, a_end)) {
- if (a_start < *start)
- *start = a_start;
- if (a_end > *end)
- *end = a_end;
- }
- }
+ /*
+ * Intersect the range with the vma range, since pmd sharing won't be
+ * across vma after all
+ */
+ *start = max(vma->vm_start, a_start);
+ *end = min(vma->vm_end, a_end);
}
/*
From: Hugh Dickins <[email protected]>
commit 723a80dafed5c95889d48baab9aa433a6ffa0b4e upstream.
pmdp_collapse_flush() should be given the start address at which the huge
page is mapped, haddr: it was given addr, which at that point has been
used as a local variable, incremented to the end address of the extent.
Found by source inspection while chasing a hugepage locking bug, which I
then could not explain by this. At first I thought this was very bad;
then saw that all of the page translations that were not flushed would
actually still point to the right pages afterwards, so harmless; then
realized that I know nothing of how different architectures and models
cache intermediate paging structures, so maybe it matters after all -
particularly since the page table concerned is immediately freed.
Much easier to fix than to think about.
Fixes: 27e1f8273113 ("khugepaged: enable collapse pmd for pte-mapped THP")
Signed-off-by: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Song Liu <[email protected]>
Cc: <[email protected]> [5.4+]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/khugepaged.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1502,7 +1502,7 @@ void collapse_pte_mapped_thp(struct mm_s
/* step 4: collapse pmd */
ptl = pmd_lock(vma->vm_mm, pmd);
- _pmd = pmdp_collapse_flush(vma, addr, pmd);
+ _pmd = pmdp_collapse_flush(vma, haddr, pmd);
spin_unlock(ptl);
mm_dec_nr_ptes(mm);
pte_free(mm, pmd_pgtable(_pmd));
From: Coly Li <[email protected]>
commit 5fe48867856367142d91a82f2cbf7a57a24cbb70 upstream.
There are some meta data of bcache are allocated by multiple pages,
and they are used as bio bv_page for I/Os to the cache device. for
example cache_set->uuids, cache->disk_buckets, journal_write->data,
bset_tree->data.
For such meta data memory, all the allocated pages should be treated
as a single memory block. Then the memory management and underlying I/O
code can treat them more clearly.
This patch adds __GFP_COMP flag to all the location allocating >0 order
pages for the above mentioned meta data. Then their pages are treated
as compound pages now.
Signed-off-by: Coly Li <[email protected]>
Cc: [email protected]
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/md/bcache/bset.c | 2 +-
drivers/md/bcache/btree.c | 2 +-
drivers/md/bcache/journal.c | 4 ++--
drivers/md/bcache/super.c | 2 +-
4 files changed, 5 insertions(+), 5 deletions(-)
--- a/drivers/md/bcache/bset.c
+++ b/drivers/md/bcache/bset.c
@@ -322,7 +322,7 @@ int bch_btree_keys_alloc(struct btree_ke
b->page_order = page_order;
- t->data = (void *) __get_free_pages(gfp, b->page_order);
+ t->data = (void *) __get_free_pages(__GFP_COMP|gfp, b->page_order);
if (!t->data)
goto err;
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -785,7 +785,7 @@ int bch_btree_cache_alloc(struct cache_s
mutex_init(&c->verify_lock);
c->verify_ondisk = (void *)
- __get_free_pages(GFP_KERNEL, ilog2(bucket_pages(c)));
+ __get_free_pages(GFP_KERNEL|__GFP_COMP, ilog2(bucket_pages(c)));
c->verify_data = mca_bucket_alloc(c, &ZERO_KEY, GFP_KERNEL);
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -999,8 +999,8 @@ int bch_journal_alloc(struct cache_set *
j->w[1].c = c;
if (!(init_fifo(&j->pin, JOURNAL_PIN, GFP_KERNEL)) ||
- !(j->w[0].data = (void *) __get_free_pages(GFP_KERNEL, JSET_BITS)) ||
- !(j->w[1].data = (void *) __get_free_pages(GFP_KERNEL, JSET_BITS)))
+ !(j->w[0].data = (void *) __get_free_pages(GFP_KERNEL|__GFP_COMP, JSET_BITS)) ||
+ !(j->w[1].data = (void *) __get_free_pages(GFP_KERNEL|__GFP_COMP, JSET_BITS)))
return -ENOMEM;
return 0;
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1776,7 +1776,7 @@ void bch_cache_set_unregister(struct cac
}
#define alloc_bucket_pages(gfp, c) \
- ((void *) __get_free_pages(__GFP_ZERO|gfp, ilog2(bucket_pages(c))))
+ ((void *) __get_free_pages(__GFP_ZERO|__GFP_COMP|gfp, ilog2(bucket_pages(c))))
struct cache_set *bch_cache_set_alloc(struct cache_sb *sb)
{
From: Bob Peterson <[email protected]>
commit 70499cdfeb3625c87eebe4f7a7ea06fa7447e5df upstream.
Before this patch, some functions started transactions then they called
gfs2_block_zero_range. However, gfs2_block_zero_range, like writes, can
start transactions, which results in a recursive transaction error.
For example:
do_shrink
trunc_start
gfs2_trans_begin <------------------------------------------------
gfs2_block_zero_range
iomap_zero_range(inode, from, length, NULL, &gfs2_iomap_ops);
iomap_apply ... iomap_zero_range_actor
iomap_begin
gfs2_iomap_begin
gfs2_iomap_begin_write
actor (iomap_zero_range_actor)
iomap_zero
iomap_write_begin
gfs2_iomap_page_prepare
gfs2_trans_begin <------------------------
This patch reorders the callers of gfs2_block_zero_range so that they
only start their transactions after the call. It also adds a BUG_ON to
ensure this doesn't happen again.
Fixes: 2257e468a63b ("gfs2: implement gfs2_block_zero_range using iomap_zero_range")
Cc: [email protected] # v5.5+
Signed-off-by: Bob Peterson <[email protected]>
Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/gfs2/bmap.c | 69 ++++++++++++++++++++++++++++++++-------------------------
1 file changed, 39 insertions(+), 30 deletions(-)
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -1351,9 +1351,15 @@ int gfs2_extent_map(struct inode *inode,
return ret;
}
+/*
+ * NOTE: Never call gfs2_block_zero_range with an open transaction because it
+ * uses iomap write to perform its actions, which begin their own transactions
+ * (iomap_begin, page_prepare, etc.)
+ */
static int gfs2_block_zero_range(struct inode *inode, loff_t from,
unsigned int length)
{
+ BUG_ON(current->journal_info);
return iomap_zero_range(inode, from, length, NULL, &gfs2_iomap_ops);
}
@@ -1414,6 +1420,16 @@ static int trunc_start(struct inode *ino
u64 oldsize = inode->i_size;
int error;
+ if (!gfs2_is_stuffed(ip)) {
+ unsigned int blocksize = i_blocksize(inode);
+ unsigned int offs = newsize & (blocksize - 1);
+ if (offs) {
+ error = gfs2_block_zero_range(inode, newsize,
+ blocksize - offs);
+ if (error)
+ return error;
+ }
+ }
if (journaled)
error = gfs2_trans_begin(sdp, RES_DINODE + RES_JDATA, GFS2_JTRUNC_REVOKES);
else
@@ -1427,19 +1443,10 @@ static int trunc_start(struct inode *ino
gfs2_trans_add_meta(ip->i_gl, dibh);
- if (gfs2_is_stuffed(ip)) {
+ if (gfs2_is_stuffed(ip))
gfs2_buffer_clear_tail(dibh, sizeof(struct gfs2_dinode) + newsize);
- } else {
- unsigned int blocksize = i_blocksize(inode);
- unsigned int offs = newsize & (blocksize - 1);
- if (offs) {
- error = gfs2_block_zero_range(inode, newsize,
- blocksize - offs);
- if (error)
- goto out;
- }
+ else
ip->i_diskflags |= GFS2_DIF_TRUNC_IN_PROG;
- }
i_size_write(inode, newsize);
ip->i_inode.i_mtime = ip->i_inode.i_ctime = current_time(&ip->i_inode);
@@ -2448,25 +2455,7 @@ int __gfs2_punch_hole(struct file *file,
loff_t start, end;
int error;
- start = round_down(offset, blocksize);
- end = round_up(offset + length, blocksize) - 1;
- error = filemap_write_and_wait_range(inode->i_mapping, start, end);
- if (error)
- return error;
-
- if (gfs2_is_jdata(ip))
- error = gfs2_trans_begin(sdp, RES_DINODE + 2 * RES_JDATA,
- GFS2_JTRUNC_REVOKES);
- else
- error = gfs2_trans_begin(sdp, RES_DINODE, 0);
- if (error)
- return error;
-
- if (gfs2_is_stuffed(ip)) {
- error = stuffed_zero_range(inode, offset, length);
- if (error)
- goto out;
- } else {
+ if (!gfs2_is_stuffed(ip)) {
unsigned int start_off, end_len;
start_off = offset & (blocksize - 1);
@@ -2489,6 +2478,26 @@ int __gfs2_punch_hole(struct file *file,
}
}
+ start = round_down(offset, blocksize);
+ end = round_up(offset + length, blocksize) - 1;
+ error = filemap_write_and_wait_range(inode->i_mapping, start, end);
+ if (error)
+ return error;
+
+ if (gfs2_is_jdata(ip))
+ error = gfs2_trans_begin(sdp, RES_DINODE + 2 * RES_JDATA,
+ GFS2_JTRUNC_REVOKES);
+ else
+ error = gfs2_trans_begin(sdp, RES_DINODE, 0);
+ if (error)
+ return error;
+
+ if (gfs2_is_stuffed(ip)) {
+ error = stuffed_zero_range(inode, offset, length);
+ if (error)
+ goto out;
+ }
+
if (gfs2_is_jdata(ip)) {
BUG_ON(!current->journal_info);
gfs2_journaled_truncate_range(inode, offset, length);
From: Max Filippov <[email protected]>
commit 6d65d3769d1910379e1cfa61ebf387efc6bfb22c upstream.
Fix the following build error in configurations with
CONFIG_XTENSA_VARIANT_HAVE_PERF_EVENTS=y:
arch/xtensa/kernel/perf_event.c:420:29: error: passing argument 3 of
‘cpuhp_setup_state’ from incompatible pointer type
Cc: [email protected]
Fixes: 25a77b55e74c ("xtensa/perf: Convert the hotplug notifier to state machine callbacks")
Signed-off-by: Max Filippov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/xtensa/kernel/perf_event.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/xtensa/kernel/perf_event.c
+++ b/arch/xtensa/kernel/perf_event.c
@@ -399,7 +399,7 @@ static struct pmu xtensa_pmu = {
.read = xtensa_pmu_read,
};
-static int xtensa_pmu_setup(int cpu)
+static int xtensa_pmu_setup(unsigned int cpu)
{
unsigned i;
From: Adrian Hunter <[email protected]>
commit a58a057ce65b52125dd355b7d8b0d540ea267a5f upstream.
CBR events can result in a duplicate branch event, because the state
type defaults to a branch. Fix by clearing the state type.
Example: trace 'sleep' and hope for a frequency change
Before:
$ perf record -e intel_pt//u sleep 0.1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data ]
$ perf script --itrace=bpe > before.txt
After:
$ perf script --itrace=bpe > after.txt
$ diff -u before.txt after.txt
# --- before.txt 2020-07-07 14:42:18.191508098 +0300
# +++ after.txt 2020-07-07 14:42:36.587891753 +0300
@@ -29673,7 +29673,6 @@
sleep 93431 [007] 15411.619905: 1 branches:u: 0 [unknown] ([unknown]) => 7f0818abb2e0 clock_nanosleep@@GLIBC_2.17+0x0 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
sleep 93431 [007] 15411.619905: 1 branches:u: 7f0818abb30c clock_nanosleep@@GLIBC_2.17+0x2c (/usr/lib/x86_64-linux-gnu/libc-2.31.so) => 0 [unknown] ([unknown])
sleep 93431 [007] 15411.720069: cbr: cbr: 15 freq: 1507 MHz ( 56%) 7f0818abb30c clock_nanosleep@@GLIBC_2.17+0x2c (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
- sleep 93431 [007] 15411.720069: 1 branches:u: 7f0818abb30c clock_nanosleep@@GLIBC_2.17+0x2c (/usr/lib/x86_64-linux-gnu/libc-2.31.so) => 0 [unknown] ([unknown])
sleep 93431 [007] 15411.720076: 1 branches:u: 0 [unknown] ([unknown]) => 7f0818abb30e clock_nanosleep@@GLIBC_2.17+0x2e (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
sleep 93431 [007] 15411.720077: 1 branches:u: 7f0818abb323 clock_nanosleep@@GLIBC_2.17+0x43 (/usr/lib/x86_64-linux-gnu/libc-2.31.so) => 7f0818ac0eb7 __nanosleep+0x17 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
sleep 93431 [007] 15411.720077: 1 branches:u: 7f0818ac0ebf __nanosleep+0x1f (/usr/lib/x86_64-linux-gnu/libc-2.31.so) => 55cb7e4c2827 rpl_nanosleep+0x97 (/usr/bin/sleep)
Fixes: 91de8684f1cff ("perf intel-pt: Cater for CBR change in PSB+")
Fixes: abe5a1d3e4bee ("perf intel-pt: Decoder to output CBR changes immediately")
Signed-off-by: Adrian Hunter <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1977,8 +1977,10 @@ next:
* possibility of another CBR change that gets caught up
* in the PSB+.
*/
- if (decoder->cbr != decoder->cbr_seen)
+ if (decoder->cbr != decoder->cbr_seen) {
+ decoder->state.type = 0;
return 0;
+ }
break;
case INTEL_PT_PIP:
@@ -2019,8 +2021,10 @@ next:
case INTEL_PT_CBR:
intel_pt_calc_cbr(decoder);
- if (decoder->cbr != decoder->cbr_seen)
+ if (decoder->cbr != decoder->cbr_seen) {
+ decoder->state.type = 0;
return 0;
+ }
break;
case INTEL_PT_MODE_EXEC:
From: Sibi Sankar <[email protected]>
commit e013f455d95add874f310dc47c608e8c70692ae5 upstream.
The following mem abort is observed when the mba firmware size exceeds
the allocated mba region. MBA firmware size is restricted to a maximum
size of 1M and remaining memory region is used by modem debug policy
firmware when available. Hence verify whether the MBA firmware size lies
within the allocated memory region and is not greater than 1M before
loading.
Err Logs:
Unable to handle kernel paging request at virtual address
Mem abort info:
...
Call trace:
__memcpy+0x110/0x180
rproc_start+0x40/0x218
rproc_boot+0x5b4/0x608
state_store+0x54/0xf8
dev_attr_store+0x44/0x60
sysfs_kf_write+0x58/0x80
kernfs_fop_write+0x140/0x230
vfs_write+0xc4/0x208
ksys_write+0x74/0xf8
__arm64_sys_write+0x24/0x30
...
Reviewed-by: Bjorn Andersson <[email protected]>
Fixes: 051fb70fd4ea4 ("remoteproc: qcom: Driver for the self-authenticating Hexagon v5")
Cc: [email protected]
Signed-off-by: Sibi Sankar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/remoteproc/qcom_q6v5_mss.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/drivers/remoteproc/qcom_q6v5_mss.c
+++ b/drivers/remoteproc/qcom_q6v5_mss.c
@@ -407,6 +407,12 @@ static int q6v5_load(struct rproc *rproc
{
struct q6v5 *qproc = rproc->priv;
+ /* MBA is restricted to a maximum size of 1M */
+ if (fw->size > qproc->mba_size || fw->size > SZ_1M) {
+ dev_err(qproc->dev, "MBA firmware load failed\n");
+ return -EINVAL;
+ }
+
memcpy(qproc->mba_region, fw->data, fw->size);
return 0;
From: Sibi Sankar <[email protected]>
commit 135b9e8d1cd8ba5ac9ad9bcf24b464b7b052e5b8 upstream.
The following mem abort is observed when one of the modem blob firmware
size exceeds the allocated mpss region. Fix this by restricting the copy
size to segment size using request_firmware_into_buf before load.
Err Logs:
Unable to handle kernel paging request at virtual address
Mem abort info:
...
Call trace:
__memcpy+0x110/0x180
rproc_start+0xd0/0x190
rproc_boot+0x404/0x550
state_store+0x54/0xf8
dev_attr_store+0x44/0x60
sysfs_kf_write+0x58/0x80
kernfs_fop_write+0x140/0x230
vfs_write+0xc4/0x208
ksys_write+0x74/0xf8
...
Reviewed-by: Bjorn Andersson <[email protected]>
Fixes: 051fb70fd4ea4 ("remoteproc: qcom: Driver for the self-authenticating Hexagon v5")
Cc: [email protected]
Signed-off-by: Sibi Sankar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/remoteproc/qcom_q6v5_mss.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/remoteproc/qcom_q6v5_mss.c
+++ b/drivers/remoteproc/qcom_q6v5_mss.c
@@ -1144,15 +1144,14 @@ static int q6v5_mpss_load(struct q6v5 *q
} else if (phdr->p_filesz) {
/* Replace "xxx.xxx" with "xxx.bxx" */
sprintf(fw_name + fw_name_len - 3, "b%02d", i);
- ret = request_firmware(&seg_fw, fw_name, qproc->dev);
+ ret = request_firmware_into_buf(&seg_fw, fw_name, qproc->dev,
+ ptr, phdr->p_filesz);
if (ret) {
dev_err(qproc->dev, "failed to load %s\n", fw_name);
iounmap(ptr);
goto release_firmware;
}
- memcpy(ptr, seg_fw->data, seg_fw->size);
-
release_firmware(seg_fw);
}
From: Andreas Gruenbacher <[email protected]>
commit c07bfb4d8fa1ee11c6d18b093d0bb6c8832d3626 upstream.
In gfs2_glock_poke, make sure gfs2_holder_uninit is called on the local
glock holder. Without that, we're leaking a glock and a pid reference.
Fixes: 9e8990dea926 ("gfs2: Smarter iopen glock waiting")
Cc: [email protected] # v5.8+
Signed-off-by: Andreas Gruenbacher <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/gfs2/glock.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -790,9 +790,11 @@ static void gfs2_glock_poke(struct gfs2_
struct gfs2_holder gh;
int error;
- error = gfs2_glock_nq_init(gl, LM_ST_SHARED, flags, &gh);
+ gfs2_holder_init(gl, LM_ST_SHARED, flags, &gh);
+ error = gfs2_glock_nq(&gh);
if (!error)
gfs2_glock_dq(&gh);
+ gfs2_holder_uninit(&gh);
}
static bool gfs2_try_evict(struct gfs2_glock *gl)
From: Mansur Alisha Shaik <[email protected]>
commit e0eb34810113dbbf1ace57440cf48d514312a373 upstream.
Currently we are considering the instances which are available
in core->inst list for load calculation in min_loaded_core()
function, but this is incorrect because by the time we call
decide_core() for second instance, the third instance not
filled yet codec_freq_data pointer.
Solve this by considering the instances whose session has started.
Cc: [email protected] # v5.7+
Fixes: 4ebf969375bc ("media: venus: introduce core selection")
Tested-by: Douglas Anderson <[email protected]>
Signed-off-by: Mansur Alisha Shaik <[email protected]>
Signed-off-by: Stanimir Varbanov <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/media/platform/qcom/venus/pm_helpers.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/drivers/media/platform/qcom/venus/pm_helpers.c
+++ b/drivers/media/platform/qcom/venus/pm_helpers.c
@@ -496,6 +496,10 @@ min_loaded_core(struct venus_inst *inst,
list_for_each_entry(inst_pos, &core->instances, list) {
if (inst_pos == inst)
continue;
+
+ if (inst_pos->state != INST_START)
+ continue;
+
vpp_freq = inst_pos->clk_data.codec_freq_data->vpp_freq;
coreid = inst_pos->clk_data.core_id;
From: Jesper Dangaard Brouer <[email protected]>
[ Upstream commit b8c50df0cb3eb9008f8372e4ff0317eee993b8d1 ]
There are a number of places in test_progs that use minus-1 as the argument
to exit(). This is confusing as a process exit status is masked to be a
number between 0 and 255 as defined in man exit(3). Thus, users will see
status 255 instead of minus-1.
This patch use positive exit code 3 instead of minus-1. These cases are put
in the same group of infrastructure setup errors.
Fixes: fd27b1835e70 ("selftests/bpf: Reset process and thread affinity after each test/sub-test")
Fixes: 811d7e375d08 ("bpf: selftests: Restore netns after each test")
Signed-off-by: Jesper Dangaard Brouer <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/159410594499.1093222.11080787853132708654.stgit@firesoul
Signed-off-by: Sasha Levin <[email protected]>
---
tools/testing/selftests/bpf/test_progs.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index 0849735ebda8a..d498b6aa63a42 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -13,6 +13,7 @@
#include <execinfo.h> /* backtrace */
#define EXIT_NO_TEST 2
+#define EXIT_ERR_SETUP_INFRA 3
/* defined in test_progs.h */
struct test_env env = {};
@@ -113,13 +114,13 @@ static void reset_affinity() {
if (err < 0) {
stdio_restore();
fprintf(stderr, "Failed to reset process affinity: %d!\n", err);
- exit(-1);
+ exit(EXIT_ERR_SETUP_INFRA);
}
err = pthread_setaffinity_np(pthread_self(), sizeof(cpuset), &cpuset);
if (err < 0) {
stdio_restore();
fprintf(stderr, "Failed to reset thread affinity: %d!\n", err);
- exit(-1);
+ exit(EXIT_ERR_SETUP_INFRA);
}
}
@@ -128,7 +129,7 @@ static void save_netns(void)
env.saved_netns_fd = open("/proc/self/ns/net", O_RDONLY);
if (env.saved_netns_fd == -1) {
perror("open(/proc/self/ns/net)");
- exit(-1);
+ exit(EXIT_ERR_SETUP_INFRA);
}
}
@@ -137,7 +138,7 @@ static void restore_netns(void)
if (setns(env.saved_netns_fd, CLONE_NEWNET) == -1) {
stdio_restore();
perror("setns(CLONE_NEWNS)");
- exit(-1);
+ exit(EXIT_ERR_SETUP_INFRA);
}
}
--
2.25.1
From: Kees Cook <[email protected]>
commit 4969f8a073977123504609d7310b42a588297aa4 upstream.
The sock counting (sock_update_netprioidx() and sock_update_classid())
was missing from pidfd's implementation of received fd installation. Add
a call to the new __receive_sock() helper.
Cc: Christian Brauner <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Sargun Dhillon <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Fixes: 8649c322f75c ("pid: Implement pidfd_getfd syscall")
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/pid.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -42,6 +42,7 @@
#include <linux/sched/signal.h>
#include <linux/sched/task.h>
#include <linux/idr.h>
+#include <net/sock.h>
struct pid init_struct_pid = {
.count = REFCOUNT_INIT(1),
@@ -642,10 +643,12 @@ static int pidfd_getfd(struct pid *pid,
}
ret = get_unused_fd_flags(O_CLOEXEC);
- if (ret < 0)
+ if (ret < 0) {
fput(file);
- else
+ } else {
+ __receive_sock(file);
fd_install(ret, file);
+ }
return ret;
}
From: Chao Yu <[email protected]>
[ Upstream commit 02772fbfcba8597eef9d5c5f7f94087132d0c1d4 ]
Memory allocated for storing compressed pages' poitner should be
released after f2fs_write_compressed_pages(), otherwise it will
cause memory leak issue.
Signed-off-by: Chao Yu <[email protected]>
Fixes: 4c8ff7095bef ("f2fs: support data compression")
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/f2fs/compress.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 1e02a8c106b0a..f6fbe61b1251e 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -1353,6 +1353,8 @@ int f2fs_write_multi_pages(struct compress_ctx *cc,
err = f2fs_write_compressed_pages(cc, submitted,
wbc, io_type);
cops->destroy_compress_ctx(cc);
+ kfree(cc->cpages);
+ cc->cpages = NULL;
if (!err)
return 0;
f2fs_bug_on(F2FS_I_SB(cc->inode), err != -EAGAIN);
--
2.25.1
From: Steve Longerbeam <[email protected]>
[ Upstream commit dd81d821d0b3f77d949d0cac5c05c1f05b921d46 ]
Use a bit-mask of EOF irqs to determine when all required idmac
channel EOFs have been received for a tile conversion, and only do
tile completion processing after all EOFs have been received. Otherwise
it was found that a conversion would stall after the completion of a
tile and the start of the next tile, because the input/read idmac
channel had not completed and entered idle state, thus locking up the
channel when attempting to re-start it for the next tile.
Fixes: 0537db801bb01 ("gpu: ipu-v3: image-convert: reconfigure IC per tile")
Signed-off-by: Steve Longerbeam <[email protected]>
Signed-off-by: Philipp Zabel <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/ipu-v3/ipu-image-convert.c | 109 +++++++++++++++++++------
1 file changed, 82 insertions(+), 27 deletions(-)
diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c b/drivers/gpu/ipu-v3/ipu-image-convert.c
index f8b031ded3cf2..aa1d4b6d278f7 100644
--- a/drivers/gpu/ipu-v3/ipu-image-convert.c
+++ b/drivers/gpu/ipu-v3/ipu-image-convert.c
@@ -137,6 +137,17 @@ struct ipu_image_convert_ctx;
struct ipu_image_convert_chan;
struct ipu_image_convert_priv;
+enum eof_irq_mask {
+ EOF_IRQ_IN = BIT(0),
+ EOF_IRQ_ROT_IN = BIT(1),
+ EOF_IRQ_OUT = BIT(2),
+ EOF_IRQ_ROT_OUT = BIT(3),
+};
+
+#define EOF_IRQ_COMPLETE (EOF_IRQ_IN | EOF_IRQ_OUT)
+#define EOF_IRQ_ROT_COMPLETE (EOF_IRQ_IN | EOF_IRQ_OUT | \
+ EOF_IRQ_ROT_IN | EOF_IRQ_ROT_OUT)
+
struct ipu_image_convert_ctx {
struct ipu_image_convert_chan *chan;
@@ -173,6 +184,9 @@ struct ipu_image_convert_ctx {
/* where to place converted tile in dest image */
unsigned int out_tile_map[MAX_TILES];
+ /* mask of completed EOF irqs at every tile conversion */
+ enum eof_irq_mask eof_mask;
+
struct list_head list;
};
@@ -189,6 +203,8 @@ struct ipu_image_convert_chan {
struct ipuv3_channel *rotation_out_chan;
/* the IPU end-of-frame irqs */
+ int in_eof_irq;
+ int rot_in_eof_irq;
int out_eof_irq;
int rot_out_eof_irq;
@@ -1380,6 +1396,9 @@ static int convert_start(struct ipu_image_convert_run *run, unsigned int tile)
dev_dbg(priv->ipu->dev, "%s: task %u: starting ctx %p run %p tile %u -> %u\n",
__func__, chan->ic_task, ctx, run, tile, dst_tile);
+ /* clear EOF irq mask */
+ ctx->eof_mask = 0;
+
if (ipu_rot_mode_is_irt(ctx->rot_mode)) {
/* swap width/height for resizer */
dest_width = d_image->tile[dst_tile].height;
@@ -1615,7 +1634,7 @@ static bool ic_settings_changed(struct ipu_image_convert_ctx *ctx)
}
/* hold irqlock when calling */
-static irqreturn_t do_irq(struct ipu_image_convert_run *run)
+static irqreturn_t do_tile_complete(struct ipu_image_convert_run *run)
{
struct ipu_image_convert_ctx *ctx = run->ctx;
struct ipu_image_convert_chan *chan = ctx->chan;
@@ -1700,6 +1719,7 @@ static irqreturn_t do_irq(struct ipu_image_convert_run *run)
ctx->cur_buf_num ^= 1;
}
+ ctx->eof_mask = 0; /* clear EOF irq mask for next tile */
ctx->next_tile++;
return IRQ_HANDLED;
done:
@@ -1715,8 +1735,9 @@ static irqreturn_t eof_irq(int irq, void *data)
struct ipu_image_convert_priv *priv = chan->priv;
struct ipu_image_convert_ctx *ctx;
struct ipu_image_convert_run *run;
+ irqreturn_t ret = IRQ_HANDLED;
+ bool tile_complete = false;
unsigned long flags;
- irqreturn_t ret;
spin_lock_irqsave(&chan->irqlock, flags);
@@ -1729,27 +1750,33 @@ static irqreturn_t eof_irq(int irq, void *data)
ctx = run->ctx;
- if (irq == chan->out_eof_irq) {
- if (ipu_rot_mode_is_irt(ctx->rot_mode)) {
- /* this is a rotation op, just ignore */
- ret = IRQ_HANDLED;
- goto out;
- }
- } else if (irq == chan->rot_out_eof_irq) {
+ if (irq == chan->in_eof_irq) {
+ ctx->eof_mask |= EOF_IRQ_IN;
+ } else if (irq == chan->out_eof_irq) {
+ ctx->eof_mask |= EOF_IRQ_OUT;
+ } else if (irq == chan->rot_in_eof_irq ||
+ irq == chan->rot_out_eof_irq) {
if (!ipu_rot_mode_is_irt(ctx->rot_mode)) {
/* this was NOT a rotation op, shouldn't happen */
dev_err(priv->ipu->dev,
"Unexpected rotation interrupt\n");
- ret = IRQ_HANDLED;
goto out;
}
+ ctx->eof_mask |= (irq == chan->rot_in_eof_irq) ?
+ EOF_IRQ_ROT_IN : EOF_IRQ_ROT_OUT;
} else {
dev_err(priv->ipu->dev, "Received unknown irq %d\n", irq);
ret = IRQ_NONE;
goto out;
}
- ret = do_irq(run);
+ if (ipu_rot_mode_is_irt(ctx->rot_mode))
+ tile_complete = (ctx->eof_mask == EOF_IRQ_ROT_COMPLETE);
+ else
+ tile_complete = (ctx->eof_mask == EOF_IRQ_COMPLETE);
+
+ if (tile_complete)
+ ret = do_tile_complete(run);
out:
spin_unlock_irqrestore(&chan->irqlock, flags);
return ret;
@@ -1783,6 +1810,10 @@ static void force_abort(struct ipu_image_convert_ctx *ctx)
static void release_ipu_resources(struct ipu_image_convert_chan *chan)
{
+ if (chan->in_eof_irq >= 0)
+ free_irq(chan->in_eof_irq, chan);
+ if (chan->rot_in_eof_irq >= 0)
+ free_irq(chan->rot_in_eof_irq, chan);
if (chan->out_eof_irq >= 0)
free_irq(chan->out_eof_irq, chan);
if (chan->rot_out_eof_irq >= 0)
@@ -1801,7 +1832,27 @@ static void release_ipu_resources(struct ipu_image_convert_chan *chan)
chan->in_chan = chan->out_chan = chan->rotation_in_chan =
chan->rotation_out_chan = NULL;
- chan->out_eof_irq = chan->rot_out_eof_irq = -1;
+ chan->in_eof_irq = -1;
+ chan->rot_in_eof_irq = -1;
+ chan->out_eof_irq = -1;
+ chan->rot_out_eof_irq = -1;
+}
+
+static int get_eof_irq(struct ipu_image_convert_chan *chan,
+ struct ipuv3_channel *channel)
+{
+ struct ipu_image_convert_priv *priv = chan->priv;
+ int ret, irq;
+
+ irq = ipu_idmac_channel_irq(priv->ipu, channel, IPU_IRQ_EOF);
+
+ ret = request_threaded_irq(irq, eof_irq, do_bh, 0, "ipu-ic", chan);
+ if (ret < 0) {
+ dev_err(priv->ipu->dev, "could not acquire irq %d\n", irq);
+ return ret;
+ }
+
+ return irq;
}
static int get_ipu_resources(struct ipu_image_convert_chan *chan)
@@ -1837,31 +1888,33 @@ static int get_ipu_resources(struct ipu_image_convert_chan *chan)
}
/* acquire the EOF interrupts */
- chan->out_eof_irq = ipu_idmac_channel_irq(priv->ipu,
- chan->out_chan,
- IPU_IRQ_EOF);
+ ret = get_eof_irq(chan, chan->in_chan);
+ if (ret < 0) {
+ chan->in_eof_irq = -1;
+ goto err;
+ }
+ chan->in_eof_irq = ret;
- ret = request_threaded_irq(chan->out_eof_irq, eof_irq, do_bh,
- 0, "ipu-ic", chan);
+ ret = get_eof_irq(chan, chan->rotation_in_chan);
if (ret < 0) {
- dev_err(priv->ipu->dev, "could not acquire irq %d\n",
- chan->out_eof_irq);
- chan->out_eof_irq = -1;
+ chan->rot_in_eof_irq = -1;
goto err;
}
+ chan->rot_in_eof_irq = ret;
- chan->rot_out_eof_irq = ipu_idmac_channel_irq(priv->ipu,
- chan->rotation_out_chan,
- IPU_IRQ_EOF);
+ ret = get_eof_irq(chan, chan->out_chan);
+ if (ret < 0) {
+ chan->out_eof_irq = -1;
+ goto err;
+ }
+ chan->out_eof_irq = ret;
- ret = request_threaded_irq(chan->rot_out_eof_irq, eof_irq, do_bh,
- 0, "ipu-ic", chan);
+ ret = get_eof_irq(chan, chan->rotation_out_chan);
if (ret < 0) {
- dev_err(priv->ipu->dev, "could not acquire irq %d\n",
- chan->rot_out_eof_irq);
chan->rot_out_eof_irq = -1;
goto err;
}
+ chan->rot_out_eof_irq = ret;
return 0;
err:
@@ -2440,6 +2493,8 @@ int ipu_image_convert_init(struct ipu_soc *ipu, struct device *dev)
chan->ic_task = i;
chan->priv = priv;
chan->dma_ch = &image_convert_dma_chan[i];
+ chan->in_eof_irq = -1;
+ chan->rot_in_eof_irq = -1;
chan->out_eof_irq = -1;
chan->rot_out_eof_irq = -1;
--
2.25.1
From: Coly Li <[email protected]>
commit 7a1481267999c02abf4a624515c1b5c7c1fccbd6 upstream.
offset_to_stripe() returns the stripe number (in type unsigned int) from
an offset (in type uint64_t) by the following calculation,
do_div(offset, d->stripe_size);
For large capacity backing device (e.g. 18TB) with small stripe size
(e.g. 4KB), the result is 4831838208 and exceeds UINT_MAX. The actual
returned value which caller receives is 536870912, due to the overflow.
Indeed in bcache_device_init(), bcache_device->nr_stripes is limited in
range [1, INT_MAX]. Therefore all valid stripe numbers in bcache are
in range [0, bcache_dev->nr_stripes - 1].
This patch adds a upper limition check in offset_to_stripe(): the max
valid stripe number should be less than bcache_device->nr_stripes. If
the calculated stripe number from do_div() is equal to or larger than
bcache_device->nr_stripe, -EINVAL will be returned. (Normally nr_stripes
is less than INT_MAX, exceeding upper limitation doesn't mean overflow,
therefore -EOVERFLOW is not used as error code.)
This patch also changes nr_stripes' type of struct bcache_device from
'unsigned int' to 'int', and return value type of offset_to_stripe()
from 'unsigned int' to 'int', to match their exact data ranges.
All locations where bcache_device->nr_stripes and offset_to_stripe() are
referenced also get updated for the above type change.
Reported-and-tested-by: Ken Raeburn <[email protected]>
Signed-off-by: Coly Li <[email protected]>
Cc: [email protected]
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1783075
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/md/bcache/bcache.h | 2 +-
drivers/md/bcache/writeback.c | 14 +++++++++-----
drivers/md/bcache/writeback.h | 19 +++++++++++++++++--
3 files changed, 27 insertions(+), 8 deletions(-)
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -264,7 +264,7 @@ struct bcache_device {
#define BCACHE_DEV_UNLINK_DONE 2
#define BCACHE_DEV_WB_RUNNING 3
#define BCACHE_DEV_RATE_DW_RUNNING 4
- unsigned int nr_stripes;
+ int nr_stripes;
unsigned int stripe_size;
atomic_t *stripe_sectors_dirty;
unsigned long *full_dirty_stripes;
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -523,15 +523,19 @@ void bcache_dev_sectors_dirty_add(struct
uint64_t offset, int nr_sectors)
{
struct bcache_device *d = c->devices[inode];
- unsigned int stripe_offset, stripe, sectors_dirty;
+ unsigned int stripe_offset, sectors_dirty;
+ int stripe;
if (!d)
return;
+ stripe = offset_to_stripe(d, offset);
+ if (stripe < 0)
+ return;
+
if (UUID_FLASH_ONLY(&c->uuids[inode]))
atomic_long_add(nr_sectors, &c->flash_dev_dirty_sectors);
- stripe = offset_to_stripe(d, offset);
stripe_offset = offset & (d->stripe_size - 1);
while (nr_sectors) {
@@ -571,12 +575,12 @@ static bool dirty_pred(struct keybuf *bu
static void refill_full_stripes(struct cached_dev *dc)
{
struct keybuf *buf = &dc->writeback_keys;
- unsigned int start_stripe, stripe, next_stripe;
+ unsigned int start_stripe, next_stripe;
+ int stripe;
bool wrapped = false;
stripe = offset_to_stripe(&dc->disk, KEY_OFFSET(&buf->last_scanned));
-
- if (stripe >= dc->disk.nr_stripes)
+ if (stripe < 0)
stripe = 0;
start_stripe = stripe;
--- a/drivers/md/bcache/writeback.h
+++ b/drivers/md/bcache/writeback.h
@@ -52,10 +52,22 @@ static inline uint64_t bcache_dev_sector
return ret;
}
-static inline unsigned int offset_to_stripe(struct bcache_device *d,
+static inline int offset_to_stripe(struct bcache_device *d,
uint64_t offset)
{
do_div(offset, d->stripe_size);
+
+ /* d->nr_stripes is in range [1, INT_MAX] */
+ if (unlikely(offset >= d->nr_stripes)) {
+ pr_err("Invalid stripe %llu (>= nr_stripes %d).\n",
+ offset, d->nr_stripes);
+ return -EINVAL;
+ }
+
+ /*
+ * Here offset is definitly smaller than INT_MAX,
+ * return it as int will never overflow.
+ */
return offset;
}
@@ -63,7 +75,10 @@ static inline bool bcache_dev_stripe_dir
uint64_t offset,
unsigned int nr_sectors)
{
- unsigned int stripe = offset_to_stripe(&dc->disk, offset);
+ int stripe = offset_to_stripe(&dc->disk, offset);
+
+ if (stripe < 0)
+ return false;
while (1) {
if (atomic_read(dc->disk.stripe_sectors_dirty + stripe))
From: Coly Li <[email protected]>
commit 65f0f017e7be8c70330372df23bcb2a407ecf02d upstream.
For some block devices which large capacity (e.g. 8TB) but small io_opt
size (e.g. 8 sectors), in bcache_device_init() the stripes number calcu-
lated by,
DIV_ROUND_UP_ULL(sectors, d->stripe_size);
might be overflow to the unsigned int bcache_device->nr_stripes.
This patch uses the uint64_t variable to store DIV_ROUND_UP_ULL()
and after the value is checked to be available in unsigned int range,
sets it to bache_device->nr_stripes. Then the overflow is avoided.
Reported-and-tested-by: Ken Raeburn <[email protected]>
Signed-off-by: Coly Li <[email protected]>
Cc: [email protected]
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1783075
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/md/bcache/super.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -826,19 +826,19 @@ static int bcache_device_init(struct bca
struct request_queue *q;
const size_t max_stripes = min_t(size_t, INT_MAX,
SIZE_MAX / sizeof(atomic_t));
- size_t n;
+ uint64_t n;
int idx;
if (!d->stripe_size)
d->stripe_size = 1 << 31;
- d->nr_stripes = DIV_ROUND_UP_ULL(sectors, d->stripe_size);
-
- if (!d->nr_stripes || d->nr_stripes > max_stripes) {
- pr_err("nr_stripes too large or invalid: %u (start sector beyond end of disk?)\n",
- (unsigned int)d->nr_stripes);
+ n = DIV_ROUND_UP_ULL(sectors, d->stripe_size);
+ if (!n || n > max_stripes) {
+ pr_err("nr_stripes too large or invalid: %llu (start sector beyond end of disk?)\n",
+ n);
return -ENOMEM;
}
+ d->nr_stripes = n;
n = d->nr_stripes * sizeof(atomic_t);
d->stripe_sectors_dirty = kvzalloc(n, GFP_KERNEL);
From: Yishai Hadas <[email protected]>
[ Upstream commit 04c0a5fcfcf65aade2fb238b6336445f1a99b646 ]
Set IOVA on IB MR in uverbs layer to let all drivers have it, this
includes both reg/rereg MR flows.
As part of this change cleaned-up this setting from the drivers that
already did it by themselves in their user flows.
Fixes: e6f0330106f4 ("mlx4_ib: set user mr attributes in struct ib_mr")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Yishai Hadas <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/infiniband/core/uverbs_cmd.c | 4 ++++
drivers/infiniband/hw/cxgb4/mem.c | 1 -
drivers/infiniband/hw/mlx4/mr.c | 1 -
3 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index b48b3f6e632d4..557644dcc9237 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -770,6 +770,7 @@ static int ib_uverbs_reg_mr(struct uverbs_attr_bundle *attrs)
mr->uobject = uobj;
atomic_inc(&pd->usecnt);
mr->res.type = RDMA_RESTRACK_MR;
+ mr->iova = cmd.hca_va;
rdma_restrack_uadd(&mr->res);
uobj->object = mr;
@@ -861,6 +862,9 @@ static int ib_uverbs_rereg_mr(struct uverbs_attr_bundle *attrs)
atomic_dec(&old_pd->usecnt);
}
+ if (cmd.flags & IB_MR_REREG_TRANS)
+ mr->iova = cmd.hca_va;
+
memset(&resp, 0, sizeof(resp));
resp.lkey = mr->lkey;
resp.rkey = mr->rkey;
diff --git a/drivers/infiniband/hw/cxgb4/mem.c b/drivers/infiniband/hw/cxgb4/mem.c
index 962dc97a8ff2b..1e4f4e5255980 100644
--- a/drivers/infiniband/hw/cxgb4/mem.c
+++ b/drivers/infiniband/hw/cxgb4/mem.c
@@ -399,7 +399,6 @@ static int finish_mem_reg(struct c4iw_mr *mhp, u32 stag)
mmid = stag >> 8;
mhp->ibmr.rkey = mhp->ibmr.lkey = stag;
mhp->ibmr.length = mhp->attr.len;
- mhp->ibmr.iova = mhp->attr.va_fbo;
mhp->ibmr.page_size = 1U << (mhp->attr.page_size + 12);
pr_debug("mmid 0x%x mhp %p\n", mmid, mhp);
return xa_insert_irq(&mhp->rhp->mrs, mmid, mhp, GFP_KERNEL);
diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c
index 7e0b205c05eb3..d7c78f841d2f5 100644
--- a/drivers/infiniband/hw/mlx4/mr.c
+++ b/drivers/infiniband/hw/mlx4/mr.c
@@ -439,7 +439,6 @@ struct ib_mr *mlx4_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
mr->ibmr.rkey = mr->ibmr.lkey = mr->mmr.key;
mr->ibmr.length = length;
- mr->ibmr.iova = virt_addr;
mr->ibmr.page_size = 1U << shift;
return &mr->ibmr;
--
2.25.1
From: Herbert Xu <[email protected]>
[ Upstream commit 662bb52f50bca16a74fe92b487a14d7dccb85e1a ]
Some user-space programs rely on crypto requests that have no
control metadata. This broke when a check was added to require
the presence of control metadata with the ctx->init flag.
This patch fixes the regression by setting ctx->init as long as
one sendmsg(2) has been made, with or without a control message.
Reported-by: Sachin Sant <[email protected]>
Reported-by: Naresh Kamboju <[email protected]>
Fixes: f3c802a1f300 ("crypto: algif_aead - Only wake up when...")
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
crypto/af_alg.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 9fcb91ea10c41..5882ed46f1adb 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -851,6 +851,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
err = -EINVAL;
goto unlock;
}
+ ctx->init = true;
if (init) {
ctx->enc = enc;
@@ -858,7 +859,6 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
memcpy(ctx->iv, con.iv->iv, ivsize);
ctx->aead_assoclen = con.aead_assoclen;
- ctx->init = true;
}
while (size) {
--
2.25.1
From: Sibi Sankar <[email protected]>
commit 5b7be880074c73540948f8fc597e0407b98fabfa upstream.
Sometimes the stop triggers a watchdog rather than a stop-ack. Update
the running state to false on requesting stop to skip the watchdog
instead.
Error Logs:
$ echo stop > /sys/class/remoteproc/remoteproc0/state
ipa 1e40000.ipa: received modem stopping event
remoteproc-modem: watchdog received: sys_m_smsm_mpss.c:291:APPS force stop
qcom-q6v5-mss 4080000.remoteproc-modem: port failed halt
ipa 1e40000.ipa: received modem offline event
remoteproc0: stopped remote processor 4080000.remoteproc-modem
Reviewed-by: Evan Green <[email protected]>
Fixes: 3b415c8fb263 ("remoteproc: q6v5: Extract common resource handling")
Cc: [email protected]
Signed-off-by: Sibi Sankar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/remoteproc/qcom_q6v5.c | 2 ++
1 file changed, 2 insertions(+)
--- a/drivers/remoteproc/qcom_q6v5.c
+++ b/drivers/remoteproc/qcom_q6v5.c
@@ -153,6 +153,8 @@ int qcom_q6v5_request_stop(struct qcom_q
{
int ret;
+ q6v5->running = false;
+
qcom_smem_state_update_bits(q6v5->state,
BIT(q6v5->stop_bit), BIT(q6v5->stop_bit));
From: Qais Yousef <[email protected]>
[ Upstream commit e65855a52b479f98674998cb23b21ef5a8144b04 ]
The following splat was caught when setting uclamp value of a task:
BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:49
cpus_read_lock+0x68/0x130
static_key_enable+0x1c/0x38
__sched_setscheduler+0x900/0xad8
Fix by ensuring we enable the key outside of the critical section in
__sched_setscheduler()
Fixes: 46609ce22703 ("sched/uclamp: Protect uclamp fast path code with static key")
Signed-off-by: Qais Yousef <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/sched/core.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index db1e99756c400..f788cd61df212 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1248,6 +1248,15 @@ static int uclamp_validate(struct task_struct *p,
if (upper_bound > SCHED_CAPACITY_SCALE)
return -EINVAL;
+ /*
+ * We have valid uclamp attributes; make sure uclamp is enabled.
+ *
+ * We need to do that here, because enabling static branches is a
+ * blocking operation which obviously cannot be done while holding
+ * scheduler locks.
+ */
+ static_branch_enable(&sched_uclamp_used);
+
return 0;
}
@@ -1278,8 +1287,6 @@ static void __setscheduler_uclamp(struct task_struct *p,
if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
return;
- static_branch_enable(&sched_uclamp_used);
-
if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
attr->sched_util_min, true);
--
2.25.1
From: Ahmad Fatoum <[email protected]>
commit 4f39d575844148fbf3081571a1f3b4ae04150958 upstream.
The flag indicating a watchdog timeout having occurred normally persists
till Power-On Reset of the Fintek Super I/O chip. The user can clear it
by writing a `1' to the bit.
The driver doesn't offer a restart method, so regular system reboot
might not reset the Super I/O and if the watchdog isn't enabled, we
won't touch the register containing the bit on the next boot.
In this case all subsequent regular reboots will be wrongly flagged
by the driver as being caused by the watchdog.
Fix this by having the flag cleared after read. This is also done by
other drivers like those for the i6300esb and mpc8xxx_wdt.
Fixes: b97cb21a4634 ("watchdog: f71808e_wdt: Fix WDTMOUT_STS register read")
Cc: [email protected]
Signed-off-by: Ahmad Fatoum <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/watchdog/f71808e_wdt.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/drivers/watchdog/f71808e_wdt.c
+++ b/drivers/watchdog/f71808e_wdt.c
@@ -706,6 +706,13 @@ static int __init watchdog_init(int sioa
wdt_conf = superio_inb(sioaddr, F71808FG_REG_WDT_CONF);
watchdog.caused_reboot = wdt_conf & BIT(F71808FG_FLAG_WDTMOUT_STS);
+ /*
+ * We don't want WDTMOUT_STS to stick around till regular reboot.
+ * Write 1 to the bit to clear it to zero.
+ */
+ superio_outb(sioaddr, F71808FG_REG_WDT_CONF,
+ wdt_conf | BIT(F71808FG_FLAG_WDTMOUT_STS));
+
superio_exit(sioaddr);
err = watchdog_set_timeout(timeout);
From: Sagi Grimberg <[email protected]>
[ Upstream commit ecca390e80561debbfdb4dc96bf94595136889fa ]
A deadlock happens in the following scenario with multipath:
1) scan_work(nvme0) detects a new nsid while nvme0
is an optimized path to it, path nvme1 happens to be
inaccessible.
2) Before scan_work is complete nvme0 disconnect is initiated
nvme_delete_ctrl_sync() sets nvme0 state to NVME_CTRL_DELETING
3) scan_work(1) attempts to submit IO,
but nvme_path_is_optimized() observes nvme0 is not LIVE.
Since nvme1 is a possible path IO is requeued and scan_work hangs.
--
Workqueue: nvme-wq nvme_scan_work [nvme_core]
kernel: Call Trace:
kernel: __schedule+0x2b9/0x6c0
kernel: schedule+0x42/0xb0
kernel: io_schedule+0x16/0x40
kernel: do_read_cache_page+0x438/0x830
kernel: read_cache_page+0x12/0x20
kernel: read_dev_sector+0x27/0xc0
kernel: read_lba+0xc1/0x220
kernel: efi_partition+0x1e6/0x708
kernel: check_partition+0x154/0x244
kernel: rescan_partitions+0xae/0x280
kernel: __blkdev_get+0x40f/0x560
kernel: blkdev_get+0x3d/0x140
kernel: __device_add_disk+0x388/0x480
kernel: device_add_disk+0x13/0x20
kernel: nvme_mpath_set_live+0x119/0x140 [nvme_core]
kernel: nvme_update_ns_ana_state+0x5c/0x60 [nvme_core]
kernel: nvme_set_ns_ana_state+0x1e/0x30 [nvme_core]
kernel: nvme_parse_ana_log+0xa1/0x180 [nvme_core]
kernel: nvme_mpath_add_disk+0x47/0x90 [nvme_core]
kernel: nvme_validate_ns+0x396/0x940 [nvme_core]
kernel: nvme_scan_work+0x24f/0x380 [nvme_core]
kernel: process_one_work+0x1db/0x380
kernel: worker_thread+0x249/0x400
kernel: kthread+0x104/0x140
--
4) Delete also hangs in flush_work(ctrl->scan_work)
from nvme_remove_namespaces().
Similiarly a deadlock with ana_work may happen: if ana_work has started
and calls nvme_mpath_set_live and device_add_disk, it will
trigger I/O. When we trigger disconnect I/O will block because
our accessible (optimized) path is disconnecting, but the alternate
path is inaccessible, so I/O blocks. Then disconnect tries to flush
the ana_work and hangs.
[ 605.550896] Workqueue: nvme-wq nvme_ana_work [nvme_core]
[ 605.552087] Call Trace:
[ 605.552683] __schedule+0x2b9/0x6c0
[ 605.553507] schedule+0x42/0xb0
[ 605.554201] io_schedule+0x16/0x40
[ 605.555012] do_read_cache_page+0x438/0x830
[ 605.556925] read_cache_page+0x12/0x20
[ 605.557757] read_dev_sector+0x27/0xc0
[ 605.558587] amiga_partition+0x4d/0x4c5
[ 605.561278] check_partition+0x154/0x244
[ 605.562138] rescan_partitions+0xae/0x280
[ 605.563076] __blkdev_get+0x40f/0x560
[ 605.563830] blkdev_get+0x3d/0x140
[ 605.564500] __device_add_disk+0x388/0x480
[ 605.565316] device_add_disk+0x13/0x20
[ 605.566070] nvme_mpath_set_live+0x5e/0x130 [nvme_core]
[ 605.567114] nvme_update_ns_ana_state+0x2c/0x30 [nvme_core]
[ 605.568197] nvme_update_ana_state+0xca/0xe0 [nvme_core]
[ 605.569360] nvme_parse_ana_log+0xa1/0x180 [nvme_core]
[ 605.571385] nvme_read_ana_log+0x76/0x100 [nvme_core]
[ 605.572376] nvme_ana_work+0x15/0x20 [nvme_core]
[ 605.573330] process_one_work+0x1db/0x380
[ 605.574144] worker_thread+0x4d/0x400
[ 605.574896] kthread+0x104/0x140
[ 605.577205] ret_from_fork+0x35/0x40
[ 605.577955] INFO: task nvme:14044 blocked for more than 120 seconds.
[ 605.579239] Tainted: G OE 5.3.5-050305-generic #201910071830
[ 605.580712] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 605.582320] nvme D 0 14044 14043 0x00000000
[ 605.583424] Call Trace:
[ 605.583935] __schedule+0x2b9/0x6c0
[ 605.584625] schedule+0x42/0xb0
[ 605.585290] schedule_timeout+0x203/0x2f0
[ 605.588493] wait_for_completion+0xb1/0x120
[ 605.590066] __flush_work+0x123/0x1d0
[ 605.591758] __cancel_work_timer+0x10e/0x190
[ 605.593542] cancel_work_sync+0x10/0x20
[ 605.594347] nvme_mpath_stop+0x2f/0x40 [nvme_core]
[ 605.595328] nvme_stop_ctrl+0x12/0x50 [nvme_core]
[ 605.596262] nvme_do_delete_ctrl+0x3f/0x90 [nvme_core]
[ 605.597333] nvme_sysfs_delete+0x5c/0x70 [nvme_core]
[ 605.598320] dev_attr_store+0x17/0x30
Fix this by introducing a new state: NVME_CTRL_DELETE_NOIO, which will
indicate the phase of controller deletion where I/O cannot be allowed
to access the namespace. NVME_CTRL_DELETING still allows mpath I/O to
be issued to the bottom device, and only after we flush the ana_work
and scan_work (after nvme_stop_ctrl and nvme_prep_remove_namespaces)
we change the state to NVME_CTRL_DELETING_NOIO. Also we prevent ana_work
from re-firing by aborting early if we are not LIVE, so we should be safe
here.
In addition, change the transport drivers to follow the updated state
machine.
Fixes: 0d0b660f214d ("nvme: add ANA support")
Reported-by: Anton Eidelman <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/nvme/host/core.c | 15 +++++++++++++++
drivers/nvme/host/fabrics.c | 2 +-
drivers/nvme/host/fabrics.h | 3 ++-
drivers/nvme/host/fc.c | 1 +
drivers/nvme/host/multipath.c | 18 +++++++++++++++---
drivers/nvme/host/nvme.h | 1 +
drivers/nvme/host/rdma.c | 10 ++++++----
drivers/nvme/host/tcp.c | 15 +++++++++------
8 files changed, 50 insertions(+), 15 deletions(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 4ee2330c603e7..f38548e6d55ec 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -362,6 +362,16 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
break;
}
break;
+ case NVME_CTRL_DELETING_NOIO:
+ switch (old_state) {
+ case NVME_CTRL_DELETING:
+ case NVME_CTRL_DEAD:
+ changed = true;
+ /* FALLTHRU */
+ default:
+ break;
+ }
+ break;
case NVME_CTRL_DEAD:
switch (old_state) {
case NVME_CTRL_DELETING:
@@ -399,6 +409,7 @@ static bool nvme_state_terminal(struct nvme_ctrl *ctrl)
case NVME_CTRL_CONNECTING:
return false;
case NVME_CTRL_DELETING:
+ case NVME_CTRL_DELETING_NOIO:
case NVME_CTRL_DEAD:
return true;
default:
@@ -3344,6 +3355,7 @@ static ssize_t nvme_sysfs_show_state(struct device *dev,
[NVME_CTRL_RESETTING] = "resetting",
[NVME_CTRL_CONNECTING] = "connecting",
[NVME_CTRL_DELETING] = "deleting",
+ [NVME_CTRL_DELETING_NOIO]= "deleting (no IO)",
[NVME_CTRL_DEAD] = "dead",
};
@@ -3911,6 +3923,9 @@ void nvme_remove_namespaces(struct nvme_ctrl *ctrl)
if (ctrl->state == NVME_CTRL_DEAD)
nvme_kill_queues(ctrl);
+ /* this is a no-op when called from the controller reset handler */
+ nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING_NOIO);
+
down_write(&ctrl->namespaces_rwsem);
list_splice_init(&ctrl->namespaces, &ns_list);
up_write(&ctrl->namespaces_rwsem);
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 2a6c8190eeb76..4ec4829d62334 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -547,7 +547,7 @@ static struct nvmf_transport_ops *nvmf_lookup_transport(
blk_status_t nvmf_fail_nonready_command(struct nvme_ctrl *ctrl,
struct request *rq)
{
- if (ctrl->state != NVME_CTRL_DELETING &&
+ if (ctrl->state != NVME_CTRL_DELETING_NOIO &&
ctrl->state != NVME_CTRL_DEAD &&
!blk_noretry_request(rq) && !(rq->cmd_flags & REQ_NVME_MPATH))
return BLK_STS_RESOURCE;
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index a0ec40ab62eeb..a9c1e3b4585ec 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -182,7 +182,8 @@ bool nvmf_ip_options_match(struct nvme_ctrl *ctrl,
static inline bool nvmf_check_ready(struct nvme_ctrl *ctrl, struct request *rq,
bool queue_live)
{
- if (likely(ctrl->state == NVME_CTRL_LIVE))
+ if (likely(ctrl->state == NVME_CTRL_LIVE ||
+ ctrl->state == NVME_CTRL_DELETING))
return true;
return __nvmf_check_ready(ctrl, rq, queue_live);
}
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index e999a8c4b7e87..549f5b0fb0b4b 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -825,6 +825,7 @@ nvme_fc_ctrl_connectivity_loss(struct nvme_fc_ctrl *ctrl)
break;
case NVME_CTRL_DELETING:
+ case NVME_CTRL_DELETING_NOIO:
default:
/* no action to take - let it delete */
break;
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 57d51148e71b6..2672953233434 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -167,9 +167,18 @@ void nvme_mpath_clear_ctrl_paths(struct nvme_ctrl *ctrl)
static bool nvme_path_is_disabled(struct nvme_ns *ns)
{
- return ns->ctrl->state != NVME_CTRL_LIVE ||
- test_bit(NVME_NS_ANA_PENDING, &ns->flags) ||
- test_bit(NVME_NS_REMOVING, &ns->flags);
+ /*
+ * We don't treat NVME_CTRL_DELETING as a disabled path as I/O should
+ * still be able to complete assuming that the controller is connected.
+ * Otherwise it will fail immediately and return to the requeue list.
+ */
+ if (ns->ctrl->state != NVME_CTRL_LIVE &&
+ ns->ctrl->state != NVME_CTRL_DELETING)
+ return true;
+ if (test_bit(NVME_NS_ANA_PENDING, &ns->flags) ||
+ test_bit(NVME_NS_REMOVING, &ns->flags))
+ return true;
+ return false;
}
static struct nvme_ns *__nvme_find_path(struct nvme_ns_head *head, int node)
@@ -574,6 +583,9 @@ static void nvme_ana_work(struct work_struct *work)
{
struct nvme_ctrl *ctrl = container_of(work, struct nvme_ctrl, ana_work);
+ if (ctrl->state != NVME_CTRL_LIVE)
+ return;
+
nvme_read_ana_log(ctrl);
}
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 09ffc3246f60e..e268f1d7e1a0f 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -186,6 +186,7 @@ enum nvme_ctrl_state {
NVME_CTRL_RESETTING,
NVME_CTRL_CONNECTING,
NVME_CTRL_DELETING,
+ NVME_CTRL_DELETING_NOIO,
NVME_CTRL_DEAD,
};
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index af0cfd25ed7a4..876859cd14e86 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1082,11 +1082,12 @@ static int nvme_rdma_setup_ctrl(struct nvme_rdma_ctrl *ctrl, bool new)
changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE);
if (!changed) {
/*
- * state change failure is ok if we're in DELETING state,
+ * state change failure is ok if we started ctrl delete,
* unless we're during creation of a new controller to
* avoid races with teardown flow.
*/
- WARN_ON_ONCE(ctrl->ctrl.state != NVME_CTRL_DELETING);
+ WARN_ON_ONCE(ctrl->ctrl.state != NVME_CTRL_DELETING &&
+ ctrl->ctrl.state != NVME_CTRL_DELETING_NOIO);
WARN_ON_ONCE(new);
ret = -EINVAL;
goto destroy_io;
@@ -1139,8 +1140,9 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);
if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
- /* state change failure is ok if we're in DELETING state */
- WARN_ON_ONCE(ctrl->ctrl.state != NVME_CTRL_DELETING);
+ /* state change failure is ok if we started ctrl delete */
+ WARN_ON_ONCE(ctrl->ctrl.state != NVME_CTRL_DELETING &&
+ ctrl->ctrl.state != NVME_CTRL_DELETING_NOIO);
return;
}
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 83bb329d4113a..a6d2e3330a584 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1929,11 +1929,12 @@ static int nvme_tcp_setup_ctrl(struct nvme_ctrl *ctrl, bool new)
if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_LIVE)) {
/*
- * state change failure is ok if we're in DELETING state,
+ * state change failure is ok if we started ctrl delete,
* unless we're during creation of a new controller to
* avoid races with teardown flow.
*/
- WARN_ON_ONCE(ctrl->state != NVME_CTRL_DELETING);
+ WARN_ON_ONCE(ctrl->state != NVME_CTRL_DELETING &&
+ ctrl->state != NVME_CTRL_DELETING_NOIO);
WARN_ON_ONCE(new);
ret = -EINVAL;
goto destroy_io;
@@ -1989,8 +1990,9 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work)
blk_mq_unquiesce_queue(ctrl->admin_q);
if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING)) {
- /* state change failure is ok if we're in DELETING state */
- WARN_ON_ONCE(ctrl->state != NVME_CTRL_DELETING);
+ /* state change failure is ok if we started ctrl delete */
+ WARN_ON_ONCE(ctrl->state != NVME_CTRL_DELETING &&
+ ctrl->state != NVME_CTRL_DELETING_NOIO);
return;
}
@@ -2025,8 +2027,9 @@ static void nvme_reset_ctrl_work(struct work_struct *work)
nvme_tcp_teardown_ctrl(ctrl, false);
if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_CONNECTING)) {
- /* state change failure is ok if we're in DELETING state */
- WARN_ON_ONCE(ctrl->state != NVME_CTRL_DELETING);
+ /* state change failure is ok if we started ctrl delete */
+ WARN_ON_ONCE(ctrl->state != NVME_CTRL_DELETING &&
+ ctrl->state != NVME_CTRL_DELETING_NOIO);
return;
}
--
2.25.1
From: Dafna Hirschfeld <[email protected]>
[ Upstream commit c247818a873adcb8488021eed38c330ea8b288a3 ]
The macros 'RKISP1_DIR_*' are flags that indicate on which
pads of the isp subdevice the media bus code is supported. So the
prefix RKISP1_ISP_SD_ is better.
Signed-off-by: Dafna Hirschfeld <[email protected]>
Acked-by: Helen Koike <[email protected]>
Reviewed-by: Tomasz Figa <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/staging/media/rkisp1/rkisp1-isp.c | 46 +++++++++++------------
1 file changed, 23 insertions(+), 23 deletions(-)
diff --git a/drivers/staging/media/rkisp1/rkisp1-isp.c b/drivers/staging/media/rkisp1/rkisp1-isp.c
index 93ba2dd2fcda0..abfedb604303f 100644
--- a/drivers/staging/media/rkisp1/rkisp1-isp.c
+++ b/drivers/staging/media/rkisp1/rkisp1-isp.c
@@ -23,8 +23,8 @@
#define RKISP1_ISP_DEV_NAME RKISP1_DRIVER_NAME "_isp"
-#define RKISP1_DIR_SRC BIT(0)
-#define RKISP1_DIR_SINK BIT(1)
+#define RKISP1_ISP_SD_SRC BIT(0)
+#define RKISP1_ISP_SD_SINK BIT(1)
/*
* NOTE: MIPI controller and input MUX are also configured in this file.
@@ -61,119 +61,119 @@ static const struct rkisp1_isp_mbus_info rkisp1_isp_formats[] = {
{
.mbus_code = MEDIA_BUS_FMT_YUYV8_2X8,
.pixel_enc = V4L2_PIXEL_ENC_YUV,
- .direction = RKISP1_DIR_SRC,
+ .direction = RKISP1_ISP_SD_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SRGGB10_1X10,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW10,
.bayer_pat = RKISP1_RAW_RGGB,
.bus_width = 10,
- .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
+ .direction = RKISP1_ISP_SD_SINK | RKISP1_ISP_SD_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SBGGR10_1X10,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW10,
.bayer_pat = RKISP1_RAW_BGGR,
.bus_width = 10,
- .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
+ .direction = RKISP1_ISP_SD_SINK | RKISP1_ISP_SD_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SGBRG10_1X10,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW10,
.bayer_pat = RKISP1_RAW_GBRG,
.bus_width = 10,
- .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
+ .direction = RKISP1_ISP_SD_SINK | RKISP1_ISP_SD_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SGRBG10_1X10,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW10,
.bayer_pat = RKISP1_RAW_GRBG,
.bus_width = 10,
- .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
+ .direction = RKISP1_ISP_SD_SINK | RKISP1_ISP_SD_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SRGGB12_1X12,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW12,
.bayer_pat = RKISP1_RAW_RGGB,
.bus_width = 12,
- .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
+ .direction = RKISP1_ISP_SD_SINK | RKISP1_ISP_SD_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SBGGR12_1X12,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW12,
.bayer_pat = RKISP1_RAW_BGGR,
.bus_width = 12,
- .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
+ .direction = RKISP1_ISP_SD_SINK | RKISP1_ISP_SD_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SGBRG12_1X12,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW12,
.bayer_pat = RKISP1_RAW_GBRG,
.bus_width = 12,
- .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
+ .direction = RKISP1_ISP_SD_SINK | RKISP1_ISP_SD_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SGRBG12_1X12,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW12,
.bayer_pat = RKISP1_RAW_GRBG,
.bus_width = 12,
- .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
+ .direction = RKISP1_ISP_SD_SINK | RKISP1_ISP_SD_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SRGGB8_1X8,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW8,
.bayer_pat = RKISP1_RAW_RGGB,
.bus_width = 8,
- .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
+ .direction = RKISP1_ISP_SD_SINK | RKISP1_ISP_SD_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SBGGR8_1X8,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW8,
.bayer_pat = RKISP1_RAW_BGGR,
.bus_width = 8,
- .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
+ .direction = RKISP1_ISP_SD_SINK | RKISP1_ISP_SD_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SGBRG8_1X8,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW8,
.bayer_pat = RKISP1_RAW_GBRG,
.bus_width = 8,
- .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
+ .direction = RKISP1_ISP_SD_SINK | RKISP1_ISP_SD_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SGRBG8_1X8,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW8,
.bayer_pat = RKISP1_RAW_GRBG,
.bus_width = 8,
- .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
+ .direction = RKISP1_ISP_SD_SINK | RKISP1_ISP_SD_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_YUYV8_1X16,
.pixel_enc = V4L2_PIXEL_ENC_YUV,
.mipi_dt = RKISP1_CIF_CSI2_DT_YUV422_8b,
.yuv_seq = RKISP1_CIF_ISP_ACQ_PROP_YCBYCR,
.bus_width = 16,
- .direction = RKISP1_DIR_SINK,
+ .direction = RKISP1_ISP_SD_SINK,
}, {
.mbus_code = MEDIA_BUS_FMT_YVYU8_1X16,
.pixel_enc = V4L2_PIXEL_ENC_YUV,
.mipi_dt = RKISP1_CIF_CSI2_DT_YUV422_8b,
.yuv_seq = RKISP1_CIF_ISP_ACQ_PROP_YCRYCB,
.bus_width = 16,
- .direction = RKISP1_DIR_SINK,
+ .direction = RKISP1_ISP_SD_SINK,
}, {
.mbus_code = MEDIA_BUS_FMT_UYVY8_1X16,
.pixel_enc = V4L2_PIXEL_ENC_YUV,
.mipi_dt = RKISP1_CIF_CSI2_DT_YUV422_8b,
.yuv_seq = RKISP1_CIF_ISP_ACQ_PROP_CBYCRY,
.bus_width = 16,
- .direction = RKISP1_DIR_SINK,
+ .direction = RKISP1_ISP_SD_SINK,
}, {
.mbus_code = MEDIA_BUS_FMT_VYUY8_1X16,
.pixel_enc = V4L2_PIXEL_ENC_YUV,
.mipi_dt = RKISP1_CIF_CSI2_DT_YUV422_8b,
.yuv_seq = RKISP1_CIF_ISP_ACQ_PROP_CRYCBY,
.bus_width = 16,
- .direction = RKISP1_DIR_SINK,
+ .direction = RKISP1_ISP_SD_SINK,
},
};
@@ -573,9 +573,9 @@ static int rkisp1_isp_enum_mbus_code(struct v4l2_subdev *sd,
int pos = 0;
if (code->pad == RKISP1_ISP_PAD_SINK_VIDEO) {
- dir = RKISP1_DIR_SINK;
+ dir = RKISP1_ISP_SD_SINK;
} else if (code->pad == RKISP1_ISP_PAD_SOURCE_VIDEO) {
- dir = RKISP1_DIR_SRC;
+ dir = RKISP1_ISP_SD_SRC;
} else {
if (code->index > 0)
return -EINVAL;
@@ -660,7 +660,7 @@ static void rkisp1_isp_set_src_fmt(struct rkisp1_isp *isp,
src_fmt->code = format->code;
mbus_info = rkisp1_isp_mbus_info_get(src_fmt->code);
- if (!mbus_info || !(mbus_info->direction & RKISP1_DIR_SRC)) {
+ if (!mbus_info || !(mbus_info->direction & RKISP1_ISP_SD_SRC)) {
src_fmt->code = RKISP1_DEF_SRC_PAD_FMT;
mbus_info = rkisp1_isp_mbus_info_get(src_fmt->code);
}
@@ -744,7 +744,7 @@ static void rkisp1_isp_set_sink_fmt(struct rkisp1_isp *isp,
which);
sink_fmt->code = format->code;
mbus_info = rkisp1_isp_mbus_info_get(sink_fmt->code);
- if (!mbus_info || !(mbus_info->direction & RKISP1_DIR_SINK)) {
+ if (!mbus_info || !(mbus_info->direction & RKISP1_ISP_SD_SINK)) {
sink_fmt->code = RKISP1_DEF_SINK_PAD_FMT;
mbus_info = rkisp1_isp_mbus_info_get(sink_fmt->code);
}
--
2.25.1
From: Ming Lei <[email protected]>
[ Upstream commit e766668c6cd49d741cfb49eaeb38998ba34d27bc ]
dm_stop_queue() only uses blk_mq_quiesce_queue() so it doesn't
formally stop the blk-mq queue; therefore there is no point making the
blk_mq_queue_stopped() check -- it will never be stopped.
In addition, even though dm_stop_queue() actually tries to quiesce hw
queues via blk_mq_quiesce_queue(), checking with blk_queue_quiesced()
to avoid unnecessary queue quiesce isn't reliable because: the
QUEUE_FLAG_QUIESCED flag is set before synchronize_rcu() and
dm_stop_queue() may be called when synchronize_rcu() from another
blk_mq_quiesce_queue() is in-progress.
Fixes: 7b17c2f7292ba ("dm: Fix a race condition related to stopping and starting queues")
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/md/dm-rq.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 85e0daabad49c..20745e2e34b94 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -70,9 +70,6 @@ void dm_start_queue(struct request_queue *q)
void dm_stop_queue(struct request_queue *q)
{
- if (blk_mq_queue_stopped(q))
- return;
-
blk_mq_quiesce_queue(q);
}
--
2.25.1
From: Stafford Horne <[email protected]>
[ Upstream commit 57b8e277c33620e115633cdf700a260b55095460 ]
When dumping a stack with 'cat /proc/#/stack' the kernel would oops.
For example:
# cat /proc/690/stack
Unable to handle kernel access
at virtual address 0x7fc60f58
Oops#: 0000
CPU #: 0
PC: c00097fc SR: 0000807f SP: d6f09b9c
GPR00: 00000000 GPR01: d6f09b9c GPR02: d6f09bb8 GPR03: d6f09bc4
GPR04: 7fc60f5c GPR05: c00099b4 GPR06: 00000000 GPR07: d6f09ba3
GPR08: ffffff00 GPR09: c0009804 GPR10: d6f08000 GPR11: 00000000
GPR12: ffffe000 GPR13: dbb86000 GPR14: 00000001 GPR15: dbb86250
GPR16: 7fc60f63 GPR17: 00000f5c GPR18: d6f09bc4 GPR19: 00000000
GPR20: c00099b4 GPR21: ffffffc0 GPR22: 00000000 GPR23: 00000000
GPR24: 00000001 GPR25: 000002c6 GPR26: d78b6850 GPR27: 00000001
GPR28: 00000000 GPR29: dbb86000 GPR30: ffffffff GPR31: dbb862fc
RES: 00000000 oGPR11: ffffffff
Process cat (pid: 702, stackpage=d79d6000)
Stack:
Call trace:
[<598977f2>] save_stack_trace_tsk+0x40/0x74
[<95063f0e>] stack_trace_save_tsk+0x44/0x58
[<b557bfdd>] proc_pid_stack+0xd0/0x13c
[<a2df8eda>] proc_single_show+0x6c/0xf0
[<e5a737b7>] seq_read+0x1b4/0x688
[<2d6c7480>] do_iter_read+0x208/0x248
[<2182a2fb>] vfs_readv+0x64/0x90
This was caused by the stack trace code in save_stack_trace_tsk using
the wrong stack pointer. It was using the user stack pointer instead of
the kernel stack pointer. Fix this by using the right stack.
Also for good measure we add try_get_task_stack/put_task_stack to ensure
the task is not lost while we are walking it's stack.
Fixes: eecac38b0423a ("openrisc: support framepointers and STACKTRACE_SUPPORT")
Signed-off-by: Stafford Horne <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/openrisc/kernel/stacktrace.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/arch/openrisc/kernel/stacktrace.c b/arch/openrisc/kernel/stacktrace.c
index 43f140a28bc72..54d38809e22cb 100644
--- a/arch/openrisc/kernel/stacktrace.c
+++ b/arch/openrisc/kernel/stacktrace.c
@@ -13,6 +13,7 @@
#include <linux/export.h>
#include <linux/sched.h>
#include <linux/sched/debug.h>
+#include <linux/sched/task_stack.h>
#include <linux/stacktrace.h>
#include <asm/processor.h>
@@ -68,12 +69,25 @@ void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
{
unsigned long *sp = NULL;
+ if (!try_get_task_stack(tsk))
+ return;
+
if (tsk == current)
sp = (unsigned long *) &sp;
- else
- sp = (unsigned long *) KSTK_ESP(tsk);
+ else {
+ unsigned long ksp;
+
+ /* Locate stack from kernel context */
+ ksp = task_thread_info(tsk)->ksp;
+ ksp += STACK_FRAME_OVERHEAD; /* redzone */
+ ksp += sizeof(struct pt_regs);
+
+ sp = (unsigned long *) ksp;
+ }
unwind_stack(trace, sp, save_stack_address_nosched);
+
+ put_task_stack(tsk);
}
EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
--
2.25.1
From: Nicolas Saenz Julienne <[email protected]>
[ Upstream commit f34e4651ce66a754f41203284acf09b28b9dd955 ]
Contrary to previous SoCs, bcm2711 doesn't have a prescaler in the PLL
feedback loop. Bypass it by zeroing fb_prediv_mask when running on
bcm2711.
Note that, since the prediv configuration bits were re-purposed, this
was triggering miscalculations on all clocks hanging from the VPU clock,
notably the aux UART, making its output unintelligible.
Fixes: 42de9ad400af ("clk: bcm2835: Add BCM2711_CLOCK_EMMC2 support")
Reported-by: Nathan Chancellor <[email protected]>
Signed-off-by: Nicolas Saenz Julienne <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Nathan Chancellor <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/clk/bcm/clk-bcm2835.c | 25 +++++++++++++++++++++----
1 file changed, 21 insertions(+), 4 deletions(-)
diff --git a/drivers/clk/bcm/clk-bcm2835.c b/drivers/clk/bcm/clk-bcm2835.c
index 6bb7efa12037b..011802f1a6df9 100644
--- a/drivers/clk/bcm/clk-bcm2835.c
+++ b/drivers/clk/bcm/clk-bcm2835.c
@@ -314,6 +314,7 @@ struct bcm2835_cprman {
struct device *dev;
void __iomem *regs;
spinlock_t regs_lock; /* spinlock for all clocks */
+ unsigned int soc;
/*
* Real names of cprman clock parents looked up through
@@ -525,6 +526,20 @@ static int bcm2835_pll_is_on(struct clk_hw *hw)
A2W_PLL_CTRL_PRST_DISABLE;
}
+static u32 bcm2835_pll_get_prediv_mask(struct bcm2835_cprman *cprman,
+ const struct bcm2835_pll_data *data)
+{
+ /*
+ * On BCM2711 there isn't a pre-divisor available in the PLL feedback
+ * loop. Bits 13:14 of ANA1 (PLLA,PLLB,PLLC,PLLD) have been re-purposed
+ * for to for VCO RANGE bits.
+ */
+ if (cprman->soc & SOC_BCM2711)
+ return 0;
+
+ return data->ana->fb_prediv_mask;
+}
+
static void bcm2835_pll_choose_ndiv_and_fdiv(unsigned long rate,
unsigned long parent_rate,
u32 *ndiv, u32 *fdiv)
@@ -582,7 +597,7 @@ static unsigned long bcm2835_pll_get_rate(struct clk_hw *hw,
ndiv = (a2wctrl & A2W_PLL_CTRL_NDIV_MASK) >> A2W_PLL_CTRL_NDIV_SHIFT;
pdiv = (a2wctrl & A2W_PLL_CTRL_PDIV_MASK) >> A2W_PLL_CTRL_PDIV_SHIFT;
using_prediv = cprman_read(cprman, data->ana_reg_base + 4) &
- data->ana->fb_prediv_mask;
+ bcm2835_pll_get_prediv_mask(cprman, data);
if (using_prediv) {
ndiv *= 2;
@@ -665,6 +680,7 @@ static int bcm2835_pll_set_rate(struct clk_hw *hw,
struct bcm2835_pll *pll = container_of(hw, struct bcm2835_pll, hw);
struct bcm2835_cprman *cprman = pll->cprman;
const struct bcm2835_pll_data *data = pll->data;
+ u32 prediv_mask = bcm2835_pll_get_prediv_mask(cprman, data);
bool was_using_prediv, use_fb_prediv, do_ana_setup_first;
u32 ndiv, fdiv, a2w_ctl;
u32 ana[4];
@@ -682,7 +698,7 @@ static int bcm2835_pll_set_rate(struct clk_hw *hw,
for (i = 3; i >= 0; i--)
ana[i] = cprman_read(cprman, data->ana_reg_base + i * 4);
- was_using_prediv = ana[1] & data->ana->fb_prediv_mask;
+ was_using_prediv = ana[1] & prediv_mask;
ana[0] &= ~data->ana->mask0;
ana[0] |= data->ana->set0;
@@ -692,10 +708,10 @@ static int bcm2835_pll_set_rate(struct clk_hw *hw,
ana[3] |= data->ana->set3;
if (was_using_prediv && !use_fb_prediv) {
- ana[1] &= ~data->ana->fb_prediv_mask;
+ ana[1] &= ~prediv_mask;
do_ana_setup_first = true;
} else if (!was_using_prediv && use_fb_prediv) {
- ana[1] |= data->ana->fb_prediv_mask;
+ ana[1] |= prediv_mask;
do_ana_setup_first = false;
} else {
do_ana_setup_first = true;
@@ -2238,6 +2254,7 @@ static int bcm2835_clk_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, cprman);
cprman->onecell.num = asize;
+ cprman->soc = pdata->soc;
hws = cprman->onecell.hws;
for (i = 0; i < asize; i++) {
--
2.25.1
From: Jane Chu <[email protected]>
[ Upstream commit dad42d17558f316e9e807698cd4207359b636084 ]
commit d78c620a2e82 ("libnvdimm/security: Introduce a 'frozen' attribute")
introduced a typo, causing a 'nvdimm->sec.flags' update being overwritten
by the subsequent update meant for 'nvdimm->sec.ext_flags'.
Link: https://lore.kernel.org/r/[email protected]
Fixes: d78c620a2e82 ("libnvdimm/security: Introduce a 'frozen' attribute")
Cc: Dan Williams <[email protected]>
Reviewed-by: Dave Jiang <[email protected]>
Signed-off-by: Jane Chu <[email protected]>
Signed-off-by: Vishal Verma <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/nvdimm/security.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvdimm/security.c b/drivers/nvdimm/security.c
index 4cef69bd3c1bd..8f3971cf16541 100644
--- a/drivers/nvdimm/security.c
+++ b/drivers/nvdimm/security.c
@@ -457,7 +457,7 @@ void __nvdimm_security_overwrite_query(struct nvdimm *nvdimm)
clear_bit(NDD_WORK_PENDING, &nvdimm->flags);
put_device(&nvdimm->dev);
nvdimm->sec.flags = nvdimm_security_flags(nvdimm, NVDIMM_USER);
- nvdimm->sec.flags = nvdimm_security_flags(nvdimm, NVDIMM_MASTER);
+ nvdimm->sec.ext_flags = nvdimm_security_flags(nvdimm, NVDIMM_MASTER);
}
void nvdimm_security_overwrite_query(struct work_struct *work)
--
2.25.1
From: Jane Chu <[email protected]>
[ Upstream commit 7f674025d9f7321dea11b802cc0ab3f09cbe51c5 ]
commit 7d988097c546 ("acpi/nfit, libnvdimm/security: Add security DSM overwrite support")
adds a sysfs_notify_dirent() to wake up userspace poll thread when the "overwrite"
operation has completed. But the notification is issued before the internal
dimm security state and flags have been updated, so the userspace poll thread
wakes up and fetches the not-yet-updated attr and falls back to sleep, forever.
But if user from another terminal issue "ndctl wait-overwrite nmemX" again,
the command returns instantly.
Link: https://lore.kernel.org/r/[email protected]
Fixes: 7d988097c546 ("acpi/nfit, libnvdimm/security: Add security DSM overwrite support")
Cc: Dave Jiang <[email protected]>
Cc: Dan Williams <[email protected]>
Reviewed-by: Dave Jiang <[email protected]>
Signed-off-by: Jane Chu <[email protected]>
Signed-off-by: Vishal Verma <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/nvdimm/security.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/nvdimm/security.c b/drivers/nvdimm/security.c
index 8f3971cf16541..4b80150e4afa7 100644
--- a/drivers/nvdimm/security.c
+++ b/drivers/nvdimm/security.c
@@ -450,14 +450,19 @@ void __nvdimm_security_overwrite_query(struct nvdimm *nvdimm)
else
dev_dbg(&nvdimm->dev, "overwrite completed\n");
- if (nvdimm->sec.overwrite_state)
- sysfs_notify_dirent(nvdimm->sec.overwrite_state);
+ /*
+ * Mark the overwrite work done and update dimm security flags,
+ * then send a sysfs event notification to wake up userspace
+ * poll threads to picked up the changed state.
+ */
nvdimm->sec.overwrite_tmo = 0;
clear_bit(NDD_SECURITY_OVERWRITE, &nvdimm->flags);
clear_bit(NDD_WORK_PENDING, &nvdimm->flags);
- put_device(&nvdimm->dev);
nvdimm->sec.flags = nvdimm_security_flags(nvdimm, NVDIMM_USER);
nvdimm->sec.ext_flags = nvdimm_security_flags(nvdimm, NVDIMM_MASTER);
+ if (nvdimm->sec.overwrite_state)
+ sysfs_notify_dirent(nvdimm->sec.overwrite_state);
+ put_device(&nvdimm->dev);
}
void nvdimm_security_overwrite_query(struct work_struct *work)
--
2.25.1
From: Geert Uytterhoeven <[email protected]>
[ Upstream commit bd8548d0dcdab514e08e35a3451667486d879dae ]
CONFIG_IOMEM does not exist. The correct symbol to depend on is
CONFIG_HAS_IOMEM.
Fixes: 1e7468bd9d30a21e ("clk: Specify IOMEM dependency for HSDK pll driver")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/clk/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/Kconfig b/drivers/clk/Kconfig
index 326f91b2dda9f..5f952e111ab5a 100644
--- a/drivers/clk/Kconfig
+++ b/drivers/clk/Kconfig
@@ -50,7 +50,7 @@ source "drivers/clk/versatile/Kconfig"
config CLK_HSDK
bool "PLL Driver for HSDK platform"
depends on OF || COMPILE_TEST
- depends on IOMEM
+ depends on HAS_IOMEM
help
This driver supports the HSDK core, system, ddr, tunnel and hdmi PLLs
control.
--
2.25.1
From: Scott Mayhew <[email protected]>
[ Upstream commit 67dd23f9e6fbaf163431912ef5599c5e0693476c ]
nfs_wb_all() calls filemap_write_and_wait(), which uses
filemap_check_errors() to determine the error to return.
filemap_check_errors() only looks at the mapping->flags and will
therefore only return either -ENOSPC or -EIO. To ensure that the
correct error is returned on close(), nfs{,4}_file_flush() should call
filemap_check_wb_err() which looks at the errseq value in
mapping->wb_err without consuming it.
Fixes: 6fbda89b257f ("NFS: Replace custom error reporting mechanism with
generic one")
Signed-off-by: Scott Mayhew <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/nfs/file.c | 5 ++++-
fs/nfs/nfs4file.c | 5 ++++-
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index f96367a2463e3..d72496efa17b0 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -140,6 +140,7 @@ static int
nfs_file_flush(struct file *file, fl_owner_t id)
{
struct inode *inode = file_inode(file);
+ errseq_t since;
dprintk("NFS: flush(%pD2)\n", file);
@@ -148,7 +149,9 @@ nfs_file_flush(struct file *file, fl_owner_t id)
return 0;
/* Flush writes to the server and return any errors */
- return nfs_wb_all(inode);
+ since = filemap_sample_wb_err(file->f_mapping);
+ nfs_wb_all(inode);
+ return filemap_check_wb_err(file->f_mapping, since);
}
ssize_t
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index 8e5d6223ddd35..a339707654673 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -110,6 +110,7 @@ static int
nfs4_file_flush(struct file *file, fl_owner_t id)
{
struct inode *inode = file_inode(file);
+ errseq_t since;
dprintk("NFS: flush(%pD2)\n", file);
@@ -125,7 +126,9 @@ nfs4_file_flush(struct file *file, fl_owner_t id)
return filemap_fdatawrite(file->f_mapping);
/* Flush writes to the server and return any errors */
- return nfs_wb_all(inode);
+ since = filemap_sample_wb_err(file->f_mapping);
+ nfs_wb_all(inode);
+ return filemap_check_wb_err(file->f_mapping, since);
}
#ifdef CONFIG_NFS_V4_2
--
2.25.1
From: Zhihao Cheng <[email protected]>
[ Upstream commit c3fc1a3919e35a9d8157ed3ae6fd0a478293ba2c ]
ubi_wl_entry related with the fm_next_anchor PEB is not freed during
detach, which causes a memory leak.
Don't forget to release fm_next_anchor PEB while detaching ubi from
mtd when CONFIG_MTD_UBI_FASTMAP is enabled.
Signed-off-by: Zhihao Cheng <[email protected]>
Fixes: 4b68bf9a69d22d ("ubi: Select fastmap anchor PEBs considering...")
Signed-off-by: Richard Weinberger <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/mtd/ubi/fastmap-wl.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c
index 83afc00e365a5..28f55f9cf7153 100644
--- a/drivers/mtd/ubi/fastmap-wl.c
+++ b/drivers/mtd/ubi/fastmap-wl.c
@@ -381,6 +381,11 @@ static void ubi_fastmap_close(struct ubi_device *ubi)
ubi->fm_anchor = NULL;
}
+ if (ubi->fm_next_anchor) {
+ return_unused_peb(ubi, ubi->fm_next_anchor);
+ ubi->fm_next_anchor = NULL;
+ }
+
if (ubi->fm) {
for (i = 0; i < ubi->fm->used_blocks; i++)
kfree(ubi->fm->e[i]);
--
2.25.1
From: Ewan D. Milne <[email protected]>
[ Upstream commit af6de8c60fe9433afa73cea6fcccdccd98ad3e5e ]
We cannot wait on a completion object in the lpfc_nvme_targetport structure
in the _destroy_targetport() code path because the NVMe/fc transport will
free that structure immediately after the .targetport_delete() callback.
This results in a use-after-free, and a crash if slub_debug=FZPU is
enabled.
An earlier fix put put the completion on the stack, but commit 2a0fb340fcc8
("scsi: lpfc: Correct localport timeout duration error") subsequently
changed the code to reference the completion through a pointer in the
object rather than the local stack variable. Fix this by using the stack
variable directly.
Link: https://lore.kernel.org/r/[email protected]
Fixes: 2a0fb340fcc8 ("scsi: lpfc: Correct localport timeout duration error")
Reviewed-by: James Smart <[email protected]>
Signed-off-by: Ewan D. Milne <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/scsi/lpfc/lpfc_nvmet.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/lpfc/lpfc_nvmet.c b/drivers/scsi/lpfc/lpfc_nvmet.c
index 88760416a8cbd..fcd9d4c2f1ee0 100644
--- a/drivers/scsi/lpfc/lpfc_nvmet.c
+++ b/drivers/scsi/lpfc/lpfc_nvmet.c
@@ -2112,7 +2112,7 @@ lpfc_nvmet_destroy_targetport(struct lpfc_hba *phba)
}
tgtp->tport_unreg_cmp = &tport_unreg_cmp;
nvmet_fc_unregister_targetport(phba->targetport);
- if (!wait_for_completion_timeout(tgtp->tport_unreg_cmp,
+ if (!wait_for_completion_timeout(&tport_unreg_cmp,
msecs_to_jiffies(LPFC_NVMET_WAIT_TMO)))
lpfc_printf_log(phba, KERN_ERR, LOG_NVME,
"6179 Unreg targetport x%px timeout "
--
2.25.1
From: Zhihao Cheng <[email protected]>
[ Upstream commit 3b185255bb2f34fa6927619b9ef27f192a3d9f5a ]
Following process triggers a memleak caused by forgetting to release the
initial next anchor PEB (CONFIG_MTD_UBI_FASTMAP is disabled):
1. attach -> __erase_worker -> produce the initial next anchor PEB
2. detach -> ubi_fastmap_close (Do nothing, it should have released the
initial next anchor PEB)
Don't produce the initial next anchor PEB in __erase_worker() when fastmap
is disabled.
Signed-off-by: Zhihao Cheng <[email protected]>
Suggested-by: Sascha Hauer <[email protected]>
Fixes: f9c34bb529975fe ("ubi: Fix producing anchor PEBs")
Reported-by: [email protected]
Signed-off-by: Richard Weinberger <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/mtd/ubi/wl.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index 27636063ed1bb..42cac572f82dc 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -1086,7 +1086,8 @@ static int __erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk)
if (!err) {
spin_lock(&ubi->wl_lock);
- if (!ubi->fm_next_anchor && e->pnum < UBI_FM_MAX_START) {
+ if (!ubi->fm_disabled && !ubi->fm_next_anchor &&
+ e->pnum < UBI_FM_MAX_START) {
/* Abort anchor production, if needed it will be
* enabled again in the wear leveling started below.
*/
--
2.25.1
From: Jonathan Marek <[email protected]>
[ Upstream commit c8b9002f44e4a1d2771b2f59f6de900864b1f9d7 ]
0x44 isn't a register offset, it is the value that goes into CAL_L_VAL.
Fixes: 548a909597d5 ("clk: qcom: clk-alpha-pll: Add support for Trion PLLs")
Signed-off-by: Jonathan Marek <[email protected]>
Tested-by: Dmitry Baryshkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/clk/qcom/clk-alpha-pll.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/clk/qcom/clk-alpha-pll.c b/drivers/clk/qcom/clk-alpha-pll.c
index 9b2dfa08acb2a..1325139173c95 100644
--- a/drivers/clk/qcom/clk-alpha-pll.c
+++ b/drivers/clk/qcom/clk-alpha-pll.c
@@ -56,7 +56,6 @@
#define PLL_STATUS(p) ((p)->offset + (p)->regs[PLL_OFF_STATUS])
#define PLL_OPMODE(p) ((p)->offset + (p)->regs[PLL_OFF_OPMODE])
#define PLL_FRAC(p) ((p)->offset + (p)->regs[PLL_OFF_FRAC])
-#define PLL_CAL_VAL(p) ((p)->offset + (p)->regs[PLL_OFF_CAL_VAL])
const u8 clk_alpha_pll_regs[][PLL_OFF_MAX_REGS] = {
[CLK_ALPHA_PLL_TYPE_DEFAULT] = {
@@ -115,7 +114,6 @@ const u8 clk_alpha_pll_regs[][PLL_OFF_MAX_REGS] = {
[PLL_OFF_STATUS] = 0x30,
[PLL_OFF_OPMODE] = 0x38,
[PLL_OFF_ALPHA_VAL] = 0x40,
- [PLL_OFF_CAL_VAL] = 0x44,
},
[CLK_ALPHA_PLL_TYPE_LUCID] = {
[PLL_OFF_L_VAL] = 0x04,
--
2.25.1
From: Wolfram Sang <[email protected]>
[ Upstream commit c7c9e914f9a0478fba4dc6f227cfd69cf84a4063 ]
Due to the lockless design of the driver, it is theoretically possible
to access a NULL pointer, if a slave interrupt was running while we were
unregistering the slave. To make this rock solid, disable the interrupt
for a short time while we are clearing the interrupt_enable register.
This patch is purely based on code inspection. The OOPS is super-hard to
trigger because clearing SAR (the address) makes interrupts even more
unlikely to happen as well. While here, reinit SCR to SDBS because this
bit should always be set according to documentation. There is no effect,
though, because the interface is disabled.
Fixes: 7b814d852af6 ("i2c: rcar: avoid race when unregistering slave client")
Signed-off-by: Wolfram Sang <[email protected]>
Reviewed-by: Niklas Söderlund <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/i2c/busses/i2c-rcar.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/i2c/busses/i2c-rcar.c b/drivers/i2c/busses/i2c-rcar.c
index 76c615be5acae..9e883474db8ce 100644
--- a/drivers/i2c/busses/i2c-rcar.c
+++ b/drivers/i2c/busses/i2c-rcar.c
@@ -866,12 +866,14 @@ static int rcar_unreg_slave(struct i2c_client *slave)
WARN_ON(!priv->slave);
- /* disable irqs and ensure none is running before clearing ptr */
+ /* ensure no irq is running before clearing ptr */
+ disable_irq(priv->irq);
rcar_i2c_write(priv, ICSIER, 0);
- rcar_i2c_write(priv, ICSCR, 0);
+ rcar_i2c_write(priv, ICSSR, 0);
+ enable_irq(priv->irq);
+ rcar_i2c_write(priv, ICSCR, SDBS);
rcar_i2c_write(priv, ICSAR, 0); /* Gen2: must be 0 if not using slave */
- synchronize_irq(priv->irq);
priv->slave = NULL;
pm_runtime_put(rcar_i2c_priv_to_dev(priv));
--
2.25.1
From: Scott Mayhew <[email protected]>
[ Upstream commit ce368536dd614452407dc31e2449eb84681a06af ]
The NFS_CONTEXT_ERROR_WRITE flag (as well as the check of said flag) was
removed by commit 6fbda89b257f. The absence of an error check allows
writes to be continually queued up for a server that may no longer be
able to handle them. Fix it by adding an error check using the generic
error reporting functions.
Fixes: 6fbda89b257f ("NFS: Replace custom error reporting mechanism with generic one")
Signed-off-by: Scott Mayhew <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/nfs/file.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index d72496efa17b0..63940a7a70be1 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -590,12 +590,14 @@ static const struct vm_operations_struct nfs_file_vm_ops = {
.page_mkwrite = nfs_vm_page_mkwrite,
};
-static int nfs_need_check_write(struct file *filp, struct inode *inode)
+static int nfs_need_check_write(struct file *filp, struct inode *inode,
+ int error)
{
struct nfs_open_context *ctx;
ctx = nfs_file_open_context(filp);
- if (nfs_ctx_key_to_expire(ctx, inode))
+ if (nfs_error_is_fatal_on_server(error) ||
+ nfs_ctx_key_to_expire(ctx, inode))
return 1;
return 0;
}
@@ -606,6 +608,8 @@ ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from)
struct inode *inode = file_inode(file);
unsigned long written = 0;
ssize_t result;
+ errseq_t since;
+ int error;
result = nfs_key_timeout_notify(file, inode);
if (result)
@@ -630,6 +634,7 @@ ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from)
if (iocb->ki_pos > i_size_read(inode))
nfs_revalidate_mapping(inode, file->f_mapping);
+ since = filemap_sample_wb_err(file->f_mapping);
nfs_start_io_write(inode);
result = generic_write_checks(iocb, from);
if (result > 0) {
@@ -648,7 +653,8 @@ ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from)
goto out;
/* Return error values */
- if (nfs_need_check_write(file, inode)) {
+ error = filemap_check_wb_err(file->f_mapping, since);
+ if (nfs_need_check_write(file, inode, error)) {
int err = nfs_wb_all(inode);
if (err < 0)
result = err;
--
2.25.1
From: Thomas Hebb <[email protected]>
[ Upstream commit e3232c2f39acafd5a29128425bc30b9884642cfa ]
commit c8c188679ccf ("tools build: Use the same CC for feature detection
and actual build") changed these assignments from unconditional (:=) to
conditional (?=) so that they wouldn't clobber values from the
environment. However, conditional assignment does not work properly for
variables that Make implicitly sets, among which are CC and CXX. To
quote tools/scripts/Makefile.include, which handles this properly:
# Makefiles suck: This macro sets a default value of $(2) for the
# variable named by $(1), unless the variable has been set by
# environment or command line. This is necessary for CC and AR
# because make sets default values, so the simpler ?= approach
# won't work as expected.
In other words, the conditional assignments will not run even if the
variables are not overridden in the environment; Make will set CC to
"cc" and CXX to "g++" when it starts[1], meaning the variables are not
empty by the time the conditional assignments are evaluated. This breaks
cross-compilation when CROSS_COMPILE is set but CC isn't, since "cc"
gets used for feature detection instead of the cross compiler (and
likewise for CXX).
To fix the issue, just pass down the values of CC and CXX computed by
the parent Makefile, which gets included by the Makefile that actually
builds whatever we're detecting features for and so is guaranteed to
have good values. This is a better solution anyway, since it means we
aren't trying to replicate the logic of the parent build system and so
don't risk it getting out of sync.
Leave PKG_CONFIG alone, since 1) there's no common logic to compute it
in Makefile.include, and 2) it's not an implicit variable, so
conditional assignment works properly.
[1] https://www.gnu.org/software/make/manual/html_node/Implicit-Variables.html
Fixes: c8c188679ccf ("tools build: Use the same CC for feature detection and actual build")
Signed-off-by: Thomas Hebb <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: David Carrillo-Cisneros <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Igor Lubashev <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Quentin Monnet <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: thomas hebb <[email protected]>
Link: http://lore.kernel.org/lkml/0a6e69d1736b0fa231a648f50b0cce5d8a6734ef.1595822871.git.tommyhebb@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
tools/build/Makefile.feature | 2 +-
tools/build/feature/Makefile | 2 --
2 files changed, 1 insertion(+), 3 deletions(-)
diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index cb152370fdefd..774f0b0ca28ac 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -8,7 +8,7 @@ endif
feature_check = $(eval $(feature_check_code))
define feature_check_code
- feature-$(1) := $(shell $(MAKE) OUTPUT=$(OUTPUT_FEATURES) CFLAGS="$(EXTRA_CFLAGS) $(FEATURE_CHECK_CFLAGS-$(1))" CXXFLAGS="$(EXTRA_CXXFLAGS) $(FEATURE_CHECK_CXXFLAGS-$(1))" LDFLAGS="$(LDFLAGS) $(FEATURE_CHECK_LDFLAGS-$(1))" -C $(feature_dir) $(OUTPUT_FEATURES)test-$1.bin >/dev/null 2>/dev/null && echo 1 || echo 0)
+ feature-$(1) := $(shell $(MAKE) OUTPUT=$(OUTPUT_FEATURES) CC=$(CC) CXX=$(CXX) CFLAGS="$(EXTRA_CFLAGS) $(FEATURE_CHECK_CFLAGS-$(1))" CXXFLAGS="$(EXTRA_CXXFLAGS) $(FEATURE_CHECK_CXXFLAGS-$(1))" LDFLAGS="$(LDFLAGS) $(FEATURE_CHECK_LDFLAGS-$(1))" -C $(feature_dir) $(OUTPUT_FEATURES)test-$1.bin >/dev/null 2>/dev/null && echo 1 || echo 0)
endef
feature_set = $(eval $(feature_set_code))
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index b1f0321180f5c..93b590d81209c 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -74,8 +74,6 @@ FILES= \
FILES := $(addprefix $(OUTPUT),$(FILES))
-CC ?= $(CROSS_COMPILE)gcc
-CXX ?= $(CROSS_COMPILE)g++
PKG_CONFIG ?= $(CROSS_COMPILE)pkg-config
LLVM_CONFIG ?= llvm-config
CLANG ?= clang
--
2.25.1
From: Liu Yi L <[email protected]>
[ Upstream commit 288d08e78008828416ffaa85ef274b4e29ef3dae ]
Address information for device TLB invalidation comes from userspace
when device is directly assigned to a guest with vIOMMU support.
VT-d requires page aligned address. This patch checks and enforce
address to be page aligned, otherwise reserved bits can be set in the
invalidation descriptor. Unrecoverable fault will be reported due to
non-zero value in the reserved bits.
Fixes: 61a06a16e36d8 ("iommu/vt-d: Support flushing more translation cache types")
Signed-off-by: Liu Yi L <[email protected]>
Signed-off-by: Jacob Pan <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/iommu/intel/dmar.c | 21 +++++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 16f47041f1bf5..ec23a2f0b5f8d 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -1459,9 +1459,26 @@ void qi_flush_dev_iotlb_pasid(struct intel_iommu *iommu, u16 sid, u16 pfsid,
* Max Invs Pending (MIP) is set to 0 for now until we have DIT in
* ECAP.
*/
- desc.qw1 |= addr & ~mask;
- if (size_order)
+ if (addr & GENMASK_ULL(size_order + VTD_PAGE_SHIFT, 0))
+ pr_warn_ratelimited("Invalidate non-aligned address %llx, order %d\n",
+ addr, size_order);
+
+ /* Take page address */
+ desc.qw1 = QI_DEV_EIOTLB_ADDR(addr);
+
+ if (size_order) {
+ /*
+ * Existing 0s in address below size_order may be the least
+ * significant bit, we must set them to 1s to avoid having
+ * smaller size than desired.
+ */
+ desc.qw1 |= GENMASK_ULL(size_order + VTD_PAGE_SHIFT - 1,
+ VTD_PAGE_SHIFT);
+ /* Clear size_order bit to indicate size */
+ desc.qw1 &= ~mask;
+ /* Set the S bit to indicate flushing more than 1 page */
desc.qw1 |= QI_DEV_EIOTLB_SIZE;
+ }
qi_submit_sync(iommu, &desc, 1, 0);
}
--
2.25.1
From: Eric Biggers <[email protected]>
[ Upstream commit 0a12c4a8069607247cb8edc3b035a664e636fd9a ]
The minix filesystem reads its maximum file size from its on-disk
superblock. This value isn't necessarily a multiple of the block size.
When it's not, the V1 block mapping code doesn't allow mapping the last
possible block. Commit 6ed6a722f9ab ("minixfs: fix block limit check")
fixed this in the V2 mapping code. Fix it in the V1 mapping code too.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Qiujun Huang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/minix/itree_v1.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/minix/itree_v1.c b/fs/minix/itree_v1.c
index c0d418209ead1..405573a79aab4 100644
--- a/fs/minix/itree_v1.c
+++ b/fs/minix/itree_v1.c
@@ -29,7 +29,7 @@ static int block_to_path(struct inode * inode, long block, int offsets[DEPTH])
if (block < 0) {
printk("MINIX-fs: block_to_path: block %ld < 0 on dev %pg\n",
block, inode->i_sb->s_bdev);
- } else if (block >= inode->i_sb->s_maxbytes/BLOCK_SIZE) {
+ } else if ((u64)block * BLOCK_SIZE >= inode->i_sb->s_maxbytes) {
if (printk_ratelimit())
printk("MINIX-fs: block_to_path: "
"block %ld too big on dev %pg\n",
--
2.25.1
From: Jin Yao <[email protected]>
[ Upstream commit c4735d990268399da9133b0ad445e488ece009ad ]
Since commit 0a892c1c9472 ("perf record: Add dummy event during system wide synthesis"),
a dummy event is added to capture mmaps.
But if we run perf-record as,
# perf record -e cycles:p -IXMM0 -a -- sleep 1
Error:
dummy:HG: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'
The issue is, if we enable the extended regs (-IXMM0), but the
pmu->capabilities is not set with PERF_PMU_CAP_EXTENDED_REGS, the kernel
will return -EOPNOTSUPP error.
See following code:
/* in kernel/events/core.c */
static int perf_try_init_event(struct pmu *pmu, struct perf_event *event)
{
....
if (!(pmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS) &&
has_extended_regs(event))
ret = -EOPNOTSUPP;
....
}
For software dummy event, the PMU should not be set with
PERF_PMU_CAP_EXTENDED_REGS. But unfortunately now, the dummy
event has possibility to be set with PERF_REG_EXTENDED_MASK bit.
In evsel__config, /* tools/perf/util/evsel.c */
if (opts->sample_intr_regs) {
attr->sample_regs_intr = opts->sample_intr_regs;
}
If we use -IXMM0, the attr>sample_regs_intr will be set with
PERF_REG_EXTENDED_MASK bit.
It doesn't make sense to set attr->sample_regs_intr for a
software dummy event.
This patch adds dummy event checking before setting
attr->sample_regs_intr and attr->sample_regs_user.
After:
# ./perf record -e cycles:p -IXMM0 -a -- sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.413 MB perf.data (45 samples) ]
Committer notes:
Adrian said this when providing his Acked-by:
"
This is fine. It will not break PT.
no_aux_samples is useful for evsels that have been added by the code rather
than requested by the user. For old kernels PT adds sched_switch tracepoint
to track context switches (before the current context switch event was
added) and having auxiliary sample information unnecessarily uses up space
in the perf buffer.
"
Fixes: 0a892c1c9472 ("perf record: Add dummy event during system wide synthesis")
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
tools/perf/util/evsel.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index ef802f6d40c17..6a79cfdf96cb6 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1014,12 +1014,14 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
if (callchain && callchain->enabled && !evsel->no_aux_samples)
evsel__config_callchain(evsel, opts, callchain);
- if (opts->sample_intr_regs && !evsel->no_aux_samples) {
+ if (opts->sample_intr_regs && !evsel->no_aux_samples &&
+ !evsel__is_dummy_event(evsel)) {
attr->sample_regs_intr = opts->sample_intr_regs;
evsel__set_sample_bit(evsel, REGS_INTR);
}
- if (opts->sample_user_regs && !evsel->no_aux_samples) {
+ if (opts->sample_user_regs && !evsel->no_aux_samples &&
+ !evsel__is_dummy_event(evsel)) {
attr->sample_regs_user |= opts->sample_user_regs;
evsel__set_sample_bit(evsel, REGS_USER);
}
--
2.25.1
From: Zhihao Cheng <[email protected]>
[ Upstream commit 094b6d1295474f338201b846a1f15e72eb0b12cf ]
There a wrong orphan node deleting in error handling path in
ubifs_jnl_update() and ubifs_jnl_rename(), which may cause
following error msg:
UBIFS error (ubi0:0 pid 1522): ubifs_delete_orphan [ubifs]:
missing orphan ino 65
Fix this by checking whether the node has been operated for
adding to orphan list before being deleted,
Signed-off-by: Zhihao Cheng <[email protected]>
Fixes: 823838a486888cf484e ("ubifs: Add hashes to the tree node cache")
Signed-off-by: Richard Weinberger <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/ubifs/journal.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
index e5ec1afe1c668..2cf05f87565c2 100644
--- a/fs/ubifs/journal.c
+++ b/fs/ubifs/journal.c
@@ -539,7 +539,7 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir,
const struct fscrypt_name *nm, const struct inode *inode,
int deletion, int xent)
{
- int err, dlen, ilen, len, lnum, ino_offs, dent_offs;
+ int err, dlen, ilen, len, lnum, ino_offs, dent_offs, orphan_added = 0;
int aligned_dlen, aligned_ilen, sync = IS_DIRSYNC(dir);
int last_reference = !!(deletion && inode->i_nlink == 0);
struct ubifs_inode *ui = ubifs_inode(inode);
@@ -630,6 +630,7 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir,
goto out_finish;
}
ui->del_cmtno = c->cmt_no;
+ orphan_added = 1;
}
err = write_head(c, BASEHD, dent, len, &lnum, &dent_offs, sync);
@@ -702,7 +703,7 @@ int ubifs_jnl_update(struct ubifs_info *c, const struct inode *dir,
kfree(dent);
out_ro:
ubifs_ro_mode(c, err);
- if (last_reference)
+ if (orphan_added)
ubifs_delete_orphan(c, inode->i_ino);
finish_reservation(c);
return err;
@@ -1218,7 +1219,7 @@ int ubifs_jnl_rename(struct ubifs_info *c, const struct inode *old_dir,
void *p;
union ubifs_key key;
struct ubifs_dent_node *dent, *dent2;
- int err, dlen1, dlen2, ilen, lnum, offs, len;
+ int err, dlen1, dlen2, ilen, lnum, offs, len, orphan_added = 0;
int aligned_dlen1, aligned_dlen2, plen = UBIFS_INO_NODE_SZ;
int last_reference = !!(new_inode && new_inode->i_nlink == 0);
int move = (old_dir != new_dir);
@@ -1334,6 +1335,7 @@ int ubifs_jnl_rename(struct ubifs_info *c, const struct inode *old_dir,
goto out_finish;
}
new_ui->del_cmtno = c->cmt_no;
+ orphan_added = 1;
}
err = write_head(c, BASEHD, dent, len, &lnum, &offs, sync);
@@ -1415,7 +1417,7 @@ int ubifs_jnl_rename(struct ubifs_info *c, const struct inode *old_dir,
release_head(c, BASEHD);
out_ro:
ubifs_ro_mode(c, err);
- if (last_reference)
+ if (orphan_added)
ubifs_delete_orphan(c, new_inode->i_ino);
out_finish:
finish_reservation(c);
--
2.25.1
From: Rayagonda Kokatanur <[email protected]>
[ Upstream commit 6ced5ff0be8e94871ba846dfbddf69d21363f3d7 ]
Handle clk_get_rate() returning 0 to avoid possible division by zero.
Fixes: daa5abc41c80 ("pwm: Add support for Broadcom iProc PWM controller")
Signed-off-by: Rayagonda Kokatanur <[email protected]>
Signed-off-by: Scott Branden <[email protected]>
Reviewed-by: Ray Jui <[email protected]>
Reviewed-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Thierry Reding <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/pwm/pwm-bcm-iproc.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/pwm/pwm-bcm-iproc.c b/drivers/pwm/pwm-bcm-iproc.c
index 1f829edd8ee70..d392a828fc493 100644
--- a/drivers/pwm/pwm-bcm-iproc.c
+++ b/drivers/pwm/pwm-bcm-iproc.c
@@ -85,8 +85,6 @@ static void iproc_pwmc_get_state(struct pwm_chip *chip, struct pwm_device *pwm,
u64 tmp, multi, rate;
u32 value, prescale;
- rate = clk_get_rate(ip->clk);
-
value = readl(ip->base + IPROC_PWM_CTRL_OFFSET);
if (value & BIT(IPROC_PWM_CTRL_EN_SHIFT(pwm->hwpwm)))
@@ -99,6 +97,13 @@ static void iproc_pwmc_get_state(struct pwm_chip *chip, struct pwm_device *pwm,
else
state->polarity = PWM_POLARITY_INVERSED;
+ rate = clk_get_rate(ip->clk);
+ if (rate == 0) {
+ state->period = 0;
+ state->duty_cycle = 0;
+ return;
+ }
+
value = readl(ip->base + IPROC_PWM_PRESCALE_OFFSET);
prescale = value >> IPROC_PWM_PRESCALE_SHIFT(pwm->hwpwm);
prescale &= IPROC_PWM_PRESCALE_MAX;
--
2.25.1
From: Stylon Wang <[email protected]>
commit c5892a10218214d729699ab61bad6fc109baf0ce upstream.
[Why]
Setting abm level does not correctly update CRTC state. As a result
no surface update is added to dc stream state and triggers warning.
[How]
Correctly update CRTC state when setting abm level property.
CC: Stable <[email protected]>
Signed-off-by: Stylon Wang <[email protected]>
Reviewed-by: Nicholas Kazlauskas <[email protected]>
Acked-by: Eryk Brol <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 23 ++++++++++++++++++++++
1 file changed, 23 insertions(+)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -8686,6 +8686,29 @@ static int amdgpu_dm_atomic_check(struct
if (ret)
goto fail;
+ /* Check connector changes */
+ for_each_oldnew_connector_in_state(state, connector, old_con_state, new_con_state, i) {
+ struct dm_connector_state *dm_old_con_state = to_dm_connector_state(old_con_state);
+ struct dm_connector_state *dm_new_con_state = to_dm_connector_state(new_con_state);
+
+ /* Skip connectors that are disabled or part of modeset already. */
+ if (!old_con_state->crtc && !new_con_state->crtc)
+ continue;
+
+ if (!new_con_state->crtc)
+ continue;
+
+ new_crtc_state = drm_atomic_get_crtc_state(state, new_con_state->crtc);
+ if (IS_ERR(new_crtc_state)) {
+ ret = PTR_ERR(new_crtc_state);
+ goto fail;
+ }
+
+ if (dm_old_con_state->abm_level !=
+ dm_new_con_state->abm_level)
+ new_crtc_state->connectors_changed = true;
+ }
+
#if defined(CONFIG_DRM_AMD_DC_DCN)
if (!compute_mst_dsc_configs_for_state(state, dm_state->context))
goto fail;
From: Xu Wang <[email protected]>
[ Upstream commit 12b90b40854a8461a02ef19f6f4474cc88d64b66 ]
In case of error, the function clk_register() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check
should be replaced with IS_ERR().
Signed-off-by: Xu Wang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Acked-by: Barry Song <[email protected]>
Fixes: 7bf21bc81f28 ("clk: sirf: re-arch to make the codes support both prima2 and atlas6")
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/clk/sirf/clk-atlas6.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/sirf/clk-atlas6.c b/drivers/clk/sirf/clk-atlas6.c
index c84d5bab7ac28..b95483bb6a5ec 100644
--- a/drivers/clk/sirf/clk-atlas6.c
+++ b/drivers/clk/sirf/clk-atlas6.c
@@ -135,7 +135,7 @@ static void __init atlas6_clk_init(struct device_node *np)
for (i = pll1; i < maxclk; i++) {
atlas6_clks[i] = clk_register(NULL, atlas6_clk_hw_array[i]);
- BUG_ON(!atlas6_clks[i]);
+ BUG_ON(IS_ERR(atlas6_clks[i]));
}
clk_register_clkdev(atlas6_clks[cpu], NULL, "cpu");
clk_register_clkdev(atlas6_clks[io], NULL, "io");
--
2.25.1
From: Jiri Olsa <[email protected]>
[ Upstream commit 4929e95a1400e45b4b5a87fd3ce10273444187d4 ]
Jin Yao reported issue with possible conflict between raw events and
term values in pmu event syntax.
Currently following syntax is resolved as raw event with 0xead value:
uncore_imc_free_running/read/
instead of using 'read' term from uncore_imc_free_running pmu, because
'read' is correct raw event syntax with 0xead value.
To solve this issue we do following:
- check existing terms during rXXXX syntax processing
and make them priority in case of conflict
- allow pmu/r0x1234/ syntax to be able to specify conflicting
raw event (implemented in previous patch)
Also add automated tests for this and perf_pmu__parse_cleanup call to
parse_events_terms, so the test gets properly cleaned up.
Fixes: 3a6c51e4d66c ("perf parser: Add support to specify rXXX event with pmu")
Reported-by: Jin Yao <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Jin Yao <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Richter <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
tools/perf/tests/parse-events.c | 37 ++++++++++++++++++++++++++++++++-
tools/perf/util/parse-events.c | 28 +++++++++++++++++++++++++
tools/perf/util/parse-events.h | 2 ++
tools/perf/util/parse-events.l | 19 ++++++++++-------
4 files changed, 77 insertions(+), 9 deletions(-)
diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index 895188b63f963..6a2ec6ec0d0ef 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -631,6 +631,34 @@ static int test__checkterms_simple(struct list_head *terms)
TEST_ASSERT_VAL("wrong val", term->val.num == 1);
TEST_ASSERT_VAL("wrong config", !strcmp(term->config, "umask"));
+ /*
+ * read
+ *
+ * The perf_pmu__test_parse_init injects 'read' term into
+ * perf_pmu_events_list, so 'read' is evaluated as read term
+ * and not as raw event with 'ead' hex value.
+ */
+ term = list_entry(term->list.next, struct parse_events_term, list);
+ TEST_ASSERT_VAL("wrong type term",
+ term->type_term == PARSE_EVENTS__TERM_TYPE_USER);
+ TEST_ASSERT_VAL("wrong type val",
+ term->type_val == PARSE_EVENTS__TERM_TYPE_NUM);
+ TEST_ASSERT_VAL("wrong val", term->val.num == 1);
+ TEST_ASSERT_VAL("wrong config", !strcmp(term->config, "read"));
+
+ /*
+ * r0xead
+ *
+ * To be still able to pass 'ead' value with 'r' syntax,
+ * we added support to parse 'r0xHEX' event.
+ */
+ term = list_entry(term->list.next, struct parse_events_term, list);
+ TEST_ASSERT_VAL("wrong type term",
+ term->type_term == PARSE_EVENTS__TERM_TYPE_CONFIG);
+ TEST_ASSERT_VAL("wrong type val",
+ term->type_val == PARSE_EVENTS__TERM_TYPE_NUM);
+ TEST_ASSERT_VAL("wrong val", term->val.num == 0xead);
+ TEST_ASSERT_VAL("wrong config", !term->config);
return 0;
}
@@ -1776,7 +1804,7 @@ struct terms_test {
static struct terms_test test__terms[] = {
[0] = {
- .str = "config=10,config1,config2=3,umask=1",
+ .str = "config=10,config1,config2=3,umask=1,read,r0xead",
.check = test__checkterms_simple,
},
};
@@ -1836,6 +1864,13 @@ static int test_term(struct terms_test *t)
INIT_LIST_HEAD(&terms);
+ /*
+ * The perf_pmu__test_parse_init prepares perf_pmu_events_list
+ * which gets freed in parse_events_terms.
+ */
+ if (perf_pmu__test_parse_init())
+ return -1;
+
ret = parse_events_terms(&terms, t->str);
if (ret) {
pr_debug("failed to parse terms '%s', err %d\n",
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 3decbb203846a..4476de0e678aa 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -2017,6 +2017,32 @@ static void perf_pmu__parse_init(void)
perf_pmu__parse_cleanup();
}
+/*
+ * This function injects special term in
+ * perf_pmu_events_list so the test code
+ * can check on this functionality.
+ */
+int perf_pmu__test_parse_init(void)
+{
+ struct perf_pmu_event_symbol *list;
+
+ list = malloc(sizeof(*list) * 1);
+ if (!list)
+ return -ENOMEM;
+
+ list->type = PMU_EVENT_SYMBOL;
+ list->symbol = strdup("read");
+
+ if (!list->symbol) {
+ free(list);
+ return -ENOMEM;
+ }
+
+ perf_pmu_events_list = list;
+ perf_pmu_events_list_num = 1;
+ return 0;
+}
+
enum perf_pmu_event_symbol_type
perf_pmu__parse_check(const char *name)
{
@@ -2078,6 +2104,8 @@ int parse_events_terms(struct list_head *terms, const char *str)
int ret;
ret = parse_events__scanner(str, &parse_state);
+ perf_pmu__parse_cleanup();
+
if (!ret) {
list_splice(parse_state.terms, terms);
zfree(&parse_state.terms);
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 1fe23a2f9b36e..0b8cdb7270f04 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -253,4 +253,6 @@ static inline bool is_sdt_event(char *str __maybe_unused)
}
#endif /* HAVE_LIBELF_SUPPORT */
+int perf_pmu__test_parse_init(void);
+
#endif /* __PERF_PARSE_EVENTS_H */
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 002802e17059e..7332d16cb4fc7 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -41,14 +41,6 @@ static int value(yyscan_t scanner, int base)
return __value(yylval, text, base, PE_VALUE);
}
-static int raw(yyscan_t scanner)
-{
- YYSTYPE *yylval = parse_events_get_lval(scanner);
- char *text = parse_events_get_text(scanner);
-
- return __value(yylval, text + 1, 16, PE_RAW);
-}
-
static int str(yyscan_t scanner, int token)
{
YYSTYPE *yylval = parse_events_get_lval(scanner);
@@ -72,6 +64,17 @@ static int str(yyscan_t scanner, int token)
return token;
}
+static int raw(yyscan_t scanner)
+{
+ YYSTYPE *yylval = parse_events_get_lval(scanner);
+ char *text = parse_events_get_text(scanner);
+
+ if (perf_pmu__parse_check(text) == PMU_EVENT_SYMBOL)
+ return str(scanner, PE_NAME);
+
+ return __value(yylval, text + 1, 16, PE_RAW);
+}
+
static bool isbpf_suffix(char *text)
{
int len = strlen(text);
--
2.25.1
From: Sandeep Raghuraman <[email protected]>
commit f87812284172a9809820d10143b573d833cd3f75 upstream.
Reproducing bug report here:
After hibernating and resuming, DPM is not enabled. This remains the case
even if you test hibernate using the steps here:
https://www.kernel.org/doc/html/latest/power/basic-pm-debugging.html
I debugged the problem, and figured out that in the file hardwaremanager.c,
in the function, phm_enable_dynamic_state_management(), the check
'if (!hwmgr->pp_one_vf && smum_is_dpm_running(hwmgr) && !amdgpu_passthrough(adev) && adev->in_suspend)'
returns true for the hibernate case, and false for the suspend case.
This means that for the hibernate case, the AMDGPU driver doesn't enable DPM
(even though it should) and simply returns from that function.
In the suspend case, it goes ahead and enables DPM, even though it doesn't need to.
I debugged further, and found out that in the case of suspend, for the
CIK/Hawaii GPUs, smum_is_dpm_running(hwmgr) returns false, while in the case of
hibernate, smum_is_dpm_running(hwmgr) returns true.
For CIK, the ci_is_dpm_running() function calls the ci_is_smc_ram_running() function,
which is ultimately used to determine if DPM is currently enabled or not,
and this seems to provide the wrong answer.
I've changed the ci_is_dpm_running() function to instead use the same method that
some other AMD GPU chips do (e.g Fiji), which seems to read the voltage controller.
I've tested on my R9 390 and it seems to work correctly for both suspend and
hibernate use cases, and has been stable so far.
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=208839
Signed-off-by: Sandeep Raghuraman <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/ci_smumgr.c
@@ -2725,7 +2725,10 @@ static int ci_initialize_mc_reg_table(st
static bool ci_is_dpm_running(struct pp_hwmgr *hwmgr)
{
- return ci_is_smc_ram_running(hwmgr);
+ return (1 == PHM_READ_INDIRECT_FIELD(hwmgr->device,
+ CGS_IND_REG__SMC, FEATURE_STATUS,
+ VOLTAGE_CONTROLLER_ON))
+ ? true : false;
}
static int ci_smu_init(struct pp_hwmgr *hwmgr)
From: Eric Biggers <[email protected]>
[ Upstream commit f666f9fb9a36f1c833b9d18923572f0e4d304754 ]
When truncating a file to a size within the last allowed logical block,
block_to_path() is called with the *next* block. This exceeds the limit,
causing the "block %ld too big" error message to be printed.
This case isn't actually an error; there are just no more blocks past that
point. So, remove this error message.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Qiujun Huang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/minix/itree_v1.c | 12 ++++++------
fs/minix/itree_v2.c | 12 ++++++------
2 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/fs/minix/itree_v1.c b/fs/minix/itree_v1.c
index 405573a79aab4..1fed906042aa8 100644
--- a/fs/minix/itree_v1.c
+++ b/fs/minix/itree_v1.c
@@ -29,12 +29,12 @@ static int block_to_path(struct inode * inode, long block, int offsets[DEPTH])
if (block < 0) {
printk("MINIX-fs: block_to_path: block %ld < 0 on dev %pg\n",
block, inode->i_sb->s_bdev);
- } else if ((u64)block * BLOCK_SIZE >= inode->i_sb->s_maxbytes) {
- if (printk_ratelimit())
- printk("MINIX-fs: block_to_path: "
- "block %ld too big on dev %pg\n",
- block, inode->i_sb->s_bdev);
- } else if (block < 7) {
+ return 0;
+ }
+ if ((u64)block * BLOCK_SIZE >= inode->i_sb->s_maxbytes)
+ return 0;
+
+ if (block < 7) {
offsets[n++] = block;
} else if ((block -= 7) < 512) {
offsets[n++] = 7;
diff --git a/fs/minix/itree_v2.c b/fs/minix/itree_v2.c
index ee8af2f9e2828..9d00f31a2d9d1 100644
--- a/fs/minix/itree_v2.c
+++ b/fs/minix/itree_v2.c
@@ -32,12 +32,12 @@ static int block_to_path(struct inode * inode, long block, int offsets[DEPTH])
if (block < 0) {
printk("MINIX-fs: block_to_path: block %ld < 0 on dev %pg\n",
block, sb->s_bdev);
- } else if ((u64)block * (u64)sb->s_blocksize >= sb->s_maxbytes) {
- if (printk_ratelimit())
- printk("MINIX-fs: block_to_path: "
- "block %ld too big on dev %pg\n",
- block, sb->s_bdev);
- } else if (block < DIRCOUNT) {
+ return 0;
+ }
+ if ((u64)block * (u64)sb->s_blocksize >= sb->s_maxbytes)
+ return 0;
+
+ if (block < DIRCOUNT) {
offsets[n++] = block;
} else if ((block -= DIRCOUNT) < INDIRCOUNT(sb)) {
offsets[n++] = DIRCOUNT;
--
2.25.1
From: Imre Deak <[email protected]>
commit 58c1721787be8a6ff28b4e5b6ce395915476871e upstream.
This fixes the following use-after-free problem in case an MST down
message times out, while waiting for the response for it:
[ 449.022841] [drm:drm_dp_mst_wait_tx_reply.isra.26] timedout msg send 0000000080ba7fa2 2 0
[ 449.022898] ------------[ cut here ]------------
[ 449.022903] list_add corruption. prev->next should be next (ffff88847dae32c0), but was 6b6b6b6b6b6b6b6b. (prev=ffff88847db1c140).
[ 449.022931] WARNING: CPU: 2 PID: 22 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70
[ 449.022935] Modules linked in: asix usbnet mii snd_hda_codec_hdmi mei_hdcp i915 x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep e1000e snd_hda_core ptp snd_pcm pps_core mei_me mei intel_lpss_pci prime_numbers
[ 449.022966] CPU: 2 PID: 22 Comm: kworker/2:0 Not tainted 5.7.0-rc3-CI-Patchwork_17536+ #1
[ 449.022970] Hardware name: Intel Corporation Tiger Lake Client Platform/TigerLake U DDR4 SODIMM RVP, BIOS TGLSFWI1.R00.2457.A16.1912270059 12/27/2019
[ 449.022976] Workqueue: events_long drm_dp_mst_link_probe_work
[ 449.022982] RIP: 0010:__list_add_valid+0x4d/0x70
[ 449.022987] Code: c3 48 89 d1 48 c7 c7 f0 e7 32 82 48 89 c2 e8 3a 49 b7 ff 0f 0b 31 c0 c3 48 89 c1 4c 89 c6 48 c7 c7 40 e8 32 82 e8 23 49 b7 ff <0f> 0b 31 c0 c3 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 90 e8 32 82 e8
[ 449.022991] RSP: 0018:ffffc900001abcb0 EFLAGS: 00010286
[ 449.022995] RAX: 0000000000000000 RBX: ffff88847dae2d58 RCX: 0000000000000001
[ 449.022999] RDX: 0000000080000001 RSI: ffff88849d914978 RDI: 00000000ffffffff
[ 449.023002] RBP: ffff88847dae32c0 R08: ffff88849d914978 R09: 0000000000000000
[ 449.023006] R10: ffffc900001abcb8 R11: 0000000000000000 R12: ffff888490d98400
[ 449.023009] R13: ffff88847dae3230 R14: ffff88847db1c140 R15: ffff888490d98540
[ 449.023013] FS: 0000000000000000(0000) GS:ffff88849ff00000(0000) knlGS:0000000000000000
[ 449.023017] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 449.023021] CR2: 00007fb96fafdc63 CR3: 0000000005610004 CR4: 0000000000760ee0
[ 449.023025] PKRU: 55555554
[ 449.023028] Call Trace:
[ 449.023034] drm_dp_queue_down_tx+0x59/0x110
[ 449.023041] ? rcu_read_lock_sched_held+0x4d/0x80
[ 449.023050] ? kmem_cache_alloc_trace+0x2a6/0x2d0
[ 449.023060] drm_dp_send_link_address+0x74/0x870
[ 449.023065] ? __slab_free+0x3e1/0x5c0
[ 449.023071] ? lockdep_hardirqs_on+0xe0/0x1c0
[ 449.023078] ? lockdep_hardirqs_on+0xe0/0x1c0
[ 449.023097] drm_dp_check_and_send_link_address+0x9a/0xc0
[ 449.023106] drm_dp_mst_link_probe_work+0x9e/0x160
[ 449.023117] process_one_work+0x268/0x600
[ 449.023124] ? __schedule+0x307/0x8d0
[ 449.023139] worker_thread+0x37/0x380
[ 449.023149] ? process_one_work+0x600/0x600
[ 449.023153] kthread+0x140/0x160
[ 449.023159] ? kthread_park+0x80/0x80
[ 449.023169] ret_from_fork+0x24/0x50
Fixes: d308a881a591 ("drm/dp_mst: Kill the second sideband tx slot, save the world")
Cc: Lyude Paul <[email protected]>
Cc: Sean Paul <[email protected]>
Cc: Wayne Lin <[email protected]>
Cc: <[email protected]> # v3.17+
Signed-off-by: Imre Deak <[email protected]>
Reviewed-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/drm_dp_mst_topology.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -1197,7 +1197,8 @@ static int drm_dp_mst_wait_tx_reply(stru
/* remove from q */
if (txmsg->state == DRM_DP_SIDEBAND_TX_QUEUED ||
- txmsg->state == DRM_DP_SIDEBAND_TX_START_SEND)
+ txmsg->state == DRM_DP_SIDEBAND_TX_START_SEND ||
+ txmsg->state == DRM_DP_SIDEBAND_TX_SENT)
list_del(&txmsg->next);
}
out:
From: Imre Deak <[email protected]>
commit d8bd15b37d328a935a4fc695fed8b19157503950 upstream.
During the initial MST probing an MST port's I2C device will be
registered using the kdev of the DRM device as a parent. Later after MST
Connection Status Notifications this I2C device will be re-registered
with the kdev of the port's connector. This will also move
inconsistently the I2C device's sysfs entry from the DRM device's sysfs
dir to the connector's dir.
Fix the above by keeping the DRM kdev as the parent of the I2C device.
Ideally the connector's kdev would be used as a parent, similarly to
non-MST connectors, however that needs some more refactoring to ensure
the connector's kdev is already available early enough. So keep the
existing (initial) behavior for now.
Cc: <[email protected]>
Signed-off-by: Imre Deak <[email protected]>
Reviewed-by: Stanislav Lisovskiy <[email protected]>
Reviewed-by: Lyude Paul <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/drm_dp_mst_topology.c | 28 ++++++++++++++++------------
1 file changed, 16 insertions(+), 12 deletions(-)
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -88,8 +88,8 @@ static int drm_dp_send_enum_path_resourc
static bool drm_dp_validate_guid(struct drm_dp_mst_topology_mgr *mgr,
u8 *guid);
-static int drm_dp_mst_register_i2c_bus(struct drm_dp_aux *aux);
-static void drm_dp_mst_unregister_i2c_bus(struct drm_dp_aux *aux);
+static int drm_dp_mst_register_i2c_bus(struct drm_dp_mst_port *port);
+static void drm_dp_mst_unregister_i2c_bus(struct drm_dp_mst_port *port);
static void drm_dp_mst_kick_tx(struct drm_dp_mst_topology_mgr *mgr);
#define DBG_PREFIX "[dp_mst]"
@@ -1966,7 +1966,7 @@ drm_dp_port_set_pdt(struct drm_dp_mst_po
}
/* remove i2c over sideband */
- drm_dp_mst_unregister_i2c_bus(&port->aux);
+ drm_dp_mst_unregister_i2c_bus(port);
} else {
mutex_lock(&mgr->lock);
drm_dp_mst_topology_put_mstb(port->mstb);
@@ -1981,7 +1981,7 @@ drm_dp_port_set_pdt(struct drm_dp_mst_po
if (port->pdt != DP_PEER_DEVICE_NONE) {
if (drm_dp_mst_is_end_device(port->pdt, port->mcs)) {
/* add i2c over sideband */
- ret = drm_dp_mst_register_i2c_bus(&port->aux);
+ ret = drm_dp_mst_register_i2c_bus(port);
} else {
lct = drm_dp_calculate_rad(port, rad);
mstb = drm_dp_add_mst_branch_device(lct, rad);
@@ -5346,22 +5346,26 @@ static const struct i2c_algorithm drm_dp
/**
* drm_dp_mst_register_i2c_bus() - register an I2C adapter for I2C-over-AUX
- * @aux: DisplayPort AUX channel
+ * @port: The port to add the I2C bus on
*
* Returns 0 on success or a negative error code on failure.
*/
-static int drm_dp_mst_register_i2c_bus(struct drm_dp_aux *aux)
+static int drm_dp_mst_register_i2c_bus(struct drm_dp_mst_port *port)
{
+ struct drm_dp_aux *aux = &port->aux;
+ struct device *parent_dev = port->mgr->dev->dev;
+
aux->ddc.algo = &drm_dp_mst_i2c_algo;
aux->ddc.algo_data = aux;
aux->ddc.retries = 3;
aux->ddc.class = I2C_CLASS_DDC;
aux->ddc.owner = THIS_MODULE;
- aux->ddc.dev.parent = aux->dev;
- aux->ddc.dev.of_node = aux->dev->of_node;
+ /* FIXME: set the kdev of the port's connector as parent */
+ aux->ddc.dev.parent = parent_dev;
+ aux->ddc.dev.of_node = parent_dev->of_node;
- strlcpy(aux->ddc.name, aux->name ? aux->name : dev_name(aux->dev),
+ strlcpy(aux->ddc.name, aux->name ? aux->name : dev_name(parent_dev),
sizeof(aux->ddc.name));
return i2c_add_adapter(&aux->ddc);
@@ -5369,11 +5373,11 @@ static int drm_dp_mst_register_i2c_bus(s
/**
* drm_dp_mst_unregister_i2c_bus() - unregister an I2C-over-AUX adapter
- * @aux: DisplayPort AUX channel
+ * @port: The port to remove the I2C bus from
*/
-static void drm_dp_mst_unregister_i2c_bus(struct drm_dp_aux *aux)
+static void drm_dp_mst_unregister_i2c_bus(struct drm_dp_mst_port *port)
{
- i2c_del_adapter(&aux->ddc);
+ i2c_del_adapter(&port->aux.ddc);
}
/**
From: Tomi Valkeinen <[email protected]>
commit ecfdedd7da5d54416db5ca0f851264dca8736f59 upstream.
Use SET_LATE_SYSTEM_SLEEP_PM_OPS in DSS submodules to force runtime PM
suspend and resume.
We use suspend late version so that omapdrm's system suspend callback is
called first, as that will disable all the display outputs after which
it's safe to force DSS into suspend.
Signed-off-by: Tomi Valkeinen <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Acked-by: Tony Lindgren <[email protected]>
Fixes: cef766300353 ("drm/omap: Prepare DSS for probing without legacy platform data")
Cc: [email protected] # v5.7+
Tested-by: Tony Lindgren <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/omapdrm/dss/dispc.c | 1 +
drivers/gpu/drm/omapdrm/dss/dsi.c | 1 +
drivers/gpu/drm/omapdrm/dss/dss.c | 1 +
drivers/gpu/drm/omapdrm/dss/venc.c | 1 +
4 files changed, 4 insertions(+)
--- a/drivers/gpu/drm/omapdrm/dss/dispc.c
+++ b/drivers/gpu/drm/omapdrm/dss/dispc.c
@@ -4915,6 +4915,7 @@ static int dispc_runtime_resume(struct d
static const struct dev_pm_ops dispc_pm_ops = {
.runtime_suspend = dispc_runtime_suspend,
.runtime_resume = dispc_runtime_resume,
+ SET_LATE_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
};
struct platform_driver omap_dispchw_driver = {
--- a/drivers/gpu/drm/omapdrm/dss/dsi.c
+++ b/drivers/gpu/drm/omapdrm/dss/dsi.c
@@ -5467,6 +5467,7 @@ static int dsi_runtime_resume(struct dev
static const struct dev_pm_ops dsi_pm_ops = {
.runtime_suspend = dsi_runtime_suspend,
.runtime_resume = dsi_runtime_resume,
+ SET_LATE_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
};
struct platform_driver omap_dsihw_driver = {
--- a/drivers/gpu/drm/omapdrm/dss/dss.c
+++ b/drivers/gpu/drm/omapdrm/dss/dss.c
@@ -1614,6 +1614,7 @@ static int dss_runtime_resume(struct dev
static const struct dev_pm_ops dss_pm_ops = {
.runtime_suspend = dss_runtime_suspend,
.runtime_resume = dss_runtime_resume,
+ SET_LATE_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
};
struct platform_driver omap_dsshw_driver = {
--- a/drivers/gpu/drm/omapdrm/dss/venc.c
+++ b/drivers/gpu/drm/omapdrm/dss/venc.c
@@ -902,6 +902,7 @@ static int venc_runtime_resume(struct de
static const struct dev_pm_ops venc_pm_ops = {
.runtime_suspend = venc_runtime_suspend,
.runtime_resume = venc_runtime_resume,
+ SET_LATE_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)
};
static const struct of_device_id venc_of_match[] = {
From: Chris Wilson <[email protected]>
commit 7c4541a37bbbf83c0f16f779e85eb61d9348ed29 upstream.
Before we return control to the system, and letting it reuse all the
pages being accessed by HW, we must disable the HW. At the moment, we
dare not reset the GPU if it will clobber the display, but once we know
the display has been disabled, we can proceed with the reset as we
shutdown the module. We know the next user must reinitialise the HW for
their purpose.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/489
Signed-off-by: Chris Wilson <[email protected]>
Cc: [email protected]
Reviewed-by: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/i915/gt/intel_gt.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -616,6 +616,11 @@ void intel_gt_driver_unregister(struct i
void intel_gt_driver_release(struct intel_gt *gt)
{
struct i915_address_space *vm;
+ intel_wakeref_t wakeref;
+
+ /* Scrub all HW state upon release */
+ with_intel_runtime_pm(gt->uncore->rpm, wakeref)
+ __intel_gt_reset(gt, ALL_ENGINES);
vm = fetch_and_zero(>->vm);
if (vm) /* FIXME being called twice on error paths :( */
From: hersen wu <[email protected]>
commit 8b0379a85762b516c7b46aed7dbf2a4947c00564 upstream.
[Why]
ramp_up_dispclk_with_dpp is to change dispclk, dppclk and dprefclk
according to bandwidth requirement. call stack: rv1_update_clocks -->
update_clocks --> dcn10_prepare_bandwidth / dcn10_optimize_bandwidth
--> prepare_bandwidth / optimize_bandwidth. before change dcn hw,
prepare_bandwidth will be called first to allow enough clock,
watermark for change, after end of dcn hw change, optimize_bandwidth
is executed to lower clock to save power for new dcn hw settings.
below is sequence of commit_planes_for_stream:
step 1: prepare_bandwidth - raise clock to have enough bandwidth
step 2: lock_doublebuffer_enable
step 3: pipe_control_lock(true) - make dchubp register change will
not take effect right way
step 4: apply_ctx_for_surface - program dchubp
step 5: pipe_control_lock(false) - dchubp register change take effect
step 6: optimize_bandwidth --> dc_post_update_surfaces_to_stream
for full_date, optimize clock to save power
at end of step 1, dcn clocks (dprefclk, dispclk, dppclk) may be
changed for new dchubp configuration. but real dcn hub dchubps are
still running with old configuration until end of step 5. this need
clocks settings at step 1 should not less than that before step 1.
this is checked by two conditions: 1. if (should_set_clock(safe_to_lower
, new_clocks->dispclk_khz, clk_mgr_base->clks.dispclk_khz) ||
new_clocks->dispclk_khz == clk_mgr_base->clks.dispclk_khz)
2. request_dpp_div = new_clocks->dispclk_khz > new_clocks->dppclk_khz
the second condition is based on new dchubp configuration. dppclk
for new dchubp may be different from dppclk before step 1.
for example, before step 1, dchubps are as below:
pipe 0: recout=(0,40,1920,980) viewport=(0,0,1920,979)
pipe 1: recout=(0,0,1920,1080) viewport=(0,0,1920,1080)
for dppclk for pipe0 need dppclk = dispclk
new dchubp pipe split configuration:
pipe 0: recout=(0,0,960,1080) viewport=(0,0,960,1080)
pipe 1: recout=(960,0,960,1080) viewport=(960,0,960,1080)
dppclk only needs dppclk = dispclk /2.
dispclk, dppclk are not lock by otg master lock. they take effect
after step 1. during this transition, dispclk are the same, but
dppclk is changed to half of previous clock for old dchubp
configuration between step 1 and step 6. This may cause p-state
warning intermittently.
[How]
for new_clocks->dispclk_khz == clk_mgr_base->clks.dispclk_khz, we
need make sure dppclk are not changed to less between step 1 and 6.
for new_clocks->dispclk_khz > clk_mgr_base->clks.dispclk_khz,
new display clock is raised, but we do not know ratio of
new_clocks->dispclk_khz and clk_mgr_base->clks.dispclk_khz,
new_clocks->dispclk_khz /2 does not guarantee equal or higher than
old dppclk. we could ignore power saving different between
dppclk = displck and dppclk = dispclk / 2 between step 1 and step 6.
as long as safe_to_lower = false, set dpclk = dispclk to simplify
condition check.
CC: Stable <[email protected]>
Signed-off-by: Hersen Wu <[email protected]>
Reviewed-by: Aric Cyr <[email protected]>
Acked-by: Eryk Brol <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/amd/display/dc/clk_mgr/dcn10/rv1_clk_mgr.c | 69 ++++++++++++-
1 file changed, 67 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn10/rv1_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn10/rv1_clk_mgr.c
@@ -85,12 +85,77 @@ static int rv1_determine_dppclk_threshol
return disp_clk_threshold;
}
-static void ramp_up_dispclk_with_dpp(struct clk_mgr_internal *clk_mgr, struct dc *dc, struct dc_clocks *new_clocks)
+static void ramp_up_dispclk_with_dpp(
+ struct clk_mgr_internal *clk_mgr,
+ struct dc *dc,
+ struct dc_clocks *new_clocks,
+ bool safe_to_lower)
{
int i;
int dispclk_to_dpp_threshold = rv1_determine_dppclk_threshold(clk_mgr, new_clocks);
bool request_dpp_div = new_clocks->dispclk_khz > new_clocks->dppclk_khz;
+ /* this function is to change dispclk, dppclk and dprefclk according to
+ * bandwidth requirement. Its call stack is rv1_update_clocks -->
+ * update_clocks --> dcn10_prepare_bandwidth / dcn10_optimize_bandwidth
+ * --> prepare_bandwidth / optimize_bandwidth. before change dcn hw,
+ * prepare_bandwidth will be called first to allow enough clock,
+ * watermark for change, after end of dcn hw change, optimize_bandwidth
+ * is executed to lower clock to save power for new dcn hw settings.
+ *
+ * below is sequence of commit_planes_for_stream:
+ *
+ * step 1: prepare_bandwidth - raise clock to have enough bandwidth
+ * step 2: lock_doublebuffer_enable
+ * step 3: pipe_control_lock(true) - make dchubp register change will
+ * not take effect right way
+ * step 4: apply_ctx_for_surface - program dchubp
+ * step 5: pipe_control_lock(false) - dchubp register change take effect
+ * step 6: optimize_bandwidth --> dc_post_update_surfaces_to_stream
+ * for full_date, optimize clock to save power
+ *
+ * at end of step 1, dcn clocks (dprefclk, dispclk, dppclk) may be
+ * changed for new dchubp configuration. but real dcn hub dchubps are
+ * still running with old configuration until end of step 5. this need
+ * clocks settings at step 1 should not less than that before step 1.
+ * this is checked by two conditions: 1. if (should_set_clock(safe_to_lower
+ * , new_clocks->dispclk_khz, clk_mgr_base->clks.dispclk_khz) ||
+ * new_clocks->dispclk_khz == clk_mgr_base->clks.dispclk_khz)
+ * 2. request_dpp_div = new_clocks->dispclk_khz > new_clocks->dppclk_khz
+ *
+ * the second condition is based on new dchubp configuration. dppclk
+ * for new dchubp may be different from dppclk before step 1.
+ * for example, before step 1, dchubps are as below:
+ * pipe 0: recout=(0,40,1920,980) viewport=(0,0,1920,979)
+ * pipe 1: recout=(0,0,1920,1080) viewport=(0,0,1920,1080)
+ * for dppclk for pipe0 need dppclk = dispclk
+ *
+ * new dchubp pipe split configuration:
+ * pipe 0: recout=(0,0,960,1080) viewport=(0,0,960,1080)
+ * pipe 1: recout=(960,0,960,1080) viewport=(960,0,960,1080)
+ * dppclk only needs dppclk = dispclk /2.
+ *
+ * dispclk, dppclk are not lock by otg master lock. they take effect
+ * after step 1. during this transition, dispclk are the same, but
+ * dppclk is changed to half of previous clock for old dchubp
+ * configuration between step 1 and step 6. This may cause p-state
+ * warning intermittently.
+ *
+ * for new_clocks->dispclk_khz == clk_mgr_base->clks.dispclk_khz, we
+ * need make sure dppclk are not changed to less between step 1 and 6.
+ * for new_clocks->dispclk_khz > clk_mgr_base->clks.dispclk_khz,
+ * new display clock is raised, but we do not know ratio of
+ * new_clocks->dispclk_khz and clk_mgr_base->clks.dispclk_khz,
+ * new_clocks->dispclk_khz /2 does not guarantee equal or higher than
+ * old dppclk. we could ignore power saving different between
+ * dppclk = displck and dppclk = dispclk / 2 between step 1 and step 6.
+ * as long as safe_to_lower = false, set dpclk = dispclk to simplify
+ * condition check.
+ * todo: review this change for other asic.
+ **/
+ if (!safe_to_lower)
+ request_dpp_div = false;
+
/* set disp clk to dpp clk threshold */
clk_mgr->funcs->set_dispclk(clk_mgr, dispclk_to_dpp_threshold);
@@ -209,7 +274,7 @@ static void rv1_update_clocks(struct clk
/* program dispclk on = as a w/a for sleep resume clock ramping issues */
if (should_set_clock(safe_to_lower, new_clocks->dispclk_khz, clk_mgr_base->clks.dispclk_khz)
|| new_clocks->dispclk_khz == clk_mgr_base->clks.dispclk_khz) {
- ramp_up_dispclk_with_dpp(clk_mgr, dc, new_clocks);
+ ramp_up_dispclk_with_dpp(clk_mgr, dc, new_clocks, safe_to_lower);
clk_mgr_base->clks.dispclk_khz = new_clocks->dispclk_khz;
send_request_to_lower = true;
}
From: Denis Efremov <[email protected]>
commit 114427b8927a4def2942b2b886f7e4aeae289ccb upstream.
Use kvfree() to free bo->sgts, because the memory is allocated with
kvmalloc_array() in panfrost_mmu_map_fault_addr().
Fixes: 187d2929206e ("drm/panfrost: Add support for GPU heap allocations")
Cc: [email protected]
Signed-off-by: Denis Efremov <[email protected]>
Reviewed-by: Steven Price <[email protected]>
Signed-off-by: Steven Price <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/panfrost/panfrost_gem.c | 2 +-
drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/panfrost/panfrost_gem.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.c
@@ -46,7 +46,7 @@ static void panfrost_gem_free_object(str
sg_free_table(&bo->sgts[i]);
}
}
- kfree(bo->sgts);
+ kvfree(bo->sgts);
}
drm_gem_shmem_free_object(obj);
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -486,7 +486,7 @@ static int panfrost_mmu_map_fault_addr(s
pages = kvmalloc_array(bo->base.base.size >> PAGE_SHIFT,
sizeof(struct page *), GFP_KERNEL | __GFP_ZERO);
if (!pages) {
- kfree(bo->sgts);
+ kvfree(bo->sgts);
bo->sgts = NULL;
mutex_unlock(&bo->base.pages_lock);
ret = -ENOMEM;
From: Tomi Valkeinen <[email protected]>
commit a72a6a16d51034045cb6355924b62221a8221ca3 upstream.
The connector type for DISPC's DPI videoport was set the LVDS instead of
DPI. This causes any DPI panel setup to fail with tidss, making all DPI
panels unusable.
Fix this by using correct connector type.
Signed-off-by: Tomi Valkeinen <[email protected]>
Fixes: 32a1795f57eecc39749017 ("drm/tidss: New driver for TI Keystone platform Display SubSystem")
Cc: [email protected] # v5.7+
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Jyri Sarha <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/tidss/tidss_kms.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/tidss/tidss_kms.c
+++ b/drivers/gpu/drm/tidss/tidss_kms.c
@@ -154,7 +154,7 @@ static int tidss_dispc_modeset_init(stru
break;
case DISPC_VP_DPI:
enc_type = DRM_MODE_ENCODER_DPI;
- conn_type = DRM_MODE_CONNECTOR_LVDS;
+ conn_type = DRM_MODE_CONNECTOR_DPI;
break;
default:
WARN_ON(1);
From: Marius Iacob <[email protected]>
commit b5ac98cbb8e5e30c34ebc837d1e5a3982d2b5f5c upstream.
Signed-off-by: Marius Iacob <[email protected]>
Cc: [email protected]
Signed-off-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/drm_panel_orientation_quirks.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/drivers/gpu/drm/drm_panel_orientation_quirks.c
+++ b/drivers/gpu/drm/drm_panel_orientation_quirks.c
@@ -121,6 +121,12 @@ static const struct dmi_system_id orient
DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "T101HA"),
},
.driver_data = (void *)&lcd800x1280_rightside_up,
+ }, { /* Asus T103HAF */
+ .matches = {
+ DMI_EXACT_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+ DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "T103HAF"),
+ },
+ .driver_data = (void *)&lcd800x1280_rightside_up,
}, { /* GPD MicroPC (generic strings, also match on bios date) */
.matches = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Default string"),
From: Geert Uytterhoeven <[email protected]>
[ Upstream commit 845d9156febcd6b3b20c0c2c8d73b461b48e844c ]
Somewhere along the patch handling path, both the old "printk(KERN_ALERT
....)" and the new "pr_alert(...)" were retained, leading to the
duplicate printing of "PC:".
Drop the old one.
Fixes: eaabf98b0932a540 ("sh: fault: modernize printing of kernel messages")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Rich Felker <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/sh/mm/fault.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/sh/mm/fault.c b/arch/sh/mm/fault.c
index fbe1f2fe9a8c8..acd1c75994983 100644
--- a/arch/sh/mm/fault.c
+++ b/arch/sh/mm/fault.c
@@ -208,7 +208,6 @@ show_fault_oops(struct pt_regs *regs, unsigned long address)
if (!oops_may_print())
return;
- printk(KERN_ALERT "PC:");
pr_alert("BUG: unable to handle kernel %s at %08lx\n",
address < PAGE_SIZE ? "NULL pointer dereference"
: "paging request",
--
2.25.1
From: Imre Deak <[email protected]>
commit 7d11507605a75fcb2159b93d1fe8213b363a2c20 upstream.
The WARN below triggers during the removal of an MST port. The problem
is that the parent device's (the connector's kdev) sysfs directory is
removed recursively when the connector is unregistered (even though the
I2C device holds a reference on the parent device). To fix this set
first the Peer Device Type to none which will remove the I2C device.
Note that atm, inconsistently, the parent of the I2C device is initially set to
the DRM kdev and after a Connection Status Notification the parent may be reset
to be the connector's kdev. This problem is addressed by the next patch.
[ 4462.989299] ------------[ cut here ]------------
[ 4463.014940] sysfs group 'power' not found for kobject 'i2c-24'
[ 4463.034664] WARNING: CPU: 0 PID: 970 at fs/sysfs/group.c:281 sysfs_remove_group+0x71/0x80
[ 4463.044357] Modules linked in: snd_hda_intel i915 drm_kms_helper(O) drm netconsole snd_hda_codec_hdmi mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul snd_intel_dspcf
g crc32_pclmul snd_hda_codec snd_hwdep ghash_clmulni_intel snd_hda_core asix usbnet kvm_intel mii i2c_algo_bit snd_pcm syscopyarea sysfillrect e1000e sysimgblt fb_sys_fops prim
e_numbers ptp pps_core i2c_i801 r8169 mei_me realtek mei [last unloaded: drm]
[ 4463.044399] CPU: 0 PID: 970 Comm: kworker/0:2 Tainted: G O 5.7.0+ #172
[ 4463.044402] Hardware name: Intel Corporation Tiger Lake Client Platform/TigerLake U DDR4 SODIMM RVP
[ 4463.044423] Workqueue: events drm_dp_delayed_destroy_work [drm_kms_helper]
[ 4463.044428] RIP: 0010:sysfs_remove_group+0x71/0x80
[ 4463.044431] Code: 48 89 df 5b 5d 41 5c e9 cd b6 ff ff 48 89 df e8 95 b4 ff ff eb cb 49 8b 14 24 48 8b 75 00 48 c7 c7 20 0f 3f 82 e8 9f c5 d7 ff <0f> 0b 5b 5d 41 5c c3 0f 1f
84 00 00 00 00 00 48 85 f6 74 31 41 54
[ 4463.044433] RSP: 0018:ffffc900018bfbf0 EFLAGS: 00010282
[ 4463.044436] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000001
[ 4463.044439] RDX: 0000000080000001 RSI: ffff88849e828f38 RDI: 00000000ffffffff
[ 4463.052970] [drm:drm_atomic_get_plane_state [drm]] Added [PLANE:100:plane 2B] 00000000c2160caa state to 00000000d172564a
[ 4463.070533] RBP: ffffffff820cea20 R08: ffff88847f4b8958 R09: 0000000000000000
[ 4463.070535] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88848a725018
[ 4463.070537] R13: 0000000000000000 R14: ffffffff827090e0 R15: 0000000000000002
[ 4463.070539] FS: 0000000000000000(0000) GS:ffff88849e800000(0000) knlGS:0000000000000000
[ 4463.070541] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4463.070543] CR2: 00007fdf8a756538 CR3: 0000000489684001 CR4: 0000000000760ef0
[ 4463.070545] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 4463.070547] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 4463.070549] PKRU: 55555554
[ 4463.070551] Call Trace:
[ 4463.070560] device_del+0x84/0x400
[ 4463.070571] cdev_device_del+0x10/0x30
[ 4463.070578] put_i2c_dev+0x69/0x80
[ 4463.070584] i2cdev_detach_adapter+0x2e/0x60
[ 4463.070591] notifier_call_chain+0x34/0x90
[ 4463.070599] blocking_notifier_call_chain+0x3f/0x60
[ 4463.070606] device_del+0x7c/0x400
[ 4463.087817] ? lockdep_init_map_waits+0x57/0x210
[ 4463.087825] device_unregister+0x11/0x60
[ 4463.087829] i2c_del_adapter+0x249/0x310
[ 4463.087846] drm_dp_port_set_pdt+0x6b/0x2c0 [drm_kms_helper]
[ 4463.087862] drm_dp_delayed_destroy_work+0x2af/0x350 [drm_kms_helper]
[ 4463.087876] process_one_work+0x268/0x600
[ 4463.105438] ? __schedule+0x30c/0x920
[ 4463.105451] worker_thread+0x37/0x380
[ 4463.105457] ? process_one_work+0x600/0x600
[ 4463.105462] kthread+0x140/0x160
[ 4463.105466] ? kthread_park+0x80/0x80
[ 4463.105474] ret_from_fork+0x24/0x50
Cc: <[email protected]>
Signed-off-by: Imre Deak <[email protected]>
Reviewed-by: Stanislav Lisovskiy <[email protected]>
Reviewed-by: Lyude Paul <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/drm_dp_mst_topology.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -4642,12 +4642,13 @@ static void drm_dp_tx_work(struct work_s
static inline void
drm_dp_delayed_destroy_port(struct drm_dp_mst_port *port)
{
+ drm_dp_port_set_pdt(port, DP_PEER_DEVICE_NONE, port->mcs);
+
if (port->connector) {
drm_connector_unregister(port->connector);
drm_connector_put(port->connector);
}
- drm_dp_port_set_pdt(port, DP_PEER_DEVICE_NONE, port->mcs);
drm_dp_mst_put_port_malloc(port);
}
From: Dinghao Liu <[email protected]>
[ Upstream commit 5a25de6df789cc805a9b8ba7ab5deef5067af47e ]
Freeing chip on error may lead to an Oops at the next time
the system goes to resume. Fix this by removing all
snd_echo_free() calls on error.
Fixes: 47b5d028fdce8 ("ALSA: Echoaudio - Add suspend support #2")
Signed-off-by: Dinghao Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
sound/pci/echoaudio/echoaudio.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/sound/pci/echoaudio/echoaudio.c b/sound/pci/echoaudio/echoaudio.c
index 0941a7a17623a..456219a665a79 100644
--- a/sound/pci/echoaudio/echoaudio.c
+++ b/sound/pci/echoaudio/echoaudio.c
@@ -2158,7 +2158,6 @@ static int snd_echo_resume(struct device *dev)
if (err < 0) {
kfree(commpage_bak);
dev_err(dev, "resume init_hw err=%d\n", err);
- snd_echo_free(chip);
return err;
}
@@ -2185,7 +2184,6 @@ static int snd_echo_resume(struct device *dev)
if (request_irq(pci->irq, snd_echo_interrupt, IRQF_SHARED,
KBUILD_MODNAME, chip)) {
dev_err(chip->card->dev, "cannot grab irq\n");
- snd_echo_free(chip);
return -EBUSY;
}
chip->irq = pci->irq;
--
2.25.1
From: Andrii Nakryiko <[email protected]>
[ Upstream commit d5ca590525cfbd87ca307dcf498a566e2e7c1767 ]
99aacebecb75 ("selftests: do not use .ONESHELL") removed .ONESHELL, which
changes how Makefile "silences" multi-command target recipes. selftests/bpf's
Makefile relied (a somewhat unknowingly) on .ONESHELL behavior of silencing
all commands within the recipe if the first command contains @ symbol.
Removing .ONESHELL exposed this hack.
This patch fixes the issue by explicitly silencing each command with $(Q).
Also explicitly define fallback rule for building *.o from *.c, instead of
relying on non-silent inherited rule. This was causing a non-silent output for
bench.o object file.
Fixes: 92f7440ecc93 ("selftests/bpf: More succinct Makefile output")
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
tools/testing/selftests/bpf/Makefile | 44 +++++++++++++++-------------
1 file changed, 24 insertions(+), 20 deletions(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index dab182ffec320..4f322d5388757 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -102,7 +102,7 @@ endif
OVERRIDE_TARGETS := 1
override define CLEAN
$(call msg,CLEAN)
- $(RM) -r $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES) $(EXTRA_CLEAN)
+ $(Q)$(RM) -r $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES) $(EXTRA_CLEAN)
endef
include ../lib.mk
@@ -122,17 +122,21 @@ $(notdir $(TEST_GEN_PROGS) \
$(TEST_GEN_PROGS_EXTENDED) \
$(TEST_CUSTOM_PROGS)): %: $(OUTPUT)/% ;
+$(OUTPUT)/%.o: %.c
+ $(call msg,CC,,$@)
+ $(Q)$(CC) $(CFLAGS) -c $(filter %.c,$^) $(LDLIBS) -o $@
+
$(OUTPUT)/%:%.c
$(call msg,BINARY,,$@)
- $(LINK.c) $^ $(LDLIBS) -o $@
+ $(Q)$(LINK.c) $^ $(LDLIBS) -o $@
$(OUTPUT)/urandom_read: urandom_read.c
$(call msg,BINARY,,$@)
- $(CC) $(LDFLAGS) -o $@ $< $(LDLIBS) -Wl,--build-id
+ $(Q)$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS) -Wl,--build-id
$(OUTPUT)/test_stub.o: test_stub.c $(BPFOBJ)
$(call msg,CC,,$@)
- $(CC) -c $(CFLAGS) -o $@ $<
+ $(Q)$(CC) -c $(CFLAGS) -o $@ $<
VMLINUX_BTF_PATHS := $(if $(O),$(O)/vmlinux) \
$(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux) \
@@ -180,11 +184,11 @@ $(BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile) \
$(BUILD_DIR)/libbpf $(BUILD_DIR)/bpftool $(INCLUDE_DIR):
$(call msg,MKDIR,,$@)
- mkdir -p $@
+ $(Q)mkdir -p $@
$(INCLUDE_DIR)/vmlinux.h: $(VMLINUX_BTF) | $(BPFTOOL) $(INCLUDE_DIR)
$(call msg,GEN,,$@)
- $(BPFTOOL) btf dump file $(VMLINUX_BTF) format c > $@
+ $(Q)$(BPFTOOL) btf dump file $(VMLINUX_BTF) format c > $@
# Get Clang's default includes on this system, as opposed to those seen by
# '-target bpf'. This fixes "missing" files on some architectures/distros,
@@ -222,28 +226,28 @@ $(OUTPUT)/flow_dissector_load.o: flow_dissector_load.h
# $4 - LDFLAGS
define CLANG_BPF_BUILD_RULE
$(call msg,CLNG-LLC,$(TRUNNER_BINARY),$2)
- ($(CLANG) $3 -O2 -target bpf -emit-llvm \
+ $(Q)($(CLANG) $3 -O2 -target bpf -emit-llvm \
-c $1 -o - || echo "BPF obj compilation failed") | \
$(LLC) -mattr=dwarfris -march=bpf -mcpu=v3 $4 -filetype=obj -o $2
endef
# Similar to CLANG_BPF_BUILD_RULE, but with disabled alu32
define CLANG_NOALU32_BPF_BUILD_RULE
$(call msg,CLNG-LLC,$(TRUNNER_BINARY),$2)
- ($(CLANG) $3 -O2 -target bpf -emit-llvm \
+ $(Q)($(CLANG) $3 -O2 -target bpf -emit-llvm \
-c $1 -o - || echo "BPF obj compilation failed") | \
$(LLC) -march=bpf -mcpu=v2 $4 -filetype=obj -o $2
endef
# Similar to CLANG_BPF_BUILD_RULE, but using native Clang and bpf LLC
define CLANG_NATIVE_BPF_BUILD_RULE
$(call msg,CLNG-BPF,$(TRUNNER_BINARY),$2)
- ($(CLANG) $3 -O2 -emit-llvm \
+ $(Q)($(CLANG) $3 -O2 -emit-llvm \
-c $1 -o - || echo "BPF obj compilation failed") | \
$(LLC) -march=bpf -mcpu=v3 $4 -filetype=obj -o $2
endef
# Build BPF object using GCC
define GCC_BPF_BUILD_RULE
$(call msg,GCC-BPF,$(TRUNNER_BINARY),$2)
- $(BPF_GCC) $3 $4 -O2 -c $1 -o $2
+ $(Q)$(BPF_GCC) $3 $4 -O2 -c $1 -o $2
endef
SKEL_BLACKLIST := btf__% test_pinning_invalid.c test_sk_assign.c
@@ -285,7 +289,7 @@ ifeq ($($(TRUNNER_OUTPUT)-dir),)
$(TRUNNER_OUTPUT)-dir := y
$(TRUNNER_OUTPUT):
$$(call msg,MKDIR,,$$@)
- mkdir -p $$@
+ $(Q)mkdir -p $$@
endif
# ensure we set up BPF objects generation rule just once for a given
@@ -305,7 +309,7 @@ $(TRUNNER_BPF_SKELS): $(TRUNNER_OUTPUT)/%.skel.h: \
$(TRUNNER_OUTPUT)/%.o \
| $(BPFTOOL) $(TRUNNER_OUTPUT)
$$(call msg,GEN-SKEL,$(TRUNNER_BINARY),$$@)
- $$(BPFTOOL) gen skeleton $$< > $$@
+ $(Q)$$(BPFTOOL) gen skeleton $$< > $$@
endif
# ensure we set up tests.h header generation rule just once
@@ -329,7 +333,7 @@ $(TRUNNER_TEST_OBJS): $(TRUNNER_OUTPUT)/%.test.o: \
$(TRUNNER_BPF_SKELS) \
$$(BPFOBJ) | $(TRUNNER_OUTPUT)
$$(call msg,TEST-OBJ,$(TRUNNER_BINARY),$$@)
- cd $$(@D) && $$(CC) -I. $$(CFLAGS) -c $(CURDIR)/$$< $$(LDLIBS) -o $$(@F)
+ $(Q)cd $$(@D) && $$(CC) -I. $$(CFLAGS) -c $(CURDIR)/$$< $$(LDLIBS) -o $$(@F)
$(TRUNNER_EXTRA_OBJS): $(TRUNNER_OUTPUT)/%.o: \
%.c \
@@ -337,20 +341,20 @@ $(TRUNNER_EXTRA_OBJS): $(TRUNNER_OUTPUT)/%.o: \
$(TRUNNER_TESTS_HDR) \
$$(BPFOBJ) | $(TRUNNER_OUTPUT)
$$(call msg,EXT-OBJ,$(TRUNNER_BINARY),$$@)
- $$(CC) $$(CFLAGS) -c $$< $$(LDLIBS) -o $$@
+ $(Q)$$(CC) $$(CFLAGS) -c $$< $$(LDLIBS) -o $$@
# only copy extra resources if in flavored build
$(TRUNNER_BINARY)-extras: $(TRUNNER_EXTRA_FILES) | $(TRUNNER_OUTPUT)
ifneq ($2,)
$$(call msg,EXT-COPY,$(TRUNNER_BINARY),$(TRUNNER_EXTRA_FILES))
- cp -a $$^ $(TRUNNER_OUTPUT)/
+ $(Q)cp -a $$^ $(TRUNNER_OUTPUT)/
endif
$(OUTPUT)/$(TRUNNER_BINARY): $(TRUNNER_TEST_OBJS) \
$(TRUNNER_EXTRA_OBJS) $$(BPFOBJ) \
| $(TRUNNER_BINARY)-extras
$$(call msg,BINARY,,$$@)
- $$(CC) $$(CFLAGS) $$(filter %.a %.o,$$^) $$(LDLIBS) -o $$@
+ $(Q)$$(CC) $$(CFLAGS) $$(filter %.a %.o,$$^) $$(LDLIBS) -o $$@
endef
@@ -403,17 +407,17 @@ verifier/tests.h: verifier/*.c
) > verifier/tests.h)
$(OUTPUT)/test_verifier: test_verifier.c verifier/tests.h $(BPFOBJ) | $(OUTPUT)
$(call msg,BINARY,,$@)
- $(CC) $(CFLAGS) $(filter %.a %.o %.c,$^) $(LDLIBS) -o $@
+ $(Q)$(CC) $(CFLAGS) $(filter %.a %.o %.c,$^) $(LDLIBS) -o $@
# Make sure we are able to include and link libbpf against c++.
$(OUTPUT)/test_cpp: test_cpp.cpp $(OUTPUT)/test_core_extern.skel.h $(BPFOBJ)
$(call msg,CXX,,$@)
- $(CXX) $(CFLAGS) $^ $(LDLIBS) -o $@
+ $(Q)$(CXX) $(CFLAGS) $^ $(LDLIBS) -o $@
# Benchmark runner
$(OUTPUT)/bench_%.o: benchs/bench_%.c bench.h
$(call msg,CC,,$@)
- $(CC) $(CFLAGS) -c $(filter %.c,$^) $(LDLIBS) -o $@
+ $(Q)$(CC) $(CFLAGS) -c $(filter %.c,$^) $(LDLIBS) -o $@
$(OUTPUT)/bench_rename.o: $(OUTPUT)/test_overhead.skel.h
$(OUTPUT)/bench_trigger.o: $(OUTPUT)/trigger_bench.skel.h
$(OUTPUT)/bench_ringbufs.o: $(OUTPUT)/ringbuf_bench.skel.h \
@@ -426,7 +430,7 @@ $(OUTPUT)/bench: $(OUTPUT)/bench.o $(OUTPUT)/testing_helpers.o \
$(OUTPUT)/bench_trigger.o \
$(OUTPUT)/bench_ringbufs.o
$(call msg,BINARY,,$@)
- $(CC) $(LDFLAGS) -o $@ $(filter %.a %.o,$^) $(LDLIBS)
+ $(Q)$(CC) $(LDFLAGS) -o $@ $(filter %.a %.o,$^) $(LDLIBS)
EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) $(SCRATCH_DIR) \
prog_tests/tests.h map_tests/tests.h verifier/tests.h \
--
2.25.1
From: Christophe Leroy <[email protected]>
[ Upstream commit 3df14264ad9930733a8166e5bd0eccc1727564bb ]
Commit ea0eada45632 leads to the following build failure on powerpc:
HOSTCC scripts/recordmcount
scripts/recordmcount.c: In function 'arm64_is_fake_mcount':
scripts/recordmcount.c:440: error: 'R_AARCH64_CALL26' undeclared (first use in this function)
scripts/recordmcount.c:440: error: (Each undeclared identifier is reported only once
scripts/recordmcount.c:440: error: for each function it appears in.)
make[2]: *** [scripts/recordmcount] Error 1
Make sure R_AARCH64_CALL26 is always defined.
Fixes: ea0eada45632 ("recordmcount: only record relocation of type R_AARCH64_CALL26 on arm64.")
Signed-off-by: Christophe Leroy <[email protected]>
Acked-by: Steven Rostedt (VMware) <[email protected]>
Acked-by: Gregory Herrero <[email protected]>
Cc: Gregory Herrero <[email protected]>
Link: https://lore.kernel.org/r/5ca1be21fa6ebf73203b45fd9aadd2bafb5e6b15.1597049145.git.christophe.leroy@csgroup.eu
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
scripts/recordmcount.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/scripts/recordmcount.c b/scripts/recordmcount.c
index e59022b3f1254..b9c2ee7ab43fa 100644
--- a/scripts/recordmcount.c
+++ b/scripts/recordmcount.c
@@ -42,6 +42,8 @@
#define R_ARM_THM_CALL 10
#define R_ARM_CALL 28
+#define R_AARCH64_CALL26 283
+
static int fd_map; /* File descriptor for file being modified. */
static int mmap_failed; /* Boolean flag. */
static char gpfx; /* prefix for global symbol name (sometimes '_') */
--
2.25.1
From: Michael S. Tsirkin <[email protected]>
[ Upstream commit 1e3e792650d2c0df8dd796906275b7c79e278664 ]
The patch adding the iommu lock did not initialize it.
The struct is zero-initialized so this is mostly a problem
when using lockdep.
Reported-by: kernel test robot <[email protected]>
Cc: Max Gurtovoy <[email protected]>
Fixes: 0ea9ee430e74 ("vdpasim: protect concurrent access to iommu iotlb")
Signed-off-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 8ac6f341dcc16..412fa85ba3653 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -331,6 +331,7 @@ static struct vdpasim *vdpasim_create(void)
INIT_WORK(&vdpasim->work, vdpasim_work);
spin_lock_init(&vdpasim->lock);
+ spin_lock_init(&vdpasim->iommu_lock);
dev = &vdpasim->vdpa.dev;
dev->coherent_dma_mask = DMA_BIT_MASK(64);
--
2.25.1
From: Dan Carpenter <[email protected]>
[ Upstream commit e8abe1de43dac658dacbd04a4543e0c988a8d386 ]
The error handling calls md_bitmap_free(bitmap) which checks for NULL
but will Oops if we pass an error pointer. Let's set "bitmap" to NULL
on this error path.
Fixes: afd756286083 ("md-cluster/raid10: resize all the bitmaps before start reshape")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Guoqing Jiang <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/md/md-cluster.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/md/md-cluster.c b/drivers/md/md-cluster.c
index 73fd50e779754..d50737ec40394 100644
--- a/drivers/md/md-cluster.c
+++ b/drivers/md/md-cluster.c
@@ -1139,6 +1139,7 @@ static int resize_bitmaps(struct mddev *mddev, sector_t newsize, sector_t oldsiz
bitmap = get_bitmap_from_slot(mddev, i);
if (IS_ERR(bitmap)) {
pr_err("can't get bitmap from slot %d\n", i);
+ bitmap = NULL;
goto out;
}
counts = &bitmap->counts;
--
2.25.1
From: Colin Ian King <[email protected]>
[ Upstream commit 88b2e9b06381551b707d980627ad0591191f7a2d ]
The 64 bit ino is being compared to the product of two u32 values,
however, the multiplication is being performed using a 32 bit multiply so
there is a potential of an overflow. To be fully safe, cast uspi->s_ncg
to a u64 to ensure a 64 bit multiplication occurs to avoid any chance of
overflow.
Fixes: f3e2a520f5fb ("ufs: NFS support")
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Evgeniy Dushistov <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Addresses-Coverity: ("Unintentional integer overflow")
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/ufs/super.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ufs/super.c b/fs/ufs/super.c
index 1da0be667409b..e3b69fb280e8c 100644
--- a/fs/ufs/super.c
+++ b/fs/ufs/super.c
@@ -101,7 +101,7 @@ static struct inode *ufs_nfs_get_inode(struct super_block *sb, u64 ino, u32 gene
struct ufs_sb_private_info *uspi = UFS_SB(sb)->s_uspi;
struct inode *inode;
- if (ino < UFS_ROOTINO || ino > uspi->s_ncg * uspi->s_ipg)
+ if (ino < UFS_ROOTINO || ino > (u64)uspi->s_ncg * uspi->s_ipg)
return ERR_PTR(-ESTALE);
inode = ufs_iget(sb, ino);
--
2.25.1
From: Dilip Kota <[email protected]>
[ Upstream commit 7d98585860d845e36ee612832a5ff021f201dbaf ]
Frequency descriptor of Lightning Mountain SoC doesn't have all the
frequency entries so resulting in the below failure causing a kernel hang:
Error MSR_FSB_FREQ index 15 is unknown
tsc: Fast TSC calibration failed
So, add all the frequency entries in the Lightning Mountain SoC frequency
descriptor.
Fixes: 0cc5359d8fd45 ("x86/cpu: Update init data for new Airmont CPU model")
Fixes: 812c2d7506fd ("x86/tsc_msr: Use named struct initializers")
Signed-off-by: Dilip Kota <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/r/211c643ae217604b46cbec43a2c0423946dc7d2d.1596440057.git.eswara.kota@linux.intel.com
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/kernel/tsc_msr.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/tsc_msr.c b/arch/x86/kernel/tsc_msr.c
index 4fec6f3a1858b..a654a9b4b77c0 100644
--- a/arch/x86/kernel/tsc_msr.c
+++ b/arch/x86/kernel/tsc_msr.c
@@ -133,10 +133,15 @@ static const struct freq_desc freq_desc_ann = {
.mask = 0x0f,
};
-/* 24 MHz crystal? : 24 * 13 / 4 = 78 MHz */
+/*
+ * 24 MHz crystal? : 24 * 13 / 4 = 78 MHz
+ * Frequency step for Lightning Mountain SoC is fixed to 78 MHz,
+ * so all the frequency entries are 78000.
+ */
static const struct freq_desc freq_desc_lgm = {
.use_msr_plat = true,
- .freqs = { 78000, 78000, 78000, 78000, 78000, 78000, 78000, 78000 },
+ .freqs = { 78000, 78000, 78000, 78000, 78000, 78000, 78000, 78000,
+ 78000, 78000, 78000, 78000, 78000, 78000, 78000, 78000 },
.mask = 0x0f,
};
--
2.25.1
From: Denis Efremov <[email protected]>
commit f29aa08852e1953e461f2d47ab13c34e14bc08b3 upstream.
clk_s is checked twice in a row in ni_init_smc_spll_table().
fb_div should be checked instead.
Fixes: 69e0b57a91ad ("drm/radeon/kms: add dpm support for cayman (v5)")
Cc: [email protected]
Signed-off-by: Denis Efremov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/radeon/ni_dpm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/radeon/ni_dpm.c
+++ b/drivers/gpu/drm/radeon/ni_dpm.c
@@ -2124,7 +2124,7 @@ static int ni_init_smc_spll_table(struct
if (p_div & ~(SMC_NISLANDS_SPLL_DIV_TABLE_PDIV_MASK >> SMC_NISLANDS_SPLL_DIV_TABLE_PDIV_SHIFT))
ret = -EINVAL;
- if (clk_s & ~(SMC_NISLANDS_SPLL_DIV_TABLE_CLKS_MASK >> SMC_NISLANDS_SPLL_DIV_TABLE_CLKS_SHIFT))
+ if (fb_div & ~(SMC_NISLANDS_SPLL_DIV_TABLE_FBDIV_MASK >> SMC_NISLANDS_SPLL_DIV_TABLE_FBDIV_SHIFT))
ret = -EINVAL;
if (fb_div & ~(SMC_NISLANDS_SPLL_DIV_TABLE_FBDIV_MASK >> SMC_NISLANDS_SPLL_DIV_TABLE_FBDIV_SHIFT))
From: Xin Xiong <[email protected]>
commit a34a0a632dd991a371fec56431d73279f9c54029 upstream.
drm_dp_mst_allocate_vcpi() invokes
drm_dp_mst_topology_get_port_validated(), which increases the refcount
of the "port".
These reference counting issues take place in two exception handling
paths separately. Either when “slots” is less than 0 or when
drm_dp_init_vcpi() returns a negative value, the function forgets to
reduce the refcnt increased drm_dp_mst_topology_get_port_validated(),
which results in a refcount leak.
Fix these issues by pulling up the error handling when "slots" is less
than 0, and calling drm_dp_mst_topology_put_port() before termination
when drm_dp_init_vcpi() returns a negative value.
Fixes: 1e797f556c61 ("drm/dp: Split drm_dp_mst_allocate_vcpi")
Cc: <[email protected]> # v4.12+
Signed-off-by: Xiyu Yang <[email protected]>
Signed-off-by: Xin Tan <[email protected]>
Signed-off-by: Xin Xiong <[email protected]>
Reviewed-by: Lyude Paul <[email protected]>
Signed-off-by: Lyude Paul <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/20200719154545.GA41231@xin-virtual-machine
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/drm_dp_mst_topology.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -4262,11 +4262,11 @@ bool drm_dp_mst_allocate_vcpi(struct drm
{
int ret;
- port = drm_dp_mst_topology_get_port_validated(mgr, port);
- if (!port)
+ if (slots < 0)
return false;
- if (slots < 0)
+ port = drm_dp_mst_topology_get_port_validated(mgr, port);
+ if (!port)
return false;
if (port->vcpi.vcpi > 0) {
@@ -4282,6 +4282,7 @@ bool drm_dp_mst_allocate_vcpi(struct drm
if (ret) {
DRM_DEBUG_KMS("failed to init vcpi slots=%d max=63 ret=%d\n",
DIV_ROUND_UP(pbn, mgr->pbn_div), ret);
+ drm_dp_mst_topology_put_port(port);
goto out;
}
DRM_DEBUG_KMS("initing vcpi for pbn=%d slots=%d\n",
From: Zhang Rui <[email protected]>
[ Upstream commit 4bb5fcb97a5df0bbc0a27e0252b1e7ce140a8431 ]
This fixes a problem introduced by commit:
5fb5273a905c ("perf/x86/rapl: Use new MSR detection interface")
that perf event sysfs attributes for psys RAPL domain are missing.
Fixes: 5fb5273a905c ("perf/x86/rapl: Use new MSR detection interface")
Signed-off-by: Zhang Rui <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Reviewed-by: Len Brown <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/x86/events/rapl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/events/rapl.c b/arch/x86/events/rapl.c
index 0f2bf59f43541..51ff9a3618c95 100644
--- a/arch/x86/events/rapl.c
+++ b/arch/x86/events/rapl.c
@@ -665,7 +665,7 @@ static const struct attribute_group *rapl_attr_update[] = {
&rapl_events_pkg_group,
&rapl_events_ram_group,
&rapl_events_gpu_group,
- &rapl_events_gpu_group,
+ &rapl_events_psys_group,
NULL,
};
--
2.25.1
From: Daniel Díaz <[email protected]>
[ Upstream commit fa5c893181ed2ca2f96552f50073786d2cfce6c0 ]
When using a cross-compilation environment, such as OpenEmbedded,
the CC an CXX variables are set to something more than just a
command: there are arguments (such as --sysroot) that need to be
passed on to the compiler so that the right set of headers and
libraries are used.
For the particular case that our systems detected, CC is set to
the following:
export CC="aarch64-linaro-linux-gcc --sysroot=/oe/build/tmp/work/machine/perf/1.0-r9/recipe-sysroot"
Without quotes, detection is as follows:
Auto-detecting system features:
... dwarf: [ OFF ]
... dwarf_getlocations: [ OFF ]
... glibc: [ OFF ]
... gtk2: [ OFF ]
... libbfd: [ OFF ]
... libcap: [ OFF ]
... libelf: [ OFF ]
... libnuma: [ OFF ]
... numa_num_possible_cpus: [ OFF ]
... libperl: [ OFF ]
... libpython: [ OFF ]
... libcrypto: [ OFF ]
... libunwind: [ OFF ]
... libdw-dwarf-unwind: [ OFF ]
... zlib: [ OFF ]
... lzma: [ OFF ]
... get_cpuid: [ OFF ]
... bpf: [ OFF ]
... libaio: [ OFF ]
... libzstd: [ OFF ]
... disassembler-four-args: [ OFF ]
Makefile.config:414: *** No gnu/libc-version.h found, please install glibc-dev[el]. Stop.
Makefile.perf:230: recipe for target 'sub-make' failed
make[1]: *** [sub-make] Error 2
Makefile:69: recipe for target 'all' failed
make: *** [all] Error 2
With CC and CXX quoted, some of those features are now detected.
Fixes: e3232c2f39ac ("tools build feature: Use CC and CXX from parent")
Signed-off-by: Daniel Díaz <[email protected]>
Reviewed-by: Thomas Hebb <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Fastabend <[email protected]>
Cc: KP Singh <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Yonghong Song <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
tools/build/Makefile.feature | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 774f0b0ca28ac..e7818b44b48ee 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -8,7 +8,7 @@ endif
feature_check = $(eval $(feature_check_code))
define feature_check_code
- feature-$(1) := $(shell $(MAKE) OUTPUT=$(OUTPUT_FEATURES) CC=$(CC) CXX=$(CXX) CFLAGS="$(EXTRA_CFLAGS) $(FEATURE_CHECK_CFLAGS-$(1))" CXXFLAGS="$(EXTRA_CXXFLAGS) $(FEATURE_CHECK_CXXFLAGS-$(1))" LDFLAGS="$(LDFLAGS) $(FEATURE_CHECK_LDFLAGS-$(1))" -C $(feature_dir) $(OUTPUT_FEATURES)test-$1.bin >/dev/null 2>/dev/null && echo 1 || echo 0)
+ feature-$(1) := $(shell $(MAKE) OUTPUT=$(OUTPUT_FEATURES) CC="$(CC)" CXX="$(CXX)" CFLAGS="$(EXTRA_CFLAGS) $(FEATURE_CHECK_CFLAGS-$(1))" CXXFLAGS="$(EXTRA_CXXFLAGS) $(FEATURE_CHECK_CXXFLAGS-$(1))" LDFLAGS="$(LDFLAGS) $(FEATURE_CHECK_LDFLAGS-$(1))" -C $(feature_dir) $(OUTPUT_FEATURES)test-$1.bin >/dev/null 2>/dev/null && echo 1 || echo 0)
endef
feature_set = $(eval $(feature_set_code))
--
2.25.1
From: Eric Biggers <[email protected]>
[ Upstream commit 32ac86efff91a3e4ef8c3d1cadd4559e23c8e73a ]
The minix filesystem leaves super_block::s_maxbytes at MAX_NON_LFS rather
than setting it to the actual filesystem-specific limit. This is broken
because it means userspace doesn't see the standard behavior like getting
EFBIG and SIGXFSZ when exceeding the maximum file size.
Fix this by setting s_maxbytes correctly.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Qiujun Huang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/minix/inode.c | 12 +++++++-----
fs/minix/itree_v1.c | 2 +-
fs/minix/itree_v2.c | 3 +--
fs/minix/minix.h | 1 -
4 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/fs/minix/inode.c b/fs/minix/inode.c
index 0dd929346f3f3..7b09a9158e401 100644
--- a/fs/minix/inode.c
+++ b/fs/minix/inode.c
@@ -150,8 +150,10 @@ static int minix_remount (struct super_block * sb, int * flags, char * data)
return 0;
}
-static bool minix_check_superblock(struct minix_sb_info *sbi)
+static bool minix_check_superblock(struct super_block *sb)
{
+ struct minix_sb_info *sbi = minix_sb(sb);
+
if (sbi->s_imap_blocks == 0 || sbi->s_zmap_blocks == 0)
return false;
@@ -161,7 +163,7 @@ static bool minix_check_superblock(struct minix_sb_info *sbi)
* of indirect blocks which places the limit well above U32_MAX.
*/
if (sbi->s_version == MINIX_V1 &&
- sbi->s_max_size > (7 + 512 + 512*512) * BLOCK_SIZE)
+ sb->s_maxbytes > (7 + 512 + 512*512) * BLOCK_SIZE)
return false;
return true;
@@ -202,7 +204,7 @@ static int minix_fill_super(struct super_block *s, void *data, int silent)
sbi->s_zmap_blocks = ms->s_zmap_blocks;
sbi->s_firstdatazone = ms->s_firstdatazone;
sbi->s_log_zone_size = ms->s_log_zone_size;
- sbi->s_max_size = ms->s_max_size;
+ s->s_maxbytes = ms->s_max_size;
s->s_magic = ms->s_magic;
if (s->s_magic == MINIX_SUPER_MAGIC) {
sbi->s_version = MINIX_V1;
@@ -233,7 +235,7 @@ static int minix_fill_super(struct super_block *s, void *data, int silent)
sbi->s_zmap_blocks = m3s->s_zmap_blocks;
sbi->s_firstdatazone = m3s->s_firstdatazone;
sbi->s_log_zone_size = m3s->s_log_zone_size;
- sbi->s_max_size = m3s->s_max_size;
+ s->s_maxbytes = m3s->s_max_size;
sbi->s_ninodes = m3s->s_ninodes;
sbi->s_nzones = m3s->s_zones;
sbi->s_dirsize = 64;
@@ -245,7 +247,7 @@ static int minix_fill_super(struct super_block *s, void *data, int silent)
} else
goto out_no_fs;
- if (!minix_check_superblock(sbi))
+ if (!minix_check_superblock(s))
goto out_illegal_sb;
/*
diff --git a/fs/minix/itree_v1.c b/fs/minix/itree_v1.c
index 046cc96ee7adb..c0d418209ead1 100644
--- a/fs/minix/itree_v1.c
+++ b/fs/minix/itree_v1.c
@@ -29,7 +29,7 @@ static int block_to_path(struct inode * inode, long block, int offsets[DEPTH])
if (block < 0) {
printk("MINIX-fs: block_to_path: block %ld < 0 on dev %pg\n",
block, inode->i_sb->s_bdev);
- } else if (block >= (minix_sb(inode->i_sb)->s_max_size/BLOCK_SIZE)) {
+ } else if (block >= inode->i_sb->s_maxbytes/BLOCK_SIZE) {
if (printk_ratelimit())
printk("MINIX-fs: block_to_path: "
"block %ld too big on dev %pg\n",
diff --git a/fs/minix/itree_v2.c b/fs/minix/itree_v2.c
index f7fc7eccccccd..ee8af2f9e2828 100644
--- a/fs/minix/itree_v2.c
+++ b/fs/minix/itree_v2.c
@@ -32,8 +32,7 @@ static int block_to_path(struct inode * inode, long block, int offsets[DEPTH])
if (block < 0) {
printk("MINIX-fs: block_to_path: block %ld < 0 on dev %pg\n",
block, sb->s_bdev);
- } else if ((u64)block * (u64)sb->s_blocksize >=
- minix_sb(sb)->s_max_size) {
+ } else if ((u64)block * (u64)sb->s_blocksize >= sb->s_maxbytes) {
if (printk_ratelimit())
printk("MINIX-fs: block_to_path: "
"block %ld too big on dev %pg\n",
diff --git a/fs/minix/minix.h b/fs/minix/minix.h
index df081e8afcc3c..168d45d3de73e 100644
--- a/fs/minix/minix.h
+++ b/fs/minix/minix.h
@@ -32,7 +32,6 @@ struct minix_sb_info {
unsigned long s_zmap_blocks;
unsigned long s_firstdatazone;
unsigned long s_log_zone_size;
- unsigned long s_max_size;
int s_dirsize;
int s_namelen;
struct buffer_head ** s_imap;
--
2.25.1
From: Geert Uytterhoeven <[email protected]>
[ Upstream commit 0c64a0dce51faa9c706fdf1f957d6f19878f4b81 ]
The Landisk setup code maps the CF IDE area using ioremap_prot(), and
passes the resulting virtual addresses to the pata_platform driver,
disguising them as I/O port addresses. Hence the pata_platform driver
translates them again using ioport_map().
As CONFIG_GENERIC_IOMAP=n, and CONFIG_HAS_IOPORT_MAP=y, the
SuperH-specific mapping code in arch/sh/kernel/ioport.c translates
I/O port addresses to virtual addresses by adding sh_io_port_base, which
defaults to -1, thus breaking the assumption of an identity mapping.
Fix this by setting sh_io_port_base to zero.
Fixes: 37b7a97884ba64bf ("sh: machvec IO death.")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Rich Felker <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/sh/boards/mach-landisk/setup.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/sh/boards/mach-landisk/setup.c b/arch/sh/boards/mach-landisk/setup.c
index 16b4d8b0bb850..2c44b94f82fb2 100644
--- a/arch/sh/boards/mach-landisk/setup.c
+++ b/arch/sh/boards/mach-landisk/setup.c
@@ -82,6 +82,9 @@ device_initcall(landisk_devices_setup);
static void __init landisk_setup(char **cmdline_p)
{
+ /* I/O port identity mapping */
+ __set_io_port_base(0);
+
/* LED ON */
__raw_writeb(__raw_readb(PA_LED) | 0x03, PA_LED);
--
2.25.1
From: Andrii Nakryiko <[email protected]>
[ Upstream commit 6bcaf41f9613278cd5897fc80ab93033bda8efaa ]
runqslower's Makefile is building/installing bpftool into
$(OUTPUT)/sbin/bpftool, which coincides with $(DEFAULT_BPFTOOL). In practice
this means that often when building selftests from scratch (after `make
clean`), selftests are racing with runqslower to simultaneously build bpftool
and one of the two processes fail due to file being busy. Prevent this race by
explicitly order-depending on $(BPFTOOL_DEFAULT).
Fixes: a2c9652f751e ("selftests: Refactor build to remove tools/lib/bpf from include path")
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
tools/testing/selftests/bpf/Makefile | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 22aaec74ea0ab..dab182ffec320 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -141,7 +141,9 @@ VMLINUX_BTF_PATHS := $(if $(O),$(O)/vmlinux) \
/boot/vmlinux-$(shell uname -r)
VMLINUX_BTF := $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS))))
-$(OUTPUT)/runqslower: $(BPFOBJ)
+DEFAULT_BPFTOOL := $(SCRATCH_DIR)/sbin/bpftool
+
+$(OUTPUT)/runqslower: $(BPFOBJ) | $(DEFAULT_BPFTOOL)
$(Q)$(MAKE) $(submake_extras) -C $(TOOLSDIR)/bpf/runqslower \
OUTPUT=$(SCRATCH_DIR)/ VMLINUX_BTF=$(VMLINUX_BTF) \
BPFOBJ=$(BPFOBJ) BPF_INCLUDE=$(INCLUDE_DIR) && \
@@ -163,7 +165,6 @@ $(OUTPUT)/test_netcnt: cgroup_helpers.c
$(OUTPUT)/test_sock_fields: cgroup_helpers.c
$(OUTPUT)/test_sysctl: cgroup_helpers.c
-DEFAULT_BPFTOOL := $(SCRATCH_DIR)/sbin/bpftool
BPFTOOL ?= $(DEFAULT_BPFTOOL)
$(DEFAULT_BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile) \
$(BPFOBJ) | $(BUILD_DIR)/bpftool
--
2.25.1
From: Tiezhu Yang <[email protected]>
[ Upstream commit 3adf3bae0d612357da516d39e1584f1547eb6e86 ]
Since filp_open() returns an error pointer, we should use IS_ERR() to
check the return value and then return PTR_ERR() if failed to get the
actual return value instead of always -EINVAL.
E.g. without this patch:
[root@localhost loongson]# ls no_such_file
ls: cannot access no_such_file: No such file or directory
[root@localhost loongson]# modprobe test_lockup file_path=no_such_file lock_sb_umount time_secs=60 state=S
modprobe: ERROR: could not insert 'test_lockup': Invalid argument
[root@localhost loongson]# dmesg | tail -1
[ 126.100596] test_lockup: cannot find file_path
With this patch:
[root@localhost loongson]# ls no_such_file
ls: cannot access no_such_file: No such file or directory
[root@localhost loongson]# modprobe test_lockup file_path=no_such_file lock_sb_umount time_secs=60 state=S
modprobe: ERROR: could not insert 'test_lockup': Unknown symbol in module, or unknown parameter (see dmesg)
[root@localhost loongson]# dmesg | tail -1
[ 95.134362] test_lockup: failed to open no_such_file: -2
Fixes: aecd42df6d39 ("lib/test_lockup.c: add parameters for locking generic vfs locks")
Signed-off-by: Tiezhu Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Cc: Konstantin Khlebnikov <[email protected]>
Cc: Kees Cook <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
lib/test_lockup.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/test_lockup.c b/lib/test_lockup.c
index bd7c7ff39f6be..e7202763a1688 100644
--- a/lib/test_lockup.c
+++ b/lib/test_lockup.c
@@ -512,8 +512,8 @@ static int __init test_lockup_init(void)
if (test_file_path[0]) {
test_file = filp_open(test_file_path, O_RDONLY, 0);
if (IS_ERR(test_file)) {
- pr_err("cannot find file_path\n");
- return -EINVAL;
+ pr_err("failed to open %s: %ld\n", test_file_path, PTR_ERR(test_file));
+ return PTR_ERR(test_file);
}
test_inode = file_inode(test_file);
} else if (test_lock_inode ||
--
2.25.1
From: Pawan Gupta <[email protected]>
[ Upstream commit f29dfa53cc8ae6ad93bae619bcc0bf45cab344f7 ]
On systems that have virtualization disabled or unsupported, sysfs
mitigation for X86_BUG_ITLB_MULTIHIT is reported incorrectly as:
$ cat /sys/devices/system/cpu/vulnerabilities/itlb_multihit
KVM: Vulnerable
System is not vulnerable to DoS attack from a rogue guest when
virtualization is disabled or unsupported in the hardware. Change the
mitigation reporting for these cases.
Fixes: b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation")
Reported-by: Nelson Dsouza <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Pawan Gupta <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/0ba029932a816179b9d14a30db38f0f11ef1f166.1594925782.git.pawan.kumar.gupta@linux.intel.com
Signed-off-by: Sasha Levin <[email protected]>
---
Documentation/admin-guide/hw-vuln/multihit.rst | 4 ++++
arch/x86/kernel/cpu/bugs.c | 8 +++++++-
2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/hw-vuln/multihit.rst b/Documentation/admin-guide/hw-vuln/multihit.rst
index ba9988d8bce50..140e4cec38c33 100644
--- a/Documentation/admin-guide/hw-vuln/multihit.rst
+++ b/Documentation/admin-guide/hw-vuln/multihit.rst
@@ -80,6 +80,10 @@ The possible values in this file are:
- The processor is not vulnerable.
* - KVM: Mitigation: Split huge pages
- Software changes mitigate this issue.
+ * - KVM: Mitigation: VMX unsupported
+ - KVM is not vulnerable because Virtual Machine Extensions (VMX) is not supported.
+ * - KVM: Mitigation: VMX disabled
+ - KVM is not vulnerable because Virtual Machine Extensions (VMX) is disabled.
* - KVM: Vulnerable
- The processor is vulnerable, but no mitigation enabled
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 0b71970d2d3d2..b0802d45abd30 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -31,6 +31,7 @@
#include <asm/intel-family.h>
#include <asm/e820/api.h>
#include <asm/hypervisor.h>
+#include <asm/tlbflush.h>
#include "cpu.h"
@@ -1556,7 +1557,12 @@ static ssize_t l1tf_show_state(char *buf)
static ssize_t itlb_multihit_show_state(char *buf)
{
- if (itlb_multihit_kvm_mitigation)
+ if (!boot_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
+ !boot_cpu_has(X86_FEATURE_VMX))
+ return sprintf(buf, "KVM: Mitigation: VMX unsupported\n");
+ else if (!(cr4_read_shadow() & X86_CR4_VMXE))
+ return sprintf(buf, "KVM: Mitigation: VMX disabled\n");
+ else if (itlb_multihit_kvm_mitigation)
return sprintf(buf, "KVM: Mitigation: Split huge pages\n");
else
return sprintf(buf, "KVM: Vulnerable\n");
--
2.25.1
From: Colin Ian King <[email protected]>
[ Upstream commit ea38f06e0291986eb93beb6d61fd413607a30ca4 ]
Currently when the call to fsp_reg_write fails -EIO is not being returned
because the count is being returned instead of the return value in retval.
Fix this by returning the value in retval instead of count.
Addresses-Coverity: ("Unused value")
Fixes: fc69f4a6af49 ("Input: add new driver for Sentelic Finger Sensing Pad")
Signed-off-by: Colin Ian King <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Dmitry Torokhov <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/input/mouse/sentelic.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/input/mouse/sentelic.c b/drivers/input/mouse/sentelic.c
index e99d9bf1a267d..e78c4c7eda34d 100644
--- a/drivers/input/mouse/sentelic.c
+++ b/drivers/input/mouse/sentelic.c
@@ -441,7 +441,7 @@ static ssize_t fsp_attr_set_setreg(struct psmouse *psmouse, void *data,
fsp_reg_write_enable(psmouse, false);
- return count;
+ return retval;
}
PSMOUSE_DEFINE_WO_ATTR(setreg, S_IWUSR, NULL, fsp_attr_set_setreg);
--
2.25.1
From: Vincent Whitchurch <[email protected]>
[ Upstream commit 1beaef29c34154ccdcb3f1ae557f6883eda18840 ]
For memcpy, the source pages are memset to zero only when --cycles is
used. This leads to wildly different results with or without --cycles,
since all sources pages are likely to be mapped to the same zero page
without explicit writes.
Before this fix:
$ export cmd="./perf stat -e LLC-loads -- ./perf bench \
mem memcpy -s 1024MB -l 100 -f default"
$ $cmd
2,935,826 LLC-loads
3.821677452 seconds time elapsed
$ $cmd --cycles
217,533,436 LLC-loads
8.616725985 seconds time elapsed
After this fix:
$ $cmd
214,459,686 LLC-loads
8.674301124 seconds time elapsed
$ $cmd --cycles
214,758,651 LLC-loads
8.644480006 seconds time elapsed
Fixes: 47b5757bac03c338 ("perf bench mem: Move boilerplate memory allocation to the infrastructure")
Signed-off-by: Vincent Whitchurch <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
tools/perf/bench/mem-functions.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/tools/perf/bench/mem-functions.c b/tools/perf/bench/mem-functions.c
index 9235b76501be8..19d45c377ac18 100644
--- a/tools/perf/bench/mem-functions.c
+++ b/tools/perf/bench/mem-functions.c
@@ -223,12 +223,8 @@ static int bench_mem_common(int argc, const char **argv, struct bench_mem_info *
return 0;
}
-static u64 do_memcpy_cycles(const struct function *r, size_t size, void *src, void *dst)
+static void memcpy_prefault(memcpy_t fn, size_t size, void *src, void *dst)
{
- u64 cycle_start = 0ULL, cycle_end = 0ULL;
- memcpy_t fn = r->fn.memcpy;
- int i;
-
/* Make sure to always prefault zero pages even if MMAP_THRESH is crossed: */
memset(src, 0, size);
@@ -237,6 +233,15 @@ static u64 do_memcpy_cycles(const struct function *r, size_t size, void *src, vo
* to not measure page fault overhead:
*/
fn(dst, src, size);
+}
+
+static u64 do_memcpy_cycles(const struct function *r, size_t size, void *src, void *dst)
+{
+ u64 cycle_start = 0ULL, cycle_end = 0ULL;
+ memcpy_t fn = r->fn.memcpy;
+ int i;
+
+ memcpy_prefault(fn, size, src, dst);
cycle_start = get_cycles();
for (i = 0; i < nr_loops; ++i)
@@ -252,11 +257,7 @@ static double do_memcpy_gettimeofday(const struct function *r, size_t size, void
memcpy_t fn = r->fn.memcpy;
int i;
- /*
- * We prefault the freshly allocated memory range here,
- * to not measure page fault overhead:
- */
- fn(dst, src, size);
+ memcpy_prefault(fn, size, src, dst);
BUG_ON(gettimeofday(&tv_start, NULL));
for (i = 0; i < nr_loops; ++i)
--
2.25.1
From: Ondrej Mosnacek <[email protected]>
[ Upstream commit 21dfbcd1f5cbff9cf2f9e7e43475aed8d072b0dd ]
In skcipher_accept_parent_nokey() the whole af_alg_ctx structure is
cleared by memset() after allocation, so add such memset() also to
aead_accept_parent_nokey() so that the new "init" field is also
initialized to zero. Without that the initial ctx->init checks might
randomly return true and cause errors.
While there, also remove the redundant zero assignments in both
functions.
Found via libkcapi testsuite.
Cc: Stephan Mueller <[email protected]>
Fixes: f3c802a1f300 ("crypto: algif_aead - Only wake up when ctx->more is zero")
Suggested-by: Herbert Xu <[email protected]>
Signed-off-by: Ondrej Mosnacek <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
crypto/algif_aead.c | 6 ------
crypto/algif_skcipher.c | 7 +------
2 files changed, 1 insertion(+), 12 deletions(-)
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index d48d2156e6210..43c6aa784858b 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -558,12 +558,6 @@ static int aead_accept_parent_nokey(void *private, struct sock *sk)
INIT_LIST_HEAD(&ctx->tsgl_list);
ctx->len = len;
- ctx->used = 0;
- atomic_set(&ctx->rcvused, 0);
- ctx->more = 0;
- ctx->merge = 0;
- ctx->enc = 0;
- ctx->aead_assoclen = 0;
crypto_init_wait(&ctx->wait);
ask->private = ctx;
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index a51ba22fef58f..81c4022285a7c 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -333,6 +333,7 @@ static int skcipher_accept_parent_nokey(void *private, struct sock *sk)
ctx = sock_kmalloc(sk, len, GFP_KERNEL);
if (!ctx)
return -ENOMEM;
+ memset(ctx, 0, len);
ctx->iv = sock_kmalloc(sk, crypto_skcipher_ivsize(tfm),
GFP_KERNEL);
@@ -340,16 +341,10 @@ static int skcipher_accept_parent_nokey(void *private, struct sock *sk)
sock_kfree_s(sk, ctx, len);
return -ENOMEM;
}
-
memset(ctx->iv, 0, crypto_skcipher_ivsize(tfm));
INIT_LIST_HEAD(&ctx->tsgl_list);
ctx->len = len;
- ctx->used = 0;
- atomic_set(&ctx->rcvused, 0);
- ctx->more = 0;
- ctx->merge = 0;
- ctx->enc = 0;
crypto_init_wait(&ctx->wait);
ask->private = ctx;
--
2.25.1
From: Andy Shevchenko <[email protected]>
[ Upstream commit 3d858942250820b9adc35f963a257481d6d4c81d ]
The event handler loop must be run with interrupts disabled.
Otherwise we will have a warning:
[ 1970.785649] irq 31 handler lineevent_irq_handler+0x0/0x20 enabled interrupts
[ 1970.792739] WARNING: CPU: 0 PID: 0 at kernel/irq/handle.c:159 __handle_irq_event_percpu+0x162/0x170
[ 1970.860732] RIP: 0010:__handle_irq_event_percpu+0x162/0x170
...
[ 1970.946994] Call Trace:
[ 1970.949446] <IRQ>
[ 1970.951471] handle_irq_event_percpu+0x2c/0x80
[ 1970.955921] handle_irq_event+0x23/0x43
[ 1970.959766] handle_simple_irq+0x57/0x70
[ 1970.963695] generic_handle_irq+0x42/0x50
[ 1970.967717] dln2_rx+0xc1/0x210 [dln2]
[ 1970.971479] ? usb_hcd_unmap_urb_for_dma+0xa6/0x1c0
[ 1970.976362] __usb_hcd_giveback_urb+0x77/0xe0
[ 1970.980727] usb_giveback_urb_bh+0x8e/0xe0
[ 1970.984837] tasklet_action_common.isra.0+0x4a/0xe0
...
Recently xHCI driver switched to tasklets in the commit 36dc01657b49
("usb: host: xhci: Support running urb giveback in tasklet context").
The handle_irq_event_* functions are expected to be called with interrupts
disabled and they rightfully complain here because we run in tasklet context
with interrupts enabled.
Use a event spinlock to protect event handler from being interrupted.
Note, that there are only two users of this GPIO and ADC drivers and both of
them are using generic_handle_irq() which makes above happen.
Fixes: 338a12814297 ("mfd: Add support for Diolan DLN-2 devices")
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/mfd/dln2.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/mfd/dln2.c b/drivers/mfd/dln2.c
index 39276fa626d2b..83e676a096dc1 100644
--- a/drivers/mfd/dln2.c
+++ b/drivers/mfd/dln2.c
@@ -287,7 +287,11 @@ static void dln2_rx(struct urb *urb)
len = urb->actual_length - sizeof(struct dln2_header);
if (handle == DLN2_HANDLE_EVENT) {
+ unsigned long flags;
+
+ spin_lock_irqsave(&dln2->event_cb_lock, flags);
dln2_run_event_callbacks(dln2, id, echo, data, len);
+ spin_unlock_irqrestore(&dln2->event_cb_lock, flags);
} else {
/* URB will be re-submitted in _dln2_transfer (free_rx_slot) */
if (dln2_transfer_complete(dln2, urb, handle, echo))
--
2.25.1
From: Dhananjay Phadke <[email protected]>
[ Upstream commit b1eef236f50ba6afea680da039ef3a2ca9c43d11 ]
When i2c client unregisters, synchronize irq before setting
iproc_i2c->slave to NULL.
(1) disable_irq()
(2) Mask event enable bits in control reg
(3) Erase slave address (avoid further writes to rx fifo)
(4) Flush tx and rx FIFOs
(5) Clear pending event (interrupt) bits in status reg
(6) enable_irq()
(7) Set client pointer to NULL
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000318
[ 371.020421] pc : bcm_iproc_i2c_isr+0x530/0x11f0
[ 371.025098] lr : __handle_irq_event_percpu+0x6c/0x170
[ 371.030309] sp : ffff800010003e40
[ 371.033727] x29: ffff800010003e40 x28: 0000000000000060
[ 371.039206] x27: ffff800010ca9de0 x26: ffff800010f895df
[ 371.044686] x25: ffff800010f18888 x24: ffff0008f7ff3600
[ 371.050165] x23: 0000000000000003 x22: 0000000001600000
[ 371.055645] x21: ffff800010f18888 x20: 0000000001600000
[ 371.061124] x19: ffff0008f726f080 x18: 0000000000000000
[ 371.066603] x17: 0000000000000000 x16: 0000000000000000
[ 371.072082] x15: 0000000000000000 x14: 0000000000000000
[ 371.077561] x13: 0000000000000000 x12: 0000000000000001
[ 371.083040] x11: 0000000000000000 x10: 0000000000000040
[ 371.088519] x9 : ffff800010f317c8 x8 : ffff800010f317c0
[ 371.093999] x7 : ffff0008f805b3b0 x6 : 0000000000000000
[ 371.099478] x5 : ffff0008f7ff36a4 x4 : ffff8008ee43d000
[ 371.104957] x3 : 0000000000000000 x2 : ffff8000107d64c0
[ 371.110436] x1 : 00000000c00000af x0 : 0000000000000000
[ 371.115916] Call trace:
[ 371.118439] bcm_iproc_i2c_isr+0x530/0x11f0
[ 371.122754] __handle_irq_event_percpu+0x6c/0x170
[ 371.127606] handle_irq_event_percpu+0x34/0x88
[ 371.132189] handle_irq_event+0x40/0x120
[ 371.136234] handle_fasteoi_irq+0xcc/0x1a0
[ 371.140459] generic_handle_irq+0x24/0x38
[ 371.144594] __handle_domain_irq+0x60/0xb8
[ 371.148820] gic_handle_irq+0xc0/0x158
[ 371.152687] el1_irq+0xb8/0x140
[ 371.155927] arch_cpu_idle+0x10/0x18
[ 371.159615] do_idle+0x204/0x290
[ 371.162943] cpu_startup_entry+0x24/0x60
[ 371.166990] rest_init+0xb0/0xbc
[ 371.170322] arch_call_rest_init+0xc/0x14
[ 371.174458] start_kernel+0x404/0x430
Fixes: c245d94ed106 ("i2c: iproc: Add multi byte read-write support for slave mode")
Signed-off-by: Dhananjay Phadke <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Acked-by: Ray Jui <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/i2c/busses/i2c-bcm-iproc.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-bcm-iproc.c b/drivers/i2c/busses/i2c-bcm-iproc.c
index 8a3c98866fb7e..688e928188214 100644
--- a/drivers/i2c/busses/i2c-bcm-iproc.c
+++ b/drivers/i2c/busses/i2c-bcm-iproc.c
@@ -1078,7 +1078,7 @@ static int bcm_iproc_i2c_unreg_slave(struct i2c_client *slave)
if (!iproc_i2c->slave)
return -EINVAL;
- iproc_i2c->slave = NULL;
+ disable_irq(iproc_i2c->irq);
/* disable all slave interrupts */
tmp = iproc_i2c_rd_reg(iproc_i2c, IE_OFFSET);
@@ -1091,6 +1091,17 @@ static int bcm_iproc_i2c_unreg_slave(struct i2c_client *slave)
tmp &= ~BIT(S_CFG_EN_NIC_SMB_ADDR3_SHIFT);
iproc_i2c_wr_reg(iproc_i2c, S_CFG_SMBUS_ADDR_OFFSET, tmp);
+ /* flush TX/RX FIFOs */
+ tmp = (BIT(S_FIFO_RX_FLUSH_SHIFT) | BIT(S_FIFO_TX_FLUSH_SHIFT));
+ iproc_i2c_wr_reg(iproc_i2c, S_FIFO_CTRL_OFFSET, tmp);
+
+ /* clear all pending slave interrupts */
+ iproc_i2c_wr_reg(iproc_i2c, IS_OFFSET, ISR_MASK_SLAVE);
+
+ iproc_i2c->slave = NULL;
+
+ enable_irq(iproc_i2c->irq);
+
return 0;
}
--
2.25.1
From: Tiezhu Yang <[email protected]>
[ Upstream commit 0776d1231bec0c7ab43baf440a3f5ef5f49dd795 ]
Reset the member "test_fs" of the test configuration after a call of the
function "kfree_const" to a null pointer so that a double memory release
will not be performed.
Fixes: d9c6a72d6fa2 ("kmod: add test driver to stress test the module loader")
Signed-off-by: Tiezhu Yang <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Luis Chamberlain <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Chuck Lever <[email protected]>
Cc: David Howells <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: James Morris <[email protected]>
Cc: Jarkko Sakkinen <[email protected]>
Cc: J. Bruce Fields <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Lars Ellenberg <[email protected]>
Cc: Nikolay Aleksandrov <[email protected]>
Cc: Philipp Reisner <[email protected]>
Cc: Roopa Prabhu <[email protected]>
Cc: "Serge E. Hallyn" <[email protected]>
Cc: Sergei Trofimovich <[email protected]>
Cc: Sergey Kvachonok <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Tony Vroon <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
lib/test_kmod.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/test_kmod.c b/lib/test_kmod.c
index e651c37d56dbd..eab52770070d6 100644
--- a/lib/test_kmod.c
+++ b/lib/test_kmod.c
@@ -745,7 +745,7 @@ static int trigger_config_run_type(struct kmod_test_device *test_dev,
break;
case TEST_KMOD_FS_TYPE:
kfree_const(config->test_fs);
- config->test_driver = NULL;
+ config->test_fs = NULL;
copied = config_copy_test_fs(config, test_str,
strlen(test_str));
break;
--
2.25.1
From: Trond Myklebust <[email protected]>
[ Upstream commit 563c53e73b8b6ec842828736f77e633f7b0911e9 ]
The current mirrored read failover code is correctly resetting the mirror
index between failed reads, however it is not able to actually flip the
RPC call over to the next RPC client.
The end result is that we keep resending the RPC call to the same client
over and over.
The fix is to use the pnfs_read_resend_pnfs() mechanism to schedule a
new RPC call, but we need to add the ability to pass in a mirror
index so that we always retry the next mirror in the list.
Fixes: 166bd5b889ac ("pNFS/flexfiles: Fix layoutstats handling during read failovers")
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/nfs/flexfilelayout/flexfilelayout.c | 50 ++++++++++++++++++--------
fs/nfs/pnfs.c | 4 ++-
fs/nfs/pnfs.h | 2 +-
3 files changed, 40 insertions(+), 16 deletions(-)
diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c b/fs/nfs/flexfilelayout/flexfilelayout.c
index de03e440b7eef..048272d60a165 100644
--- a/fs/nfs/flexfilelayout/flexfilelayout.c
+++ b/fs/nfs/flexfilelayout/flexfilelayout.c
@@ -790,6 +790,19 @@ ff_layout_choose_best_ds_for_read(struct pnfs_layout_segment *lseg,
return ff_layout_choose_any_ds_for_read(lseg, start_idx, best_idx);
}
+static struct nfs4_pnfs_ds *
+ff_layout_get_ds_for_read(struct nfs_pageio_descriptor *pgio, int *best_idx)
+{
+ struct pnfs_layout_segment *lseg = pgio->pg_lseg;
+ struct nfs4_pnfs_ds *ds;
+
+ ds = ff_layout_choose_best_ds_for_read(lseg, pgio->pg_mirror_idx,
+ best_idx);
+ if (ds || !pgio->pg_mirror_idx)
+ return ds;
+ return ff_layout_choose_best_ds_for_read(lseg, 0, best_idx);
+}
+
static void
ff_layout_pg_get_read(struct nfs_pageio_descriptor *pgio,
struct nfs_page *req,
@@ -840,7 +853,7 @@ ff_layout_pg_init_read(struct nfs_pageio_descriptor *pgio,
goto out_nolseg;
}
- ds = ff_layout_choose_best_ds_for_read(pgio->pg_lseg, 0, &ds_idx);
+ ds = ff_layout_get_ds_for_read(pgio, &ds_idx);
if (!ds) {
if (!ff_layout_no_fallback_to_mds(pgio->pg_lseg))
goto out_mds;
@@ -1028,11 +1041,24 @@ static void ff_layout_reset_write(struct nfs_pgio_header *hdr, bool retry_pnfs)
}
}
+static void ff_layout_resend_pnfs_read(struct nfs_pgio_header *hdr)
+{
+ u32 idx = hdr->pgio_mirror_idx + 1;
+ int new_idx = 0;
+
+ if (ff_layout_choose_any_ds_for_read(hdr->lseg, idx + 1, &new_idx))
+ ff_layout_send_layouterror(hdr->lseg);
+ else
+ pnfs_error_mark_layout_for_return(hdr->inode, hdr->lseg);
+ pnfs_read_resend_pnfs(hdr, new_idx);
+}
+
static void ff_layout_reset_read(struct nfs_pgio_header *hdr)
{
struct rpc_task *task = &hdr->task;
pnfs_layoutcommit_inode(hdr->inode, false);
+ pnfs_error_mark_layout_for_return(hdr->inode, hdr->lseg);
if (!test_and_set_bit(NFS_IOHDR_REDO, &hdr->flags)) {
dprintk("%s Reset task %5u for i/o through MDS "
@@ -1234,6 +1260,12 @@ static void ff_layout_io_track_ds_error(struct pnfs_layout_segment *lseg,
break;
case NFS4ERR_NXIO:
ff_layout_mark_ds_unreachable(lseg, idx);
+ /*
+ * Don't return the layout if this is a read and we still
+ * have layouts to try
+ */
+ if (opnum == OP_READ)
+ break;
/* Fallthrough */
default:
pnfs_error_mark_layout_for_return(lseg->pls_layout->plh_inode,
@@ -1247,7 +1279,6 @@ static void ff_layout_io_track_ds_error(struct pnfs_layout_segment *lseg,
static int ff_layout_read_done_cb(struct rpc_task *task,
struct nfs_pgio_header *hdr)
{
- int new_idx = hdr->pgio_mirror_idx;
int err;
if (task->tk_status < 0) {
@@ -1267,10 +1298,6 @@ static int ff_layout_read_done_cb(struct rpc_task *task,
clear_bit(NFS_IOHDR_RESEND_MDS, &hdr->flags);
switch (err) {
case -NFS4ERR_RESET_TO_PNFS:
- if (ff_layout_choose_best_ds_for_read(hdr->lseg,
- hdr->pgio_mirror_idx + 1,
- &new_idx))
- goto out_layouterror;
set_bit(NFS_IOHDR_RESEND_PNFS, &hdr->flags);
return task->tk_status;
case -NFS4ERR_RESET_TO_MDS:
@@ -1281,10 +1308,6 @@ static int ff_layout_read_done_cb(struct rpc_task *task,
}
return 0;
-out_layouterror:
- ff_layout_read_record_layoutstats_done(task, hdr);
- ff_layout_send_layouterror(hdr->lseg);
- hdr->pgio_mirror_idx = new_idx;
out_eagain:
rpc_restart_call_prepare(task);
return -EAGAIN;
@@ -1411,10 +1434,9 @@ static void ff_layout_read_release(void *data)
struct nfs_pgio_header *hdr = data;
ff_layout_read_record_layoutstats_done(&hdr->task, hdr);
- if (test_bit(NFS_IOHDR_RESEND_PNFS, &hdr->flags)) {
- ff_layout_send_layouterror(hdr->lseg);
- pnfs_read_resend_pnfs(hdr);
- } else if (test_bit(NFS_IOHDR_RESEND_MDS, &hdr->flags))
+ if (test_bit(NFS_IOHDR_RESEND_PNFS, &hdr->flags))
+ ff_layout_resend_pnfs_read(hdr);
+ else if (test_bit(NFS_IOHDR_RESEND_MDS, &hdr->flags))
ff_layout_reset_read(hdr);
pnfs_generic_rw_release(data);
}
diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index d61dac48dff50..75e988caf3cd7 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -2939,7 +2939,8 @@ pnfs_try_to_read_data(struct nfs_pgio_header *hdr,
}
/* Resend all requests through pnfs. */
-void pnfs_read_resend_pnfs(struct nfs_pgio_header *hdr)
+void pnfs_read_resend_pnfs(struct nfs_pgio_header *hdr,
+ unsigned int mirror_idx)
{
struct nfs_pageio_descriptor pgio;
@@ -2950,6 +2951,7 @@ void pnfs_read_resend_pnfs(struct nfs_pgio_header *hdr)
nfs_pageio_init_read(&pgio, hdr->inode, false,
hdr->completion_ops);
+ pgio.pg_mirror_idx = mirror_idx;
hdr->task.tk_status = nfs_pageio_resend(&pgio, hdr);
}
}
diff --git a/fs/nfs/pnfs.h b/fs/nfs/pnfs.h
index 8e0ada581b92e..2661c44c62db4 100644
--- a/fs/nfs/pnfs.h
+++ b/fs/nfs/pnfs.h
@@ -311,7 +311,7 @@ int _pnfs_return_layout(struct inode *);
int pnfs_commit_and_return_layout(struct inode *);
void pnfs_ld_write_done(struct nfs_pgio_header *);
void pnfs_ld_read_done(struct nfs_pgio_header *);
-void pnfs_read_resend_pnfs(struct nfs_pgio_header *);
+void pnfs_read_resend_pnfs(struct nfs_pgio_header *, unsigned int mirror_idx);
struct pnfs_layout_segment *pnfs_update_layout(struct inode *ino,
struct nfs_open_context *ctx,
loff_t pos,
--
2.25.1
From: Wang Hai <[email protected]>
[ Upstream commit 50caa777a3a24d7027748e96265728ce748b41ef ]
Fix the missing clk_disable_unprepare() before return
from emac_clks_phase1_init() in the error handling case.
Fixes: b9b17debc69d ("net: emac: emac gigabit ethernet controller driver")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wang Hai <[email protected]>
Acked-by: Timur Tabi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/qualcomm/emac/emac.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/qualcomm/emac/emac.c b/drivers/net/ethernet/qualcomm/emac/emac.c
index 20b1b43a0e393..1166b98d8bb2c 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac.c
@@ -474,13 +474,24 @@ static int emac_clks_phase1_init(struct platform_device *pdev,
ret = clk_prepare_enable(adpt->clk[EMAC_CLK_CFG_AHB]);
if (ret)
- return ret;
+ goto disable_clk_axi;
ret = clk_set_rate(adpt->clk[EMAC_CLK_HIGH_SPEED], 19200000);
if (ret)
- return ret;
+ goto disable_clk_cfg_ahb;
+
+ ret = clk_prepare_enable(adpt->clk[EMAC_CLK_HIGH_SPEED]);
+ if (ret)
+ goto disable_clk_cfg_ahb;
- return clk_prepare_enable(adpt->clk[EMAC_CLK_HIGH_SPEED]);
+ return 0;
+
+disable_clk_cfg_ahb:
+ clk_disable_unprepare(adpt->clk[EMAC_CLK_CFG_AHB]);
+disable_clk_axi:
+ clk_disable_unprepare(adpt->clk[EMAC_CLK_AXI]);
+
+ return ret;
}
/* Enable clocks; needs emac_clks_phase1_init to be called before */
--
2.25.1
From: Jin Yao <[email protected]>
[ Upstream commit 1101c872c8c7869c78dc106ae820040f36807eda ]
We received an error report that perf-record caused 'Segmentation fault'
on a newly system (e.g. on the new installed ubuntu).
(gdb) backtrace
#0 __read_once_size (size=4, res=<synthetic pointer>, p=0x14) at /root/0-jinyao/acme/tools/include/linux/compiler.h:139
#1 atomic_read (v=0x14) at /root/0-jinyao/acme/tools/include/asm/../../arch/x86/include/asm/atomic.h:28
#2 refcount_read (r=0x14) at /root/0-jinyao/acme/tools/include/linux/refcount.h:65
#3 perf_mmap__read_init (map=map@entry=0x0) at mmap.c:177
#4 0x0000561ce5c0de39 in perf_evlist__poll_thread (arg=0x561ce68584d0) at util/sideband_evlist.c:62
#5 0x00007fad78491609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#6 0x00007fad7823c103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
The root cause is, evlist__add_bpf_sb_event() just returns 0 if
HAVE_LIBBPF_SUPPORT is not defined (inline function path). So it will
not create a valid evsel for side-band event.
But perf-record still creates BPF side band thread to process the
side-band event, then the error happpens.
We can reproduce this issue by removing the libelf-dev. e.g.
1. apt-get remove libelf-dev
2. perf record -a -- sleep 1
root@test:~# ./perf record -a -- sleep 1
perf: Segmentation fault
Obtained 6 stack frames.
./perf(+0x28eee8) [0x5562d6ef6ee8]
/lib/x86_64-linux-gnu/libc.so.6(+0x46210) [0x7fbfdc65f210]
./perf(+0x342e74) [0x5562d6faae74]
./perf(+0x257e39) [0x5562d6ebfe39]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7fbfdc990609]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7fbfdc73b103]
Segmentation fault (core dumped)
To fix this issue,
1. We either install the missing libraries to let HAVE_LIBBPF_SUPPORT
be defined.
e.g. apt-get install libelf-dev and install other related libraries.
2. Use this patch to skip the side-band event setup if HAVE_LIBBPF_SUPPORT
is not set.
Committer notes:
The side band thread is not used just with BPF, it is also used with
--switch-output-event, so narrow the ifdef to the BPF specific part.
Fixes: 23cbb41c939a ("perf record: Move side band evlist setup to separate routine")
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
tools/perf/builtin-record.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index a37e7910e9e90..23ea934f30b34 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1489,7 +1489,7 @@ static int record__setup_sb_evlist(struct record *rec)
evlist__set_cb(rec->sb_evlist, record__process_signal_event, rec);
rec->thread_id = pthread_self();
}
-
+#ifdef HAVE_LIBBPF_SUPPORT
if (!opts->no_bpf_event) {
if (rec->sb_evlist == NULL) {
rec->sb_evlist = evlist__new();
@@ -1505,7 +1505,7 @@ static int record__setup_sb_evlist(struct record *rec)
return -1;
}
}
-
+#endif
if (perf_evlist__start_sb_thread(rec->sb_evlist, &rec->opts.target)) {
pr_debug("Couldn't start the BPF side band thread:\nBPF programs starting from now on won't be annotatable\n");
opts->no_bpf_event = true;
--
2.25.1
From: Jeffrey Mitchell <[email protected]>
[ Upstream commit b4487b93545214a9db8cbf32e86411677b0cca21 ]
Move the buffer size check to decode_attr_security_label() before memcpy()
Only call memcpy() if the buffer is large enough
Fixes: aa9c2669626c ("NFS: Client implementation of Labeled-NFS")
Signed-off-by: Jeffrey Mitchell <[email protected]>
[Trond: clean up duplicate test of label->len != 0]
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/nfs/nfs4proc.c | 2 --
fs/nfs/nfs4xdr.c | 6 +++++-
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 2e2dac29a9e91..45e0585e0667c 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5845,8 +5845,6 @@ static int _nfs4_get_security_label(struct inode *inode, void *buf,
return ret;
if (!(fattr.valid & NFS_ATTR_FATTR_V4_SECURITY_LABEL))
return -ENOENT;
- if (buflen < label.len)
- return -ERANGE;
return 0;
}
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 47817ef0aadb1..4e0d8a3b89b67 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -4166,7 +4166,11 @@ static int decode_attr_security_label(struct xdr_stream *xdr, uint32_t *bitmap,
return -EIO;
if (len < NFS4_MAXLABELLEN) {
if (label) {
- memcpy(label->label, p, len);
+ if (label->len) {
+ if (label->len < len)
+ return -ERANGE;
+ memcpy(label->label, p, len);
+ }
label->len = len;
label->pi = pi;
label->lfs = lfs;
--
2.25.1
From: Dan Carpenter <[email protected]>
[ Upstream commit 4437c1152ce0e57ab8f401aa696ea6291cc07ab1 ]
These if statements are supposed to be true if we ended the
list_for_each_entry() loops without hitting a break statement but they
don't work.
In the first loop, we increment "i" after the "if (i == unit)" condition
so we don't necessarily know that "i" is not equal to unit at the end of
the loop.
In the second loop we exit when mode is not pointing to a valid
drm_display_mode struct so it doesn't make sense to check "mode->type".
Fixes: a278724aa23c ("drm/vmwgfx: Implement fbdev on kms v2")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Roland Scheidegger <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
index 04d66592f6050..b7a9cee69ea72 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
@@ -2578,7 +2578,7 @@ int vmw_kms_fbdev_init_data(struct vmw_private *dev_priv,
++i;
}
- if (i != unit) {
+ if (&con->head == &dev_priv->dev->mode_config.connector_list) {
DRM_ERROR("Could not find initial display unit.\n");
ret = -EINVAL;
goto out_unlock;
@@ -2602,13 +2602,13 @@ int vmw_kms_fbdev_init_data(struct vmw_private *dev_priv,
break;
}
- if (mode->type & DRM_MODE_TYPE_PREFERRED)
- *p_mode = mode;
- else {
+ if (&mode->head == &con->modes) {
WARN_ONCE(true, "Could not find initial preferred mode.\n");
*p_mode = list_first_entry(&con->modes,
struct drm_display_mode,
head);
+ } else {
+ *p_mode = mode;
}
out_unlock:
--
2.25.1
From: Dan Carpenter <[email protected]>
[ Upstream commit cf16fe9243bfa2863491026fc727618c7c593c84 ]
If "offset" is non-zero then we end up copying from beyond the end of
the config because of pointer math. We can fix this by casting the
struct to a u8 pointer.
Fixes: 2c53d0f64c06 ("vdpasim: vDPA device simulator")
Signed-off-by: Dan Carpenter <[email protected]>
Link: https://lore.kernel.org/r/20200406144552.GF68494@mwanda
Signed-off-by: Michael S. Tsirkin <[email protected]>
Acked-by: Jason Wang <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/vdpa/vdpa_sim/vdpa_sim.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index 412fa85ba3653..67956db75013f 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -522,7 +522,7 @@ static void vdpasim_get_config(struct vdpa_device *vdpa, unsigned int offset,
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
if (offset + len < sizeof(struct virtio_net_config))
- memcpy(buf, &vdpasim->config + offset, len);
+ memcpy(buf, (u8 *)&vdpasim->config + offset, len);
}
static void vdpasim_set_config(struct vdpa_device *vdpa, unsigned int offset,
--
2.25.1
From: Wang Hai <[email protected]>
[ Upstream commit 75d3e7f4769d276a056efa1cc7f08de571fc9b4b ]
test_unwind() misses to call kfree(bt) in an error path.
Add the missed function call to fix it.
Fixes: 0610154650f1 ("s390/test_unwind: print verbose unwinding results")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wang Hai <[email protected]>
Acked-by: Ilya Leoshkevich <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/s390/lib/test_unwind.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/s390/lib/test_unwind.c b/arch/s390/lib/test_unwind.c
index 32b7a30b2485d..b0b12b46bc572 100644
--- a/arch/s390/lib/test_unwind.c
+++ b/arch/s390/lib/test_unwind.c
@@ -63,6 +63,7 @@ static noinline int test_unwind(struct task_struct *task, struct pt_regs *regs,
break;
if (state.reliable && !addr) {
pr_err("unwind state reliable but addr is 0\n");
+ kfree(bt);
return -EINVAL;
}
sprint_symbol(sym, addr);
--
2.25.1
From: Muchun Song <[email protected]>
[ Upstream commit 10de795a5addd1962406796a6e13ba6cc0fc6bee ]
Fix compiler warning(as show below) for !CONFIG_KPROBES_ON_FTRACE.
kernel/kprobes.c: In function 'kill_kprobe':
kernel/kprobes.c:1116:33: warning: statement with no effect
[-Wunused-value]
1116 | #define disarm_kprobe_ftrace(p) (-ENODEV)
| ^
kernel/kprobes.c:2154:3: note: in expansion of macro
'disarm_kprobe_ftrace'
2154 | disarm_kprobe_ftrace(p);
Link: https://lore.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Reported-by: Stephen Rothwell <[email protected]>
Fixes: 0cb2f1372baa ("kprobes: Fix NULL pointer dereference at kprobe_ftrace_handler")
Acked-by: Masami Hiramatsu <[email protected]>
Acked-by: John Fastabend <[email protected]>
Signed-off-by: Muchun Song <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/kprobes.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index f54d4a29fc26e..72af5d37e9ff1 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1079,9 +1079,20 @@ static int disarm_kprobe_ftrace(struct kprobe *p)
ipmodify ? &kprobe_ipmodify_enabled : &kprobe_ftrace_enabled);
}
#else /* !CONFIG_KPROBES_ON_FTRACE */
-#define prepare_kprobe(p) arch_prepare_kprobe(p)
-#define arm_kprobe_ftrace(p) (-ENODEV)
-#define disarm_kprobe_ftrace(p) (-ENODEV)
+static inline int prepare_kprobe(struct kprobe *p)
+{
+ return arch_prepare_kprobe(p);
+}
+
+static inline int arm_kprobe_ftrace(struct kprobe *p)
+{
+ return -ENODEV;
+}
+
+static inline int disarm_kprobe_ftrace(struct kprobe *p)
+{
+ return -ENODEV;
+}
#endif
/* Arm a kprobe with text_mutex */
--
2.25.1
From: Chao Yu <[email protected]>
[ Upstream commit 944dd22ea4475bd11180fd2f431a4a547ca4d8f5 ]
We missed to update isize of compressed file in write_end() with
below case:
cluster size is 16KB
- write 14KB data from offset 0
- overwrite 16KB data from offset 0
Fixes: 4c8ff7095bef ("f2fs: support data compression")
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/f2fs/data.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 326c63879ddc8..6e9017e6a8197 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -3432,6 +3432,10 @@ static int f2fs_write_end(struct file *file,
if (f2fs_compressed_file(inode) && fsdata) {
f2fs_compress_write_end(inode, fsdata, page->index, copied);
f2fs_update_time(F2FS_I_SB(inode), REQ_TIME);
+
+ if (pos + copied > i_size_read(inode) &&
+ !f2fs_verity_in_progress(inode))
+ f2fs_i_size_write(inode, pos + copied);
return copied;
}
#endif
--
2.25.1
From: Colin Ian King <[email protected]>
[ Upstream commit dee9d154f40c58d02f69acdaa5cfd1eae6ebc28b ]
It is possible for the call to omap_iommu_dump_ctx to return
a negative error number, so check for the failure and return
the error number rather than pass the negative value to
simple_read_from_buffer.
Fixes: 14e0e6796a0d ("OMAP: iommu: add initial debugfs support")
Signed-off-by: Colin Ian King <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Addresses-Coverity: ("Improper use of negative value")
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/iommu/omap-iommu-debug.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/iommu/omap-iommu-debug.c b/drivers/iommu/omap-iommu-debug.c
index 8e19bfa94121e..a99afb5d9011c 100644
--- a/drivers/iommu/omap-iommu-debug.c
+++ b/drivers/iommu/omap-iommu-debug.c
@@ -98,8 +98,11 @@ static ssize_t debug_read_regs(struct file *file, char __user *userbuf,
mutex_lock(&iommu_debug_lock);
bytes = omap_iommu_dump_ctx(obj, p, count);
+ if (bytes < 0)
+ goto err;
bytes = simple_read_from_buffer(userbuf, count, ppos, buf, bytes);
+err:
mutex_unlock(&iommu_debug_lock);
kfree(buf);
--
2.25.1
From: Wolfram Sang <[email protected]>
[ Upstream commit 314139f9f0abdba61ed9a8463bbcb0bf900ac5a2 ]
When the SSR interrupt is activated, it will detect every STOP condition
on the bus, not only the ones after we have been addressed. So, enable
this interrupt only after we have been addressed, and disable it
otherwise.
Fixes: de20d1857dd6 ("i2c: rcar: add slave support")
Signed-off-by: Wolfram Sang <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/i2c/busses/i2c-rcar.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/i2c/busses/i2c-rcar.c b/drivers/i2c/busses/i2c-rcar.c
index 2e3e1bb750134..76c615be5acae 100644
--- a/drivers/i2c/busses/i2c-rcar.c
+++ b/drivers/i2c/busses/i2c-rcar.c
@@ -583,13 +583,14 @@ static bool rcar_i2c_slave_irq(struct rcar_i2c_priv *priv)
rcar_i2c_write(priv, ICSIER, SDR | SSR | SAR);
}
- rcar_i2c_write(priv, ICSSR, ~SAR & 0xff);
+ /* Clear SSR, too, because of old STOPs to other clients than us */
+ rcar_i2c_write(priv, ICSSR, ~(SAR | SSR) & 0xff);
}
/* master sent stop */
if (ssr_filtered & SSR) {
i2c_slave_event(priv->slave, I2C_SLAVE_STOP, &value);
- rcar_i2c_write(priv, ICSIER, SAR | SSR);
+ rcar_i2c_write(priv, ICSIER, SAR);
rcar_i2c_write(priv, ICSSR, ~SSR & 0xff);
}
@@ -853,7 +854,7 @@ static int rcar_reg_slave(struct i2c_client *slave)
priv->slave = slave;
rcar_i2c_write(priv, ICSAR, slave->addr);
rcar_i2c_write(priv, ICSSR, 0);
- rcar_i2c_write(priv, ICSIER, SAR | SSR);
+ rcar_i2c_write(priv, ICSIER, SAR);
rcar_i2c_write(priv, ICSCR, SIE | SDBS);
return 0;
--
2.25.1
From: Jacob Pan <[email protected]>
[ Upstream commit 1ff00279655d95ae9c285c39878aedf9ff008d25 ]
For guest requested IOTLB invalidation, address and mask are provided as
part of the invalidation data. VT-d HW silently ignores any address bits
below the mask. SW shall also allow such case but give warning if
address does not align with the mask. This patch relax the fault
handling from error to warning and proceed with invalidation request
with the given mask.
Fixes: 6ee1b77ba3ac0 ("iommu/vt-d: Add svm/sva invalidate function")
Signed-off-by: Jacob Pan <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/iommu/intel/iommu.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index e7bce09a9f735..04e82f1756010 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5452,13 +5452,12 @@ intel_iommu_sva_invalidate(struct iommu_domain *domain, struct device *dev,
switch (BIT(cache_type)) {
case IOMMU_CACHE_INV_TYPE_IOTLB:
+ /* HW will ignore LSB bits based on address mask */
if (inv_info->granularity == IOMMU_INV_GRANU_ADDR &&
size &&
(inv_info->addr_info.addr & ((BIT(VTD_PAGE_SHIFT + size)) - 1))) {
- pr_err_ratelimited("Address out of range, 0x%llx, size order %llu\n",
+ pr_err_ratelimited("User address not aligned, 0x%llx, size order %llu\n",
inv_info->addr_info.addr, size);
- ret = -ERANGE;
- goto out_unlock;
}
/*
--
2.25.1
From: Krzysztof Kozlowski <[email protected]>
[ Upstream commit 929a343b858612100cb09443a8aaa20d4a4706d3 ]
The VFIO_AP uses ap_driver_register() (and deregister) functions
implemented in ap_bus.c (compiled into ap.o). However the ap.o will be
built only if CONFIG_ZCRYPT is selected.
This was not visible before commit e93a1695d7fb ("iommu: Enable compile
testing for some of drivers") because the CONFIG_VFIO_AP depends on
CONFIG_S390_AP_IOMMU which depends on the missing CONFIG_ZCRYPT. After
adding COMPILE_TEST, it is possible to select a configuration with
VFIO_AP and S390_AP_IOMMU but without the ZCRYPT.
Add proper dependency to the VFIO_AP to fix build errors:
ERROR: modpost: "ap_driver_register" [drivers/s390/crypto/vfio_ap.ko] undefined!
ERROR: modpost: "ap_driver_unregister" [drivers/s390/crypto/vfio_ap.ko] undefined!
Reported-by: kernel test robot <[email protected]>
Fixes: e93a1695d7fb ("iommu: Enable compile testing for some of drivers")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/s390/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index c7d7ede6300c5..4907a5149a8a3 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -769,6 +769,7 @@ config VFIO_AP
def_tristate n
prompt "VFIO support for AP devices"
depends on S390_AP_IOMMU && VFIO_MDEV_DEVICE && KVM
+ depends on ZCRYPT
help
This driver grants access to Adjunct Processor (AP) devices
via the VFIO mediated device interface.
--
2.25.1
From: Tero Kristo <[email protected]>
[ Upstream commit d5b29c2c5ba2bd5bbdb5b744659984185d17d079 ]
PM runtime should be disabled in the fail path of probe and when
the driver is removed.
Fixes: 2d63908bdbfb ("watchdog: Add K3 RTI watchdog support")
Signed-off-by: Tero Kristo <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/watchdog/rti_wdt.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/watchdog/rti_wdt.c b/drivers/watchdog/rti_wdt.c
index d456dd72d99a0..c904496fff65e 100644
--- a/drivers/watchdog/rti_wdt.c
+++ b/drivers/watchdog/rti_wdt.c
@@ -211,6 +211,7 @@ static int rti_wdt_probe(struct platform_device *pdev)
err_iomap:
pm_runtime_put_sync(&pdev->dev);
+ pm_runtime_disable(&pdev->dev);
return ret;
}
@@ -221,6 +222,7 @@ static int rti_wdt_remove(struct platform_device *pdev)
watchdog_unregister_device(&wdt->wdd);
pm_runtime_put(&pdev->dev);
+ pm_runtime_disable(&pdev->dev);
return 0;
}
--
2.25.1
From: Konrad Dybcio <[email protected]>
[ Upstream commit 3386af51d3bcebcba3f7becdb1ef2e384abe90cf ]
Add missing halt_check, hwcg_reg and hwcg_bit properties.
These were likely omitted when porting the driver upstream.
Signed-off-by: Konrad Dybcio <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Fixes: f2a76a2955c0 ("clk: qcom: Add Global Clock controller (GCC) driver for SDM660")
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/clk/qcom/gcc-sdm660.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/clk/qcom/gcc-sdm660.c b/drivers/clk/qcom/gcc-sdm660.c
index bf5730832ef3d..c6fb57cd576f5 100644
--- a/drivers/clk/qcom/gcc-sdm660.c
+++ b/drivers/clk/qcom/gcc-sdm660.c
@@ -1715,6 +1715,9 @@ static struct clk_branch gcc_mss_cfg_ahb_clk = {
static struct clk_branch gcc_mss_mnoc_bimc_axi_clk = {
.halt_reg = 0x8a004,
+ .halt_check = BRANCH_HALT,
+ .hwcg_reg = 0x8a004,
+ .hwcg_bit = 1,
.clkr = {
.enable_reg = 0x8a004,
.enable_mask = BIT(0),
--
2.25.1
From: Krzysztof Sobota <[email protected]>
[ Upstream commit cb36e29bb0e4b0c33c3d5866a0a4aebace4c99b7 ]
When watchdog device is being registered, it calls misc_register that
makes watchdog available for systemd to open. This is a data race
scenario, because when device is open it may still have device struct
not initialized - this in turn causes a crash. This patch moves
device initialization before misc_register call and it solves the
problem printed below.
------------[ cut here ]------------
WARNING: CPU: 3 PID: 1 at lib/kobject.c:612 kobject_get+0x50/0x54
kobject: '(null)' ((ptrval)): is not initialized, yet kobject_get() is being called.
Modules linked in: k2_reset_status(O) davinci_wdt(+) sfn_platform_hwbcn(O) fsmddg_sfn(O) clk_misc_mmap(O) clk_sw_bcn(O) fsp_reset(O) cma_mod(O) slave_sup_notif(O) fpga_master(O) latency(O+) evnotify(O) enable_arm_pmu(O) xge(O) rio_mport_cdev br_netfilter bridge stp llc nvrd_checksum(O) ipv6
CPU: 3 PID: 1 Comm: systemd Tainted: G O 4.19.113-g2579778-fsm4_k2 #1
Hardware name: Keystone
[<c02126c4>] (unwind_backtrace) from [<c020da94>] (show_stack+0x18/0x1c)
[<c020da94>] (show_stack) from [<c07f87d8>] (dump_stack+0xb4/0xe8)
[<c07f87d8>] (dump_stack) from [<c0221f70>] (__warn+0xfc/0x114)
[<c0221f70>] (__warn) from [<c0221fd8>] (warn_slowpath_fmt+0x50/0x74)
[<c0221fd8>] (warn_slowpath_fmt) from [<c07fd394>] (kobject_get+0x50/0x54)
[<c07fd394>] (kobject_get) from [<c0602ce8>] (get_device+0x1c/0x24)
[<c0602ce8>] (get_device) from [<c06961e0>] (watchdog_open+0x90/0xf0)
[<c06961e0>] (watchdog_open) from [<c06001dc>] (misc_open+0x130/0x17c)
[<c06001dc>] (misc_open) from [<c0388228>] (chrdev_open+0xec/0x1a8)
[<c0388228>] (chrdev_open) from [<c037fa98>] (do_dentry_open+0x204/0x3cc)
[<c037fa98>] (do_dentry_open) from [<c0391e2c>] (path_openat+0x330/0x1148)
[<c0391e2c>] (path_openat) from [<c0394518>] (do_filp_open+0x78/0xec)
[<c0394518>] (do_filp_open) from [<c0381100>] (do_sys_open+0x130/0x1f4)
[<c0381100>] (do_sys_open) from [<c0201000>] (ret_fast_syscall+0x0/0x28)
Exception stack(0xd2ceffa8 to 0xd2cefff0)
ffa0: b6f69968 00000000 ffffff9c b6ebd210 000a0001 00000000
ffc0: b6f69968 00000000 00000000 00000142 fffffffd ffffffff 00b65530 bed7bb78
ffe0: 00000142 bed7ba70 b6cc2503 b6cc41d6
---[ end trace 7b16eb105513974f ]---
------------[ cut here ]------------
WARNING: CPU: 3 PID: 1 at lib/refcount.c:153 kobject_get+0x24/0x54
refcount_t: increment on 0; use-after-free.
Modules linked in: k2_reset_status(O) davinci_wdt(+) sfn_platform_hwbcn(O) fsmddg_sfn(O) clk_misc_mmap(O) clk_sw_bcn(O) fsp_reset(O) cma_mod(O) slave_sup_notif(O) fpga_master(O) latency(O+) evnotify(O) enable_arm_pmu(O) xge(O) rio_mport_cdev br_netfilter bridge stp llc nvrd_checksum(O) ipv6
CPU: 3 PID: 1 Comm: systemd Tainted: G W O 4.19.113-g2579778-fsm4_k2 #1
Hardware name: Keystone
[<c02126c4>] (unwind_backtrace) from [<c020da94>] (show_stack+0x18/0x1c)
[<c020da94>] (show_stack) from [<c07f87d8>] (dump_stack+0xb4/0xe8)
[<c07f87d8>] (dump_stack) from [<c0221f70>] (__warn+0xfc/0x114)
[<c0221f70>] (__warn) from [<c0221fd8>] (warn_slowpath_fmt+0x50/0x74)
[<c0221fd8>] (warn_slowpath_fmt) from [<c07fd368>] (kobject_get+0x24/0x54)
[<c07fd368>] (kobject_get) from [<c0602ce8>] (get_device+0x1c/0x24)
[<c0602ce8>] (get_device) from [<c06961e0>] (watchdog_open+0x90/0xf0)
[<c06961e0>] (watchdog_open) from [<c06001dc>] (misc_open+0x130/0x17c)
[<c06001dc>] (misc_open) from [<c0388228>] (chrdev_open+0xec/0x1a8)
[<c0388228>] (chrdev_open) from [<c037fa98>] (do_dentry_open+0x204/0x3cc)
[<c037fa98>] (do_dentry_open) from [<c0391e2c>] (path_openat+0x330/0x1148)
[<c0391e2c>] (path_openat) from [<c0394518>] (do_filp_open+0x78/0xec)
[<c0394518>] (do_filp_open) from [<c0381100>] (do_sys_open+0x130/0x1f4)
[<c0381100>] (do_sys_open) from [<c0201000>] (ret_fast_syscall+0x0/0x28)
Exception stack(0xd2ceffa8 to 0xd2cefff0)
ffa0: b6f69968 00000000 ffffff9c b6ebd210 000a0001 00000000
ffc0: b6f69968 00000000 00000000 00000142 fffffffd ffffffff 00b65530 bed7bb78
ffe0: 00000142 bed7ba70 b6cc2503 b6cc41d6
---[ end trace 7b16eb1055139750 ]---
Fixes: 72139dfa2464 ("watchdog: Fix the race between the release of watchdog_core_data and cdev")
Reviewed-by: Guenter Roeck <[email protected]>
Reviewed-by: Alexander Sverdlin <[email protected]>
Signed-off-by: Krzysztof Sobota <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/watchdog/watchdog_dev.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 7e4cd34a8c20e..b535f5fa279b9 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -994,6 +994,15 @@ static int watchdog_cdev_register(struct watchdog_device *wdd)
if (IS_ERR_OR_NULL(watchdog_kworker))
return -ENODEV;
+ device_initialize(&wd_data->dev);
+ wd_data->dev.devt = MKDEV(MAJOR(watchdog_devt), wdd->id);
+ wd_data->dev.class = &watchdog_class;
+ wd_data->dev.parent = wdd->parent;
+ wd_data->dev.groups = wdd->groups;
+ wd_data->dev.release = watchdog_core_data_release;
+ dev_set_drvdata(&wd_data->dev, wdd);
+ dev_set_name(&wd_data->dev, "watchdog%d", wdd->id);
+
kthread_init_work(&wd_data->work, watchdog_ping_work);
hrtimer_init(&wd_data->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
wd_data->timer.function = watchdog_timer_expired;
@@ -1014,15 +1023,6 @@ static int watchdog_cdev_register(struct watchdog_device *wdd)
}
}
- device_initialize(&wd_data->dev);
- wd_data->dev.devt = MKDEV(MAJOR(watchdog_devt), wdd->id);
- wd_data->dev.class = &watchdog_class;
- wd_data->dev.parent = wdd->parent;
- wd_data->dev.groups = wdd->groups;
- wd_data->dev.release = watchdog_core_data_release;
- dev_set_drvdata(&wd_data->dev, wdd);
- dev_set_name(&wd_data->dev, "watchdog%d", wdd->id);
-
/* Fill in the data structures */
cdev_init(&wd_data->cdev, &watchdog_fops);
--
2.25.1
From: Jonathan Marek <[email protected]>
[ Upstream commit 667f39b59b494d96ae70f4217637db2ebbee3df0 ]
Fix the parents and set BRANCH_HALT_SKIP. From the downstream driver it
should be a 500us delay and not skip, however this matches what was done
for other clocks that had 500us delay in downstream.
Fixes: f73a4230d5bb ("clk: qcom: gcc: Add GPU and NPU clocks for SM8150")
Signed-off-by: Jonathan Marek <[email protected]>
Tested-by: Dmitry Baryshkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/clk/qcom/gcc-sm8150.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/clk/qcom/gcc-sm8150.c b/drivers/clk/qcom/gcc-sm8150.c
index 72524cf110487..55e9d6d75a0cd 100644
--- a/drivers/clk/qcom/gcc-sm8150.c
+++ b/drivers/clk/qcom/gcc-sm8150.c
@@ -1617,6 +1617,7 @@ static struct clk_branch gcc_gpu_cfg_ahb_clk = {
};
static struct clk_branch gcc_gpu_gpll0_clk_src = {
+ .halt_check = BRANCH_HALT_SKIP,
.clkr = {
.enable_reg = 0x52004,
.enable_mask = BIT(15),
@@ -1632,13 +1633,14 @@ static struct clk_branch gcc_gpu_gpll0_clk_src = {
};
static struct clk_branch gcc_gpu_gpll0_div_clk_src = {
+ .halt_check = BRANCH_HALT_SKIP,
.clkr = {
.enable_reg = 0x52004,
.enable_mask = BIT(16),
.hw.init = &(struct clk_init_data){
.name = "gcc_gpu_gpll0_div_clk_src",
.parent_hws = (const struct clk_hw *[]){
- &gcc_gpu_gpll0_clk_src.clkr.hw },
+ &gpll0_out_even.clkr.hw },
.num_parents = 1,
.flags = CLK_SET_RATE_PARENT,
.ops = &clk_branch2_ops,
@@ -1729,6 +1731,7 @@ static struct clk_branch gcc_npu_cfg_ahb_clk = {
};
static struct clk_branch gcc_npu_gpll0_clk_src = {
+ .halt_check = BRANCH_HALT_SKIP,
.clkr = {
.enable_reg = 0x52004,
.enable_mask = BIT(18),
@@ -1744,13 +1747,14 @@ static struct clk_branch gcc_npu_gpll0_clk_src = {
};
static struct clk_branch gcc_npu_gpll0_div_clk_src = {
+ .halt_check = BRANCH_HALT_SKIP,
.clkr = {
.enable_reg = 0x52004,
.enable_mask = BIT(19),
.hw.init = &(struct clk_init_data){
.name = "gcc_npu_gpll0_div_clk_src",
.parent_hws = (const struct clk_hw *[]){
- &gcc_npu_gpll0_clk_src.clkr.hw },
+ &gpll0_out_even.clkr.hw },
.num_parents = 1,
.flags = CLK_SET_RATE_PARENT,
.ops = &clk_branch2_ops,
--
2.25.1
From: Dan Carpenter <[email protected]>
[ Upstream commit 1d2c0c565bc0da25f5e899a862fb58e612b222df ]
The "entry" pointer is an offset from the list head and it doesn't
point to a valid vmw_legacy_display_unit struct. Presumably the
intent was to point to the last entry.
Also the "i++" wasn't used so I have removed that as well.
Fixes: d7e1958dbe4a ("drm/vmwgfx: Support older hardware.")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Roland Scheidegger <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c
index 16dafff5cab19..009f1742bed51 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c
@@ -81,7 +81,7 @@ static int vmw_ldu_commit_list(struct vmw_private *dev_priv)
struct vmw_legacy_display_unit *entry;
struct drm_framebuffer *fb = NULL;
struct drm_crtc *crtc = NULL;
- int i = 0;
+ int i;
/* If there is no display topology the host just assumes
* that the guest will set the same layout as the host.
@@ -92,12 +92,11 @@ static int vmw_ldu_commit_list(struct vmw_private *dev_priv)
crtc = &entry->base.crtc;
w = max(w, crtc->x + crtc->mode.hdisplay);
h = max(h, crtc->y + crtc->mode.vdisplay);
- i++;
}
if (crtc == NULL)
return 0;
- fb = entry->base.crtc.primary->state->fb;
+ fb = crtc->primary->state->fb;
return vmw_kms_write_svga(dev_priv, w, h, fb->pitches[0],
fb->format->cpp[0] * 8,
--
2.25.1
From: Martin KaFai Lau <[email protected]>
[ Upstream commit 811d7e375d08312dba23f3b6bf7e58ec14aa5dcb ]
It is common for networking tests creating its netns and making its own
setting under this new netns (e.g. changing tcp sysctl). If the test
forgot to restore to the original netns, it would affect the
result of other tests.
This patch saves the original netns at the beginning and then restores it
after every test. Since the restore "setns()" is not expensive, it does it
on all tests without tracking if a test has created a new netns or not.
The new restore_netns() could also be done in test__end_subtest() such
that each subtest will get an automatic netns reset. However,
the individual test would lose flexibility to have total control
on netns for its own subtests. In some cases, forcing a test to do
unnecessary netns re-configure for each subtest is time consuming.
e.g. In my vm, forcing netns re-configure on each subtest in sk_assign.c
increased the runtime from 1s to 8s. On top of that, test_progs.c
is also doing per-test (instead of per-subtest) cleanup for cgroup.
Thus, this patch also does per-test restore_netns(). The only existing
per-subtest cleanup is reset_affinity() and no test is depending on this.
Thus, it is removed from test__end_subtest() to give a consistent
expectation to the individual tests. test_progs.c only ensures
any affinity/netns/cgroup change made by an earlier test does not
affect the following tests.
Signed-off-by: Martin KaFai Lau <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
tools/testing/selftests/bpf/test_progs.c | 23 +++++++++++++++++++++--
tools/testing/selftests/bpf/test_progs.h | 2 ++
2 files changed, 23 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index da70a4f72f547..6218b2b5a3f62 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -121,6 +121,24 @@ static void reset_affinity() {
}
}
+static void save_netns(void)
+{
+ env.saved_netns_fd = open("/proc/self/ns/net", O_RDONLY);
+ if (env.saved_netns_fd == -1) {
+ perror("open(/proc/self/ns/net)");
+ exit(-1);
+ }
+}
+
+static void restore_netns(void)
+{
+ if (setns(env.saved_netns_fd, CLONE_NEWNET) == -1) {
+ stdio_restore();
+ perror("setns(CLONE_NEWNS)");
+ exit(-1);
+ }
+}
+
void test__end_subtest()
{
struct prog_test_def *test = env.test;
@@ -138,8 +156,6 @@ void test__end_subtest()
test->test_num, test->subtest_num,
test->subtest_name, sub_error_cnt ? "FAIL" : "OK");
- reset_affinity();
-
free(test->subtest_name);
test->subtest_name = NULL;
}
@@ -643,6 +659,7 @@ int main(int argc, char **argv)
return -1;
}
+ save_netns();
stdio_hijack();
for (i = 0; i < prog_test_cnt; i++) {
struct prog_test_def *test = &prog_test_defs[i];
@@ -673,6 +690,7 @@ int main(int argc, char **argv)
test->error_cnt ? "FAIL" : "OK");
reset_affinity();
+ restore_netns();
if (test->need_cgroup_cleanup)
cleanup_cgroup_environment();
}
@@ -686,6 +704,7 @@ int main(int argc, char **argv)
free_str_set(&env.subtest_selector.blacklist);
free_str_set(&env.subtest_selector.whitelist);
free(env.subtest_selector.num_set);
+ close(env.saved_netns_fd);
if (env.succ_cnt + env.fail_cnt + env.skip_cnt == 0)
return EXIT_FAILURE;
diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h
index f4503c926acad..b809246039181 100644
--- a/tools/testing/selftests/bpf/test_progs.h
+++ b/tools/testing/selftests/bpf/test_progs.h
@@ -78,6 +78,8 @@ struct test_env {
int sub_succ_cnt; /* successful sub-tests */
int fail_cnt; /* total failed tests + sub-tests */
int skip_cnt; /* skipped tests */
+
+ int saved_netns_fd;
};
extern struct test_env env;
--
2.25.1
From: Jesper Dangaard Brouer <[email protected]>
[ Upstream commit 6c92bd5cd4650c39dd929565ee172984c680fead ]
When a user selects a non-existing test the summary is printed with
indication 0 for all info types, and shell "success" (EXIT_SUCCESS) is
indicated. This can be understood by a human end-user, but for shell
scripting is it useful to indicate a shell failure (EXIT_FAILURE).
Signed-off-by: Jesper Dangaard Brouer <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/159363984736.930467.17956007131403952343.stgit@firesoul
Signed-off-by: Sasha Levin <[email protected]>
---
tools/testing/selftests/bpf/test_progs.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index 54fa5fa688ce9..da70a4f72f547 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -687,5 +687,8 @@ int main(int argc, char **argv)
free_str_set(&env.subtest_selector.whitelist);
free(env.subtest_selector.num_set);
+ if (env.succ_cnt + env.fail_cnt + env.skip_cnt == 0)
+ return EXIT_FAILURE;
+
return env.fail_cnt ? EXIT_FAILURE : EXIT_SUCCESS;
}
--
2.25.1
From: Aneesh Kumar K.V <[email protected]>
[ Upstream commit 3563b9bea0ca7f53e4218b5e268550341a49f333 ]
With commit 4a4a5e5d2aad ("powerpc/pkeys: key allocation/deallocation
must not change pkey registers") we are not updating UAMOR on key
allocation. So don't update the expected uamor value in the test.
Fixes: 4a4a5e5d2aad ("powerpc/pkeys: key allocation/deallocation must not change pkey registers")
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c b/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
index bc33d748d95b4..3694613f418f6 100644
--- a/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
@@ -101,15 +101,20 @@ static int child(struct shared_info *info)
*/
info->invalid_amr = info->amr2 | (~0x0UL & ~info->expected_uamor);
+ /*
+ * if PKEY_DISABLE_EXECUTE succeeded we should update the expected_iamr
+ */
if (disable_execute)
info->expected_iamr |= 1ul << pkeyshift(pkey1);
else
info->expected_iamr &= ~(1ul << pkeyshift(pkey1));
- info->expected_iamr &= ~(1ul << pkeyshift(pkey2) | 1ul << pkeyshift(pkey3));
+ /*
+ * We allocated pkey2 and pkey 3 above. Clear the IAMR bits.
+ */
+ info->expected_iamr &= ~(1ul << pkeyshift(pkey2));
+ info->expected_iamr &= ~(1ul << pkeyshift(pkey3));
- info->expected_uamor |= 3ul << pkeyshift(pkey1) |
- 3ul << pkeyshift(pkey2);
/*
* Create an IAMR value different from expected value.
* Kernel will reject an IAMR and UAMOR change.
--
2.25.1
From: Wei Hu <[email protected]>
[ Upstream commit d6af2ed29c7c1c311b96dac989dcb991e90ee195 ]
Kdump could fail sometime on Hyper-V guest because the retry in
hv_pci_enter_d0() releases child device structures in hv_pci_bus_exit().
Although there is a second asynchronous device relations message sending
from the host, if this message arrives to the guest after
hv_send_resource_allocated() is called, the retry would fail.
Fix the problem by moving retry to hv_pci_probe() and start the retry
from hv_pci_query_relations() call. This will cause a device relations
message to arrive to the guest synchronously; the guest would then be
able to rebuild the child device structures before calling
hv_send_resource_allocated().
Link: https://lore.kernel.org/r/[email protected]
Fixes: c81992e7f4aa ("PCI: hv: Retry PCI bus D0 entry on invalid device state")
Signed-off-by: Wei Hu <[email protected]>
[[email protected]: fixed a comment and commit log]
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Reviewed-by: Michael Kelley <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/pci/controller/pci-hyperv.c | 71 +++++++++++++++--------------
1 file changed, 37 insertions(+), 34 deletions(-)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index bf40ff09c99d6..d0033ff6c1437 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -2759,10 +2759,8 @@ static int hv_pci_enter_d0(struct hv_device *hdev)
struct pci_bus_d0_entry *d0_entry;
struct hv_pci_compl comp_pkt;
struct pci_packet *pkt;
- bool retry = true;
int ret;
-enter_d0_retry:
/*
* Tell the host that the bus is ready to use, and moved into the
* powered-on state. This includes telling the host which region
@@ -2789,38 +2787,6 @@ static int hv_pci_enter_d0(struct hv_device *hdev)
if (ret)
goto exit;
- /*
- * In certain case (Kdump) the pci device of interest was
- * not cleanly shut down and resource is still held on host
- * side, the host could return invalid device status.
- * We need to explicitly request host to release the resource
- * and try to enter D0 again.
- */
- if (comp_pkt.completion_status < 0 && retry) {
- retry = false;
-
- dev_err(&hdev->device, "Retrying D0 Entry\n");
-
- /*
- * Hv_pci_bus_exit() calls hv_send_resource_released()
- * to free up resources of its child devices.
- * In the kdump kernel we need to set the
- * wslot_res_allocated to 255 so it scans all child
- * devices to release resources allocated in the
- * normal kernel before panic happened.
- */
- hbus->wslot_res_allocated = 255;
-
- ret = hv_pci_bus_exit(hdev, true);
-
- if (ret == 0) {
- kfree(pkt);
- goto enter_d0_retry;
- }
- dev_err(&hdev->device,
- "Retrying D0 failed with ret %d\n", ret);
- }
-
if (comp_pkt.completion_status < 0) {
dev_err(&hdev->device,
"PCI Pass-through VSP failed D0 Entry with status %x\n",
@@ -3058,6 +3024,7 @@ static int hv_pci_probe(struct hv_device *hdev,
struct hv_pcibus_device *hbus;
u16 dom_req, dom;
char *name;
+ bool enter_d0_retry = true;
int ret;
/*
@@ -3178,11 +3145,47 @@ static int hv_pci_probe(struct hv_device *hdev,
if (ret)
goto free_fwnode;
+retry:
ret = hv_pci_query_relations(hdev);
if (ret)
goto free_irq_domain;
ret = hv_pci_enter_d0(hdev);
+ /*
+ * In certain case (Kdump) the pci device of interest was
+ * not cleanly shut down and resource is still held on host
+ * side, the host could return invalid device status.
+ * We need to explicitly request host to release the resource
+ * and try to enter D0 again.
+ * Since the hv_pci_bus_exit() call releases structures
+ * of all its child devices, we need to start the retry from
+ * hv_pci_query_relations() call, requesting host to send
+ * the synchronous child device relations message before this
+ * information is needed in hv_send_resources_allocated()
+ * call later.
+ */
+ if (ret == -EPROTO && enter_d0_retry) {
+ enter_d0_retry = false;
+
+ dev_err(&hdev->device, "Retrying D0 Entry\n");
+
+ /*
+ * Hv_pci_bus_exit() calls hv_send_resources_released()
+ * to free up resources of its child devices.
+ * In the kdump kernel we need to set the
+ * wslot_res_allocated to 255 so it scans all child
+ * devices to release resources allocated in the
+ * normal kernel before panic happened.
+ */
+ hbus->wslot_res_allocated = 255;
+ ret = hv_pci_bus_exit(hdev, true);
+
+ if (ret == 0)
+ goto retry;
+
+ dev_err(&hdev->device,
+ "Retrying D0 failed with ret %d\n", ret);
+ }
if (ret)
goto free_irq_domain;
--
2.25.1
From: Jacob Pan <[email protected]>
[ Upstream commit d315e9e684d1efd4cb2e8cd70b8d71dec02fcf1f ]
For the unlikely use case where multiple aux domains from the same pdev
are attached to a single guest and then bound to a single process
(thus same PASID) within that guest, we cannot easily support this case
by refcounting the number of users. As there is only one SL page table
per PASID while we have multiple aux domains thus multiple SL page tables
for the same PASID.
Extra unbinding guest PASID can happen due to race between normal and
exception cases. Termination of one aux domain may affect others unless
we actively track and switch aux domains to ensure the validity of SL
page tables and TLB states in the shared PASID entry.
Support for sharing second level PGDs across domains can reduce the
complexity but this is not available due to the limitations on VFIO
container architecture. We can revisit this decision once sharing PGDs
are available.
Overall, the complexity and potential glitch do not warrant this unlikely
use case thereby removed by this patch.
Fixes: 56722a4398a30 ("iommu/vt-d: Add bind guest PASID support")
Signed-off-by: Liu Yi L <[email protected]>
Signed-off-by: Jacob Pan <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Cc: Kevin Tian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/iommu/intel/svm.c | 22 +++++++++-------------
1 file changed, 9 insertions(+), 13 deletions(-)
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 6c87c807a0abb..d386853121a26 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -277,20 +277,16 @@ int intel_svm_bind_gpasid(struct iommu_domain *domain, struct device *dev,
goto out;
}
+ /*
+ * Do not allow multiple bindings of the same device-PASID since
+ * there is only one SL page tables per PASID. We may revisit
+ * once sharing PGD across domains are supported.
+ */
for_each_svm_dev(sdev, svm, dev) {
- /*
- * For devices with aux domains, we should allow
- * multiple bind calls with the same PASID and pdev.
- */
- if (iommu_dev_feature_enabled(dev,
- IOMMU_DEV_FEAT_AUX)) {
- sdev->users++;
- } else {
- dev_warn_ratelimited(dev,
- "Already bound with PASID %u\n",
- svm->pasid);
- ret = -EBUSY;
- }
+ dev_warn_ratelimited(dev,
+ "Already bound with PASID %u\n",
+ svm->pasid);
+ ret = -EBUSY;
goto out;
}
} else {
--
2.25.1
From: Sebastian Reichel <[email protected]>
[ Upstream commit 3180cfabf6fbf982ca6d1a6eb56334647cc1416b ]
Unbreak CPCAP driver, which has one more bit in the day counter
increasing the max. range from 2014 to 2058. The original commit
introducing the range limit was obviously wrong, since the driver
has only been written in 2017 (3 years after 14 bits would have
run out).
Fixes: d2377f8cc5a7 ("rtc: cpcap: set range")
Reported-by: Sicelo A. Mhlongo <[email protected]>
Reported-by: Dev Null <[email protected]>
Signed-off-by: Sebastian Reichel <[email protected]>
Signed-off-by: Alexandre Belloni <[email protected]>
Tested-by: Merlijn Wajer <[email protected]>
Acked-by: Tony Lindgren <[email protected]>
Acked-by: Merlijn Wajer <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/rtc/rtc-cpcap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/rtc/rtc-cpcap.c b/drivers/rtc/rtc-cpcap.c
index a603f1f211250..800667d73a6fb 100644
--- a/drivers/rtc/rtc-cpcap.c
+++ b/drivers/rtc/rtc-cpcap.c
@@ -261,7 +261,7 @@ static int cpcap_rtc_probe(struct platform_device *pdev)
return PTR_ERR(rtc->rtc_dev);
rtc->rtc_dev->ops = &cpcap_rtc_ops;
- rtc->rtc_dev->range_max = (1 << 14) * SECS_PER_DAY - 1;
+ rtc->rtc_dev->range_max = (timeu64_t) (DAY_MASK + 1) * SECS_PER_DAY - 1;
err = cpcap_get_vendor(dev, rtc->regmap, &rtc->vendor);
if (err)
--
2.25.1
From: Liu Yi L <[email protected]>
[ Upstream commit 5f77d6ca5ca74e4b4a5e2e010f7ff50c45dea326 ]
Set proper masks to avoid invalid input spillover to reserved bits.
Signed-off-by: Liu Yi L <[email protected]>
Signed-off-by: Jacob Pan <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Reviewed-by: Eric Auger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
include/linux/intel-iommu.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 04bd9279c3fb3..711bdca975be3 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -381,8 +381,8 @@ enum {
#define QI_DEV_EIOTLB_ADDR(a) ((u64)(a) & VTD_PAGE_MASK)
#define QI_DEV_EIOTLB_SIZE (((u64)1) << 11)
-#define QI_DEV_EIOTLB_GLOB(g) ((u64)g)
-#define QI_DEV_EIOTLB_PASID(p) (((u64)p) << 32)
+#define QI_DEV_EIOTLB_GLOB(g) ((u64)(g) & 0x1)
+#define QI_DEV_EIOTLB_PASID(p) ((u64)((p) & 0xfffff) << 32)
#define QI_DEV_EIOTLB_SID(sid) ((u64)((sid) & 0xffff) << 16)
#define QI_DEV_EIOTLB_QDEP(qd) ((u64)((qd) & 0x1f) << 4)
#define QI_DEV_EIOTLB_PFSID(pfsid) (((u64)(pfsid & 0xf) << 12) | \
--
2.25.1
From: Aneesh Kumar K.V <[email protected]>
[ Upstream commit 0eaa3b5ca7b5a76e3783639c828498343be66a01 ]
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
.../selftests/powerpc/ptrace/ptrace-pkey.c | 30 ++++++++-----------
1 file changed, 12 insertions(+), 18 deletions(-)
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c b/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
index f9216c7a1829e..bc33d748d95b4 100644
--- a/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
@@ -66,11 +66,6 @@ static int sys_pkey_alloc(unsigned long flags, unsigned long init_access_rights)
return syscall(__NR_pkey_alloc, flags, init_access_rights);
}
-static int sys_pkey_free(int pkey)
-{
- return syscall(__NR_pkey_free, pkey);
-}
-
static int child(struct shared_info *info)
{
unsigned long reg;
@@ -100,7 +95,11 @@ static int child(struct shared_info *info)
info->amr1 |= 3ul << pkeyshift(pkey1);
info->amr2 |= 3ul << pkeyshift(pkey2);
- info->invalid_amr |= info->amr2 | 3ul << pkeyshift(pkey3);
+ /*
+ * invalid amr value where we try to force write
+ * things which are deined by a uamor setting.
+ */
+ info->invalid_amr = info->amr2 | (~0x0UL & ~info->expected_uamor);
if (disable_execute)
info->expected_iamr |= 1ul << pkeyshift(pkey1);
@@ -111,17 +110,12 @@ static int child(struct shared_info *info)
info->expected_uamor |= 3ul << pkeyshift(pkey1) |
3ul << pkeyshift(pkey2);
- info->invalid_iamr |= 1ul << pkeyshift(pkey1) | 1ul << pkeyshift(pkey2);
- info->invalid_uamor |= 3ul << pkeyshift(pkey1);
-
/*
- * We won't use pkey3. We just want a plausible but invalid key to test
- * whether ptrace will let us write to AMR bits we are not supposed to.
- *
- * This also tests whether the kernel restores the UAMOR permissions
- * after a key is freed.
+ * Create an IAMR value different from expected value.
+ * Kernel will reject an IAMR and UAMOR change.
*/
- sys_pkey_free(pkey3);
+ info->invalid_iamr = info->expected_iamr | (1ul << pkeyshift(pkey1) | 1ul << pkeyshift(pkey2));
+ info->invalid_uamor = info->expected_uamor & ~(0x3ul << pkeyshift(pkey1));
printf("%-30s AMR: %016lx pkey1: %d pkey2: %d pkey3: %d\n",
user_write, info->amr1, pkey1, pkey2, pkey3);
@@ -196,9 +190,9 @@ static int parent(struct shared_info *info, pid_t pid)
PARENT_SKIP_IF_UNSUPPORTED(ret, &info->child_sync);
PARENT_FAIL_IF(ret, &info->child_sync);
- info->amr1 = info->amr2 = info->invalid_amr = regs[0];
- info->expected_iamr = info->invalid_iamr = regs[1];
- info->expected_uamor = info->invalid_uamor = regs[2];
+ info->amr1 = info->amr2 = regs[0];
+ info->expected_iamr = regs[1];
+ info->expected_uamor = regs[2];
/* Wake up child so that it can set itself up. */
ret = prod_child(&info->child_sync);
--
2.25.1
From: Paul Kocialkowski <[email protected]>
[ Upstream commit ded874ece29d3fe2abd3775810a06056067eb68c ]
This introduces two macros: RGA_COLOR_FMT_IS_YUV and RGA_COLOR_FMT_IS_RGB
which allow quick checking of the colorspace familily of a RGA color format.
These macros are then used to refactor the logic for CSC mode selection.
The two nested tests for input colorspace are simplified into a single one,
with a logical and, making the whole more readable.
Signed-off-by: Paul Kocialkowski <[email protected]>
Reviewed-by: Ezequiel Garcia <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/media/platform/rockchip/rga/rga-hw.c | 23 +++++++++-----------
drivers/media/platform/rockchip/rga/rga-hw.h | 5 +++++
2 files changed, 15 insertions(+), 13 deletions(-)
diff --git a/drivers/media/platform/rockchip/rga/rga-hw.c b/drivers/media/platform/rockchip/rga/rga-hw.c
index 4be6dcf292fff..5607ee8d19176 100644
--- a/drivers/media/platform/rockchip/rga/rga-hw.c
+++ b/drivers/media/platform/rockchip/rga/rga-hw.c
@@ -200,22 +200,19 @@ static void rga_cmd_set_trans_info(struct rga_ctx *ctx)
dst_info.data.format = ctx->out.fmt->hw_format;
dst_info.data.swap = ctx->out.fmt->color_swap;
- if (ctx->in.fmt->hw_format >= RGA_COLOR_FMT_YUV422SP) {
- if (ctx->out.fmt->hw_format < RGA_COLOR_FMT_YUV422SP) {
- switch (ctx->in.colorspace) {
- case V4L2_COLORSPACE_REC709:
- src_info.data.csc_mode =
- RGA_SRC_CSC_MODE_BT709_R0;
- break;
- default:
- src_info.data.csc_mode =
- RGA_SRC_CSC_MODE_BT601_R0;
- break;
- }
+ if (RGA_COLOR_FMT_IS_YUV(ctx->in.fmt->hw_format) &&
+ RGA_COLOR_FMT_IS_RGB(ctx->out.fmt->hw_format)) {
+ switch (ctx->in.colorspace) {
+ case V4L2_COLORSPACE_REC709:
+ src_info.data.csc_mode = RGA_SRC_CSC_MODE_BT709_R0;
+ break;
+ default:
+ src_info.data.csc_mode = RGA_SRC_CSC_MODE_BT601_R0;
+ break;
}
}
- if (ctx->out.fmt->hw_format >= RGA_COLOR_FMT_YUV422SP) {
+ if (RGA_COLOR_FMT_IS_YUV(ctx->out.fmt->hw_format)) {
switch (ctx->out.colorspace) {
case V4L2_COLORSPACE_REC709:
dst_info.data.csc_mode = RGA_SRC_CSC_MODE_BT709_R0;
diff --git a/drivers/media/platform/rockchip/rga/rga-hw.h b/drivers/media/platform/rockchip/rga/rga-hw.h
index 96cb0314dfa70..e8917e5630a48 100644
--- a/drivers/media/platform/rockchip/rga/rga-hw.h
+++ b/drivers/media/platform/rockchip/rga/rga-hw.h
@@ -95,6 +95,11 @@
#define RGA_COLOR_FMT_CP_8BPP 15
#define RGA_COLOR_FMT_MASK 15
+#define RGA_COLOR_FMT_IS_YUV(fmt) \
+ (((fmt) >= RGA_COLOR_FMT_YUV422SP) && ((fmt) < RGA_COLOR_FMT_CP_1BPP))
+#define RGA_COLOR_FMT_IS_RGB(fmt) \
+ ((fmt) < RGA_COLOR_FMT_YUV422SP)
+
#define RGA_COLOR_NONE_SWAP 0
#define RGA_COLOR_RB_SWAP 1
#define RGA_COLOR_ALPHA_SWAP 2
--
2.25.1
From: Qais Yousef <[email protected]>
[ Upstream commit 46609ce227039fd192e0ecc7d940bed587fd2c78 ]
There is a report that when uclamp is enabled, a netperf UDP test
regresses compared to a kernel compiled without uclamp.
https://lore.kernel.org/lkml/[email protected]/
While investigating the root cause, there were no sign that the uclamp
code is doing anything particularly expensive but could suffer from bad
cache behavior under certain circumstances that are yet to be
understood.
https://lore.kernel.org/lkml/20200616110824.dgkkbyapn3io6wik@e107158-lin/
To reduce the pressure on the fast path anyway, add a static key that is
by default will skip executing uclamp logic in the
enqueue/dequeue_task() fast path until it's needed.
As soon as the user start using util clamp by:
1. Changing uclamp value of a task with sched_setattr()
2. Modifying the default sysctl_sched_util_clamp_{min, max}
3. Modifying the default cpu.uclamp.{min, max} value in cgroup
We flip the static key now that the user has opted to use util clamp.
Effectively re-introducing uclamp logic in the enqueue/dequeue_task()
fast path. It stays on from that point forward until the next reboot.
This should help minimize the effect of util clamp on workloads that
don't need it but still allow distros to ship their kernels with uclamp
compiled in by default.
SCHED_WARN_ON() in uclamp_rq_dec_id() was removed since now we can end
up with unbalanced call to uclamp_rq_dec_id() if we flip the key while
a task is running in the rq. Since we know it is harmless we just
quietly return if we attempt a uclamp_rq_dec_id() when
rq->uclamp[].bucket[].tasks is 0.
In schedutil, we introduce a new uclamp_is_enabled() helper which takes
the static key into account to ensure RT boosting behavior is retained.
The following results demonstrates how this helps on 2 Sockets Xeon E5
2x10-Cores system.
nouclamp uclamp uclamp-static-key
Hmean send-64 162.43 ( 0.00%) 157.84 * -2.82%* 163.39 * 0.59%*
Hmean send-128 324.71 ( 0.00%) 314.78 * -3.06%* 326.18 * 0.45%*
Hmean send-256 641.55 ( 0.00%) 628.67 * -2.01%* 648.12 * 1.02%*
Hmean send-1024 2525.28 ( 0.00%) 2448.26 * -3.05%* 2543.73 * 0.73%*
Hmean send-2048 4836.14 ( 0.00%) 4712.08 * -2.57%* 4867.69 * 0.65%*
Hmean send-3312 7540.83 ( 0.00%) 7425.45 * -1.53%* 7621.06 * 1.06%*
Hmean send-4096 9124.53 ( 0.00%) 8948.82 * -1.93%* 9276.25 * 1.66%*
Hmean send-8192 15589.67 ( 0.00%) 15486.35 * -0.66%* 15819.98 * 1.48%*
Hmean send-16384 26386.47 ( 0.00%) 25752.25 * -2.40%* 26773.74 * 1.47%*
The perf diff between nouclamp and uclamp-static-key when uclamp is
disabled in the fast path:
8.73% -1.55% [kernel.kallsyms] [k] try_to_wake_up
0.07% +0.04% [kernel.kallsyms] [k] deactivate_task
0.13% -0.02% [kernel.kallsyms] [k] activate_task
The diff between nouclamp and uclamp-static-key when uclamp is enabled
in the fast path:
8.73% -0.72% [kernel.kallsyms] [k] try_to_wake_up
0.13% +0.39% [kernel.kallsyms] [k] activate_task
0.07% +0.38% [kernel.kallsyms] [k] deactivate_task
Fixes: 69842cba9ace ("sched/uclamp: Add CPU's clamp buckets refcounting")
Reported-by: Mel Gorman <[email protected]>
Signed-off-by: Qais Yousef <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Lukasz Luba <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/sched/core.c | 74 +++++++++++++++++++++++++++++++-
kernel/sched/cpufreq_schedutil.c | 2 +-
kernel/sched/sched.h | 47 +++++++++++++++++++-
3 files changed, 119 insertions(+), 4 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c3cbdc436e2e4..db1e99756c400 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -794,6 +794,26 @@ unsigned int sysctl_sched_uclamp_util_max = SCHED_CAPACITY_SCALE;
/* All clamps are required to be less or equal than these values */
static struct uclamp_se uclamp_default[UCLAMP_CNT];
+/*
+ * This static key is used to reduce the uclamp overhead in the fast path. It
+ * primarily disables the call to uclamp_rq_{inc, dec}() in
+ * enqueue/dequeue_task().
+ *
+ * This allows users to continue to enable uclamp in their kernel config with
+ * minimum uclamp overhead in the fast path.
+ *
+ * As soon as userspace modifies any of the uclamp knobs, the static key is
+ * enabled, since we have an actual users that make use of uclamp
+ * functionality.
+ *
+ * The knobs that would enable this static key are:
+ *
+ * * A task modifying its uclamp value with sched_setattr().
+ * * An admin modifying the sysctl_sched_uclamp_{min, max} via procfs.
+ * * An admin modifying the cgroup cpu.uclamp.{min, max}
+ */
+DEFINE_STATIC_KEY_FALSE(sched_uclamp_used);
+
/* Integer rounded range for each bucket */
#define UCLAMP_BUCKET_DELTA DIV_ROUND_CLOSEST(SCHED_CAPACITY_SCALE, UCLAMP_BUCKETS)
@@ -990,10 +1010,38 @@ static inline void uclamp_rq_dec_id(struct rq *rq, struct task_struct *p,
lockdep_assert_held(&rq->lock);
+ /*
+ * If sched_uclamp_used was enabled after task @p was enqueued,
+ * we could end up with unbalanced call to uclamp_rq_dec_id().
+ *
+ * In this case the uc_se->active flag should be false since no uclamp
+ * accounting was performed at enqueue time and we can just return
+ * here.
+ *
+ * Need to be careful of the following enqeueue/dequeue ordering
+ * problem too
+ *
+ * enqueue(taskA)
+ * // sched_uclamp_used gets enabled
+ * enqueue(taskB)
+ * dequeue(taskA)
+ * // Must not decrement bukcet->tasks here
+ * dequeue(taskB)
+ *
+ * where we could end up with stale data in uc_se and
+ * bucket[uc_se->bucket_id].
+ *
+ * The following check here eliminates the possibility of such race.
+ */
+ if (unlikely(!uc_se->active))
+ return;
+
bucket = &uc_rq->bucket[uc_se->bucket_id];
+
SCHED_WARN_ON(!bucket->tasks);
if (likely(bucket->tasks))
bucket->tasks--;
+
uc_se->active = false;
/*
@@ -1021,6 +1069,15 @@ static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p)
{
enum uclamp_id clamp_id;
+ /*
+ * Avoid any overhead until uclamp is actually used by the userspace.
+ *
+ * The condition is constructed such that a NOP is generated when
+ * sched_uclamp_used is disabled.
+ */
+ if (!static_branch_unlikely(&sched_uclamp_used))
+ return;
+
if (unlikely(!p->sched_class->uclamp_enabled))
return;
@@ -1036,6 +1093,15 @@ static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p)
{
enum uclamp_id clamp_id;
+ /*
+ * Avoid any overhead until uclamp is actually used by the userspace.
+ *
+ * The condition is constructed such that a NOP is generated when
+ * sched_uclamp_used is disabled.
+ */
+ if (!static_branch_unlikely(&sched_uclamp_used))
+ return;
+
if (unlikely(!p->sched_class->uclamp_enabled))
return;
@@ -1144,8 +1210,10 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
update_root_tg = true;
}
- if (update_root_tg)
+ if (update_root_tg) {
+ static_branch_enable(&sched_uclamp_used);
uclamp_update_root_tg();
+ }
/*
* We update all RUNNABLE tasks only when task groups are in use.
@@ -1210,6 +1278,8 @@ static void __setscheduler_uclamp(struct task_struct *p,
if (likely(!(attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)))
return;
+ static_branch_enable(&sched_uclamp_used);
+
if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP_MIN) {
uclamp_se_set(&p->uclamp_req[UCLAMP_MIN],
attr->sched_util_min, true);
@@ -7442,6 +7512,8 @@ static ssize_t cpu_uclamp_write(struct kernfs_open_file *of, char *buf,
if (req.ret)
return req.ret;
+ static_branch_enable(&sched_uclamp_used);
+
mutex_lock(&uclamp_mutex);
rcu_read_lock();
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 7fbaee24c824f..dc6835bc64907 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -210,7 +210,7 @@ unsigned long schedutil_cpu_util(int cpu, unsigned long util_cfs,
unsigned long dl_util, util, irq;
struct rq *rq = cpu_rq(cpu);
- if (!IS_BUILTIN(CONFIG_UCLAMP_TASK) &&
+ if (!uclamp_is_used() &&
type == FREQUENCY_UTIL && rt_rq_is_runnable(&rq->rt)) {
return max;
}
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 877fb08eb1b04..c82857e2e288a 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -862,6 +862,8 @@ struct uclamp_rq {
unsigned int value;
struct uclamp_bucket bucket[UCLAMP_BUCKETS];
};
+
+DECLARE_STATIC_KEY_FALSE(sched_uclamp_used);
#endif /* CONFIG_UCLAMP_TASK */
/*
@@ -2349,12 +2351,35 @@ static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {}
#ifdef CONFIG_UCLAMP_TASK
unsigned long uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id);
+/**
+ * uclamp_rq_util_with - clamp @util with @rq and @p effective uclamp values.
+ * @rq: The rq to clamp against. Must not be NULL.
+ * @util: The util value to clamp.
+ * @p: The task to clamp against. Can be NULL if you want to clamp
+ * against @rq only.
+ *
+ * Clamps the passed @util to the max(@rq, @p) effective uclamp values.
+ *
+ * If sched_uclamp_used static key is disabled, then just return the util
+ * without any clamping since uclamp aggregation at the rq level in the fast
+ * path is disabled, rendering this operation a NOP.
+ *
+ * Use uclamp_eff_value() if you don't care about uclamp values at rq level. It
+ * will return the correct effective uclamp value of the task even if the
+ * static key is disabled.
+ */
static __always_inline
unsigned long uclamp_rq_util_with(struct rq *rq, unsigned long util,
struct task_struct *p)
{
- unsigned long min_util = READ_ONCE(rq->uclamp[UCLAMP_MIN].value);
- unsigned long max_util = READ_ONCE(rq->uclamp[UCLAMP_MAX].value);
+ unsigned long min_util;
+ unsigned long max_util;
+
+ if (!static_branch_likely(&sched_uclamp_used))
+ return util;
+
+ min_util = READ_ONCE(rq->uclamp[UCLAMP_MIN].value);
+ max_util = READ_ONCE(rq->uclamp[UCLAMP_MAX].value);
if (p) {
min_util = max(min_util, uclamp_eff_value(p, UCLAMP_MIN));
@@ -2371,6 +2396,19 @@ unsigned long uclamp_rq_util_with(struct rq *rq, unsigned long util,
return clamp(util, min_util, max_util);
}
+
+/*
+ * When uclamp is compiled in, the aggregation at rq level is 'turned off'
+ * by default in the fast path and only gets turned on once userspace performs
+ * an operation that requires it.
+ *
+ * Returns true if userspace opted-in to use uclamp and aggregation at rq level
+ * hence is active.
+ */
+static inline bool uclamp_is_used(void)
+{
+ return static_branch_likely(&sched_uclamp_used);
+}
#else /* CONFIG_UCLAMP_TASK */
static inline
unsigned long uclamp_rq_util_with(struct rq *rq, unsigned long util,
@@ -2378,6 +2416,11 @@ unsigned long uclamp_rq_util_with(struct rq *rq, unsigned long util,
{
return util;
}
+
+static inline bool uclamp_is_used(void)
+{
+ return false;
+}
#endif /* CONFIG_UCLAMP_TASK */
#ifdef arch_scale_freq_capacity
--
2.25.1
From: Paul Kocialkowski <[email protected]>
[ Upstream commit 0f879bab72f47e8ba2421a984e7acfa763d3e84e ]
Setting the output CSC mode is required for a YUV output, but must not
be set when the input is also YUV. Doing this (as tested with a YUV420P
to YUV420P conversion) results in wrong colors.
Adapt the logic to only set the output CSC mode when the output is YUV and
the input is RGB. Also add a comment to clarify the rationale.
Fixes: f7e7b48e6d79 ("[media] rockchip/rga: v4l2 m2m support")
Signed-off-by: Paul Kocialkowski <[email protected]>
Reviewed-by: Ezequiel Garcia <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/media/platform/rockchip/rga/rga-hw.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/media/platform/rockchip/rga/rga-hw.c b/drivers/media/platform/rockchip/rga/rga-hw.c
index 5607ee8d19176..aaa96f256356b 100644
--- a/drivers/media/platform/rockchip/rga/rga-hw.c
+++ b/drivers/media/platform/rockchip/rga/rga-hw.c
@@ -200,6 +200,11 @@ static void rga_cmd_set_trans_info(struct rga_ctx *ctx)
dst_info.data.format = ctx->out.fmt->hw_format;
dst_info.data.swap = ctx->out.fmt->color_swap;
+ /*
+ * CSC mode must only be set when the colorspace families differ between
+ * input and output. It must remain unset (zeroed) if both are the same.
+ */
+
if (RGA_COLOR_FMT_IS_YUV(ctx->in.fmt->hw_format) &&
RGA_COLOR_FMT_IS_RGB(ctx->out.fmt->hw_format)) {
switch (ctx->in.colorspace) {
@@ -212,7 +217,8 @@ static void rga_cmd_set_trans_info(struct rga_ctx *ctx)
}
}
- if (RGA_COLOR_FMT_IS_YUV(ctx->out.fmt->hw_format)) {
+ if (RGA_COLOR_FMT_IS_RGB(ctx->in.fmt->hw_format) &&
+ RGA_COLOR_FMT_IS_YUV(ctx->out.fmt->hw_format)) {
switch (ctx->out.colorspace) {
case V4L2_COLORSPACE_REC709:
dst_info.data.csc_mode = RGA_SRC_CSC_MODE_BT709_R0;
--
2.25.1
From: Herbert Xu <[email protected]>
[ Upstream commit eeedb618378f8a09779546a3eeac16b000447d62 ]
The arc4 algorithm requires storing state in the request context
in order to allow more than one encrypt/decrypt operation. As this
driver does not seem to do that, it means that using it for more
than one operation is broken.
Fixes: eaed71a44ad9 ("crypto: caam - add ecb(*) support")
Link: https://lore.kernel.org/linux-crypto/CAMj1kXGvMe_A_iQ43Pmygg9xaAM-RLy=_M=v+eg--8xNmv9P+w@mail.gmail.com
Link: https://lore.kernel.org/linux-crypto/[email protected]
Signed-off-by: Herbert Xu <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Acked-by: Horia Geantă <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/crypto/caam/caamalg.c | 29 -----------------------------
drivers/crypto/caam/compat.h | 1 -
2 files changed, 30 deletions(-)
diff --git a/drivers/crypto/caam/caamalg.c b/drivers/crypto/caam/caamalg.c
index bf90a4fcabd1f..8149ac4d6ef22 100644
--- a/drivers/crypto/caam/caamalg.c
+++ b/drivers/crypto/caam/caamalg.c
@@ -810,12 +810,6 @@ static int ctr_skcipher_setkey(struct crypto_skcipher *skcipher,
return skcipher_setkey(skcipher, key, keylen, ctx1_iv_off);
}
-static int arc4_skcipher_setkey(struct crypto_skcipher *skcipher,
- const u8 *key, unsigned int keylen)
-{
- return skcipher_setkey(skcipher, key, keylen, 0);
-}
-
static int des_skcipher_setkey(struct crypto_skcipher *skcipher,
const u8 *key, unsigned int keylen)
{
@@ -1967,21 +1961,6 @@ static struct caam_skcipher_alg driver_algs[] = {
},
.caam.class1_alg_type = OP_ALG_ALGSEL_3DES | OP_ALG_AAI_ECB,
},
- {
- .skcipher = {
- .base = {
- .cra_name = "ecb(arc4)",
- .cra_driver_name = "ecb-arc4-caam",
- .cra_blocksize = ARC4_BLOCK_SIZE,
- },
- .setkey = arc4_skcipher_setkey,
- .encrypt = skcipher_encrypt,
- .decrypt = skcipher_decrypt,
- .min_keysize = ARC4_MIN_KEY_SIZE,
- .max_keysize = ARC4_MAX_KEY_SIZE,
- },
- .caam.class1_alg_type = OP_ALG_ALGSEL_ARC4 | OP_ALG_AAI_ECB,
- },
};
static struct caam_aead_alg driver_aeads[] = {
@@ -3457,7 +3436,6 @@ int caam_algapi_init(struct device *ctrldev)
struct caam_drv_private *priv = dev_get_drvdata(ctrldev);
int i = 0, err = 0;
u32 aes_vid, aes_inst, des_inst, md_vid, md_inst, ccha_inst, ptha_inst;
- u32 arc4_inst;
unsigned int md_limit = SHA512_DIGEST_SIZE;
bool registered = false, gcm_support;
@@ -3477,8 +3455,6 @@ int caam_algapi_init(struct device *ctrldev)
CHA_ID_LS_DES_SHIFT;
aes_inst = cha_inst & CHA_ID_LS_AES_MASK;
md_inst = (cha_inst & CHA_ID_LS_MD_MASK) >> CHA_ID_LS_MD_SHIFT;
- arc4_inst = (cha_inst & CHA_ID_LS_ARC4_MASK) >>
- CHA_ID_LS_ARC4_SHIFT;
ccha_inst = 0;
ptha_inst = 0;
@@ -3499,7 +3475,6 @@ int caam_algapi_init(struct device *ctrldev)
md_inst = mdha & CHA_VER_NUM_MASK;
ccha_inst = rd_reg32(&priv->ctrl->vreg.ccha) & CHA_VER_NUM_MASK;
ptha_inst = rd_reg32(&priv->ctrl->vreg.ptha) & CHA_VER_NUM_MASK;
- arc4_inst = rd_reg32(&priv->ctrl->vreg.afha) & CHA_VER_NUM_MASK;
gcm_support = aesa & CHA_VER_MISC_AES_GCM;
}
@@ -3522,10 +3497,6 @@ int caam_algapi_init(struct device *ctrldev)
if (!aes_inst && (alg_sel == OP_ALG_ALGSEL_AES))
continue;
- /* Skip ARC4 algorithms if not supported by device */
- if (!arc4_inst && alg_sel == OP_ALG_ALGSEL_ARC4)
- continue;
-
/*
* Check support for AES modes not available
* on LP devices.
diff --git a/drivers/crypto/caam/compat.h b/drivers/crypto/caam/compat.h
index 60e2a54c19f11..c3c22a8de4c00 100644
--- a/drivers/crypto/caam/compat.h
+++ b/drivers/crypto/caam/compat.h
@@ -43,7 +43,6 @@
#include <crypto/akcipher.h>
#include <crypto/scatterwalk.h>
#include <crypto/skcipher.h>
-#include <crypto/arc4.h>
#include <crypto/internal/skcipher.h>
#include <crypto/internal/hash.h>
#include <crypto/internal/rsa.h>
--
2.25.1
From: Mark Zhang <[email protected]>
[ Upstream commit c9f557421e505f75da4234a6af8eff46bc08614b ]
In auto mode only bind user QPs to a dynamic counter, since this feature
is mainly used for system statistic and diagnostic purpose, while there's
no need to counter kernel QPs so far.
Fixes: 99fa331dc862 ("RDMA/counter: Add "auto" configuration mode support")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Zhang <[email protected]>
Reviewed-by: Maor Gottlieb <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/infiniband/core/counters.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/counters.c b/drivers/infiniband/core/counters.c
index 738d1faf4bba5..6deb1901fbd02 100644
--- a/drivers/infiniband/core/counters.c
+++ b/drivers/infiniband/core/counters.c
@@ -288,7 +288,7 @@ int rdma_counter_bind_qp_auto(struct ib_qp *qp, u8 port)
struct rdma_counter *counter;
int ret;
- if (!qp->res.valid)
+ if (!qp->res.valid || rdma_is_kernel_res(&qp->res))
return 0;
if (!rdma_is_port_valid(dev, port))
--
2.25.1
From: Mark Zhang <[email protected]>
[ Upstream commit cbeb7d896c0f296451ffa7b67e7706786b8364c8 ]
In manual mode allow bind user QPs with different pids to same counter,
since this is allowed in auto mode.
Bind kernel QPs and user QPs to the same counter are not allowed.
Fixes: 1bd8e0a9d0fd ("RDMA/counter: Allow manual mode configuration support")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Zhang <[email protected]>
Reviewed-by: Maor Gottlieb <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/infiniband/core/counters.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/counters.c b/drivers/infiniband/core/counters.c
index 6deb1901fbd02..417ebf4d8ba9b 100644
--- a/drivers/infiniband/core/counters.c
+++ b/drivers/infiniband/core/counters.c
@@ -483,7 +483,7 @@ int rdma_counter_bind_qpn(struct ib_device *dev, u8 port,
goto err;
}
- if (counter->res.task != qp->res.task) {
+ if (rdma_is_kernel_res(&counter->res) != rdma_is_kernel_res(&qp->res)) {
ret = -EINVAL;
goto err_task;
}
--
2.25.1
From: Tyler Hicks <[email protected]>
[ Upstream commit 5f3e92657bbfb63ad3109433d843c89996114b03 ]
Verifying that a file hash is not blacklisted is currently only
supported for files with appended signatures (modsig). In the future,
this might change.
For now, the "appraise_flag" option is only appropriate for appraise
actions and its "blacklist" value is only appropriate when
CONFIG_IMA_APPRAISE_MODSIG is enabled and "appraise_flag=blacklist" is
only appropriate when "appraise_type=imasig|modsig" is also present.
Make this clear at policy load so that IMA policy authors don't assume
that other uses of "appraise_flag=blacklist" are supported.
Fixes: 273df864cf74 ("ima: Check against blacklisted hashes for files with modsig")
Signed-off-by: Tyler Hicks <[email protected]>
Reivewed-by: Nayna Jain <[email protected]>
Tested-by: Nayna Jain <[email protected]>
Signed-off-by: Mimi Zohar <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
security/integrity/ima/ima_policy.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
index 3e3e568c81309..a59bf2f5b2d4f 100644
--- a/security/integrity/ima/ima_policy.c
+++ b/security/integrity/ima/ima_policy.c
@@ -1035,6 +1035,11 @@ static bool ima_validate_rule(struct ima_rule_entry *entry)
return false;
}
+ /* Ensure that combinations of flags are compatible with each other */
+ if (entry->flags & IMA_CHECK_BLACKLIST &&
+ !(entry->flags & IMA_MODSIG_ALLOWED))
+ return false;
+
return true;
}
@@ -1371,9 +1376,17 @@ static int ima_parse_rule(char *rule, struct ima_rule_entry *entry)
result = -EINVAL;
break;
case Opt_appraise_flag:
+ if (entry->action != APPRAISE) {
+ result = -EINVAL;
+ break;
+ }
+
ima_log_string(ab, "appraise_flag", args[0].from);
- if (strstr(args[0].from, "blacklist"))
+ if (IS_ENABLED(CONFIG_IMA_APPRAISE_MODSIG) &&
+ strstr(args[0].from, "blacklist"))
entry->flags |= IMA_CHECK_BLACKLIST;
+ else
+ result = -EINVAL;
break;
case Opt_permit_directio:
entry->flags |= IMA_PERMIT_DIRECTIO;
--
2.25.1
From: Johannes Thumshirn <[email protected]>
commit a9cb9f4148ef6bb8fabbdaa85c42b2171fbd5a0d upstream.
Don't call report zones for more zones than the user actually requested,
otherwise this can lead to out-of-bounds accesses in the callback
functions.
Such a situation can happen if the target's ->report_zones() callback
function returns 0 because we've reached the end of the target and then
restart the report zones on the second target.
We're again calling into ->report_zones() and ultimately into the user
supplied callback function but when we're not subtracting the number of
zones already processed this may lead to out-of-bounds accesses in the
user callbacks.
Signed-off-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Damien Le Moal <[email protected]>
Fixes: d41003513e61 ("block: rework zone reporting")
Cc: [email protected] # v5.5+
Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/md/dm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -504,7 +504,8 @@ static int dm_blk_report_zones(struct ge
}
args.tgt = tgt;
- ret = tgt->type->report_zones(tgt, &args, nr_zones);
+ ret = tgt->type->report_zones(tgt, &args,
+ nr_zones - args.zone_idx);
if (ret < 0)
goto out;
} while (args.zone_idx < nr_zones &&
From: Kees Cook <[email protected]>
commit 11990a5bd7e558e9203c1070fc52fb6f0488e75b upstream.
The only-root-readable /sys/module/$module/sections/$section files
did not truncate their output to the available buffer size. While most
paths into the kernfs read handlers end up using PAGE_SIZE buffers,
it's possible to get there through other paths (e.g. splice, sendfile).
Actually limit the output to the "count" passed into the read function,
and report it back correctly. *sigh*
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/lkml/20200805002015.GE23458@shao2-debian
Fixes: ed66f991bb19 ("module: Refactor section attr into bin attribute")
Cc: [email protected]
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Acked-by: Jessica Yu <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/module.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -1520,18 +1520,34 @@ struct module_sect_attrs {
struct module_sect_attr attrs[];
};
+#define MODULE_SECT_READ_SIZE (3 /* "0x", "\n" */ + (BITS_PER_LONG / 4))
static ssize_t module_sect_read(struct file *file, struct kobject *kobj,
struct bin_attribute *battr,
char *buf, loff_t pos, size_t count)
{
struct module_sect_attr *sattr =
container_of(battr, struct module_sect_attr, battr);
+ char bounce[MODULE_SECT_READ_SIZE + 1];
+ size_t wrote;
if (pos != 0)
return -EINVAL;
- return sprintf(buf, "0x%px\n",
- kallsyms_show_value(file->f_cred) ? (void *)sattr->address : NULL);
+ /*
+ * Since we're a binary read handler, we must account for the
+ * trailing NUL byte that sprintf will write: if "buf" is
+ * too small to hold the NUL, or the NUL is exactly the last
+ * byte, the read will look like it got truncated by one byte.
+ * Since there is no way to ask sprintf nicely to not write
+ * the NUL, we have to use a bounce buffer.
+ */
+ wrote = scnprintf(bounce, sizeof(bounce), "0x%px\n",
+ kallsyms_show_value(file->f_cred)
+ ? (void *)sattr->address : NULL);
+ count = min(count, wrote);
+ memcpy(buf, bounce, count);
+
+ return count;
}
static void free_sect_attrs(struct module_sect_attrs *sect_attrs)
@@ -1580,7 +1596,7 @@ static void add_sect_attrs(struct module
goto out;
sect_attrs->nsections++;
sattr->battr.read = module_sect_read;
- sattr->battr.size = 3 /* "0x", "\n" */ + (BITS_PER_LONG / 4);
+ sattr->battr.size = MODULE_SECT_READ_SIZE;
sattr->battr.attr.mode = 0400;
*(gattr++) = &(sattr++)->battr;
}
From: Dafna Hirschfeld <[email protected]>
[ Upstream commit b861d139a36a4593498932bfec957bdcc7d98eb3 ]
The macro RKISP1_DIR_SINK_SRC is a mask of two flags.
The macro hides the fact that it's a mask and the code
is actually more clear if we replace it the with bitwise-or explicitly.
Signed-off-by: Dafna Hirschfeld <[email protected]>
Acked-by: Helen Koike <[email protected]>
Reviewed-by: Tomasz Figa <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/staging/media/rkisp1/rkisp1-isp.c | 25 +++++++++++------------
1 file changed, 12 insertions(+), 13 deletions(-)
diff --git a/drivers/staging/media/rkisp1/rkisp1-isp.c b/drivers/staging/media/rkisp1/rkisp1-isp.c
index dc2b59a0160a8..93ba2dd2fcda0 100644
--- a/drivers/staging/media/rkisp1/rkisp1-isp.c
+++ b/drivers/staging/media/rkisp1/rkisp1-isp.c
@@ -25,7 +25,6 @@
#define RKISP1_DIR_SRC BIT(0)
#define RKISP1_DIR_SINK BIT(1)
-#define RKISP1_DIR_SINK_SRC (RKISP1_DIR_SINK | RKISP1_DIR_SRC)
/*
* NOTE: MIPI controller and input MUX are also configured in this file.
@@ -69,84 +68,84 @@ static const struct rkisp1_isp_mbus_info rkisp1_isp_formats[] = {
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW10,
.bayer_pat = RKISP1_RAW_RGGB,
.bus_width = 10,
- .direction = RKISP1_DIR_SINK_SRC,
+ .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SBGGR10_1X10,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW10,
.bayer_pat = RKISP1_RAW_BGGR,
.bus_width = 10,
- .direction = RKISP1_DIR_SINK_SRC,
+ .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SGBRG10_1X10,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW10,
.bayer_pat = RKISP1_RAW_GBRG,
.bus_width = 10,
- .direction = RKISP1_DIR_SINK_SRC,
+ .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SGRBG10_1X10,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW10,
.bayer_pat = RKISP1_RAW_GRBG,
.bus_width = 10,
- .direction = RKISP1_DIR_SINK_SRC,
+ .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SRGGB12_1X12,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW12,
.bayer_pat = RKISP1_RAW_RGGB,
.bus_width = 12,
- .direction = RKISP1_DIR_SINK_SRC,
+ .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SBGGR12_1X12,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW12,
.bayer_pat = RKISP1_RAW_BGGR,
.bus_width = 12,
- .direction = RKISP1_DIR_SINK_SRC,
+ .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SGBRG12_1X12,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW12,
.bayer_pat = RKISP1_RAW_GBRG,
.bus_width = 12,
- .direction = RKISP1_DIR_SINK_SRC,
+ .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SGRBG12_1X12,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW12,
.bayer_pat = RKISP1_RAW_GRBG,
.bus_width = 12,
- .direction = RKISP1_DIR_SINK_SRC,
+ .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SRGGB8_1X8,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW8,
.bayer_pat = RKISP1_RAW_RGGB,
.bus_width = 8,
- .direction = RKISP1_DIR_SINK_SRC,
+ .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SBGGR8_1X8,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW8,
.bayer_pat = RKISP1_RAW_BGGR,
.bus_width = 8,
- .direction = RKISP1_DIR_SINK_SRC,
+ .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SGBRG8_1X8,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW8,
.bayer_pat = RKISP1_RAW_GBRG,
.bus_width = 8,
- .direction = RKISP1_DIR_SINK_SRC,
+ .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_SGRBG8_1X8,
.pixel_enc = V4L2_PIXEL_ENC_BAYER,
.mipi_dt = RKISP1_CIF_CSI2_DT_RAW8,
.bayer_pat = RKISP1_RAW_GRBG,
.bus_width = 8,
- .direction = RKISP1_DIR_SINK_SRC,
+ .direction = RKISP1_DIR_SINK | RKISP1_DIR_SRC,
}, {
.mbus_code = MEDIA_BUS_FMT_YUYV8_1X16,
.pixel_enc = V4L2_PIXEL_ENC_YUV,
--
2.25.1
From: Steve Longerbeam <[email protected]>
[ Upstream commit 0f6245f42ce9b7e4d20f2cda8d5f12b55a44d7d1 ]
Combine the rotate_irq() and norotate_irq() handlers into a single
eof_irq() handler.
Signed-off-by: Steve Longerbeam <[email protected]>
Signed-off-by: Philipp Zabel <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/ipu-v3/ipu-image-convert.c | 58 +++++++++-----------------
1 file changed, 20 insertions(+), 38 deletions(-)
diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c b/drivers/gpu/ipu-v3/ipu-image-convert.c
index eeca50d9a1ee4..f8b031ded3cf2 100644
--- a/drivers/gpu/ipu-v3/ipu-image-convert.c
+++ b/drivers/gpu/ipu-v3/ipu-image-convert.c
@@ -1709,9 +1709,10 @@ static irqreturn_t do_irq(struct ipu_image_convert_run *run)
return IRQ_WAKE_THREAD;
}
-static irqreturn_t norotate_irq(int irq, void *data)
+static irqreturn_t eof_irq(int irq, void *data)
{
struct ipu_image_convert_chan *chan = data;
+ struct ipu_image_convert_priv *priv = chan->priv;
struct ipu_image_convert_ctx *ctx;
struct ipu_image_convert_run *run;
unsigned long flags;
@@ -1728,45 +1729,26 @@ static irqreturn_t norotate_irq(int irq, void *data)
ctx = run->ctx;
- if (ipu_rot_mode_is_irt(ctx->rot_mode)) {
- /* this is a rotation operation, just ignore */
- spin_unlock_irqrestore(&chan->irqlock, flags);
- return IRQ_HANDLED;
- }
-
- ret = do_irq(run);
-out:
- spin_unlock_irqrestore(&chan->irqlock, flags);
- return ret;
-}
-
-static irqreturn_t rotate_irq(int irq, void *data)
-{
- struct ipu_image_convert_chan *chan = data;
- struct ipu_image_convert_priv *priv = chan->priv;
- struct ipu_image_convert_ctx *ctx;
- struct ipu_image_convert_run *run;
- unsigned long flags;
- irqreturn_t ret;
-
- spin_lock_irqsave(&chan->irqlock, flags);
-
- /* get current run and its context */
- run = chan->current_run;
- if (!run) {
+ if (irq == chan->out_eof_irq) {
+ if (ipu_rot_mode_is_irt(ctx->rot_mode)) {
+ /* this is a rotation op, just ignore */
+ ret = IRQ_HANDLED;
+ goto out;
+ }
+ } else if (irq == chan->rot_out_eof_irq) {
+ if (!ipu_rot_mode_is_irt(ctx->rot_mode)) {
+ /* this was NOT a rotation op, shouldn't happen */
+ dev_err(priv->ipu->dev,
+ "Unexpected rotation interrupt\n");
+ ret = IRQ_HANDLED;
+ goto out;
+ }
+ } else {
+ dev_err(priv->ipu->dev, "Received unknown irq %d\n", irq);
ret = IRQ_NONE;
goto out;
}
- ctx = run->ctx;
-
- if (!ipu_rot_mode_is_irt(ctx->rot_mode)) {
- /* this was NOT a rotation operation, shouldn't happen */
- dev_err(priv->ipu->dev, "Unexpected rotation interrupt\n");
- spin_unlock_irqrestore(&chan->irqlock, flags);
- return IRQ_HANDLED;
- }
-
ret = do_irq(run);
out:
spin_unlock_irqrestore(&chan->irqlock, flags);
@@ -1859,7 +1841,7 @@ static int get_ipu_resources(struct ipu_image_convert_chan *chan)
chan->out_chan,
IPU_IRQ_EOF);
- ret = request_threaded_irq(chan->out_eof_irq, norotate_irq, do_bh,
+ ret = request_threaded_irq(chan->out_eof_irq, eof_irq, do_bh,
0, "ipu-ic", chan);
if (ret < 0) {
dev_err(priv->ipu->dev, "could not acquire irq %d\n",
@@ -1872,7 +1854,7 @@ static int get_ipu_resources(struct ipu_image_convert_chan *chan)
chan->rotation_out_chan,
IPU_IRQ_EOF);
- ret = request_threaded_irq(chan->rot_out_eof_irq, rotate_irq, do_bh,
+ ret = request_threaded_irq(chan->rot_out_eof_irq, eof_irq, do_bh,
0, "ipu-ic", chan);
if (ret < 0) {
dev_err(priv->ipu->dev, "could not acquire irq %d\n",
--
2.25.1
From: Jeff Layton <[email protected]>
commit 02e37571f9e79022498fd0525c073b07e9d9ac69 upstream.
Most session messages contain a feature mask, but the MDS will
routinely send a REJECT message with one that is zero-length.
Commit 0fa8263367db ("ceph: fix endianness bug when handling MDS
session feature bits") fixed the decoding of the feature mask,
but failed to account for the MDS sending a zero-length feature
mask. This causes REJECT message decoding to fail.
Skip trying to decode a feature mask if the word count is zero.
Cc: [email protected]
URL: https://tracker.ceph.com/issues/46823
Fixes: 0fa8263367db ("ceph: fix endianness bug when handling MDS session feature bits")
Signed-off-by: Jeff Layton <[email protected]>
Reviewed-by: Ilya Dryomov <[email protected]>
Tested-by: Patrick Donnelly <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/ceph/mds_client.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -3279,8 +3279,10 @@ static void handle_session(struct ceph_m
goto bad;
/* version >= 3, feature bits */
ceph_decode_32_safe(&p, end, len, bad);
- ceph_decode_64_safe(&p, end, features, bad);
- p += len - sizeof(features);
+ if (len) {
+ ceph_decode_64_safe(&p, end, features, bad);
+ p += len - sizeof(features);
+ }
}
mutex_lock(&mdsc->mutex);
From: Jesper Dangaard Brouer <[email protected]>
[ Upstream commit 3220fb667842a9725cbb71656f406eadb03c094b ]
This is a follow up adjustment to commit 6c92bd5cd465 ("selftests/bpf:
Test_progs indicate to shell on non-actions"), that returns shell exit
indication EXIT_FAILURE (value 1) when user selects a non-existing test.
The problem with using EXIT_FAILURE is that a shell script cannot tell
the difference between a non-existing test and the test failing.
This patch uses value 2 as shell exit indication.
(Aside note unrecognized option parameters use value 64).
Fixes: 6c92bd5cd465 ("selftests/bpf: Test_progs indicate to shell on non-actions")
Signed-off-by: Jesper Dangaard Brouer <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/159410593992.1093222.90072558386094370.stgit@firesoul
Signed-off-by: Sasha Levin <[email protected]>
---
tools/testing/selftests/bpf/test_progs.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index 6218b2b5a3f62..0849735ebda8a 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -12,6 +12,8 @@
#include <string.h>
#include <execinfo.h> /* backtrace */
+#define EXIT_NO_TEST 2
+
/* defined in test_progs.h */
struct test_env env = {};
@@ -707,7 +709,7 @@ int main(int argc, char **argv)
close(env.saved_netns_fd);
if (env.succ_cnt + env.fail_cnt + env.skip_cnt == 0)
- return EXIT_FAILURE;
+ return EXIT_NO_TEST;
return env.fail_cnt ? EXIT_FAILURE : EXIT_SUCCESS;
}
--
2.25.1
From: Johan Hovold <[email protected]>
[ Upstream commit ab4cc4ef6724ea588e835fc1e764c4b4407a70b7 ]
Use an unsigned type for the process-packet buffer argument and give it
a more apt name.
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/usb/serial/ftdi_sio.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/drivers/usb/serial/ftdi_sio.c b/drivers/usb/serial/ftdi_sio.c
index 9ad44a96dfe3a..96b9e2768ac5c 100644
--- a/drivers/usb/serial/ftdi_sio.c
+++ b/drivers/usb/serial/ftdi_sio.c
@@ -2480,12 +2480,12 @@ static int ftdi_prepare_write_buffer(struct usb_serial_port *port,
#define FTDI_RS_ERR_MASK (FTDI_RS_BI | FTDI_RS_PE | FTDI_RS_FE | FTDI_RS_OE)
static int ftdi_process_packet(struct usb_serial_port *port,
- struct ftdi_private *priv, char *packet, int len)
+ struct ftdi_private *priv, unsigned char *buf, int len)
{
+ unsigned char status;
+ unsigned char *ch;
int i;
- char status;
char flag;
- char *ch;
if (len < 2) {
dev_dbg(&port->dev, "malformed packet\n");
@@ -2495,7 +2495,7 @@ static int ftdi_process_packet(struct usb_serial_port *port,
/* Compare new line status to the old one, signal if different/
N.B. packet may be processed more than once, but differences
are only processed once. */
- status = packet[0] & FTDI_STATUS_B0_MASK;
+ status = buf[0] & FTDI_STATUS_B0_MASK;
if (status != priv->prev_status) {
char diff_status = status ^ priv->prev_status;
@@ -2521,7 +2521,7 @@ static int ftdi_process_packet(struct usb_serial_port *port,
}
/* save if the transmitter is empty or not */
- if (packet[1] & FTDI_RS_TEMT)
+ if (buf[1] & FTDI_RS_TEMT)
priv->transmit_empty = 1;
else
priv->transmit_empty = 0;
@@ -2535,29 +2535,29 @@ static int ftdi_process_packet(struct usb_serial_port *port,
* data payload to avoid over-reporting.
*/
flag = TTY_NORMAL;
- if (packet[1] & FTDI_RS_ERR_MASK) {
+ if (buf[1] & FTDI_RS_ERR_MASK) {
/* Break takes precedence over parity, which takes precedence
* over framing errors */
- if (packet[1] & FTDI_RS_BI) {
+ if (buf[1] & FTDI_RS_BI) {
flag = TTY_BREAK;
port->icount.brk++;
usb_serial_handle_break(port);
- } else if (packet[1] & FTDI_RS_PE) {
+ } else if (buf[1] & FTDI_RS_PE) {
flag = TTY_PARITY;
port->icount.parity++;
- } else if (packet[1] & FTDI_RS_FE) {
+ } else if (buf[1] & FTDI_RS_FE) {
flag = TTY_FRAME;
port->icount.frame++;
}
/* Overrun is special, not associated with a char */
- if (packet[1] & FTDI_RS_OE) {
+ if (buf[1] & FTDI_RS_OE) {
port->icount.overrun++;
tty_insert_flip_char(&port->port, 0, TTY_OVERRUN);
}
}
port->icount.rx += len;
- ch = packet + 2;
+ ch = buf + 2;
if (port->port.console && port->sysrq) {
for (i = 0; i < len; i++, ch++) {
--
2.25.1
From: Johan Hovold <[email protected]>
[ Upstream commit ce054039ba5e47b75a3be02a00274e52b06a6456 ]
Clean up receive processing by dropping the character pointer and
keeping the length argument unchanged throughout the function.
Also make it more apparent that sysrq processing can consume a
characters by adding an explicit continue.
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/usb/serial/ftdi_sio.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/drivers/usb/serial/ftdi_sio.c b/drivers/usb/serial/ftdi_sio.c
index 96b9e2768ac5c..33f1cca7eaa61 100644
--- a/drivers/usb/serial/ftdi_sio.c
+++ b/drivers/usb/serial/ftdi_sio.c
@@ -2483,7 +2483,6 @@ static int ftdi_process_packet(struct usb_serial_port *port,
struct ftdi_private *priv, unsigned char *buf, int len)
{
unsigned char status;
- unsigned char *ch;
int i;
char flag;
@@ -2526,8 +2525,7 @@ static int ftdi_process_packet(struct usb_serial_port *port,
else
priv->transmit_empty = 0;
- len -= 2;
- if (!len)
+ if (len == 2)
return 0; /* status only */
/*
@@ -2556,19 +2554,20 @@ static int ftdi_process_packet(struct usb_serial_port *port,
}
}
- port->icount.rx += len;
- ch = buf + 2;
+ port->icount.rx += len - 2;
if (port->port.console && port->sysrq) {
- for (i = 0; i < len; i++, ch++) {
- if (!usb_serial_handle_sysrq_char(port, *ch))
- tty_insert_flip_char(&port->port, *ch, flag);
+ for (i = 2; i < len; i++) {
+ if (usb_serial_handle_sysrq_char(port, buf[i]))
+ continue;
+ tty_insert_flip_char(&port->port, buf[i], flag);
}
} else {
- tty_insert_flip_string_fixed_flag(&port->port, ch, flag, len);
+ tty_insert_flip_string_fixed_flag(&port->port, buf + 2, flag,
+ len - 2);
}
- return len;
+ return len - 2;
}
static void ftdi_process_read_urb(struct urb *urb)
--
2.25.1
From: Chen Tao <[email protected]>
[ Upstream commit 3e4aeff36e9212a939290c0ca70d4931c4ad1950 ]
Fix memory leak in amdgpu_debugfs_gpr_read not freeing data when
pm_runtime_get_sync failed.
Fixes: a9ffe2a983383 ("drm/amdgpu/debugfs: properly handle runtime pm")
Signed-off-by: Chen Tao <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index f87b225437fc3..bd5061fbe031e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -973,7 +973,7 @@ static ssize_t amdgpu_debugfs_gpr_read(struct file *f, char __user *buf,
r = pm_runtime_get_sync(adev->ddev->dev);
if (r < 0)
- return r;
+ goto err;
r = amdgpu_virt_enable_access_debugfs(adev);
if (r < 0)
@@ -1003,7 +1003,7 @@ static ssize_t amdgpu_debugfs_gpr_read(struct file *f, char __user *buf,
value = data[result >> 2];
r = put_user(value, (uint32_t *)buf);
if (r) {
- result = r;
+ amdgpu_virt_disable_access_debugfs(adev);
goto err;
}
@@ -1012,11 +1012,14 @@ static ssize_t amdgpu_debugfs_gpr_read(struct file *f, char __user *buf,
size -= 4;
}
-err:
- pm_runtime_put_autosuspend(adev->ddev->dev);
kfree(data);
amdgpu_virt_disable_access_debugfs(adev);
return result;
+
+err:
+ pm_runtime_put_autosuspend(adev->ddev->dev);
+ kfree(data);
+ return r;
}
/**
--
2.25.1
From: Boris Brezillon <[email protected]>
[ Upstream commit ccc49eff77bee2885447a032948959a134029fe3 ]
The mtd var in fun_wait_rnb() is now unused, let's get rid of it and
fix the warning resulting from this unused var.
Fixes: 50a487e7719c ("mtd: rawnand: Pass a nand_chip object to chip->dev_ready()")
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Miquel Raynal <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Link: https://lore.kernel.org/linux-mtd/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/mtd/nand/raw/fsl_upm.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/mtd/nand/raw/fsl_upm.c b/drivers/mtd/nand/raw/fsl_upm.c
index 627deb26db512..76d1032cd35e8 100644
--- a/drivers/mtd/nand/raw/fsl_upm.c
+++ b/drivers/mtd/nand/raw/fsl_upm.c
@@ -62,7 +62,6 @@ static int fun_chip_ready(struct nand_chip *chip)
static void fun_wait_rnb(struct fsl_upm_nand *fun)
{
if (fun->rnb_gpio[fun->mchip_number] >= 0) {
- struct mtd_info *mtd = nand_to_mtd(&fun->chip);
int cnt = 1000000;
while (--cnt && !fun_chip_ready(&fun->chip))
--
2.25.1
From: Jeff Layton <[email protected]>
commit b748fc7a8763a5b3f8149f12c45711cd73ef8176 upstream.
Symlink inodes should have the security context set in their xattrs on
creation. We already set the context on creation, but we don't attach
the pagelist. The effect is that symlink inodes don't get an SELinux
context set on them at creation, so they end up unlabeled instead of
inheriting the proper context. Make it do so.
Cc: [email protected]
Signed-off-by: Jeff Layton <[email protected]>
Reviewed-by: Ilya Dryomov <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/ceph/dir.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -930,6 +930,10 @@ static int ceph_symlink(struct inode *di
req->r_num_caps = 2;
req->r_dentry_drop = CEPH_CAP_FILE_SHARED | CEPH_CAP_AUTH_EXCL;
req->r_dentry_unless = CEPH_CAP_FILE_EXCL;
+ if (as_ctx.pagelist) {
+ req->r_pagelist = as_ctx.pagelist;
+ as_ctx.pagelist = NULL;
+ }
err = ceph_mdsc_do_request(mdsc, dir, req);
if (!err && !req->r_reply_info.head->is_dentry)
err = ceph_handle_notrace_create(dir, dentry);
From: Dan Williams <[email protected]>
commit 92fe2aa859f52ce6aa595ca97fec110dc7100e63 upstream.
The ND_CMD_CALL format allows for a general passthrough of passlisted
commands targeting a given command set. However there is no validation
of the family index relative to what the bus supports.
- Update the NFIT bus implementation (the only one that supports
ND_CMD_CALL passthrough) to also passlist the valid set of command
family indices.
- Update the generic __nd_ioctl() path to validate that field on behalf
of all implementations.
Fixes: 31eca76ba2fc ("nfit, libnvdimm: limited/whitelisted dimm command marshaling mechanism")
Cc: Vishal Verma <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Len Brown <[email protected]>
Cc: <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Vishal Verma <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/acpi/nfit/core.c | 11 +++++++++--
drivers/acpi/nfit/nfit.h | 1 -
drivers/nvdimm/bus.c | 16 ++++++++++++++++
include/linux/libnvdimm.h | 2 ++
include/uapi/linux/ndctl.h | 4 ++++
5 files changed, 31 insertions(+), 3 deletions(-)
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -1823,6 +1823,7 @@ static void populate_shutdown_status(str
static int acpi_nfit_add_dimm(struct acpi_nfit_desc *acpi_desc,
struct nfit_mem *nfit_mem, u32 device_handle)
{
+ struct nvdimm_bus_descriptor *nd_desc = &acpi_desc->nd_desc;
struct acpi_device *adev, *adev_dimm;
struct device *dev = acpi_desc->dev;
unsigned long dsm_mask, label_mask;
@@ -1834,6 +1835,7 @@ static int acpi_nfit_add_dimm(struct acp
/* nfit test assumes 1:1 relationship between commands and dsms */
nfit_mem->dsm_mask = acpi_desc->dimm_cmd_force_en;
nfit_mem->family = NVDIMM_FAMILY_INTEL;
+ set_bit(NVDIMM_FAMILY_INTEL, &nd_desc->dimm_family_mask);
if (dcr->valid_fields & ACPI_NFIT_CONTROL_MFG_INFO_VALID)
sprintf(nfit_mem->id, "%04x-%02x-%04x-%08x",
@@ -1886,10 +1888,13 @@ static int acpi_nfit_add_dimm(struct acp
* Note, that checking for function0 (bit0) tells us if any commands
* are reachable through this GUID.
*/
+ clear_bit(NVDIMM_FAMILY_INTEL, &nd_desc->dimm_family_mask);
for (i = 0; i <= NVDIMM_FAMILY_MAX; i++)
- if (acpi_check_dsm(adev_dimm->handle, to_nfit_uuid(i), 1, 1))
+ if (acpi_check_dsm(adev_dimm->handle, to_nfit_uuid(i), 1, 1)) {
+ set_bit(i, &nd_desc->dimm_family_mask);
if (family < 0 || i == default_dsm_family)
family = i;
+ }
/* limit the supported commands to those that are publicly documented */
nfit_mem->family = family;
@@ -2153,6 +2158,9 @@ static void acpi_nfit_init_dsms(struct a
nd_desc->cmd_mask = acpi_desc->bus_cmd_force_en;
nd_desc->bus_dsm_mask = acpi_desc->bus_nfit_cmd_force_en;
+ set_bit(ND_CMD_CALL, &nd_desc->cmd_mask);
+ set_bit(NVDIMM_BUS_FAMILY_NFIT, &nd_desc->bus_family_mask);
+
adev = to_acpi_dev(acpi_desc);
if (!adev)
return;
@@ -2160,7 +2168,6 @@ static void acpi_nfit_init_dsms(struct a
for (i = ND_CMD_ARS_CAP; i <= ND_CMD_CLEAR_ERROR; i++)
if (acpi_check_dsm(adev->handle, guid, 1, 1ULL << i))
set_bit(i, &nd_desc->cmd_mask);
- set_bit(ND_CMD_CALL, &nd_desc->cmd_mask);
dsm_mask =
(1 << ND_CMD_ARS_CAP) |
--- a/drivers/acpi/nfit/nfit.h
+++ b/drivers/acpi/nfit/nfit.h
@@ -33,7 +33,6 @@
| ACPI_NFIT_MEM_RESTORE_FAILED | ACPI_NFIT_MEM_FLUSH_FAILED \
| ACPI_NFIT_MEM_NOT_ARMED | ACPI_NFIT_MEM_MAP_FAILED)
-#define NVDIMM_FAMILY_MAX NVDIMM_FAMILY_HYPERV
#define NVDIMM_CMD_MAX 31
#define NVDIMM_STANDARD_CMDMASK \
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -1037,9 +1037,25 @@ static int __nd_ioctl(struct nvdimm_bus
dimm_name = "bus";
}
+ /* Validate command family support against bus declared support */
if (cmd == ND_CMD_CALL) {
+ unsigned long *mask;
+
if (copy_from_user(&pkg, p, sizeof(pkg)))
return -EFAULT;
+
+ if (nvdimm) {
+ if (pkg.nd_family > NVDIMM_FAMILY_MAX)
+ return -EINVAL;
+ mask = &nd_desc->dimm_family_mask;
+ } else {
+ if (pkg.nd_family > NVDIMM_BUS_FAMILY_MAX)
+ return -EINVAL;
+ mask = &nd_desc->bus_family_mask;
+ }
+
+ if (!test_bit(pkg.nd_family, mask))
+ return -EINVAL;
}
if (!desc ||
--- a/include/linux/libnvdimm.h
+++ b/include/linux/libnvdimm.h
@@ -78,6 +78,8 @@ struct nvdimm_bus_descriptor {
const struct attribute_group **attr_groups;
unsigned long bus_dsm_mask;
unsigned long cmd_mask;
+ unsigned long dimm_family_mask;
+ unsigned long bus_family_mask;
struct module *module;
char *provider_name;
struct device_node *of_node;
--- a/include/uapi/linux/ndctl.h
+++ b/include/uapi/linux/ndctl.h
@@ -245,6 +245,10 @@ struct nd_cmd_pkg {
#define NVDIMM_FAMILY_MSFT 3
#define NVDIMM_FAMILY_HYPERV 4
#define NVDIMM_FAMILY_PAPR 5
+#define NVDIMM_FAMILY_MAX NVDIMM_FAMILY_PAPR
+
+#define NVDIMM_BUS_FAMILY_NFIT 0
+#define NVDIMM_BUS_FAMILY_MAX NVDIMM_BUS_FAMILY_NFIT
#define ND_IOCTL_CALL _IOWR(ND_IOCTL, ND_CMD_CALL,\
struct nd_cmd_pkg)
From: Liu Ying <[email protected]>
commit 3b2a999582c467d1883716b37ffcc00178a13713 upstream.
Both of the two LVDS channels should be disabled for split mode
in the encoder's ->disable() callback, because they are enabled
in the encoder's ->enable() callback.
Fixes: 6556f7f82b9c ("drm: imx: Move imx-drm driver out of staging")
Cc: Philipp Zabel <[email protected]>
Cc: Sascha Hauer <[email protected]>
Cc: Pengutronix Kernel Team <[email protected]>
Cc: NXP Linux Team <[email protected]>
Cc: <[email protected]>
Signed-off-by: Liu Ying <[email protected]>
Signed-off-by: Philipp Zabel <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/imx/imx-ldb.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/gpu/drm/imx/imx-ldb.c
+++ b/drivers/gpu/drm/imx/imx-ldb.c
@@ -304,18 +304,19 @@ static void imx_ldb_encoder_disable(stru
{
struct imx_ldb_channel *imx_ldb_ch = enc_to_imx_ldb_ch(encoder);
struct imx_ldb *ldb = imx_ldb_ch->ldb;
+ int dual = ldb->ldb_ctrl & LDB_SPLIT_MODE_EN;
int mux, ret;
drm_panel_disable(imx_ldb_ch->panel);
- if (imx_ldb_ch == &ldb->channel[0])
+ if (imx_ldb_ch == &ldb->channel[0] || dual)
ldb->ldb_ctrl &= ~LDB_CH0_MODE_EN_MASK;
- else if (imx_ldb_ch == &ldb->channel[1])
+ if (imx_ldb_ch == &ldb->channel[1] || dual)
ldb->ldb_ctrl &= ~LDB_CH1_MODE_EN_MASK;
regmap_write(ldb->regmap, IOMUXC_GPR2, ldb->ldb_ctrl);
- if (ldb->ldb_ctrl & LDB_SPLIT_MODE_EN) {
+ if (dual) {
clk_disable_unprepare(ldb->clk[0]);
clk_disable_unprepare(ldb->clk[1]);
}
From: Herbert Xu <[email protected]>
[ Upstream commit f3c802a1f30013f8f723b62d7fa49eb9e991da23 ]
AEAD does not support partial requests so we must not wake up
while ctx->more is set. In order to distinguish between the
case of no data sent yet and a zero-length request, a new init
flag has been added to ctx.
SKCIPHER has also been modified to ensure that at least a block
of data is available if there is more data to come.
Fixes: 2d97591ef43d ("crypto: af_alg - consolidation of...")
Signed-off-by: Herbert Xu <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
crypto/af_alg.c | 11 ++++++++---
crypto/algif_aead.c | 4 ++--
crypto/algif_skcipher.c | 4 ++--
include/crypto/if_alg.h | 4 +++-
4 files changed, 15 insertions(+), 8 deletions(-)
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 28fc323e3fe30..9fcb91ea10c41 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -635,6 +635,7 @@ void af_alg_pull_tsgl(struct sock *sk, size_t used, struct scatterlist *dst,
if (!ctx->used)
ctx->merge = 0;
+ ctx->init = ctx->more;
}
EXPORT_SYMBOL_GPL(af_alg_pull_tsgl);
@@ -734,9 +735,10 @@ EXPORT_SYMBOL_GPL(af_alg_wmem_wakeup);
*
* @sk socket of connection to user space
* @flags If MSG_DONTWAIT is set, then only report if function would sleep
+ * @min Set to minimum request size if partial requests are allowed.
* @return 0 when writable memory is available, < 0 upon error
*/
-int af_alg_wait_for_data(struct sock *sk, unsigned flags)
+int af_alg_wait_for_data(struct sock *sk, unsigned flags, unsigned min)
{
DEFINE_WAIT_FUNC(wait, woken_wake_function);
struct alg_sock *ask = alg_sk(sk);
@@ -754,7 +756,9 @@ int af_alg_wait_for_data(struct sock *sk, unsigned flags)
if (signal_pending(current))
break;
timeout = MAX_SCHEDULE_TIMEOUT;
- if (sk_wait_event(sk, &timeout, (ctx->used || !ctx->more),
+ if (sk_wait_event(sk, &timeout,
+ ctx->init && (!ctx->more ||
+ (min && ctx->used >= min)),
&wait)) {
err = 0;
break;
@@ -843,7 +847,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
}
lock_sock(sk);
- if (!ctx->more && ctx->used) {
+ if (ctx->init && (init || !ctx->more)) {
err = -EINVAL;
goto unlock;
}
@@ -854,6 +858,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
memcpy(ctx->iv, con.iv->iv, ivsize);
ctx->aead_assoclen = con.aead_assoclen;
+ ctx->init = true;
}
while (size) {
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 0ae000a61c7f5..d48d2156e6210 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -106,8 +106,8 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
size_t usedpages = 0; /* [in] RX bufs to be used from user */
size_t processed = 0; /* [in] TX bufs to be consumed */
- if (!ctx->used) {
- err = af_alg_wait_for_data(sk, flags);
+ if (!ctx->init || ctx->more) {
+ err = af_alg_wait_for_data(sk, flags, 0);
if (err)
return err;
}
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index ec5567c87a6df..a51ba22fef58f 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -61,8 +61,8 @@ static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg,
int err = 0;
size_t len = 0;
- if (!ctx->used) {
- err = af_alg_wait_for_data(sk, flags);
+ if (!ctx->init || (ctx->more && ctx->used < bs)) {
+ err = af_alg_wait_for_data(sk, flags, bs);
if (err)
return err;
}
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index 088c1ded27148..ee6412314f8f3 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -135,6 +135,7 @@ struct af_alg_async_req {
* SG?
* @enc: Cryptographic operation to be performed when
* recvmsg is invoked.
+ * @init: True if metadata has been sent.
* @len: Length of memory allocated for this data structure.
*/
struct af_alg_ctx {
@@ -151,6 +152,7 @@ struct af_alg_ctx {
bool more;
bool merge;
bool enc;
+ bool init;
unsigned int len;
};
@@ -226,7 +228,7 @@ unsigned int af_alg_count_tsgl(struct sock *sk, size_t bytes, size_t offset);
void af_alg_pull_tsgl(struct sock *sk, size_t used, struct scatterlist *dst,
size_t dst_offset);
void af_alg_wmem_wakeup(struct sock *sk);
-int af_alg_wait_for_data(struct sock *sk, unsigned flags);
+int af_alg_wait_for_data(struct sock *sk, unsigned flags, unsigned min);
int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
unsigned int ivsize);
ssize_t af_alg_sendpage(struct socket *sock, struct page *page,
--
2.25.1
From: Anton Blanchard <[email protected]>
commit 89c140bbaeee7a55ed0360a88f294ead2b95201b upstream.
Booting with a 4GB LMB size causes us to panic:
qemu-system-ppc64: OS terminated: OS panic:
Memory block size not suitable: 0x0
Fix pseries_memory_block_size() to handle 64 bit LMBs.
Cc: [email protected]
Signed-off-by: Anton Blanchard <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/powerpc/platforms/pseries/hotplug-memory.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -27,7 +27,7 @@ static bool rtas_hp_event;
unsigned long pseries_memory_block_size(void)
{
struct device_node *np;
- unsigned int memblock_size = MIN_MEMORY_BLOCK_SIZE;
+ u64 memblock_size = MIN_MEMORY_BLOCK_SIZE;
struct resource r;
np = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory");
From: John Dorminy <[email protected]>
commit 4cb6f22612511ff2aba4c33fb0f281cae7c23772 upstream.
REQ_OP_FLUSH was being treated as a flag, but the operation
part of bio->bi_opf must be treated as a whole. Change to
accessing the operation part via bio_op(bio) and checking
for equality.
Signed-off-by: John Dorminy <[email protected]>
Acked-by: Heinz Mauelshagen <[email protected]>
Fixes: d3c7b35c20d60 ("dm: add emulated block size target")
Cc: [email protected]
Signed-off-by: Mike Snitzer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/md/dm-ebs-target.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/md/dm-ebs-target.c
+++ b/drivers/md/dm-ebs-target.c
@@ -363,7 +363,7 @@ static int ebs_map(struct dm_target *ti,
bio_set_dev(bio, ec->dev->bdev);
bio->bi_iter.bi_sector = ec->start + dm_target_offset(ti, bio->bi_iter.bi_sector);
- if (unlikely(bio->bi_opf & REQ_OP_FLUSH))
+ if (unlikely(bio_op(bio) == REQ_OP_FLUSH))
return DM_MAPIO_REMAPPED;
/*
* Only queue for bufio processing in case of partial or overlapping buffers
From: Kamal Heib <[email protected]>
[ Upstream commit 95a5631f6c9f3045f26245e6045244652204dfdb ]
The return value from ipoib_ib_dev_stop() is always 0 - change it to be
void.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Kamal Heib <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/infiniband/ulp/ipoib/ipoib.h | 2 +-
drivers/infiniband/ulp/ipoib/ipoib_ib.c | 4 +---
2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index 9a3379c49541f..9ce6a36fe48ed 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -515,7 +515,7 @@ void ipoib_ib_dev_cleanup(struct net_device *dev);
int ipoib_ib_dev_open_default(struct net_device *dev);
int ipoib_ib_dev_open(struct net_device *dev);
-int ipoib_ib_dev_stop(struct net_device *dev);
+void ipoib_ib_dev_stop(struct net_device *dev);
void ipoib_ib_dev_up(struct net_device *dev);
void ipoib_ib_dev_down(struct net_device *dev);
int ipoib_ib_dev_stop_default(struct net_device *dev);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index da3c5315bbb51..6ee64c25aaff4 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -846,7 +846,7 @@ int ipoib_ib_dev_stop_default(struct net_device *dev)
return 0;
}
-int ipoib_ib_dev_stop(struct net_device *dev)
+void ipoib_ib_dev_stop(struct net_device *dev)
{
struct ipoib_dev_priv *priv = ipoib_priv(dev);
@@ -854,8 +854,6 @@ int ipoib_ib_dev_stop(struct net_device *dev)
clear_bit(IPOIB_FLAG_INITIALIZED, &priv->flags);
ipoib_flush_ah(dev);
-
- return 0;
}
int ipoib_ib_dev_open_default(struct net_device *dev)
--
2.25.1
From: Kevin Hao <[email protected]>
commit 96b4833b6827a62c295b149213c68b559514c929 upstream.
In calculation of the cpu mask for the hwlat kernel thread, the wrong
cpu mask is used instead of the tracing_cpumask, this causes the
tracing/tracing_cpumask useless for hwlat tracer. Fixes it.
Link: https://lkml.kernel.org/r/[email protected]
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Fixes: 0330f7aa8ee6 ("tracing: Have hwlat trace migrate across tracing_cpumask CPUs")
Signed-off-by: Kevin Hao <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/trace/trace_hwlat.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/kernel/trace/trace_hwlat.c
+++ b/kernel/trace/trace_hwlat.c
@@ -283,6 +283,7 @@ static bool disable_migrate;
static void move_to_next_cpu(void)
{
struct cpumask *current_mask = &save_cpumask;
+ struct trace_array *tr = hwlat_trace;
int next_cpu;
if (disable_migrate)
@@ -296,7 +297,7 @@ static void move_to_next_cpu(void)
goto disable;
get_online_cpus();
- cpumask_and(current_mask, cpu_online_mask, tracing_buffer_mask);
+ cpumask_and(current_mask, cpu_online_mask, tr->tracing_cpumask);
next_cpu = cpumask_next(smp_processor_id(), current_mask);
put_online_cpus();
@@ -373,7 +374,7 @@ static int start_kthread(struct trace_ar
/* Just pick the first CPU on first iteration */
current_mask = &save_cpumask;
get_online_cpus();
- cpumask_and(current_mask, cpu_online_mask, tracing_buffer_mask);
+ cpumask_and(current_mask, cpu_online_mask, tr->tracing_cpumask);
put_online_cpus();
next_cpu = cpumask_first(current_mask);
From: Sudeep Holla <[email protected]>
[ Upstream commit 4df2ef85f0efe44505f511ca5e4455585f53a2da ]
Commit c8ff5841a90b ("rtc: pl031: switch to rtc_time64_to_tm/rtc_tm_to_time64")
seemed to have accidentally removed the call to pl031_alarm_irq_enable
from pl031_set_alarm while switching to 64-bit apis.
Let us add back the same to get the set alarm functionality back.
Fixes: c8ff5841a90b ("rtc: pl031: switch to rtc_time64_to_tm/rtc_tm_to_time64")
Signed-off-by: Sudeep Holla <[email protected]>
Signed-off-by: Alexandre Belloni <[email protected]>
Tested-by: Valentin Schneider <[email protected]>
Cc: Linus Walleij <[email protected]>
Cc: Alexandre Belloni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/rtc/rtc-pl031.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/rtc/rtc-pl031.c b/drivers/rtc/rtc-pl031.c
index 40d7450a1ce49..c6b89273feba8 100644
--- a/drivers/rtc/rtc-pl031.c
+++ b/drivers/rtc/rtc-pl031.c
@@ -275,6 +275,7 @@ static int pl031_set_alarm(struct device *dev, struct rtc_wkalrm *alarm)
struct pl031_local *ldata = dev_get_drvdata(dev);
writel(rtc_tm_to_time64(&alarm->time), ldata->base + RTC_MR);
+ pl031_alarm_irq_enable(dev, alarm->enabled);
return 0;
}
--
2.25.1
From: Vladimir Oltean <[email protected]>
[ Upstream commit 35bd8c07db2ce8fd2834ef866240613a4ef982e7 ]
Sometimes debugging a device is easiest using devmem on its register
map, and that can be seen with /proc/iomem. But some device drivers have
many memory regions. Take for example a networking switch. Its memory
map used to look like this in /proc/iomem:
1fc000000-1fc3fffff : pcie@1f0000000
1fc000000-1fc3fffff : 0000:00:00.5
1fc010000-1fc01ffff : sys
1fc030000-1fc03ffff : rew
1fc060000-1fc0603ff : s2
1fc070000-1fc0701ff : devcpu_gcb
1fc080000-1fc0800ff : qs
1fc090000-1fc0900cb : ptp
1fc100000-1fc10ffff : port0
1fc110000-1fc11ffff : port1
1fc120000-1fc12ffff : port2
1fc130000-1fc13ffff : port3
1fc140000-1fc14ffff : port4
1fc150000-1fc15ffff : port5
1fc200000-1fc21ffff : qsys
1fc280000-1fc28ffff : ana
But after the patch in Fixes: was applied, the information is now
presented in a much more opaque way:
1fc000000-1fc3fffff : pcie@1f0000000
1fc000000-1fc3fffff : 0000:00:00.5
1fc010000-1fc01ffff : 0000:00:00.5
1fc030000-1fc03ffff : 0000:00:00.5
1fc060000-1fc0603ff : 0000:00:00.5
1fc070000-1fc0701ff : 0000:00:00.5
1fc080000-1fc0800ff : 0000:00:00.5
1fc090000-1fc0900cb : 0000:00:00.5
1fc100000-1fc10ffff : 0000:00:00.5
1fc110000-1fc11ffff : 0000:00:00.5
1fc120000-1fc12ffff : 0000:00:00.5
1fc130000-1fc13ffff : 0000:00:00.5
1fc140000-1fc14ffff : 0000:00:00.5
1fc150000-1fc15ffff : 0000:00:00.5
1fc200000-1fc21ffff : 0000:00:00.5
1fc280000-1fc28ffff : 0000:00:00.5
That patch made a fair comment that /proc/iomem might be confusing when
it shows resources without an associated device, but we can do better
than just hide the resource name altogether. Namely, we can print the
device name _and_ the resource name. Like this:
1fc000000-1fc3fffff : pcie@1f0000000
1fc000000-1fc3fffff : 0000:00:00.5
1fc010000-1fc01ffff : 0000:00:00.5 sys
1fc030000-1fc03ffff : 0000:00:00.5 rew
1fc060000-1fc0603ff : 0000:00:00.5 s2
1fc070000-1fc0701ff : 0000:00:00.5 devcpu_gcb
1fc080000-1fc0800ff : 0000:00:00.5 qs
1fc090000-1fc0900cb : 0000:00:00.5 ptp
1fc100000-1fc10ffff : 0000:00:00.5 port0
1fc110000-1fc11ffff : 0000:00:00.5 port1
1fc120000-1fc12ffff : 0000:00:00.5 port2
1fc130000-1fc13ffff : 0000:00:00.5 port3
1fc140000-1fc14ffff : 0000:00:00.5 port4
1fc150000-1fc15ffff : 0000:00:00.5 port5
1fc200000-1fc21ffff : 0000:00:00.5 qsys
1fc280000-1fc28ffff : 0000:00:00.5 ana
Fixes: 8d84b18f5678 ("devres: always use dev_name() in devm_ioremap_resource()")
Signed-off-by: Vladimir Oltean <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
lib/devres.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/lib/devres.c b/lib/devres.c
index 6ef51f159c54b..ca0d28727ccef 100644
--- a/lib/devres.c
+++ b/lib/devres.c
@@ -119,6 +119,7 @@ __devm_ioremap_resource(struct device *dev, const struct resource *res,
{
resource_size_t size;
void __iomem *dest_ptr;
+ char *pretty_name;
BUG_ON(!dev);
@@ -129,7 +130,15 @@ __devm_ioremap_resource(struct device *dev, const struct resource *res,
size = resource_size(res);
- if (!devm_request_mem_region(dev, res->start, size, dev_name(dev))) {
+ if (res->name)
+ pretty_name = devm_kasprintf(dev, GFP_KERNEL, "%s %s",
+ dev_name(dev), res->name);
+ else
+ pretty_name = devm_kstrdup(dev, dev_name(dev), GFP_KERNEL);
+ if (!pretty_name)
+ return IOMEM_ERR_PTR(-ENOMEM);
+
+ if (!devm_request_mem_region(dev, res->start, size, pretty_name)) {
dev_err(dev, "can't request region for resource %pR\n", res);
return IOMEM_ERR_PTR(-EBUSY);
}
--
2.25.1
From: Masami Hiramatsu <[email protected]>
commit 12d572e785b15bc764e956caaa8a4c846fd15694 upstream.
Fix the memory leakage in debuginfo__find_trace_events() when the probe
point is not found in the debuginfo. If there is no probe point found in
the debuginfo, debuginfo__find_probes() will NOT return -ENOENT, but 0.
Thus the caller of debuginfo__find_probes() must check the tf.ntevs and
release the allocated memory for the array of struct probe_trace_event.
The current code releases the memory only if the debuginfo__find_probes()
hits an error but not checks tf.ntevs. In the result, the memory allocated
on *tevs are not released if tf.ntevs == 0.
This fixes the memory leakage by checking tf.ntevs == 0 in addition to
ret < 0.
Fixes: ff741783506c ("perf probe: Introduce debuginfo to encapsulate dwarf information")
Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Srikar Dronamraju <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/159438668346.62703.10887420400718492503.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
tools/perf/util/probe-finder.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -1467,7 +1467,7 @@ int debuginfo__find_trace_events(struct
if (ret >= 0 && tf.pf.skip_empty_arg)
ret = fill_empty_trace_arg(pev, tf.tevs, tf.ntevs);
- if (ret < 0) {
+ if (ret < 0 || tf.ntevs == 0) {
for (i = 0; i < tf.ntevs; i++)
clear_probe_trace_event(&tf.tevs[i]);
zfree(tevs);
From: Eric Dumazet <[email protected]>
[ Upstream commit 393415203f5c916b5907e0a7c89f4c2c5a9c5505 ]
We need to increase TSO_HEADER_SIZE from 128 to 256.
Since otx2_sq_init() calls qmem_alloc() with TSO_HEADER_SIZE,
we need to change (struct qmem)->entry_sz to avoid truncation to 0.
Fixes: 7a37245ef23f ("octeontx2-af: NPA block admin queue init")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Sunil Goutham <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/marvell/octeontx2/af/common.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/common.h b/drivers/net/ethernet/marvell/octeontx2/af/common.h
index cd33c2e6ca5fc..f48eb66ed021b 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/common.h
+++ b/drivers/net/ethernet/marvell/octeontx2/af/common.h
@@ -43,7 +43,7 @@ struct qmem {
void *base;
dma_addr_t iova;
int alloc_sz;
- u8 entry_sz;
+ u16 entry_sz;
u8 align;
u32 qsize;
};
--
2.25.1
From: Adrian Hunter <[email protected]>
commit 401136bb084fd021acd9f8c51b52fe0a25e326b2 upstream.
While walking code towards a FUP ip, the packet state is
INTEL_PT_STATE_FUP or INTEL_PT_STATE_FUP_NO_TIP. That was mishandled
resulting in the state becoming INTEL_PT_STATE_IN_SYNC prematurely. The
result was an occasional lost EXSTOP event.
Signed-off-by: Adrian Hunter <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
tools/perf/util/intel-pt-decoder/intel-pt-decoder.c | 21 ++++++--------------
1 file changed, 7 insertions(+), 14 deletions(-)
--- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
+++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c
@@ -1164,6 +1164,7 @@ static int intel_pt_walk_fup(struct inte
return 0;
if (err == -EAGAIN ||
intel_pt_fup_with_nlip(decoder, &intel_pt_insn, ip, err)) {
+ decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
if (intel_pt_fup_event(decoder))
return 0;
return -EAGAIN;
@@ -1942,17 +1943,13 @@ next:
}
if (decoder->set_fup_mwait)
no_tip = true;
+ if (no_tip)
+ decoder->pkt_state = INTEL_PT_STATE_FUP_NO_TIP;
+ else
+ decoder->pkt_state = INTEL_PT_STATE_FUP;
err = intel_pt_walk_fup(decoder);
- if (err != -EAGAIN) {
- if (err)
- return err;
- if (no_tip)
- decoder->pkt_state =
- INTEL_PT_STATE_FUP_NO_TIP;
- else
- decoder->pkt_state = INTEL_PT_STATE_FUP;
- return 0;
- }
+ if (err != -EAGAIN)
+ return err;
if (no_tip) {
no_tip = false;
break;
@@ -2599,15 +2596,11 @@ const struct intel_pt_state *intel_pt_de
err = intel_pt_walk_tip(decoder);
break;
case INTEL_PT_STATE_FUP:
- decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
err = intel_pt_walk_fup(decoder);
if (err == -EAGAIN)
err = intel_pt_walk_fup_tip(decoder);
- else if (!err)
- decoder->pkt_state = INTEL_PT_STATE_FUP;
break;
case INTEL_PT_STATE_FUP_NO_TIP:
- decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
err = intel_pt_walk_fup(decoder);
if (err == -EAGAIN)
err = intel_pt_walk_trace(decoder);
From: Masami Hiramatsu <[email protected]>
commit 477d08478170469d10b533624342d13701e24b34 upstream.
Since the parse_args() stops parsing at '--', bootconfig_params()
will never get the '--' as param and initargs_found never be true.
In the result, if we pass some init arguments via the bootconfig,
those are always appended to the kernel command line with '--'
even if the kernel command line already has '--'.
To fix this correctly, check the return value of parse_args()
and set initargs_found true if the return value is not an error
but a valid address.
Link: https://lkml.kernel.org/r/159650953285.270383.14822353843556363851.stgit@devnote2
Fixes: f61872bb58a1 ("bootconfig: Use parse_args() to find bootconfig and '--'")
Cc: [email protected]
Reported-by: Arvind Sankar <[email protected]>
Suggested-by: Arvind Sankar <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
init/main.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
--- a/init/main.c
+++ b/init/main.c
@@ -387,8 +387,6 @@ static int __init bootconfig_params(char
{
if (strcmp(param, "bootconfig") == 0) {
bootconfig_found = true;
- } else if (strcmp(param, "--") == 0) {
- initargs_found = true;
}
return 0;
}
@@ -399,19 +397,23 @@ static void __init setup_boot_config(con
const char *msg;
int pos;
u32 size, csum;
- char *data, *copy;
+ char *data, *copy, *err;
int ret;
/* Cut out the bootconfig data even if we have no bootconfig option */
data = get_boot_config_from_initrd(&size, &csum);
strlcpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE);
- parse_args("bootconfig", tmp_cmdline, NULL, 0, 0, 0, NULL,
- bootconfig_params);
+ err = parse_args("bootconfig", tmp_cmdline, NULL, 0, 0, 0, NULL,
+ bootconfig_params);
- if (!bootconfig_found)
+ if (IS_ERR(err) || !bootconfig_found)
return;
+ /* parse_args() stops at '--' and returns an address */
+ if (err)
+ initargs_found = true;
+
if (!data) {
pr_err("'bootconfig' found on command line, but no bootconfig found\n");
return;
From: Charles Keepax <[email protected]>
[ Upstream commit ddff6c45b21d0437ce0c85f8ac35d7b5480513d7 ]
Whilst it doesn't matter if the internal 32k clock register settings
are cleaned up on exit, as the part will be turned off losing any
settings, hence the driver hasn't historially bothered. The external
clock should however be cleaned up, as it could cause clocks to be
left on, and will at best generate a warning on unbind.
Add clean up on both the probe error path and unbind for the 32k
clock.
Fixes: cdd8da8cc66b ("mfd: arizona: Add gating of external MCLKn clocks")
Signed-off-by: Charles Keepax <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/mfd/arizona-core.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/mfd/arizona-core.c b/drivers/mfd/arizona-core.c
index f73cf76d1373d..a5e443110fc3d 100644
--- a/drivers/mfd/arizona-core.c
+++ b/drivers/mfd/arizona-core.c
@@ -1426,6 +1426,15 @@ int arizona_dev_init(struct arizona *arizona)
arizona_irq_exit(arizona);
err_pm:
pm_runtime_disable(arizona->dev);
+
+ switch (arizona->pdata.clk32k_src) {
+ case ARIZONA_32KZ_MCLK1:
+ case ARIZONA_32KZ_MCLK2:
+ arizona_clk32k_disable(arizona);
+ break;
+ default:
+ break;
+ }
err_reset:
arizona_enable_reset(arizona);
regulator_disable(arizona->dcvdd);
@@ -1448,6 +1457,15 @@ int arizona_dev_exit(struct arizona *arizona)
regulator_disable(arizona->dcvdd);
regulator_put(arizona->dcvdd);
+ switch (arizona->pdata.clk32k_src) {
+ case ARIZONA_32KZ_MCLK1:
+ case ARIZONA_32KZ_MCLK2:
+ arizona_clk32k_disable(arizona);
+ break;
+ default:
+ break;
+ }
+
mfd_remove_devices(arizona->dev);
arizona_free_irq(arizona, ARIZONA_IRQ_UNDERCLOCKED, arizona);
arizona_free_irq(arizona, ARIZONA_IRQ_OVERCLOCKED, arizona);
--
2.25.1
From: Masami Hiramatsu <[email protected]>
commit 11fd3eb874e73ee8069bcfd54e3c16fa7ce56fe6 upstream.
Fix a wrong "variable not found" warning when the probe point is not
found in the debuginfo.
Since the debuginfo__find_probes() can return 0 even if it does not find
given probe point in the debuginfo, fill_empty_trace_arg() can be called
with tf.ntevs == 0 and it can emit a wrong warning. To fix this, reject
ntevs == 0 in fill_empty_trace_arg().
E.g. without this patch;
# perf probe -x /lib64/libc-2.30.so -a "memcpy arg1=%di"
Failed to find the location of the '%di' variable at this address.
Perhaps it has been optimized out.
Use -V with the --range option to show '%di' location range.
Added new events:
probe_libc:memcpy (on memcpy in /usr/lib64/libc-2.30.so with arg1=%di)
probe_libc:memcpy (on memcpy in /usr/lib64/libc-2.30.so with arg1=%di)
You can now use it in all perf tools, such as:
perf record -e probe_libc:memcpy -aR sleep 1
With this;
# perf probe -x /lib64/libc-2.30.so -a "memcpy arg1=%di"
Added new events:
probe_libc:memcpy (on memcpy in /usr/lib64/libc-2.30.so with arg1=%di)
probe_libc:memcpy (on memcpy in /usr/lib64/libc-2.30.so with arg1=%di)
You can now use it in all perf tools, such as:
perf record -e probe_libc:memcpy -aR sleep 1
Fixes: cb4027308570 ("perf probe: Trace a magic number if variable is not found")
Reported-by: Andi Kleen <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Srikar Dronamraju <[email protected]>
Tested-by: Andi Kleen <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/159438667364.62703.2200642186798763202.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
tools/perf/util/probe-finder.c | 3 +++
1 file changed, 3 insertions(+)
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -1408,6 +1408,9 @@ static int fill_empty_trace_arg(struct p
char *type;
int i, j, ret;
+ if (!ntevs)
+ return -ENOENT;
+
for (i = 0; i < pev->nargs; i++) {
type = NULL;
for (j = 0; j < ntevs; j++) {
From: Coly Li <[email protected]>
commit a2f32ee8fd853cec8860f883d98afc3a339546de upstream.
Commit 85750aeb748f ("bcache: use bio_{start,end}_io_acct") moves the
io account code to the location after bio_set_dev(bio, dc->bdev) in
cached_dev_make_request(). Then the account is performed incorrectly on
backing device, indeed the I/O should be counted to bcache device like
/dev/bcache0.
With the mistaken I/O account, iostat does not display I/O counts for
bcache device and all the numbers go to backing device. In writeback
mode, the hard drive may have 340K+ IOPS which is impossible and wrong
for spinning disk.
This patch introduces bch_bio_start_io_acct() and bch_bio_end_io_acct(),
which switches bio->bi_disk to bcache device before calling
bio_start_io_acct() or bio_end_io_acct(). Now the I/Os are counted to
bcache device, and bcache device, cache device and backing device have
their correct I/O count information back.
Fixes: 85750aeb748f ("bcache: use bio_{start,end}_io_acct")
Signed-off-by: Coly Li <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: [email protected]
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/md/bcache/request.c | 31 +++++++++++++++++++++++++++----
1 file changed, 27 insertions(+), 4 deletions(-)
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -617,6 +617,28 @@ static void cache_lookup(struct closure
/* Common code for the make_request functions */
+static inline void bch_bio_start_io_acct(struct gendisk *acct_bi_disk,
+ struct bio *bio,
+ unsigned long *start_time)
+{
+ struct gendisk *saved_bi_disk = bio->bi_disk;
+
+ bio->bi_disk = acct_bi_disk;
+ *start_time = bio_start_io_acct(bio);
+ bio->bi_disk = saved_bi_disk;
+}
+
+static inline void bch_bio_end_io_acct(struct gendisk *acct_bi_disk,
+ struct bio *bio,
+ unsigned long start_time)
+{
+ struct gendisk *saved_bi_disk = bio->bi_disk;
+
+ bio->bi_disk = acct_bi_disk;
+ bio_end_io_acct(bio, start_time);
+ bio->bi_disk = saved_bi_disk;
+}
+
static void request_endio(struct bio *bio)
{
struct closure *cl = bio->bi_private;
@@ -668,7 +690,7 @@ static void backing_request_endio(struct
static void bio_complete(struct search *s)
{
if (s->orig_bio) {
- bio_end_io_acct(s->orig_bio, s->start_time);
+ bch_bio_end_io_acct(s->d->disk, s->orig_bio, s->start_time);
trace_bcache_request_end(s->d, s->orig_bio);
s->orig_bio->bi_status = s->iop.status;
bio_endio(s->orig_bio);
@@ -728,7 +750,7 @@ static inline struct search *search_allo
s->recoverable = 1;
s->write = op_is_write(bio_op(bio));
s->read_dirty_data = 0;
- s->start_time = bio_start_io_acct(bio);
+ bch_bio_start_io_acct(d->disk, bio, &s->start_time);
s->iop.c = d->c;
s->iop.bio = NULL;
@@ -1080,7 +1102,7 @@ static void detached_dev_end_io(struct b
bio->bi_end_io = ddip->bi_end_io;
bio->bi_private = ddip->bi_private;
- bio_end_io_acct(bio, ddip->start_time);
+ bch_bio_end_io_acct(ddip->d->disk, bio, ddip->start_time);
if (bio->bi_status) {
struct cached_dev *dc = container_of(ddip->d,
@@ -1105,7 +1127,8 @@ static void detached_dev_do_request(stru
*/
ddip = kzalloc(sizeof(struct detached_dev_io_private), GFP_NOIO);
ddip->d = d;
- ddip->start_time = bio_start_io_acct(bio);
+ bch_bio_start_io_acct(d->disk, bio, &ddip->start_time);
+
ddip->bi_end_io = bio->bi_end_io;
ddip->bi_private = bio->bi_private;
bio->bi_end_io = detached_dev_end_io;
From: Ahmad Fatoum <[email protected]>
commit 802141462d844f2e6a4d63a12260d79b7afc4c34 upstream.
The flags that should be or-ed into the watchdog_info.options by drivers
all start with WDIOF_, e.g. WDIOF_SETTIMEOUT, which indicates that the
driver's watchdog_ops has a usable set_timeout.
WDIOC_SETTIMEOUT was used instead, which expands to 0xc0045706, which
equals:
WDIOF_FANFAULT | WDIOF_EXTERN1 | WDIOF_PRETIMEOUT | WDIOF_ALARMONLY |
WDIOF_MAGICCLOSE | 0xc0045000
These were so far indicated to userspace on WDIOC_GETSUPPORT.
As the driver has not yet been migrated to the new watchdog kernel API,
the constant can just be dropped without substitute.
Fixes: 96cb4eb019ce ("watchdog: f71808e_wdt: new watchdog driver for Fintek F71808E and F71882FG")
Cc: [email protected]
Signed-off-by: Ahmad Fatoum <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/watchdog/f71808e_wdt.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/drivers/watchdog/f71808e_wdt.c
+++ b/drivers/watchdog/f71808e_wdt.c
@@ -690,8 +690,7 @@ static int __init watchdog_init(int sioa
* into the module have been registered yet.
*/
watchdog.sioaddr = sioaddr;
- watchdog.ident.options = WDIOC_SETTIMEOUT
- | WDIOF_MAGICCLOSE
+ watchdog.ident.options = WDIOF_MAGICCLOSE
| WDIOF_KEEPALIVEPING
| WDIOF_CARDRESET;
From: Johan Hovold <[email protected]>
[ Upstream commit 733fff67941dad64b8a630450b8372b1873edc41 ]
Only the last NUL in a packet should be flagged as a break character,
for example, to avoid dropping unrelated characters when IGNBRK is set.
Also make sysrq work by consuming the break character instead of having
it immediately cancel the sysrq request, and by not processing it
prematurely to avoid triggering a sysrq based on an unrelated character
received in the same packet (which was received *before* the break).
Note that the break flag can be left set also for a packet received
immediately following a break and that and an ending NUL in such a
packet will continue to be reported as a break as there's no good way to
tell it apart from an actual break.
Tested on FT232R and FT232H.
Fixes: 72fda3ca6fc1 ("USB: serial: ftd_sio: implement sysrq handling on break")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/usb/serial/ftdi_sio.c | 24 +++++++++++++++++-------
1 file changed, 17 insertions(+), 7 deletions(-)
diff --git a/drivers/usb/serial/ftdi_sio.c b/drivers/usb/serial/ftdi_sio.c
index 33f1cca7eaa61..07b146d7033a6 100644
--- a/drivers/usb/serial/ftdi_sio.c
+++ b/drivers/usb/serial/ftdi_sio.c
@@ -2483,6 +2483,7 @@ static int ftdi_process_packet(struct usb_serial_port *port,
struct ftdi_private *priv, unsigned char *buf, int len)
{
unsigned char status;
+ bool brkint = false;
int i;
char flag;
@@ -2534,13 +2535,17 @@ static int ftdi_process_packet(struct usb_serial_port *port,
*/
flag = TTY_NORMAL;
if (buf[1] & FTDI_RS_ERR_MASK) {
- /* Break takes precedence over parity, which takes precedence
- * over framing errors */
- if (buf[1] & FTDI_RS_BI) {
- flag = TTY_BREAK;
+ /*
+ * Break takes precedence over parity, which takes precedence
+ * over framing errors. Note that break is only associated
+ * with the last character in the buffer and only when it's a
+ * NUL.
+ */
+ if (buf[1] & FTDI_RS_BI && buf[len - 1] == '\0') {
port->icount.brk++;
- usb_serial_handle_break(port);
- } else if (buf[1] & FTDI_RS_PE) {
+ brkint = true;
+ }
+ if (buf[1] & FTDI_RS_PE) {
flag = TTY_PARITY;
port->icount.parity++;
} else if (buf[1] & FTDI_RS_FE) {
@@ -2556,8 +2561,13 @@ static int ftdi_process_packet(struct usb_serial_port *port,
port->icount.rx += len - 2;
- if (port->port.console && port->sysrq) {
+ if (brkint || (port->port.console && port->sysrq)) {
for (i = 2; i < len; i++) {
+ if (brkint && i == len - 1) {
+ if (usb_serial_handle_break(port))
+ return len - 3;
+ flag = TTY_BREAK;
+ }
if (usb_serial_handle_sysrq_char(port, buf[i]))
continue;
tty_insert_flip_char(&port->port, buf[i], flag);
--
2.25.1
From: Yan-Hsuan Chuang <[email protected]>
[ Upstream commit 68aa716b7dd36f55e080da9e27bc594346334c41 ]
Some platforms cannot read the DBI register successfully for the
ASPM settings. After the read failed, the bus could be unstable,
and the device just became unavailable [1]. For those platforms,
the ASPM should be disabled. But as the ASPM can help the driver
to save the power consumption in power save mode, the ASPM is still
needed. So, add a module parameter for them to disable it, then
the device can still work, while others can benefit from the less
power consumption that brings by ASPM enabled.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=206411
[2] Note that my lenovo T430 is the same.
Fixes: 3dff7c6e3749 ("rtw88: allows to enable/disable HCI link PS mechanism")
Signed-off-by: Yan-Hsuan Chuang <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/wireless/realtek/rtw88/pci.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/net/wireless/realtek/rtw88/pci.c b/drivers/net/wireless/realtek/rtw88/pci.c
index 8228db9a5fc86..3413973bc4750 100644
--- a/drivers/net/wireless/realtek/rtw88/pci.c
+++ b/drivers/net/wireless/realtek/rtw88/pci.c
@@ -14,8 +14,11 @@
#include "debug.h"
static bool rtw_disable_msi;
+static bool rtw_pci_disable_aspm;
module_param_named(disable_msi, rtw_disable_msi, bool, 0644);
+module_param_named(disable_aspm, rtw_pci_disable_aspm, bool, 0644);
MODULE_PARM_DESC(disable_msi, "Set Y to disable MSI interrupt support");
+MODULE_PARM_DESC(disable_aspm, "Set Y to disable PCI ASPM support");
static u32 rtw_pci_tx_queue_idx_addr[] = {
[RTW_TX_QUEUE_BK] = RTK_PCI_TXBD_IDX_BKQ,
@@ -1200,6 +1203,9 @@ static void rtw_pci_clkreq_set(struct rtw_dev *rtwdev, bool enable)
u8 value;
int ret;
+ if (rtw_pci_disable_aspm)
+ return;
+
ret = rtw_dbi_read8(rtwdev, RTK_PCIE_LINK_CFG, &value);
if (ret) {
rtw_err(rtwdev, "failed to read CLKREQ_L1, ret=%d", ret);
@@ -1219,6 +1225,9 @@ static void rtw_pci_aspm_set(struct rtw_dev *rtwdev, bool enable)
u8 value;
int ret;
+ if (rtw_pci_disable_aspm)
+ return;
+
ret = rtw_dbi_read8(rtwdev, RTK_PCIE_LINK_CFG, &value);
if (ret) {
rtw_err(rtwdev, "failed to read ASPM, ret=%d", ret);
--
2.25.1
From: ChangSyun Peng <[email protected]>
commit a1c6ae3d9f3dd6aa5981a332a6f700cf1c25edef upstream.
In degraded raid5, we need to read parity to do reconstruct-write when
data disks fail. However, we can not read parity from
handle_stripe_dirtying() in force reconstruct-write mode.
Reproducible Steps:
1. Create degraded raid5
mdadm -C /dev/md2 --assume-clean -l5 -n3 /dev/sda2 /dev/sdb2 missing
2. Set rmw_level to 0
echo 0 > /sys/block/md2/md/rmw_level
3. IO to raid5
Now some io may be stuck in raid5. We can use handle_stripe_fill() to read
the parity in this situation.
Cc: <[email protected]> # v4.4+
Reviewed-by: Alex Wu <[email protected]>
Reviewed-by: BingJing Chang <[email protected]>
Reviewed-by: Danny Shih <[email protected]>
Signed-off-by: ChangSyun Peng <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/md/raid5.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3607,6 +3607,7 @@ static int need_this_block(struct stripe
* is missing/faulty, then we need to read everything we can.
*/
if (sh->raid_conf->level != 6 &&
+ sh->raid_conf->rmw_level != PARITY_DISABLE_RMW &&
sh->sector < sh->raid_conf->mddev->recovery_cp)
/* reconstruct-write isn't being forced */
return 0;
@@ -4842,7 +4843,7 @@ static void handle_stripe(struct stripe_
* or to load a block that is being partially written.
*/
if (s.to_read || s.non_overwrite
- || (conf->level == 6 && s.to_write && s.failed)
+ || (s.to_write && s.failed)
|| (s.syncing && (s.uptodate + s.compute < disks))
|| s.replacing
|| s.expanding)
From: Huacai Chen <[email protected]>
commit c9c73a05413ea4a465cae1cb3593b01b190a233f upstream.
In gc->mask_cache bits, 1 means enabled and 0 means disabled, but in the
loongson-liointc driver mask_cache is misused by reverting its meaning.
This patch fix the bug and update the comments as well.
Fixes: dbb152267908c4b2c3639492a ("irqchip: Add driver for Loongson I/O Local Interrupt Controller")
Signed-off-by: Huacai Chen <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Reviewed-by: Jiaxun Yang <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/irqchip/irq-loongson-liointc.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
--- a/drivers/irqchip/irq-loongson-liointc.c
+++ b/drivers/irqchip/irq-loongson-liointc.c
@@ -60,7 +60,7 @@ static void liointc_chained_handle_irq(s
if (!pending) {
/* Always blame LPC IRQ if we have that bug */
if (handler->priv->has_lpc_irq_errata &&
- (handler->parent_int_map & ~gc->mask_cache &
+ (handler->parent_int_map & gc->mask_cache &
BIT(LIOINTC_ERRATA_IRQ)))
pending = BIT(LIOINTC_ERRATA_IRQ);
else
@@ -132,11 +132,11 @@ static void liointc_resume(struct irq_ch
irq_gc_lock_irqsave(gc, flags);
/* Disable all at first */
writel(0xffffffff, gc->reg_base + LIOINTC_REG_INTC_DISABLE);
- /* Revert map cache */
+ /* Restore map cache */
for (i = 0; i < LIOINTC_CHIP_IRQ; i++)
writeb(priv->map_cache[i], gc->reg_base + i);
- /* Revert mask cache */
- writel(~gc->mask_cache, gc->reg_base + LIOINTC_REG_INTC_ENABLE);
+ /* Restore mask cache */
+ writel(gc->mask_cache, gc->reg_base + LIOINTC_REG_INTC_ENABLE);
irq_gc_unlock_irqrestore(gc, flags);
}
@@ -244,7 +244,7 @@ int __init liointc_of_init(struct device
ct->chip.irq_mask_ack = irq_gc_mask_disable_reg;
ct->chip.irq_set_type = liointc_set_type;
- gc->mask_cache = 0xffffffff;
+ gc->mask_cache = 0;
priv->gc = gc;
for (i = 0; i < LIOINTC_NUM_PARENT; i++) {
From: Ahmad Fatoum <[email protected]>
commit e871e93fb08a619dfc015974a05768ed6880fd82 upstream.
The driver supports populating bootstatus with WDIOF_CARDRESET, but so
far userspace couldn't portably determine whether absence of this flag
meant no watchdog reset or no driver support. Or-in the bit to fix this.
Fixes: b97cb21a4634 ("watchdog: f71808e_wdt: Fix WDTMOUT_STS register read")
Cc: [email protected]
Signed-off-by: Ahmad Fatoum <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/watchdog/f71808e_wdt.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/watchdog/f71808e_wdt.c
+++ b/drivers/watchdog/f71808e_wdt.c
@@ -692,7 +692,8 @@ static int __init watchdog_init(int sioa
watchdog.sioaddr = sioaddr;
watchdog.ident.options = WDIOC_SETTIMEOUT
| WDIOF_MAGICCLOSE
- | WDIOF_KEEPALIVEPING;
+ | WDIOF_KEEPALIVEPING
+ | WDIOF_CARDRESET;
snprintf(watchdog.ident.identity,
sizeof(watchdog.ident.identity), "%s watchdog",
From: Qiushi Wu <[email protected]>
[ Upstream commit aaa3cbbac326c95308e315f1ab964a3369c4d07d ]
In function cros_ec_ishtp_probe(), "up_write" is already called
before function "cros_ec_dev_init". But "up_write" will be called
again after the calling of the function "cros_ec_dev_init" failed.
Thus add a call of the function “down_write” in this if branch
for the completion of the exception handling.
Fixes: 26a14267aff2 ("platform/chrome: Add ChromeOS EC ISHTP driver")
Signed-off-by: Qiushi Wu <[email protected]>
Tested-by: Mathew King <[email protected]>
Signed-off-by: Enric Balletbo i Serra <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/platform/chrome/cros_ec_ishtp.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/chrome/cros_ec_ishtp.c b/drivers/platform/chrome/cros_ec_ishtp.c
index ed794a7ddba9b..81364029af367 100644
--- a/drivers/platform/chrome/cros_ec_ishtp.c
+++ b/drivers/platform/chrome/cros_ec_ishtp.c
@@ -681,8 +681,10 @@ static int cros_ec_ishtp_probe(struct ishtp_cl_device *cl_device)
/* Register croc_ec_dev mfd */
rv = cros_ec_dev_init(client_data);
- if (rv)
+ if (rv) {
+ down_write(&init_lock);
goto end_cros_ec_dev_init_error;
+ }
return 0;
--
2.25.1
From: Kamal Dasu <[email protected]>
[ Upstream commit 4551e78ad98add1f16b70cf286d5aad3ce7bcd4c ]
Implement ECC correctable and uncorrectable error handling for EDU
reads. If ECC correctable bitflips are encountered on EDU transfer,
read page again using PIO. This is needed due to a NAND controller
limitation where corrected data is not transferred to the DMA buffer
on ECC error. This applies to ECC correctable errors that are reported
by the controller hardware based on set number of bitflips threshold in
the controller threshold register, bitflips below the threshold are
corrected silently and are not reported by the controller hardware.
Fixes: a5d53ad26a8b ("mtd: rawnand: brcmnand: Add support for flash-edu for dma transfers")
Signed-off-by: Kamal Dasu <[email protected]>
Signed-off-by: Miquel Raynal <[email protected]>
Link: https://lore.kernel.org/linux-mtd/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/mtd/nand/raw/brcmnand/brcmnand.c | 26 ++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
index ac934a715a194..a4033d32a7103 100644
--- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c
+++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c
@@ -1918,6 +1918,22 @@ static int brcmnand_edu_trans(struct brcmnand_host *host, u64 addr, u32 *buf,
edu_writel(ctrl, EDU_STOP, 0); /* force stop */
edu_readl(ctrl, EDU_STOP);
+ if (!ret && edu_cmd == EDU_CMD_READ) {
+ u64 err_addr = 0;
+
+ /*
+ * check for ECC errors here, subpage ECC errors are
+ * retained in ECC error address register
+ */
+ err_addr = brcmnand_get_uncorrecc_addr(ctrl);
+ if (!err_addr) {
+ err_addr = brcmnand_get_correcc_addr(ctrl);
+ if (err_addr)
+ ret = -EUCLEAN;
+ } else
+ ret = -EBADMSG;
+ }
+
return ret;
}
@@ -2124,6 +2140,7 @@ static int brcmnand_read(struct mtd_info *mtd, struct nand_chip *chip,
u64 err_addr = 0;
int err;
bool retry = true;
+ bool edu_err = false;
dev_dbg(ctrl->dev, "read %llx -> %p\n", (unsigned long long)addr, buf);
@@ -2141,6 +2158,10 @@ static int brcmnand_read(struct mtd_info *mtd, struct nand_chip *chip,
else
return -EIO;
}
+
+ if (has_edu(ctrl) && err_addr)
+ edu_err = true;
+
} else {
if (oob)
memset(oob, 0x99, mtd->oobsize);
@@ -2188,6 +2209,11 @@ static int brcmnand_read(struct mtd_info *mtd, struct nand_chip *chip,
if (mtd_is_bitflip(err)) {
unsigned int corrected = brcmnand_count_corrected(ctrl);
+ /* in case of EDU correctable error we read again using PIO */
+ if (edu_err)
+ err = brcmnand_read_by_pio(mtd, chip, addr, trans, buf,
+ oob, &err_addr);
+
dev_dbg(ctrl->dev, "corrected error at 0x%llx\n",
(unsigned long long)err_addr);
mtd->ecc_stats.corrected += corrected;
--
2.25.1
From: Yoshihiro Shimoda <[email protected]>
[ Upstream commit 2b26e34e9af3fa24fa1266e9ea2d66a1f7d62dc0 ]
To add end() operation in the future, clean the code of
renesas_sdhi_internal_dmac_complete_tasklet_fn(). No behavior change.
Signed-off-by: Yoshihiro Shimoda <[email protected]>
Link: https://lore.kernel.org/r/1590044466-28372-3-git-send-email-yoshihiro.shimoda.uh@renesas.com
Tested-by: Wolfram Sang <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/mmc/host/renesas_sdhi_internal_dmac.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/mmc/host/renesas_sdhi_internal_dmac.c b/drivers/mmc/host/renesas_sdhi_internal_dmac.c
index 47ac53e912411..201b8ed37f2e0 100644
--- a/drivers/mmc/host/renesas_sdhi_internal_dmac.c
+++ b/drivers/mmc/host/renesas_sdhi_internal_dmac.c
@@ -229,15 +229,12 @@ static void renesas_sdhi_internal_dmac_issue_tasklet_fn(unsigned long arg)
DTRAN_CTRL_DM_START);
}
-static void renesas_sdhi_internal_dmac_complete_tasklet_fn(unsigned long arg)
+static bool renesas_sdhi_internal_dmac_complete(struct tmio_mmc_host *host)
{
- struct tmio_mmc_host *host = (struct tmio_mmc_host *)arg;
enum dma_data_direction dir;
- spin_lock_irq(&host->lock);
-
if (!host->data)
- goto out;
+ return false;
if (host->data->flags & MMC_DATA_READ)
dir = DMA_FROM_DEVICE;
@@ -250,6 +247,17 @@ static void renesas_sdhi_internal_dmac_complete_tasklet_fn(unsigned long arg)
if (dir == DMA_FROM_DEVICE)
clear_bit(SDHI_INTERNAL_DMAC_RX_IN_USE, &global_flags);
+ return true;
+}
+
+static void renesas_sdhi_internal_dmac_complete_tasklet_fn(unsigned long arg)
+{
+ struct tmio_mmc_host *host = (struct tmio_mmc_host *)arg;
+
+ spin_lock_irq(&host->lock);
+ if (!renesas_sdhi_internal_dmac_complete(host))
+ goto out;
+
tmio_mmc_do_data_irq(host);
out:
spin_unlock_irq(&host->lock);
--
2.25.1
From: Cristian Ciocaltea <[email protected]>
[ Upstream commit f47ee279d25fb0e010cae5d6e758e39b40eb6378 ]
The h_clk clock in the Actions Semi S500 SoC clock driver has an
invalid parent. Replace with the correct one.
Fixes: ed6b4795ece4 ("clk: actions: Add clock driver for S500 SoC")
Signed-off-by: Cristian Ciocaltea <[email protected]>
Reviewed-by: Manivannan Sadhasivam <[email protected]>
Link: https://lore.kernel.org/r/c57e7ebabfa970014f073b92fe95b47d3e5a70b1.1593788312.git.cristian.ciocaltea@gmail.com
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/clk/actions/owl-s500.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/actions/owl-s500.c b/drivers/clk/actions/owl-s500.c
index e2007ac4d235d..0eb83a0b70bcc 100644
--- a/drivers/clk/actions/owl-s500.c
+++ b/drivers/clk/actions/owl-s500.c
@@ -183,7 +183,7 @@ static OWL_GATE(timer_clk, "timer_clk", "hosc", CMU_DEVCLKEN1, 27, 0, 0);
static OWL_GATE(hdmi_clk, "hdmi_clk", "hosc", CMU_DEVCLKEN1, 3, 0, 0);
/* divider clocks */
-static OWL_DIVIDER(h_clk, "h_clk", "ahbprevdiv_clk", CMU_BUSCLK1, 12, 2, NULL, 0, 0);
+static OWL_DIVIDER(h_clk, "h_clk", "ahbprediv_clk", CMU_BUSCLK1, 12, 2, NULL, 0, 0);
static OWL_DIVIDER(rmii_ref_clk, "rmii_ref_clk", "ethernet_pll_clk", CMU_ETHERNETPLL, 1, 1, rmii_ref_div_table, 0, 0);
/* factor clocks */
--
2.25.1
From: Aneesh Kumar K.V <[email protected]>
[ Upstream commit 9a11f12e0a6c374b3ef1ce81e32ce477d28eb1b8 ]
Rename variable to indicate that they are invalid values which we will
use to test ptrace update of pkeys.
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
.../selftests/powerpc/ptrace/ptrace-pkey.c | 26 +++++++++----------
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c b/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
index bdbbbe8431e03..f9216c7a1829e 100644
--- a/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-pkey.c
@@ -44,7 +44,7 @@ struct shared_info {
unsigned long amr2;
/* AMR value that ptrace should refuse to write to the child. */
- unsigned long amr3;
+ unsigned long invalid_amr;
/* IAMR value the parent expects to read from the child. */
unsigned long expected_iamr;
@@ -57,8 +57,8 @@ struct shared_info {
* (even though they're valid ones) because userspace doesn't have
* access to those registers.
*/
- unsigned long new_iamr;
- unsigned long new_uamor;
+ unsigned long invalid_iamr;
+ unsigned long invalid_uamor;
};
static int sys_pkey_alloc(unsigned long flags, unsigned long init_access_rights)
@@ -100,7 +100,7 @@ static int child(struct shared_info *info)
info->amr1 |= 3ul << pkeyshift(pkey1);
info->amr2 |= 3ul << pkeyshift(pkey2);
- info->amr3 |= info->amr2 | 3ul << pkeyshift(pkey3);
+ info->invalid_amr |= info->amr2 | 3ul << pkeyshift(pkey3);
if (disable_execute)
info->expected_iamr |= 1ul << pkeyshift(pkey1);
@@ -111,8 +111,8 @@ static int child(struct shared_info *info)
info->expected_uamor |= 3ul << pkeyshift(pkey1) |
3ul << pkeyshift(pkey2);
- info->new_iamr |= 1ul << pkeyshift(pkey1) | 1ul << pkeyshift(pkey2);
- info->new_uamor |= 3ul << pkeyshift(pkey1);
+ info->invalid_iamr |= 1ul << pkeyshift(pkey1) | 1ul << pkeyshift(pkey2);
+ info->invalid_uamor |= 3ul << pkeyshift(pkey1);
/*
* We won't use pkey3. We just want a plausible but invalid key to test
@@ -196,9 +196,9 @@ static int parent(struct shared_info *info, pid_t pid)
PARENT_SKIP_IF_UNSUPPORTED(ret, &info->child_sync);
PARENT_FAIL_IF(ret, &info->child_sync);
- info->amr1 = info->amr2 = info->amr3 = regs[0];
- info->expected_iamr = info->new_iamr = regs[1];
- info->expected_uamor = info->new_uamor = regs[2];
+ info->amr1 = info->amr2 = info->invalid_amr = regs[0];
+ info->expected_iamr = info->invalid_iamr = regs[1];
+ info->expected_uamor = info->invalid_uamor = regs[2];
/* Wake up child so that it can set itself up. */
ret = prod_child(&info->child_sync);
@@ -234,10 +234,10 @@ static int parent(struct shared_info *info, pid_t pid)
return ret;
/* Write invalid AMR value in child. */
- ret = ptrace_write_regs(pid, NT_PPC_PKEY, &info->amr3, 1);
+ ret = ptrace_write_regs(pid, NT_PPC_PKEY, &info->invalid_amr, 1);
PARENT_FAIL_IF(ret, &info->child_sync);
- printf("%-30s AMR: %016lx\n", ptrace_write_running, info->amr3);
+ printf("%-30s AMR: %016lx\n", ptrace_write_running, info->invalid_amr);
/* Wake up child so that it can verify it didn't change. */
ret = prod_child(&info->child_sync);
@@ -249,7 +249,7 @@ static int parent(struct shared_info *info, pid_t pid)
/* Try to write to IAMR. */
regs[0] = info->amr1;
- regs[1] = info->new_iamr;
+ regs[1] = info->invalid_iamr;
ret = ptrace_write_regs(pid, NT_PPC_PKEY, regs, 2);
PARENT_FAIL_IF(!ret, &info->child_sync);
@@ -257,7 +257,7 @@ static int parent(struct shared_info *info, pid_t pid)
ptrace_write_running, regs[0], regs[1]);
/* Try to write to IAMR and UAMOR. */
- regs[2] = info->new_uamor;
+ regs[2] = info->invalid_uamor;
ret = ptrace_write_regs(pid, NT_PPC_PKEY, regs, 3);
PARENT_FAIL_IF(!ret, &info->child_sync);
--
2.25.1
From: Kees Cook <[email protected]>
commit e4d05028a07f505a08802a6d1b11674c149df2b3 upstream.
The TSYNC ESRCH flag test will fail for regular users because NNP was
not set yet. Add NNP setting.
Fixes: 51891498f2da ("seccomp: allow TSYNC and USER_NOTIF together")
Cc: [email protected]
Reviewed-by: Tycho Andersen <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
tools/testing/selftests/seccomp/seccomp_bpf.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -3258,6 +3258,11 @@ TEST(user_notification_with_tsync)
int ret;
unsigned int flags;
+ ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
+ ASSERT_EQ(0, ret) {
+ TH_LOG("Kernel does not support PR_SET_NO_NEW_PRIVS!");
+ }
+
/* these were exclusive */
flags = SECCOMP_FILTER_FLAG_NEW_LISTENER |
SECCOMP_FILTER_FLAG_TSYNC;
From: Mike Kravetz <[email protected]>
commit 3a5139f1c5bb76d69756fb8f13fffa173e261153 upstream.
The routine cma_init_reserved_areas is designed to activate all
reserved cma areas. It quits when it first encounters an error.
This can leave some areas in a state where they are reserved but
not activated. There is no feedback to code which performed the
reservation. Attempting to allocate memory from areas in such a
state will result in a BUG.
Modify cma_init_reserved_areas to always attempt to activate all
areas. The called routine, cma_activate_area is responsible for
leaving the area in a valid state. No one is making active use
of returned error codes, so change the routine to void.
How to reproduce: This example uses kernelcore, hugetlb and cma
as an easy way to reproduce. However, this is a more general cma
issue.
Two node x86 VM 16GB total, 8GB per node
Kernel command line parameters, kernelcore=4G hugetlb_cma=8G
Related boot time messages,
hugetlb_cma: reserve 8192 MiB, up to 4096 MiB per node
cma: Reserved 4096 MiB at 0x0000000100000000
hugetlb_cma: reserved 4096 MiB on node 0
cma: Reserved 4096 MiB at 0x0000000300000000
hugetlb_cma: reserved 4096 MiB on node 1
cma: CMA area hugetlb could not be activated
# echo 8 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
...
Call Trace:
bitmap_find_next_zero_area_off+0x51/0x90
cma_alloc+0x1a5/0x310
alloc_fresh_huge_page+0x78/0x1a0
alloc_pool_huge_page+0x6f/0xf0
set_max_huge_pages+0x10c/0x250
nr_hugepages_store_common+0x92/0x120
? __kmalloc+0x171/0x270
kernfs_fop_write+0xc1/0x1a0
vfs_write+0xc7/0x1f0
ksys_write+0x5f/0xe0
do_syscall_64+0x4d/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: c64be2bb1c6e ("drivers: add Contiguous Memory Allocator")
Signed-off-by: Mike Kravetz <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Roman Gushchin <[email protected]>
Acked-by: Barry Song <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Michal Nazarewicz <[email protected]>
Cc: Kyungmin Park <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/cma.c | 23 +++++++++--------------
1 file changed, 9 insertions(+), 14 deletions(-)
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -93,17 +93,15 @@ static void cma_clear_bitmap(struct cma
mutex_unlock(&cma->lock);
}
-static int __init cma_activate_area(struct cma *cma)
+static void __init cma_activate_area(struct cma *cma)
{
unsigned long base_pfn = cma->base_pfn, pfn = base_pfn;
unsigned i = cma->count >> pageblock_order;
struct zone *zone;
cma->bitmap = bitmap_zalloc(cma_bitmap_maxno(cma), GFP_KERNEL);
- if (!cma->bitmap) {
- cma->count = 0;
- return -ENOMEM;
- }
+ if (!cma->bitmap)
+ goto out_error;
WARN_ON_ONCE(!pfn_valid(pfn));
zone = page_zone(pfn_to_page(pfn));
@@ -133,25 +131,22 @@ static int __init cma_activate_area(stru
spin_lock_init(&cma->mem_head_lock);
#endif
- return 0;
+ return;
not_in_zone:
- pr_err("CMA area %s could not be activated\n", cma->name);
bitmap_free(cma->bitmap);
+out_error:
cma->count = 0;
- return -EINVAL;
+ pr_err("CMA area %s could not be activated\n", cma->name);
+ return;
}
static int __init cma_init_reserved_areas(void)
{
int i;
- for (i = 0; i < cma_area_count; i++) {
- int ret = cma_activate_area(&cma_areas[i]);
-
- if (ret)
- return ret;
- }
+ for (i = 0; i < cma_area_count; i++)
+ cma_activate_area(&cma_areas[i]);
return 0;
}
From: Dafna Hirschfeld <[email protected]>
[ Upstream commit 206003b18bb264521607440752814ccff59f91f3 ]
When setting the sink format of the 'rkisp1_resizer'
the format should be supported by 'rkisp1_isp' on
the video source pad. This patch checks this condition
and sets the format to default if the condition is false.
Fixes: 56e3b29f9f6b "media: staging: rkisp1: add streaming paths"
Signed-off-by: Dafna Hirschfeld <[email protected]>
Reviewed-by: Tomasz Figa <[email protected]>
Acked-by: Helen Koike <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/staging/media/rkisp1/rkisp1-common.h | 3 +++
drivers/staging/media/rkisp1/rkisp1-isp.c | 3 ---
drivers/staging/media/rkisp1/rkisp1-resizer.c | 2 +-
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/staging/media/rkisp1/rkisp1-common.h b/drivers/staging/media/rkisp1/rkisp1-common.h
index 0c4fe503adc90..12bd9d05050db 100644
--- a/drivers/staging/media/rkisp1/rkisp1-common.h
+++ b/drivers/staging/media/rkisp1/rkisp1-common.h
@@ -22,6 +22,9 @@
#include "rkisp1-regs.h"
#include "uapi/rkisp1-config.h"
+#define RKISP1_ISP_SD_SRC BIT(0)
+#define RKISP1_ISP_SD_SINK BIT(1)
+
#define RKISP1_ISP_MAX_WIDTH 4032
#define RKISP1_ISP_MAX_HEIGHT 3024
#define RKISP1_ISP_MIN_WIDTH 32
diff --git a/drivers/staging/media/rkisp1/rkisp1-isp.c b/drivers/staging/media/rkisp1/rkisp1-isp.c
index abfedb604303f..b21a67aea433c 100644
--- a/drivers/staging/media/rkisp1/rkisp1-isp.c
+++ b/drivers/staging/media/rkisp1/rkisp1-isp.c
@@ -23,9 +23,6 @@
#define RKISP1_ISP_DEV_NAME RKISP1_DRIVER_NAME "_isp"
-#define RKISP1_ISP_SD_SRC BIT(0)
-#define RKISP1_ISP_SD_SINK BIT(1)
-
/*
* NOTE: MIPI controller and input MUX are also configured in this file.
* This is because ISP Subdev describes not only ISP submodule (input size,
diff --git a/drivers/staging/media/rkisp1/rkisp1-resizer.c b/drivers/staging/media/rkisp1/rkisp1-resizer.c
index e188944941b58..a2b35961bc8b7 100644
--- a/drivers/staging/media/rkisp1/rkisp1-resizer.c
+++ b/drivers/staging/media/rkisp1/rkisp1-resizer.c
@@ -542,7 +542,7 @@ static void rkisp1_rsz_set_sink_fmt(struct rkisp1_resizer *rsz,
which);
sink_fmt->code = format->code;
mbus_info = rkisp1_isp_mbus_info_get(sink_fmt->code);
- if (!mbus_info) {
+ if (!mbus_info || !(mbus_info->direction & RKISP1_ISP_SD_SRC)) {
sink_fmt->code = RKISP1_DEF_FMT;
mbus_info = rkisp1_isp_mbus_info_get(sink_fmt->code);
}
--
2.25.1
From: Jia He <[email protected]>
commit b4223a510e2ab1bf0f971d50af7c1431014b25ad upstream.
When check_memblock_offlined_cb() returns failed rc(e.g. the memblock is
online at that time), mem_hotplug_begin/done is unpaired in such case.
Therefore a warning:
Call Trace:
percpu_up_write+0x33/0x40
try_remove_memory+0x66/0x120
? _cond_resched+0x19/0x30
remove_memory+0x2b/0x40
dev_dax_kmem_remove+0x36/0x72 [kmem]
device_release_driver_internal+0xf0/0x1c0
device_release_driver+0x12/0x20
bus_remove_device+0xe1/0x150
device_del+0x17b/0x3e0
unregister_dev_dax+0x29/0x60
devm_action_release+0x15/0x20
release_nodes+0x19a/0x1e0
devres_release_all+0x3f/0x50
device_release_driver_internal+0x100/0x1c0
driver_detach+0x4c/0x8f
bus_remove_driver+0x5c/0xd0
driver_unregister+0x31/0x50
dax_pmem_exit+0x10/0xfe0 [dax_pmem]
Fixes: f1037ec0cc8a ("mm/memory_hotplug: fix remove_memory() lockdep splat")
Signed-off-by: Jia He <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Dan Williams <[email protected]>
Cc: <[email protected]> [5.6+]
Cc: Andy Lutomirski <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chuhong Yuan <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Kaly Xin <[email protected]>
Cc: Logan Gunthorpe <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/memory_hotplug.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1742,7 +1742,7 @@ static int __ref try_remove_memory(int n
*/
rc = walk_memory_blocks(start, size, NULL, check_memblock_offlined_cb);
if (rc)
- goto done;
+ return rc;
/* remove memmap entry */
firmware_map_remove(start, start + size, "System RAM");
@@ -1766,9 +1766,8 @@ static int __ref try_remove_memory(int n
try_offline_node(nid);
-done:
mem_hotplug_done();
- return rc;
+ return 0;
}
/**
From: Junxiao Bi <[email protected]>
commit 38d51b2dd171ad973afc1f5faab825ed05a2d5e9 upstream.
Dan Carpenter reported the following static checker warning.
fs/ocfs2/super.c:1269 ocfs2_parse_options() warn: '(-1)' 65535 can't fit into 32767 'mopt->slot'
fs/ocfs2/suballoc.c:859 ocfs2_init_inode_steal_slot() warn: '(-1)' 65535 can't fit into 32767 'osb->s_inode_steal_slot'
fs/ocfs2/suballoc.c:867 ocfs2_init_meta_steal_slot() warn: '(-1)' 65535 can't fit into 32767 'osb->s_meta_steal_slot'
That's because OCFS2_INVALID_SLOT is (u16)-1. Slot number in ocfs2 can be
never negative, so change s16 to u16.
Fixes: 9277f8334ffc ("ocfs2: fix value of OCFS2_INVALID_SLOT")
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Junxiao Bi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Joseph Qi <[email protected]>
Reviewed-by: Gang He <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Jun Piao <[email protected]>
Cc: <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/ocfs2/ocfs2.h | 4 ++--
fs/ocfs2/suballoc.c | 4 ++--
fs/ocfs2/super.c | 4 ++--
3 files changed, 6 insertions(+), 6 deletions(-)
--- a/fs/ocfs2/ocfs2.h
+++ b/fs/ocfs2/ocfs2.h
@@ -327,8 +327,8 @@ struct ocfs2_super
spinlock_t osb_lock;
u32 s_next_generation;
unsigned long osb_flags;
- s16 s_inode_steal_slot;
- s16 s_meta_steal_slot;
+ u16 s_inode_steal_slot;
+ u16 s_meta_steal_slot;
atomic_t s_num_inodes_stolen;
atomic_t s_num_meta_stolen;
--- a/fs/ocfs2/suballoc.c
+++ b/fs/ocfs2/suballoc.c
@@ -879,9 +879,9 @@ static void __ocfs2_set_steal_slot(struc
{
spin_lock(&osb->osb_lock);
if (type == INODE_ALLOC_SYSTEM_INODE)
- osb->s_inode_steal_slot = slot;
+ osb->s_inode_steal_slot = (u16)slot;
else if (type == EXTENT_ALLOC_SYSTEM_INODE)
- osb->s_meta_steal_slot = slot;
+ osb->s_meta_steal_slot = (u16)slot;
spin_unlock(&osb->osb_lock);
}
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -78,7 +78,7 @@ struct mount_options
unsigned long commit_interval;
unsigned long mount_opt;
unsigned int atime_quantum;
- signed short slot;
+ unsigned short slot;
int localalloc_opt;
unsigned int resv_level;
int dir_resv_level;
@@ -1349,7 +1349,7 @@ static int ocfs2_parse_options(struct su
goto bail;
}
if (option)
- mopt->slot = (s16)option;
+ mopt->slot = (u16)option;
break;
case Opt_commit:
if (match_int(&args[0], &option)) {
From: Paul Cercueil <[email protected]>
commit ca43f274e03f91c533643299ae4984965ce03205 upstream.
plane->index is NOT the index of the color plane in a YUV frame.
Actually, a YUV frame is represented by a single drm_plane, even though
it contains three Y, U, V planes.
v2-v3: No change
Cc: [email protected] # v5.3
Fixes: 90b86fcc47b4 ("DRM: Add KMS driver for the Ingenic JZ47xx SoCs")
Signed-off-by: Paul Cercueil <[email protected]>
Reviewed-by: Sam Ravnborg <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/ingenic/ingenic-drm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/ingenic/ingenic-drm.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm.c
@@ -386,7 +386,7 @@ static void ingenic_drm_plane_atomic_upd
addr = drm_fb_cma_get_gem_addr(state->fb, state, 0);
width = state->src_w >> 16;
height = state->src_h >> 16;
- cpp = state->fb->format->cpp[plane->index];
+ cpp = state->fb->format->cpp[0];
priv->dma_hwdesc->addr = addr;
priv->dma_hwdesc->cmd = width * height * cpp / 4;
From: Peter Zijlstra <[email protected]>
commit 38cf307c1f2011d413750c5acb725456f47d9172 upstream.
For SMP systems using IPI based TLB invalidation, looking at
current->active_mm is entirely reasonable. This then presents the
following race condition:
CPU0 CPU1
flush_tlb_mm(mm) use_mm(mm)
<send-IPI>
tsk->active_mm = mm;
<IPI>
if (tsk->active_mm == mm)
// flush TLBs
</IPI>
switch_mm(old_mm,mm,tsk);
Where it is possible the IPI flushed the TLBs for @old_mm, not @mm,
because the IPI lands before we actually switched.
Avoid this by disabling IRQs across changing ->active_mm and
switch_mm().
Of the (SMP) architectures that have IPI based TLB invalidate:
Alpha - checks active_mm
ARC - ASID specific
IA64 - checks active_mm
MIPS - ASID specific flush
OpenRISC - shoots down world
PARISC - shoots down world
SH - ASID specific
SPARC - ASID specific
x86 - N/A
xtensa - checks active_mm
So at the very least Alpha, IA64 and Xtensa are suspect.
On top of this, for scheduler consistency we need at least preemption
disabled across changing tsk->mm and doing switch_mm(), which is
currently provided by task_lock(), but that's not sufficient for
PREEMPT_RT.
[[email protected]: add comment]
Reported-by: Andy Lutomirski <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/kthread.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -1239,13 +1239,16 @@ void kthread_use_mm(struct mm_struct *mm
WARN_ON_ONCE(tsk->mm);
task_lock(tsk);
+ /* Hold off tlb flush IPIs while switching mm's */
+ local_irq_disable();
active_mm = tsk->active_mm;
if (active_mm != mm) {
mmgrab(mm);
tsk->active_mm = mm;
}
tsk->mm = mm;
- switch_mm(active_mm, mm, tsk);
+ switch_mm_irqs_off(active_mm, mm, tsk);
+ local_irq_enable();
task_unlock(tsk);
#ifdef finish_arch_post_lock_switch
finish_arch_post_lock_switch();
@@ -1274,9 +1277,11 @@ void kthread_unuse_mm(struct mm_struct *
task_lock(tsk);
sync_mm_rss(mm);
+ local_irq_disable();
tsk->mm = NULL;
/* active_mm is still 'mm' */
enter_lazy_tlb(mm, tsk);
+ local_irq_enable();
task_unlock(tsk);
}
EXPORT_SYMBOL_GPL(kthread_unuse_mm);
From: Steven Rostedt (VMware) <[email protected]>
commit afcab636657421f7ebfa0783a91f90256bba0091 upstream.
On exit, if a process is preempted after the trace_sched_process_exit()
tracepoint but before the process is done exiting, then when it gets
scheduled in, the function tracers will not filter it properly against the
function tracing pid filters.
That is because the function tracing pid filters hooks to the
sched_process_exit() tracepoint to remove the exiting task's pid from the
filter list. Because the filtering happens at the sched_switch tracepoint,
when the exiting task schedules back in to finish up the exit, it will no
longer be in the function pid filtering tables.
This was noticeable in the notrace self tests on a preemptable kernel, as
the tests would fail as it exits and preempted after being taken off the
notrace filter table and on scheduling back in it would not be in the
notrace list, and then the ending of the exit function would trace. The test
detected this and would fail.
Cc: [email protected]
Cc: Namhyung Kim <[email protected]>
Fixes: 1e10486ffee0a ("ftrace: Add 'function-fork' trace option")
Fixes: c37775d57830a ("tracing: Add infrastructure to allow set_event_pid to follow children"
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/trace/ftrace.c | 4 ++--
kernel/trace/trace_events.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -6969,12 +6969,12 @@ void ftrace_pid_follow_fork(struct trace
if (enable) {
register_trace_sched_process_fork(ftrace_pid_follow_sched_process_fork,
tr);
- register_trace_sched_process_exit(ftrace_pid_follow_sched_process_exit,
+ register_trace_sched_process_free(ftrace_pid_follow_sched_process_exit,
tr);
} else {
unregister_trace_sched_process_fork(ftrace_pid_follow_sched_process_fork,
tr);
- unregister_trace_sched_process_exit(ftrace_pid_follow_sched_process_exit,
+ unregister_trace_sched_process_free(ftrace_pid_follow_sched_process_exit,
tr);
}
}
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -538,12 +538,12 @@ void trace_event_follow_fork(struct trac
if (enable) {
register_trace_prio_sched_process_fork(event_filter_pid_sched_process_fork,
tr, INT_MIN);
- register_trace_prio_sched_process_exit(event_filter_pid_sched_process_exit,
+ register_trace_prio_sched_process_free(event_filter_pid_sched_process_exit,
tr, INT_MAX);
} else {
unregister_trace_sched_process_fork(event_filter_pid_sched_process_fork,
tr);
- unregister_trace_sched_process_exit(event_filter_pid_sched_process_exit,
+ unregister_trace_sched_process_free(event_filter_pid_sched_process_exit,
tr);
}
}
From: Jason Gunthorpe <[email protected]>
[ Upstream commit 65936bf25f90fe440bb2d11624c7d10fab266639 ]
ipoib_mcast_carrier_on_task() insanely open codes a rtnl_lock() such that
the only time flush_workqueue() can be called is if it also clears
IPOIB_FLAG_OPER_UP.
Thus the flush inside ipoib_flush_ah() will deadlock if it gets unlucky
enough, and lockdep doesn't help us to find it early:
CPU0 CPU1 CPU2
__ipoib_ib_dev_flush()
down_read(vlan_rwsem)
ipoib_vlan_add()
rtnl_trylock()
down_write(vlan_rwsem)
ipoib_mcast_carrier_on_task()
while (!rtnl_trylock())
msleep(20);
ipoib_flush_ah()
flush_workqueue(priv->wq)
Clean up the ah_reaper related functions and lifecycle to make sense:
- Start/Stop of the reaper should only be done in open/stop NDOs, not in
any other places
- cancel and flush of the reaper should only happen in the stop NDO.
cancel is only functional when combined with IPOIB_STOP_REAPER.
- Non-stop places were flushing the AH's just need to flush out dead AH's
synchronously and ignore the background task completely. It is fully
locked and harmless to leave running.
Which ultimately fixes the ABBA deadlock by removing the unnecessary
flush_workqueue() from the problematic place under the vlan_rwsem.
Fixes: efc82eeeae4e ("IB/ipoib: No longer use flush as a parameter")
Link: https://lore.kernel.org/r/[email protected]
Reported-by: Kamal Heib <[email protected]>
Tested-by: Kamal Heib <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/infiniband/ulp/ipoib/ipoib_ib.c | 65 ++++++++++-------------
drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 +
2 files changed, 31 insertions(+), 36 deletions(-)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 6ee64c25aaff4..494f413dc3c6c 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -670,13 +670,12 @@ int ipoib_send(struct net_device *dev, struct sk_buff *skb,
return rc;
}
-static void __ipoib_reap_ah(struct net_device *dev)
+static void ipoib_reap_dead_ahs(struct ipoib_dev_priv *priv)
{
- struct ipoib_dev_priv *priv = ipoib_priv(dev);
struct ipoib_ah *ah, *tah;
unsigned long flags;
- netif_tx_lock_bh(dev);
+ netif_tx_lock_bh(priv->dev);
spin_lock_irqsave(&priv->lock, flags);
list_for_each_entry_safe(ah, tah, &priv->dead_ahs, list)
@@ -687,37 +686,37 @@ static void __ipoib_reap_ah(struct net_device *dev)
}
spin_unlock_irqrestore(&priv->lock, flags);
- netif_tx_unlock_bh(dev);
+ netif_tx_unlock_bh(priv->dev);
}
void ipoib_reap_ah(struct work_struct *work)
{
struct ipoib_dev_priv *priv =
container_of(work, struct ipoib_dev_priv, ah_reap_task.work);
- struct net_device *dev = priv->dev;
- __ipoib_reap_ah(dev);
+ ipoib_reap_dead_ahs(priv);
if (!test_bit(IPOIB_STOP_REAPER, &priv->flags))
queue_delayed_work(priv->wq, &priv->ah_reap_task,
round_jiffies_relative(HZ));
}
-static void ipoib_flush_ah(struct net_device *dev)
+static void ipoib_start_ah_reaper(struct ipoib_dev_priv *priv)
{
- struct ipoib_dev_priv *priv = ipoib_priv(dev);
-
- cancel_delayed_work(&priv->ah_reap_task);
- flush_workqueue(priv->wq);
- ipoib_reap_ah(&priv->ah_reap_task.work);
+ clear_bit(IPOIB_STOP_REAPER, &priv->flags);
+ queue_delayed_work(priv->wq, &priv->ah_reap_task,
+ round_jiffies_relative(HZ));
}
-static void ipoib_stop_ah(struct net_device *dev)
+static void ipoib_stop_ah_reaper(struct ipoib_dev_priv *priv)
{
- struct ipoib_dev_priv *priv = ipoib_priv(dev);
-
set_bit(IPOIB_STOP_REAPER, &priv->flags);
- ipoib_flush_ah(dev);
+ cancel_delayed_work(&priv->ah_reap_task);
+ /*
+ * After ipoib_stop_ah_reaper() we always go through
+ * ipoib_reap_dead_ahs() which ensures the work is really stopped and
+ * does a final flush out of the dead_ah's list
+ */
}
static int recvs_pending(struct net_device *dev)
@@ -846,16 +845,6 @@ int ipoib_ib_dev_stop_default(struct net_device *dev)
return 0;
}
-void ipoib_ib_dev_stop(struct net_device *dev)
-{
- struct ipoib_dev_priv *priv = ipoib_priv(dev);
-
- priv->rn_ops->ndo_stop(dev);
-
- clear_bit(IPOIB_FLAG_INITIALIZED, &priv->flags);
- ipoib_flush_ah(dev);
-}
-
int ipoib_ib_dev_open_default(struct net_device *dev)
{
struct ipoib_dev_priv *priv = ipoib_priv(dev);
@@ -899,10 +888,7 @@ int ipoib_ib_dev_open(struct net_device *dev)
return -1;
}
- clear_bit(IPOIB_STOP_REAPER, &priv->flags);
- queue_delayed_work(priv->wq, &priv->ah_reap_task,
- round_jiffies_relative(HZ));
-
+ ipoib_start_ah_reaper(priv);
if (priv->rn_ops->ndo_open(dev)) {
pr_warn("%s: Failed to open dev\n", dev->name);
goto dev_stop;
@@ -913,13 +899,20 @@ int ipoib_ib_dev_open(struct net_device *dev)
return 0;
dev_stop:
- set_bit(IPOIB_STOP_REAPER, &priv->flags);
- cancel_delayed_work(&priv->ah_reap_task);
- set_bit(IPOIB_FLAG_INITIALIZED, &priv->flags);
- ipoib_ib_dev_stop(dev);
+ ipoib_stop_ah_reaper(priv);
return -1;
}
+void ipoib_ib_dev_stop(struct net_device *dev)
+{
+ struct ipoib_dev_priv *priv = ipoib_priv(dev);
+
+ priv->rn_ops->ndo_stop(dev);
+
+ clear_bit(IPOIB_FLAG_INITIALIZED, &priv->flags);
+ ipoib_stop_ah_reaper(priv);
+}
+
void ipoib_pkey_dev_check_presence(struct net_device *dev)
{
struct ipoib_dev_priv *priv = ipoib_priv(dev);
@@ -1230,7 +1223,7 @@ static void __ipoib_ib_dev_flush(struct ipoib_dev_priv *priv,
ipoib_mcast_dev_flush(dev);
if (oper_up)
set_bit(IPOIB_FLAG_OPER_UP, &priv->flags);
- ipoib_flush_ah(dev);
+ ipoib_reap_dead_ahs(priv);
}
if (level >= IPOIB_FLUSH_NORMAL)
@@ -1305,7 +1298,7 @@ void ipoib_ib_dev_cleanup(struct net_device *dev)
* the neighbor garbage collection is stopped and reaped.
* That should all be done now, so make a final ah flush.
*/
- ipoib_stop_ah(dev);
+ ipoib_reap_dead_ahs(priv);
clear_bit(IPOIB_PKEY_ASSIGNED, &priv->flags);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 3cfb682b91b0a..ef60e8e4ae67b 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -1976,6 +1976,8 @@ static void ipoib_ndo_uninit(struct net_device *dev)
/* no more works over the priv->wq */
if (priv->wq) {
+ /* See ipoib_mcast_carrier_on_task() */
+ WARN_ON(test_bit(IPOIB_FLAG_OPER_UP, &priv->flags));
flush_workqueue(priv->wq);
destroy_workqueue(priv->wq);
priv->wq = NULL;
--
2.25.1
From: Hugh Dickins <[email protected]>
commit 119a5fc16105b2b9383a6e2a7800b2ef861b2975 upstream.
When retract_page_tables() removes a page table to make way for a huge
pmd, it holds huge page lock, i_mmap_lock_write, mmap_write_trylock and
pmd lock; but when collapse_pte_mapped_thp() does the same (to handle the
case when the original mmap_write_trylock had failed), only
mmap_write_trylock and pmd lock are held.
That's not enough. One machine has twice crashed under load, with "BUG:
spinlock bad magic" and GPF on 6b6b6b6b6b6b6b6b. Examining the second
crash, page_vma_mapped_walk_done()'s spin_unlock of pvmw->ptl (serving
page_referenced() on a file THP, that had found a page table at *pmd)
discovers that the page table page and its lock have already been freed by
the time it comes to unlock.
Follow the example of retract_page_tables(), but we only need one of huge
page lock or i_mmap_lock_write to secure against this: because it's the
narrower lock, and because it simplifies collapse_pte_mapped_thp() to know
the hpage earlier, choose to rely on huge page lock here.
Fixes: 27e1f8273113 ("khugepaged: enable collapse pmd for pte-mapped THP")
Signed-off-by: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Song Liu <[email protected]>
Cc: <[email protected]> [5.4+]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/khugepaged.c | 44 +++++++++++++++++++-------------------------
1 file changed, 19 insertions(+), 25 deletions(-)
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1412,7 +1412,7 @@ void collapse_pte_mapped_thp(struct mm_s
{
unsigned long haddr = addr & HPAGE_PMD_MASK;
struct vm_area_struct *vma = find_vma(mm, haddr);
- struct page *hpage = NULL;
+ struct page *hpage;
pte_t *start_pte, *pte;
pmd_t *pmd, _pmd;
spinlock_t *ptl;
@@ -1432,9 +1432,17 @@ void collapse_pte_mapped_thp(struct mm_s
if (!hugepage_vma_check(vma, vma->vm_flags | VM_HUGEPAGE))
return;
+ hpage = find_lock_page(vma->vm_file->f_mapping,
+ linear_page_index(vma, haddr));
+ if (!hpage)
+ return;
+
+ if (!PageHead(hpage))
+ goto drop_hpage;
+
pmd = mm_find_pmd(mm, haddr);
if (!pmd)
- return;
+ goto drop_hpage;
start_pte = pte_offset_map_lock(mm, pmd, haddr, &ptl);
@@ -1453,30 +1461,11 @@ void collapse_pte_mapped_thp(struct mm_s
page = vm_normal_page(vma, addr, *pte);
- if (!page || !PageCompound(page))
- goto abort;
-
- if (!hpage) {
- hpage = compound_head(page);
- /*
- * The mapping of the THP should not change.
- *
- * Note that uprobe, debugger, or MAP_PRIVATE may
- * change the page table, but the new page will
- * not pass PageCompound() check.
- */
- if (WARN_ON(hpage->mapping != vma->vm_file->f_mapping))
- goto abort;
- }
-
/*
- * Confirm the page maps to the correct subpage.
- *
- * Note that uprobe, debugger, or MAP_PRIVATE may change
- * the page table, but the new page will not pass
- * PageCompound() check.
+ * Note that uprobe, debugger, or MAP_PRIVATE may change the
+ * page table, but the new page will not be a subpage of hpage.
*/
- if (WARN_ON(hpage + i != page))
+ if (hpage + i != page)
goto abort;
count++;
}
@@ -1495,7 +1484,7 @@ void collapse_pte_mapped_thp(struct mm_s
pte_unmap_unlock(start_pte, ptl);
/* step 3: set proper refcount and mm_counters. */
- if (hpage) {
+ if (count) {
page_ref_sub(hpage, count);
add_mm_counter(vma->vm_mm, mm_counter_file(hpage), -count);
}
@@ -1506,10 +1495,15 @@ void collapse_pte_mapped_thp(struct mm_s
spin_unlock(ptl);
mm_dec_nr_ptes(mm);
pte_free(mm, pmd_pgtable(_pmd));
+
+drop_hpage:
+ unlock_page(hpage);
+ put_page(hpage);
return;
abort:
pte_unmap_unlock(start_pte, ptl);
+ goto drop_hpage;
}
static int khugepaged_collapse_pte_mapped_thps(struct mm_slot *mm_slot)
From: Muchun Song <[email protected]>
commit 0cb2f1372baa60af8456388a574af6133edd7d80 upstream.
We found a case of kernel panic on our server. The stack trace is as
follows(omit some irrelevant information):
BUG: kernel NULL pointer dereference, address: 0000000000000080
RIP: 0010:kprobe_ftrace_handler+0x5e/0xe0
RSP: 0018:ffffb512c6550998 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff8e9d16eea018 RCX: 0000000000000000
RDX: ffffffffbe1179c0 RSI: ffffffffc0535564 RDI: ffffffffc0534ec0
RBP: ffffffffc0534ec1 R08: ffff8e9d1bbb0f00 R09: 0000000000000004
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8e9d1f797060 R14: 000000000000bacc R15: ffff8e9ce13eca00
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000080 CR3: 00000008453d0005 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
ftrace_ops_assist_func+0x56/0xe0
ftrace_call+0x5/0x34
tcpa_statistic_send+0x5/0x130 [ttcp_engine]
The tcpa_statistic_send is the function being kprobed. After analysis,
the root cause is that the fourth parameter regs of kprobe_ftrace_handler
is NULL. Why regs is NULL? We use the crash tool to analyze the kdump.
crash> dis tcpa_statistic_send -r
<tcpa_statistic_send>: callq 0xffffffffbd8018c0 <ftrace_caller>
The tcpa_statistic_send calls ftrace_caller instead of ftrace_regs_caller.
So it is reasonable that the fourth parameter regs of kprobe_ftrace_handler
is NULL. In theory, we should call the ftrace_regs_caller instead of the
ftrace_caller. After in-depth analysis, we found a reproducible path.
Writing a simple kernel module which starts a periodic timer. The
timer's handler is named 'kprobe_test_timer_handler'. The module
name is kprobe_test.ko.
1) insmod kprobe_test.ko
2) bpftrace -e 'kretprobe:kprobe_test_timer_handler {}'
3) echo 0 > /proc/sys/kernel/ftrace_enabled
4) rmmod kprobe_test
5) stop step 2) kprobe
6) insmod kprobe_test.ko
7) bpftrace -e 'kretprobe:kprobe_test_timer_handler {}'
We mark the kprobe as GONE but not disarm the kprobe in the step 4).
The step 5) also do not disarm the kprobe when unregister kprobe. So
we do not remove the ip from the filter. In this case, when the module
loads again in the step 6), we will replace the code to ftrace_caller
via the ftrace_module_enable(). When we register kprobe again, we will
not replace ftrace_caller to ftrace_regs_caller because the ftrace is
disabled in the step 3). So the step 7) will trigger kernel panic. Fix
this problem by disarming the kprobe when the module is going away.
Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Fixes: ae6aa16fdc16 ("kprobes: introduce ftrace based optimization")
Acked-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Muchun Song <[email protected]>
Co-developed-by: Chengming Zhou <[email protected]>
Signed-off-by: Chengming Zhou <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/kprobes.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2113,6 +2113,13 @@ static void kill_kprobe(struct kprobe *p
* the original probed function (which will be freed soon) any more.
*/
arch_remove_kprobe(p);
+
+ /*
+ * The module is going away. We should disarm the kprobe which
+ * is using ftrace.
+ */
+ if (kprobe_ftrace(p))
+ disarm_kprobe_ftrace(p);
}
/* Disable one kprobe */
From: Kees Cook <[email protected]>
commit d9539752d23283db4692384a634034f451261e29 upstream.
Add missed sock updates to compat path via a new helper, which will be
used more in coming patches. (The net/core/scm.c code is left as-is here
to assist with -stable backports for the compat path.)
Cc: Christoph Hellwig <[email protected]>
Cc: Sargun Dhillon <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: [email protected]
Fixes: 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set correctly")
Fixes: d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly")
Acked-by: Christian Brauner <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/net/sock.h | 4 ++++
net/compat.c | 1 +
net/core/sock.c | 21 +++++++++++++++++++++
3 files changed, 26 insertions(+)
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -891,6 +891,8 @@ static inline int sk_memalloc_socks(void
{
return static_branch_unlikely(&memalloc_socks_key);
}
+
+void __receive_sock(struct file *file);
#else
static inline int sk_memalloc_socks(void)
@@ -898,6 +900,8 @@ static inline int sk_memalloc_socks(void
return 0;
}
+static inline void __receive_sock(struct file *file)
+{ }
#endif
static inline gfp_t sk_gfp_mask(const struct sock *sk, gfp_t gfp_mask)
--- a/net/compat.c
+++ b/net/compat.c
@@ -309,6 +309,7 @@ void scm_detach_fds_compat(struct msghdr
break;
}
/* Bump the usage count and install the file. */
+ __receive_sock(fp[i]);
fd_install(new_fd, get_file(fp[i]));
}
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2842,6 +2842,27 @@ int sock_no_mmap(struct file *file, stru
}
EXPORT_SYMBOL(sock_no_mmap);
+/*
+ * When a file is received (via SCM_RIGHTS, etc), we must bump the
+ * various sock-based usage counts.
+ */
+void __receive_sock(struct file *file)
+{
+ struct socket *sock;
+ int error;
+
+ /*
+ * The resulting value of "error" is ignored here since we only
+ * need to take action when the file is a socket and testing
+ * "sock" for NULL is sufficient.
+ */
+ sock = sock_from_file(file, &error);
+ if (sock) {
+ sock_update_netprioidx(&sock->sk->sk_cgrp_data);
+ sock_update_classid(&sock->sk->sk_cgrp_data);
+ }
+}
+
ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags)
{
ssize_t res;
From: Chengming Zhou <[email protected]>
commit 8a224ffb3f52b0027f6b7279854c71a31c48fc97 upstream.
When module loaded and enabled, we will use __ftrace_replace_code
for module if any ftrace_ops referenced it found. But we will get
wrong ftrace_addr for module rec in ftrace_get_addr_new, because
rec->flags has not been setup correctly. It can cause the callback
function of a ftrace_ops has FTRACE_OPS_FL_SAVE_REGS to be called
with pt_regs set to NULL.
So setup correct FTRACE_FL_REGS flags for rec when we call
referenced_filters to find ftrace_ops references it.
Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Fixes: 8c4f3c3fa9681 ("ftrace: Check module functions being traced on reload")
Signed-off-by: Chengming Zhou <[email protected]>
Signed-off-by: Muchun Song <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/trace/ftrace.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -6187,8 +6187,11 @@ static int referenced_filters(struct dyn
int cnt = 0;
for (ops = ftrace_ops_list; ops != &ftrace_list_end; ops = ops->next) {
- if (ops_references_rec(ops, rec))
- cnt++;
+ if (ops_references_rec(ops, rec)) {
+ cnt++;
+ if (ops->flags & FTRACE_OPS_FL_SAVE_REGS)
+ rec->flags |= FTRACE_FL_REGS;
+ }
}
return cnt;
@@ -6367,8 +6370,8 @@ void ftrace_module_enable(struct module
if (ftrace_start_up)
cnt += referenced_filters(rec);
- /* This clears FTRACE_FL_DISABLED */
- rec->flags = cnt;
+ rec->flags &= ~FTRACE_FL_DISABLED;
+ rec->flags += cnt;
if (ftrace_start_up && cnt) {
int failed = __ftrace_replace_code(rec, 1);
From: David Hildenbrand <[email protected]>
commit 4a93025cbe4a0b19d1a25a2d763a3d2018bad0d9 upstream.
Especially with memory hotplug, we can have offline sections (with a
garbage memmap) and overlapping zones. We have to make sure to only touch
initialized memmaps (online sections managed by the buddy) and that the
zone matches, to not move pages between zones.
To test if this can actually happen, I added a simple
BUG_ON(page_zone(page_i) != page_zone(page_j));
right before the swap. When hotplugging a 256M DIMM to a 4G x86-64 VM and
onlining the first memory block "online_movable" and the second memory
block "online_kernel", it will trigger the BUG, as both zones (NORMAL and
MOVABLE) overlap.
This might result in all kinds of weird situations (e.g., double
allocations, list corruptions, unmovable allocations ending up in the
movable zone).
Fixes: e900a918b098 ("mm: shuffle initial free memory to improve memory-side-cache utilization")
Signed-off-by: David Hildenbrand <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Wei Yang <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Dan Williams <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: <[email protected]> [5.2+]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/shuffle.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
--- a/mm/shuffle.c
+++ b/mm/shuffle.c
@@ -58,25 +58,25 @@ module_param_call(shuffle, shuffle_store
* For two pages to be swapped in the shuffle, they must be free (on a
* 'free_area' lru), have the same order, and have the same migratetype.
*/
-static struct page * __meminit shuffle_valid_page(unsigned long pfn, int order)
+static struct page * __meminit shuffle_valid_page(struct zone *zone,
+ unsigned long pfn, int order)
{
- struct page *page;
+ struct page *page = pfn_to_online_page(pfn);
/*
* Given we're dealing with randomly selected pfns in a zone we
* need to ask questions like...
*/
- /* ...is the pfn even in the memmap? */
- if (!pfn_valid_within(pfn))
+ /* ... is the page managed by the buddy? */
+ if (!page)
return NULL;
- /* ...is the pfn in a present section or a hole? */
- if (!pfn_in_present_section(pfn))
+ /* ... is the page assigned to the same zone? */
+ if (page_zone(page) != zone)
return NULL;
/* ...is the page free and currently on a free_area list? */
- page = pfn_to_page(pfn);
if (!PageBuddy(page))
return NULL;
@@ -123,7 +123,7 @@ void __meminit __shuffle_zone(struct zon
* page_j randomly selected in the span @zone_start_pfn to
* @spanned_pages.
*/
- page_i = shuffle_valid_page(i, order);
+ page_i = shuffle_valid_page(z, i, order);
if (!page_i)
continue;
@@ -137,7 +137,7 @@ void __meminit __shuffle_zone(struct zon
j = z->zone_start_pfn +
ALIGN_DOWN(get_random_long() % z->spanned_pages,
order_pages);
- page_j = shuffle_valid_page(j, order);
+ page_j = shuffle_valid_page(z, j, order);
if (page_j && page_j != page_i)
break;
}
From: Michal Koutný <[email protected]>
commit a6f23d14ec7d7d02220ad8bb2774be3322b9aeec upstream.
When workload runs in cgroups that aren't directly below root cgroup and
their parent specifies reclaim protection, it may end up ineffective.
The reason is that propagate_protected_usage() is not called in all
hierarchy up. All the protected usage is incorrectly accumulated in the
workload's parent. This means that siblings_low_usage is overestimated
and effective protection underestimated. Even though it is transitional
phenomenon (uncharge path does correct propagation and fixes the wrong
children_low_usage), it can undermine the intended protection
unexpectedly.
We have noticed this problem while seeing a swap out in a descendant of a
protected memcg (intermediate node) while the parent was conveniently
under its protection limit and the memory pressure was external to that
hierarchy. Michal has pinpointed this down to the wrong
siblings_low_usage which led to the unwanted reclaim.
The fix is simply updating children_low_usage in respective ancestors also
in the charging path.
Fixes: 230671533d64 ("mm: memory.low hierarchical behavior")
Signed-off-by: Michal Koutný <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Roman Gushchin <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: <[email protected]> [4.18+]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/page_counter.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/mm/page_counter.c
+++ b/mm/page_counter.c
@@ -72,7 +72,7 @@ void page_counter_charge(struct page_cou
long new;
new = atomic_long_add_return(nr_pages, &c->usage);
- propagate_protected_usage(counter, new);
+ propagate_protected_usage(c, new);
/*
* This is indeed racy, but we can live with some
* inaccuracy in the watermark.
@@ -116,7 +116,7 @@ bool page_counter_try_charge(struct page
new = atomic_long_add_return(nr_pages, &c->usage);
if (new > c->max) {
atomic_long_sub(nr_pages, &c->usage);
- propagate_protected_usage(counter, new);
+ propagate_protected_usage(c, new);
/*
* This is racy, but we can live with some
* inaccuracy in the failcnt.
@@ -125,7 +125,7 @@ bool page_counter_try_charge(struct page
*fail = c;
goto failed;
}
- propagate_protected_usage(counter, new);
+ propagate_protected_usage(c, new);
/*
* Just like with failcnt, we can live with some
* inaccuracy in the watermark.
From: Mike Kravetz <[email protected]>
commit 34ae204f18519f0920bd50a644abd6fefc8dbfcf upstream.
Commit c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing
synchronization") requires callers of huge_pte_alloc to hold i_mmap_rwsem
in at least read mode. This is because the explicit locking in
huge_pmd_share (called by huge_pte_alloc) was removed. When restructuring
the code, the call to huge_pte_alloc in the else block at the beginning of
hugetlb_fault was missed.
Unfortunately, that else clause is exercised when there is no page table
entry. This will likely lead to a call to huge_pmd_share. If
huge_pmd_share thinks pmd sharing is possible, it will traverse the
mapping tree (i_mmap) without holding i_mmap_rwsem. If someone else is
modifying the tree, bad things such as addressing exceptions or worse
could happen.
Simply remove the else clause. It should have been removed previously.
The code following the else will call huge_pte_alloc with the appropriate
locking.
To prevent this type of issue in the future, add routines to assert that
i_mmap_rwsem is held, and call these routines in huge pmd sharing
routines.
Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
Suggested-by: Matthew Wilcox <[email protected]>
Signed-off-by: Mike Kravetz <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: "Kirill A.Shutemov" <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Prakash Sangappa <[email protected]>
Cc: <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/fs.h | 10 ++++++++++
include/linux/hugetlb.h | 8 +++++---
mm/hugetlb.c | 15 +++++++--------
mm/rmap.c | 2 +-
4 files changed, 23 insertions(+), 12 deletions(-)
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -549,6 +549,16 @@ static inline void i_mmap_unlock_read(st
up_read(&mapping->i_mmap_rwsem);
}
+static inline void i_mmap_assert_locked(struct address_space *mapping)
+{
+ lockdep_assert_held(&mapping->i_mmap_rwsem);
+}
+
+static inline void i_mmap_assert_write_locked(struct address_space *mapping)
+{
+ lockdep_assert_held_write(&mapping->i_mmap_rwsem);
+}
+
/*
* Might pages of this file be mapped into userspace?
*/
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -164,7 +164,8 @@ pte_t *huge_pte_alloc(struct mm_struct *
unsigned long addr, unsigned long sz);
pte_t *huge_pte_offset(struct mm_struct *mm,
unsigned long addr, unsigned long sz);
-int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep);
+int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma,
+ unsigned long *addr, pte_t *ptep);
void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
unsigned long *start, unsigned long *end);
struct page *follow_huge_addr(struct mm_struct *mm, unsigned long address,
@@ -203,8 +204,9 @@ static inline struct address_space *huge
return NULL;
}
-static inline int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr,
- pte_t *ptep)
+static inline int huge_pmd_unshare(struct mm_struct *mm,
+ struct vm_area_struct *vma,
+ unsigned long *addr, pte_t *ptep)
{
return 0;
}
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3952,7 +3952,7 @@ void __unmap_hugepage_range(struct mmu_g
continue;
ptl = huge_pte_lock(h, mm, ptep);
- if (huge_pmd_unshare(mm, &address, ptep)) {
+ if (huge_pmd_unshare(mm, vma, &address, ptep)) {
spin_unlock(ptl);
/*
* We just unmapped a page of PMDs by clearing a PUD.
@@ -4539,10 +4539,6 @@ vm_fault_t hugetlb_fault(struct mm_struc
} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
return VM_FAULT_HWPOISON_LARGE |
VM_FAULT_SET_HINDEX(hstate_index(h));
- } else {
- ptep = huge_pte_alloc(mm, haddr, huge_page_size(h));
- if (!ptep)
- return VM_FAULT_OOM;
}
/*
@@ -5019,7 +5015,7 @@ unsigned long hugetlb_change_protection(
if (!ptep)
continue;
ptl = huge_pte_lock(h, mm, ptep);
- if (huge_pmd_unshare(mm, &address, ptep)) {
+ if (huge_pmd_unshare(mm, vma, &address, ptep)) {
pages++;
spin_unlock(ptl);
shared_pmd = true;
@@ -5400,12 +5396,14 @@ out:
* returns: 1 successfully unmapped a shared pte page
* 0 the underlying pte page is not shared, or it is the last user
*/
-int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep)
+int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma,
+ unsigned long *addr, pte_t *ptep)
{
pgd_t *pgd = pgd_offset(mm, *addr);
p4d_t *p4d = p4d_offset(pgd, *addr);
pud_t *pud = pud_offset(p4d, *addr);
+ i_mmap_assert_write_locked(vma->vm_file->f_mapping);
BUG_ON(page_count(virt_to_page(ptep)) == 0);
if (page_count(virt_to_page(ptep)) == 1)
return 0;
@@ -5423,7 +5421,8 @@ pte_t *huge_pmd_share(struct mm_struct *
return NULL;
}
-int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep)
+int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma,
+ unsigned long *addr, pte_t *ptep)
{
return 0;
}
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1469,7 +1469,7 @@ static bool try_to_unmap_one(struct page
* do this outside rmap routines.
*/
VM_BUG_ON(!(flags & TTU_RMAP_LOCKED));
- if (huge_pmd_unshare(mm, &address, pvmw.pte)) {
+ if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) {
/*
* huge_pmd_unshare unmapped an entire PMD
* page. There is no way of knowing exactly
From: Hugh Dickins <[email protected]>
commit 18e77600f7a1ed69f8ce46c9e11cad0985712dfa upstream.
Only once have I seen this scenario (and forgot even to notice what forced
the eventual crash): a sequence of "BUG: Bad page map" alerts from
vm_normal_page(), from zap_pte_range() servicing exit_mmap();
pmd:00000000, pte values corresponding to data in physical page 0.
The pte mappings being zapped in this case were supposed to be from a huge
page of ext4 text (but could as well have been shmem): my belief is that
it was racing with collapse_file()'s retract_page_tables(), found *pmd
pointing to a page table, locked it, but *pmd had become 0 by the time
start_pte was decided.
In most cases, that possibility is excluded by holding mmap lock; but
exit_mmap() proceeds without mmap lock. Most of what's run by khugepaged
checks khugepaged_test_exit() after acquiring mmap lock:
khugepaged_collapse_pte_mapped_thps() and hugepage_vma_revalidate() do so,
for example. But retract_page_tables() did not: fix that.
The fix is for retract_page_tables() to check khugepaged_test_exit(),
after acquiring mmap lock, before doing anything to the page table.
Getting the mmap lock serializes with __mmput(), which briefly takes and
drops it in __khugepaged_exit(); then the khugepaged_test_exit() check on
mm_users makes sure we don't touch the page table once exit_mmap() might
reach it, since exit_mmap() will be proceeding without mmap lock, not
expecting anyone to be racing with it.
Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Song Liu <[email protected]>
Cc: <[email protected]> [4.8+]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
mm/khugepaged.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1532,6 +1532,7 @@ out:
static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
{
struct vm_area_struct *vma;
+ struct mm_struct *mm;
unsigned long addr;
pmd_t *pmd, _pmd;
@@ -1560,7 +1561,8 @@ static void retract_page_tables(struct a
continue;
if (vma->vm_end < addr + HPAGE_PMD_SIZE)
continue;
- pmd = mm_find_pmd(vma->vm_mm, addr);
+ mm = vma->vm_mm;
+ pmd = mm_find_pmd(mm, addr);
if (!pmd)
continue;
/*
@@ -1570,17 +1572,19 @@ static void retract_page_tables(struct a
* mmap_lock while holding page lock. Fault path does it in
* reverse order. Trylock is a way to avoid deadlock.
*/
- if (mmap_write_trylock(vma->vm_mm)) {
- spinlock_t *ptl = pmd_lock(vma->vm_mm, pmd);
- /* assume page table is clear */
- _pmd = pmdp_collapse_flush(vma, addr, pmd);
- spin_unlock(ptl);
- mmap_write_unlock(vma->vm_mm);
- mm_dec_nr_ptes(vma->vm_mm);
- pte_free(vma->vm_mm, pmd_pgtable(_pmd));
+ if (mmap_write_trylock(mm)) {
+ if (!khugepaged_test_exit(mm)) {
+ spinlock_t *ptl = pmd_lock(mm, pmd);
+ /* assume page table is clear */
+ _pmd = pmdp_collapse_flush(vma, addr, pmd);
+ spin_unlock(ptl);
+ mm_dec_nr_ptes(mm);
+ pte_free(mm, pmd_pgtable(_pmd));
+ }
+ mmap_write_unlock(mm);
} else {
/* Try again later */
- khugepaged_add_pte_mapped_thp(vma->vm_mm, addr);
+ khugepaged_add_pte_mapped_thp(mm, addr);
}
}
i_mmap_unlock_write(mapping);
From: Mike Rapoport <[email protected]>
commit 6c86a3029ce3b44597526909f2e39a77a497f640 upstream.
When a configuration has NUMA disabled and SGI_IP27 enabled, the build
fails:
CC kernel/bounds.s
CC arch/mips/kernel/asm-offsets.s
In file included from arch/mips/include/asm/topology.h:11,
from include/linux/topology.h:36,
from include/linux/gfp.h:9,
from include/linux/slab.h:15,
from include/linux/crypto.h:19,
from include/crypto/hash.h:11,
from include/linux/uio.h:10,
from include/linux/socket.h:8,
from include/linux/compat.h:15,
from arch/mips/kernel/asm-offsets.c:12:
include/linux/topology.h: In function 'numa_node_id':
arch/mips/include/asm/mach-ip27/topology.h:16:27: error: implicit declaration of function 'cputonasid'; did you mean 'cpu_vpe_id'? [-Werror=implicit-function-declaration]
#define cpu_to_node(cpu) (cputonasid(cpu))
^~~~~~~~~~
include/linux/topology.h:119:9: note: in expansion of macro 'cpu_to_node'
return cpu_to_node(raw_smp_processor_id());
^~~~~~~~~~~
include/linux/topology.h: In function 'cpu_cpu_mask':
arch/mips/include/asm/mach-ip27/topology.h:19:7: error: implicit declaration of function 'hub_data' [-Werror=implicit-function-declaration]
&hub_data(node)->h_cpus)
^~~~~~~~
include/linux/topology.h:210:9: note: in expansion of macro 'cpumask_of_node'
return cpumask_of_node(cpu_to_node(cpu));
^~~~~~~~~~~~~~~
arch/mips/include/asm/mach-ip27/topology.h:19:21: error: invalid type argument of '->' (have 'int')
&hub_data(node)->h_cpus)
^~
include/linux/topology.h:210:9: note: in expansion of macro 'cpumask_of_node'
return cpumask_of_node(cpu_to_node(cpu));
^~~~~~~~~~~~~~~
Before switch from discontigmem to sparsemem, there always was
CONFIG_NEED_MULTIPLE_NODES=y because it was selected by DISCONTIGMEM.
Without DISCONTIGMEM it is possible to have SPARSEMEM without NUMA for
SGI_IP27 and as many things there rely on custom node definition, the
build breaks.
As Thomas noted "... there are right now too many places in IP27 code,
which assumes NUMA enabled", the simplest solution would be to always
enable NUMA for SGI-IP27 builds.
Reported-by: kernel test robot <[email protected]>
Fixes: 397dc00e249e ("mips: sgi-ip27: switch from DISCONTIGMEM to SPARSEMEM")
Cc: [email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Signed-off-by: Thomas Bogendoerfer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/mips/Kconfig | 1 +
1 file changed, 1 insertion(+)
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -678,6 +678,7 @@ config SGI_IP27
select SYS_SUPPORTS_NUMA
select SYS_SUPPORTS_SMP
select MIPS_L1_CACHE_SHIFT_7
+ select NUMA
help
This are the SGI Origin 200, Origin 2000 and Onyx 2 Graphics
workstations. To compile a Linux kernel that runs on these, say Y
From: Johannes Berg <[email protected]>
commit 5981fe5b0529ba25d95f37d7faa434183ad618c5 upstream.
This never was intended to be a 'while' loop, it should've
just been an 'if' instead of 'while'. Fix this.
I noticed this while applying another patch from Ben that
intended to fix a busy loop at this spot.
Cc: [email protected]
Fixes: b16798f5b907 ("mac80211: mark station unauthorized before key removal")
Reported-by: Ben Greear <[email protected]>
Link: https://lore.kernel.org/r/20200803110209.253009ae41ff.I3522aad099392b31d5cf2dcca34cbac7e5832dde@changeid
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/mac80211/sta_info.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -1050,7 +1050,7 @@ static void __sta_info_destroy_part2(str
might_sleep();
lockdep_assert_held(&local->sta_mtx);
- while (sta->sta_state == IEEE80211_STA_AUTHORIZED) {
+ if (sta->sta_state == IEEE80211_STA_AUTHORIZED) {
ret = sta_info_move_state(sta, IEEE80211_STA_ASSOC);
WARN_ON_ONCE(ret);
}
From: Qu Wenruo <[email protected]>
commit 1e6e238c3002ea3611465ce5f32777ddd6a40126 upstream.
[BUG]
There is a bug report of NULL pointer dereference caused in
compress_file_extent():
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Workqueue: btrfs-delalloc btrfs_delalloc_helper [btrfs]
NIP [c008000006dd4d34] compress_file_range.constprop.41+0x75c/0x8a0 [btrfs]
LR [c008000006dd4d1c] compress_file_range.constprop.41+0x744/0x8a0 [btrfs]
Call Trace:
[c000000c69093b00] [c008000006dd4d1c] compress_file_range.constprop.41+0x744/0x8a0 [btrfs] (unreliable)
[c000000c69093bd0] [c008000006dd4ebc] async_cow_start+0x44/0xa0 [btrfs]
[c000000c69093c10] [c008000006e14824] normal_work_helper+0xdc/0x598 [btrfs]
[c000000c69093c80] [c0000000001608c0] process_one_work+0x2c0/0x5b0
[c000000c69093d10] [c000000000160c38] worker_thread+0x88/0x660
[c000000c69093db0] [c00000000016b55c] kthread+0x1ac/0x1c0
[c000000c69093e20] [c00000000000b660] ret_from_kernel_thread+0x5c/0x7c
---[ end trace f16954aa20d822f6 ]---
[CAUSE]
For the following execution route of compress_file_range(), it's
possible to hit NULL pointer dereference:
compress_file_extent()
|- pages = NULL;
|- start = async_chunk->start = 0;
|- end = async_chunk = 4095;
|- nr_pages = 1;
|- inode_need_compress() == false; <<< Possible, see later explanation
| Now, we have nr_pages = 1, pages = NULL
|- cont:
|- ret = cow_file_range_inline();
|- if (ret <= 0) {
|- for (i = 0; i < nr_pages; i++) {
|- WARN_ON(pages[i]->mapping); <<< Crash
To enter above call execution branch, we need the following race:
Thread 1 (chattr) | Thread 2 (writeback)
--------------------------+------------------------------
| btrfs_run_delalloc_range
| |- inode_need_compress = true
| |- cow_file_range_async()
btrfs_ioctl_set_flag() |
|- binode_flags |= |
BTRFS_INODE_NOCOMPRESS |
| compress_file_range()
| |- inode_need_compress = false
| |- nr_page = 1 while pages = NULL
| | Then hit the crash
[FIX]
This patch will fix it by checking @pages before doing accessing it.
This patch is only designed as a hot fix and easy to backport.
More elegant fix may make btrfs only check inode_need_compress() once to
avoid such race, but that would be another story.
Reported-by: Luciano Chavez <[email protected]>
Fixes: 4d3a800ebb12 ("btrfs: merge nr_pages input and output parameter in compress_pages")
CC: [email protected] # 4.14.x: cecc8d9038d16: btrfs: Move free_pages_out label in inline extent handling branch in compress_file_range
CC: [email protected] # 4.14+
Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/inode.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -650,12 +650,18 @@ cont:
page_error_op |
PAGE_END_WRITEBACK);
- for (i = 0; i < nr_pages; i++) {
- WARN_ON(pages[i]->mapping);
- put_page(pages[i]);
+ /*
+ * Ensure we only free the compressed pages if we have
+ * them allocated, as we can still reach here with
+ * inode_need_compress() == false.
+ */
+ if (pages) {
+ for (i = 0; i < nr_pages; i++) {
+ WARN_ON(pages[i]->mapping);
+ put_page(pages[i]);
+ }
+ kfree(pages);
}
- kfree(pages);
-
return 0;
}
}
From: Jonathan McDowell <[email protected]>
commit df43dd526e6609769ae513a81443c7aa727c8ca3 upstream.
The IPQ806x does not appear to have a functional multicast ethernet
address filter. This was observed as a failure to correctly receive IPv6
packets on a LAN to the all stations address. Checking the vendor driver
shows that it does not attempt to enable the multicast filter and
instead falls back to receiving all multicast packets, internally
setting ALLMULTI.
Use the new fallback support in the dwmac1000 driver to correctly
achieve the same with the mainline IPQ806x driver. Confirmed to fix IPv6
functionality on an RB3011 router.
Cc: [email protected]
Signed-off-by: Jonathan McDowell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/stmicro/stmmac/dwmac-ipq806x.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-ipq806x.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-ipq806x.c
@@ -351,6 +351,7 @@ static int ipq806x_gmac_probe(struct pla
plat_dat->has_gmac = true;
plat_dat->bsp_priv = gmac;
plat_dat->fix_mac_speed = ipq806x_gmac_fix_mac_speed;
+ plat_dat->multicast_filter_bins = 0;
err = stmmac_dvr_probe(&pdev->dev, plat_dat, &stmmac_res);
if (err)
From: Paul Cercueil <[email protected]>
commit 84e7a946da71f678affacea301f6d5cb4d9784e8 upstream.
The PAT1 register contains information about the IRQ type (edge/level)
for input GPIOs with IRQ enabled, and the direction for non-IRQ GPIOs.
So it makes sense to read it only if the GPIO has no interrupt
configured, otherwise input GPIOs configured for level IRQs are
misdetected as output GPIOs.
Fixes: ebd6651418b6 ("pinctrl: ingenic: Implement .get_direction for GPIO chips")
Reported-by: João Henrique <[email protected]>
Signed-off-by: Paul Cercueil <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Linus Walleij <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/pinctrl/pinctrl-ingenic.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/pinctrl/pinctrl-ingenic.c
+++ b/drivers/pinctrl/pinctrl-ingenic.c
@@ -1955,7 +1955,8 @@ static int ingenic_gpio_get_direction(st
unsigned int pin = gc->base + offset;
if (jzpc->info->version >= ID_JZ4760) {
- if (ingenic_get_pin_config(jzpc, pin, JZ4760_GPIO_PAT1))
+ if (ingenic_get_pin_config(jzpc, pin, JZ4760_GPIO_INT) ||
+ ingenic_get_pin_config(jzpc, pin, JZ4760_GPIO_PAT1))
return GPIO_LINE_DIRECTION_IN;
return GPIO_LINE_DIRECTION_OUT;
}
From: Alexandru Ardelean <[email protected]>
commit 65afb0932a81c1de719ceee0db0b276094b10ac8 upstream.
There are 2 exit paths where the lock isn't held, but try to unlock the
mutex when exiting. In these places we should just return from the
function.
A neater approach would be to cleanup the ad5592r_read_raw(), but that
would make this patch more difficult to backport to stable versions.
Fixes 56ca9db862bf3: ("iio: dac: Add support for the AD5592R/AD5593R ADCs/DACs")
Reported-by: Charles Stanhope <[email protected]>
Signed-off-by: Alexandru Ardelean <[email protected]>
Cc: <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/iio/dac/ad5592r-base.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/iio/dac/ad5592r-base.c
+++ b/drivers/iio/dac/ad5592r-base.c
@@ -413,7 +413,7 @@ static int ad5592r_read_raw(struct iio_d
s64 tmp = *val * (3767897513LL / 25LL);
*val = div_s64_rem(tmp, 1000000000LL, val2);
- ret = IIO_VAL_INT_PLUS_MICRO;
+ return IIO_VAL_INT_PLUS_MICRO;
} else {
int mult;
@@ -444,7 +444,7 @@ static int ad5592r_read_raw(struct iio_d
ret = IIO_VAL_INT;
break;
default:
- ret = -EINVAL;
+ return -EINVAL;
}
unlock:
From: Eugeniu Rosca <[email protected]>
commit c92d30e4b78dc331909f8c6056c2792aa14e2166 upstream.
In commit f3b98e3c4d2e16 ("media: vsp1: Provide support for extended
command pools"), the vsp pointer used for referencing the VSP1 device
structure from a command pool during vsp1_dl_ext_cmd_pool_destroy() was
not populated.
Correctly assign the pointer to prevent the following
null-pointer-dereference when removing the device:
[*] h3ulcb-kf #>
echo fea28000.vsp > /sys/bus/platform/devices/fea28000.vsp/driver/unbind
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
Mem abort info:
ESR = 0x96000006
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=00000007318be000
[0000000000000028] pgd=00000007333a1003, pud=00000007333a6003, pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 486 Comm: sh Not tainted 5.7.0-rc6-arm64-renesas-00118-ge644645abf47 #185
Hardware name: Renesas H3ULCB Kingfisher board based on r8a77951 (DT)
pstate: 40000005 (nZcv daif -PAN -UAO)
pc : vsp1_dlm_destroy+0xe4/0x11c
lr : vsp1_dlm_destroy+0xc8/0x11c
sp : ffff800012963b60
x29: ffff800012963b60 x28: ffff0006f83fc440
x27: 0000000000000000 x26: ffff0006f5e13e80
x25: ffff0006f5e13ed0 x24: ffff0006f5e13ed0
x23: ffff0006f5e13ed0 x22: dead000000000122
x21: ffff0006f5e3a080 x20: ffff0006f5df2938
x19: ffff0006f5df2980 x18: 0000000000000003
x17: 0000000000000000 x16: 0000000000000016
x15: 0000000000000003 x14: 00000000000393c0
x13: ffff800011a5ec18 x12: ffff800011d8d000
x11: ffff0006f83fcc68 x10: ffff800011a53d70
x9 : ffff8000111f3000 x8 : 0000000000000000
x7 : 0000000000210d00 x6 : 0000000000000000
x5 : ffff800010872e60 x4 : 0000000000000004
x3 : 0000000078068000 x2 : ffff800012781000
x1 : 0000000000002c00 x0 : 0000000000000000
Call trace:
vsp1_dlm_destroy+0xe4/0x11c
vsp1_wpf_destroy+0x10/0x20
vsp1_entity_destroy+0x24/0x4c
vsp1_destroy_entities+0x54/0x130
vsp1_remove+0x1c/0x40
platform_drv_remove+0x28/0x50
__device_release_driver+0x178/0x220
device_driver_detach+0x44/0xc0
unbind_store+0xe0/0x104
drv_attr_store+0x20/0x30
sysfs_kf_write+0x48/0x70
kernfs_fop_write+0x148/0x230
__vfs_write+0x18/0x40
vfs_write+0xdc/0x1c4
ksys_write+0x68/0xf0
__arm64_sys_write+0x18/0x20
el0_svc_common.constprop.0+0x70/0x170
do_el0_svc+0x20/0x80
el0_sync_handler+0x134/0x1b0
el0_sync+0x140/0x180
Code: b40000c2 f9403a60 d2800084 a9400663 (f9401400)
---[ end trace 3875369841fb288a ]---
Fixes: f3b98e3c4d2e16 ("media: vsp1: Provide support for extended command pools")
Cc: [email protected] # v4.19+
Signed-off-by: Eugeniu Rosca <[email protected]>
Reviewed-by: Kieran Bingham <[email protected]>
Tested-by: Kieran Bingham <[email protected]>
Reviewed-by: Laurent Pinchart <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/media/platform/vsp1/vsp1_dl.c | 2 ++
1 file changed, 2 insertions(+)
--- a/drivers/media/platform/vsp1/vsp1_dl.c
+++ b/drivers/media/platform/vsp1/vsp1_dl.c
@@ -431,6 +431,8 @@ vsp1_dl_cmd_pool_create(struct vsp1_devi
if (!pool)
return NULL;
+ pool->vsp1 = vsp1;
+
spin_lock_init(&pool->lock);
INIT_LIST_HEAD(&pool->free);
From: Vincent Duvert <[email protected]>
commit d0f6ba2ef2c1c95069509e71402e7d6d43452512 upstream.
Add a missing return statement to atalk_proc_init so it doesn't return
-ENOMEM when successful. This allows the appletalk module to load
properly.
Fixes: e2bcd8b0ce6e ("appletalk: use remove_proc_subtree to simplify procfs code")
Link: https://www.downtowndougbrown.com/2020/08/hacking-up-a-fix-for-the-broken-appletalk-kernel-module-in-linux-5-1-and-newer/
Reported-by: Christopher KOBAYASHI <[email protected]>
Reported-by: Doug Brown <[email protected]>
Signed-off-by: Vincent Duvert <[email protected]>
[lukas: add missing tags]
Signed-off-by: Lukas Wunner <[email protected]>
Cc: [email protected] # v5.1+
Cc: Yue Haibing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/appletalk/atalk_proc.c | 2 ++
1 file changed, 2 insertions(+)
--- a/net/appletalk/atalk_proc.c
+++ b/net/appletalk/atalk_proc.c
@@ -229,6 +229,8 @@ int __init atalk_proc_init(void)
sizeof(struct aarp_iter_state), NULL))
goto out;
+ return 0;
+
out:
remove_proc_subtree("atalk", init_net.proc_net);
return -ENOMEM;
From: Paul Aurich <[email protected]>
commit baf57b56d3604880ccb3956ec6c62ea894f5de99 upstream.
Handling a lease break for the cached root didn't free the
smb2_lease_break_work allocation, resulting in a leak:
unreferenced object 0xffff98383a5af480 (size 128):
comm "cifsd", pid 684, jiffies 4294936606 (age 534.868s)
hex dump (first 32 bytes):
c0 ff ff ff 1f 00 00 00 88 f4 5a 3a 38 98 ff ff ..........Z:8...
88 f4 5a 3a 38 98 ff ff 80 88 d6 8a ff ff ff ff ..Z:8...........
backtrace:
[<0000000068957336>] smb2_is_valid_oplock_break+0x1fa/0x8c0
[<0000000073b70b9e>] cifs_demultiplex_thread+0x73d/0xcc0
[<00000000905fa372>] kthread+0x11c/0x150
[<0000000079378e4e>] ret_from_fork+0x22/0x30
Avoid this leak by only allocating when necessary.
Fixes: a93864d93977 ("cifs: add lease tracking to the cached root fid")
Signed-off-by: Paul Aurich <[email protected]>
CC: Stable <[email protected]> # v4.18+
Reviewed-by: Aurelien Aptel <[email protected]>
Signed-off-by: Steve French <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/cifs/smb2misc.c | 73 +++++++++++++++++++++++++++++++++++++----------------
1 file changed, 52 insertions(+), 21 deletions(-)
--- a/fs/cifs/smb2misc.c
+++ b/fs/cifs/smb2misc.c
@@ -508,15 +508,31 @@ cifs_ses_oplock_break(struct work_struct
kfree(lw);
}
+static void
+smb2_queue_pending_open_break(struct tcon_link *tlink, __u8 *lease_key,
+ __le32 new_lease_state)
+{
+ struct smb2_lease_break_work *lw;
+
+ lw = kmalloc(sizeof(struct smb2_lease_break_work), GFP_KERNEL);
+ if (!lw) {
+ cifs_put_tlink(tlink);
+ return;
+ }
+
+ INIT_WORK(&lw->lease_break, cifs_ses_oplock_break);
+ lw->tlink = tlink;
+ lw->lease_state = new_lease_state;
+ memcpy(lw->lease_key, lease_key, SMB2_LEASE_KEY_SIZE);
+ queue_work(cifsiod_wq, &lw->lease_break);
+}
+
static bool
-smb2_tcon_has_lease(struct cifs_tcon *tcon, struct smb2_lease_break *rsp,
- struct smb2_lease_break_work *lw)
+smb2_tcon_has_lease(struct cifs_tcon *tcon, struct smb2_lease_break *rsp)
{
- bool found;
__u8 lease_state;
struct list_head *tmp;
struct cifsFileInfo *cfile;
- struct cifs_pending_open *open;
struct cifsInodeInfo *cinode;
int ack_req = le32_to_cpu(rsp->Flags &
SMB2_NOTIFY_BREAK_LEASE_FLAG_ACK_REQUIRED);
@@ -546,22 +562,29 @@ smb2_tcon_has_lease(struct cifs_tcon *tc
cfile->oplock_level = lease_state;
cifs_queue_oplock_break(cfile);
- kfree(lw);
return true;
}
- found = false;
+ return false;
+}
+
+static struct cifs_pending_open *
+smb2_tcon_find_pending_open_lease(struct cifs_tcon *tcon,
+ struct smb2_lease_break *rsp)
+{
+ __u8 lease_state = le32_to_cpu(rsp->NewLeaseState);
+ int ack_req = le32_to_cpu(rsp->Flags &
+ SMB2_NOTIFY_BREAK_LEASE_FLAG_ACK_REQUIRED);
+ struct cifs_pending_open *open;
+ struct cifs_pending_open *found = NULL;
+
list_for_each_entry(open, &tcon->pending_opens, olist) {
if (memcmp(open->lease_key, rsp->LeaseKey,
SMB2_LEASE_KEY_SIZE))
continue;
if (!found && ack_req) {
- found = true;
- memcpy(lw->lease_key, open->lease_key,
- SMB2_LEASE_KEY_SIZE);
- lw->tlink = cifs_get_tlink(open->tlink);
- queue_work(cifsiod_wq, &lw->lease_break);
+ found = open;
}
cifs_dbg(FYI, "found in the pending open list\n");
@@ -582,14 +605,7 @@ smb2_is_valid_lease_break(char *buffer)
struct TCP_Server_Info *server;
struct cifs_ses *ses;
struct cifs_tcon *tcon;
- struct smb2_lease_break_work *lw;
-
- lw = kmalloc(sizeof(struct smb2_lease_break_work), GFP_KERNEL);
- if (!lw)
- return false;
-
- INIT_WORK(&lw->lease_break, cifs_ses_oplock_break);
- lw->lease_state = rsp->NewLeaseState;
+ struct cifs_pending_open *open;
cifs_dbg(FYI, "Checking for lease break\n");
@@ -607,11 +623,27 @@ smb2_is_valid_lease_break(char *buffer)
spin_lock(&tcon->open_file_lock);
cifs_stats_inc(
&tcon->stats.cifs_stats.num_oplock_brks);
- if (smb2_tcon_has_lease(tcon, rsp, lw)) {
+ if (smb2_tcon_has_lease(tcon, rsp)) {
spin_unlock(&tcon->open_file_lock);
spin_unlock(&cifs_tcp_ses_lock);
return true;
}
+ open = smb2_tcon_find_pending_open_lease(tcon,
+ rsp);
+ if (open) {
+ __u8 lease_key[SMB2_LEASE_KEY_SIZE];
+ struct tcon_link *tlink;
+
+ tlink = cifs_get_tlink(open->tlink);
+ memcpy(lease_key, open->lease_key,
+ SMB2_LEASE_KEY_SIZE);
+ spin_unlock(&tcon->open_file_lock);
+ spin_unlock(&cifs_tcp_ses_lock);
+ smb2_queue_pending_open_break(tlink,
+ lease_key,
+ rsp->NewLeaseState);
+ return true;
+ }
spin_unlock(&tcon->open_file_lock);
if (tcon->crfid.is_valid &&
@@ -629,7 +661,6 @@ smb2_is_valid_lease_break(char *buffer)
}
}
spin_unlock(&cifs_tcp_ses_lock);
- kfree(lw);
cifs_dbg(FYI, "Can not process lease break - no lease matched\n");
return false;
}
From: Sibi Sankar <[email protected]>
commit 08257610302159e08fd4f5d33787807374ea63c7 upstream.
Having a non-MSA (Modem Self-Authentication) SID bypassed breaks modem
sandboxing i.e if a transaction were to originate from it, the hardware
memory protections units (XPUs) would fail to flag them (any transaction
originating from modem are historically termed as an MSA transaction).
Drop the unused non-MSA modem SID on SC7180 SoCs and cheza so that SMMU
continues to block them.
Tested-by: Douglas Anderson <[email protected]>
Reviewed-by: Sai Prakash Ranjan <[email protected]>
Reviewed-by: Douglas Anderson <[email protected]>
Fixes: bec71ba243e95 ("arm64: dts: qcom: sc7180: Update Q6V5 MSS node")
Fixes: 68aee4af5f620 ("arm64: dts: qcom: sdm845-cheza: Add iommus property")
Cc: [email protected]
Reported-by: Sai Prakash Ranjan <[email protected]>
Signed-off-by: Sibi Sankar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Andersson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm64/boot/dts/qcom/sc7180-idp.dts | 2 +-
arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
--- a/arch/arm64/boot/dts/qcom/sc7180-idp.dts
+++ b/arch/arm64/boot/dts/qcom/sc7180-idp.dts
@@ -312,7 +312,7 @@
&remoteproc_mpss {
status = "okay";
compatible = "qcom,sc7180-mss-pil";
- iommus = <&apps_smmu 0x460 0x1>, <&apps_smmu 0x444 0x3>;
+ iommus = <&apps_smmu 0x461 0x0>, <&apps_smmu 0x444 0x3>;
memory-region = <&mba_mem &mpss_mem>;
};
--- a/arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845-cheza.dtsi
@@ -634,7 +634,7 @@ ap_ts_i2c: &i2c14 {
};
&mss_pil {
- iommus = <&apps_smmu 0x780 0x1>,
+ iommus = <&apps_smmu 0x781 0x0>,
<&apps_smmu 0x724 0x3>;
};
From: Josef Bacik <[email protected]>
commit 18c850fdc5a801bad4977b0f1723761d42267e45 upstream.
There's long existed a lockdep splat because we open our bdev's under
the ->device_list_mutex at mount time, which acquires the bd_mutex.
Usually this goes unnoticed, but if you do loopback devices at all
suddenly the bd_mutex comes with a whole host of other dependencies,
which results in the splat when you mount a btrfs file system.
======================================================
WARNING: possible circular locking dependency detected
5.8.0-0.rc3.1.fc33.x86_64+debug #1 Not tainted
------------------------------------------------------
systemd-journal/509 is trying to acquire lock:
ffff970831f84db0 (&fs_info->reloc_mutex){+.+.}-{3:3}, at: btrfs_record_root_in_trans+0x44/0x70 [btrfs]
but task is already holding lock:
ffff97083144d598 (sb_pagefaults){.+.+}-{0:0}, at: btrfs_page_mkwrite+0x59/0x560 [btrfs]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #6 (sb_pagefaults){.+.+}-{0:0}:
__sb_start_write+0x13e/0x220
btrfs_page_mkwrite+0x59/0x560 [btrfs]
do_page_mkwrite+0x4f/0x130
do_wp_page+0x3b0/0x4f0
handle_mm_fault+0xf47/0x1850
do_user_addr_fault+0x1fc/0x4b0
exc_page_fault+0x88/0x300
asm_exc_page_fault+0x1e/0x30
-> #5 (&mm->mmap_lock#2){++++}-{3:3}:
__might_fault+0x60/0x80
_copy_from_user+0x20/0xb0
get_sg_io_hdr+0x9a/0xb0
scsi_cmd_ioctl+0x1ea/0x2f0
cdrom_ioctl+0x3c/0x12b4
sr_block_ioctl+0xa4/0xd0
block_ioctl+0x3f/0x50
ksys_ioctl+0x82/0xc0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x52/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
-> #4 (&cd->lock){+.+.}-{3:3}:
__mutex_lock+0x7b/0x820
sr_block_open+0xa2/0x180
__blkdev_get+0xdd/0x550
blkdev_get+0x38/0x150
do_dentry_open+0x16b/0x3e0
path_openat+0x3c9/0xa00
do_filp_open+0x75/0x100
do_sys_openat2+0x8a/0x140
__x64_sys_openat+0x46/0x70
do_syscall_64+0x52/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
-> #3 (&bdev->bd_mutex){+.+.}-{3:3}:
__mutex_lock+0x7b/0x820
__blkdev_get+0x6a/0x550
blkdev_get+0x85/0x150
blkdev_get_by_path+0x2c/0x70
btrfs_get_bdev_and_sb+0x1b/0xb0 [btrfs]
open_fs_devices+0x88/0x240 [btrfs]
btrfs_open_devices+0x92/0xa0 [btrfs]
btrfs_mount_root+0x250/0x490 [btrfs]
legacy_get_tree+0x30/0x50
vfs_get_tree+0x28/0xc0
vfs_kern_mount.part.0+0x71/0xb0
btrfs_mount+0x119/0x380 [btrfs]
legacy_get_tree+0x30/0x50
vfs_get_tree+0x28/0xc0
do_mount+0x8c6/0xca0
__x64_sys_mount+0x8e/0xd0
do_syscall_64+0x52/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
-> #2 (&fs_devs->device_list_mutex){+.+.}-{3:3}:
__mutex_lock+0x7b/0x820
btrfs_run_dev_stats+0x36/0x420 [btrfs]
commit_cowonly_roots+0x91/0x2d0 [btrfs]
btrfs_commit_transaction+0x4e6/0x9f0 [btrfs]
btrfs_sync_file+0x38a/0x480 [btrfs]
__x64_sys_fdatasync+0x47/0x80
do_syscall_64+0x52/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
-> #1 (&fs_info->tree_log_mutex){+.+.}-{3:3}:
__mutex_lock+0x7b/0x820
btrfs_commit_transaction+0x48e/0x9f0 [btrfs]
btrfs_sync_file+0x38a/0x480 [btrfs]
__x64_sys_fdatasync+0x47/0x80
do_syscall_64+0x52/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
-> #0 (&fs_info->reloc_mutex){+.+.}-{3:3}:
__lock_acquire+0x1241/0x20c0
lock_acquire+0xb0/0x400
__mutex_lock+0x7b/0x820
btrfs_record_root_in_trans+0x44/0x70 [btrfs]
start_transaction+0xd2/0x500 [btrfs]
btrfs_dirty_inode+0x44/0xd0 [btrfs]
file_update_time+0xc6/0x120
btrfs_page_mkwrite+0xda/0x560 [btrfs]
do_page_mkwrite+0x4f/0x130
do_wp_page+0x3b0/0x4f0
handle_mm_fault+0xf47/0x1850
do_user_addr_fault+0x1fc/0x4b0
exc_page_fault+0x88/0x300
asm_exc_page_fault+0x1e/0x30
other info that might help us debug this:
Chain exists of:
&fs_info->reloc_mutex --> &mm->mmap_lock#2 --> sb_pagefaults
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(sb_pagefaults);
lock(&mm->mmap_lock#2);
lock(sb_pagefaults);
lock(&fs_info->reloc_mutex);
*** DEADLOCK ***
3 locks held by systemd-journal/509:
#0: ffff97083bdec8b8 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x12e/0x4b0
#1: ffff97083144d598 (sb_pagefaults){.+.+}-{0:0}, at: btrfs_page_mkwrite+0x59/0x560 [btrfs]
#2: ffff97083144d6a8 (sb_internal){.+.+}-{0:0}, at: start_transaction+0x3f8/0x500 [btrfs]
stack backtrace:
CPU: 0 PID: 509 Comm: systemd-journal Not tainted 5.8.0-0.rc3.1.fc33.x86_64+debug #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Call Trace:
dump_stack+0x92/0xc8
check_noncircular+0x134/0x150
__lock_acquire+0x1241/0x20c0
lock_acquire+0xb0/0x400
? btrfs_record_root_in_trans+0x44/0x70 [btrfs]
? lock_acquire+0xb0/0x400
? btrfs_record_root_in_trans+0x44/0x70 [btrfs]
__mutex_lock+0x7b/0x820
? btrfs_record_root_in_trans+0x44/0x70 [btrfs]
? kvm_sched_clock_read+0x14/0x30
? sched_clock+0x5/0x10
? sched_clock_cpu+0xc/0xb0
btrfs_record_root_in_trans+0x44/0x70 [btrfs]
start_transaction+0xd2/0x500 [btrfs]
btrfs_dirty_inode+0x44/0xd0 [btrfs]
file_update_time+0xc6/0x120
btrfs_page_mkwrite+0xda/0x560 [btrfs]
? sched_clock+0x5/0x10
do_page_mkwrite+0x4f/0x130
do_wp_page+0x3b0/0x4f0
handle_mm_fault+0xf47/0x1850
do_user_addr_fault+0x1fc/0x4b0
exc_page_fault+0x88/0x300
? asm_exc_page_fault+0x8/0x30
asm_exc_page_fault+0x1e/0x30
RIP: 0033:0x7fa3972fdbfe
Code: Bad RIP value.
Fix this by not holding the ->device_list_mutex at this point. The
device_list_mutex exists to protect us from modifying the device list
while the file system is running.
However it can also be modified by doing a scan on a device. But this
action is specifically protected by the uuid_mutex, which we are holding
here. We cannot race with opening at this point because we have the
->s_mount lock held during the mount. Not having the
->device_list_mutex here is perfectly safe as we're not going to change
the devices at this point.
CC: [email protected] # 4.19+
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
[ add some comments ]
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/volumes.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -245,7 +245,9 @@ static int __btrfs_map_block(struct btrf
*
* global::fs_devs - add, remove, updates to the global list
*
- * does not protect: manipulation of the fs_devices::devices list!
+ * does not protect: manipulation of the fs_devices::devices list in general
+ * but in mount context it could be used to exclude list modifications by eg.
+ * scan ioctl
*
* btrfs_device::name - renames (write side), read is RCU
*
@@ -258,6 +260,9 @@ static int __btrfs_map_block(struct btrf
* may be used to exclude some operations from running concurrently without any
* modifications to the list (see write_all_supers)
*
+ * Is not required at mount and close times, because our device list is
+ * protected by the uuid_mutex at that point.
+ *
* balance_mutex
* -------------
* protects balance structures (status, state) and context accessed from
@@ -602,6 +607,11 @@ static int btrfs_free_stale_devices(cons
return ret;
}
+/*
+ * This is only used on mount, and we are protected from competing things
+ * messing with our fs_devices by the uuid_mutex, thus we do not need the
+ * fs_devices->device_list_mutex here.
+ */
static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
struct btrfs_device *device, fmode_t flags,
void *holder)
@@ -1229,8 +1239,14 @@ int btrfs_open_devices(struct btrfs_fs_d
int ret;
lockdep_assert_held(&uuid_mutex);
+ /*
+ * The device_list_mutex cannot be taken here in case opening the
+ * underlying device takes further locks like bd_mutex.
+ *
+ * We also don't need the lock here as this is called during mount and
+ * exclusion is provided by uuid_mutex
+ */
- mutex_lock(&fs_devices->device_list_mutex);
if (fs_devices->opened) {
fs_devices->opened++;
ret = 0;
@@ -1238,7 +1254,6 @@ int btrfs_open_devices(struct btrfs_fs_d
list_sort(NULL, &fs_devices->devices, devid_cmp);
ret = open_fs_devices(fs_devices, flags, holder);
}
- mutex_unlock(&fs_devices->device_list_mutex);
return ret;
}
From: Josef Bacik <[email protected]>
commit a47bd78d0c44621efb98b525d04d60dc4d1a79b0 upstream.
Dave hit this splat during testing btrfs/078:
======================================================
WARNING: possible circular locking dependency detected
5.8.0-rc6-default+ #1191 Not tainted
------------------------------------------------------
kswapd0/75 is trying to acquire lock:
ffffa040e9d04ff8 (&delayed_node->mutex){+.+.}-{3:3}, at: __btrfs_release_delayed_node.part.0+0x3f/0x310 [btrfs]
but task is already holding lock:
ffffffff8b0c8040 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (fs_reclaim){+.+.}-{0:0}:
__lock_acquire+0x56f/0xaa0
lock_acquire+0xa3/0x440
fs_reclaim_acquire.part.0+0x25/0x30
__kmalloc_track_caller+0x49/0x330
kstrdup+0x2e/0x60
__kernfs_new_node.constprop.0+0x44/0x250
kernfs_new_node+0x25/0x50
kernfs_create_link+0x34/0xa0
sysfs_do_create_link_sd+0x5e/0xd0
btrfs_sysfs_add_devices_dir+0x65/0x100 [btrfs]
btrfs_init_new_device+0x44c/0x12b0 [btrfs]
btrfs_ioctl+0xc3c/0x25c0 [btrfs]
ksys_ioctl+0x68/0xa0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x50/0xe0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
-> #1 (&fs_info->chunk_mutex){+.+.}-{3:3}:
__lock_acquire+0x56f/0xaa0
lock_acquire+0xa3/0x440
__mutex_lock+0xa0/0xaf0
btrfs_chunk_alloc+0x137/0x3e0 [btrfs]
find_free_extent+0xb44/0xfb0 [btrfs]
btrfs_reserve_extent+0x9b/0x180 [btrfs]
btrfs_alloc_tree_block+0xc1/0x350 [btrfs]
alloc_tree_block_no_bg_flush+0x4a/0x60 [btrfs]
__btrfs_cow_block+0x143/0x7a0 [btrfs]
btrfs_cow_block+0x15f/0x310 [btrfs]
push_leaf_right+0x150/0x240 [btrfs]
split_leaf+0x3cd/0x6d0 [btrfs]
btrfs_search_slot+0xd14/0xf70 [btrfs]
btrfs_insert_empty_items+0x64/0xc0 [btrfs]
__btrfs_commit_inode_delayed_items+0xb2/0x840 [btrfs]
btrfs_async_run_delayed_root+0x10e/0x1d0 [btrfs]
btrfs_work_helper+0x2f9/0x650 [btrfs]
process_one_work+0x22c/0x600
worker_thread+0x50/0x3b0
kthread+0x137/0x150
ret_from_fork+0x1f/0x30
-> #0 (&delayed_node->mutex){+.+.}-{3:3}:
check_prev_add+0x98/0xa20
validate_chain+0xa8c/0x2a00
__lock_acquire+0x56f/0xaa0
lock_acquire+0xa3/0x440
__mutex_lock+0xa0/0xaf0
__btrfs_release_delayed_node.part.0+0x3f/0x310 [btrfs]
btrfs_evict_inode+0x3bf/0x560 [btrfs]
evict+0xd6/0x1c0
dispose_list+0x48/0x70
prune_icache_sb+0x54/0x80
super_cache_scan+0x121/0x1a0
do_shrink_slab+0x175/0x420
shrink_slab+0xb1/0x2e0
shrink_node+0x192/0x600
balance_pgdat+0x31f/0x750
kswapd+0x206/0x510
kthread+0x137/0x150
ret_from_fork+0x1f/0x30
other info that might help us debug this:
Chain exists of:
&delayed_node->mutex --> &fs_info->chunk_mutex --> fs_reclaim
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(&fs_info->chunk_mutex);
lock(fs_reclaim);
lock(&delayed_node->mutex);
*** DEADLOCK ***
3 locks held by kswapd0/75:
#0: ffffffff8b0c8040 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30
#1: ffffffff8b0b50b8 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x54/0x2e0
#2: ffffa040e057c0e8 (&type->s_umount_key#26){++++}-{3:3}, at: trylock_super+0x16/0x50
stack backtrace:
CPU: 2 PID: 75 Comm: kswapd0 Not tainted 5.8.0-rc6-default+ #1191
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
Call Trace:
dump_stack+0x78/0xa0
check_noncircular+0x16f/0x190
check_prev_add+0x98/0xa20
validate_chain+0xa8c/0x2a00
__lock_acquire+0x56f/0xaa0
lock_acquire+0xa3/0x440
? __btrfs_release_delayed_node.part.0+0x3f/0x310 [btrfs]
__mutex_lock+0xa0/0xaf0
? __btrfs_release_delayed_node.part.0+0x3f/0x310 [btrfs]
? __lock_acquire+0x56f/0xaa0
? __btrfs_release_delayed_node.part.0+0x3f/0x310 [btrfs]
? lock_acquire+0xa3/0x440
? btrfs_evict_inode+0x138/0x560 [btrfs]
? btrfs_evict_inode+0x2fe/0x560 [btrfs]
? __btrfs_release_delayed_node.part.0+0x3f/0x310 [btrfs]
__btrfs_release_delayed_node.part.0+0x3f/0x310 [btrfs]
btrfs_evict_inode+0x3bf/0x560 [btrfs]
evict+0xd6/0x1c0
dispose_list+0x48/0x70
prune_icache_sb+0x54/0x80
super_cache_scan+0x121/0x1a0
do_shrink_slab+0x175/0x420
shrink_slab+0xb1/0x2e0
shrink_node+0x192/0x600
balance_pgdat+0x31f/0x750
kswapd+0x206/0x510
? _raw_spin_unlock_irqrestore+0x3e/0x50
? finish_wait+0x90/0x90
? balance_pgdat+0x750/0x750
kthread+0x137/0x150
? kthread_stop+0x2a0/0x2a0
ret_from_fork+0x1f/0x30
This is because we're holding the chunk_mutex while adding this device
and adding its sysfs entries. We actually hold different locks in
different places when calling this function, the dev_replace semaphore
for instance in dev replace, so instead of moving this call around
simply wrap it's operations in NOFS.
CC: [email protected] # 4.14+
Reported-by: David Sterba <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/sysfs.c | 3 +++
1 file changed, 3 insertions(+)
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1273,7 +1273,9 @@ int btrfs_sysfs_add_devices_dir(struct b
{
int error = 0;
struct btrfs_device *dev;
+ unsigned int nofs_flag;
+ nofs_flag = memalloc_nofs_save();
list_for_each_entry(dev, &fs_devices->devices, dev_list) {
if (one_device && one_device != dev)
@@ -1301,6 +1303,7 @@ int btrfs_sysfs_add_devices_dir(struct b
break;
}
}
+ memalloc_nofs_restore(nofs_flag);
return error;
}
From: Josef Bacik <[email protected]>
commit faa008899a4db21a2df99833cb4ff6fa67009a20 upstream.
There's some inconsistency around SB_I_VERSION handling with mount and
remount. Since we don't really want it to be off ever just work around
this by making sure we don't get the flag cleared on remount.
There's a tiny cpu cost of setting the bit, otherwise all changes to
i_version also change some of the times (ctime/mtime) so the inode needs
to be synced. We wouldn't save anything by disabling it.
Reported-by: Eric Sandeen <[email protected]>
CC: [email protected] # 5.4+
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
[ add perf impact analysis ]
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/super.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1897,6 +1897,12 @@ static int btrfs_remount(struct super_bl
set_bit(BTRFS_FS_OPEN, &fs_info->flags);
}
out:
+ /*
+ * We need to set SB_I_VERSION here otherwise it'll get cleared by VFS,
+ * since the absence of the flag means it can be toggled off by remount.
+ */
+ *flags |= SB_I_VERSION;
+
wake_up_process(fs_info->transaction_kthread);
btrfs_remount_cleanup(fs_info, old_opts);
return 0;
From: Michael Ellerman <[email protected]>
commit 0c83b277ada72b585e6a3e52b067669df15bcedb upstream.
Recently random.h started including percpu.h (see commit
f227e3ec3b5c ("random32: update the net random state on interrupt and
activity")), which broke corenet64_smp_defconfig:
In file included from /linux/arch/powerpc/include/asm/paca.h:18,
from /linux/arch/powerpc/include/asm/percpu.h:13,
from /linux/include/linux/random.h:14,
from /linux/lib/uuid.c:14:
/linux/arch/powerpc/include/asm/mmu.h:139:22: error: unknown type name 'next_tlbcam_idx'
139 | DECLARE_PER_CPU(int, next_tlbcam_idx);
This is due to a circular header dependency:
asm/mmu.h includes asm/percpu.h, which includes asm/paca.h, which
includes asm/mmu.h
Which means DECLARE_PER_CPU() isn't defined when mmu.h needs it.
We can fix it by moving the include of paca.h below the include of
asm-generic/percpu.h.
This moves the include of paca.h out of the #ifdef __powerpc64__, but
that is OK because paca.h is almost entirely inside #ifdef
CONFIG_PPC64 anyway.
It also moves the include of paca.h out of the #ifdef CONFIG_SMP,
which could possibly break something, but seems to have no ill
effects.
Fixes: f227e3ec3b5c ("random32: update the net random state on interrupt and activity")
Cc: [email protected] # v5.8
Reported-by: Stephen Rothwell <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/powerpc/include/asm/percpu.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/powerpc/include/asm/percpu.h
+++ b/arch/powerpc/include/asm/percpu.h
@@ -10,8 +10,6 @@
#ifdef CONFIG_SMP
-#include <asm/paca.h>
-
#define __my_cpu_offset local_paca->data_offset
#endif /* CONFIG_SMP */
@@ -19,4 +17,6 @@
#include <asm-generic/percpu.h>
+#include <asm/paca.h>
+
#endif /* _ASM_POWERPC_PERCPU_H_ */
From: Anand Jain <[email protected]>
commit 4faf55b03823e96c44dc4e364520000ed3b12fdb upstream.
->show_devname currently shows the lowest devid in the list. As the seed
devices have the lowest devid in the sprouted filesystem, the userland
tool such as findmnt end up seeing seed device instead of the device from
the read-writable sprouted filesystem. As shown below.
mount /dev/sda /btrfs
mount: /btrfs: WARNING: device write-protected, mounted read-only.
findmnt --output SOURCE,TARGET,UUID /btrfs
SOURCE TARGET UUID
/dev/sda /btrfs 899f7027-3e46-4626-93e7-7d4c9ad19111
btrfs dev add -f /dev/sdb /btrfs
umount /btrfs
mount /dev/sdb /btrfs
findmnt --output SOURCE,TARGET,UUID /btrfs
SOURCE TARGET UUID
/dev/sda /btrfs 899f7027-3e46-4626-93e7-7d4c9ad19111
All sprouts from a single seed will show the same seed device and the
same fsid. That's confusing.
This is causing problems in our prototype as there isn't any reference
to the sprout file-system(s) which is being used for actual read and
write.
This was added in the patch which implemented the show_devname in btrfs
commit 9c5085c14798 ("Btrfs: implement ->show_devname").
I tried to look for any particular reason that we need to show the seed
device, there isn't any.
So instead, do not traverse through the seed devices, just show the
lowest devid in the sprouted fsid.
After the patch:
mount /dev/sda /btrfs
mount: /btrfs: WARNING: device write-protected, mounted read-only.
findmnt --output SOURCE,TARGET,UUID /btrfs
SOURCE TARGET UUID
/dev/sda /btrfs 899f7027-3e46-4626-93e7-7d4c9ad19111
btrfs dev add -f /dev/sdb /btrfs
mount -o rw,remount /dev/sdb /btrfs
findmnt --output SOURCE,TARGET,UUID /btrfs
SOURCE TARGET UUID
/dev/sdb /btrfs 595ca0e6-b82e-46b5-b9e2-c72a6928be48
mount /dev/sda /btrfs1
mount: /btrfs1: WARNING: device write-protected, mounted read-only.
btrfs dev add -f /dev/sdc /btrfs1
findmnt --output SOURCE,TARGET,UUID /btrfs1
SOURCE TARGET UUID
/dev/sdc /btrfs1 ca1dbb7a-8446-4f95-853c-a20f3f82bdbb
cat /proc/self/mounts | grep btrfs
/dev/sdb /btrfs btrfs rw,relatime,noacl,space_cache,subvolid=5,subvol=/ 0 0
/dev/sdc /btrfs1 btrfs ro,relatime,noacl,space_cache,subvolid=5,subvol=/ 0 0
Reported-by: Martin K. Petersen <[email protected]>
CC: [email protected] # 4.19+
Tested-by: Martin K. Petersen <[email protected]>
Signed-off-by: Anand Jain <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/super.c | 21 +++++++--------------
1 file changed, 7 insertions(+), 14 deletions(-)
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -2296,9 +2296,7 @@ static int btrfs_unfreeze(struct super_b
static int btrfs_show_devname(struct seq_file *m, struct dentry *root)
{
struct btrfs_fs_info *fs_info = btrfs_sb(root->d_sb);
- struct btrfs_fs_devices *cur_devices;
struct btrfs_device *dev, *first_dev = NULL;
- struct list_head *head;
/*
* Lightweight locking of the devices. We should not need
@@ -2308,18 +2306,13 @@ static int btrfs_show_devname(struct seq
* least until the rcu_read_unlock.
*/
rcu_read_lock();
- cur_devices = fs_info->fs_devices;
- while (cur_devices) {
- head = &cur_devices->devices;
- list_for_each_entry_rcu(dev, head, dev_list) {
- if (test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state))
- continue;
- if (!dev->name)
- continue;
- if (!first_dev || dev->devid < first_dev->devid)
- first_dev = dev;
- }
- cur_devices = cur_devices->seed;
+ list_for_each_entry_rcu(dev, &fs_info->fs_devices->devices, dev_list) {
+ if (test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state))
+ continue;
+ if (!dev->name)
+ continue;
+ if (!first_dev || dev->devid < first_dev->devid)
+ first_dev = dev;
}
if (first_dev)
From: Filipe Manana <[email protected]>
commit a93e01682e283f6de09d6ce8f805dc52a2e942fb upstream.
When syncing the log, we used to update the log root tree without holding
neither the log_mutex of the subvolume root nor the log_mutex of log root
tree.
We used to have two critical sections delimited by the log_mutex of the
log root tree, so in the first one we incremented the log_writers of the
log root tree and on the second one we decremented it and waited for the
log_writers counter to go down to zero. This was because the update of
the log root tree happened between the two critical sections.
The use of two critical sections allowed a little bit more of parallelism
and required the use of the log_writers counter, necessary to make sure
we didn't miss any log root tree update when we have multiple tasks trying
to sync the log in parallel.
However after commit 06989c799f0481 ("Btrfs: fix race updating log root
item during fsync") the log root tree update was moved into a critical
section delimited by the subvolume's log_mutex. Later another commit
moved the log tree update from that critical section into the second
critical section delimited by the log_mutex of the log root tree. Both
commits addressed different bugs.
The end result is that the first critical section delimited by the
log_mutex of the log root tree became pointless, since there's nothing
done between it and the second critical section, we just have an unlock
of the log_mutex followed by a lock operation. This means we can merge
both critical sections, as the first one does almost nothing now, and we
can stop using the log_writers counter of the log root tree, which was
incremented in the first critical section and decremented in the second
criticial section, used to make sure no one in the second critical section
started writeback of the log root tree before some other task updated it.
So just remove the mutex_unlock() followed by mutex_lock() of the log root
tree, as well as the use of the log_writers counter for the log root tree.
This patch is part of a series that has the following patches:
1/4 btrfs: only commit the delayed inode when doing a full fsync
2/4 btrfs: only commit delayed items at fsync if we are logging a directory
3/4 btrfs: stop incremening log_batch for the log root tree when syncing log
4/4 btrfs: remove no longer needed use of log_writers for the log root tree
After the entire patchset applied I saw about 12% decrease on max latency
reported by dbench. The test was done on a qemu vm, with 8 cores, 16Gb of
ram, using kvm and using a raw NVMe device directly (no intermediary fs on
the host). The test was invoked like the following:
mkfs.btrfs -f /dev/sdk
mount -o ssd -o nospace_cache /dev/sdk /mnt/sdk
dbench -D /mnt/sdk -t 300 8
umount /mnt/dsk
CC: [email protected] # 5.4+
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/ctree.h | 1 +
fs/btrfs/tree-log.c | 13 -------------
2 files changed, 1 insertion(+), 13 deletions(-)
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1059,6 +1059,7 @@ struct btrfs_root {
wait_queue_head_t log_writer_wait;
wait_queue_head_t log_commit_wait[2];
struct list_head log_ctxs[2];
+ /* Used only for log trees of subvolumes, not for the log root tree */
atomic_t log_writers;
atomic_t log_commit[2];
/* Used only for log trees of subvolumes, not for the log root tree */
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3116,28 +3116,17 @@ int btrfs_sync_log(struct btrfs_trans_ha
btrfs_init_log_ctx(&root_log_ctx, NULL);
mutex_lock(&log_root_tree->log_mutex);
- atomic_inc(&log_root_tree->log_writers);
index2 = log_root_tree->log_transid % 2;
list_add_tail(&root_log_ctx.list, &log_root_tree->log_ctxs[index2]);
root_log_ctx.log_transid = log_root_tree->log_transid;
- mutex_unlock(&log_root_tree->log_mutex);
-
- mutex_lock(&log_root_tree->log_mutex);
-
/*
* Now we are safe to update the log_root_tree because we're under the
* log_mutex, and we're a current writer so we're holding the commit
* open until we drop the log_mutex.
*/
ret = update_log_root(trans, log, &new_root_item);
-
- if (atomic_dec_and_test(&log_root_tree->log_writers)) {
- /* atomic_dec_and_test implies a barrier */
- cond_wake_up_nomb(&log_root_tree->log_writer_wait);
- }
-
if (ret) {
if (!list_empty(&root_log_ctx.list))
list_del_init(&root_log_ctx.list);
@@ -3183,8 +3172,6 @@ int btrfs_sync_log(struct btrfs_trans_ha
root_log_ctx.log_transid - 1);
}
- wait_for_writer(log_root_tree);
-
/*
* now that we've moved on to the tree of log tree roots,
* check the full commit flag again
From: Filipe Manana <[email protected]>
commit 8c8648dd1f6d62aeb912deeb788b6ac33cb782e7 upstream.
Commit 2c2c452b0cafdc ("Btrfs: fix fsync when extend references are added
to an inode") forced a commit of the delayed inode when logging an inode
in order to ensure we would end up logging the inode item during a full
fsync. By committing the delayed inode, we updated the inode item in the
fs/subvolume tree and then later when copying items from leafs modified in
the current transaction into the log tree (with copy_inode_items_to_log())
we ended up copying the inode item from the fs/subvolume tree into the log
tree. Logging an up to date version of the inode item is required to make
sure at log replay time we get the link count fixup triggered among other
things (replay xattr deletes, etc). The test case generic/040 from fstests
exercises the bug which that commit fixed.
However for a fast fsync we don't need to commit the delayed inode because
we always log an up to date version of the inode item based on the struct
btrfs_inode we have in-memory. We started doing this for fast fsyncs since
commit e4545de5b035c7 ("Btrfs: fix fsync data loss after append write").
So just stop committing the delayed inode if we are doing a fast fsync,
we are only wasting time and adding contention on fs/subvolume tree.
This patch is part of a series that has the following patches:
1/4 btrfs: only commit the delayed inode when doing a full fsync
2/4 btrfs: only commit delayed items at fsync if we are logging a directory
3/4 btrfs: stop incremening log_batch for the log root tree when syncing log
4/4 btrfs: remove no longer needed use of log_writers for the log root tree
After the entire patchset applied I saw about 12% decrease on max latency
reported by dbench. The test was done on a qemu vm, with 8 cores, 16Gb of
ram, using kvm and using a raw NVMe device directly (no intermediary fs on
the host). The test was invoked like the following:
mkfs.btrfs -f /dev/sdk
mount -o ssd -o nospace_cache /dev/sdk /mnt/sdk
dbench -D /mnt/sdk -t 300 8
umount /mnt/dsk
CC: [email protected] # 5.4+
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/tree-log.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -5130,7 +5130,7 @@ static int btrfs_log_inode(struct btrfs_
struct btrfs_key max_key;
struct btrfs_root *log = root->log_root;
int err = 0;
- int ret;
+ int ret = 0;
bool fast_search = false;
u64 ino = btrfs_ino(inode);
struct extent_map_tree *em_tree = &inode->extent_tree;
@@ -5167,14 +5167,16 @@ static int btrfs_log_inode(struct btrfs_
/*
* Only run delayed items if we are a dir or a new file.
- * Otherwise commit the delayed inode only, which is needed in
- * order for the log replay code to mark inodes for link count
- * fixup (create temporary BTRFS_TREE_LOG_FIXUP_OBJECTID items).
+ * Otherwise commit the delayed inode only if the full sync flag is set,
+ * as we want to make sure an up to date version is in the subvolume
+ * tree so copy_inode_items_to_log() / copy_items() can find it and copy
+ * it to the log tree. For a non full sync, we always log the inode item
+ * based on the in-memory struct btrfs_inode which is always up to date.
*/
if (S_ISDIR(inode->vfs_inode.i_mode) ||
inode->generation > fs_info->last_trans_committed)
ret = btrfs_commit_inode_delayed_items(trans, inode);
- else
+ else if (test_bit(BTRFS_INODE_NEEDS_FULL_SYNC, &inode->runtime_flags))
ret = btrfs_commit_inode_delayed_inode(inode);
if (ret) {
From: Qu Wenruo <[email protected]>
commit 851fd730a743e072badaf67caf39883e32439431 upstream.
[BUG]
When a lot of subvolumes are created, there is a user report about
transaction aborted:
BTRFS: Transaction aborted (error -24)
WARNING: CPU: 17 PID: 17041 at fs/btrfs/transaction.c:1576 create_pending_snapshot+0xbc4/0xd10 [btrfs]
RIP: 0010:create_pending_snapshot+0xbc4/0xd10 [btrfs]
Call Trace:
create_pending_snapshots+0x82/0xa0 [btrfs]
btrfs_commit_transaction+0x275/0x8c0 [btrfs]
btrfs_mksubvol+0x4b9/0x500 [btrfs]
btrfs_ioctl_snap_create_transid+0x174/0x180 [btrfs]
btrfs_ioctl_snap_create_v2+0x11c/0x180 [btrfs]
btrfs_ioctl+0x11a4/0x2da0 [btrfs]
do_vfs_ioctl+0xa9/0x640
ksys_ioctl+0x67/0x90
__x64_sys_ioctl+0x1a/0x20
do_syscall_64+0x5a/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9
---[ end trace 33f2f83f3d5250e9 ]---
BTRFS: error (device sda1) in create_pending_snapshot:1576: errno=-24 unknown
BTRFS info (device sda1): forced readonly
BTRFS warning (device sda1): Skipping commit of aborted transaction.
BTRFS: error (device sda1) in cleanup_transaction:1831: errno=-24 unknown
[CAUSE]
The error is EMFILE (Too many files open) and comes from the anonymous
block device allocation. The ids are in a shared pool of size 1<<20.
The ids are assigned to live subvolumes, ie. the root structure exists
in memory (eg. after creation or after the root appears in some path).
The pool could be exhausted if the numbers are not reclaimed fast
enough, after subvolume deletion or if other system component uses the
anon block devices.
[WORKAROUND]
Since it's not possible to completely solve the problem, we can only
minimize the time the id is allocated to a subvolume root.
Firstly, we can reduce the use of anon_dev by trees that are not
subvolume roots, like data reloc tree.
This patch will do extra check on root objectid, to skip roots that
don't need anon_dev. Currently it's only data reloc tree and orphan
roots.
Reported-by: Greed Rong <[email protected]>
Link: https://lore.kernel.org/linux-btrfs/CA+UqX+NTrZ6boGnWHhSeZmEY5J76CTqmYjO2S+=tHJX7nb9DPw@mail.gmail.com/
CC: [email protected] # 4.4+
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/disk-io.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1428,9 +1428,16 @@ static int btrfs_init_fs_root(struct btr
spin_lock_init(&root->ino_cache_lock);
init_waitqueue_head(&root->ino_cache_wait);
- ret = get_anon_bdev(&root->anon_dev);
- if (ret)
- goto fail;
+ /*
+ * Don't assign anonymous block device to roots that are not exposed to
+ * userspace, the id pool is limited to 1M
+ */
+ if (is_fstree(root->root_key.objectid) &&
+ btrfs_root_refs(&root->root_item) > 0) {
+ ret = get_anon_bdev(&root->anon_dev);
+ if (ret)
+ goto fail;
+ }
mutex_lock(&root->objectid_mutex);
ret = btrfs_find_highest_objectid(root,
From: Jonathan McDowell <[email protected]>
commit 592d751c1e174df5ff219946908b005eb48934b3 upstream.
If we don't have a hardware multicast filter available then instead of
silently failing to listen for the requested ethernet broadcast
addresses fall back to receiving all multicast packets, in a similar
fashion to other drivers with no multicast filter.
Cc: [email protected]
Signed-off-by: Jonathan McDowell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c | 3 +++
1 file changed, 3 insertions(+)
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
@@ -164,6 +164,9 @@ static void dwmac1000_set_filter(struct
value = GMAC_FRAME_FILTER_PR | GMAC_FRAME_FILTER_PCF;
} else if (dev->flags & IFF_ALLMULTI) {
value = GMAC_FRAME_FILTER_PM; /* pass all multi */
+ } else if (!netdev_mc_empty(dev) && (mcbitslog2 == 0)) {
+ /* Fall back to all multicast if we've no filter */
+ value = GMAC_FRAME_FILTER_PM;
} else if (!netdev_mc_empty(dev)) {
struct netdev_hw_addr *ha;
From: David Sterba <[email protected]>
commit 27942c9971cc405c60432eca9395e514a2ae9f5e upstream.
Reported by Forza on IRC that remounting with compression options does
not reflect the change in level, or at least it does not appear to do so
according to the messages:
mount -o compress=zstd:1 /dev/sda /mnt
mount -o remount,compress=zstd:15 /mnt
does not print the change to the level to syslog:
[ 41.366060] BTRFS info (device vda): use zstd compression, level 1
[ 41.368254] BTRFS info (device vda): disk space caching is enabled
[ 41.390429] BTRFS info (device vda): disk space caching is enabled
What really happens is that the message is lost but the level is actualy
changed.
There's another weird output, if compression is reset to 'no':
[ 45.413776] BTRFS info (device vda): use no compression, level 4
To fix that, save the previous compression level and print the message
in that case too and use separate message for 'no' compression.
CC: [email protected] # 4.19+
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/super.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -449,6 +449,7 @@ int btrfs_parse_options(struct btrfs_fs_
char *compress_type;
bool compress_force = false;
enum btrfs_compression_type saved_compress_type;
+ int saved_compress_level;
bool saved_compress_force;
int no_compress = 0;
@@ -531,6 +532,7 @@ int btrfs_parse_options(struct btrfs_fs_
info->compress_type : BTRFS_COMPRESS_NONE;
saved_compress_force =
btrfs_test_opt(info, FORCE_COMPRESS);
+ saved_compress_level = info->compress_level;
if (token == Opt_compress ||
token == Opt_compress_force ||
strncmp(args[0].from, "zlib", 4) == 0) {
@@ -575,6 +577,8 @@ int btrfs_parse_options(struct btrfs_fs_
no_compress = 0;
} else if (strncmp(args[0].from, "no", 2) == 0) {
compress_type = "no";
+ info->compress_level = 0;
+ info->compress_type = 0;
btrfs_clear_opt(info->mount_opt, COMPRESS);
btrfs_clear_opt(info->mount_opt, FORCE_COMPRESS);
compress_force = false;
@@ -595,11 +599,11 @@ int btrfs_parse_options(struct btrfs_fs_
*/
btrfs_clear_opt(info->mount_opt, FORCE_COMPRESS);
}
- if ((btrfs_test_opt(info, COMPRESS) &&
- (info->compress_type != saved_compress_type ||
- compress_force != saved_compress_force)) ||
- (!btrfs_test_opt(info, COMPRESS) &&
- no_compress == 1)) {
+ if (no_compress == 1) {
+ btrfs_info(info, "use no compression");
+ } else if ((info->compress_type != saved_compress_type) ||
+ (compress_force != saved_compress_force) ||
+ (info->compress_level != saved_compress_level)) {
btrfs_info(info, "%s %s compression, level %d",
(compress_force) ? "force" : "use",
compress_type, info->compress_level);
From: Zenghui Yu <[email protected]>
commit 3af9571cd585efafc2facbd8dbd407317ff898cf upstream.
The GICv4.1 spec tells us that it's CONSTRAINED UNPREDICTABLE to issue a
register-based invalidation operation for a vPEID not mapped to that RD,
or another RD within the same CommonLPIAff group.
To follow this rule, commit f3a059219bc7 ("irqchip/gic-v4.1: Ensure mutual
exclusion between vPE affinity change and RD access") tried to address the
race between the RD accesses and the vPE affinity change, but somehow
forgot to take GICR_INVALLR into account. Let's take the vpe_lock before
evaluating vpe->col_idx to fix it.
Fixes: f3a059219bc7 ("irqchip/gic-v4.1: Ensure mutual exclusion between vPE affinity change and RD access")
Signed-off-by: Zenghui Yu <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -4090,18 +4090,22 @@ static void its_vpe_4_1_deschedule(struc
static void its_vpe_4_1_invall(struct its_vpe *vpe)
{
void __iomem *rdbase;
+ unsigned long flags;
u64 val;
+ int cpu;
val = GICR_INVALLR_V;
val |= FIELD_PREP(GICR_INVALLR_VPEID, vpe->vpe_id);
/* Target the redistributor this vPE is currently known on */
- raw_spin_lock(&gic_data_rdist_cpu(vpe->col_idx)->rd_lock);
- rdbase = per_cpu_ptr(gic_rdists->rdist, vpe->col_idx)->rd_base;
+ cpu = vpe_to_cpuid_lock(vpe, &flags);
+ raw_spin_lock(&gic_data_rdist_cpu(cpu)->rd_lock);
+ rdbase = per_cpu_ptr(gic_rdists->rdist, cpu)->rd_base;
gic_write_lpir(val, rdbase + GICR_INVALLR);
wait_for_syncr(rdbase);
- raw_spin_unlock(&gic_data_rdist_cpu(vpe->col_idx)->rd_lock);
+ raw_spin_unlock(&gic_data_rdist_cpu(cpu)->rd_lock);
+ vpe_to_cpuid_unlock(vpe, flags);
}
static int its_vpe_4_1_set_vcpu_affinity(struct irq_data *d, void *vcpu_info)
From: Christian Eggers <[email protected]>
commit add48ba425192c6e04ce70549129cacd01e2a09e upstream.
The correct compatible string is "gpio-mux" (see
bindings/mux/gpio-mux.txt).
Cc: [email protected] # v4.13+
Reviewed-by: Peter Rosin <[email protected]>
Signed-off-by: Christian Eggers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rob Herring <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
Documentation/devicetree/bindings/iio/multiplexer/io-channel-mux.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/Documentation/devicetree/bindings/iio/multiplexer/io-channel-mux.txt
+++ b/Documentation/devicetree/bindings/iio/multiplexer/io-channel-mux.txt
@@ -21,7 +21,7 @@ controller state. The mux controller sta
Example:
mux: mux-controller {
- compatible = "mux-gpio";
+ compatible = "gpio-mux";
#mux-control-cells = <0>;
mux-gpios = <&pioA 0 GPIO_ACTIVE_HIGH>,
From: Qu Wenruo <[email protected]>
commit c57dd1f2f6a7cd1bb61802344f59ccdc5278c983 upstream.
[BUG]
The following script can lead to tons of beyond device boundary access:
mkfs.btrfs -f $dev -b 10G
mount $dev $mnt
trimfs $mnt
btrfs filesystem resize 1:-1G $mnt
trimfs $mnt
[CAUSE]
Since commit 929be17a9b49 ("btrfs: Switch btrfs_trim_free_extents to
find_first_clear_extent_bit"), we try to avoid trimming ranges that's
already trimmed.
So we check device->alloc_state by finding the first range which doesn't
have CHUNK_TRIMMED and CHUNK_ALLOCATED not set.
But if we shrunk the device, that bits are not cleared, thus we could
easily got a range starts beyond the shrunk device size.
This results the returned @start and @end are all beyond device size,
then we call "end = min(end, device->total_bytes -1);" making @end
smaller than device size.
Then finally we goes "len = end - start + 1", totally underflow the
result, and lead to the beyond-device-boundary access.
[FIX]
This patch will fix the problem in two ways:
- Clear CHUNK_TRIMMED | CHUNK_ALLOCATED bits when shrinking device
This is the root fix
- Add extra safety check when trimming free device extents
We check and warn if the returned range is already beyond current
device.
Link: https://github.com/kdave/btrfs-progs/issues/282
Fixes: 929be17a9b49 ("btrfs: Switch btrfs_trim_free_extents to find_first_clear_extent_bit")
CC: [email protected] # 5.4+
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: Filipe Manana <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/extent-io-tree.h | 2 ++
fs/btrfs/extent-tree.c | 14 ++++++++++++++
fs/btrfs/volumes.c | 4 ++++
3 files changed, 20 insertions(+)
--- a/fs/btrfs/extent-io-tree.h
+++ b/fs/btrfs/extent-io-tree.h
@@ -34,6 +34,8 @@ struct io_failure_record;
*/
#define CHUNK_ALLOCATED EXTENT_DIRTY
#define CHUNK_TRIMMED EXTENT_DEFRAG
+#define CHUNK_STATE_MASK (CHUNK_ALLOCATED | \
+ CHUNK_TRIMMED)
enum {
IO_TREE_FS_PINNED_EXTENTS,
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -33,6 +33,7 @@
#include "delalloc-space.h"
#include "block-group.h"
#include "discard.h"
+#include "rcu-string.h"
#undef SCRAMBLE_DELAYED_REFS
@@ -5668,6 +5669,19 @@ static int btrfs_trim_free_extents(struc
&start, &end,
CHUNK_TRIMMED | CHUNK_ALLOCATED);
+ /* Check if there are any CHUNK_* bits left */
+ if (start > device->total_bytes) {
+ WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
+ btrfs_warn_in_rcu(fs_info,
+"ignoring attempt to trim beyond device size: offset %llu length %llu device %s device size %llu",
+ start, end - start + 1,
+ rcu_str_deref(device->name),
+ device->total_bytes);
+ mutex_unlock(&fs_info->chunk_mutex);
+ ret = 0;
+ break;
+ }
+
/* Ensure we skip the reserved area in the first 1M */
start = max_t(u64, start, SZ_1M);
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4720,6 +4720,10 @@ again:
}
mutex_lock(&fs_info->chunk_mutex);
+ /* Clear all state bits beyond the shrunk device size */
+ clear_extent_bits(&device->alloc_state, new_size, (u64)-1,
+ CHUNK_STATE_MASK);
+
btrfs_device_set_disk_total_bytes(device, new_size);
if (list_empty(&device->post_commit_list))
list_add_tail(&device->post_commit_list,
From: Josef Bacik <[email protected]>
commit 3ef3959b29c4a5bd65526ab310a1a18ae533172a upstream.
Chris Murphy reported a problem where rpm ostree will bind mount a bunch
of things for whatever voodoo it's doing. But when it does this
/proc/mounts shows something like
/dev/sda /mnt/test btrfs rw,relatime,subvolid=256,subvol=/foo 0 0
/dev/sda /mnt/test/baz btrfs rw,relatime,subvolid=256,subvol=/foo/bar 0 0
Despite subvolid=256 being subvol=/foo. This is because we're just
spitting out the dentry of the mount point, which in the case of bind
mounts is the source path for the mountpoint. Instead we should spit
out the path to the actual subvol. Fix this by looking up the name for
the subvolid we have mounted. With this fix the same test looks like
this
/dev/sda /mnt/test btrfs rw,relatime,subvolid=256,subvol=/foo 0 0
/dev/sda /mnt/test/baz btrfs rw,relatime,subvolid=256,subvol=/foo 0 0
Reported-by: Chris Murphy <[email protected]>
CC: [email protected] # 4.4+
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/super.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1312,6 +1312,7 @@ static int btrfs_show_options(struct seq
{
struct btrfs_fs_info *info = btrfs_sb(dentry->d_sb);
const char *compress_type;
+ const char *subvol_name;
if (btrfs_test_opt(info, DEGRADED))
seq_puts(seq, ",degraded");
@@ -1398,8 +1399,13 @@ static int btrfs_show_options(struct seq
seq_puts(seq, ",ref_verify");
seq_printf(seq, ",subvolid=%llu",
BTRFS_I(d_inode(dentry))->root->root_key.objectid);
- seq_puts(seq, ",subvol=");
- seq_dentry(seq, dentry, " \t\n\\");
+ subvol_name = btrfs_get_subvol_name_from_objectid(info,
+ BTRFS_I(d_inode(dentry))->root->root_key.objectid);
+ if (!IS_ERR(subvol_name)) {
+ seq_puts(seq, ",subvol=");
+ seq_escape(seq, subvol_name, " \t\n\\");
+ kfree(subvol_name);
+ }
return 0;
}
From: Josef Bacik <[email protected]>
commit fbabd4a36faaf74c83142d0b3d950c11ec14fda1 upstream.
Eric reported seeing this message while running generic/475
BTRFS: error (device dm-3) in btrfs_sync_log:3084: errno=-117 Filesystem corrupted
Full stack trace:
BTRFS: error (device dm-0) in btrfs_commit_transaction:2323: errno=-5 IO failure (Error while writing out transaction)
BTRFS info (device dm-0): forced readonly
BTRFS warning (device dm-0): Skipping commit of aborted transaction.
------------[ cut here ]------------
BTRFS: error (device dm-0) in cleanup_transaction:1894: errno=-5 IO failure
BTRFS: Transaction aborted (error -117)
BTRFS warning (device dm-0): direct IO failed ino 3555 rw 0,0 sector 0x1c6480 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3555 rw 0,0 sector 0x1c6488 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3555 rw 0,0 sector 0x1c6490 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3555 rw 0,0 sector 0x1c6498 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3555 rw 0,0 sector 0x1c64a0 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3555 rw 0,0 sector 0x1c64a8 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3555 rw 0,0 sector 0x1c64b0 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3555 rw 0,0 sector 0x1c64b8 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3555 rw 0,0 sector 0x1c64c0 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3572 rw 0,0 sector 0x1b85e8 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3572 rw 0,0 sector 0x1b85f0 len 4096 err no 10
WARNING: CPU: 3 PID: 23985 at fs/btrfs/tree-log.c:3084 btrfs_sync_log+0xbc8/0xd60 [btrfs]
BTRFS warning (device dm-0): direct IO failed ino 3548 rw 0,0 sector 0x1d4288 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3548 rw 0,0 sector 0x1d4290 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3548 rw 0,0 sector 0x1d4298 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3548 rw 0,0 sector 0x1d42a0 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3548 rw 0,0 sector 0x1d42a8 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3548 rw 0,0 sector 0x1d42b0 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3548 rw 0,0 sector 0x1d42b8 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3548 rw 0,0 sector 0x1d42c0 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3548 rw 0,0 sector 0x1d42c8 len 4096 err no 10
BTRFS warning (device dm-0): direct IO failed ino 3548 rw 0,0 sector 0x1d42d0 len 4096 err no 10
CPU: 3 PID: 23985 Comm: fsstress Tainted: G W L 5.8.0-rc4-default+ #1181
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
RIP: 0010:btrfs_sync_log+0xbc8/0xd60 [btrfs]
RSP: 0018:ffff909a44d17bd0 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001
RDX: ffff8f3be41cb940 RSI: ffffffffb0108d2b RDI: ffffffffb0108ff7
RBP: ffff909a44d17e70 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000037988 R12: ffff8f3bd20e4000
R13: ffff8f3bd20e4428 R14: 00000000ffffff8b R15: ffff909a44d17c70
FS: 00007f6a6ed3fb80(0000) GS:ffff8f3c3dc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6a6ed3e000 CR3: 00000000525c0003 CR4: 0000000000160ee0
Call Trace:
? finish_wait+0x90/0x90
? __mutex_unlock_slowpath+0x45/0x2a0
? lock_acquire+0xa3/0x440
? lockref_put_or_lock+0x9/0x30
? dput+0x20/0x4a0
? dput+0x20/0x4a0
? do_raw_spin_unlock+0x4b/0xc0
? _raw_spin_unlock+0x1f/0x30
btrfs_sync_file+0x335/0x490 [btrfs]
do_fsync+0x38/0x70
__x64_sys_fsync+0x10/0x20
do_syscall_64+0x50/0xe0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f6a6ef1b6e3
Code: Bad RIP value.
RSP: 002b:00007ffd01e20038 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
RAX: ffffffffffffffda RBX: 000000000007a120 RCX: 00007f6a6ef1b6e3
RDX: 00007ffd01e1ffa0 RSI: 00007ffd01e1ffa0 RDI: 0000000000000003
RBP: 0000000000000003 R08: 0000000000000001 R09: 00007ffd01e2004c
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000009f
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffffb007fe0b>] copy_process+0x67b/0x1b00
softirqs last enabled at (0): [<ffffffffb007fe0b>] copy_process+0x67b/0x1b00
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace af146e0e38433456 ]---
BTRFS: error (device dm-0) in btrfs_sync_log:3084: errno=-117 Filesystem corrupted
This ret came from btrfs_write_marked_extents(). If we get an aborted
transaction via EIO before, we'll see it in btree_write_cache_pages()
and return EUCLEAN, which gets printed as "Filesystem corrupted".
Except we shouldn't be returning EUCLEAN here, we need to be returning
EROFS because EUCLEAN is reserved for actual corruption, not IO errors.
We are inconsistent about our handling of BTRFS_FS_STATE_ERROR
elsewhere, but we want to use EROFS for this particular case. The
original transaction abort has the real error code for why we ended up
with an aborted transaction, all subsequent actions just need to return
EROFS because they may not have a trans handle and have no idea about
the original cause of the abort.
After patch "btrfs: don't WARN if we abort a transaction with EROFS" the
stacktrace will not be dumped either.
Reported-by: Eric Sandeen <[email protected]>
CC: [email protected] # 5.4+
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
[ add full test stacktrace ]
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/extent_io.c | 2 +-
fs/btrfs/scrub.c | 2 +-
fs/btrfs/transaction.c | 5 ++++-
3 files changed, 6 insertions(+), 3 deletions(-)
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4127,7 +4127,7 @@ retry:
if (!test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
ret = flush_write_bio(&epd);
} else {
- ret = -EUCLEAN;
+ ret = -EROFS;
end_write_bio(&epd, ret);
}
return ret;
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -3758,7 +3758,7 @@ static noinline_for_stack int scrub_supe
struct btrfs_fs_info *fs_info = sctx->fs_info;
if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state))
- return -EIO;
+ return -EROFS;
/* Seed devices of a new filesystem has their own generation. */
if (scrub_dev->fs_devices != fs_info->fs_devices)
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -937,7 +937,10 @@ static int __btrfs_end_transaction(struc
if (TRANS_ABORTED(trans) ||
test_bit(BTRFS_FS_STATE_ERROR, &info->fs_state)) {
wake_up_process(info->transaction_kthread);
- err = -EIO;
+ if (TRANS_ABORTED(trans))
+ err = trans->aborted;
+ else
+ err = -EROFS;
}
kmem_cache_free(btrfs_trans_handle_cachep, trans);
From: Qu Wenruo <[email protected]>
commit 082b6c970f02fefd278c7833880cda29691a5f34 upstream.
[BUG]
When a lot of subvolumes are created, there is a user report about
transaction aborted caused by slow anonymous block device reclaim:
BTRFS: Transaction aborted (error -24)
WARNING: CPU: 17 PID: 17041 at fs/btrfs/transaction.c:1576 create_pending_snapshot+0xbc4/0xd10 [btrfs]
RIP: 0010:create_pending_snapshot+0xbc4/0xd10 [btrfs]
Call Trace:
create_pending_snapshots+0x82/0xa0 [btrfs]
btrfs_commit_transaction+0x275/0x8c0 [btrfs]
btrfs_mksubvol+0x4b9/0x500 [btrfs]
btrfs_ioctl_snap_create_transid+0x174/0x180 [btrfs]
btrfs_ioctl_snap_create_v2+0x11c/0x180 [btrfs]
btrfs_ioctl+0x11a4/0x2da0 [btrfs]
do_vfs_ioctl+0xa9/0x640
ksys_ioctl+0x67/0x90
__x64_sys_ioctl+0x1a/0x20
do_syscall_64+0x5a/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9
---[ end trace 33f2f83f3d5250e9 ]---
BTRFS: error (device sda1) in create_pending_snapshot:1576: errno=-24 unknown
BTRFS info (device sda1): forced readonly
BTRFS warning (device sda1): Skipping commit of aborted transaction.
BTRFS: error (device sda1) in cleanup_transaction:1831: errno=-24 unknown
[CAUSE]
The anonymous device pool is shared and its size is 1M. It's possible to
hit that limit if the subvolume deletion is not fast enough and the
subvolumes to be cleaned keep the ids allocated.
[WORKAROUND]
We can't avoid the anon device pool exhaustion but we can shorten the
time the id is attached to the subvolume root once the subvolume becomes
invisible to the user.
Reported-by: Greed Rong <[email protected]>
Link: https://lore.kernel.org/linux-btrfs/CA+UqX+NTrZ6boGnWHhSeZmEY5J76CTqmYjO2S+=tHJX7nb9DPw@mail.gmail.com/
CC: [email protected] # 4.4+
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/inode.c | 2 ++
1 file changed, 2 insertions(+)
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4041,6 +4041,8 @@ int btrfs_delete_subvolume(struct inode
}
}
+ free_anon_bdev(dest->anon_dev);
+ dest->anon_dev = 0;
out_end_trans:
trans->block_rsv = NULL;
trans->bytes_reserved = 0;
From: David Sterba <[email protected]>
commit f37c563bab4297024c300b05c8f48430e323809d upstream.
User Forza reported on IRC that some invalid combinations of file
attributes are accepted by chattr.
The NODATACOW and compression file flags/attributes are mutually
exclusive, but they could be set by 'chattr +c +C' on an empty file. The
nodatacow will be in effect because it's checked first in
btrfs_run_delalloc_range.
Extend the flag validation to catch the following cases:
- input flags are conflicting
- old and new flags are conflicting
- initialize the local variable with inode flags after inode ls locked
Inode attributes take precedence over mount options and are an
independent setting.
Nocompress would be a no-op with nodatacow, but we don't want to mix
any compression-related options with nodatacow.
CC: [email protected] # 4.4+
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/ioctl.c | 30 ++++++++++++++++++++++--------
1 file changed, 22 insertions(+), 8 deletions(-)
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -164,8 +164,11 @@ static int btrfs_ioctl_getflags(struct f
return 0;
}
-/* Check if @flags are a supported and valid set of FS_*_FL flags */
-static int check_fsflags(unsigned int flags)
+/*
+ * Check if @flags are a supported and valid set of FS_*_FL flags and that
+ * the old and new flags are not conflicting
+ */
+static int check_fsflags(unsigned int old_flags, unsigned int flags)
{
if (flags & ~(FS_IMMUTABLE_FL | FS_APPEND_FL | \
FS_NOATIME_FL | FS_NODUMP_FL | \
@@ -174,9 +177,19 @@ static int check_fsflags(unsigned int fl
FS_NOCOW_FL))
return -EOPNOTSUPP;
+ /* COMPR and NOCOMP on new/old are valid */
if ((flags & FS_NOCOMP_FL) && (flags & FS_COMPR_FL))
return -EINVAL;
+ if ((flags & FS_COMPR_FL) && (flags & FS_NOCOW_FL))
+ return -EINVAL;
+
+ /* NOCOW and compression options are mutually exclusive */
+ if ((old_flags & FS_NOCOW_FL) && (flags & (FS_COMPR_FL | FS_NOCOMP_FL)))
+ return -EINVAL;
+ if ((flags & FS_NOCOW_FL) && (old_flags & (FS_COMPR_FL | FS_NOCOMP_FL)))
+ return -EINVAL;
+
return 0;
}
@@ -190,7 +203,7 @@ static int btrfs_ioctl_setflags(struct f
unsigned int fsflags, old_fsflags;
int ret;
const char *comp = NULL;
- u32 binode_flags = binode->flags;
+ u32 binode_flags;
if (!inode_owner_or_capable(inode))
return -EPERM;
@@ -201,22 +214,23 @@ static int btrfs_ioctl_setflags(struct f
if (copy_from_user(&fsflags, arg, sizeof(fsflags)))
return -EFAULT;
- ret = check_fsflags(fsflags);
- if (ret)
- return ret;
-
ret = mnt_want_write_file(file);
if (ret)
return ret;
inode_lock(inode);
-
fsflags = btrfs_mask_fsflags_for_type(inode, fsflags);
old_fsflags = btrfs_inode_flags_to_fsflags(binode->flags);
+
ret = vfs_ioc_setflags_prepare(inode, old_fsflags, fsflags);
if (ret)
goto out_unlock;
+ ret = check_fsflags(old_fsflags, fsflags);
+ if (ret)
+ goto out_unlock;
+
+ binode_flags = binode->flags;
if (fsflags & FS_SYNC_FL)
binode_flags |= BTRFS_INODE_SYNC;
else
From: Ansuel Smith <[email protected]>
commit 5149901e9e6deca487c01cc434a3ac4125c7b00b upstream.
Set some specific value for Tx De-Emphasis, Tx Swing and Rx equalization
needed on some ipq8064 based device (Netgear R7800 for example). Without
this the system locks on kernel load.
Link: https://lore.kernel.org/r/[email protected]
Fixes: 82a823833f4e ("PCI: qcom: Add Qualcomm PCIe controller driver")
Signed-off-by: Ansuel Smith <[email protected]>
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Acked-by: Stanimir Varbanov <[email protected]>
Cc: [email protected] # v4.5+
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/pci/controller/dwc/pcie-qcom.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -77,6 +77,18 @@
#define DBI_RO_WR_EN 1
#define PERST_DELAY_US 1000
+/* PARF registers */
+#define PCIE20_PARF_PCS_DEEMPH 0x34
+#define PCS_DEEMPH_TX_DEEMPH_GEN1(x) ((x) << 16)
+#define PCS_DEEMPH_TX_DEEMPH_GEN2_3_5DB(x) ((x) << 8)
+#define PCS_DEEMPH_TX_DEEMPH_GEN2_6DB(x) ((x) << 0)
+
+#define PCIE20_PARF_PCS_SWING 0x38
+#define PCS_SWING_TX_SWING_FULL(x) ((x) << 8)
+#define PCS_SWING_TX_SWING_LOW(x) ((x) << 0)
+
+#define PCIE20_PARF_CONFIG_BITS 0x50
+#define PHY_RX0_EQ(x) ((x) << 24)
#define PCIE20_v3_PARF_SLV_ADDR_SPACE_SIZE 0x358
#define SLV_ADDR_SPACE_SZ 0x10000000
@@ -286,6 +298,7 @@ static int qcom_pcie_init_2_1_0(struct q
struct qcom_pcie_resources_2_1_0 *res = &pcie->res.v2_1_0;
struct dw_pcie *pci = pcie->pci;
struct device *dev = pci->dev;
+ struct device_node *node = dev->of_node;
u32 val;
int ret;
@@ -330,6 +343,17 @@ static int qcom_pcie_init_2_1_0(struct q
val &= ~BIT(0);
writel(val, pcie->parf + PCIE20_PARF_PHY_CTRL);
+ if (of_device_is_compatible(node, "qcom,pcie-ipq8064")) {
+ writel(PCS_DEEMPH_TX_DEEMPH_GEN1(24) |
+ PCS_DEEMPH_TX_DEEMPH_GEN2_3_5DB(24) |
+ PCS_DEEMPH_TX_DEEMPH_GEN2_6DB(34),
+ pcie->parf + PCIE20_PARF_PCS_DEEMPH);
+ writel(PCS_SWING_TX_SWING_FULL(120) |
+ PCS_SWING_TX_SWING_LOW(120),
+ pcie->parf + PCIE20_PARF_PCS_SWING);
+ writel(PHY_RX0_EQ(4), pcie->parf + PCIE20_PARF_CONFIG_BITS);
+ }
+
/* enable external reference clock */
val = readl(pcie->parf + PCIE20_PARF_PHY_REFCLK);
val |= BIT(16);
From: Tom Rix <[email protected]>
commit d60ba8de1164e1b42e296ff270c622a070ef8fe7 upstream.
clang static analysis flags this error
fs/btrfs/ref-verify.c:290:3: warning: Potential leak of memory pointed to by 're' [unix.Malloc]
kfree(be);
^~~~~
The problem is in this block of code:
if (root_objectid) {
struct root_entry *exist_re;
exist_re = insert_root_entry(&exist->roots, re);
if (exist_re)
kfree(re);
}
There is no 'else' block freeing when root_objectid is 0. Add the
missing kfree to the else branch.
Fixes: fd708b81d972 ("Btrfs: add a extent ref verify tool")
CC: [email protected] # 4.19+
Signed-off-by: Tom Rix <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/ref-verify.c | 2 ++
1 file changed, 2 insertions(+)
--- a/fs/btrfs/ref-verify.c
+++ b/fs/btrfs/ref-verify.c
@@ -286,6 +286,8 @@ static struct block_entry *add_block_ent
exist_re = insert_root_entry(&exist->roots, re);
if (exist_re)
kfree(re);
+ } else {
+ kfree(re);
}
kfree(be);
return exist;
From: Josef Bacik <[email protected]>
commit 01d01caf19ff7c537527d352d169c4368375c0a1 upstream.
We are currently getting this lockdep splat in btrfs/161:
======================================================
WARNING: possible circular locking dependency detected
5.8.0-rc5+ #20 Tainted: G E
------------------------------------------------------
mount/678048 is trying to acquire lock:
ffff9b769f15b6e0 (&fs_devs->device_list_mutex){+.+.}-{3:3}, at: clone_fs_devices+0x4d/0x170 [btrfs]
but task is already holding lock:
ffff9b76abdb08d0 (&fs_info->chunk_mutex){+.+.}-{3:3}, at: btrfs_read_chunk_tree+0x6a/0x800 [btrfs]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&fs_info->chunk_mutex){+.+.}-{3:3}:
__mutex_lock+0x8b/0x8f0
btrfs_init_new_device+0x2d2/0x1240 [btrfs]
btrfs_ioctl+0x1de/0x2d20 [btrfs]
ksys_ioctl+0x87/0xc0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x52/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
-> #0 (&fs_devs->device_list_mutex){+.+.}-{3:3}:
__lock_acquire+0x1240/0x2460
lock_acquire+0xab/0x360
__mutex_lock+0x8b/0x8f0
clone_fs_devices+0x4d/0x170 [btrfs]
btrfs_read_chunk_tree+0x330/0x800 [btrfs]
open_ctree+0xb7c/0x18ce [btrfs]
btrfs_mount_root.cold+0x13/0xfa [btrfs]
legacy_get_tree+0x30/0x50
vfs_get_tree+0x28/0xc0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
btrfs_mount+0x13b/0x3e0 [btrfs]
legacy_get_tree+0x30/0x50
vfs_get_tree+0x28/0xc0
do_mount+0x7de/0xb30
__x64_sys_mount+0x8e/0xd0
do_syscall_64+0x52/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&fs_info->chunk_mutex);
lock(&fs_devs->device_list_mutex);
lock(&fs_info->chunk_mutex);
lock(&fs_devs->device_list_mutex);
*** DEADLOCK ***
3 locks held by mount/678048:
#0: ffff9b75ff5fb0e0 (&type->s_umount_key#63/1){+.+.}-{3:3}, at: alloc_super+0xb5/0x380
#1: ffffffffc0c2fbc8 (uuid_mutex){+.+.}-{3:3}, at: btrfs_read_chunk_tree+0x54/0x800 [btrfs]
#2: ffff9b76abdb08d0 (&fs_info->chunk_mutex){+.+.}-{3:3}, at: btrfs_read_chunk_tree+0x6a/0x800 [btrfs]
stack backtrace:
CPU: 2 PID: 678048 Comm: mount Tainted: G E 5.8.0-rc5+ #20
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./890FX Deluxe5, BIOS P1.40 05/03/2011
Call Trace:
dump_stack+0x96/0xd0
check_noncircular+0x162/0x180
__lock_acquire+0x1240/0x2460
? asm_sysvec_apic_timer_interrupt+0x12/0x20
lock_acquire+0xab/0x360
? clone_fs_devices+0x4d/0x170 [btrfs]
__mutex_lock+0x8b/0x8f0
? clone_fs_devices+0x4d/0x170 [btrfs]
? rcu_read_lock_sched_held+0x52/0x60
? cpumask_next+0x16/0x20
? module_assert_mutex_or_preempt+0x14/0x40
? __module_address+0x28/0xf0
? clone_fs_devices+0x4d/0x170 [btrfs]
? static_obj+0x4f/0x60
? lockdep_init_map_waits+0x43/0x200
? clone_fs_devices+0x4d/0x170 [btrfs]
clone_fs_devices+0x4d/0x170 [btrfs]
btrfs_read_chunk_tree+0x330/0x800 [btrfs]
open_ctree+0xb7c/0x18ce [btrfs]
? super_setup_bdi_name+0x79/0xd0
btrfs_mount_root.cold+0x13/0xfa [btrfs]
? vfs_parse_fs_string+0x84/0xb0
? rcu_read_lock_sched_held+0x52/0x60
? kfree+0x2b5/0x310
legacy_get_tree+0x30/0x50
vfs_get_tree+0x28/0xc0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
btrfs_mount+0x13b/0x3e0 [btrfs]
? cred_has_capability+0x7c/0x120
? rcu_read_lock_sched_held+0x52/0x60
? legacy_get_tree+0x30/0x50
legacy_get_tree+0x30/0x50
vfs_get_tree+0x28/0xc0
do_mount+0x7de/0xb30
? memdup_user+0x4e/0x90
__x64_sys_mount+0x8e/0xd0
do_syscall_64+0x52/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This is because btrfs_read_chunk_tree() can come upon DEV_EXTENT's and
then read the device, which takes the device_list_mutex. The
device_list_mutex needs to be taken before the chunk_mutex, so this is a
problem. We only really need the chunk mutex around adding the chunk,
so move the mutex around read_one_chunk.
An argument could be made that we don't even need the chunk_mutex here
as it's during mount, and we are protected by various other locks.
However we already have special rules for ->device_list_mutex, and I'd
rather not have another special case for ->chunk_mutex.
CC: [email protected] # 4.19+
Reviewed-by: Anand Jain <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/volumes.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7064,7 +7064,6 @@ int btrfs_read_chunk_tree(struct btrfs_f
* otherwise we don't need it.
*/
mutex_lock(&uuid_mutex);
- mutex_lock(&fs_info->chunk_mutex);
/*
* It is possible for mount and umount to race in such a way that
@@ -7109,7 +7108,9 @@ int btrfs_read_chunk_tree(struct btrfs_f
} else if (found_key.type == BTRFS_CHUNK_ITEM_KEY) {
struct btrfs_chunk *chunk;
chunk = btrfs_item_ptr(leaf, slot, struct btrfs_chunk);
+ mutex_lock(&fs_info->chunk_mutex);
ret = read_one_chunk(&found_key, leaf, chunk);
+ mutex_unlock(&fs_info->chunk_mutex);
if (ret)
goto error;
}
@@ -7139,7 +7140,6 @@ int btrfs_read_chunk_tree(struct btrfs_f
}
ret = 0;
error:
- mutex_unlock(&fs_info->chunk_mutex);
mutex_unlock(&uuid_mutex);
btrfs_free_path(path);
From: Shaokun Zhang <[email protected]>
commit 539707caa1a89ee4efc57b4e4231c20c46575ccc upstream.
When PMU event ID is equal or greater than 0x4000, it will be reduced
by 0x4000 and it is not the raw number in the sysfs. Let's correct it
and obtain the raw event ID.
Before this patch:
cat /sys/bus/event_source/devices/armv8_pmuv3_0/events/sample_feed
event=0x001
After this patch:
cat /sys/bus/event_source/devices/armv8_pmuv3_0/events/sample_feed
event=0x4001
Signed-off-by: Shaokun Zhang <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[will: fixed formatting of 'if' condition]
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm64/kernel/perf_event.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -155,7 +155,7 @@ armv8pmu_events_sysfs_show(struct device
pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr);
- return sprintf(page, "event=0x%03llx\n", pmu_attr->id);
+ return sprintf(page, "event=0x%04llx\n", pmu_attr->id);
}
#define ARMV8_EVENT_ATTR(name, config) \
@@ -244,10 +244,13 @@ armv8pmu_event_attr_is_visible(struct ko
test_bit(pmu_attr->id, cpu_pmu->pmceid_bitmap))
return attr->mode;
- pmu_attr->id -= ARMV8_PMUV3_EXT_COMMON_EVENT_BASE;
- if (pmu_attr->id < ARMV8_PMUV3_MAX_COMMON_EVENTS &&
- test_bit(pmu_attr->id, cpu_pmu->pmceid_ext_bitmap))
- return attr->mode;
+ if (pmu_attr->id >= ARMV8_PMUV3_EXT_COMMON_EVENT_BASE) {
+ u64 id = pmu_attr->id - ARMV8_PMUV3_EXT_COMMON_EVENT_BASE;
+
+ if (id < ARMV8_PMUV3_MAX_COMMON_EVENTS &&
+ test_bit(id, cpu_pmu->pmceid_ext_bitmap))
+ return attr->mode;
+ }
return 0;
}
From: Johannes Thumshirn <[email protected]>
commit 137c541821a83debb63b3fa8abdd1cbc41bdf3a1 upstream.
With the recent addition of filesystem checksum types other than CRC32c,
it is not anymore hard-coded which checksum type a btrfs filesystem uses.
Up to now there is no good way to read the filesystem checksum, apart from
reading the filesystem UUID and then query sysfs for the checksum type.
Add a new csum_type and csum_size fields to the BTRFS_IOC_FS_INFO ioctl
command which usually is used to query filesystem features. Also add a
flags member indicating that the kernel responded with a set csum_type and
csum_size field.
For compatibility reasons, only return the csum_type and csum_size if
the BTRFS_FS_INFO_FLAG_CSUM_INFO flag was passed to the kernel. Also
clear any unknown flags so we don't pass false positives to user-space
newer than the kernel.
To simplify further additions to the ioctl, also switch the padding to a
u8 array. Pahole was used to verify the result of this switch:
The csum members are added before flags, which might look odd, but this
is to keep the alignment requirements and not to introduce holes in the
structure.
$ pahole -C btrfs_ioctl_fs_info_args fs/btrfs/btrfs.ko
struct btrfs_ioctl_fs_info_args {
__u64 max_id; /* 0 8 */
__u64 num_devices; /* 8 8 */
__u8 fsid[16]; /* 16 16 */
__u32 nodesize; /* 32 4 */
__u32 sectorsize; /* 36 4 */
__u32 clone_alignment; /* 40 4 */
__u16 csum_type; /* 44 2 */
__u16 csum_size; /* 46 2 */
__u64 flags; /* 48 8 */
__u8 reserved[968]; /* 56 968 */
/* size: 1024, cachelines: 16, members: 10 */
};
Fixes: 3951e7f050ac ("btrfs: add xxhash64 to checksumming algorithms")
Fixes: 3831bf0094ab ("btrfs: add sha256 to checksumming algorithm")
CC: [email protected] # 5.5+
Signed-off-by: Johannes Thumshirn <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/ioctl.c | 16 +++++++++++++---
include/uapi/linux/btrfs.h | 14 ++++++++++++--
2 files changed, 25 insertions(+), 5 deletions(-)
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3217,11 +3217,15 @@ static long btrfs_ioctl_fs_info(struct b
struct btrfs_ioctl_fs_info_args *fi_args;
struct btrfs_device *device;
struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
+ u64 flags_in;
int ret = 0;
- fi_args = kzalloc(sizeof(*fi_args), GFP_KERNEL);
- if (!fi_args)
- return -ENOMEM;
+ fi_args = memdup_user(arg, sizeof(*fi_args));
+ if (IS_ERR(fi_args))
+ return PTR_ERR(fi_args);
+
+ flags_in = fi_args->flags;
+ memset(fi_args, 0, sizeof(*fi_args));
rcu_read_lock();
fi_args->num_devices = fs_devices->num_devices;
@@ -3237,6 +3241,12 @@ static long btrfs_ioctl_fs_info(struct b
fi_args->sectorsize = fs_info->sectorsize;
fi_args->clone_alignment = fs_info->sectorsize;
+ if (flags_in & BTRFS_FS_INFO_FLAG_CSUM_INFO) {
+ fi_args->csum_type = btrfs_super_csum_type(fs_info->super_copy);
+ fi_args->csum_size = btrfs_super_csum_size(fs_info->super_copy);
+ fi_args->flags |= BTRFS_FS_INFO_FLAG_CSUM_INFO;
+ }
+
if (copy_to_user(arg, fi_args, sizeof(*fi_args)))
ret = -EFAULT;
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -243,6 +243,13 @@ struct btrfs_ioctl_dev_info_args {
__u8 path[BTRFS_DEVICE_PATH_NAME_MAX]; /* out */
};
+/*
+ * Retrieve information about the filesystem
+ */
+
+/* Request information about checksum type and size */
+#define BTRFS_FS_INFO_FLAG_CSUM_INFO (1 << 0)
+
struct btrfs_ioctl_fs_info_args {
__u64 max_id; /* out */
__u64 num_devices; /* out */
@@ -250,8 +257,11 @@ struct btrfs_ioctl_fs_info_args {
__u32 nodesize; /* out */
__u32 sectorsize; /* out */
__u32 clone_alignment; /* out */
- __u32 reserved32;
- __u64 reserved[122]; /* pad to 1k */
+ /* See BTRFS_FS_INFO_FLAG_* */
+ __u16 csum_type; /* out */
+ __u16 csum_size; /* out */
+ __u64 flags; /* in/out */
+ __u8 reserved[968]; /* pad to 1k */
};
/*
From: Lorenzo Bianconi <[email protected]>
commit a1bab9396c2d98c601ce81c27567159dfbc10c19 upstream.
Reset hw time samples generator after system resume in order to avoid
disalignment between system and device time reference since FIFO
batching and time samples generator are disabled during suspend.
Fixes: 213451076bd3 ("iio: imu: st_lsm6dsx: add hw timestamp support")
Tested-by: Sean Nyekjaer <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Cc: <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/iio/imu/st_lsm6dsx/st_lsm6dsx.h | 3 +--
drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c | 23 +++++++++++++++--------
drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c | 2 +-
3 files changed, 17 insertions(+), 11 deletions(-)
--- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx.h
+++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx.h
@@ -436,8 +436,7 @@ int st_lsm6dsx_update_watermark(struct s
u16 watermark);
int st_lsm6dsx_update_fifo(struct st_lsm6dsx_sensor *sensor, bool enable);
int st_lsm6dsx_flush_fifo(struct st_lsm6dsx_hw *hw);
-int st_lsm6dsx_set_fifo_mode(struct st_lsm6dsx_hw *hw,
- enum st_lsm6dsx_fifo_mode fifo_mode);
+int st_lsm6dsx_resume_fifo(struct st_lsm6dsx_hw *hw);
int st_lsm6dsx_read_fifo(struct st_lsm6dsx_hw *hw);
int st_lsm6dsx_read_tagged_fifo(struct st_lsm6dsx_hw *hw);
int st_lsm6dsx_check_odr(struct st_lsm6dsx_sensor *sensor, u32 odr, u8 *val);
--- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c
+++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c
@@ -184,8 +184,8 @@ static int st_lsm6dsx_update_decimators(
return err;
}
-int st_lsm6dsx_set_fifo_mode(struct st_lsm6dsx_hw *hw,
- enum st_lsm6dsx_fifo_mode fifo_mode)
+static int st_lsm6dsx_set_fifo_mode(struct st_lsm6dsx_hw *hw,
+ enum st_lsm6dsx_fifo_mode fifo_mode)
{
unsigned int data;
@@ -302,6 +302,18 @@ static int st_lsm6dsx_reset_hw_ts(struct
return 0;
}
+int st_lsm6dsx_resume_fifo(struct st_lsm6dsx_hw *hw)
+{
+ int err;
+
+ /* reset hw ts counter */
+ err = st_lsm6dsx_reset_hw_ts(hw);
+ if (err < 0)
+ return err;
+
+ return st_lsm6dsx_set_fifo_mode(hw, ST_LSM6DSX_FIFO_CONT);
+}
+
/*
* Set max bulk read to ST_LSM6DSX_MAX_WORD_LEN/ST_LSM6DSX_MAX_TAGGED_WORD_LEN
* in order to avoid a kmalloc for each bus access
@@ -675,12 +687,7 @@ int st_lsm6dsx_update_fifo(struct st_lsm
goto out;
if (fifo_mask) {
- /* reset hw ts counter */
- err = st_lsm6dsx_reset_hw_ts(hw);
- if (err < 0)
- goto out;
-
- err = st_lsm6dsx_set_fifo_mode(hw, ST_LSM6DSX_FIFO_CONT);
+ err = st_lsm6dsx_resume_fifo(hw);
if (err < 0)
goto out;
}
--- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
+++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
@@ -2458,7 +2458,7 @@ static int __maybe_unused st_lsm6dsx_res
}
if (hw->fifo_mask)
- err = st_lsm6dsx_set_fifo_mode(hw, ST_LSM6DSX_FIFO_CONT);
+ err = st_lsm6dsx_resume_fifo(hw);
return err;
}
From: Josef Bacik <[email protected]>
commit bf53d4687b8f3f6b752f091eb85f62369a515dfd upstream.
In try_to_merge_free_space we attempt to find entries to the left and
right of the entry we are adding to see if they can be merged. We
search for an entry past our current info (saved into right_info), and
then if right_info exists and it has a rb_prev() we save the rb_prev()
into left_info.
However there's a slight problem in the case that we have a right_info,
but no entry previous to that entry. At that point we will search for
an entry just before the info we're attempting to insert. This will
simply find right_info again, and assign it to left_info, making them
both the same pointer.
Now if right_info _can_ be merged with the range we're inserting, we'll
add it to the info and free right_info. However further down we'll
access left_info, which was right_info, and thus get a use-after-free.
Fix this by only searching for the left entry if we don't find a right
entry at all.
The CVE referenced had a specially crafted file system that could
trigger this use-after-free. However with the tree checker improvements
we no longer trigger the conditions for the UAF. But the original
conditions still apply, hence this fix.
Reference: CVE-2019-19448
Fixes: 963030817060 ("Btrfs: use hybrid extents+bitmap rb tree for free space")
CC: [email protected] # 4.4+
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/free-space-cache.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -2281,7 +2281,7 @@ out:
static bool try_merge_free_space(struct btrfs_free_space_ctl *ctl,
struct btrfs_free_space *info, bool update_stat)
{
- struct btrfs_free_space *left_info;
+ struct btrfs_free_space *left_info = NULL;
struct btrfs_free_space *right_info;
bool merged = false;
u64 offset = info->offset;
@@ -2297,7 +2297,7 @@ static bool try_merge_free_space(struct
if (right_info && rb_prev(&right_info->offset_index))
left_info = rb_entry(rb_prev(&right_info->offset_index),
struct btrfs_free_space, offset_index);
- else
+ else if (!right_info)
left_info = tree_search_offset(ctl, offset - 1, 0, 0);
/* See try_merge_free_space() comment. */
From: Lukas Wunner <[email protected]>
commit 654888327e9f655a9d55ad477a9583e90e8c9b5c upstream.
Commit 3451a495ef24 ("driver core: Establish order of operations for
device_add and device_del via bitflag") sought to prevent asynchronous
driver binding to a device which is being removed. It added a
per-device "dead" flag which is checked in the following code paths:
* asynchronous binding in __driver_attach_async_helper()
* synchronous binding in device_driver_attach()
* asynchronous binding in __device_attach_async_helper()
It did *not* check the flag upon:
* synchronous binding in __device_attach()
However __device_attach() may also be called asynchronously from:
deferred_probe_work_func()
bus_probe_device()
device_initial_probe()
__device_attach()
So if the commit's intention was to check the "dead" flag in all
asynchronous code paths, then a check is also necessary in
__device_attach(). Add the missing check.
Fixes: 3451a495ef24 ("driver core: Establish order of operations for device_add and device_del via bitflag")
Signed-off-by: Lukas Wunner <[email protected]>
Cc: [email protected] # v5.1+
Cc: Alexander Duyck <[email protected]>
Link: https://lore.kernel.org/r/de88a23a6fe0ef70f7cfd13c8aea9ab51b4edab6.1594214103.git.lukas@wunner.de
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/base/dd.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -844,7 +844,9 @@ static int __device_attach(struct device
int ret = 0;
device_lock(dev);
- if (dev->driver) {
+ if (dev->p->dead) {
+ goto out_unlock;
+ } else if (dev->driver) {
if (device_is_bound(dev)) {
ret = 1;
goto out_unlock;
From: Boleyn Su <[email protected]>
commit c15c2ec07a26b251040943a1a9f90d3037f041e5 upstream.
The `if (!ret)` check will always be false and it may result in
ret->path being dereferenced while it is a NULL pointer.
Fixes: a37f232b7b65 ("btrfs: backref: introduce the skeleton of btrfs_backref_iter")
CC: [email protected] # 5.8+
Reviewed-by: Nikolay Borisov <[email protected]>
Reviewed-by: Qu Wenruo <[email protected]>
Signed-off-by: Boleyn Su <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/backref.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -2303,7 +2303,7 @@ struct btrfs_backref_iter *btrfs_backref
return NULL;
ret->path = btrfs_alloc_path();
- if (!ret) {
+ if (!ret->path) {
kfree(ret);
return NULL;
}
From: Coly Li <[email protected]>
commit c5be1f2c5bab1538aa29cd42e226d6b80391e3ff upstream.
This patch is a fix to patch "bcache: fix bio_{start,end}_io_acct with
proper device". The previous patch uses a hack to temporarily set
bi_disk to bcache device, which is mistaken too.
As Christoph suggests, this patch uses disk_{start,end}_io_acct() to
count I/O for bcache device in the correct way.
Fixes: 85750aeb748f ("bcache: use bio_{start,end}_io_acct")
Signed-off-by: Coly Li <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: [email protected]
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/md/bcache/request.c | 37 +++++++++----------------------------
1 file changed, 9 insertions(+), 28 deletions(-)
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -617,28 +617,6 @@ static void cache_lookup(struct closure
/* Common code for the make_request functions */
-static inline void bch_bio_start_io_acct(struct gendisk *acct_bi_disk,
- struct bio *bio,
- unsigned long *start_time)
-{
- struct gendisk *saved_bi_disk = bio->bi_disk;
-
- bio->bi_disk = acct_bi_disk;
- *start_time = bio_start_io_acct(bio);
- bio->bi_disk = saved_bi_disk;
-}
-
-static inline void bch_bio_end_io_acct(struct gendisk *acct_bi_disk,
- struct bio *bio,
- unsigned long start_time)
-{
- struct gendisk *saved_bi_disk = bio->bi_disk;
-
- bio->bi_disk = acct_bi_disk;
- bio_end_io_acct(bio, start_time);
- bio->bi_disk = saved_bi_disk;
-}
-
static void request_endio(struct bio *bio)
{
struct closure *cl = bio->bi_private;
@@ -690,7 +668,9 @@ static void backing_request_endio(struct
static void bio_complete(struct search *s)
{
if (s->orig_bio) {
- bch_bio_end_io_acct(s->d->disk, s->orig_bio, s->start_time);
+ /* Count on bcache device */
+ disk_end_io_acct(s->d->disk, bio_op(s->orig_bio), s->start_time);
+
trace_bcache_request_end(s->d, s->orig_bio);
s->orig_bio->bi_status = s->iop.status;
bio_endio(s->orig_bio);
@@ -750,8 +730,8 @@ static inline struct search *search_allo
s->recoverable = 1;
s->write = op_is_write(bio_op(bio));
s->read_dirty_data = 0;
- bch_bio_start_io_acct(d->disk, bio, &s->start_time);
-
+ /* Count on the bcache device */
+ s->start_time = disk_start_io_acct(d->disk, bio_sectors(bio), bio_op(bio));
s->iop.c = d->c;
s->iop.bio = NULL;
s->iop.inode = d->id;
@@ -1102,7 +1082,8 @@ static void detached_dev_end_io(struct b
bio->bi_end_io = ddip->bi_end_io;
bio->bi_private = ddip->bi_private;
- bch_bio_end_io_acct(ddip->d->disk, bio, ddip->start_time);
+ /* Count on the bcache device */
+ disk_end_io_acct(ddip->d->disk, bio_op(bio), ddip->start_time);
if (bio->bi_status) {
struct cached_dev *dc = container_of(ddip->d,
@@ -1127,8 +1108,8 @@ static void detached_dev_do_request(stru
*/
ddip = kzalloc(sizeof(struct detached_dev_io_private), GFP_NOIO);
ddip->d = d;
- bch_bio_start_io_acct(d->disk, bio, &ddip->start_time);
-
+ /* Count on the bcache device */
+ ddip->start_time = disk_start_io_acct(d->disk, bio_sectors(bio), bio_op(bio));
ddip->bi_end_io = bio->bi_end_io;
ddip->bi_private = bio->bi_private;
bio->bi_end_io = detached_dev_end_io;
From: Huacai Chen <[email protected]>
commit 9cce844abf07b683cff5f0273977d5f8d0af94c7 upstream.
Now CPU#0 is not hotpluggable on MIPS, so prevent to create /sys/devices
/system/cpu/cpu0/online which confuses some user-space tools.
Cc: [email protected]
Signed-off-by: Huacai Chen <[email protected]>
Signed-off-by: Thomas Bogendoerfer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/mips/kernel/topology.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/mips/kernel/topology.c
+++ b/arch/mips/kernel/topology.c
@@ -20,7 +20,7 @@ static int __init topology_init(void)
for_each_present_cpu(i) {
struct cpu *c = &per_cpu(cpu_devices, i);
- c->hotpluggable = 1;
+ c->hotpluggable = !!i;
ret = register_cpu(c, i);
if (ret)
printk(KERN_WARNING "topology_init: register_cpu %d "
From: Pavel Machek <[email protected]>
commit 881a3a11c2b858fe9b69ef79ac5ee9978a266dc9 upstream.
btrfs_get_extent() sets variable ret, but out: error path expect error
to be in variable err so the error code is lost.
Fixes: 6bf9e4bd6a27 ("btrfs: inode: Verify inode mode to avoid NULL pointer dereference")
CC: [email protected] # 5.4+
Reviewed-by: Nikolay Borisov <[email protected]>
Signed-off-by: Pavel Machek (CIP) <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/btrfs/inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6633,7 +6633,7 @@ struct extent_map *btrfs_get_extent(stru
extent_type == BTRFS_FILE_EXTENT_PREALLOC) {
/* Only regular file could have regular/prealloc extent */
if (!S_ISREG(inode->vfs_inode.i_mode)) {
- ret = -EUCLEAN;
+ err = -EUCLEAN;
btrfs_crit(fs_info,
"regular/prealloc extent found for non-regular inode %llu",
btrfs_ino(inode));
From: Paul Cercueil <[email protected]>
commit 0889a67a9e7a56ba39af223d536630b20b877fda upstream.
The ROUT (right channel output of audio codec) was connected to INL
(left channel of audio amplifier) instead of INR (right channel of audio
amplifier).
Fixes: 8ddebad15e9b ("MIPS: qi_lb60: Migrate to devicetree")
Cc: [email protected] # v5.3
Signed-off-by: Paul Cercueil <[email protected]>
Signed-off-by: Thomas Bogendoerfer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/mips/boot/dts/ingenic/qi_lb60.dts | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/mips/boot/dts/ingenic/qi_lb60.dts
+++ b/arch/mips/boot/dts/ingenic/qi_lb60.dts
@@ -69,7 +69,7 @@
"Speaker", "OUTL",
"Speaker", "OUTR",
"INL", "LOUT",
- "INL", "ROUT";
+ "INR", "ROUT";
simple-audio-card,aux-devs = <&>;
On Thu, 20 Aug 2020 at 14:55, Greg Kroah-Hartman
<[email protected]> wrote:
>
> This is the start of the stable review cycle for the 5.8.3 release.
> There are 232 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sat, 22 Aug 2020 09:15:09 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.3-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
> Herbert Xu <[email protected]>
> crypto: af_alg - Fix regression on empty requests
Results from Linaro’s test farm.
Regressions detected.
ltp-crypto-tests:
* af_alg02
ltp-cve-tests:
* cve-2017-17805
af_alg02.c:52: BROK: Timed out while reading from request socket.
We are running the LTP 20200515 tag released test suite.
https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/crypto/af_alg02.c
Summary
------------------------------------------------------------------------
kernel: 5.8.3-rc1
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-5.8.y
git commit: 201fff807310ce10485bcff294d47be95f3769eb
git describe: v5.8.2-233-g201fff807310
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-5.8-oe/build/v5.8.2-233-g201fff807310
Regressions (compared to build v5.8.2)
------------------------------------------------------------------------
x15:
ltp-crypto-tests:
* af_alg02
ltp-cve-tests:
* cve-2017-17805
--
Linaro LKFT
https://lkft.linaro.org
On Thu, Aug 20, 2020 at 08:57:57PM +0530, Naresh Kamboju wrote:
> On Thu, 20 Aug 2020 at 14:55, Greg Kroah-Hartman
> <[email protected]> wrote:
> >
> > This is the start of the stable review cycle for the 5.8.3 release.
> > There are 232 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Sat, 22 Aug 2020 09:15:09 +0000.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.3-rc1.gz
> > or in the git tree and branch at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
>
> > Herbert Xu <[email protected]>
> > crypto: af_alg - Fix regression on empty requests
>
> Results from Linaro’s test farm.
> Regressions detected.
>
> ltp-crypto-tests:
> * af_alg02
> ltp-cve-tests:
> * cve-2017-17805
>
> af_alg02.c:52: BROK: Timed out while reading from request socket.
> We are running the LTP 20200515 tag released test suite.
> https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/crypto/af_alg02.c
>
> Summary
> ------------------------------------------------------------------------
>
> kernel: 5.8.3-rc1
> git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> git branch: linux-5.8.y
> git commit: 201fff807310ce10485bcff294d47be95f3769eb
> git describe: v5.8.2-233-g201fff807310
> Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-5.8-oe/build/v5.8.2-233-g201fff807310
>
> Regressions (compared to build v5.8.2)
> ------------------------------------------------------------------------
>
> x15:
> ltp-crypto-tests:
> * af_alg02
>
> ltp-cve-tests:
> * cve-2017-17805
>
Looks like this test is still "broken" because it assumes behavior that isn't
clearly specified, as previously discussed at
https://lkml.kernel.org/r/[email protected].
I sent out LTP patches to fix it:
https://lkml.kernel.org/linux-crypto/[email protected]/T/#u
- Eric
On Thu, Aug 20, 2020 at 11:17:31AM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.8.3 release.
> There are 232 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sat, 22 Aug 2020 09:15:09 +0000.
> Anything received after that time might be too late.
>
Build results:
total: 154 pass: 154 fail: 0
Qemu test results:
total: 430 pass: 430 fail: 0
Tested-by: Guenter Roeck <[email protected]>
Guenter
On 8/20/20 3:17 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.8.3 release.
> There are 232 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sat, 22 Aug 2020 09:15:09 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.3-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan <[email protected]>
thanks,
-- Shuah
Hi all,
> On Thu, Aug 20, 2020 at 08:57:57PM +0530, Naresh Kamboju wrote:
> > On Thu, 20 Aug 2020 at 14:55, Greg Kroah-Hartman
> > <[email protected]> wrote:
> > > This is the start of the stable review cycle for the 5.8.3 release.
> > > There are 232 patches in this series, all will be posted as a response
> > > to this one. If anyone has any issues with these being applied, please
> > > let me know.
> > > Responses should be made by Sat, 22 Aug 2020 09:15:09 +0000.
> > > Anything received after that time might be too late.
> > > The whole patch series can be found in one patch at:
> > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.3-rc1.gz
> > > or in the git tree and branch at:
> > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y
> > > and the diffstat can be found below.
> > > thanks,
> > > greg k-h
> > > Herbert Xu <[email protected]>
> > > crypto: af_alg - Fix regression on empty requests
> > Results from Linaro’s test farm.
> > Regressions detected.
> > ltp-crypto-tests:
> > * af_alg02
> > ltp-cve-tests:
> > * cve-2017-17805
> > af_alg02.c:52: BROK: Timed out while reading from request socket.
> > We are running the LTP 20200515 tag released test suite.
> > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/crypto/af_alg02.c
> > Summary
> > ------------------------------------------------------------------------
> > kernel: 5.8.3-rc1
> > git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> > git branch: linux-5.8.y
> > git commit: 201fff807310ce10485bcff294d47be95f3769eb
> > git describe: v5.8.2-233-g201fff807310
> > Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-5.8-oe/build/v5.8.2-233-g201fff807310
> > Regressions (compared to build v5.8.2)
> > ------------------------------------------------------------------------
> > x15:
> > ltp-crypto-tests:
> > * af_alg02
> > ltp-cve-tests:
> > * cve-2017-17805
> Looks like this test is still "broken" because it assumes behavior that isn't
> clearly specified, as previously discussed at
> https://lkml.kernel.org/r/[email protected].
> I sent out LTP patches to fix it:
> https://lkml.kernel.org/linux-crypto/[email protected]/T/#u
FYI fix for LTP merged.
Kind regards,
Petr
> - Eric
On Thu, Aug 20, 2020 at 05:47:46PM -0600, Shuah Khan wrote:
> On 8/20/20 3:17 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.8.3 release.
> > There are 232 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Sat, 22 Aug 2020 09:15:09 +0000.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.3-rc1.gz
> > or in the git tree and branch at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
> >
>
> Compiled and booted on my test system. No dmesg regressions.
>
> Tested-by: Shuah Khan <[email protected]>
Thanks for testing all of these and letting me know.
greg k-h
On Thu, Aug 20, 2020 at 01:06:31PM -0700, Guenter Roeck wrote:
> On Thu, Aug 20, 2020 at 11:17:31AM +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.8.3 release.
> > There are 232 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Sat, 22 Aug 2020 09:15:09 +0000.
> > Anything received after that time might be too late.
> >
>
> Build results:
> total: 154 pass: 154 fail: 0
> Qemu test results:
> total: 430 pass: 430 fail: 0
>
> Tested-by: Guenter Roeck <[email protected]>
Thanks for testing all of these and letting me know.
greg k-h
On Thu, Aug 20, 2020 at 08:57:57PM +0530, Naresh Kamboju wrote:
> On Thu, 20 Aug 2020 at 14:55, Greg Kroah-Hartman
> <[email protected]> wrote:
> >
> > This is the start of the stable review cycle for the 5.8.3 release.
> > There are 232 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Sat, 22 Aug 2020 09:15:09 +0000.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.3-rc1.gz
> > or in the git tree and branch at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
>
> > Herbert Xu <[email protected]>
> > crypto: af_alg - Fix regression on empty requests
>
> Results from Linaro’s test farm.
> Regressions detected.
>
> ltp-crypto-tests:
> * af_alg02
> ltp-cve-tests:
> * cve-2017-17805
>
> af_alg02.c:52: BROK: Timed out while reading from request socket.
> We are running the LTP 20200515 tag released test suite.
> https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/crypto/af_alg02.c
Looks like the crypto tests are now fixed :)
Anyway, thanks for testing all of these and letting me know.
greg k-h
On Fri, 21 Aug 2020 at 16:45, Greg Kroah-Hartman
<[email protected]> wrote:
>
> On Thu, Aug 20, 2020 at 08:57:57PM +0530, Naresh Kamboju wrote:
> > On Thu, 20 Aug 2020 at 14:55, Greg Kroah-Hartman
> > <[email protected]> wrote:
> > >
> > > This is the start of the stable review cycle for the 5.8.3 release.
> > > There are 232 patches in this series, all will be posted as a response
> > > to this one. If anyone has any issues with these being applied, please
> > > let me know.
> > >
> > > Responses should be made by Sat, 22 Aug 2020 09:15:09 +0000.
> > > Anything received after that time might be too late.
> > >
> > > The whole patch series can be found in one patch at:
> > > https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.8.3-rc1.gz
> > > or in the git tree and branch at:
> > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.8.y
> > > and the diffstat can be found below.
> > >
> > > thanks,
> > >
> > > greg k-h
> >
> > > Herbert Xu <[email protected]>
> > > crypto: af_alg - Fix regression on empty requests
> >
> > Results from Linaro’s test farm.
> > Regressions detected.
> >
> > ltp-crypto-tests:
> > * af_alg02
> > ltp-cve-tests:
> > * cve-2017-17805
> >
> > af_alg02.c:52: BROK: Timed out while reading from request socket.
> > We are running the LTP 20200515 tag released test suite.
> > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/crypto/af_alg02.c
>
> Looks like the crypto tests are now fixed :)
>
> Anyway, thanks for testing all of these and letting me know.
Apart from the reported LTP crypto test case problem all other results
look good to me.
Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.
Summary
------------------------------------------------------------------------
kernel: 5.8.3
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-5.8.y
git commit: a1101e94767e2d5da5bcbee12573d96a1c8be5bb
git describe: v5.8.3
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-5.8-oe/build/v5.8.3
No regressions (compared to build v5.8.2)
No fixes (compared to build v5.8.2)
Ran 30256 total tests in the following environments and test suites.
Environments
--------------
- dragonboard-410c
- hi6220-hikey
- i386
- juno-r2
- juno-r2-compat
- juno-r2-kasan
- nxp-ls2088
- qemu_arm
- qemu_arm64
- qemu_i386
- qemu_x86_64
- x15
- x86
- x86-kasan
Test Suites
-----------
* build
* linux-log-parser
* ltp-commands-tests
* ltp-containers-tests
* ltp-dio-tests
* ltp-fs-tests
* ltp-io-tests
* ltp-math-tests
* ltp-tracing-tests
* ltp-cve-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* v4l2-compliance
* libhugetlbfs
* ltp-cap_bounds-tests
* ltp-controllers-tests
* ltp-cpuhotplug-tests
* ltp-crypto-tests
* ltp-hugetlb-tests
* ltp-ipc-tests
* ltp-mm-tests
* ltp-open-posix-tests
* ltp-sched-tests
* network-basic-tests
* igt-gpu-tools
--
Linaro LKFT
https://lkft.linaro.org
On 08/20/20 11:19, Greg Kroah-Hartman wrote:
> From: Qais Yousef <[email protected]>
>
> [ Upstream commit 46609ce227039fd192e0ecc7d940bed587fd2c78 ]
>
> There is a report that when uclamp is enabled, a netperf UDP test
> regresses compared to a kernel compiled without uclamp.
>
> https://lore.kernel.org/lkml/[email protected]/
>
> While investigating the root cause, there were no sign that the uclamp
> code is doing anything particularly expensive but could suffer from bad
> cache behavior under certain circumstances that are yet to be
> understood.
>
> https://lore.kernel.org/lkml/20200616110824.dgkkbyapn3io6wik@e107158-lin/
>
> To reduce the pressure on the fast path anyway, add a static key that is
> by default will skip executing uclamp logic in the
> enqueue/dequeue_task() fast path until it's needed.
>
> As soon as the user start using util clamp by:
>
> 1. Changing uclamp value of a task with sched_setattr()
> 2. Modifying the default sysctl_sched_util_clamp_{min, max}
> 3. Modifying the default cpu.uclamp.{min, max} value in cgroup
>
> We flip the static key now that the user has opted to use util clamp.
> Effectively re-introducing uclamp logic in the enqueue/dequeue_task()
> fast path. It stays on from that point forward until the next reboot.
>
> This should help minimize the effect of util clamp on workloads that
> don't need it but still allow distros to ship their kernels with uclamp
> compiled in by default.
>
> SCHED_WARN_ON() in uclamp_rq_dec_id() was removed since now we can end
> up with unbalanced call to uclamp_rq_dec_id() if we flip the key while
> a task is running in the rq. Since we know it is harmless we just
> quietly return if we attempt a uclamp_rq_dec_id() when
> rq->uclamp[].bucket[].tasks is 0.
>
> In schedutil, we introduce a new uclamp_is_enabled() helper which takes
> the static key into account to ensure RT boosting behavior is retained.
>
> The following results demonstrates how this helps on 2 Sockets Xeon E5
> 2x10-Cores system.
>
> nouclamp uclamp uclamp-static-key
> Hmean send-64 162.43 ( 0.00%) 157.84 * -2.82%* 163.39 * 0.59%*
> Hmean send-128 324.71 ( 0.00%) 314.78 * -3.06%* 326.18 * 0.45%*
> Hmean send-256 641.55 ( 0.00%) 628.67 * -2.01%* 648.12 * 1.02%*
> Hmean send-1024 2525.28 ( 0.00%) 2448.26 * -3.05%* 2543.73 * 0.73%*
> Hmean send-2048 4836.14 ( 0.00%) 4712.08 * -2.57%* 4867.69 * 0.65%*
> Hmean send-3312 7540.83 ( 0.00%) 7425.45 * -1.53%* 7621.06 * 1.06%*
> Hmean send-4096 9124.53 ( 0.00%) 8948.82 * -1.93%* 9276.25 * 1.66%*
> Hmean send-8192 15589.67 ( 0.00%) 15486.35 * -0.66%* 15819.98 * 1.48%*
> Hmean send-16384 26386.47 ( 0.00%) 25752.25 * -2.40%* 26773.74 * 1.47%*
>
> The perf diff between nouclamp and uclamp-static-key when uclamp is
> disabled in the fast path:
>
> 8.73% -1.55% [kernel.kallsyms] [k] try_to_wake_up
> 0.07% +0.04% [kernel.kallsyms] [k] deactivate_task
> 0.13% -0.02% [kernel.kallsyms] [k] activate_task
>
> The diff between nouclamp and uclamp-static-key when uclamp is enabled
> in the fast path:
>
> 8.73% -0.72% [kernel.kallsyms] [k] try_to_wake_up
> 0.13% +0.39% [kernel.kallsyms] [k] activate_task
> 0.07% +0.38% [kernel.kallsyms] [k] deactivate_task
>
> Fixes: 69842cba9ace ("sched/uclamp: Add CPU's clamp buckets refcounting")
> Reported-by: Mel Gorman <[email protected]>
> Signed-off-by: Qais Yousef <[email protected]>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> Tested-by: Lukasz Luba <[email protected]>
> Link: https://lkml.kernel.org/r/[email protected]
> Signed-off-by: Sasha Levin <[email protected]>
> ---
Greg/Peter/Mel
Should this go to 5.4 too? Not saying it should, but I don't know if distros
could care about potential performance hit that this patch addresses.
Thanks
--
Qais Yousef
On Thu, Aug 27, 2020 at 02:53:31PM +0100, Qais Yousef wrote:
> On 08/20/20 11:19, Greg Kroah-Hartman wrote:
> > From: Qais Yousef <[email protected]>
> >
> > [ Upstream commit 46609ce227039fd192e0ecc7d940bed587fd2c78 ]
> >
> > There is a report that when uclamp is enabled, a netperf UDP test
> > regresses compared to a kernel compiled without uclamp.
> >
> > https://lore.kernel.org/lkml/[email protected]/
> >
> > While investigating the root cause, there were no sign that the uclamp
> > code is doing anything particularly expensive but could suffer from bad
> > cache behavior under certain circumstances that are yet to be
> > understood.
> >
> > https://lore.kernel.org/lkml/20200616110824.dgkkbyapn3io6wik@e107158-lin/
> >
> > To reduce the pressure on the fast path anyway, add a static key that is
> > by default will skip executing uclamp logic in the
> > enqueue/dequeue_task() fast path until it's needed.
> >
> > As soon as the user start using util clamp by:
> >
> > 1. Changing uclamp value of a task with sched_setattr()
> > 2. Modifying the default sysctl_sched_util_clamp_{min, max}
> > 3. Modifying the default cpu.uclamp.{min, max} value in cgroup
> >
> > We flip the static key now that the user has opted to use util clamp.
> > Effectively re-introducing uclamp logic in the enqueue/dequeue_task()
> > fast path. It stays on from that point forward until the next reboot.
> >
> > This should help minimize the effect of util clamp on workloads that
> > don't need it but still allow distros to ship their kernels with uclamp
> > compiled in by default.
> >
> > SCHED_WARN_ON() in uclamp_rq_dec_id() was removed since now we can end
> > up with unbalanced call to uclamp_rq_dec_id() if we flip the key while
> > a task is running in the rq. Since we know it is harmless we just
> > quietly return if we attempt a uclamp_rq_dec_id() when
> > rq->uclamp[].bucket[].tasks is 0.
> >
> > In schedutil, we introduce a new uclamp_is_enabled() helper which takes
> > the static key into account to ensure RT boosting behavior is retained.
> >
> > The following results demonstrates how this helps on 2 Sockets Xeon E5
> > 2x10-Cores system.
> >
> > nouclamp uclamp uclamp-static-key
> > Hmean send-64 162.43 ( 0.00%) 157.84 * -2.82%* 163.39 * 0.59%*
> > Hmean send-128 324.71 ( 0.00%) 314.78 * -3.06%* 326.18 * 0.45%*
> > Hmean send-256 641.55 ( 0.00%) 628.67 * -2.01%* 648.12 * 1.02%*
> > Hmean send-1024 2525.28 ( 0.00%) 2448.26 * -3.05%* 2543.73 * 0.73%*
> > Hmean send-2048 4836.14 ( 0.00%) 4712.08 * -2.57%* 4867.69 * 0.65%*
> > Hmean send-3312 7540.83 ( 0.00%) 7425.45 * -1.53%* 7621.06 * 1.06%*
> > Hmean send-4096 9124.53 ( 0.00%) 8948.82 * -1.93%* 9276.25 * 1.66%*
> > Hmean send-8192 15589.67 ( 0.00%) 15486.35 * -0.66%* 15819.98 * 1.48%*
> > Hmean send-16384 26386.47 ( 0.00%) 25752.25 * -2.40%* 26773.74 * 1.47%*
> >
> > The perf diff between nouclamp and uclamp-static-key when uclamp is
> > disabled in the fast path:
> >
> > 8.73% -1.55% [kernel.kallsyms] [k] try_to_wake_up
> > 0.07% +0.04% [kernel.kallsyms] [k] deactivate_task
> > 0.13% -0.02% [kernel.kallsyms] [k] activate_task
> >
> > The diff between nouclamp and uclamp-static-key when uclamp is enabled
> > in the fast path:
> >
> > 8.73% -0.72% [kernel.kallsyms] [k] try_to_wake_up
> > 0.13% +0.39% [kernel.kallsyms] [k] activate_task
> > 0.07% +0.38% [kernel.kallsyms] [k] deactivate_task
> >
> > Fixes: 69842cba9ace ("sched/uclamp: Add CPU's clamp buckets refcounting")
> > Reported-by: Mel Gorman <[email protected]>
> > Signed-off-by: Qais Yousef <[email protected]>
> > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> > Tested-by: Lukasz Luba <[email protected]>
> > Link: https://lkml.kernel.org/r/[email protected]
> > Signed-off-by: Sasha Levin <[email protected]>
> > ---
>
> Greg/Peter/Mel
>
> Should this go to 5.4 too? Not saying it should, but I don't know if distros
> could care about potential performance hit that this patch addresses.
If you want to provide a backported version of this to 5.4.y, that you
have tested that works properly, I will be glad to queue it up.
thanks,
greg k-h
On 08/27/20 17:55, Greg Kroah-Hartman wrote:
> On Thu, Aug 27, 2020 at 02:53:31PM +0100, Qais Yousef wrote:
> > On 08/20/20 11:19, Greg Kroah-Hartman wrote:
> > > From: Qais Yousef <[email protected]>
> > >
> > > [ Upstream commit 46609ce227039fd192e0ecc7d940bed587fd2c78 ]
> > >
> > > There is a report that when uclamp is enabled, a netperf UDP test
> > > regresses compared to a kernel compiled without uclamp.
> > >
> > > https://lore.kernel.org/lkml/[email protected]/
> > >
> > > While investigating the root cause, there were no sign that the uclamp
> > > code is doing anything particularly expensive but could suffer from bad
> > > cache behavior under certain circumstances that are yet to be
> > > understood.
> > >
> > > https://lore.kernel.org/lkml/20200616110824.dgkkbyapn3io6wik@e107158-lin/
> > >
> > > To reduce the pressure on the fast path anyway, add a static key that is
> > > by default will skip executing uclamp logic in the
> > > enqueue/dequeue_task() fast path until it's needed.
> > >
> > > As soon as the user start using util clamp by:
> > >
> > > 1. Changing uclamp value of a task with sched_setattr()
> > > 2. Modifying the default sysctl_sched_util_clamp_{min, max}
> > > 3. Modifying the default cpu.uclamp.{min, max} value in cgroup
> > >
> > > We flip the static key now that the user has opted to use util clamp.
> > > Effectively re-introducing uclamp logic in the enqueue/dequeue_task()
> > > fast path. It stays on from that point forward until the next reboot.
> > >
> > > This should help minimize the effect of util clamp on workloads that
> > > don't need it but still allow distros to ship their kernels with uclamp
> > > compiled in by default.
> > >
> > > SCHED_WARN_ON() in uclamp_rq_dec_id() was removed since now we can end
> > > up with unbalanced call to uclamp_rq_dec_id() if we flip the key while
> > > a task is running in the rq. Since we know it is harmless we just
> > > quietly return if we attempt a uclamp_rq_dec_id() when
> > > rq->uclamp[].bucket[].tasks is 0.
> > >
> > > In schedutil, we introduce a new uclamp_is_enabled() helper which takes
> > > the static key into account to ensure RT boosting behavior is retained.
> > >
> > > The following results demonstrates how this helps on 2 Sockets Xeon E5
> > > 2x10-Cores system.
> > >
> > > nouclamp uclamp uclamp-static-key
> > > Hmean send-64 162.43 ( 0.00%) 157.84 * -2.82%* 163.39 * 0.59%*
> > > Hmean send-128 324.71 ( 0.00%) 314.78 * -3.06%* 326.18 * 0.45%*
> > > Hmean send-256 641.55 ( 0.00%) 628.67 * -2.01%* 648.12 * 1.02%*
> > > Hmean send-1024 2525.28 ( 0.00%) 2448.26 * -3.05%* 2543.73 * 0.73%*
> > > Hmean send-2048 4836.14 ( 0.00%) 4712.08 * -2.57%* 4867.69 * 0.65%*
> > > Hmean send-3312 7540.83 ( 0.00%) 7425.45 * -1.53%* 7621.06 * 1.06%*
> > > Hmean send-4096 9124.53 ( 0.00%) 8948.82 * -1.93%* 9276.25 * 1.66%*
> > > Hmean send-8192 15589.67 ( 0.00%) 15486.35 * -0.66%* 15819.98 * 1.48%*
> > > Hmean send-16384 26386.47 ( 0.00%) 25752.25 * -2.40%* 26773.74 * 1.47%*
> > >
> > > The perf diff between nouclamp and uclamp-static-key when uclamp is
> > > disabled in the fast path:
> > >
> > > 8.73% -1.55% [kernel.kallsyms] [k] try_to_wake_up
> > > 0.07% +0.04% [kernel.kallsyms] [k] deactivate_task
> > > 0.13% -0.02% [kernel.kallsyms] [k] activate_task
> > >
> > > The diff between nouclamp and uclamp-static-key when uclamp is enabled
> > > in the fast path:
> > >
> > > 8.73% -0.72% [kernel.kallsyms] [k] try_to_wake_up
> > > 0.13% +0.39% [kernel.kallsyms] [k] activate_task
> > > 0.07% +0.38% [kernel.kallsyms] [k] deactivate_task
> > >
> > > Fixes: 69842cba9ace ("sched/uclamp: Add CPU's clamp buckets refcounting")
> > > Reported-by: Mel Gorman <[email protected]>
> > > Signed-off-by: Qais Yousef <[email protected]>
> > > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> > > Tested-by: Lukasz Luba <[email protected]>
> > > Link: https://lkml.kernel.org/r/[email protected]
> > > Signed-off-by: Sasha Levin <[email protected]>
> > > ---
> >
> > Greg/Peter/Mel
> >
> > Should this go to 5.4 too? Not saying it should, but I don't know if distros
> > could care about potential performance hit that this patch addresses.
>
> If you want to provide a backported version of this to 5.4.y, that you
> have tested that works properly, I will be glad to queue it up.
The conflict was simple enough to resolve. I'll test them tomorrow and post
2 patches.
Thanks!
--
Qais Yousef