2023-10-09 13:36:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.10 000/226] 5.10.198-rc1 review

This is the start of the stable review cycle for the 5.10.198 release.
There are 226 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.198-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 5.10.198-rc1

Florian Westphal <[email protected]>
netfilter: nftables: exthdr: fix 4-byte stack OOB write

Florian Westphal <[email protected]>
netfilter: nf_tables: fix kdoc warnings after gc rework

John David Anglin <[email protected]>
parisc: Restore __ldcw_align for PA-RISC 2.0 processors

Shay Drory <[email protected]>
RDMA/mlx5: Fix NULL string error

Bernard Metzler <[email protected]>
RDMA/siw: Fix connection failure handling

Konstantin Meskhidze <[email protected]>
RDMA/uverbs: Fix typo of sizeof argument

Leon Romanovsky <[email protected]>
RDMA/cma: Fix truncation compilation warning in make_cma_ports

Mark Zhang <[email protected]>
RDMA/cma: Initialize ib_sa_multicast structure to 0 when join

Duje Mihanović <[email protected]>
gpio: pxa: disable pinctrl calls for MMP_GPIO

Bartosz Golaszewski <[email protected]>
gpio: aspeed: fix the GPIO number passed to pinctrl_gpio_set_config()

Christophe JAILLET <[email protected]>
IB/mlx4: Fix the size of a buffer in add_port_entries()

Dan Carpenter <[email protected]>
of: dynamic: Fix potential memory leak in of_changeset_action()

Leon Romanovsky <[email protected]>
RDMA/core: Require admin capabilities to set system parameters

Fedor Pchelkin <[email protected]>
dm zoned: free dmz->ddev array in dmz_put_zoned_devices

Ivan Babrou <[email protected]>
cpupower: add Makefile dependencies for install targets

Xin Long <[email protected]>
sctp: update hb timer immediately after users change hb_interval

Xin Long <[email protected]>
sctp: update transport state when processing a dupcook packet

Neal Cardwell <[email protected]>
tcp: fix delayed ACKs for MSS boundary condition

Neal Cardwell <[email protected]>
tcp: fix quick-ack counting to count actual ACKs of new data

Chengfeng Ye <[email protected]>
tipc: fix a potential deadlock on &tx->lock

Ben Wolsieffer <[email protected]>
net: stmmac: dwmac-stm32: fix resume on STM32 MCU

Florian Westphal <[email protected]>
netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure

Xin Long <[email protected]>
netfilter: handle the connecting collision properly in nf_conntrack_proto_sctp

Dan Carpenter <[email protected]>
net: ethernet: ti: am65-cpsw: Fix error code in am65_cpsw_nuss_init_tx_chns()

Jeremy Cline <[email protected]>
net: nfc: llcp: Add lock when modifying device list

Shigeru Yoshida <[email protected]>
net: usb: smsc75xx: Fix uninit-value access in __smsc75xx_read_reg

Fabio Estevam <[email protected]>
net: dsa: mv88e6xxx: Avoid EEPROM timeout when EEPROM is absent

David Howells <[email protected]>
ipv4, ipv6: Fix handling of transhdrlen in __ip{,6}_append_data()

Eric Dumazet <[email protected]>
net: fix possible store tearing in neigh_periodic_work()

Mauricio Faria de Oliveira <[email protected]>
modpost: add missing else to the "of" check

Trond Myklebust <[email protected]>
NFSv4: Fix a nfs4_state_manager() race

Arnd Bergmann <[email protected]>
ima: rework CONFIG_IMA dependency block

Junxiao Bi <[email protected]>
scsi: target: core: Fix deadlock due to recursive locking

Oleksandr Tymoshenko <[email protected]>
ima: Finish deprecation of IMA_TRUSTED_KEYRING Kconfig

Richard Fitzgerald <[email protected]>
regmap: rbtree: Fix wrong register marked as in-cache when creating new node

Felix Fietkau <[email protected]>
wifi: mt76: mt76x02: fix MT76x0 external LNA gain handling

Alexandra Diupina <[email protected]>
drivers/net: process the result of hdlc_open() and add call of hdlc_close() in uhdlc_close()

Leon Hwang <[email protected]>
bpf: Fix tr dereferencing

Pin-yen Lin <[email protected]>
wifi: mwifiex: Fix oob check condition in mwifiex_process_rx_packet

Arnd Bergmann <[email protected]>
wifi: iwlwifi: dbg_ini: fix structure packing

Zhihao Cheng <[email protected]>
ubi: Refuse attaching if mtd's erasesize is 0

Rob Herring <[email protected]>
arm64: Add Cortex-A520 CPU part definition

Jordan Rife <[email protected]>
net: prevent rewrite of msg_name in sock_sendmsg()

Qu Wenruo <[email protected]>
btrfs: reject unknown mount options early

Jordan Rife <[email protected]>
net: replace calls to sock->ops->connect() with kernel_connect()

Gustavo A. R. Silva <[email protected]>
wifi: mwifiex: Fix tlv_buf_left calculation

Gustavo A. R. Silva <[email protected]>
qed/red_ll2: Fix undefined behavior bug in struct qed_ll2_info

Dinghao Liu <[email protected]>
scsi: zfcp: Fix a double put in zfcp_port_enqueue()

Greg Kroah-Hartman <[email protected]>
Revert "PCI: qcom: Disable write access to read only registers for IP v2.3.3"

Greg Kroah-Hartman <[email protected]>
Revert "clk: imx: pll14xx: dynamically configure PLL for 393216000/361267200Hz"

Ming Lei <[email protected]>
block: fix use-after-free of q->q_usage_counter

Ilya Dryomov <[email protected]>
rbd: take header_rwsem in rbd_dev_refresh() only when updating

Ilya Dryomov <[email protected]>
rbd: decouple parent info read-in from updating rbd_dev

Ilya Dryomov <[email protected]>
rbd: decouple header read-in from updating rbd_dev->header

Ilya Dryomov <[email protected]>
rbd: move rbd_dev_refresh() definition

Nathan Chancellor <[email protected]>
drm/mediatek: Fix backport issue in mtk_drm_gem_prime_vmap()

Zheng Yejian <[email protected]>
ring-buffer: Fix bytes info in per_cpu buffer stats

Vlastimil Babka <[email protected]>
ring-buffer: remove obsolete comment for free_buffer_page()

Trond Myklebust <[email protected]>
NFSv4: Fix a state manager thread deadlock regression

Benjamin Coddington <[email protected]>
NFS: rename nfs_client_kset to nfs_kset

Benjamin Coddington <[email protected]>
NFS: Cleanup unused rpc_clnt variable

Johan Hovold <[email protected]>
spi: zynqmp-gqspi: fix clock imbalance on probe failure

Dinghao Liu <[email protected]>
spi: spi-zynqmp-gqspi: Fix runtime PM imbalance in zynqmp_qspi_probe

Greg Ungerer <[email protected]>
fs: binfmt_elf_efpic: fix personality for ELF-FDPIC

Matthias Schiffer <[email protected]>
ata: libata-sata: increase PMP SRST timeout to 10s

Damien Le Moal <[email protected]>
ata: libata-core: Do not register PM operations for SAS ports

Damien Le Moal <[email protected]>
ata: libata-core: Fix port and device removal

Damien Le Moal <[email protected]>
ata: libata-core: Fix ata_port_request_pm() locking

Mika Westerberg <[email protected]>
net: thunderbolt: Fix TCPv6 GSO checksum calculation

Nick Desaulniers <[email protected]>
bpf: Fix BTF_ID symbol generation collision in tools/

Jiri Olsa <[email protected]>
bpf: Fix BTF_ID symbol generation collision

Josef Bacik <[email protected]>
btrfs: properly report 0 avail for very full file systems

Steven Rostedt (Google) <[email protected]>
ring-buffer: Update "shortest_full" in polling

Ben Wolsieffer <[email protected]>
proc: nommu: /proc/<pid>/maps: release mmap read lock

Trond Myklebust <[email protected]>
Revert "SUNRPC dont update timeout value on connection reset"

Heiner Kallweit <[email protected]>
i2c: i801: unregister tco_pdev in i801_probe() error path

Niklas Cassel <[email protected]>
ata: libata-scsi: ignore reserved bits for REPORT SUPPORTED OPERATION CODES

Kailang Yang <[email protected]>
ALSA: hda: Disable power save for solving pop issue on Lenovo ThinkCentre M70q

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: disallow rule removal from chain binding

Pan Bian <[email protected]>
nilfs2: fix potential use after free in nilfs_gccache_submit_read_data()

Andy Shevchenko <[email protected]>
serial: 8250_port: Check IRQ data before use

Daniel Starke <[email protected]>
Revert "tty: n_gsm: fix UAF in gsm_cleanup_mux"

Vishal Goel <[email protected]>
Smack:- Use overlay inode label in smack_inode_copy_up()

Roberto Sassu <[email protected]>
smack: Retrieve transmuting information in smack_inode_getsecurity()

Roberto Sassu <[email protected]>
smack: Record transmuting in smk_transmuted

Irvin Cote <[email protected]>
nvme-pci: always return an ERR_PTR from nvme_pci_alloc_dev

Phil Sutter <[email protected]>
netfilter: nft_exthdr: Fix for unsafe packet data read

Phil Sutter <[email protected]>
netfilter: nft_exthdr: Search chunks in SCTP packets only

Mika Westerberg <[email protected]>
watchdog: iTCO_wdt: Set NO_REBOOT if the watchdog is not already running

Mika Westerberg <[email protected]>
watchdog: iTCO_wdt: No need to stop the timer in probe

Pratyush Yadav <[email protected]>
nvme-pci: do not set the NUMA node of device if it has none

Christoph Hellwig <[email protected]>
nvme-pci: factor out a nvme_pci_alloc_dev helper

Christoph Hellwig <[email protected]>
nvme-pci: factor the iod mempool creation into a helper

Mario Limonciello <[email protected]>
ACPI: Check StorageD3Enable _DSD property in ACPI code

Chengming Zhou <[email protected]>
cgroup: Fix suspicious rcu_dereference_check() usage warning

Chengming Zhou <[email protected]>
sched/cpuacct: Optimize away RCU read lock

Chengming Zhou <[email protected]>
sched/cpuacct: Fix charge percpu cpuusage

Andrey Ryabinin <[email protected]>
sched/cpuacct: Fix user/system in shown cpuacct.usage*

Arnaldo Carvalho de Melo <[email protected]>
perf build: Define YYNOMEM as YYNOABORT for bison < 3.81

Thomas Zimmermann <[email protected]>
fbdev/sh7760fb: Depend on FB=y

Johnathan Mantey <[email protected]>
ncsi: Propagate carrier gain/loss events to the NCSI controller

Benjamin Gray <[email protected]>
powerpc/watchpoints: Disable preemption in thread_change_pc()

Hans Verkuil <[email protected]>
media: vb2: frame_vector.c: replace WARN_ONCE with a comment

Stanislav Fomichev <[email protected]>
bpf: Clarify error expectations from bpf_clone_redirect

Han Xu <[email protected]>
spi: nxp-fspi: reset the FLSHxCR1 registers

Niklas Cassel <[email protected]>
ata: libata-eh: do not clear ATA_PFLAG_EH_PENDING in ata_eh_reset()

Michal Grzedzicki <[email protected]>
scsi: pm80xx: Avoid leaking tags when processing OPC_INB_SET_CONTROLLER_CONFIG command

Michal Grzedzicki <[email protected]>
scsi: pm80xx: Use phy-specific SAS address when sending PHY_START command

David Thompson <[email protected]>
platform/mellanox: mlxbf-bootctl: add NET dependency into Kconfig

Steven Rostedt (Google) <[email protected]>
ring-buffer: Do not attempt to read past "commit"

Ricardo B. Marliere <[email protected]>
selftests: fix dependency checker script

Zheng Yejian <[email protected]>
ring-buffer: Avoid softlockup in ring_buffer_resize()

Zheng Yejian <[email protected]>
selftests/ftrace: Correctly enable event in instance-event.tc

Javed Hasan <[email protected]>
scsi: qedf: Add synchronization between I/O completions and abort

Helge Deller <[email protected]>
parisc: irq: Make irq_stack_union static to avoid sparse warning

Helge Deller <[email protected]>
parisc: drivers: Fix sparse warning

Helge Deller <[email protected]>
parisc: iosapic.c: Fix sparse warnings

Helge Deller <[email protected]>
parisc: sba: Fix compile warning wrt list of SBA devices

Sergey Senozhatsky <[email protected]>
dma-debug: don't call __dma_entry_alloc_check_leak() under free_entries_lock

William A. Kennington III <[email protected]>
i2c: npcm7xx: Fix callback completion ordering

Wenhua Lin <[email protected]>
gpio: pmic-eic-sprd: Add can_sleep flag for PMIC EIC chip

Max Filippov <[email protected]>
xtensa: boot/lib: fix function prototypes

Randy Dunlap <[email protected]>
xtensa: boot: don't add include-dirs

Randy Dunlap <[email protected]>
xtensa: iss/network: make functions static

Max Filippov <[email protected]>
xtensa: add default definition for XCHAL_HAVE_DIV32

Dan Carpenter <[email protected]>
power: supply: ucs1002: fix error code in ucs1002_get_property()

Tony Lindgren <[email protected]>
bus: ti-sysc: Fix SYSC_QUIRK_SWSUP_SIDLE_ACT handling for uart wake-up

Tony Lindgren <[email protected]>
ARM: dts: ti: omap: motorola-mapphone: Fix abe_clkctrl warning on boot

Tony Lindgren <[email protected]>
ARM: dts: Unify pwm-omap-dmtimer node names

Gireesh Hiremath <[email protected]>
ARM: dts: am335x: Guardian: Update beeper label

Geert Uytterhoeven <[email protected]>
ARM: dts: motorola-mapphone: Drop second ti,wlcore compatible value

Carl Philipp Klemm <[email protected]>
ARM: dts: motorola-mapphone: Add 1.2GHz OPP

Tony Lindgren <[email protected]>
ARM: dts: motorola-mapphone: Configure lower temperature passive cooling

Tony Lindgren <[email protected]>
ARM: dts: ti: omap: Fix bandgap thermal cells addressing for omap3/4

Krzysztof Kozlowski <[email protected]>
ARM: dts: omap: correct indentation

Timo Alho <[email protected]>
clk: tegra: fix error return case for recalc_rate

Adam Ford <[email protected]>
bus: ti-sysc: Fix missing AM35xx SoC matching

Julien Panis <[email protected]>
bus: ti-sysc: Use fsleep() instead of usleep_range() in sysc_reset()

Christoph Hellwig <[email protected]>
MIPS: Alchemy: only build mmc support helpers if au1xmmc is enabled

Qu Wenruo <[email protected]>
btrfs: reset destination buffer when read_extent_buffer() gets invalid range

Werner Fischer <[email protected]>
ata: ahci: Add Elkhart Lake AHCI controller

Mario Limonciello <[email protected]>
ata: ahci: Rename board_ahci_mobile

Paul Menzel <[email protected]>
ata: ahci: Add support for AMD A85 FCH (Hudson D4)

Paul Menzel <[email protected]>
ata: libata: Rename link flag ATA_LFLAG_NO_DB_DELAY

Xiao Liang <[email protected]>
netfilter: nft_exthdr: Fix non-linear header modification

Florian Westphal <[email protected]>
netfilter: exthdr: add support for tcp option removal

Pablo Neira Ayuso <[email protected]>
netfilter: nft_exthdr: break evaluation if setting TCP option fails

Florian Westphal <[email protected]>
netfilter: nf_tables: add and use nft_thoff helper

Florian Westphal <[email protected]>
netfilter: nf_tables: add and use nft_sk helper

Phil Sutter <[email protected]>
netfilter: nft_exthdr: Support SCTP chunks

Jan Engelhardt <[email protected]>
netfilter: use actual socket sk for REJECT action

Konrad Dybcio <[email protected]>
media: venus: hfi_venus: Write to VIDC_CTRL_INIT after unmasking interrupts

Dikshita Agarwal <[email protected]>
media: venus: hfi: Add a 6xx boot logic

Bryan O'Donoghue <[email protected]>
media: venus: core: Add differentiator IS_V6(core)

Dikshita Agarwal <[email protected]>
media: venus: hfi: Define additional 6xx registers

Bryan O'Donoghue <[email protected]>
media: venus: hfi,pm,firmware: Convert to block relative addressing

Bryan O'Donoghue <[email protected]>
media: venus: core: Add io base variables for each block

Wolfram Sang <[email protected]>
mmc: renesas_sdhi: register irqs before registering controller

Wolfram Sang <[email protected]>
mmc: tmio: support custom irq masks

Wolfram Sang <[email protected]>
mmc: renesas_sdhi: populate SCC pointer at the proper place

Wolfram Sang <[email protected]>
mmc: renesas_sdhi: probe into TMIO after SCC parameters have been setup

Werner Sembach <[email protected]>
Input: i8042 - add quirk for TUXEDO Gemini 17 Gen1/Clevo PD70PN

Huacai Chen <[email protected]>
Input: i8042 - rename i8042-x86ia64io.h to i8042-acpipnpio.h

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: double hook unregistration in netns path

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: unregister flowtable hooks on netns exit

Xiaoke Wang <[email protected]>
i2c: mux: demux-pinctrl: check the return value of devm_kstrdup()

Christophe JAILLET <[email protected]>
gpio: tb10x: Fix an error handling path in tb10x_gpio_probe()

Artem Chernyshev <[email protected]>
net: rds: Fix possible NULL-pointer dereference

Sebastian Andrzej Siewior <[email protected]>
bnxt_en: Flush XDP for bnxt_poll_nitroa0()'s NAPI

Sebastian Andrzej Siewior <[email protected]>
locking/seqlock: Do the lockdep annotation before locking in do_write_seqcount_begin_nested()

Ahmed S. Darwish <[email protected]>
seqlock: Prefix internal seqcount_t-only macros with a "do_"

Peter Zijlstra <[email protected]>
seqlock: Rename __seqprop() users

Arnd Bergmann <[email protected]>
seqlock: avoid -Wshadow warnings

Jozsef Kadlecsik <[email protected]>
netfilter: ipset: Fix race between IPSET_CMD_CREATE and IPSET_CMD_SWAP

Ziyang Xuan <[email protected]>
team: fix null-ptr-deref when team device type is changed

Eric Dumazet <[email protected]>
net: bridge: use DEV_STATS_INC()

Jie Wang <[email protected]>
net: hns3: add 5ms delay before clear firmware reset irq source

Jian Shen <[email protected]>
net: hns3: only enable unicast promisc when mac table full

Josh Poimboeuf <[email protected]>
x86/srso: Fix SBPB enablement for spec_rstack_overflow=off

Josh Poimboeuf <[email protected]>
x86/srso: Fix srso_show_state() side effect

Stephen Boyd <[email protected]>
platform/x86: intel_scu_ipc: Fail IPC send if still busy

Stephen Boyd <[email protected]>
platform/x86: intel_scu_ipc: Don't override scu in intel_scu_ipc_dev_simple_command()

Stephen Boyd <[email protected]>
platform/x86: intel_scu_ipc: Check status upon timeout in ipc_wait_for_interrupt()

Stephen Boyd <[email protected]>
platform/x86: intel_scu_ipc: Check status after timeout in busy_loop()

Eric Dumazet <[email protected]>
dccp: fix dccp_v4_err()/dccp_v6_err() again

Kajol Jain <[email protected]>
powerpc/perf/hv-24x7: Update domain value check

Kyle Zeng <[email protected]>
ipv4: fix null-deref in ipv4_link_failure

Ivan Vecera <[email protected]>
i40e: Fix VF VLAN offloading when port VLAN is configured

Shengjiu Wang <[email protected]>
ASoC: imx-audmix: Fix return error with devm_clk_get()

Sabrina Dubroca <[email protected]>
selftests: tls: swap the TX and RX sockets in some tests

Kees Cook <[email protected]>
selftests/tls: Add {} to avoid static checker warning

Toke Høiland-Jørgensen <[email protected]>
bpf: Avoid deadlock when using queue and stack maps from NMI

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: disallow element removal on anonymous sets

Jerome Brunet <[email protected]>
ASoC: meson: spdifin: start hw on dai probe

Florian Westphal <[email protected]>
netfilter: nf_tables: fix memleak when more than 255 elements expired

Pablo Neira Ayuso <[email protected]>
netfilter: nft_set_hash: try later when GC hits EAGAIN on iteration

Pablo Neira Ayuso <[email protected]>
netfilter: nft_set_pipapo: stop GC iteration if GC transaction allocation fails

Pablo Neira Ayuso <[email protected]>
netfilter: nft_set_rbtree: use read spinlock to avoid datapath contention

Pablo Neira Ayuso <[email protected]>
netfilter: nft_set_rbtree: skip sync GC for new elements in this transaction

Florian Westphal <[email protected]>
netfilter: nf_tables: defer gc run if previous batch is still pending

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: use correct lock to protect gc_list

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: GC transaction race with abort path

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: GC transaction race with netns dismantle

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: fix GC transaction races with netns and netlink event exit path

Florian Westphal <[email protected]>
netfilter: nf_tables: don't fail inserts if duplicate has expired

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: remove busy mark and gc batch API

Pablo Neira Ayuso <[email protected]>
netfilter: nft_set_hash: mark set element as dead when deleting from packet path

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: adapt set backend to use GC transaction API

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: GC transaction API to avoid race with control plane

Florian Westphal <[email protected]>
netfilter: nf_tables: don't skip expired elements during walk

Pablo Neira Ayuso <[email protected]>
netfilter: nf_tables: integrate pipapo into commit protocol

Steven Rostedt (Google) <[email protected]>
tracing: Have event inject files inc the trace array ref count

Jan Kara <[email protected]>
ext4: do not let fstrim block system suspend

Jan Kara <[email protected]>
ext4: move setting of trimmed bit into ext4_try_to_trim_range()

Kemeng Shi <[email protected]>
ext4: replace the traditional ternary conditional operator with with max()/min()

Dmitry Monakhov <[email protected]>
ext4: mark group as trimmed only if it was fully scanned

Lukas Czerner <[email protected]>
ext4: change s_last_trim_minblks type to unsigned long

Lukas Bulwahn <[email protected]>
ext4: scope ret locally in ext4_try_to_trim_range()

Wang Jianchao <[email protected]>
ext4: add new helper interface ext4_try_to_trim_range()

Wang Jianchao <[email protected]>
ext4: remove the 'group' parameter of ext4_trim_extent

Szuying Chen <[email protected]>
ata: libahci: clear pending interrupt status

Hannes Reinecke <[email protected]>
ata: ahci: Drop pointless VPRINTK() calls and convert the remaining ones

Steven Rostedt (Google) <[email protected]>
tracing: Increase trace array ref count on enable and filter files

Trond Myklebust <[email protected]>
SUNRPC: Mark the cred for revalidation if the server rejects it

Trond Myklebust <[email protected]>
NFS/pNFS: Report EINVAL errors from connect() to the server

Trond Myklebust <[email protected]>
NFS: Use the correct commit info in nfs_join_page_group()


-------------

Diffstat:

Makefile | 4 +-
arch/arm/boot/dts/am335x-guardian.dts | 9 +-
arch/arm/boot/dts/am3517-evm.dts | 2 +-
arch/arm/boot/dts/logicpd-torpedo-baseboard.dtsi | 2 +-
arch/arm/boot/dts/motorola-mapphone-common.dtsi | 33 +-
arch/arm/boot/dts/omap-gpmc-smsc911x.dtsi | 6 +-
arch/arm/boot/dts/omap-gpmc-smsc9221.dtsi | 6 +-
arch/arm/boot/dts/omap3-cm-t3517.dts | 12 +-
arch/arm/boot/dts/omap3-cpu-thermal.dtsi | 3 +-
arch/arm/boot/dts/omap3-gta04.dtsi | 8 +-
arch/arm/boot/dts/omap3-ldp.dts | 2 +-
arch/arm/boot/dts/omap3-n900.dts | 40 +-
arch/arm/boot/dts/omap3-zoom3.dts | 44 +--
arch/arm/boot/dts/omap4-cpu-thermal.dtsi | 29 +-
arch/arm/boot/dts/omap443x.dtsi | 1 +
arch/arm/boot/dts/omap4460.dtsi | 1 +
arch/arm/boot/dts/omap5-cm-t54.dts | 64 +--
arch/arm64/include/asm/cputype.h | 2 +
arch/mips/alchemy/devboards/db1000.c | 4 +
arch/mips/alchemy/devboards/db1200.c | 6 +
arch/mips/alchemy/devboards/db1300.c | 4 +
arch/parisc/include/asm/ldcw.h | 36 +-
arch/parisc/include/asm/ropes.h | 3 +
arch/parisc/include/asm/spinlock_types.h | 5 -
arch/parisc/kernel/drivers.c | 2 +-
arch/parisc/kernel/irq.c | 2 +-
arch/powerpc/kernel/hw_breakpoint.c | 7 +-
arch/powerpc/perf/hv-24x7.c | 2 +-
arch/x86/kernel/cpu/bugs.c | 4 +-
arch/xtensa/boot/Makefile | 3 +-
arch/xtensa/boot/lib/zmem.c | 5 +-
arch/xtensa/include/asm/core.h | 4 +
arch/xtensa/platforms/iss/network.c | 4 +-
block/blk-core.c | 2 -
block/blk-sysfs.c | 2 +
drivers/acpi/device_pm.c | 29 ++
drivers/ata/ahci.c | 111 +++---
drivers/ata/ahci_brcm.c | 2 +-
drivers/ata/ahci_xgene.c | 4 -
drivers/ata/libahci.c | 49 +--
drivers/ata/libata-core.c | 41 +-
drivers/ata/libata-eh.c | 13 +-
drivers/ata/libata-sata.c | 2 +-
drivers/ata/libata-scsi.c | 2 +-
drivers/ata/libata-transport.c | 9 +-
drivers/ata/libata.h | 2 +
drivers/base/regmap/regcache-rbtree.c | 3 +-
drivers/block/rbd.c | 412 ++++++++++---------
drivers/bus/ti-sysc.c | 31 +-
drivers/char/agp/parisc-agp.c | 2 -
drivers/clk/imx/clk-pll14xx.c | 2 +
drivers/clk/tegra/clk-bpmp.c | 2 +-
drivers/gpio/gpio-aspeed.c | 2 +-
drivers/gpio/gpio-pmic-eic-sprd.c | 1 +
drivers/gpio/gpio-pxa.c | 1 +
drivers/gpio/gpio-tb10x.c | 6 +-
drivers/gpu/drm/mediatek/mtk_drm_gem.c | 2 +-
drivers/i2c/busses/i2c-i801.c | 1 +
drivers/i2c/busses/i2c-npcm7xx.c | 17 +-
drivers/i2c/muxes/i2c-demux-pinctrl.c | 4 +
drivers/infiniband/core/cma.c | 2 +-
drivers/infiniband/core/cma_configfs.c | 2 +-
drivers/infiniband/core/nldev.c | 1 +
drivers/infiniband/core/uverbs_main.c | 2 +-
drivers/infiniband/hw/mlx4/sysfs.c | 2 +-
drivers/infiniband/hw/mlx5/main.c | 2 +-
drivers/infiniband/sw/siw/siw_cm.c | 16 +-
.../serio/{i8042-x86ia64io.h => i8042-acpipnpio.h} | 13 +-
drivers/input/serio/i8042.h | 2 +-
drivers/md/dm-zoned-target.c | 15 +-
drivers/media/platform/qcom/venus/core.c | 12 +
drivers/media/platform/qcom/venus/core.h | 11 +
drivers/media/platform/qcom/venus/firmware.c | 28 +-
drivers/media/platform/qcom/venus/hfi_venus.c | 94 +++--
drivers/media/platform/qcom/venus/hfi_venus_io.h | 114 ++++--
drivers/media/platform/qcom/venus/pm_helpers.c | 12 +-
drivers/mmc/host/renesas_sdhi_core.c | 19 +-
drivers/mmc/host/tmio_mmc.h | 1 +
drivers/mmc/host/tmio_mmc_core.c | 8 +-
drivers/mtd/ubi/build.c | 7 +
drivers/net/dsa/mv88e6xxx/chip.c | 6 +-
drivers/net/dsa/mv88e6xxx/global1.c | 31 --
drivers/net/dsa/mv88e6xxx/global1.h | 1 -
drivers/net/dsa/mv88e6xxx/global2.c | 2 +-
drivers/net/dsa/mv88e6xxx/global2.h | 1 +
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 +
.../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 7 +-
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 7 +-
drivers/net/ethernet/qlogic/qed/qed_ll2.h | 2 +-
drivers/net/ethernet/stmicro/stmmac/dwmac-stm32.c | 7 +-
drivers/net/ethernet/ti/am65-cpsw-nuss.c | 1 +
drivers/net/team/team.c | 10 +-
drivers/net/thunderbolt.c | 3 +-
drivers/net/usb/smsc75xx.c | 4 +-
drivers/net/wan/fsl_ucc_hdlc.c | 12 +-
drivers/net/wireless/intel/iwlwifi/fw/error-dump.h | 6 +-
.../net/wireless/marvell/mwifiex/11n_rxreorder.c | 4 +-
drivers/net/wireless/marvell/mwifiex/sta_rx.c | 16 +-
.../net/wireless/mediatek/mt76/mt76x02_eeprom.c | 7 -
drivers/net/wireless/mediatek/mt76/mt76x2/eeprom.c | 13 +-
drivers/nvme/host/pci.c | 149 +++----
drivers/of/dynamic.c | 6 +-
drivers/parisc/iosapic.c | 4 +-
drivers/parisc/iosapic_private.h | 4 +-
drivers/pci/controller/dwc/pcie-qcom.c | 2 -
drivers/platform/mellanox/Kconfig | 1 +
drivers/platform/x86/intel_scu_ipc.c | 66 ++--
drivers/power/supply/ucs1002_power.c | 3 +-
drivers/s390/scsi/zfcp_aux.c | 9 +-
drivers/scsi/pm8001/pm8001_hwi.c | 2 +-
drivers/scsi/pm8001/pm80xx_hwi.c | 4 +-
drivers/scsi/qedf/qedf_io.c | 10 +-
drivers/scsi/qedf/qedf_main.c | 7 +-
drivers/spi/spi-nxp-fspi.c | 7 +
drivers/spi/spi-zynqmp-gqspi.c | 24 +-
drivers/target/target_core_device.c | 11 +-
drivers/tty/n_gsm.c | 4 +-
drivers/tty/serial/8250/8250_port.c | 5 +-
drivers/video/fbdev/Kconfig | 2 +-
drivers/watchdog/iTCO_wdt.c | 26 +-
fs/binfmt_elf_fdpic.c | 5 +-
fs/btrfs/extent_io.c | 8 +-
fs/btrfs/super.c | 6 +-
fs/ext4/ext4.h | 2 +-
fs/ext4/mballoc.c | 138 ++++---
fs/nfs/direct.c | 8 +-
fs/nfs/flexfilelayout/flexfilelayout.c | 1 +
fs/nfs/nfs4proc.c | 4 +-
fs/nfs/nfs4state.c | 47 ++-
fs/nfs/sysfs.c | 16 +-
fs/nfs/write.c | 23 +-
fs/nilfs2/gcinode.c | 6 +-
fs/proc/task_nommu.c | 27 +-
include/linux/acpi.h | 5 +
include/linux/bpf.h | 2 +-
include/linux/btf_ids.h | 2 +-
include/linux/cgroup.h | 3 +-
include/linux/if_team.h | 2 +
include/linux/libata.h | 4 +-
include/linux/netfilter/nf_conntrack_sctp.h | 1 +
include/linux/nfs_page.h | 4 +-
include/linux/seqlock.h | 104 ++---
include/net/netfilter/ipv4/nf_reject.h | 4 +-
include/net/netfilter/ipv6/nf_reject.h | 5 +-
include/net/netfilter/nf_tables.h | 136 +++----
include/net/tcp.h | 6 +-
include/uapi/linux/bpf.h | 4 +-
include/uapi/linux/netfilter/nf_tables.h | 2 +
kernel/bpf/queue_stack_maps.c | 21 +-
kernel/dma/debug.c | 20 +-
kernel/sched/cpuacct.c | 84 ++--
kernel/trace/ring_buffer.c | 42 +-
kernel/trace/trace.c | 27 ++
kernel/trace/trace.h | 2 +
kernel/trace/trace_events.c | 6 +-
kernel/trace/trace_events_inject.c | 3 +-
mm/frame_vector.c | 6 +-
net/bridge/br_forward.c | 4 +-
net/bridge/br_input.c | 4 +-
net/core/neighbour.c | 4 +-
net/dccp/ipv4.c | 9 +-
net/dccp/ipv6.c | 9 +-
net/ipv4/netfilter/ipt_REJECT.c | 3 +-
net/ipv4/netfilter/nf_reject_ipv4.c | 6 +-
net/ipv4/netfilter/nft_reject_ipv4.c | 3 +-
net/ipv4/route.c | 4 +-
net/ipv4/tcp_input.c | 13 +
net/ipv4/tcp_output.c | 7 +-
net/ipv6/netfilter/ip6t_REJECT.c | 2 +-
net/ipv6/netfilter/nf_reject_ipv6.c | 5 +-
net/ipv6/netfilter/nft_reject_ipv6.c | 3 +-
net/l2tp/l2tp_ip6.c | 2 +-
net/ncsi/ncsi-aen.c | 5 +
net/netfilter/ipset/ip_set_core.c | 12 +-
net/netfilter/ipvs/ip_vs_sync.c | 4 +-
net/netfilter/nf_conntrack_proto_sctp.c | 43 +-
net/netfilter/nf_tables_api.c | 436 ++++++++++++++++++---
net/netfilter/nf_tables_core.c | 2 +-
net/netfilter/nf_tables_trace.c | 6 +-
net/netfilter/nft_exthdr.c | 193 ++++++++-
net/netfilter/nft_flow_offload.c | 2 +-
net/netfilter/nft_payload.c | 10 +-
net/netfilter/nft_reject_inet.c | 6 +-
net/netfilter/nft_set_hash.c | 87 ++--
net/netfilter/nft_set_pipapo.c | 115 ++++--
net/netfilter/nft_set_rbtree.c | 199 ++++++----
net/netfilter/nft_synproxy.c | 4 +-
net/netfilter/nft_tproxy.c | 4 +-
net/nfc/llcp_core.c | 2 +
net/rds/rdma_transport.c | 8 +-
net/rds/tcp_connect.c | 2 +-
net/sctp/associola.c | 3 +-
net/sctp/socket.c | 1 +
net/socket.c | 29 +-
net/sunrpc/clnt.c | 4 +-
net/tipc/crypto.c | 4 +-
scripts/mod/file2alias.c | 2 +-
security/integrity/ima/Kconfig | 21 +-
security/smack/smack.h | 1 +
security/smack/smack_lsm.c | 65 ++-
sound/pci/hda/hda_intel.c | 1 +
sound/soc/fsl/imx-audmix.c | 2 +-
sound/soc/meson/axg-spdifin.c | 49 +--
tools/include/linux/btf_ids.h | 2 +-
tools/include/uapi/linux/bpf.h | 4 +-
tools/perf/util/Build | 6 +
tools/power/cpupower/Makefile | 8 +-
tools/power/cpupower/bench/Makefile | 2 +-
.../ftrace/test.d/instances/instance-event.tc | 2 +-
tools/testing/selftests/kselftest_deps.sh | 77 +++-
tools/testing/selftests/net/tls.c | 11 +-
211 files changed, 2713 insertions(+), 1552 deletions(-)



2023-10-09 19:33:53

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

Hi!

> This is the start of the stable review cycle for the 5.10.198 release.
> There are 226 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000.
> Anything received after that time might be too late.

4.14, 4.19 and 6.1 tests ok, 5.10 seems to have problems:

https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/1030540843

Lets see arm64_defconfig:

https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5254610954

...and this seems to be real failure:

https://lava.ciplatform.org/scheduler/job/1018088

[ 62.871632] renesas_sdhi_internal_dmac ee100000.mmc: Got CD GPIO
[ 62.874253] rcar-dmac e6700000.dma-controller: deferred probe timeout, ignoring dependency
[ 62.889345] rcar-dmac e7300000.dma-controller: deferred probe timeout, ignoring dependency
[ 62.892139] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018
[ 62.901256] rcar-dmac e7310000.dma-controller: deferred probe timeout, ignoring dependency
[ 62.906431] Mem abort info:
[ 62.906438] ESR = 0x96000004
[ 62.917751] rcar-dmac ec700000.dma-controller: deferred probe timeout, ignoring dependency
[ 62.920548] EC = 0x25: DABT (current EL), IL = 32 bits
[ 62.920551] SET = 0, FnV = 0
[ 62.920554] EA = 0, S1PTW = 0
[ 62.920559] Data abort info:
[ 62.927031] renesas_sdhi_internal_dmac ee100000.mmc: mmc1 base at 0x00000000ee100000, max clock rate 200 MHz
[ 62.931976] rcar-dmac ec720000.dma-controller: deferred probe timeout, ignoring dependency
[ 62.934138] ISV = 0, ISS = 0x00000004
[ 62.934145] CM = 0, WnR = 0
[ 62.940844] ravb e6800000.ethernet: deferred probe timeout, ignoring dependency
[ 62.943210] [0000000000000018] user address but active_mm is swapper
[ 62.943221] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 62.954866] ravb e6800000.ethernet eth0: Base address at 0xe6800000, fc:28:99:92:7b:e0, IRQ 118.
[ 62.961296] Modules linked in:
[ 62.961313] CPU: 5 PID: 135 Comm: kworker/u12:2 Not tainted 5.10.198-rc1-g18c65c1b4996 #1
[ 63.007289] Hardware name: HopeRun HiHope RZ/G2M with sub board (DT)
[ 63.013658] Workqueue: events_unbound async_run_entry_fn
[ 63.018971] pstate: 20000005 (nzCv daif -PAN -UAO -TCO BTYPE=--)
[ 63.024982] pc : renesas_sdhi_reset_scc+0x94/0xe0
[ 63.029681] lr : renesas_sdhi_reset_scc+0x60/0xe0
[ 63.034379] sp : ffff800012353ab0
[ 63.037688] x29: ffff800012353ab0 x28: ffff80001110b2c0
[ 63.042998] x27: 0000000000000000 x26: ffff0005c03f6e80
[ 63.048308] x25: ffff0005c11c7a90 x24: ffff0005c0822010
[ 63.053618] x23: ffff0005c0822000 x22: ffff0005c08221d0
[ 63.058928] x21: ffff0005c11c7a80 x20: 0000000000000020

Let's see bbb_defconfig:

https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5254611119

Fails:

https://lava.ciplatform.org/scheduler/job/1018083

bootz 0x82000000 - 0x88000000
zimage: Bad magic!
bootloader-commands timed out after 281 seconds
end: 2.4.3 bootloader-commands (duration 00:04:41) [common]

Not sure about this one.

Lets see arm_shmobile_defconfig:

https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5254611233

That's:

https://lava.ciplatform.org/scheduler/job/1018084

Seems similar to previous failure:

2.092944] usbcore: registered new interface driver usbhid
[ 2.098710] sh_mobile_sdhi ee140000.mmc: Got CD GPIO
[ 2.103206] usbhid: USB HID core driver
[ 2.108224] sh_mobile_sdhi ee140000.mmc: Got WP GPIO
[ 2.124168] 8<--- cut here ---
[ 2.124476] sgtl5000 0-000a: sgtl5000 revision 0x11
[ 2.127222] Unable to handle kernel NULL pointer dereference at virtual address 0000000c
[ 2.127228] pgd = (ptrval)
[ 2.140755] rcar_sound ec500000.sound: probed
[ 2.142917] [0000000c] *pgd=00000000
[ 2.147915] NET: Registered protocol family 10
[ 2.150849] Internal error: Oops: 5 [#1] SMP ARM
[ 2.155700] sh_mmcif ee200000.mmc: Chip version 0x0003, clock rate 12MHz
[ 2.159894] CPU: 1 PID: 7 Comm: kworker/u4:0 Not tainted 5.10.198-rc1-g18c65c1b4996 #1
[ 2.174486] Hardware name: Generic RZ/G1 (Flattened Device Tree)
[ 2.174540] Segment Routing with IPv6
[ 2.180501] Workqueue: events_unbound async_run_entry_fn
[ 2.184234] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
[ 2.189455] PC is at renesas_sdhi_reset_scc+0x34/0x50
[ 2.189462] LR is at sd_ctrl_write16+0x30/0x48
[ 2.195810] NET: Registered protocol family 17
[ 2.200409] pc : [<c05da960>] lr : [<c05da754>] psr: 60000013
[ 2.200415] sp : c10a9e30 ip : 00000024 fp : c11b3cc0
[ 2.204877] can: controller area network core
[ 2.209282] r10: c11ae410 r9 : c11ae400 r8 : c18e9d48
[ 2.209288] r7 : fffffe00 r6 : c18e9d48 r5 : c18e9d40 r4 : c1970b80
[ 2.215590] NET: Registered protocol family 29
[ 2.220759] r3 : 0000000c r2 : 00000006 r1 : 00000001 r0 : 00000000
[ 2.225117] can: raw protocol
[ 2.230322] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 2.236848] can: broadcast manager protocol

Best regards,
Pavel
--
DENX Software Engineering GmbH, Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany


Attachments:
(No filename) (5.34 kB)
signature.asc (201.00 B)
Download all attachments

2023-10-09 20:13:44

by Pavel Machek

[permalink] [raw]
Subject: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

On Mon 2023-10-09 21:33:22, Pavel Machek wrote:
> Hi!
>
> > This is the start of the stable review cycle for the 5.10.198 release.
> > There are 226 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000.
> > Anything received after that time might be too late.
>
> 4.14, 4.19 and 6.1 tests ok, 5.10 seems to have problems:

Guessing from stack traces, these may be relevant:

|e10d3d256 b161d8 o: 5.10| mmc: renesas_sdhi: probe into TMIO after SCC parameters have been setu\
p
|493b70c48 d14ac6 o: 5.10| mmc: renesas_sdhi: populate SCC pointer at the proper place
|c508545f4 0d856c o: 5.10| mmc: tmio: support custom irq masks
|8df1f0639 74f45d o: 5.10| mmc: renesas_sdhi: register irqs before registering controller

Leaving below for context...

Best regards,
Pavel

> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/1030540843
>
> Lets see arm64_defconfig:
>
> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5254610954
>
> ...and this seems to be real failure:
>
> https://lava.ciplatform.org/scheduler/job/1018088
>
> [ 62.871632] renesas_sdhi_internal_dmac ee100000.mmc: Got CD GPIO
> [ 62.874253] rcar-dmac e6700000.dma-controller: deferred probe timeout, ignoring dependency
> [ 62.889345] rcar-dmac e7300000.dma-controller: deferred probe timeout, ignoring dependency
> [ 62.892139] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018
> [ 62.901256] rcar-dmac e7310000.dma-controller: deferred probe timeout, ignoring dependency
> [ 62.906431] Mem abort info:
> [ 62.906438] ESR = 0x96000004
> [ 62.917751] rcar-dmac ec700000.dma-controller: deferred probe timeout, ignoring dependency
> [ 62.920548] EC = 0x25: DABT (current EL), IL = 32 bits
> [ 62.920551] SET = 0, FnV = 0
> [ 62.920554] EA = 0, S1PTW = 0
> [ 62.920559] Data abort info:
> [ 62.927031] renesas_sdhi_internal_dmac ee100000.mmc: mmc1 base at 0x00000000ee100000, max clock rate 200 MHz
> [ 62.931976] rcar-dmac ec720000.dma-controller: deferred probe timeout, ignoring dependency
> [ 62.934138] ISV = 0, ISS = 0x00000004
> [ 62.934145] CM = 0, WnR = 0
> [ 62.940844] ravb e6800000.ethernet: deferred probe timeout, ignoring dependency
> [ 62.943210] [0000000000000018] user address but active_mm is swapper
> [ 62.943221] Internal error: Oops: 96000004 [#1] PREEMPT SMP
> [ 62.954866] ravb e6800000.ethernet eth0: Base address at 0xe6800000, fc:28:99:92:7b:e0, IRQ 118.
> [ 62.961296] Modules linked in:
> [ 62.961313] CPU: 5 PID: 135 Comm: kworker/u12:2 Not tainted 5.10.198-rc1-g18c65c1b4996 #1
> [ 63.007289] Hardware name: HopeRun HiHope RZ/G2M with sub board (DT)
> [ 63.013658] Workqueue: events_unbound async_run_entry_fn
> [ 63.018971] pstate: 20000005 (nzCv daif -PAN -UAO -TCO BTYPE=--)
> [ 63.024982] pc : renesas_sdhi_reset_scc+0x94/0xe0
> [ 63.029681] lr : renesas_sdhi_reset_scc+0x60/0xe0
> [ 63.034379] sp : ffff800012353ab0
> [ 63.037688] x29: ffff800012353ab0 x28: ffff80001110b2c0
> [ 63.042998] x27: 0000000000000000 x26: ffff0005c03f6e80
> [ 63.048308] x25: ffff0005c11c7a90 x24: ffff0005c0822010
> [ 63.053618] x23: ffff0005c0822000 x22: ffff0005c08221d0
> [ 63.058928] x21: ffff0005c11c7a80 x20: 0000000000000020
>
> Let's see bbb_defconfig:
>
> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5254611119
>
> Fails:
>
> https://lava.ciplatform.org/scheduler/job/1018083
>
> bootz 0x82000000 - 0x88000000
> zimage: Bad magic!
> bootloader-commands timed out after 281 seconds
> end: 2.4.3 bootloader-commands (duration 00:04:41) [common]
>
> Not sure about this one.
>
> Lets see arm_shmobile_defconfig:
>
> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5254611233
>
> That's:
>
> https://lava.ciplatform.org/scheduler/job/1018084
>
> Seems similar to previous failure:
>
> 2.092944] usbcore: registered new interface driver usbhid
> [ 2.098710] sh_mobile_sdhi ee140000.mmc: Got CD GPIO
> [ 2.103206] usbhid: USB HID core driver
> [ 2.108224] sh_mobile_sdhi ee140000.mmc: Got WP GPIO
> [ 2.124168] 8<--- cut here ---
> [ 2.124476] sgtl5000 0-000a: sgtl5000 revision 0x11
> [ 2.127222] Unable to handle kernel NULL pointer dereference at virtual address 0000000c
> [ 2.127228] pgd = (ptrval)
> [ 2.140755] rcar_sound ec500000.sound: probed
> [ 2.142917] [0000000c] *pgd=00000000
> [ 2.147915] NET: Registered protocol family 10
> [ 2.150849] Internal error: Oops: 5 [#1] SMP ARM
> [ 2.155700] sh_mmcif ee200000.mmc: Chip version 0x0003, clock rate 12MHz
> [ 2.159894] CPU: 1 PID: 7 Comm: kworker/u4:0 Not tainted 5.10.198-rc1-g18c65c1b4996 #1
> [ 2.174486] Hardware name: Generic RZ/G1 (Flattened Device Tree)
> [ 2.174540] Segment Routing with IPv6
> [ 2.180501] Workqueue: events_unbound async_run_entry_fn
> [ 2.184234] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
> [ 2.189455] PC is at renesas_sdhi_reset_scc+0x34/0x50
> [ 2.189462] LR is at sd_ctrl_write16+0x30/0x48
> [ 2.195810] NET: Registered protocol family 17
> [ 2.200409] pc : [<c05da960>] lr : [<c05da754>] psr: 60000013
> [ 2.200415] sp : c10a9e30 ip : 00000024 fp : c11b3cc0
> [ 2.204877] can: controller area network core
> [ 2.209282] r10: c11ae410 r9 : c11ae400 r8 : c18e9d48
> [ 2.209288] r7 : fffffe00 r6 : c18e9d48 r5 : c18e9d40 r4 : c1970b80
> [ 2.215590] NET: Registered protocol family 29
> [ 2.220759] r3 : 0000000c r2 : 00000006 r1 : 00000001 r0 : 00000000
> [ 2.225117] can: raw protocol
> [ 2.230322] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
> [ 2.236848] can: broadcast manager protocol
>
> Best regards,
> Pavel



--
DENX Software Engineering GmbH, Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany


Attachments:
(No filename) (6.08 kB)
signature.asc (201.00 B)
Download all attachments

2023-10-09 20:25:11

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

On 10/9/23 05:59, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.10.198 release.
> There are 226 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.198-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on
BMIPS_GENERIC:

Tested-by: Florian Fainelli <[email protected]>
--
Florian

2023-10-09 22:57:58

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

On 10/9/23 06:59, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.10.198 release.
> There are 226 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.198-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted on my test system. No dmesg regressions.

Tested-by: Shuah Khan <[email protected]>

thanks,
-- Shuah

2023-10-10 09:57:37

by Jon Hunter

[permalink] [raw]
Subject: Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

On Mon, 09 Oct 2023 14:59:21 +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.10.198 release.
> There are 226 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.198-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

All tests passing for Tegra ...

Test results for stable-v5.10:
10 builds: 10 pass, 0 fail
26 boots: 26 pass, 0 fail
68 tests: 68 pass, 0 fail

Linux version: 5.10.198-rc1-g18c65c1b4996
Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000,
tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000,
tegra20-ventana, tegra210-p2371-2180,
tegra210-p3450-0000, tegra30-cardhu-a04

Tested-by: Jon Hunter <[email protected]>

Jon

2023-10-10 11:18:35

by Pavel Machek

[permalink] [raw]
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

Hi!

> > > This is the start of the stable review cycle for the 5.10.198 release.
> > > There are 226 patches in this series, all will be posted as a response
> > > to this one. If anyone has any issues with these being applied, please
> > > let me know.
> > >
> > > Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000.
> > > Anything received after that time might be too late.
> >
> > 4.14, 4.19 and 6.1 tests ok, 5.10 seems to have problems:
>
> Guessing from stack traces, these may be relevant:

So bisection reveals these are relevant:

|e10d3d256 b161d8 o: 5.10| mmc: renesas_sdhi: probe into TMIO after SCC parameters have been setup

Ok

|493b70c48 d14ac6 o: 5.10| mmc: renesas_sdhi: populate SCC pointer at the proper place

Testing now: https://gitlab.com/cip-project/cip-kernel/linux-cip/-/pipelines/1031822035

|c508545f4 0d856c o: 5.10| mmc: tmio: support custom irq masks
|8df1f0639 74f45d o: 5.10| mmc: renesas_sdhi: register irqs before registering controller

Fail: https://gitlab.com/cip-project/cip-kernel/linux-cip/-/pipelines/1031786077

I should be able to point specific commit with two more tests.

Best regards,
Pavel
--
DENX Software Engineering GmbH, Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany


Attachments:
(No filename) (1.32 kB)
signature.asc (201.00 B)
Download all attachments

2023-10-10 12:06:28

by Pavel Machek

[permalink] [raw]
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

Hi!
> > > > This is the start of the stable review cycle for the 5.10.198 release.
> > > > There are 226 patches in this series, all will be posted as a response
> > > > to this one. If anyone has any issues with these being applied, please
> > > > let me know.
> > > >
> > > > Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000.
> > > > Anything received after that time might be too late.
> > >
> > > 4.14, 4.19 and 6.1 tests ok, 5.10 seems to have problems:
> >
> > Guessing from stack traces, these may be relevant:
>
> So bisection reveals these are relevant:
>
> |e10d3d256 b161d8 o: 5.10| mmc: renesas_sdhi: probe into TMIO after SCC parameters have been setup
>
> Ok
>
> |493b70c48 d14ac6 o: 5.10| mmc: renesas_sdhi: populate SCC pointer at the proper place
>
> Testing now: https://gitlab.com/cip-project/cip-kernel/linux-cip/-/pipelines/1031822035

And testing failed. So

commit f5799b4e142884c2e7aa99f813113af4a3395ffb
Author: Wolfram Sang <[email protected]>
Date: Tue Nov 10 15:20:57 2020 +0100

mmc: renesas_sdhi: populate SCC pointer at the proper place

[ Upstream commit d14ac691bb6f6ebaa7eeec21ca04dd47300ff5b6 ]

seems to be the buggy commit that breaks renesas boards in 5.10.

> |c508545f4 0d856c o: 5.10| mmc: tmio: support custom irq masks

Testing too: https://gitlab.com/cip-project/cip-kernel/linux-cip/-/pipelines/1031834627

Best regards,
Pavel
--
DENX Software Engineering GmbH, Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany


Attachments:
(No filename) (1.59 kB)
signature.asc (201.00 B)
Download all attachments

2023-10-10 18:19:14

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

On Mon, Oct 09, 2023 at 02:59:21PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.10.198 release.
> There are 226 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000.
> Anything received after that time might be too late.
>

Build results:
total: 159 pass: 159 fail: 0
Qemu test results:
total: 495 pass: 495 fail: 0

Tested-by: Guenter Roeck <[email protected]>

Guenter

2023-10-10 19:07:28

by Wolfram Sang

[permalink] [raw]
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

Hi Pavel,

> And testing failed. So
>
> commit f5799b4e142884c2e7aa99f813113af4a3395ffb
> Author: Wolfram Sang <[email protected]>
> Date: Tue Nov 10 15:20:57 2020 +0100
>
> mmc: renesas_sdhi: populate SCC pointer at the proper place
>
> [ Upstream commit d14ac691bb6f6ebaa7eeec21ca04dd47300ff5b6 ]
>
> seems to be the buggy commit that breaks renesas boards in 5.10.

This patch was part of a series. Did the other two patches come with it?

b161d87dfd3d ("mmc: renesas_sdhi: probe into TMIO after SCC parameters have been setup")
45bffc371fef ("mmc: renesas_sdhi: only reset SCC when its pointer is populated")

If not, I could imagine that could lead to a crash. No idea why only
with 5.10, though.

Happy hacking,

Wolfram


Attachments:
(No filename) (790.00 B)
signature.asc (849.00 B)
Download all attachments

2023-10-10 19:16:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

On Tue, Oct 10, 2023 at 09:07:01PM +0200, Wolfram Sang wrote:
> Hi Pavel,
>
> > And testing failed. So
> >
> > commit f5799b4e142884c2e7aa99f813113af4a3395ffb
> > Author: Wolfram Sang <[email protected]>
> > Date: Tue Nov 10 15:20:57 2020 +0100
> >
> > mmc: renesas_sdhi: populate SCC pointer at the proper place
> >
> > [ Upstream commit d14ac691bb6f6ebaa7eeec21ca04dd47300ff5b6 ]
> >
> > seems to be the buggy commit that breaks renesas boards in 5.10.
>
> This patch was part of a series. Did the other two patches come with it?
>
> b161d87dfd3d ("mmc: renesas_sdhi: probe into TMIO after SCC parameters have been setup")

Yes.

> 45bffc371fef ("mmc: renesas_sdhi: only reset SCC when its pointer is populated")

No :(

> If not, I could imagine that could lead to a crash. No idea why only
> with 5.10, though.

The above commit is only in 5.11, so newer kernels should be fine.

I'll go queue up the one missing patch now, thanks.

greg k-h

2023-10-11 01:40:44

by Naresh Kamboju

[permalink] [raw]
Subject: Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

On Mon, 9 Oct 2023 at 19:06, Greg Kroah-Hartman
<[email protected]> wrote:
>
> This is the start of the stable review cycle for the 5.10.198 release.
> There are 226 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.198-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h


Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Tested-by: Linux Kernel Functional Testing <[email protected]>

## Build
* kernel: 5.10.198-rc1
* git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc
* git branch: linux-5.10.y
* git commit: 18c65c1b4996e3f6f8986a05eceaf427355a7a4f
* git describe: v5.10.197-227-g18c65c1b4996
* test details:
https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.10.y/build/v5.10.197-227-g18c65c1b4996

## Test Regressions (compared to v5.10.197)

## Metric Regressions (compared to v5.10.197)

## Test Fixes (compared to v5.10.197)

## Metric Fixes (compared to v5.10.197)

## Test result summary
total: 92938, pass: 73992, fail: 2587, skip: 16291, xfail: 68

## Build Summary
* arc: 5 total, 5 passed, 0 failed
* arm: 115 total, 115 passed, 0 failed
* arm64: 42 total, 42 passed, 0 failed
* i386: 34 total, 34 passed, 0 failed
* mips: 27 total, 26 passed, 1 failed
* parisc: 4 total, 0 passed, 4 failed
* powerpc: 26 total, 25 passed, 1 failed
* riscv: 12 total, 11 passed, 1 failed
* s390: 12 total, 12 passed, 0 failed
* sh: 14 total, 12 passed, 2 failed
* sparc: 8 total, 8 passed, 0 failed
* x86_64: 38 total, 38 passed, 0 failed

## Test suites summary
* boot
* kselftest-android
* kselftest-arm64
* kselftest-breakpoints
* kselftest-capabilities
* kselftest-cgroup
* kselftest-clone3
* kselftest-core
* kselftest-cpu-hotplug
* kselftest-cpufreq
* kselftest-drivers-dma-buf
* kselftest-efivarfs
* kselftest-exec
* kselftest-filesystems
* kselftest-filesystems-binderfs
* kselftest-filesystems-epoll
* kselftest-firmware
* kselftest-fpu
* kselftest-ftrace
* kselftest-futex
* kselftest-gpio
* kselftest-intel_pstate
* kselftest-ipc
* kselftest-ir
* kselftest-kcmp
* kselftest-kexec
* kselftest-kvm
* kselftest-lib
* kselftest-membarrier
* kselftest-memfd
* kselftest-memory-hotplug
* kselftest-mincore
* kselftest-mount
* kselftest-mqueue
* kselftest-net
* kselftest-net-forwarding
* kselftest-net-mptcp
* kselftest-netfilter
* kselftest-nsfs
* kselftest-openat2
* kselftest-pid_namespace
* kselftest-pidfd
* kselftest-proc
* kselftest-pstore
* kselftest-ptrace
* kselftest-rseq
* kselftest-rtc
* kselftest-sigaltstack
* kselftest-size
* kselftest-tc-testing
* kselftest-timens
* kselftest-tmpfs
* kselftest-tpm2
* kselftest-user
* kselftest-user_events
* kselftest-vDSO
* kselftest-vm
* kselftest-watchdog
* kselftest-x86
* kselftest-zram
* kunit
* kvm-unit-tests
* libgpiod
* log-parser-boot
* log-parser-test
* ltp-cap_bounds
* ltp-commands
* ltp-containers
* ltp-controllers
* ltp-cpuhotplug
* ltp-crypto
* ltp-cve
* ltp-dio
* ltp-fcntl-locktests
* ltp-filecaps
* ltp-fs
* ltp-fs_bind
* ltp-fs_perms_simple
* ltp-fsx
* ltp-hugetlb
* ltp-io
* ltp-ipc
* ltp-math
* ltp-mm
* ltp-nptl
* ltp-pty
* ltp-sched
* ltp-securebits
* ltp-smoke
* ltp-syscalls
* ltp-tracing
* network-basic-tests
* perf
* rcutorture
* v4l2-compliance

--
Linaro LKFT
https://lkft.linaro.org

2023-10-11 09:19:09

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

Hi!

> Tested on arm64 and x86 for 5.10.198-rc1,
>
> Kernel repo:https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> Branch: linux-5.10.y
> Version: 5.10.198-rc1
> Commit: 18c65c1b4996e3f6f8986a05eceaf427355a7a4f
> Compiler: gcc version 7.3.0 (GCC)
>
> arm64:
> --------------------------------------------------------------------
> Testcase Result Summary:
> total: 9023
> passed: 9023
> failed: 0
> timeout: 0
> --------------------------------------------------------------------
>
> x86:
> --------------------------------------------------------------------
> Testcase Result Summary:
> total: 9023
> passed: 9023
> failed: 0
> timeout: 0
> --------------------------------------------------------------------
> Tested-by: Hulk Robot <[email protected]>

Thanks for the testing. Please avoid top-posting and remove irrelevant
content when replying. (Yes, I actually scrolled 900 lines to see
there's nothing to see there).

Best regards,
Pavel

--
DENX Software Engineering GmbH, Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany


Attachments:
(No filename) (1.15 kB)
signature.asc (201.00 B)
Download all attachments

2023-10-11 09:38:56

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

Hi!

> This is the start of the stable review cycle for the 5.10.198 release.
> There are 226 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 11 Oct 2023 13:00:55 +0000.
> Anything received after that time might be too late.

Now I'm confused. a8d8122 seems to be labeled as 5.10.198 (not rc),
and it was released early.

It is broken w.r.t. Renesas hw, as reported before.

Best reagrds,
Pavel

--
DENX Software Engineering GmbH, Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany


Attachments:
(No filename) (707.00 B)
signature.asc (201.00 B)
Download all attachments

2023-10-24 19:22:37

by Pavel Machek

[permalink] [raw]
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

Hi!

> > > And testing failed. So
> > >
> > > commit f5799b4e142884c2e7aa99f813113af4a3395ffb
> > > Author: Wolfram Sang <[email protected]>
> > > Date: Tue Nov 10 15:20:57 2020 +0100
> > >
> > > mmc: renesas_sdhi: populate SCC pointer at the proper place
> > >
> > > [ Upstream commit d14ac691bb6f6ebaa7eeec21ca04dd47300ff5b6 ]
> > >
> > > seems to be the buggy commit that breaks renesas boards in 5.10.
> >
> > This patch was part of a series. Did the other two patches come with it?
> >
> > b161d87dfd3d ("mmc: renesas_sdhi: probe into TMIO after SCC parameters have been setup")
>
> Yes.
>
> > 45bffc371fef ("mmc: renesas_sdhi: only reset SCC when its pointer is populated")
>
> No :(
>
> > If not, I could imagine that could lead to a crash. No idea why only
> > with 5.10, though.
>
> The above commit is only in 5.11, so newer kernels should be fine.
>
> I'll go queue up the one missing patch now, thanks.

Thank you. Patch indeed appears to be in 5.10.199.

But we still have failures on Renesas with 5.10.199-rc2:

https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/1047368849

And they still happed during MMC init:

2.638013] renesas_sdhi_internal_dmac ee100000.mmc: Got CD GPIO
[ 2.638846] INFO: trying to register non-static key.
[ 2.644192] ledtrig-cpu: registered to indicate activity on CPUs
[ 2.649066] The code is fine but needs lockdep annotation, or maybe
[ 2.649069] you didn't initialize this object before use?
[ 2.649071] turning off the locking correctness validator.
[ 2.649080] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.199-rc2-arm64-renesas-ge31b6513c43d #1
[ 2.649082] Hardware name: HopeRun HiHope RZ/G2M with sub board (DT)
[ 2.649086] Call trace:
[ 2.655106] SMCCC: SOC_ID: ARCH_SOC_ID not implemented, skipping ....
[ 2.661354] dump_backtrace+0x0/0x194
[ 2.661361] show_stack+0x14/0x20
[ 2.667430] usbcore: registered new interface driver usbhid
[ 2.672230] dump_stack+0xe8/0x130
[ 2.672238] register_lock_class+0x480/0x514
[ 2.672244] __lock_acquire+0x74/0x20ec
[ 2.681113] usbhid: USB HID core driver
[ 2.687450] lock_acquire+0x218/0x350
[ 2.687456] _raw_spin_lock+0x58/0x80
[ 2.687464] tmio_mmc_irq+0x410/0x9ac
[ 2.688556] renesas_sdhi_internal_dmac ee160000.mmc: mmc0 base at 0x00000000ee160000, max clock rate 200 MHz
[ 2.744936] __handle_irq_event_percpu+0xbc/0x340
[ 2.749635] handle_irq_event+0x60/0x100
[ 2.753553] handle_fasteoi_irq+0xa0/0x1ec
[ 2.757644] __handle_domain_irq+0x7c/0xdc
[ 2.761736] efi_header_end+0x4c/0xd0
[ 2.765393] el1_irq+0xcc/0x180
[ 2.768530] arch_cpu_idle+0x14/0x2c
[ 2.772100] default_idle_call+0x58/0xe4
[ 2.776019] do_idle+0x244/0x2c0
[ 2.779242] cpu_startup_entry+0x20/0x6c
[ 2.783160] rest_init+0x164/0x28c
[ 2.786561] arch_call_rest_init+0xc/0x14
[ 2.790565] start_kernel+0x4c4/0x4f8
[ 2.794233] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014
[ 2.803011] Mem abort info:

from https://lava.ciplatform.org/scheduler/job/1025535
from
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5360973735 .

Is there something else missing?

Best regards,
Pavel
--
DENX Software Engineering GmbH, Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany


Attachments:
(No filename) (3.44 kB)
signature.asc (201.00 B)
Download all attachments

2023-10-25 10:48:21

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

Hi Pavel,

On Tue, Oct 24, 2023 at 9:22 PM Pavel Machek <[email protected]> wrote:
> > > > And testing failed. So
> > > >
> > > > commit f5799b4e142884c2e7aa99f813113af4a3395ffb
> > > > Author: Wolfram Sang <[email protected]>
> > > > Date: Tue Nov 10 15:20:57 2020 +0100
> > > >
> > > > mmc: renesas_sdhi: populate SCC pointer at the proper place
> > > >
> > > > [ Upstream commit d14ac691bb6f6ebaa7eeec21ca04dd47300ff5b6 ]
> > > >
> > > > seems to be the buggy commit that breaks renesas boards in 5.10.
> > >
> > > This patch was part of a series. Did the other two patches come with it?
> > >
> > > b161d87dfd3d ("mmc: renesas_sdhi: probe into TMIO after SCC parameters have been setup")
> >
> > Yes.
> >
> > > 45bffc371fef ("mmc: renesas_sdhi: only reset SCC when its pointer is populated")
> >
> > No :(
> >
> > > If not, I could imagine that could lead to a crash. No idea why only
> > > with 5.10, though.
> >
> > The above commit is only in 5.11, so newer kernels should be fine.
> >
> > I'll go queue up the one missing patch now, thanks.
>
> Thank you. Patch indeed appears to be in 5.10.199.
>
> But we still have failures on Renesas with 5.10.199-rc2:
>
> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/1047368849
>
> And they still happed during MMC init:
>
> 2.638013] renesas_sdhi_internal_dmac ee100000.mmc: Got CD GPIO
> [ 2.638846] INFO: trying to register non-static key.
> [ 2.644192] ledtrig-cpu: registered to indicate activity on CPUs
> [ 2.649066] The code is fine but needs lockdep annotation, or maybe
> [ 2.649069] you didn't initialize this object before use?
> [ 2.649071] turning off the locking correctness validator.
> [ 2.649080] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.199-rc2-arm64-renesas-ge31b6513c43d #1
> [ 2.649082] Hardware name: HopeRun HiHope RZ/G2M with sub board (DT)
> [ 2.649086] Call trace:
> [ 2.655106] SMCCC: SOC_ID: ARCH_SOC_ID not implemented, skipping ....
> [ 2.661354] dump_backtrace+0x0/0x194
> [ 2.661361] show_stack+0x14/0x20
> [ 2.667430] usbcore: registered new interface driver usbhid
> [ 2.672230] dump_stack+0xe8/0x130
> [ 2.672238] register_lock_class+0x480/0x514
> [ 2.672244] __lock_acquire+0x74/0x20ec
> [ 2.681113] usbhid: USB HID core driver
> [ 2.687450] lock_acquire+0x218/0x350
> [ 2.687456] _raw_spin_lock+0x58/0x80
> [ 2.687464] tmio_mmc_irq+0x410/0x9ac
> [ 2.688556] renesas_sdhi_internal_dmac ee160000.mmc: mmc0 base at 0x00000000ee160000, max clock rate 200 MHz
> [ 2.744936] __handle_irq_event_percpu+0xbc/0x340
> [ 2.749635] handle_irq_event+0x60/0x100
> [ 2.753553] handle_fasteoi_irq+0xa0/0x1ec
> [ 2.757644] __handle_domain_irq+0x7c/0xdc
> [ 2.761736] efi_header_end+0x4c/0xd0
> [ 2.765393] el1_irq+0xcc/0x180
> [ 2.768530] arch_cpu_idle+0x14/0x2c
> [ 2.772100] default_idle_call+0x58/0xe4
> [ 2.776019] do_idle+0x244/0x2c0
> [ 2.779242] cpu_startup_entry+0x20/0x6c
> [ 2.783160] rest_init+0x164/0x28c
> [ 2.786561] arch_call_rest_init+0xc/0x14
> [ 2.790565] start_kernel+0x4c4/0x4f8
> [ 2.794233] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014
> [ 2.803011] Mem abort info:
>
> from https://lava.ciplatform.org/scheduler/job/1025535
> from
> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5360973735 .
>
> Is there something else missing?

I don't have a HopeRun HiHope RZ/G2M, but both v5.10.198 and v5.10.199
seem to work fine on Salvator-XS with R-Car H3 ES2.0 and Salvator-X
with R-Car M3-W ES1.0, using a config based on latest renesas_defconfig.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2023-10-25 10:53:46

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

Hi Pavel,

On Wed, Oct 25, 2023 at 12:47 PM Geert Uytterhoeven
<[email protected]> wrote:
> On Tue, Oct 24, 2023 at 9:22 PM Pavel Machek <[email protected]> wrote:
> > > > > And testing failed. So
> > > > >
> > > > > commit f5799b4e142884c2e7aa99f813113af4a3395ffb
> > > > > Author: Wolfram Sang <[email protected]>
> > > > > Date: Tue Nov 10 15:20:57 2020 +0100
> > > > >
> > > > > mmc: renesas_sdhi: populate SCC pointer at the proper place
> > > > >
> > > > > [ Upstream commit d14ac691bb6f6ebaa7eeec21ca04dd47300ff5b6 ]
> > > > >
> > > > > seems to be the buggy commit that breaks renesas boards in 5.10.
> > > >
> > > > This patch was part of a series. Did the other two patches come with it?
> > > >
> > > > b161d87dfd3d ("mmc: renesas_sdhi: probe into TMIO after SCC parameters have been setup")
> > >
> > > Yes.
> > >
> > > > 45bffc371fef ("mmc: renesas_sdhi: only reset SCC when its pointer is populated")
> > >
> > > No :(
> > >
> > > > If not, I could imagine that could lead to a crash. No idea why only
> > > > with 5.10, though.
> > >
> > > The above commit is only in 5.11, so newer kernels should be fine.
> > >
> > > I'll go queue up the one missing patch now, thanks.
> >
> > Thank you. Patch indeed appears to be in 5.10.199.
> >
> > But we still have failures on Renesas with 5.10.199-rc2:
> >
> > https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/1047368849
> >
> > And they still happed during MMC init:
> >
> > 2.638013] renesas_sdhi_internal_dmac ee100000.mmc: Got CD GPIO
> > [ 2.638846] INFO: trying to register non-static key.
> > [ 2.644192] ledtrig-cpu: registered to indicate activity on CPUs
> > [ 2.649066] The code is fine but needs lockdep annotation, or maybe
> > [ 2.649069] you didn't initialize this object before use?
> > [ 2.649071] turning off the locking correctness validator.
> > [ 2.649080] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.199-rc2-arm64-renesas-ge31b6513c43d #1
> > [ 2.649082] Hardware name: HopeRun HiHope RZ/G2M with sub board (DT)
> > [ 2.649086] Call trace:
> > [ 2.655106] SMCCC: SOC_ID: ARCH_SOC_ID not implemented, skipping ....
> > [ 2.661354] dump_backtrace+0x0/0x194
> > [ 2.661361] show_stack+0x14/0x20
> > [ 2.667430] usbcore: registered new interface driver usbhid
> > [ 2.672230] dump_stack+0xe8/0x130
> > [ 2.672238] register_lock_class+0x480/0x514
> > [ 2.672244] __lock_acquire+0x74/0x20ec
> > [ 2.681113] usbhid: USB HID core driver
> > [ 2.687450] lock_acquire+0x218/0x350
> > [ 2.687456] _raw_spin_lock+0x58/0x80
> > [ 2.687464] tmio_mmc_irq+0x410/0x9ac
> > [ 2.688556] renesas_sdhi_internal_dmac ee160000.mmc: mmc0 base at 0x00000000ee160000, max clock rate 200 MHz
> > [ 2.744936] __handle_irq_event_percpu+0xbc/0x340
> > [ 2.749635] handle_irq_event+0x60/0x100
> > [ 2.753553] handle_fasteoi_irq+0xa0/0x1ec
> > [ 2.757644] __handle_domain_irq+0x7c/0xdc
> > [ 2.761736] efi_header_end+0x4c/0xd0
> > [ 2.765393] el1_irq+0xcc/0x180
> > [ 2.768530] arch_cpu_idle+0x14/0x2c
> > [ 2.772100] default_idle_call+0x58/0xe4
> > [ 2.776019] do_idle+0x244/0x2c0
> > [ 2.779242] cpu_startup_entry+0x20/0x6c
> > [ 2.783160] rest_init+0x164/0x28c
> > [ 2.786561] arch_call_rest_init+0xc/0x14
> > [ 2.790565] start_kernel+0x4c4/0x4f8
> > [ 2.794233] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014
> > [ 2.803011] Mem abort info:
> >
> > from https://lava.ciplatform.org/scheduler/job/1025535
> > from
> > https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5360973735 .
> >
> > Is there something else missing?
>
> I don't have a HopeRun HiHope RZ/G2M, but both v5.10.198 and v5.10.199
> seem to work fine on Salvator-XS with R-Car H3 ES2.0 and Salvator-X
> with R-Car M3-W ES1.0, using a config based on latest renesas_defconfig.

Sorry, I looked at the wrong log on R-Car M3-W.
I do see the issue with v5.10.198, but not with v5.10.199.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2023-10-25 12:36:15

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

Hi Pavel,

On Wed, Oct 25, 2023 at 12:53 PM Geert Uytterhoeven
<[email protected]> wrote:
> On Wed, Oct 25, 2023 at 12:47 PM Geert Uytterhoeven
> <[email protected]> wrote:
> > On Tue, Oct 24, 2023 at 9:22 PM Pavel Machek <[email protected]> wrote:
> > > But we still have failures on Renesas with 5.10.199-rc2:
> > >
> > > https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/1047368849
> > >
> > > And they still happed during MMC init:
> > >
> > > 2.638013] renesas_sdhi_internal_dmac ee100000.mmc: Got CD GPIO
> > > [ 2.638846] INFO: trying to register non-static key.
> > > [ 2.644192] ledtrig-cpu: registered to indicate activity on CPUs
> > > [ 2.649066] The code is fine but needs lockdep annotation, or maybe
> > > [ 2.649069] you didn't initialize this object before use?
> > > [ 2.649071] turning off the locking correctness validator.
> > > [ 2.649080] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.199-rc2-arm64-renesas-ge31b6513c43d #1
> > > [ 2.649082] Hardware name: HopeRun HiHope RZ/G2M with sub board (DT)
> > > [ 2.649086] Call trace:
> > > [ 2.655106] SMCCC: SOC_ID: ARCH_SOC_ID not implemented, skipping ....
> > > [ 2.661354] dump_backtrace+0x0/0x194
> > > [ 2.661361] show_stack+0x14/0x20
> > > [ 2.667430] usbcore: registered new interface driver usbhid
> > > [ 2.672230] dump_stack+0xe8/0x130
> > > [ 2.672238] register_lock_class+0x480/0x514
> > > [ 2.672244] __lock_acquire+0x74/0x20ec
> > > [ 2.681113] usbhid: USB HID core driver
> > > [ 2.687450] lock_acquire+0x218/0x350
> > > [ 2.687456] _raw_spin_lock+0x58/0x80
> > > [ 2.687464] tmio_mmc_irq+0x410/0x9ac
> > > [ 2.688556] renesas_sdhi_internal_dmac ee160000.mmc: mmc0 base at 0x00000000ee160000, max clock rate 200 MHz
> > > [ 2.744936] __handle_irq_event_percpu+0xbc/0x340
> > > [ 2.749635] handle_irq_event+0x60/0x100
> > > [ 2.753553] handle_fasteoi_irq+0xa0/0x1ec
> > > [ 2.757644] __handle_domain_irq+0x7c/0xdc
> > > [ 2.761736] efi_header_end+0x4c/0xd0
> > > [ 2.765393] el1_irq+0xcc/0x180
> > > [ 2.768530] arch_cpu_idle+0x14/0x2c
> > > [ 2.772100] default_idle_call+0x58/0xe4
> > > [ 2.776019] do_idle+0x244/0x2c0
> > > [ 2.779242] cpu_startup_entry+0x20/0x6c
> > > [ 2.783160] rest_init+0x164/0x28c
> > > [ 2.786561] arch_call_rest_init+0xc/0x14
> > > [ 2.790565] start_kernel+0x4c4/0x4f8
> > > [ 2.794233] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014
> > > [ 2.803011] Mem abort info:
> > >
> > > from https://lava.ciplatform.org/scheduler/job/1025535
> > > from
> > > https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5360973735 .
> > >
> > > Is there something else missing?
> >
> > I don't have a HopeRun HiHope RZ/G2M, but both v5.10.198 and v5.10.199
> > seem to work fine on Salvator-XS with R-Car H3 ES2.0 and Salvator-X
> > with R-Car M3-W ES1.0, using a config based on latest renesas_defconfig.
>
> Sorry, I looked at the wrong log on R-Car M3-W.
> I do see the issue with v5.10.198, but not with v5.10.199.

It seems to be an intermittent issue. Investigating...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2023-10-25 17:07:05

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

On Wed, Oct 25, 2023 at 2:35 PM Geert Uytterhoeven <[email protected]> wrote:
> On Wed, Oct 25, 2023 at 12:53 PM Geert Uytterhoeven
> <[email protected]> wrote:
> > On Wed, Oct 25, 2023 at 12:47 PM Geert Uytterhoeven
> > <[email protected]> wrote:
> > > On Tue, Oct 24, 2023 at 9:22 PM Pavel Machek <[email protected]> wrote:
> > > > But we still have failures on Renesas with 5.10.199-rc2:
> > > >
> > > > https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/1047368849
> > > >
> > > > And they still happed during MMC init:
> > > >
> > > > 2.638013] renesas_sdhi_internal_dmac ee100000.mmc: Got CD GPIO
> > > > [ 2.638846] INFO: trying to register non-static key.
> > > > [ 2.644192] ledtrig-cpu: registered to indicate activity on CPUs
> > > > [ 2.649066] The code is fine but needs lockdep annotation, or maybe
> > > > [ 2.649069] you didn't initialize this object before use?
> > > > [ 2.649071] turning off the locking correctness validator.
> > > > [ 2.649080] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.199-rc2-arm64-renesas-ge31b6513c43d #1
> > > > [ 2.649082] Hardware name: HopeRun HiHope RZ/G2M with sub board (DT)
> > > > [ 2.649086] Call trace:
> > > > [ 2.655106] SMCCC: SOC_ID: ARCH_SOC_ID not implemented, skipping ....
> > > > [ 2.661354] dump_backtrace+0x0/0x194
> > > > [ 2.661361] show_stack+0x14/0x20
> > > > [ 2.667430] usbcore: registered new interface driver usbhid
> > > > [ 2.672230] dump_stack+0xe8/0x130
> > > > [ 2.672238] register_lock_class+0x480/0x514
> > > > [ 2.672244] __lock_acquire+0x74/0x20ec
> > > > [ 2.681113] usbhid: USB HID core driver
> > > > [ 2.687450] lock_acquire+0x218/0x350
> > > > [ 2.687456] _raw_spin_lock+0x58/0x80
> > > > [ 2.687464] tmio_mmc_irq+0x410/0x9ac
> > > > [ 2.688556] renesas_sdhi_internal_dmac ee160000.mmc: mmc0 base at 0x00000000ee160000, max clock rate 200 MHz
> > > > [ 2.744936] __handle_irq_event_percpu+0xbc/0x340
> > > > [ 2.749635] handle_irq_event+0x60/0x100
> > > > [ 2.753553] handle_fasteoi_irq+0xa0/0x1ec
> > > > [ 2.757644] __handle_domain_irq+0x7c/0xdc
> > > > [ 2.761736] efi_header_end+0x4c/0xd0
> > > > [ 2.765393] el1_irq+0xcc/0x180
> > > > [ 2.768530] arch_cpu_idle+0x14/0x2c
> > > > [ 2.772100] default_idle_call+0x58/0xe4
> > > > [ 2.776019] do_idle+0x244/0x2c0
> > > > [ 2.779242] cpu_startup_entry+0x20/0x6c
> > > > [ 2.783160] rest_init+0x164/0x28c
> > > > [ 2.786561] arch_call_rest_init+0xc/0x14
> > > > [ 2.790565] start_kernel+0x4c4/0x4f8
> > > > [ 2.794233] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014
> > > > [ 2.803011] Mem abort info:
> > > >
> > > > from https://lava.ciplatform.org/scheduler/job/1025535
> > > > from
> > > > https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5360973735 .
> > > >
> > > > Is there something else missing?
> > >
> > > I don't have a HopeRun HiHope RZ/G2M, but both v5.10.198 and v5.10.199
> > > seem to work fine on Salvator-XS with R-Car H3 ES2.0 and Salvator-X
> > > with R-Car M3-W ES1.0, using a config based on latest renesas_defconfig.
> >
> > Sorry, I looked at the wrong log on R-Car M3-W.
> > I do see the issue with v5.10.198, but not with v5.10.199.
>
> It seems to be an intermittent issue. Investigating...

After spending too much time on bisecting, the bad guy turns out to
be commit 6d3745bbc3341d3b ("mmc: renesas_sdhi: register irqs before
registering controller") in v5.10.198.

Adding debug information shows the lock is mmc_host.lock.

It is definitely initialized:

renesas_sdhi_probe()
{
...
tmio_mmc_host_alloc()
mmc_alloc_host
spin_lock_init(&host->lock);
...
devm_request_irq()
-> tmio_mmc_irq
tmio_mmc_cmd_irq()
spin_lock(&host->lock);
...
}

That leaves us with a missing lockdep annotation?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2023-10-25 18:41:18

by Guenter Roeck

[permalink] [raw]
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

On 10/25/23 10:05, Geert Uytterhoeven wrote:
> On Wed, Oct 25, 2023 at 2:35 PM Geert Uytterhoeven <[email protected]> wrote:
>> On Wed, Oct 25, 2023 at 12:53 PM Geert Uytterhoeven
>> <[email protected]> wrote:
>>> On Wed, Oct 25, 2023 at 12:47 PM Geert Uytterhoeven
>>> <[email protected]> wrote:
>>>> On Tue, Oct 24, 2023 at 9:22 PM Pavel Machek <[email protected]> wrote:
>>>>> But we still have failures on Renesas with 5.10.199-rc2:
>>>>>
>>>>> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/1047368849
>>>>>
>>>>> And they still happed during MMC init:
>>>>>
>>>>> 2.638013] renesas_sdhi_internal_dmac ee100000.mmc: Got CD GPIO
>>>>> [ 2.638846] INFO: trying to register non-static key.
>>>>> [ 2.644192] ledtrig-cpu: registered to indicate activity on CPUs
>>>>> [ 2.649066] The code is fine but needs lockdep annotation, or maybe
>>>>> [ 2.649069] you didn't initialize this object before use?
>>>>> [ 2.649071] turning off the locking correctness validator.
>>>>> [ 2.649080] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.199-rc2-arm64-renesas-ge31b6513c43d #1
>>>>> [ 2.649082] Hardware name: HopeRun HiHope RZ/G2M with sub board (DT)
>>>>> [ 2.649086] Call trace:
>>>>> [ 2.655106] SMCCC: SOC_ID: ARCH_SOC_ID not implemented, skipping ....
>>>>> [ 2.661354] dump_backtrace+0x0/0x194
>>>>> [ 2.661361] show_stack+0x14/0x20
>>>>> [ 2.667430] usbcore: registered new interface driver usbhid
>>>>> [ 2.672230] dump_stack+0xe8/0x130
>>>>> [ 2.672238] register_lock_class+0x480/0x514
>>>>> [ 2.672244] __lock_acquire+0x74/0x20ec
>>>>> [ 2.681113] usbhid: USB HID core driver
>>>>> [ 2.687450] lock_acquire+0x218/0x350
>>>>> [ 2.687456] _raw_spin_lock+0x58/0x80
>>>>> [ 2.687464] tmio_mmc_irq+0x410/0x9ac
>>>>> [ 2.688556] renesas_sdhi_internal_dmac ee160000.mmc: mmc0 base at 0x00000000ee160000, max clock rate 200 MHz
>>>>> [ 2.744936] __handle_irq_event_percpu+0xbc/0x340
>>>>> [ 2.749635] handle_irq_event+0x60/0x100
>>>>> [ 2.753553] handle_fasteoi_irq+0xa0/0x1ec
>>>>> [ 2.757644] __handle_domain_irq+0x7c/0xdc
>>>>> [ 2.761736] efi_header_end+0x4c/0xd0
>>>>> [ 2.765393] el1_irq+0xcc/0x180
>>>>> [ 2.768530] arch_cpu_idle+0x14/0x2c
>>>>> [ 2.772100] default_idle_call+0x58/0xe4
>>>>> [ 2.776019] do_idle+0x244/0x2c0
>>>>> [ 2.779242] cpu_startup_entry+0x20/0x6c
>>>>> [ 2.783160] rest_init+0x164/0x28c
>>>>> [ 2.786561] arch_call_rest_init+0xc/0x14
>>>>> [ 2.790565] start_kernel+0x4c4/0x4f8
>>>>> [ 2.794233] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014
>>>>> [ 2.803011] Mem abort info:
>>>>>
>>>>> from https://lava.ciplatform.org/scheduler/job/1025535
>>>>> from
>>>>> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5360973735 .
>>>>>
>>>>> Is there something else missing?
>>>>
>>>> I don't have a HopeRun HiHope RZ/G2M, but both v5.10.198 and v5.10.199
>>>> seem to work fine on Salvator-XS with R-Car H3 ES2.0 and Salvator-X
>>>> with R-Car M3-W ES1.0, using a config based on latest renesas_defconfig.
>>>
>>> Sorry, I looked at the wrong log on R-Car M3-W.
>>> I do see the issue with v5.10.198, but not with v5.10.199.
>>
>> It seems to be an intermittent issue. Investigating...
>
> After spending too much time on bisecting, the bad guy turns out to
> be commit 6d3745bbc3341d3b ("mmc: renesas_sdhi: register irqs before
> registering controller") in v5.10.198.
>
> Adding debug information shows the lock is mmc_host.lock.
>
> It is definitely initialized:
>
> renesas_sdhi_probe()
> {
> ...
> tmio_mmc_host_alloc()
> mmc_alloc_host
> spin_lock_init(&host->lock);
> ...
> devm_request_irq()
> -> tmio_mmc_irq
> tmio_mmc_cmd_irq()
> spin_lock(&host->lock);
> ...
> }
>
> That leaves us with a missing lockdep annotation?
>

Is it possible that the lock initialization is overwritten ?
I seem to recall a recent case where this happens.

Also, there is
spin_lock_init(&_host->lock);
in tmio_mmc_host_probe(), and tmio_mmc_host_probe() is called after
devm_request_irq().

Also, how would lockdep annotation help with "Unable to handle
kernel NULL pointer dereference at virtual address 0000000000000014"
in the log above ?

Guenter

2023-10-25 19:54:24

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

Hi Günter,

On Wed, Oct 25, 2023 at 8:39 PM Guenter Roeck <[email protected]> wrote:
> On 10/25/23 10:05, Geert Uytterhoeven wrote:
> > On Wed, Oct 25, 2023 at 2:35 PM Geert Uytterhoeven <[email protected]> wrote:
> >> On Wed, Oct 25, 2023 at 12:53 PM Geert Uytterhoeven
> >> <[email protected]> wrote:
> >>> On Wed, Oct 25, 2023 at 12:47 PM Geert Uytterhoeven
> >>> <[email protected]> wrote:
> >>>> On Tue, Oct 24, 2023 at 9:22 PM Pavel Machek <[email protected]> wrote:
> >>>>> But we still have failures on Renesas with 5.10.199-rc2:
> >>>>>
> >>>>> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/1047368849
> >>>>>
> >>>>> And they still happed during MMC init:
> >>>>>
> >>>>> 2.638013] renesas_sdhi_internal_dmac ee100000.mmc: Got CD GPIO
> >>>>> [ 2.638846] INFO: trying to register non-static key.
> >>>>> [ 2.644192] ledtrig-cpu: registered to indicate activity on CPUs
> >>>>> [ 2.649066] The code is fine but needs lockdep annotation, or maybe
> >>>>> [ 2.649069] you didn't initialize this object before use?
> >>>>> [ 2.649071] turning off the locking correctness validator.
> >>>>> [ 2.649080] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.199-rc2-arm64-renesas-ge31b6513c43d #1
> >>>>> [ 2.649082] Hardware name: HopeRun HiHope RZ/G2M with sub board (DT)
> >>>>> [ 2.649086] Call trace:
> >>>>> [ 2.655106] SMCCC: SOC_ID: ARCH_SOC_ID not implemented, skipping ....
> >>>>> [ 2.661354] dump_backtrace+0x0/0x194
> >>>>> [ 2.661361] show_stack+0x14/0x20
> >>>>> [ 2.667430] usbcore: registered new interface driver usbhid
> >>>>> [ 2.672230] dump_stack+0xe8/0x130
> >>>>> [ 2.672238] register_lock_class+0x480/0x514
> >>>>> [ 2.672244] __lock_acquire+0x74/0x20ec
> >>>>> [ 2.681113] usbhid: USB HID core driver
> >>>>> [ 2.687450] lock_acquire+0x218/0x350
> >>>>> [ 2.687456] _raw_spin_lock+0x58/0x80
> >>>>> [ 2.687464] tmio_mmc_irq+0x410/0x9ac
> >>>>> [ 2.688556] renesas_sdhi_internal_dmac ee160000.mmc: mmc0 base at 0x00000000ee160000, max clock rate 200 MHz
> >>>>> [ 2.744936] __handle_irq_event_percpu+0xbc/0x340
> >>>>> [ 2.749635] handle_irq_event+0x60/0x100
> >>>>> [ 2.753553] handle_fasteoi_irq+0xa0/0x1ec
> >>>>> [ 2.757644] __handle_domain_irq+0x7c/0xdc
> >>>>> [ 2.761736] efi_header_end+0x4c/0xd0
> >>>>> [ 2.765393] el1_irq+0xcc/0x180
> >>>>> [ 2.768530] arch_cpu_idle+0x14/0x2c
> >>>>> [ 2.772100] default_idle_call+0x58/0xe4
> >>>>> [ 2.776019] do_idle+0x244/0x2c0
> >>>>> [ 2.779242] cpu_startup_entry+0x20/0x6c
> >>>>> [ 2.783160] rest_init+0x164/0x28c
> >>>>> [ 2.786561] arch_call_rest_init+0xc/0x14
> >>>>> [ 2.790565] start_kernel+0x4c4/0x4f8
> >>>>> [ 2.794233] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014
> >>>>> [ 2.803011] Mem abort info:
> >>>>>
> >>>>> from https://lava.ciplatform.org/scheduler/job/1025535
> >>>>> from
> >>>>> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5360973735 .
> >>>>>
> >>>>> Is there something else missing?
> >>>>
> >>>> I don't have a HopeRun HiHope RZ/G2M, but both v5.10.198 and v5.10.199
> >>>> seem to work fine on Salvator-XS with R-Car H3 ES2.0 and Salvator-X
> >>>> with R-Car M3-W ES1.0, using a config based on latest renesas_defconfig.
> >>>
> >>> Sorry, I looked at the wrong log on R-Car M3-W.
> >>> I do see the issue with v5.10.198, but not with v5.10.199.
> >>
> >> It seems to be an intermittent issue. Investigating...
> >
> > After spending too much time on bisecting, the bad guy turns out to
> > be commit 6d3745bbc3341d3b ("mmc: renesas_sdhi: register irqs before
> > registering controller") in v5.10.198.
> >
> > Adding debug information shows the lock is mmc_host.lock.
> >
> > It is definitely initialized:
> >
> > renesas_sdhi_probe()
> > {
> > ...
> > tmio_mmc_host_alloc()
> > mmc_alloc_host
> > spin_lock_init(&host->lock);
> > ...
> > devm_request_irq()
> > -> tmio_mmc_irq
> > tmio_mmc_cmd_irq()
> > spin_lock(&host->lock);
> > ...
> > }
> >
> > That leaves us with a missing lockdep annotation?
>
> Is it possible that the lock initialization is overwritten ?
> I seem to recall a recent case where this happens.
>
> Also, there is
> spin_lock_init(&_host->lock);
> in tmio_mmc_host_probe(), and tmio_mmc_host_probe() is called after
> devm_request_irq().

Unless I am missing something, that is initializing tmio_mmc_host.lock,
which is a different lock than mmc_host.lock?

> Also, how would lockdep annotation help with "Unable to handle
> kernel NULL pointer dereference at virtual address 0000000000000014"
> in the log above ?

For the log from v5.10.198-rc1-g18c65c1b4996, that happened because
it lacked commit 1e3d016a95067ab3 ("mmc: renesas_sdhi: only reset
SCC when its pointer is populated"), according to earlier messages in
this thread.

For the NULL pointer dereference in 5.10.199-rc2, I'm not sure.
I didn't see that on R-Car M3-W...

According to my logs, I never saw this lockdep issue in MMC on mainline
before, so it's a bit hard to guess what's missing...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2023-10-25 21:27:55

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

On Wed, Oct 25, 2023 at 9:53 PM Geert Uytterhoeven <[email protected]> wrote:
> On Wed, Oct 25, 2023 at 8:39 PM Guenter Roeck <[email protected]> wrote:
> > On 10/25/23 10:05, Geert Uytterhoeven wrote:
> > > On Wed, Oct 25, 2023 at 2:35 PM Geert Uytterhoeven <[email protected]> wrote:
> > >> On Wed, Oct 25, 2023 at 12:53 PM Geert Uytterhoeven
> > >> <[email protected]> wrote:
> > >>> On Wed, Oct 25, 2023 at 12:47 PM Geert Uytterhoeven
> > >>> <[email protected]> wrote:
> > >>>> On Tue, Oct 24, 2023 at 9:22 PM Pavel Machek <[email protected]> wrote:
> > >>>>> But we still have failures on Renesas with 5.10.199-rc2:
> > >>>>>
> > >>>>> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/1047368849
> > >>>>>
> > >>>>> And they still happed during MMC init:
> > >>>>>
> > >>>>> 2.638013] renesas_sdhi_internal_dmac ee100000.mmc: Got CD GPIO
> > >>>>> [ 2.638846] INFO: trying to register non-static key.
> > >>>>> [ 2.644192] ledtrig-cpu: registered to indicate activity on CPUs
> > >>>>> [ 2.649066] The code is fine but needs lockdep annotation, or maybe
> > >>>>> [ 2.649069] you didn't initialize this object before use?
> > >>>>> [ 2.649071] turning off the locking correctness validator.
> > >>>>> [ 2.649080] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.199-rc2-arm64-renesas-ge31b6513c43d #1
> > >>>>> [ 2.649082] Hardware name: HopeRun HiHope RZ/G2M with sub board (DT)
> > >>>>> [ 2.649086] Call trace:
> > >>>>> [ 2.655106] SMCCC: SOC_ID: ARCH_SOC_ID not implemented, skipping ....
> > >>>>> [ 2.661354] dump_backtrace+0x0/0x194
> > >>>>> [ 2.661361] show_stack+0x14/0x20
> > >>>>> [ 2.667430] usbcore: registered new interface driver usbhid
> > >>>>> [ 2.672230] dump_stack+0xe8/0x130
> > >>>>> [ 2.672238] register_lock_class+0x480/0x514
> > >>>>> [ 2.672244] __lock_acquire+0x74/0x20ec
> > >>>>> [ 2.681113] usbhid: USB HID core driver
> > >>>>> [ 2.687450] lock_acquire+0x218/0x350
> > >>>>> [ 2.687456] _raw_spin_lock+0x58/0x80
> > >>>>> [ 2.687464] tmio_mmc_irq+0x410/0x9ac
> > >>>>> [ 2.688556] renesas_sdhi_internal_dmac ee160000.mmc: mmc0 base at 0x00000000ee160000, max clock rate 200 MHz
> > >>>>> [ 2.744936] __handle_irq_event_percpu+0xbc/0x340
> > >>>>> [ 2.749635] handle_irq_event+0x60/0x100
> > >>>>> [ 2.753553] handle_fasteoi_irq+0xa0/0x1ec
> > >>>>> [ 2.757644] __handle_domain_irq+0x7c/0xdc
> > >>>>> [ 2.761736] efi_header_end+0x4c/0xd0
> > >>>>> [ 2.765393] el1_irq+0xcc/0x180
> > >>>>> [ 2.768530] arch_cpu_idle+0x14/0x2c
> > >>>>> [ 2.772100] default_idle_call+0x58/0xe4
> > >>>>> [ 2.776019] do_idle+0x244/0x2c0
> > >>>>> [ 2.779242] cpu_startup_entry+0x20/0x6c
> > >>>>> [ 2.783160] rest_init+0x164/0x28c
> > >>>>> [ 2.786561] arch_call_rest_init+0xc/0x14
> > >>>>> [ 2.790565] start_kernel+0x4c4/0x4f8
> > >>>>> [ 2.794233] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014
> > >>>>> [ 2.803011] Mem abort info:
> > >>>>>
> > >>>>> from https://lava.ciplatform.org/scheduler/job/1025535
> > >>>>> from
> > >>>>> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5360973735 .
> > >>>>>
> > >>>>> Is there something else missing?
> > >>>>
> > >>>> I don't have a HopeRun HiHope RZ/G2M, but both v5.10.198 and v5.10.199
> > >>>> seem to work fine on Salvator-XS with R-Car H3 ES2.0 and Salvator-X
> > >>>> with R-Car M3-W ES1.0, using a config based on latest renesas_defconfig.
> > >>>
> > >>> Sorry, I looked at the wrong log on R-Car M3-W.
> > >>> I do see the issue with v5.10.198, but not with v5.10.199.
> > >>
> > >> It seems to be an intermittent issue. Investigating...
> > >
> > > After spending too much time on bisecting, the bad guy turns out to
> > > be commit 6d3745bbc3341d3b ("mmc: renesas_sdhi: register irqs before
> > > registering controller") in v5.10.198.
> > >
> > > Adding debug information shows the lock is mmc_host.lock.
> > >
> > > It is definitely initialized:
> > >
> > > renesas_sdhi_probe()
> > > {
> > > ...
> > > tmio_mmc_host_alloc()
> > > mmc_alloc_host
> > > spin_lock_init(&host->lock);

Initializing mmc_host.lock.

> > > ...
> > > devm_request_irq()
> > > -> tmio_mmc_irq
> > > tmio_mmc_cmd_irq()
> > > spin_lock(&host->lock);

Locking tmio_mmc_host.lock, but ...

> > > ...
> > > }
> > >
> > > That leaves us with a missing lockdep annotation?
> >
> > Is it possible that the lock initialization is overwritten ?
> > I seem to recall a recent case where this happens.
> >
> > Also, there is
> > spin_lock_init(&_host->lock);
> > in tmio_mmc_host_probe(), and tmio_mmc_host_probe() is called after
> > devm_request_irq().
>
> Unless I am missing something, that is initializing tmio_mmc_host.lock,
> which is a different lock than mmc_host.lock?

... tmio_mmc_host.lock is initialized only here.

Now the question remains why this is not triggered in mainline.
More investigation to do tomorrow...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2023-10-25 22:02:32

by Pavel Machek

[permalink] [raw]
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

Hi!

> > > > I don't have a HopeRun HiHope RZ/G2M, but both v5.10.198 and v5.10.199
> > > > seem to work fine on Salvator-XS with R-Car H3 ES2.0 and Salvator-X
> > > > with R-Car M3-W ES1.0, using a config based on latest renesas_defconfig.
> > >
> > > Sorry, I looked at the wrong log on R-Car M3-W.
> > > I do see the issue with v5.10.198, but not with v5.10.199.
> >
> > It seems to be an intermittent issue. Investigating...
>
> After spending too much time on bisecting, the bad guy turns out to
> be commit 6d3745bbc3341d3b ("mmc: renesas_sdhi: register irqs before
> registering controller") in v5.10.198.

Thanks for bisection, let me take a look.

I reverted 6d3745bbc3341d3b on top of 5.10.199 and that solved my
issues:

https://gitlab.com/cip-project/cip-kernel/linux-cip/-/pipelines/1049624365

(Strange thing is I seen a failure on qemu this fixed, too. I guess
that must be some kind of glitch).

Best regards,
Pavel
--
DENX Software Engineering GmbH, Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany


Attachments:
(No filename) (1.09 kB)
signature.asc (201.00 B)
Download all attachments

2023-10-26 01:23:31

by Guenter Roeck

[permalink] [raw]
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

On 10/25/23 15:00, Pavel Machek wrote:
> Hi!
>
>>>>> I don't have a HopeRun HiHope RZ/G2M, but both v5.10.198 and v5.10.199
>>>>> seem to work fine on Salvator-XS with R-Car H3 ES2.0 and Salvator-X
>>>>> with R-Car M3-W ES1.0, using a config based on latest renesas_defconfig.
>>>>
>>>> Sorry, I looked at the wrong log on R-Car M3-W.
>>>> I do see the issue with v5.10.198, but not with v5.10.199.
>>>
>>> It seems to be an intermittent issue. Investigating...
>>
>> After spending too much time on bisecting, the bad guy turns out to
>> be commit 6d3745bbc3341d3b ("mmc: renesas_sdhi: register irqs before
>> registering controller") in v5.10.198.
>
> Thanks for bisection, let me take a look.
>
> I reverted 6d3745bbc3341d3b on top of 5.10.199 and that solved my
> issues:
>
> https://gitlab.com/cip-project/cip-kernel/linux-cip/-/pipelines/1049624365
>
> (Strange thing is I seen a failure on qemu this fixed, too. I guess
> that must be some kind of glitch).
>

qemu interrupt timing is different, which can result in exposing race
conditions which are not seen with real hardware. Plus, of course, there
is always the possibility that the qemu emulation is buggy.

What is your qemu command line ? I'd like to add it to my tests if possible.

Thanks,
Guenter

2023-10-26 12:08:45

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

On Wed, Oct 25, 2023 at 11:26 PM Geert Uytterhoeven
<[email protected]> wrote:
> On Wed, Oct 25, 2023 at 9:53 PM Geert Uytterhoeven <[email protected]> wrote:
> > On Wed, Oct 25, 2023 at 8:39 PM Guenter Roeck <[email protected]> wrote:
> > > On 10/25/23 10:05, Geert Uytterhoeven wrote:
> > > > On Wed, Oct 25, 2023 at 2:35 PM Geert Uytterhoeven <[email protected]> wrote:
> > > >> On Wed, Oct 25, 2023 at 12:53 PM Geert Uytterhoeven
> > > >> <[email protected]> wrote:
> > > >>> On Wed, Oct 25, 2023 at 12:47 PM Geert Uytterhoeven
> > > >>> <[email protected]> wrote:
> > > >>>> On Tue, Oct 24, 2023 at 9:22 PM Pavel Machek <[email protected]> wrote:
> > > >>>>> But we still have failures on Renesas with 5.10.199-rc2:
> > > >>>>>
> > > >>>>> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/1047368849
> > > >>>>>
> > > >>>>> And they still happed during MMC init:
> > > >>>>>
> > > >>>>> 2.638013] renesas_sdhi_internal_dmac ee100000.mmc: Got CD GPIO
> > > >>>>> [ 2.638846] INFO: trying to register non-static key.
> > > >>>>> [ 2.644192] ledtrig-cpu: registered to indicate activity on CPUs
> > > >>>>> [ 2.649066] The code is fine but needs lockdep annotation, or maybe
> > > >>>>> [ 2.649069] you didn't initialize this object before use?
> > > >>>>> [ 2.649071] turning off the locking correctness validator.
> > > >>>>> [ 2.649080] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.199-rc2-arm64-renesas-ge31b6513c43d #1
> > > >>>>> [ 2.649082] Hardware name: HopeRun HiHope RZ/G2M with sub board (DT)
> > > >>>>> [ 2.649086] Call trace:
> > > >>>>> [ 2.655106] SMCCC: SOC_ID: ARCH_SOC_ID not implemented, skipping ....
> > > >>>>> [ 2.661354] dump_backtrace+0x0/0x194
> > > >>>>> [ 2.661361] show_stack+0x14/0x20
> > > >>>>> [ 2.667430] usbcore: registered new interface driver usbhid
> > > >>>>> [ 2.672230] dump_stack+0xe8/0x130
> > > >>>>> [ 2.672238] register_lock_class+0x480/0x514
> > > >>>>> [ 2.672244] __lock_acquire+0x74/0x20ec
> > > >>>>> [ 2.681113] usbhid: USB HID core driver
> > > >>>>> [ 2.687450] lock_acquire+0x218/0x350
> > > >>>>> [ 2.687456] _raw_spin_lock+0x58/0x80
> > > >>>>> [ 2.687464] tmio_mmc_irq+0x410/0x9ac
> > > >>>>> [ 2.688556] renesas_sdhi_internal_dmac ee160000.mmc: mmc0 base at 0x00000000ee160000, max clock rate 200 MHz
> > > >>>>> [ 2.744936] __handle_irq_event_percpu+0xbc/0x340
> > > >>>>> [ 2.749635] handle_irq_event+0x60/0x100
> > > >>>>> [ 2.753553] handle_fasteoi_irq+0xa0/0x1ec
> > > >>>>> [ 2.757644] __handle_domain_irq+0x7c/0xdc
> > > >>>>> [ 2.761736] efi_header_end+0x4c/0xd0
> > > >>>>> [ 2.765393] el1_irq+0xcc/0x180
> > > >>>>> [ 2.768530] arch_cpu_idle+0x14/0x2c
> > > >>>>> [ 2.772100] default_idle_call+0x58/0xe4
> > > >>>>> [ 2.776019] do_idle+0x244/0x2c0
> > > >>>>> [ 2.779242] cpu_startup_entry+0x20/0x6c
> > > >>>>> [ 2.783160] rest_init+0x164/0x28c
> > > >>>>> [ 2.786561] arch_call_rest_init+0xc/0x14
> > > >>>>> [ 2.790565] start_kernel+0x4c4/0x4f8
> > > >>>>> [ 2.794233] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000014
> > > >>>>> [ 2.803011] Mem abort info:
> > > >>>>>
> > > >>>>> from https://lava.ciplatform.org/scheduler/job/1025535
> > > >>>>> from
> > > >>>>> https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/5360973735 .
> > > >>>>>
> > > >>>>> Is there something else missing?
> > > >>
> > > >> It seems to be an intermittent issue. Investigating...
> > > >
> > > > After spending too much time on bisecting, the bad guy turns out to
> > > > be commit 6d3745bbc3341d3b ("mmc: renesas_sdhi: register irqs before
> > > > registering controller") in v5.10.198.
> > > >
> > > > Adding debug information shows the lock is mmc_host.lock.
> > > >
> > > > It is definitely initialized:
> > > >
> > > > renesas_sdhi_probe()
> > > > {
> > > > ...
> > > > tmio_mmc_host_alloc()
> > > > mmc_alloc_host
> > > > spin_lock_init(&host->lock);
>
> Initializing mmc_host.lock.
>
> > > > ...
> > > > devm_request_irq()
> > > > -> tmio_mmc_irq
> > > > tmio_mmc_cmd_irq()
> > > > spin_lock(&host->lock);
>
> Locking tmio_mmc_host.lock, but ...
>
> > > > ...
> > > > }
> > > >
> > > > That leaves us with a missing lockdep annotation?
> > >
> > > Is it possible that the lock initialization is overwritten ?
> > > I seem to recall a recent case where this happens.
> > >
> > > Also, there is
> > > spin_lock_init(&_host->lock);
> > > in tmio_mmc_host_probe(), and tmio_mmc_host_probe() is called after
> > > devm_request_irq().
> >
> > Unless I am missing something, that is initializing tmio_mmc_host.lock,
> > which is a different lock than mmc_host.lock?
>
> ... tmio_mmc_host.lock is initialized only here.
>
> Now the question remains why this is not triggered in mainline.
> More investigation to do tomorrow...

| --- a/drivers/mmc/host/renesas_sdhi_core.c
| +++ b/drivers/mmc/host/renesas_sdhi_core.c
| @@ -1011,6 +1011,8 @@ int renesas_sdhi_probe(struct platform_device *pdev,
| renesas_sdhi_start_signal_voltage_switch;
| host->sdcard_irq_setbit_mask = TMIO_STAT_ALWAYS_SET_27;
| host->reset = renesas_sdhi_reset;

host->sdcard_irq_mask_all is not initialized in this branch

| + } else {
| + host->sdcard_irq_mask_all = TMIO_MASK_ALL;
| }

| /* Orginally registers were 16 bit apart, could be 32 or 64
nowadays */
| @@ -1098,9 +1100,7 @@ int renesas_sdhi_probe(struct platform_device *pdev,
| host->ops.hs400_complete = renesas_sdhi_hs400_complete;
| }

| - ret = tmio_mmc_host_probe(host);
| - if (ret < 0)
| - goto edisclk;
| + sd_ctrl_write32_as_16_and_16(host, CTL_IRQ_MASK,
host->sdcard_irq_mask_all);

Fails to disable interrupts for real as host->sdcard_irq_mask_all is
still zero.

| num_irqs = platform_irq_count(pdev);
| if (num_irqs < 0) {
| @@ -1127,6 +1127,10 @@ int renesas_sdhi_probe(struct platform_device *pdev,
| goto eirq;
| }

| + ret = tmio_mmc_host_probe(host);

Initializes host->sdcard_irq_mask_all when needed and disables
interrupts:

if (!_host->sdcard_irq_mask_all)
_host->sdcard_irq_mask_all = TMIO_MASK_ALL;
tmio_mmc_disable_mmc_irqs(_host, _host->sdcard_irq_mask_all);

If the interrupt came in before, we have an issue.

| + if (ret < 0)
| + goto edisclk;
| +
| dev_info(&pdev->dev, "%s base at %pa, max clock rate %u MHz\n",
| mmc_hostname(host->mmc), &res->start,
host->mmc->f_max / 1000000);

The solution is to backport commit 9f12cac1bb88e329 ("mmc: renesas_sdhi:
use custom mask for TMIO_MASK_ALL") in v5.13.
As this doesn't backport cleanly, I'll submit a (tested) patch.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2023-10-26 12:22:07

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: renesas_sdhi problems in 5.10-stable was Re: [PATCH 5.10 000/226] 5.10.198-rc1 review

On Thu, Oct 26, 2023 at 2:08 PM Geert Uytterhoeven <[email protected]> wrote:
> The solution is to backport commit 9f12cac1bb88e329 ("mmc: renesas_sdhi:
> use custom mask for TMIO_MASK_ALL") in v5.13.
> As this doesn't backport cleanly, I'll submit a (tested) patch.

https://lore.kernel.org/r/1b9fda30f2d86fab50341a947d17b5206a2c7507.1698321354.git.geert+renesas@glider.be

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds