2022-10-14 00:58:33

by Dave Airlie

[permalink] [raw]
Subject: [git pull] drm fixes for 6.1-rc1

Hi Linus,

Round of fixes for the merge window stuff, bunch of amdgpu and i915
changes, this should have the gcc11 warning fix, amongst other
changes.

Dave.

drm-next-2022-10-14:
drm fixes for 6.1-rc1

amdgpu:
- DC mutex fix
- DC SubVP fixes
- DCN 3.2.x fixes
- DCN 3.1.x fixes
- SDMA 6.x fixes
- Enable DPIA for 3.1.4
- VRR fixes
- VRAM BO swapping fix
- Revert dirty fb helper change
- SR-IOV suspend/resume fixes
- Work around GCC array bounds check fail warning
- UMC 8.10 fixes
- Misc fixes and cleanups

i915:
- Round to closest in g4x+ HDMI clock readout
- Update MOCS table for EHL
- Fix PSR_IMR/IIR field handling
- Fix watermark calculations for gen12+/DG2 modifiers
- Reject excessive dotclocks early
- Fix revocation of non-persistent contexts
- Handle migration for dpt
- Fix display problems after resume
- Allow control over the flags when migrating
- Consider DG2_RC_CCS_CC when migrating buffers
The following changes since commit bafaf67c42f4b547bf4fb329ac6dcb28b05de15e:

Revert "drm/sched: Use parent fence instead of finished" (2022-10-07
12:58:39 +1000)

are available in the Git repository at:

git://anongit.freedesktop.org/drm/drm tags/drm-next-2022-10-14

for you to fetch changes up to fc3523a833c9c109e68209f1ecdd15864373e66a:

Merge tag 'amd-drm-fixes-6.1-2022-10-12' of
https://gitlab.freedesktop.org/agd5f/linux into drm-next (2022-10-14
07:47:25 +1000)

----------------------------------------------------------------
drm fixes for 6.1-rc1

amdgpu:
- DC mutex fix
- DC SubVP fixes
- DCN 3.2.x fixes
- DCN 3.1.x fixes
- SDMA 6.x fixes
- Enable DPIA for 3.1.4
- VRR fixes
- VRAM BO swapping fix
- Revert dirty fb helper change
- SR-IOV suspend/resume fixes
- Work around GCC array bounds check fail warning
- UMC 8.10 fixes
- Misc fixes and cleanups

i915:
- Round to closest in g4x+ HDMI clock readout
- Update MOCS table for EHL
- Fix PSR_IMR/IIR field handling
- Fix watermark calculations for gen12+/DG2 modifiers
- Reject excessive dotclocks early
- Fix revocation of non-persistent contexts
- Handle migration for dpt
- Fix display problems after resume
- Allow control over the flags when migrating
- Consider DG2_RC_CCS_CC when migrating buffers

----------------------------------------------------------------
Alex Deucher (7):
drm/amdgpu: switch sdma buffer function tear down to a helper
drm/amdgpu: fix SDMA suspend/resume on SR-IOV
drm/amd/display: make dcn32_split_stream_for_mpc_or_odm static
drm/amd/display: fix indentation in dc.c
drm/amd/display: make virtual_disable_link_output static
drm/amd/display: add a license to cursor_reg_cache.h
drm/amd/display: fix transfer function passed to build_coefficients()

Alexey Kodanev (2):
drm/amd/pm: vega10_hwmgr: fix potential off-by-one overflow in
'performance_levels'
drm/amd/pm: smu7_hwmgr: fix potential off-by-one overflow in
'performance_levels'

Alvin Lee (5):
drm/amd/display: Only commit SubVP state after pipe programming
drm/amd/display: Block SubVP if rotation being used
drm/amd/display: Disable GSL when enabling phantom pipe
drm/amd/display: For SubVP pipe split case use min transition into MPO
drm/amd/display: Fix watermark calculation

Aric Cyr (4):
Revert "drm/amd/display: correct hostvm flag"
drm/amd/display: Fix vupdate and vline position calculation
drm/amd/display: 3.2.206
drm/amd/display: 3.2.207

Arunpravin Paneer Selvam (1):
drm/amdgpu: Fix VRAM BO swap issue

Aurabindo Pillai (2):
drm/amd/display: Do not trigger timing sync for phantom pipes
drm/amd/display: Add HUBP surface flip interrupt handler

Bokun Zhang (1):
drm/amdgpu: Fix SDMA engine resume issue under SRIOV

Candice Li (2):
drm/amdgpu: Update umc v8_10_0 headers
drm/amdgpu: Add poison mode query for umc v8_10_0

Charlene Liu (1):
drm/amd/display: prevent S4 test from failing

Daniel Gomez (1):
drm/amd/display: Fix mutex lock in dcn10

Dave Airlie (3):
Merge tag 'drm-intel-next-fixes-2022-10-06-1' of
git://anongit.freedesktop.org/drm/drm-intel into drm-next
Merge tag 'drm-intel-next-fixes-2022-10-13' of
git://anongit.freedesktop.org/drm/drm-intel into drm-next
Merge tag 'amd-drm-fixes-6.1-2022-10-12' of
https://gitlab.freedesktop.org/agd5f/linux into drm-next

Dillon Varone (8):
drm/amd/display: Program SubVP in dc_commit_state_no_check
drm/amd/display: Reorder FCLK P-state switch sequence for DCN32
drm/amd/display: Increase compbuf size prior to updating clocks
drm/amd/display: Fix merging dynamic ODM+MPO configs on DCN32
Revert "drm/amd/display: skip commit minimal transition state"
drm/amd/display: Use correct pixel clock to program DTBCLK DTO's
drm/amd/display: Acquire FCLK DPM levels on DCN32
drm/amd/display: Fix bug preventing FCLK Pstate allow message being sent

Dmytro Laktyushkin (3):
drm/amd/display: fix dcn315 dml detile overestimation
drm/amd/display: add dummy pstate workaround to dcn315
drm/amd/display: always allow pstate change when no dpps are
active on dcn315

Dong Chenchen (1):
drm/amd/display: Removed unused variable 'sdp_stream_enable'

Eric Bernstein (1):
drm/amd/display: Fix disable DSC logic in the DIO code

Fangzhi Zuo (1):
drm/amd/display: Validate DSC After Enable All New CRTCs

George Shen (1):
drm/amd/display: Add missing SDP registers to DCN32 reglist

Guenter Roeck (1):
drm/amd/display: fix array-bounds error in
dc_stream_remove_writeback() [take 2]

Hamza Mahfooz (1):
Revert "drm/amdgpu: use dirty framebuffer helper"

Ian Chen (1):
drm/amd/display: Refactor edp ILR caps codes

Iswara Nagulendran (1):
drm/amd/display: Allow PSR exit when panel is disconnected

Josip Pavic (1):
drm/amd/display: do not compare integers of different widths

Jouni Högander (1):
drm/i915/psr: Fix PSR_IMR/IIR field handling

Jun Lei (1):
drm/amd/display: Add a helper to map ODM/MPC/Multi-Plane resources

Leo (Hanghong) Ma (1):
drm/amd/display: AUX tracing cleanup

Leo Chen (1):
drm/amd/display: Add log for LTTPR

Lewis Huang (1):
drm/amd/display: Keep OTG on when Z10 is disable

Li Zhong (1):
drivers/amd/pm: check the return value of amdgpu_bo_kmap

Martin Leung (3):
drm/amd/display: block odd h_total timings from halving pixel rate
drm/amd/display: unblock mcm_luts
drm/amd/display: zeromem mypipe heap struct before using it

Matthew Auld (3):
drm/i915/display: handle migration for dpt
drm/i915: allow control over the flags when migrating
drm/i915/display: consider DG2_RC_CCS_CC when migrating buffers

Max Tseng (1):
drm/amd/display: Use the same cursor info across features

Meenakshikumar Somasundaram (1):
drm/amd/display: Display does not light up after S4 resume

Nicholas Kazlauskas (1):
drm/amd/display: Update PMFW z-state interface for DCN314

Philip Yang (2):
drm/amdgpu: Set vmbo destroy after pt bo is created
drm/amdgpu: Correct amdgpu_amdkfd_total_mem_size calculation

Randy Dunlap (1):
drm/amd/display: clean up dcn32_fpu.c kernel-doc

Rodrigo Siqueira (14):
drm/amd/display: Drop unused code for DCN32/321
drm/amd/display: Update DCN321 hook that deals with pipe aquire
drm/amd/display: Fix SubVP control flow in the MPO context
drm/amd/display: Remove OPTC lock check
drm/amd/display: Adding missing HDMI ACP SEND register
drm/amd/display: Add PState change high hook for DCN32
drm/amd/display: Enable 2 to 1 ODM policy if supported
drm/amd/display: Disconnect DSC for unused pipes during ODM transition
drm/amd/display: update DSC for DCN32
drm/amd/display: Minor code style change
drm/amd/display: Add a missing hook to DCN20
drm/amd/display: Use set_vtotal_min_max to configure OTG VTOTAL
drm/amd/display: Drop uncessary OTG lock check
drm/amd/display: Clean some DCN32 macros

Roman Li (1):
drm/amd/display: Enable dpia support for dcn314

Ruili Ji (1):
drm/amdgpu: Enable F32_WPTR_POLL_ENABLE in mqd

Shirish S (1):
drm/amd/display: explicitly disable psr_feature_enable appropriately

Sonny Jiang (1):
drm/amdgpu: Enable VCN PG on GC11_0_1

Tao Zhou (4):
drm/amdgpu: remove check for CE in RAS error address query
drm/amdgpu: define RAS convert_error_address API
drm/amdgpu: define convert_error_address for umc v8.7
drm/amdgpu: fix coding style issue for mca notifier

Tejas Upadhyay (1):
drm/i915/ehl: Update MOCS table for EHL

Thomas Hellström (1):
drm/i915: Fix display problems after resume

Tvrtko Ursulin (1):
drm/i915/guc: Fix revocation of non-persistent contexts

Ville Syrjälä (7):
drm/i915: Round to closest in g4x+ HDMI clock readout
drm/i915: Fix watermark calculations for gen12+ RC CCS modifier
drm/i915: Fix watermark calculations for gen12+ MC CCS modifier
drm/i915: Fix watermark calculations for gen12+ CCS+CC modifier
drm/i915: Fix watermark calculations for DG2 CCS modifiers
drm/i915: Fix watermark calculations for DG2 CCS+CC modifier
drm/i915: Reject excessive dotclocks early

Vladimir Stempen (2):
drm/amd/display: properly configure DCFCLK when enable/disable Freesync
drm/amd/display: increase hardware status wait time

Wenjing Liu (3):
drm/amd/display: fix integer overflow during MSA V_Freq calculation
drm/amd/display: write all 4 bytes of FFE_PRESET dpcd value
drm/amd/display: Add missing mask sh for SYM32_TP_SQ_PULSE register

Yang Li (3):
drm/amd/display: clean up one inconsistent indenting
drm/amd/display: clean up one inconsistent indenting
drm/amd/display: Simplify bool conversion

Yang Yingliang (3):
drm/amd/display: change to enc314_stream_encoder_dp_blank static
drm/amdgpu/sdma: add missing release_firmware() in
amdgpu_sdma_init_microcode()
drm/amd/display: fix build error on arm64

Yuan Can (1):
drm/amd/display: Remove unused struct i2c_id_config_access

Yunxiang Li (1):
drm/amd/display: Fix vblank refcount in vrr transition

Zhikai Zhai (1):
drm/amd/display: skip commit minimal transition state

drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 6 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 14 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 8 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 29 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h | 2 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 17 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h | 7 +-
drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 6 +-
drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 6 +-
drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 6 +-
drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 29 +--
drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 11 +-
drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 15 +-
drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 17 +-
drivers/gpu/drm/amd/amdgpu/si_dma.c | 5 +-
drivers/gpu/drm/amd/amdgpu/soc21.c | 1 +
drivers/gpu/drm/amd/amdgpu/umc_v6_1.c | 10 +-
drivers/gpu/drm/amd/amdgpu/umc_v6_7.c | 165 ++++++--------
drivers/gpu/drm/amd/amdgpu/umc_v8_10.c | 78 ++++---
drivers/gpu/drm/amd/amdgpu/umc_v8_7.c | 63 +++---
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v11.c | 3 +-
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 71 +++---
.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_psr.c | 8 +-
drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c | 7 -
.../amd/display/dc/clk_mgr/dcn20/dcn20_clk_mgr.c | 4 +-
.../drm/amd/display/dc/clk_mgr/dcn314/dcn314_smu.c | 11 +-
.../amd/display/dc/clk_mgr/dcn32/dcn32_clk_mgr.c | 85 +++++---
drivers/gpu/drm/amd/display/dc/core/dc.c | 105 ++++++++-
drivers/gpu/drm/amd/display/dc/core/dc_link.c | 11 +-
drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 70 +++---
drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 53 ++++-
drivers/gpu/drm/amd/display/dc/core/dc_stream.c | 8 +-
drivers/gpu/drm/amd/display/dc/dc.h | 8 +-
drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c | 147 ++++++++++++-
drivers/gpu/drm/amd/display/dc/dc_dmub_srv.h | 1 +
drivers/gpu/drm/amd/display/dc/dc_link.h | 4 +
drivers/gpu/drm/amd/display/dc/dce/dce_aux.c | 13 +-
drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.c | 1 +
.../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 239 +++++----------------
drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.c | 40 +---
drivers/gpu/drm/amd/display/dc/dcn10/dcn10_optc.h | 1 -
.../gpu/drm/amd/display/dc/dcn10/dcn10_resource.c | 66 +++++-
drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hubp.c | 30 +++
drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 25 +--
drivers/gpu/drm/amd/display/dc/dcn20/dcn20_optc.c | 1 +
.../gpu/drm/amd/display/dc/dcn21/dcn21_hubbub.c | 8 +-
.../gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 13 +-
drivers/gpu/drm/amd/display/dc/dcn30/dcn30_dpp.c | 4 +
drivers/gpu/drm/amd/display/dc/dcn30/dcn30_optc.c | 3 +-
.../gpu/drm/amd/display/dc/dcn30/dcn30_resource.c | 4 +
.../drm/amd/display/dc/dcn301/dcn301_resource.c | 2 +-
.../display/dc/dcn31/dcn31_hpo_dp_stream_encoder.c | 20 +-
drivers/gpu/drm/amd/display/dc/dcn31/dcn31_optc.c | 2 -
.../gpu/drm/amd/display/dc/dcn31/dcn31_resource.c | 15 +-
.../display/dc/dcn314/dcn314_dio_stream_encoder.c | 2 +-
.../drm/amd/display/dc/dcn314/dcn314_resource.c | 16 +-
.../drm/amd/display/dc/dcn315/dcn315_resource.c | 15 +-
.../drm/amd/display/dc/dcn316/dcn316_resource.c | 13 +-
.../amd/display/dc/dcn32/dcn32_dio_link_encoder.c | 7 -
.../amd/display/dc/dcn32/dcn32_dio_link_encoder.h | 4 -
.../display/dc/dcn32/dcn32_dio_stream_encoder.c | 57 +++--
.../display/dc/dcn32/dcn32_dio_stream_encoder.h | 14 +-
.../display/dc/dcn32/dcn32_hpo_dp_link_encoder.h | 1 +
.../gpu/drm/amd/display/dc/dcn32/dcn32_hubbub.c | 1 +
drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hubp.c | 6 +-
drivers/gpu/drm/amd/display/dc/dcn32/dcn32_hwseq.c | 42 ++--
drivers/gpu/drm/amd/display/dc/dcn32/dcn32_optc.c | 2 +-
.../gpu/drm/amd/display/dc/dcn32/dcn32_resource.c | 31 +++
.../gpu/drm/amd/display/dc/dcn32/dcn32_resource.h | 22 ++
.../amd/display/dc/dcn32/dcn32_resource_helpers.c | 88 ++++++++
.../display/dc/dcn321/dcn321_dio_link_encoder.c | 1 -
.../drm/amd/display/dc/dcn321/dcn321_resource.c | 6 +-
.../gpu/drm/amd/display/dc/dml/calcs/dcn_calcs.c | 118 +++++-----
.../gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.c | 96 +++------
.../gpu/drm/amd/display/dc/dml/dcn31/dcn31_fpu.h | 1 +
.../amd/display/dc/dml/dcn31/display_mode_vba_31.c | 15 ++
.../gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c | 131 ++++++-----
.../amd/display/dc/dml/dcn32/display_mode_vba_32.c | 21 +-
.../gpu/drm/amd/display/dc/dml/display_mode_lib.c | 1 +
.../gpu/drm/amd/display/dc/dml/display_mode_lib.h | 1 +
drivers/gpu/drm/amd/display/dc/inc/core_types.h | 6 +-
drivers/gpu/drm/amd/display/dc/inc/dcn_calcs.h | 19 +-
drivers/gpu/drm/amd/display/dc/inc/hw/clk_mgr.h | 15 +-
.../drm/amd/display/dc/inc/hw/cursor_reg_cache.h | 99 +++++++++
drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h | 4 +
drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h | 5 +
.../drm/amd/display/dc/inc/hw/timing_generator.h | 1 -
drivers/gpu/drm/amd/display/dc/inc/resource.h | 6 +
.../gpu/drm/amd/display/dc/link/link_hwss_hpo_dp.c | 2 +-
.../drm/amd/display/dc/virtual/virtual_link_hwss.c | 2 +-
drivers/gpu/drm/amd/display/dmub/dmub_srv.h | 1 +
drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h | 140 ++++++++++--
drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c | 1 +
.../drm/amd/display/modules/color/color_gamma.c | 2 +-
.../amd/include/asic_reg/umc/umc_8_10_0_offset.h | 2 +
.../amd/include/asic_reg/umc/umc_8_10_0_sh_mask.h | 3 +
drivers/gpu/drm/amd/pm/legacy-dpm/kv_dpm.c | 5 +-
.../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 2 +-
.../gpu/drm/amd/pm/powerplay/hwmgr/vega10_hwmgr.c | 2 +-
drivers/gpu/drm/i915/display/g4x_hdmi.c | 2 +-
drivers/gpu/drm/i915/display/intel_display.c | 18 ++
drivers/gpu/drm/i915/display/intel_fb_pin.c | 62 ++++--
drivers/gpu/drm/i915/display/intel_psr.c | 78 ++++---
drivers/gpu/drm/i915/display/skl_watermark.c | 16 +-
drivers/gpu/drm/i915/gem/i915_gem_context.c | 8 +-
drivers/gpu/drm/i915/gem/i915_gem_object.c | 37 +++-
drivers/gpu/drm/i915/gem/i915_gem_object.h | 4 +
drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 3 +-
drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 5 +-
drivers/gpu/drm/i915/gt/intel_context.c | 5 +-
drivers/gpu/drm/i915/gt/intel_context.h | 3 +-
drivers/gpu/drm/i915/gt/intel_ggtt.c | 8 +-
drivers/gpu/drm/i915/gt/intel_mocs.c | 8 +
drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 26 +--
drivers/gpu/drm/i915/i915_reg.h | 16 +-
116 files changed, 1830 insertions(+), 1081 deletions(-)
create mode 100644 drivers/gpu/drm/amd/display/dc/inc/hw/cursor_reg_cache.h


2022-10-14 05:09:09

by Linus Torvalds

[permalink] [raw]
Subject: Re: [git pull] drm fixes for 6.1-rc1

On Thu, Oct 13, 2022 at 5:29 PM Dave Airlie <[email protected]> wrote:
>
> Round of fixes for the merge window stuff, bunch of amdgpu and i915
> changes, this should have the gcc11 warning fix, amongst other
> changes.

Some of those amd changes aren't "fixes". They are some major code changes.

We're still in the merge window, so I'm letting it slide, but calling
then "fixes" really stretches things. They are fixes exactly the same
way completely new development can "fix" things.

Linus

2022-10-14 05:34:12

by pr-tracker-bot

[permalink] [raw]
Subject: Re: [git pull] drm fixes for 6.1-rc1

The pull request you sent on Fri, 14 Oct 2022 10:29:19 +1000:

> git://anongit.freedesktop.org/drm/drm tags/drm-next-2022-10-14

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/9c9155a3509a2ebdb06d77c7a621e9685c802eac

Thank you!

--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

2022-10-16 08:31:10

by Arthur Marsh

[permalink] [raw]
Subject: Re: [git pull] drm fixes for 6.1-rc1

From: Arthur Marsh <[email protected]>

Hi, the "drm fixes for 6.1-rc1" commit caused the amdgpu module to fail
with my Cape Verde radeonsi card.

I haven't been able to bisect the problem to an individual commit, but
attach a dmesg extract below.

I'm happy to supply any other configuration information and test patches.

Arthur.

Linux version 6.0.0+ (root@am64) (gcc-12 (Debian 12.2.0-5) 12.2.0, GNU ld (GNU Binutils for Debian) 2.39) #5179 SMP PREEMPT_DYNAMIC Fri Oct 14 17:00:40 ACDT 2022
Command line: BOOT_IMAGE=/vmlinuz-6.0.0+ root=UUID=39706f53-7c27-4310-b22a-36c7b042d1a1 ro single amdgpu.audio=1 amdgpu.si_support=1 radeon.si_support=0 page_owner=on amdgpu.gpu_recovery=1
...

[drm] amdgpu kernel modesetting enabled.
amdgpu 0000:01:00.0: vgaarb: deactivate vga console
Console: switching to colour dummy device 80x25
[drm] initializing kernel modesetting (VERDE 0x1002:0x682B 0x1458:0x22CA 0x87).
[drm] register mmio base: 0xFE8C0000
[drm] register mmio size: 262144
[drm] add ip block number 0 <si_common>
[drm] add ip block number 1 <gmc_v6_0>
[drm] add ip block number 2 <si_ih>
[drm] add ip block number 3 <gfx_v6_0>
[drm] add ip block number 4 <si_dma>
[drm] add ip block number 5 <si_dpm>
[drm] add ip block number 6 <dce_v6_0>
[drm] add ip block number 7 <uvd_v3_1>
[drm] BIOS signature incorrect 5b 7
resource sanity check: requesting [mem 0x000c0000-0x000dffff], which spans more than PCI Bus 0000:00 [mem 0x000d0000-0x000dffff window]
caller pci_map_rom+0x68/0x1b0 mapping multiple BARs
amdgpu 0000:01:00.0: No more image in the PCI ROM
amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
amdgpu: ATOM BIOS: xxx-xxx-xxx
amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
amdgpu 0000:01:00.0: amdgpu: PCIE atomic ops is not supported
[drm] PCIE gen 2 link speeds already enabled
[drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
RTL8211B Gigabit Ethernet r8169-0-300:00: attached PHY driver (mii_bus:phy_addr=r8169-0-300:00, irq=MAC)
r8169 0000:03:00.0 eth0: Link is Down
amdgpu 0000:01:00.0: amdgpu: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
amdgpu 0000:01:00.0: amdgpu: GART: 1024M 0x000000FF00000000 - 0x000000FF3FFFFFFF
[drm] Detected VRAM RAM=2048M, BAR=256M
[drm] RAM width 128bits DDR3
[drm] amdgpu: 2048M of VRAM memory ready
[drm] amdgpu: 3979M of GTT memory ready.
[drm] GART: num cpu pages 262144, num gpu pages 262144
amdgpu 0000:01:00.0: amdgpu: PCIE GART of 1024M enabled (table at 0x000000F400A00000).
[drm] Internal thermal controller with fan control
[drm] amdgpu: dpm initialized
[drm] AMDGPU Display Connectors
[drm] Connector 0:
[drm] HDMI-A-1
[drm] HPD1
[drm] DDC: 0x194c 0x194c 0x194d 0x194d 0x194e 0x194e 0x194f 0x194f
[drm] Encoders:
[drm] DFP1: INTERNAL_UNIPHY
[drm] Connector 1:
[drm] DVI-D-1
[drm] HPD2
[drm] DDC: 0x1950 0x1950 0x1951 0x1951 0x1952 0x1952 0x1953 0x1953
[drm] Encoders:
[drm] DFP2: INTERNAL_UNIPHY
[drm] Connector 2:
[drm] VGA-1
[drm] DDC: 0x1970 0x1970 0x1971 0x1971 0x1972 0x1972 0x1973 0x1973
[drm] Encoders:
[drm] CRT1: INTERNAL_KLDSCP_DAC1
[drm] Found UVD firmware Version: 64.0 Family ID: 13
amdgpu: Move buffer fallback to memcpy unavailable
[drm:amdgpu_device_init.cold [amdgpu]] *ERROR* sw_init of IP block <uvd_v3_1> failed -19
amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_init failed
amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
amdgpu 0000:01:00.0: amdgpu: amdgpu: finishing device.
BUG: kernel NULL pointer dereference, address: 0000000000000090
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 447 Comm: udevd Not tainted 6.0.0+ #5179
Hardware name: System manufacturer System Product Name/M3A78 PRO, BIOS 1701 01/27/2011
RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc cc cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74 0f <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9 99 8e
RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
FS: 00007fd81fcd9840(0000) GS:ffff99bb67cc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000090 CR3: 0000000111822000 CR4: 00000000000006e0
Call Trace:
<TASK>
amdgpu_fence_driver_sw_fini+0xc2/0xd0 [amdgpu]
amdgpu_device_fini_sw+0x17/0x3c0 [amdgpu]
amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
devm_drm_dev_init_release+0x4a/0x70 [drm]
release_nodes+0x40/0xb0
devres_release_all+0x89/0xc0
device_unbind_cleanup+0xe/0x70
really_probe+0x245/0x3a0
? pm_runtime_barrier+0x61/0xb0
__driver_probe_device+0x78/0x170
driver_probe_device+0x2d/0xb0
__driver_attach+0xdc/0x1d0
? __device_attach_driver+0x100/0x100
bus_for_each_dev+0x69/0xa0
bus_add_driver+0x1d4/0x230
? _raw_spin_unlock+0x15/0x40
driver_register+0x89/0xe0
? 0xffffffffc0c3b000
do_one_initcall+0x44/0x200
? __kmem_cache_alloc_node+0x90/0x360
? kmalloc_trace+0x38/0xc0
do_init_module+0x4a/0x1e0
__do_sys_finit_module+0xb5/0x130
do_syscall_64+0x3a/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fd81ff5b1b9
Code: 08 44 89 e0 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 1c 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007ffc5b37cbb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 000055e5f2f6a140 RCX: 00007fd81ff5b1b9
RDX: 0000000000000000 RSI: 000055e5f2f67e30 RDI: 0000000000000017
RBP: 000055e5f2f67e30 R08: 0000000000000000 R09: 000055e5f2f46700
R10: 0000000000000017 R11: 0000000000000246 R12: 0000000000020000
R13: 0000000000000000 R14: 000055e5f2f65b00 R15: 0000000000000024
</TASK>
Modules linked in: amdgpu(+) snd_emu10k1_synth snd_emux_synth snd_seq_midi_emul snd_seq_virmidi snd_seq_midi snd_seq_midi_event snd_seq wmi_bmof snd_emu10k1 edac_mce_amd gpu_sched drm_buddy video kvm_amd drm_ttm_helper ttm snd_util_mem drm_display_helper snd_ac97_codec ccp drm_kms_helper snd_hda_codec_hdmi rng_core ac97_bus snd_rawmidi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_seq_device drm kvm snd_hwdep snd_pcm_oss snd_mixer_oss evdev serio_raw snd_pcm irqbypass i2c_algo_bit fb_sys_fops syscopyarea sysfillrect emu10k1_gp pcspkr gameport k10temp snd_timer sysimgblt snd acpi_cpufreq wmi soundcore button sp5100_tco asus_atk0110 ext4 crc16 mbcache jbd2 btrfs blake2b_generic xor raid6_pq zstd_compress libcrc32c crc32c_generic uas usb_storage sg sd_mod hid_generic t10_pi usbhid hid sr_mod cdrom crc64_rocksoft crc64 ata_generic ahci pata_atiixp libahci ohci_pci firewire_ohci libata firewire_core crc_itu_t xhci_pci scsi_mod ohci_hcd r8169 ehci_pci xhci_hcd
realtek ehci_hcd mdio_devres i2c_piix4 scsi_common usbcore libphy usb_common
CR2: 0000000000000090
---[ end trace 0000000000000000 ]---
RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc cc cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74 0f <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9 99 8e
RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
FS: 00007fd81fcd9840(0000) GS:ffff99bb67cc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000090 CR3: 0000000111822000 CR4: 00000000000006e0
note: udevd[447] exited with preempt_count 1
udevd[433]: worker [447] terminated by signal 9 (Killed)
udevd[433]: worker [447] failed while handling '/devices/pci0000:00/0000:00:02.0/0000:01:00.0'
r8169 0000:03:00.0 eth0: Link is Up - 1Gbps/Full - flow control off
IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Adding 4194300k swap on /dev/sda4. Priority:-2 extents:1 across:4194300k FS
EXT4-fs (sda5): re-mounted. Quota mode: none.
lp: driver loaded but no devices found
ppdev: user-space parallel port driver
it87: Found IT8716F chip at 0xe80, revision 3
ACPI Warning: SystemIO range 0x0000000000000E85-0x0000000000000E86 conflicts with OpRegion 0x0000000000000E85-0x0000000000000E86 (\_SB.PCI0.SBRG.ASOC.HWRE) (20220331/utaddress-204)
ACPI: OSL: Resource conflict; ACPI support missing from driver?
BUG: unable to handle page fault for address: 00000000000065c0
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#2] PREEMPT SMP NOPTI
CPU: 2 PID: 55 Comm: kworker/2:1 Tainted: G D 6.0.0+ #5179
Hardware name: System manufacturer System Product Name/M3A78 PRO, BIOS 1701 01/27/2011
Workqueue: events output_poll_execute [drm_kms_helper]
RIP: 0010:amdgpu_device_rreg.part.0+0x39/0x100 [amdgpu]
Code: 6c 24 08 48 89 fb 4c 89 64 24 10 44 8d 24 b5 00 00 00 00 4c 3b a7 88 08 00 00 89 f5 73 70 83 e2 02 74 2f 4c 03 a3 90 08 00 00 <45> 8b 24 24 48 8b 43 08 0f b7 70 3e 66 90 44 89 e0 48 8b 1c 24 48
RSP: 0018:ffffbeb3c0717c48 EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff99bae8260000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000001970 RDI: ffff99bae8260000
RBP: 0000000000001970 R08: ffffbeb3c0717e08 R09: 0000000000000000
R10: 0000000000000018 R11: fefefefefefefeff R12: 00000000000065c0
R13: ffffbeb3c0717d70 R14: 0000000000000000 R15: 000000010005e340
FS: 0000000000000000(0000) GS:ffff99bb67c80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000065c0 CR3: 000000008980a000 CR4: 00000000000006e0
Call Trace:
<TASK>
amdgpu_i2c_pre_xfer+0x163/0x180 [amdgpu]
bit_xfer+0x36/0x530 [i2c_algo_bit]
__i2c_transfer+0x185/0x550
i2c_transfer+0xa2/0x110
amdgpu_display_ddc_probe+0xbd/0x100 [amdgpu]
amdgpu_connector_vga_detect+0x8e/0x200 [amdgpu]
drm_helper_probe_detect_ctx+0x7b/0xd0 [drm_kms_helper]
output_poll_execute+0x152/0x220 [drm_kms_helper]
process_one_work+0x1ae/0x370
worker_thread+0x4d/0x3b0
? rescuer_thread+0x380/0x380
kthread+0xe3/0x110
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x22/0x30
</TASK>
Modules linked in: max6650 hwmon_vid parport_pc ppdev lp parport amdgpu(+) snd_emu10k1_synth snd_emux_synth snd_seq_midi_emul snd_seq_virmidi snd_seq_midi snd_seq_midi_event snd_seq wmi_bmof snd_emu10k1 edac_mce_amd gpu_sched drm_buddy video kvm_amd drm_ttm_helper ttm snd_util_mem drm_display_helper snd_ac97_codec ccp drm_kms_helper snd_hda_codec_hdmi rng_core ac97_bus snd_rawmidi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_seq_device drm kvm snd_hwdep snd_pcm_oss snd_mixer_oss evdev serio_raw snd_pcm irqbypass i2c_algo_bit fb_sys_fops syscopyarea sysfillrect emu10k1_gp pcspkr gameport k10temp snd_timer sysimgblt snd acpi_cpufreq wmi soundcore button sp5100_tco asus_atk0110 ext4 crc16 mbcache jbd2 btrfs blake2b_generic xor raid6_pq zstd_compress libcrc32c crc32c_generic uas usb_storage sg sd_mod hid_generic t10_pi usbhid hid sr_mod cdrom crc64_rocksoft crc64 ata_generic ahci pata_atiixp libahci ohci_pci firewire_ohci libata firewire_core crc_itu_t xhci_pci
scsi_mod ohci_hcd r8169 ehci_pci xhci_hcd realtek ehci_hcd mdio_devres i2c_piix4 scsi_common usbcore libphy usb_common
CR2: 00000000000065c0
---[ end trace 0000000000000000 ]---
RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc cc cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74 0f <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9 99 8e
RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff99bb67c80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000065c0 CR3: 000000008980a000 CR4: 00000000000006e0

2022-10-16 21:46:58

by Dave Airlie

[permalink] [raw]
Subject: Re: [git pull] drm fixes for 6.1-rc1

On Sun, 16 Oct 2022 at 18:09, Arthur Marsh
<[email protected]> wrote:
>
> From: Arthur Marsh <[email protected]>
>
> Hi, the "drm fixes for 6.1-rc1" commit caused the amdgpu module to fail
> with my Cape Verde radeonsi card.
>
> I haven't been able to bisect the problem to an individual commit, but
> attach a dmesg extract below.
>
> I'm happy to supply any other configuration information and test patches.
>

Can you try reverting: it's the only think I can spot that might
affect a card that old since most changes in that request were for
display hw you don't have.

ommit 312b4dc11d4f74bfe03ea25ffe04c1f2fdd13cb9
Author: Arunpravin Paneer Selvam <[email protected]>
Date: Tue Oct 4 07:33:39 2022 -0700

drm/amdgpu: Fix VRAM BO swap issue

DRM buddy manager allocates the contiguous memory requests in
a single block or multiple blocks. So for the ttm move operation
(incase of low vram memory) we should consider all the blocks to
compute the total memory size which compared with the struct
ttm_resource num_pages in order to verify that the blocks are
contiguous for the eviction process.

v2: Added a Fixes tag
v3: Rewrite the code to save a bit of calculations and
variables (Christian)

Fixes: c9cad937c0c5 ("drm/amdgpu: add drm buddy support to amdgpu")
Signed-off-by: Arunpravin Paneer Selvam <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>


Thanks,
Dave.

> Arthur.
>
> Linux version 6.0.0+ (root@am64) (gcc-12 (Debian 12.2.0-5) 12.2.0, GNU ld (GNU Binutils for Debian) 2.39) #5179 SMP PREEMPT_DYNAMIC Fri Oct 14 17:00:40 ACDT 2022
> Command line: BOOT_IMAGE=/vmlinuz-6.0.0+ root=UUID=39706f53-7c27-4310-b22a-36c7b042d1a1 ro single amdgpu.audio=1 amdgpu.si_support=1 radeon.si_support=0 page_owner=on amdgpu.gpu_recovery=1
> ...
>
> [drm] amdgpu kernel modesetting enabled.
> amdgpu 0000:01:00.0: vgaarb: deactivate vga console
> Console: switching to colour dummy device 80x25
> [drm] initializing kernel modesetting (VERDE 0x1002:0x682B 0x1458:0x22CA 0x87).
> [drm] register mmio base: 0xFE8C0000
> [drm] register mmio size: 262144
> [drm] add ip block number 0 <si_common>
> [drm] add ip block number 1 <gmc_v6_0>
> [drm] add ip block number 2 <si_ih>
> [drm] add ip block number 3 <gfx_v6_0>
> [drm] add ip block number 4 <si_dma>
> [drm] add ip block number 5 <si_dpm>
> [drm] add ip block number 6 <dce_v6_0>
> [drm] add ip block number 7 <uvd_v3_1>
> [drm] BIOS signature incorrect 5b 7
> resource sanity check: requesting [mem 0x000c0000-0x000dffff], which spans more than PCI Bus 0000:00 [mem 0x000d0000-0x000dffff window]
> caller pci_map_rom+0x68/0x1b0 mapping multiple BARs
> amdgpu 0000:01:00.0: No more image in the PCI ROM
> amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
> amdgpu: ATOM BIOS: xxx-xxx-xxx
> amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
> amdgpu 0000:01:00.0: amdgpu: PCIE atomic ops is not supported
> [drm] PCIE gen 2 link speeds already enabled
> [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
> RTL8211B Gigabit Ethernet r8169-0-300:00: attached PHY driver (mii_bus:phy_addr=r8169-0-300:00, irq=MAC)
> r8169 0000:03:00.0 eth0: Link is Down
> amdgpu 0000:01:00.0: amdgpu: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
> amdgpu 0000:01:00.0: amdgpu: GART: 1024M 0x000000FF00000000 - 0x000000FF3FFFFFFF
> [drm] Detected VRAM RAM=2048M, BAR=256M
> [drm] RAM width 128bits DDR3
> [drm] amdgpu: 2048M of VRAM memory ready
> [drm] amdgpu: 3979M of GTT memory ready.
> [drm] GART: num cpu pages 262144, num gpu pages 262144
> amdgpu 0000:01:00.0: amdgpu: PCIE GART of 1024M enabled (table at 0x000000F400A00000).
> [drm] Internal thermal controller with fan control
> [drm] amdgpu: dpm initialized
> [drm] AMDGPU Display Connectors
> [drm] Connector 0:
> [drm] HDMI-A-1
> [drm] HPD1
> [drm] DDC: 0x194c 0x194c 0x194d 0x194d 0x194e 0x194e 0x194f 0x194f
> [drm] Encoders:
> [drm] DFP1: INTERNAL_UNIPHY
> [drm] Connector 1:
> [drm] DVI-D-1
> [drm] HPD2
> [drm] DDC: 0x1950 0x1950 0x1951 0x1951 0x1952 0x1952 0x1953 0x1953
> [drm] Encoders:
> [drm] DFP2: INTERNAL_UNIPHY
> [drm] Connector 2:
> [drm] VGA-1
> [drm] DDC: 0x1970 0x1970 0x1971 0x1971 0x1972 0x1972 0x1973 0x1973
> [drm] Encoders:
> [drm] CRT1: INTERNAL_KLDSCP_DAC1
> [drm] Found UVD firmware Version: 64.0 Family ID: 13
> amdgpu: Move buffer fallback to memcpy unavailable
> [drm:amdgpu_device_init.cold [amdgpu]] *ERROR* sw_init of IP block <uvd_v3_1> failed -19
> amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_init failed
> amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
> amdgpu 0000:01:00.0: amdgpu: amdgpu: finishing device.
> BUG: kernel NULL pointer dereference, address: 0000000000000090
> #PF: supervisor write access in kernel mode
> #PF: error_code(0x0002) - not-present page
> PGD 0 P4D 0
> Oops: 0002 [#1] PREEMPT SMP NOPTI
> CPU: 3 PID: 447 Comm: udevd Not tainted 6.0.0+ #5179
> Hardware name: System manufacturer System Product Name/M3A78 PRO, BIOS 1701 01/27/2011
> RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
> Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc cc cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74 0f <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9 99 8e
> RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
> RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
> RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
> RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
> R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
> R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
> FS: 00007fd81fcd9840(0000) GS:ffff99bb67cc0000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000090 CR3: 0000000111822000 CR4: 00000000000006e0
> Call Trace:
> <TASK>
> amdgpu_fence_driver_sw_fini+0xc2/0xd0 [amdgpu]
> amdgpu_device_fini_sw+0x17/0x3c0 [amdgpu]
> amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
> devm_drm_dev_init_release+0x4a/0x70 [drm]
> release_nodes+0x40/0xb0
> devres_release_all+0x89/0xc0
> device_unbind_cleanup+0xe/0x70
> really_probe+0x245/0x3a0
> ? pm_runtime_barrier+0x61/0xb0
> __driver_probe_device+0x78/0x170
> driver_probe_device+0x2d/0xb0
> __driver_attach+0xdc/0x1d0
> ? __device_attach_driver+0x100/0x100
> bus_for_each_dev+0x69/0xa0
> bus_add_driver+0x1d4/0x230
> ? _raw_spin_unlock+0x15/0x40
> driver_register+0x89/0xe0
> ? 0xffffffffc0c3b000
> do_one_initcall+0x44/0x200
> ? __kmem_cache_alloc_node+0x90/0x360
> ? kmalloc_trace+0x38/0xc0
> do_init_module+0x4a/0x1e0
> __do_sys_finit_module+0xb5/0x130
> do_syscall_64+0x3a/0x90
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7fd81ff5b1b9
> Code: 08 44 89 e0 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 1c 0d 00 f7 d8 64 89 01 48
> RSP: 002b:00007ffc5b37cbb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> RAX: ffffffffffffffda RBX: 000055e5f2f6a140 RCX: 00007fd81ff5b1b9
> RDX: 0000000000000000 RSI: 000055e5f2f67e30 RDI: 0000000000000017
> RBP: 000055e5f2f67e30 R08: 0000000000000000 R09: 000055e5f2f46700
> R10: 0000000000000017 R11: 0000000000000246 R12: 0000000000020000
> R13: 0000000000000000 R14: 000055e5f2f65b00 R15: 0000000000000024
> </TASK>
> Modules linked in: amdgpu(+) snd_emu10k1_synth snd_emux_synth snd_seq_midi_emul snd_seq_virmidi snd_seq_midi snd_seq_midi_event snd_seq wmi_bmof snd_emu10k1 edac_mce_amd gpu_sched drm_buddy video kvm_amd drm_ttm_helper ttm snd_util_mem drm_display_helper snd_ac97_codec ccp drm_kms_helper snd_hda_codec_hdmi rng_core ac97_bus snd_rawmidi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_seq_device drm kvm snd_hwdep snd_pcm_oss snd_mixer_oss evdev serio_raw snd_pcm irqbypass i2c_algo_bit fb_sys_fops syscopyarea sysfillrect emu10k1_gp pcspkr gameport k10temp snd_timer sysimgblt snd acpi_cpufreq wmi soundcore button sp5100_tco asus_atk0110 ext4 crc16 mbcache jbd2 btrfs blake2b_generic xor raid6_pq zstd_compress libcrc32c crc32c_generic uas usb_storage sg sd_mod hid_generic t10_pi usbhid hid sr_mod cdrom crc64_rocksoft crc64 ata_generic ahci pata_atiixp libahci ohci_pci firewire_ohci libata firewire_core crc_itu_t xhci_pci scsi_mod ohci_hcd r8169 ehci_pci xhci_hcd
> realtek ehci_hcd mdio_devres i2c_piix4 scsi_common usbcore libphy usb_common
> CR2: 0000000000000090
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
> Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc cc cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74 0f <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9 99 8e
> RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
> RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
> RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
> RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
> R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
> R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
> FS: 00007fd81fcd9840(0000) GS:ffff99bb67cc0000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000090 CR3: 0000000111822000 CR4: 00000000000006e0
> note: udevd[447] exited with preempt_count 1
> udevd[433]: worker [447] terminated by signal 9 (Killed)
> udevd[433]: worker [447] failed while handling '/devices/pci0000:00/0000:00:02.0/0000:01:00.0'
> r8169 0000:03:00.0 eth0: Link is Up - 1Gbps/Full - flow control off
> IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> Adding 4194300k swap on /dev/sda4. Priority:-2 extents:1 across:4194300k FS
> EXT4-fs (sda5): re-mounted. Quota mode: none.
> lp: driver loaded but no devices found
> ppdev: user-space parallel port driver
> it87: Found IT8716F chip at 0xe80, revision 3
> ACPI Warning: SystemIO range 0x0000000000000E85-0x0000000000000E86 conflicts with OpRegion 0x0000000000000E85-0x0000000000000E86 (\_SB.PCI0.SBRG.ASOC.HWRE) (20220331/utaddress-204)
> ACPI: OSL: Resource conflict; ACPI support missing from driver?
> BUG: unable to handle page fault for address: 00000000000065c0
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: 0000 [#2] PREEMPT SMP NOPTI
> CPU: 2 PID: 55 Comm: kworker/2:1 Tainted: G D 6.0.0+ #5179
> Hardware name: System manufacturer System Product Name/M3A78 PRO, BIOS 1701 01/27/2011
> Workqueue: events output_poll_execute [drm_kms_helper]
> RIP: 0010:amdgpu_device_rreg.part.0+0x39/0x100 [amdgpu]
> Code: 6c 24 08 48 89 fb 4c 89 64 24 10 44 8d 24 b5 00 00 00 00 4c 3b a7 88 08 00 00 89 f5 73 70 83 e2 02 74 2f 4c 03 a3 90 08 00 00 <45> 8b 24 24 48 8b 43 08 0f b7 70 3e 66 90 44 89 e0 48 8b 1c 24 48
> RSP: 0018:ffffbeb3c0717c48 EFLAGS: 00010206
> RAX: 0000000000000000 RBX: ffff99bae8260000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000001970 RDI: ffff99bae8260000
> RBP: 0000000000001970 R08: ffffbeb3c0717e08 R09: 0000000000000000
> R10: 0000000000000018 R11: fefefefefefefeff R12: 00000000000065c0
> R13: ffffbeb3c0717d70 R14: 0000000000000000 R15: 000000010005e340
> FS: 0000000000000000(0000) GS:ffff99bb67c80000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000065c0 CR3: 000000008980a000 CR4: 00000000000006e0
> Call Trace:
> <TASK>
> amdgpu_i2c_pre_xfer+0x163/0x180 [amdgpu]
> bit_xfer+0x36/0x530 [i2c_algo_bit]
> __i2c_transfer+0x185/0x550
> i2c_transfer+0xa2/0x110
> amdgpu_display_ddc_probe+0xbd/0x100 [amdgpu]
> amdgpu_connector_vga_detect+0x8e/0x200 [amdgpu]
> drm_helper_probe_detect_ctx+0x7b/0xd0 [drm_kms_helper]
> output_poll_execute+0x152/0x220 [drm_kms_helper]
> process_one_work+0x1ae/0x370
> worker_thread+0x4d/0x3b0
> ? rescuer_thread+0x380/0x380
> kthread+0xe3/0x110
> ? kthread_complete_and_exit+0x20/0x20
> ret_from_fork+0x22/0x30
> </TASK>
> Modules linked in: max6650 hwmon_vid parport_pc ppdev lp parport amdgpu(+) snd_emu10k1_synth snd_emux_synth snd_seq_midi_emul snd_seq_virmidi snd_seq_midi snd_seq_midi_event snd_seq wmi_bmof snd_emu10k1 edac_mce_amd gpu_sched drm_buddy video kvm_amd drm_ttm_helper ttm snd_util_mem drm_display_helper snd_ac97_codec ccp drm_kms_helper snd_hda_codec_hdmi rng_core ac97_bus snd_rawmidi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_seq_device drm kvm snd_hwdep snd_pcm_oss snd_mixer_oss evdev serio_raw snd_pcm irqbypass i2c_algo_bit fb_sys_fops syscopyarea sysfillrect emu10k1_gp pcspkr gameport k10temp snd_timer sysimgblt snd acpi_cpufreq wmi soundcore button sp5100_tco asus_atk0110 ext4 crc16 mbcache jbd2 btrfs blake2b_generic xor raid6_pq zstd_compress libcrc32c crc32c_generic uas usb_storage sg sd_mod hid_generic t10_pi usbhid hid sr_mod cdrom crc64_rocksoft crc64 ata_generic ahci pata_atiixp libahci ohci_pci firewire_ohci libata firewire_core crc_itu_t xhci_pci
> scsi_mod ohci_hcd r8169 ehci_pci xhci_hcd realtek ehci_hcd mdio_devres i2c_piix4 scsi_common usbcore libphy usb_common
> CR2: 00000000000065c0
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
> Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc cc cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74 0f <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9 99 8e
> RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
> RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
> RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
> RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
> R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
> R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffff99bb67c80000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000000065c0 CR3: 000000008980a000 CR4: 00000000000006e0

2022-10-17 01:26:12

by Arthur Marsh

[permalink] [raw]
Subject: Re: [git pull] drm fixes for 6.1-rc1

Thanks Dave, I reverted patch 312b4dc11d4f74bfe03ea25ffe04c1f2fdd13cb9 against 6.1-rc1 and the resulting kernel loaded amdgpu fine on my pc with Cape Verde GPU.

Regards,

Arthur.

On 17 October 2022 8:14:18 am ACDT, Dave Airlie <[email protected]> wrote:
>On Sun, 16 Oct 2022 at 18:09, Arthur Marsh
><[email protected]> wrote:
>>
>> From: Arthur Marsh <[email protected]>
>>
>> Hi, the "drm fixes for 6.1-rc1" commit caused the amdgpu module to fail
>> with my Cape Verde radeonsi card.
>>
>> I haven't been able to bisect the problem to an individual commit, but
>> attach a dmesg extract below.
>>
>> I'm happy to supply any other configuration information and test patches.
>>
>
>Can you try reverting: it's the only think I can spot that might
>affect a card that old since most changes in that request were for
>display hw you don't have.
>
>ommit 312b4dc11d4f74bfe03ea25ffe04c1f2fdd13cb9
>Author: Arunpravin Paneer Selvam <[email protected]>
>Date: Tue Oct 4 07:33:39 2022 -0700
>
> drm/amdgpu: Fix VRAM BO swap issue
>
> DRM buddy manager allocates the contiguous memory requests in
> a single block or multiple blocks. So for the ttm move operation
> (incase of low vram memory) we should consider all the blocks to
> compute the total memory size which compared with the struct
> ttm_resource num_pages in order to verify that the blocks are
> contiguous for the eviction process.
>
> v2: Added a Fixes tag
> v3: Rewrite the code to save a bit of calculations and
> variables (Christian)
>
> Fixes: c9cad937c0c5 ("drm/amdgpu: add drm buddy support to amdgpu")
> Signed-off-by: Arunpravin Paneer Selvam <[email protected]>
> Reviewed-by: Christian König <[email protected]>
> Signed-off-by: Alex Deucher <[email protected]>
>
>
>Thanks,
>Dave.
>
>> Arthur.
>>
>> Linux version 6.0.0+ (root@am64) (gcc-12 (Debian 12.2.0-5) 12.2.0, GNU ld (GNU Binutils for Debian) 2.39) #5179 SMP PREEMPT_DYNAMIC Fri Oct 14 17:00:40 ACDT 2022
>> Command line: BOOT_IMAGE=/vmlinuz-6.0.0+ root=UUID=39706f53-7c27-4310-b22a-36c7b042d1a1 ro single amdgpu.audio=1 amdgpu.si_support=1 radeon.si_support=0 page_owner=on amdgpu.gpu_recovery=1
>> ...
>>
>> [drm] amdgpu kernel modesetting enabled.
>> amdgpu 0000:01:00.0: vgaarb: deactivate vga console
>> Console: switching to colour dummy device 80x25
>> [drm] initializing kernel modesetting (VERDE 0x1002:0x682B 0x1458:0x22CA 0x87).
>> [drm] register mmio base: 0xFE8C0000
>> [drm] register mmio size: 262144
>> [drm] add ip block number 0 <si_common>
>> [drm] add ip block number 1 <gmc_v6_0>
>> [drm] add ip block number 2 <si_ih>
>> [drm] add ip block number 3 <gfx_v6_0>
>> [drm] add ip block number 4 <si_dma>
>> [drm] add ip block number 5 <si_dpm>
>> [drm] add ip block number 6 <dce_v6_0>
>> [drm] add ip block number 7 <uvd_v3_1>
>> [drm] BIOS signature incorrect 5b 7
>> resource sanity check: requesting [mem 0x000c0000-0x000dffff], which spans more than PCI Bus 0000:00 [mem 0x000d0000-0x000dffff window]
>> caller pci_map_rom+0x68/0x1b0 mapping multiple BARs
>> amdgpu 0000:01:00.0: No more image in the PCI ROM
>> amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
>> amdgpu: ATOM BIOS: xxx-xxx-xxx
>> amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
>> amdgpu 0000:01:00.0: amdgpu: PCIE atomic ops is not supported
>> [drm] PCIE gen 2 link speeds already enabled
>> [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
>> RTL8211B Gigabit Ethernet r8169-0-300:00: attached PHY driver (mii_bus:phy_addr=r8169-0-300:00, irq=MAC)
>> r8169 0000:03:00.0 eth0: Link is Down
>> amdgpu 0000:01:00.0: amdgpu: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
>> amdgpu 0000:01:00.0: amdgpu: GART: 1024M 0x000000FF00000000 - 0x000000FF3FFFFFFF
>> [drm] Detected VRAM RAM=2048M, BAR=256M
>> [drm] RAM width 128bits DDR3
>> [drm] amdgpu: 2048M of VRAM memory ready
>> [drm] amdgpu: 3979M of GTT memory ready.
>> [drm] GART: num cpu pages 262144, num gpu pages 262144
>> amdgpu 0000:01:00.0: amdgpu: PCIE GART of 1024M enabled (table at 0x000000F400A00000).
>> [drm] Internal thermal controller with fan control
>> [drm] amdgpu: dpm initialized
>> [drm] AMDGPU Display Connectors
>> [drm] Connector 0:
>> [drm] HDMI-A-1
>> [drm] HPD1
>> [drm] DDC: 0x194c 0x194c 0x194d 0x194d 0x194e 0x194e 0x194f 0x194f
>> [drm] Encoders:
>> [drm] DFP1: INTERNAL_UNIPHY
>> [drm] Connector 1:
>> [drm] DVI-D-1
>> [drm] HPD2
>> [drm] DDC: 0x1950 0x1950 0x1951 0x1951 0x1952 0x1952 0x1953 0x1953
>> [drm] Encoders:
>> [drm] DFP2: INTERNAL_UNIPHY
>> [drm] Connector 2:
>> [drm] VGA-1
>> [drm] DDC: 0x1970 0x1970 0x1971 0x1971 0x1972 0x1972 0x1973 0x1973
>> [drm] Encoders:
>> [drm] CRT1: INTERNAL_KLDSCP_DAC1
>> [drm] Found UVD firmware Version: 64.0 Family ID: 13
>> amdgpu: Move buffer fallback to memcpy unavailable
>> [drm:amdgpu_device_init.cold [amdgpu]] *ERROR* sw_init of IP block <uvd_v3_1> failed -19
>> amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_init failed
>> amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
>> amdgpu 0000:01:00.0: amdgpu: amdgpu: finishing device.
>> BUG: kernel NULL pointer dereference, address: 0000000000000090
>> #PF: supervisor write access in kernel mode
>> #PF: error_code(0x0002) - not-present page
>> PGD 0 P4D 0
>> Oops: 0002 [#1] PREEMPT SMP NOPTI
>> CPU: 3 PID: 447 Comm: udevd Not tainted 6.0.0+ #5179
>> Hardware name: System manufacturer System Product Name/M3A78 PRO, BIOS 1701 01/27/2011
>> RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
>> Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc cc cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74 0f <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9 99 8e
>> RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
>> RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
>> RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
>> RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
>> R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
>> R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
>> FS: 00007fd81fcd9840(0000) GS:ffff99bb67cc0000(0000) knlGS:0000000000000000
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000090 CR3: 0000000111822000 CR4: 00000000000006e0
>> Call Trace:
>> <TASK>
>> amdgpu_fence_driver_sw_fini+0xc2/0xd0 [amdgpu]
>> amdgpu_device_fini_sw+0x17/0x3c0 [amdgpu]
>> amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
>> devm_drm_dev_init_release+0x4a/0x70 [drm]
>> release_nodes+0x40/0xb0
>> devres_release_all+0x89/0xc0
>> device_unbind_cleanup+0xe/0x70
>> really_probe+0x245/0x3a0
>> ? pm_runtime_barrier+0x61/0xb0
>> __driver_probe_device+0x78/0x170
>> driver_probe_device+0x2d/0xb0
>> __driver_attach+0xdc/0x1d0
>> ? __device_attach_driver+0x100/0x100
>> bus_for_each_dev+0x69/0xa0
>> bus_add_driver+0x1d4/0x230
>> ? _raw_spin_unlock+0x15/0x40
>> driver_register+0x89/0xe0
>> ? 0xffffffffc0c3b000
>> do_one_initcall+0x44/0x200
>> ? __kmem_cache_alloc_node+0x90/0x360
>> ? kmalloc_trace+0x38/0xc0
>> do_init_module+0x4a/0x1e0
>> __do_sys_finit_module+0xb5/0x130
>> do_syscall_64+0x3a/0x90
>> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> RIP: 0033:0x7fd81ff5b1b9
>> Code: 08 44 89 e0 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 1c 0d 00 f7 d8 64 89 01 48
>> RSP: 002b:00007ffc5b37cbb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>> RAX: ffffffffffffffda RBX: 000055e5f2f6a140 RCX: 00007fd81ff5b1b9
>> RDX: 0000000000000000 RSI: 000055e5f2f67e30 RDI: 0000000000000017
>> RBP: 000055e5f2f67e30 R08: 0000000000000000 R09: 000055e5f2f46700
>> R10: 0000000000000017 R11: 0000000000000246 R12: 0000000000020000
>> R13: 0000000000000000 R14: 000055e5f2f65b00 R15: 0000000000000024
>> </TASK>
>> Modules linked in: amdgpu(+) snd_emu10k1_synth snd_emux_synth snd_seq_midi_emul snd_seq_virmidi snd_seq_midi snd_seq_midi_event snd_seq wmi_bmof snd_emu10k1 edac_mce_amd gpu_sched drm_buddy video kvm_amd drm_ttm_helper ttm snd_util_mem drm_display_helper snd_ac97_codec ccp drm_kms_helper snd_hda_codec_hdmi rng_core ac97_bus snd_rawmidi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_seq_device drm kvm snd_hwdep snd_pcm_oss snd_mixer_oss evdev serio_raw snd_pcm irqbypass i2c_algo_bit fb_sys_fops syscopyarea sysfillrect emu10k1_gp pcspkr gameport k10temp snd_timer sysimgblt snd acpi_cpufreq wmi soundcore button sp5100_tco asus_atk0110 ext4 crc16 mbcache jbd2 btrfs blake2b_generic xor raid6_pq zstd_compress libcrc32c crc32c_generic uas usb_storage sg sd_mod hid_generic t10_pi usbhid hid sr_mod cdrom crc64_rocksoft crc64 ata_generic ahci pata_atiixp libahci ohci_pci firewire_ohci libata firewire_core crc_itu_t xhci_pci scsi_mod ohci_hcd r8169 ehci_pci xhci_hcd
>> realtek ehci_hcd mdio_devres i2c_piix4 scsi_common usbcore libphy usb_common
>> CR2: 0000000000000090
>> ---[ end trace 0000000000000000 ]---
>> RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
>> Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc cc cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74 0f <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9 99 8e
>> RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
>> RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
>> RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
>> RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
>> R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
>> R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
>> FS: 00007fd81fcd9840(0000) GS:ffff99bb67cc0000(0000) knlGS:0000000000000000
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000090 CR3: 0000000111822000 CR4: 00000000000006e0
>> note: udevd[447] exited with preempt_count 1
>> udevd[433]: worker [447] terminated by signal 9 (Killed)
>> udevd[433]: worker [447] failed while handling '/devices/pci0000:00/0000:00:02.0/0000:01:00.0'
>> r8169 0000:03:00.0 eth0: Link is Up - 1Gbps/Full - flow control off
>> IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
>> Adding 4194300k swap on /dev/sda4. Priority:-2 extents:1 across:4194300k FS
>> EXT4-fs (sda5): re-mounted. Quota mode: none.
>> lp: driver loaded but no devices found
>> ppdev: user-space parallel port driver
>> it87: Found IT8716F chip at 0xe80, revision 3
>> ACPI Warning: SystemIO range 0x0000000000000E85-0x0000000000000E86 conflicts with OpRegion 0x0000000000000E85-0x0000000000000E86 (\_SB.PCI0.SBRG.ASOC.HWRE) (20220331/utaddress-204)
>> ACPI: OSL: Resource conflict; ACPI support missing from driver?
>> BUG: unable to handle page fault for address: 00000000000065c0
>> #PF: supervisor read access in kernel mode
>> #PF: error_code(0x0000) - not-present page
>> PGD 0 P4D 0
>> Oops: 0000 [#2] PREEMPT SMP NOPTI
>> CPU: 2 PID: 55 Comm: kworker/2:1 Tainted: G D 6.0.0+ #5179
>> Hardware name: System manufacturer System Product Name/M3A78 PRO, BIOS 1701 01/27/2011
>> Workqueue: events output_poll_execute [drm_kms_helper]
>> RIP: 0010:amdgpu_device_rreg.part.0+0x39/0x100 [amdgpu]
>> Code: 6c 24 08 48 89 fb 4c 89 64 24 10 44 8d 24 b5 00 00 00 00 4c 3b a7 88 08 00 00 89 f5 73 70 83 e2 02 74 2f 4c 03 a3 90 08 00 00 <45> 8b 24 24 48 8b 43 08 0f b7 70 3e 66 90 44 89 e0 48 8b 1c 24 48
>> RSP: 0018:ffffbeb3c0717c48 EFLAGS: 00010206
>> RAX: 0000000000000000 RBX: ffff99bae8260000 RCX: 0000000000000000
>> RDX: 0000000000000000 RSI: 0000000000001970 RDI: ffff99bae8260000
>> RBP: 0000000000001970 R08: ffffbeb3c0717e08 R09: 0000000000000000
>> R10: 0000000000000018 R11: fefefefefefefeff R12: 00000000000065c0
>> R13: ffffbeb3c0717d70 R14: 0000000000000000 R15: 000000010005e340
>> FS: 0000000000000000(0000) GS:ffff99bb67c80000(0000) knlGS:0000000000000000
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00000000000065c0 CR3: 000000008980a000 CR4: 00000000000006e0
>> Call Trace:
>> <TASK>
>> amdgpu_i2c_pre_xfer+0x163/0x180 [amdgpu]
>> bit_xfer+0x36/0x530 [i2c_algo_bit]
>> __i2c_transfer+0x185/0x550
>> i2c_transfer+0xa2/0x110
>> amdgpu_display_ddc_probe+0xbd/0x100 [amdgpu]
>> amdgpu_connector_vga_detect+0x8e/0x200 [amdgpu]
>> drm_helper_probe_detect_ctx+0x7b/0xd0 [drm_kms_helper]
>> output_poll_execute+0x152/0x220 [drm_kms_helper]
>> process_one_work+0x1ae/0x370
>> worker_thread+0x4d/0x3b0
>> ? rescuer_thread+0x380/0x380
>> kthread+0xe3/0x110
>> ? kthread_complete_and_exit+0x20/0x20
>> ret_from_fork+0x22/0x30
>> </TASK>
>> Modules linked in: max6650 hwmon_vid parport_pc ppdev lp parport amdgpu(+) snd_emu10k1_synth snd_emux_synth snd_seq_midi_emul snd_seq_virmidi snd_seq_midi snd_seq_midi_event snd_seq wmi_bmof snd_emu10k1 edac_mce_amd gpu_sched drm_buddy video kvm_amd drm_ttm_helper ttm snd_util_mem drm_display_helper snd_ac97_codec ccp drm_kms_helper snd_hda_codec_hdmi rng_core ac97_bus snd_rawmidi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_seq_device drm kvm snd_hwdep snd_pcm_oss snd_mixer_oss evdev serio_raw snd_pcm irqbypass i2c_algo_bit fb_sys_fops syscopyarea sysfillrect emu10k1_gp pcspkr gameport k10temp snd_timer sysimgblt snd acpi_cpufreq wmi soundcore button sp5100_tco asus_atk0110 ext4 crc16 mbcache jbd2 btrfs blake2b_generic xor raid6_pq zstd_compress libcrc32c crc32c_generic uas usb_storage sg sd_mod hid_generic t10_pi usbhid hid sr_mod cdrom crc64_rocksoft crc64 ata_generic ahci pata_atiixp libahci ohci_pci firewire_ohci libata firewire_core crc_itu_t xhci_pci
>> scsi_mod ohci_hcd r8169 ehci_pci xhci_hcd realtek ehci_hcd mdio_devres i2c_piix4 scsi_common usbcore libphy usb_common
>> CR2: 00000000000065c0
>> ---[ end trace 0000000000000000 ]---
>> RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
>> Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc cc cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74 0f <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9 99 8e
>> RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
>> RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
>> RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
>> RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
>> R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
>> R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
>> FS: 0000000000000000(0000) GS:ffff99bb67c80000(0000) knlGS:0000000000000000
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00000000000065c0 CR3: 000000008980a000 CR4: 00000000000006e0

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2022-10-17 06:28:34

by Christian König

[permalink] [raw]
Subject: Re: [git pull] drm fixes for 6.1-rc1

Arun please take a look into this ASAP.

Thanks,
Christian.

Am 17.10.22 um 03:13 schrieb Arthur Marsh:
> Thanks Dave, I reverted patch 312b4dc11d4f74bfe03ea25ffe04c1f2fdd13cb9 against 6.1-rc1 and the resulting kernel loaded amdgpu fine on my pc with Cape Verde GPU.
>
> Regards,
>
> Arthur.
>
> On 17 October 2022 8:14:18 am ACDT, Dave Airlie <[email protected]> wrote:
>> On Sun, 16 Oct 2022 at 18:09, Arthur Marsh
>> <[email protected]> wrote:
>>> From: Arthur Marsh <[email protected]>
>>>
>>> Hi, the "drm fixes for 6.1-rc1" commit caused the amdgpu module to fail
>>> with my Cape Verde radeonsi card.
>>>
>>> I haven't been able to bisect the problem to an individual commit, but
>>> attach a dmesg extract below.
>>>
>>> I'm happy to supply any other configuration information and test patches.
>>>
>> Can you try reverting: it's the only think I can spot that might
>> affect a card that old since most changes in that request were for
>> display hw you don't have.
>>
>> ommit 312b4dc11d4f74bfe03ea25ffe04c1f2fdd13cb9
>> Author: Arunpravin Paneer Selvam <[email protected]>
>> Date: Tue Oct 4 07:33:39 2022 -0700
>>
>> drm/amdgpu: Fix VRAM BO swap issue
>>
>> DRM buddy manager allocates the contiguous memory requests in
>> a single block or multiple blocks. So for the ttm move operation
>> (incase of low vram memory) we should consider all the blocks to
>> compute the total memory size which compared with the struct
>> ttm_resource num_pages in order to verify that the blocks are
>> contiguous for the eviction process.
>>
>> v2: Added a Fixes tag
>> v3: Rewrite the code to save a bit of calculations and
>> variables (Christian)
>>
>> Fixes: c9cad937c0c5 ("drm/amdgpu: add drm buddy support to amdgpu")
>> Signed-off-by: Arunpravin Paneer Selvam <[email protected]>
>> Reviewed-by: Christian König <[email protected]>
>> Signed-off-by: Alex Deucher <[email protected]>
>>
>>
>> Thanks,
>> Dave.
>>
>>> Arthur.
>>>
>>> Linux version 6.0.0+ (root@am64) (gcc-12 (Debian 12.2.0-5) 12.2.0, GNU ld (GNU Binutils for Debian) 2.39) #5179 SMP PREEMPT_DYNAMIC Fri Oct 14 17:00:40 ACDT 2022
>>> Command line: BOOT_IMAGE=/vmlinuz-6.0.0+ root=UUID=39706f53-7c27-4310-b22a-36c7b042d1a1 ro single amdgpu.audio=1 amdgpu.si_support=1 radeon.si_support=0 page_owner=on amdgpu.gpu_recovery=1
>>> ...
>>>
>>> [drm] amdgpu kernel modesetting enabled.
>>> amdgpu 0000:01:00.0: vgaarb: deactivate vga console
>>> Console: switching to colour dummy device 80x25
>>> [drm] initializing kernel modesetting (VERDE 0x1002:0x682B 0x1458:0x22CA 0x87).
>>> [drm] register mmio base: 0xFE8C0000
>>> [drm] register mmio size: 262144
>>> [drm] add ip block number 0 <si_common>
>>> [drm] add ip block number 1 <gmc_v6_0>
>>> [drm] add ip block number 2 <si_ih>
>>> [drm] add ip block number 3 <gfx_v6_0>
>>> [drm] add ip block number 4 <si_dma>
>>> [drm] add ip block number 5 <si_dpm>
>>> [drm] add ip block number 6 <dce_v6_0>
>>> [drm] add ip block number 7 <uvd_v3_1>
>>> [drm] BIOS signature incorrect 5b 7
>>> resource sanity check: requesting [mem 0x000c0000-0x000dffff], which spans more than PCI Bus 0000:00 [mem 0x000d0000-0x000dffff window]
>>> caller pci_map_rom+0x68/0x1b0 mapping multiple BARs
>>> amdgpu 0000:01:00.0: No more image in the PCI ROM
>>> amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
>>> amdgpu: ATOM BIOS: xxx-xxx-xxx
>>> amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
>>> amdgpu 0000:01:00.0: amdgpu: PCIE atomic ops is not supported
>>> [drm] PCIE gen 2 link speeds already enabled
>>> [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
>>> RTL8211B Gigabit Ethernet r8169-0-300:00: attached PHY driver (mii_bus:phy_addr=r8169-0-300:00, irq=MAC)
>>> r8169 0000:03:00.0 eth0: Link is Down
>>> amdgpu 0000:01:00.0: amdgpu: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
>>> amdgpu 0000:01:00.0: amdgpu: GART: 1024M 0x000000FF00000000 - 0x000000FF3FFFFFFF
>>> [drm] Detected VRAM RAM=2048M, BAR=256M
>>> [drm] RAM width 128bits DDR3
>>> [drm] amdgpu: 2048M of VRAM memory ready
>>> [drm] amdgpu: 3979M of GTT memory ready.
>>> [drm] GART: num cpu pages 262144, num gpu pages 262144
>>> amdgpu 0000:01:00.0: amdgpu: PCIE GART of 1024M enabled (table at 0x000000F400A00000).
>>> [drm] Internal thermal controller with fan control
>>> [drm] amdgpu: dpm initialized
>>> [drm] AMDGPU Display Connectors
>>> [drm] Connector 0:
>>> [drm] HDMI-A-1
>>> [drm] HPD1
>>> [drm] DDC: 0x194c 0x194c 0x194d 0x194d 0x194e 0x194e 0x194f 0x194f
>>> [drm] Encoders:
>>> [drm] DFP1: INTERNAL_UNIPHY
>>> [drm] Connector 1:
>>> [drm] DVI-D-1
>>> [drm] HPD2
>>> [drm] DDC: 0x1950 0x1950 0x1951 0x1951 0x1952 0x1952 0x1953 0x1953
>>> [drm] Encoders:
>>> [drm] DFP2: INTERNAL_UNIPHY
>>> [drm] Connector 2:
>>> [drm] VGA-1
>>> [drm] DDC: 0x1970 0x1970 0x1971 0x1971 0x1972 0x1972 0x1973 0x1973
>>> [drm] Encoders:
>>> [drm] CRT1: INTERNAL_KLDSCP_DAC1
>>> [drm] Found UVD firmware Version: 64.0 Family ID: 13
>>> amdgpu: Move buffer fallback to memcpy unavailable
>>> [drm:amdgpu_device_init.cold [amdgpu]] *ERROR* sw_init of IP block <uvd_v3_1> failed -19
>>> amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_init failed
>>> amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
>>> amdgpu 0000:01:00.0: amdgpu: amdgpu: finishing device.
>>> BUG: kernel NULL pointer dereference, address: 0000000000000090
>>> #PF: supervisor write access in kernel mode
>>> #PF: error_code(0x0002) - not-present page
>>> PGD 0 P4D 0
>>> Oops: 0002 [#1] PREEMPT SMP NOPTI
>>> CPU: 3 PID: 447 Comm: udevd Not tainted 6.0.0+ #5179
>>> Hardware name: System manufacturer System Product Name/M3A78 PRO, BIOS 1701 01/27/2011
>>> RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
>>> Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc cc cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74 0f <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9 99 8e
>>> RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
>>> RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
>>> RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
>>> RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
>>> R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
>>> R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
>>> FS: 00007fd81fcd9840(0000) GS:ffff99bb67cc0000(0000) knlGS:0000000000000000
>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 0000000000000090 CR3: 0000000111822000 CR4: 00000000000006e0
>>> Call Trace:
>>> <TASK>
>>> amdgpu_fence_driver_sw_fini+0xc2/0xd0 [amdgpu]
>>> amdgpu_device_fini_sw+0x17/0x3c0 [amdgpu]
>>> amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
>>> devm_drm_dev_init_release+0x4a/0x70 [drm]
>>> release_nodes+0x40/0xb0
>>> devres_release_all+0x89/0xc0
>>> device_unbind_cleanup+0xe/0x70
>>> really_probe+0x245/0x3a0
>>> ? pm_runtime_barrier+0x61/0xb0
>>> __driver_probe_device+0x78/0x170
>>> driver_probe_device+0x2d/0xb0
>>> __driver_attach+0xdc/0x1d0
>>> ? __device_attach_driver+0x100/0x100
>>> bus_for_each_dev+0x69/0xa0
>>> bus_add_driver+0x1d4/0x230
>>> ? _raw_spin_unlock+0x15/0x40
>>> driver_register+0x89/0xe0
>>> ? 0xffffffffc0c3b000
>>> do_one_initcall+0x44/0x200
>>> ? __kmem_cache_alloc_node+0x90/0x360
>>> ? kmalloc_trace+0x38/0xc0
>>> do_init_module+0x4a/0x1e0
>>> __do_sys_finit_module+0xb5/0x130
>>> do_syscall_64+0x3a/0x90
>>> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>> RIP: 0033:0x7fd81ff5b1b9
>>> Code: 08 44 89 e0 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 1c 0d 00 f7 d8 64 89 01 48
>>> RSP: 002b:00007ffc5b37cbb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> RAX: ffffffffffffffda RBX: 000055e5f2f6a140 RCX: 00007fd81ff5b1b9
>>> RDX: 0000000000000000 RSI: 000055e5f2f67e30 RDI: 0000000000000017
>>> RBP: 000055e5f2f67e30 R08: 0000000000000000 R09: 000055e5f2f46700
>>> R10: 0000000000000017 R11: 0000000000000246 R12: 0000000000020000
>>> R13: 0000000000000000 R14: 000055e5f2f65b00 R15: 0000000000000024
>>> </TASK>
>>> Modules linked in: amdgpu(+) snd_emu10k1_synth snd_emux_synth snd_seq_midi_emul snd_seq_virmidi snd_seq_midi snd_seq_midi_event snd_seq wmi_bmof snd_emu10k1 edac_mce_amd gpu_sched drm_buddy video kvm_amd drm_ttm_helper ttm snd_util_mem drm_display_helper snd_ac97_codec ccp drm_kms_helper snd_hda_codec_hdmi rng_core ac97_bus snd_rawmidi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_seq_device drm kvm snd_hwdep snd_pcm_oss snd_mixer_oss evdev serio_raw snd_pcm irqbypass i2c_algo_bit fb_sys_fops syscopyarea sysfillrect emu10k1_gp pcspkr gameport k10temp snd_timer sysimgblt snd acpi_cpufreq wmi soundcore button sp5100_tco asus_atk0110 ext4 crc16 mbcache jbd2 btrfs blake2b_generic xor raid6_pq zstd_compress libcrc32c crc32c_generic uas usb_storage sg sd_mod hid_generic t10_pi usbhid hid sr_mod cdrom crc64_rocksoft crc64 ata_generic ahci pata_atiixp libahci ohci_pci firewire_ohci libata firewire_core crc_itu_t xhci_pci scsi_mod ohci_hcd r8169 ehci_pci xhci_hcd
>>> realtek ehci_hcd mdio_devres i2c_piix4 scsi_common usbcore libphy usb_common
>>> CR2: 0000000000000090
>>> ---[ end trace 0000000000000000 ]---
>>> RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
>>> Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc cc cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74 0f <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9 99 8e
>>> RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
>>> RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
>>> RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
>>> RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
>>> R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
>>> R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
>>> FS: 00007fd81fcd9840(0000) GS:ffff99bb67cc0000(0000) knlGS:0000000000000000
>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 0000000000000090 CR3: 0000000111822000 CR4: 00000000000006e0
>>> note: udevd[447] exited with preempt_count 1
>>> udevd[433]: worker [447] terminated by signal 9 (Killed)
>>> udevd[433]: worker [447] failed while handling '/devices/pci0000:00/0000:00:02.0/0000:01:00.0'
>>> r8169 0000:03:00.0 eth0: Link is Up - 1Gbps/Full - flow control off
>>> IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
>>> Adding 4194300k swap on /dev/sda4. Priority:-2 extents:1 across:4194300k FS
>>> EXT4-fs (sda5): re-mounted. Quota mode: none.
>>> lp: driver loaded but no devices found
>>> ppdev: user-space parallel port driver
>>> it87: Found IT8716F chip at 0xe80, revision 3
>>> ACPI Warning: SystemIO range 0x0000000000000E85-0x0000000000000E86 conflicts with OpRegion 0x0000000000000E85-0x0000000000000E86 (\_SB.PCI0.SBRG.ASOC.HWRE) (20220331/utaddress-204)
>>> ACPI: OSL: Resource conflict; ACPI support missing from driver?
>>> BUG: unable to handle page fault for address: 00000000000065c0
>>> #PF: supervisor read access in kernel mode
>>> #PF: error_code(0x0000) - not-present page
>>> PGD 0 P4D 0
>>> Oops: 0000 [#2] PREEMPT SMP NOPTI
>>> CPU: 2 PID: 55 Comm: kworker/2:1 Tainted: G D 6.0.0+ #5179
>>> Hardware name: System manufacturer System Product Name/M3A78 PRO, BIOS 1701 01/27/2011
>>> Workqueue: events output_poll_execute [drm_kms_helper]
>>> RIP: 0010:amdgpu_device_rreg.part.0+0x39/0x100 [amdgpu]
>>> Code: 6c 24 08 48 89 fb 4c 89 64 24 10 44 8d 24 b5 00 00 00 00 4c 3b a7 88 08 00 00 89 f5 73 70 83 e2 02 74 2f 4c 03 a3 90 08 00 00 <45> 8b 24 24 48 8b 43 08 0f b7 70 3e 66 90 44 89 e0 48 8b 1c 24 48
>>> RSP: 0018:ffffbeb3c0717c48 EFLAGS: 00010206
>>> RAX: 0000000000000000 RBX: ffff99bae8260000 RCX: 0000000000000000
>>> RDX: 0000000000000000 RSI: 0000000000001970 RDI: ffff99bae8260000
>>> RBP: 0000000000001970 R08: ffffbeb3c0717e08 R09: 0000000000000000
>>> R10: 0000000000000018 R11: fefefefefefefeff R12: 00000000000065c0
>>> R13: ffffbeb3c0717d70 R14: 0000000000000000 R15: 000000010005e340
>>> FS: 0000000000000000(0000) GS:ffff99bb67c80000(0000) knlGS:0000000000000000
>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 00000000000065c0 CR3: 000000008980a000 CR4: 00000000000006e0
>>> Call Trace:
>>> <TASK>
>>> amdgpu_i2c_pre_xfer+0x163/0x180 [amdgpu]
>>> bit_xfer+0x36/0x530 [i2c_algo_bit]
>>> __i2c_transfer+0x185/0x550
>>> i2c_transfer+0xa2/0x110
>>> amdgpu_display_ddc_probe+0xbd/0x100 [amdgpu]
>>> amdgpu_connector_vga_detect+0x8e/0x200 [amdgpu]
>>> drm_helper_probe_detect_ctx+0x7b/0xd0 [drm_kms_helper]
>>> output_poll_execute+0x152/0x220 [drm_kms_helper]
>>> process_one_work+0x1ae/0x370
>>> worker_thread+0x4d/0x3b0
>>> ? rescuer_thread+0x380/0x380
>>> kthread+0xe3/0x110
>>> ? kthread_complete_and_exit+0x20/0x20
>>> ret_from_fork+0x22/0x30
>>> </TASK>
>>> Modules linked in: max6650 hwmon_vid parport_pc ppdev lp parport amdgpu(+) snd_emu10k1_synth snd_emux_synth snd_seq_midi_emul snd_seq_virmidi snd_seq_midi snd_seq_midi_event snd_seq wmi_bmof snd_emu10k1 edac_mce_amd gpu_sched drm_buddy video kvm_amd drm_ttm_helper ttm snd_util_mem drm_display_helper snd_ac97_codec ccp drm_kms_helper snd_hda_codec_hdmi rng_core ac97_bus snd_rawmidi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_seq_device drm kvm snd_hwdep snd_pcm_oss snd_mixer_oss evdev serio_raw snd_pcm irqbypass i2c_algo_bit fb_sys_fops syscopyarea sysfillrect emu10k1_gp pcspkr gameport k10temp snd_timer sysimgblt snd acpi_cpufreq wmi soundcore button sp5100_tco asus_atk0110 ext4 crc16 mbcache jbd2 btrfs blake2b_generic xor raid6_pq zstd_compress libcrc32c crc32c_generic uas usb_storage sg sd_mod hid_generic t10_pi usbhid hid sr_mod cdrom crc64_rocksoft crc64 ata_generic ahci pata_atiixp libahci ohci_pci firewire_ohci libata firewire_core crc_itu_t xhci_pci
>>> scsi_mod ohci_hcd r8169 ehci_pci xhci_hcd realtek ehci_hcd mdio_devres i2c_piix4 scsi_common usbcore libphy usb_common
>>> CR2: 00000000000065c0
>>> ---[ end trace 0000000000000000 ]---
>>> RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
>>> Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc cc cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74 0f <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9 99 8e
>>> RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
>>> RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
>>> RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
>>> RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
>>> R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
>>> R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
>>> FS: 0000000000000000(0000) GS:ffff99bb67c80000(0000) knlGS:0000000000000000
>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 00000000000065c0 CR3: 000000008980a000 CR4: 00000000000006e0

Subject: Re: [git pull] drm fixes for 6.1-rc1

Hi Arthur,

Is this old radeon card?

Thanks,
Arun

On 10/17/2022 11:50 AM, Christian König wrote:
> Arun please take a look into this ASAP.
>
> Thanks,
> Christian.
>
> Am 17.10.22 um 03:13 schrieb Arthur Marsh:
>> Thanks Dave, I reverted patch
>> 312b4dc11d4f74bfe03ea25ffe04c1f2fdd13cb9 against 6.1-rc1 and the
>> resulting kernel loaded amdgpu fine on my pc with Cape Verde GPU.
>>
>> Regards,
>>
>> Arthur.
>>
>> On 17 October 2022 8:14:18 am ACDT, Dave Airlie <[email protected]>
>> wrote:
>>> On Sun, 16 Oct 2022 at 18:09, Arthur Marsh
>>> <[email protected]> wrote:
>>>> From: Arthur Marsh <[email protected]>
>>>>
>>>> Hi, the "drm fixes for 6.1-rc1" commit caused the amdgpu module to
>>>> fail
>>>> with my Cape Verde radeonsi card.
>>>>
>>>> I haven't been able to bisect the problem to an individual commit, but
>>>> attach a dmesg extract below.
>>>>
>>>> I'm happy to supply any other configuration information and test
>>>> patches.
>>>>
>>> Can you try reverting: it's the only think I can spot that might
>>> affect a card that old since most changes in that request were for
>>> display hw you don't have.
>>>
>>> ommit 312b4dc11d4f74bfe03ea25ffe04c1f2fdd13cb9
>>> Author: Arunpravin Paneer Selvam <[email protected]>
>>> Date:   Tue Oct 4 07:33:39 2022 -0700
>>>
>>>     drm/amdgpu: Fix VRAM BO swap issue
>>>
>>>     DRM buddy manager allocates the contiguous memory requests in
>>>     a single block or multiple blocks. So for the ttm move operation
>>>     (incase of low vram memory) we should consider all the blocks to
>>>     compute the total memory size which compared with the struct
>>>     ttm_resource num_pages in order to verify that the blocks are
>>>     contiguous for the eviction process.
>>>
>>>     v2: Added a Fixes tag
>>>     v3: Rewrite the code to save a bit of calculations and
>>>         variables (Christian)
>>>
>>>     Fixes: c9cad937c0c5 ("drm/amdgpu: add drm buddy support to amdgpu")
>>>     Signed-off-by: Arunpravin Paneer Selvam
>>> <[email protected]>
>>>     Reviewed-by: Christian König <[email protected]>
>>>     Signed-off-by: Alex Deucher <[email protected]>
>>>
>>>
>>> Thanks,
>>> Dave.
>>>
>>>> Arthur.
>>>>
>>>>   Linux version 6.0.0+ (root@am64) (gcc-12 (Debian 12.2.0-5)
>>>> 12.2.0, GNU ld (GNU Binutils for Debian) 2.39) #5179 SMP
>>>> PREEMPT_DYNAMIC Fri Oct 14 17:00:40 ACDT 2022
>>>>   Command line: BOOT_IMAGE=/vmlinuz-6.0.0+
>>>> root=UUID=39706f53-7c27-4310-b22a-36c7b042d1a1 ro single
>>>> amdgpu.audio=1 amdgpu.si_support=1 radeon.si_support=0
>>>> page_owner=on amdgpu.gpu_recovery=1
>>>> ...
>>>>
>>>>   [drm] amdgpu kernel modesetting enabled.
>>>>   amdgpu 0000:01:00.0: vgaarb: deactivate vga console
>>>>   Console: switching to colour dummy device 80x25
>>>>   [drm] initializing kernel modesetting (VERDE 0x1002:0x682B
>>>> 0x1458:0x22CA 0x87).
>>>>   [drm] register mmio base: 0xFE8C0000
>>>>   [drm] register mmio size: 262144
>>>>   [drm] add ip block number 0 <si_common>
>>>>   [drm] add ip block number 1 <gmc_v6_0>
>>>>   [drm] add ip block number 2 <si_ih>
>>>>   [drm] add ip block number 3 <gfx_v6_0>
>>>>   [drm] add ip block number 4 <si_dma>
>>>>   [drm] add ip block number 5 <si_dpm>
>>>>   [drm] add ip block number 6 <dce_v6_0>
>>>>   [drm] add ip block number 7 <uvd_v3_1>
>>>>   [drm] BIOS signature incorrect 5b 7
>>>>   resource sanity check: requesting [mem 0x000c0000-0x000dffff],
>>>> which spans more than PCI Bus 0000:00 [mem 0x000d0000-0x000dffff
>>>> window]
>>>>   caller pci_map_rom+0x68/0x1b0 mapping multiple BARs
>>>>   amdgpu 0000:01:00.0: No more image in the PCI ROM
>>>>   amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
>>>>   amdgpu: ATOM BIOS: xxx-xxx-xxx
>>>>   amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature
>>>> not supported
>>>>   amdgpu 0000:01:00.0: amdgpu: PCIE atomic ops is not supported
>>>>   [drm] PCIE gen 2 link speeds already enabled
>>>>   [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment
>>>> size is 9-bit
>>>>   RTL8211B Gigabit Ethernet r8169-0-300:00: attached PHY driver
>>>> (mii_bus:phy_addr=r8169-0-300:00, irq=MAC)
>>>>   r8169 0000:03:00.0 eth0: Link is Down
>>>>   amdgpu 0000:01:00.0: amdgpu: VRAM: 2048M 0x000000F400000000 -
>>>> 0x000000F47FFFFFFF (2048M used)
>>>>   amdgpu 0000:01:00.0: amdgpu: GART: 1024M 0x000000FF00000000 -
>>>> 0x000000FF3FFFFFFF
>>>>   [drm] Detected VRAM RAM=2048M, BAR=256M
>>>>   [drm] RAM width 128bits DDR3
>>>>   [drm] amdgpu: 2048M of VRAM memory ready
>>>>   [drm] amdgpu: 3979M of GTT memory ready.
>>>>   [drm] GART: num cpu pages 262144, num gpu pages 262144
>>>>   amdgpu 0000:01:00.0: amdgpu: PCIE GART of 1024M enabled (table at
>>>> 0x000000F400A00000).
>>>>   [drm] Internal thermal controller with fan control
>>>>   [drm] amdgpu: dpm initialized
>>>>   [drm] AMDGPU Display Connectors
>>>>   [drm] Connector 0:
>>>>   [drm]   HDMI-A-1
>>>>   [drm]   HPD1
>>>>   [drm]   DDC: 0x194c 0x194c 0x194d 0x194d 0x194e 0x194e 0x194f 0x194f
>>>>   [drm]   Encoders:
>>>>   [drm]     DFP1: INTERNAL_UNIPHY
>>>>   [drm] Connector 1:
>>>>   [drm]   DVI-D-1
>>>>   [drm]   HPD2
>>>>   [drm]   DDC: 0x1950 0x1950 0x1951 0x1951 0x1952 0x1952 0x1953 0x1953
>>>>   [drm]   Encoders:
>>>>   [drm]     DFP2: INTERNAL_UNIPHY
>>>>   [drm] Connector 2:
>>>>   [drm]   VGA-1
>>>>   [drm]   DDC: 0x1970 0x1970 0x1971 0x1971 0x1972 0x1972 0x1973 0x1973
>>>>   [drm]   Encoders:
>>>>   [drm]     CRT1: INTERNAL_KLDSCP_DAC1
>>>>   [drm] Found UVD firmware Version: 64.0 Family ID: 13
>>>>   amdgpu: Move buffer fallback to memcpy unavailable
>>>>   [drm:amdgpu_device_init.cold [amdgpu]] *ERROR* sw_init of IP
>>>> block <uvd_v3_1> failed -19
>>>>   amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_init failed
>>>>   amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
>>>>   amdgpu 0000:01:00.0: amdgpu: amdgpu: finishing device.
>>>>   BUG: kernel NULL pointer dereference, address: 0000000000000090
>>>>   #PF: supervisor write access in kernel mode
>>>>   #PF: error_code(0x0002) - not-present page
>>>>   PGD 0 P4D 0
>>>>   Oops: 0002 [#1] PREEMPT SMP NOPTI
>>>>   CPU: 3 PID: 447 Comm: udevd Not tainted 6.0.0+ #5179
>>>>   Hardware name: System manufacturer System Product Name/M3A78 PRO,
>>>> BIOS 1701    01/27/2011
>>>>   RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
>>>>   Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc cc
>>>> cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74 0f
>>>> <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9 99 8e
>>>>   RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
>>>>   RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
>>>>   RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
>>>>   RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
>>>>   R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
>>>>   R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
>>>>   FS:  00007fd81fcd9840(0000) GS:ffff99bb67cc0000(0000)
>>>> knlGS:0000000000000000
>>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>   CR2: 0000000000000090 CR3: 0000000111822000 CR4: 00000000000006e0
>>>>   Call Trace:
>>>>    <TASK>
>>>>    amdgpu_fence_driver_sw_fini+0xc2/0xd0 [amdgpu]
>>>>    amdgpu_device_fini_sw+0x17/0x3c0 [amdgpu]
>>>>    amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
>>>>    devm_drm_dev_init_release+0x4a/0x70 [drm]
>>>>    release_nodes+0x40/0xb0
>>>>    devres_release_all+0x89/0xc0
>>>>    device_unbind_cleanup+0xe/0x70
>>>>    really_probe+0x245/0x3a0
>>>>    ? pm_runtime_barrier+0x61/0xb0
>>>>    __driver_probe_device+0x78/0x170
>>>>    driver_probe_device+0x2d/0xb0
>>>>    __driver_attach+0xdc/0x1d0
>>>>    ? __device_attach_driver+0x100/0x100
>>>>    bus_for_each_dev+0x69/0xa0
>>>>    bus_add_driver+0x1d4/0x230
>>>>    ? _raw_spin_unlock+0x15/0x40
>>>>    driver_register+0x89/0xe0
>>>>    ? 0xffffffffc0c3b000
>>>>    do_one_initcall+0x44/0x200
>>>>    ? __kmem_cache_alloc_node+0x90/0x360
>>>>    ? kmalloc_trace+0x38/0xc0
>>>>    do_init_module+0x4a/0x1e0
>>>>    __do_sys_finit_module+0xb5/0x130
>>>>    do_syscall_64+0x3a/0x90
>>>>    entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>>>   RIP: 0033:0x7fd81ff5b1b9
>>>>   Code: 08 44 89 e0 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 48 89 f8
>>>> 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05
>>>> <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 1c 0d 00 f7 d8 64 89 01 48
>>>>   RSP: 002b:00007ffc5b37cbb8 EFLAGS: 00000246 ORIG_RAX:
>>>> 0000000000000139
>>>>   RAX: ffffffffffffffda RBX: 000055e5f2f6a140 RCX: 00007fd81ff5b1b9
>>>>   RDX: 0000000000000000 RSI: 000055e5f2f67e30 RDI: 0000000000000017
>>>>   RBP: 000055e5f2f67e30 R08: 0000000000000000 R09: 000055e5f2f46700
>>>>   R10: 0000000000000017 R11: 0000000000000246 R12: 0000000000020000
>>>>   R13: 0000000000000000 R14: 000055e5f2f65b00 R15: 0000000000000024
>>>>    </TASK>
>>>>   Modules linked in: amdgpu(+) snd_emu10k1_synth snd_emux_synth
>>>> snd_seq_midi_emul snd_seq_virmidi snd_seq_midi snd_seq_midi_event
>>>> snd_seq wmi_bmof snd_emu10k1 edac_mce_amd gpu_sched drm_buddy video
>>>> kvm_amd drm_ttm_helper ttm snd_util_mem drm_display_helper
>>>> snd_ac97_codec ccp drm_kms_helper snd_hda_codec_hdmi rng_core
>>>> ac97_bus snd_rawmidi snd_hda_intel snd_intel_dspcfg snd_hda_codec
>>>> snd_hda_core snd_seq_device drm kvm snd_hwdep snd_pcm_oss
>>>> snd_mixer_oss evdev serio_raw snd_pcm irqbypass i2c_algo_bit
>>>> fb_sys_fops syscopyarea sysfillrect emu10k1_gp pcspkr gameport
>>>> k10temp snd_timer sysimgblt snd acpi_cpufreq wmi soundcore button
>>>> sp5100_tco asus_atk0110 ext4 crc16 mbcache jbd2 btrfs
>>>> blake2b_generic xor raid6_pq zstd_compress libcrc32c crc32c_generic
>>>> uas usb_storage sg sd_mod hid_generic t10_pi usbhid hid sr_mod
>>>> cdrom crc64_rocksoft crc64 ata_generic ahci pata_atiixp libahci
>>>> ohci_pci firewire_ohci libata firewire_core crc_itu_t xhci_pci
>>>> scsi_mod ohci_hcd r8169 ehci_pci xhci_hcd
>>>>    realtek ehci_hcd mdio_devres i2c_piix4 scsi_common usbcore
>>>> libphy usb_common
>>>>   CR2: 0000000000000090
>>>>   ---[ end trace 0000000000000000 ]---
>>>>   RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
>>>>   Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc cc
>>>> cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74 0f
>>>> <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9 99 8e
>>>>   RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
>>>>   RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
>>>>   RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
>>>>   RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
>>>>   R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
>>>>   R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
>>>>   FS:  00007fd81fcd9840(0000) GS:ffff99bb67cc0000(0000)
>>>> knlGS:0000000000000000
>>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>   CR2: 0000000000000090 CR3: 0000000111822000 CR4: 00000000000006e0
>>>>   note: udevd[447] exited with preempt_count 1
>>>>   udevd[433]: worker [447] terminated by signal 9 (Killed)
>>>>   udevd[433]: worker [447] failed while handling
>>>> '/devices/pci0000:00/0000:00:02.0/0000:01:00.0'
>>>>   r8169 0000:03:00.0 eth0: Link is Up - 1Gbps/Full - flow control off
>>>>   IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
>>>>   Adding 4194300k swap on /dev/sda4.  Priority:-2 extents:1
>>>> across:4194300k FS
>>>>   EXT4-fs (sda5): re-mounted. Quota mode: none.
>>>>   lp: driver loaded but no devices found
>>>>   ppdev: user-space parallel port driver
>>>>   it87: Found IT8716F chip at 0xe80, revision 3
>>>>   ACPI Warning: SystemIO range
>>>> 0x0000000000000E85-0x0000000000000E86 conflicts with OpRegion
>>>> 0x0000000000000E85-0x0000000000000E86 (\_SB.PCI0.SBRG.ASOC.HWRE)
>>>> (20220331/utaddress-204)
>>>>   ACPI: OSL: Resource conflict; ACPI support missing from driver?
>>>>   BUG: unable to handle page fault for address: 00000000000065c0
>>>>   #PF: supervisor read access in kernel mode
>>>>   #PF: error_code(0x0000) - not-present page
>>>>   PGD 0 P4D 0
>>>>   Oops: 0000 [#2] PREEMPT SMP NOPTI
>>>>   CPU: 2 PID: 55 Comm: kworker/2:1 Tainted: G D            6.0.0+
>>>> #5179
>>>>   Hardware name: System manufacturer System Product Name/M3A78 PRO,
>>>> BIOS 1701    01/27/2011
>>>>   Workqueue: events output_poll_execute [drm_kms_helper]
>>>>   RIP: 0010:amdgpu_device_rreg.part.0+0x39/0x100 [amdgpu]
>>>>   Code: 6c 24 08 48 89 fb 4c 89 64 24 10 44 8d 24 b5 00 00 00 00 4c
>>>> 3b a7 88 08 00 00 89 f5 73 70 83 e2 02 74 2f 4c 03 a3 90 08 00 00
>>>> <45> 8b 24 24 48 8b 43 08 0f b7 70 3e 66 90 44 89 e0 48 8b 1c 24 48
>>>>   RSP: 0018:ffffbeb3c0717c48 EFLAGS: 00010206
>>>>   RAX: 0000000000000000 RBX: ffff99bae8260000 RCX: 0000000000000000
>>>>   RDX: 0000000000000000 RSI: 0000000000001970 RDI: ffff99bae8260000
>>>>   RBP: 0000000000001970 R08: ffffbeb3c0717e08 R09: 0000000000000000
>>>>   R10: 0000000000000018 R11: fefefefefefefeff R12: 00000000000065c0
>>>>   R13: ffffbeb3c0717d70 R14: 0000000000000000 R15: 000000010005e340
>>>>   FS:  0000000000000000(0000) GS:ffff99bb67c80000(0000)
>>>> knlGS:0000000000000000
>>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>   CR2: 00000000000065c0 CR3: 000000008980a000 CR4: 00000000000006e0
>>>>   Call Trace:
>>>>    <TASK>
>>>>    amdgpu_i2c_pre_xfer+0x163/0x180 [amdgpu]
>>>>    bit_xfer+0x36/0x530 [i2c_algo_bit]
>>>>    __i2c_transfer+0x185/0x550
>>>>    i2c_transfer+0xa2/0x110
>>>>    amdgpu_display_ddc_probe+0xbd/0x100 [amdgpu]
>>>>    amdgpu_connector_vga_detect+0x8e/0x200 [amdgpu]
>>>>    drm_helper_probe_detect_ctx+0x7b/0xd0 [drm_kms_helper]
>>>>    output_poll_execute+0x152/0x220 [drm_kms_helper]
>>>>    process_one_work+0x1ae/0x370
>>>>    worker_thread+0x4d/0x3b0
>>>>    ? rescuer_thread+0x380/0x380
>>>>    kthread+0xe3/0x110
>>>>    ? kthread_complete_and_exit+0x20/0x20
>>>>    ret_from_fork+0x22/0x30
>>>>    </TASK>
>>>>   Modules linked in: max6650 hwmon_vid parport_pc ppdev lp parport
>>>> amdgpu(+) snd_emu10k1_synth snd_emux_synth snd_seq_midi_emul
>>>> snd_seq_virmidi snd_seq_midi snd_seq_midi_event snd_seq wmi_bmof
>>>> snd_emu10k1 edac_mce_amd gpu_sched drm_buddy video kvm_amd
>>>> drm_ttm_helper ttm snd_util_mem drm_display_helper snd_ac97_codec
>>>> ccp drm_kms_helper snd_hda_codec_hdmi rng_core ac97_bus snd_rawmidi
>>>> snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core
>>>> snd_seq_device drm kvm snd_hwdep snd_pcm_oss snd_mixer_oss evdev
>>>> serio_raw snd_pcm irqbypass i2c_algo_bit fb_sys_fops syscopyarea
>>>> sysfillrect emu10k1_gp pcspkr gameport k10temp snd_timer sysimgblt
>>>> snd acpi_cpufreq wmi soundcore button sp5100_tco asus_atk0110 ext4
>>>> crc16 mbcache jbd2 btrfs blake2b_generic xor raid6_pq zstd_compress
>>>> libcrc32c crc32c_generic uas usb_storage sg sd_mod hid_generic
>>>> t10_pi usbhid hid sr_mod cdrom crc64_rocksoft crc64 ata_generic
>>>> ahci pata_atiixp libahci ohci_pci firewire_ohci libata
>>>> firewire_core crc_itu_t xhci_pci
>>>>    scsi_mod ohci_hcd r8169 ehci_pci xhci_hcd realtek ehci_hcd
>>>> mdio_devres i2c_piix4 scsi_common usbcore libphy usb_common
>>>>   CR2: 00000000000065c0
>>>>   ---[ end trace 0000000000000000 ]---
>>>>   RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
>>>>   Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc cc
>>>> cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74 0f
>>>> <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9 99 8e
>>>>   RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
>>>>   RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
>>>>   RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
>>>>   RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
>>>>   R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
>>>>   R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
>>>>   FS:  0000000000000000(0000) GS:ffff99bb67c80000(0000)
>>>> knlGS:0000000000000000
>>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>   CR2: 00000000000065c0 CR3: 000000008980a000 CR4: 00000000000006e0
>

2022-10-17 07:28:24

by Christian König

[permalink] [raw]
Subject: Re: [git pull] drm fixes for 6.1-rc1

Hi Arun,

the hw generation doesn't matter. This error message here:

amdgpu: Move buffer fallback to memcpy unavailable

indicates that the detection of linear buffers still doesn't work as
expected or that we have a bug somewhere else.

Maybe the limiting when SDMA moves are not available isn't working
correctly?

Regards,
Christian.

Am 17.10.22 um 08:54 schrieb Arunpravin Paneer Selvam:
> Hi Arthur,
>
> Is this old radeon card?
>
> Thanks,
> Arun
>
> On 10/17/2022 11:50 AM, Christian König wrote:
>> Arun please take a look into this ASAP.
>>
>> Thanks,
>> Christian.
>>
>> Am 17.10.22 um 03:13 schrieb Arthur Marsh:
>>> Thanks Dave, I reverted patch
>>> 312b4dc11d4f74bfe03ea25ffe04c1f2fdd13cb9 against 6.1-rc1 and the
>>> resulting kernel loaded amdgpu fine on my pc with Cape Verde GPU.
>>>
>>> Regards,
>>>
>>> Arthur.
>>>
>>> On 17 October 2022 8:14:18 am ACDT, Dave Airlie <[email protected]>
>>> wrote:
>>>> On Sun, 16 Oct 2022 at 18:09, Arthur Marsh
>>>> <[email protected]> wrote:
>>>>> From: Arthur Marsh <[email protected]>
>>>>>
>>>>> Hi, the "drm fixes for 6.1-rc1" commit caused the amdgpu module to
>>>>> fail
>>>>> with my Cape Verde radeonsi card.
>>>>>
>>>>> I haven't been able to bisect the problem to an individual commit,
>>>>> but
>>>>> attach a dmesg extract below.
>>>>>
>>>>> I'm happy to supply any other configuration information and test
>>>>> patches.
>>>>>
>>>> Can you try reverting: it's the only think I can spot that might
>>>> affect a card that old since most changes in that request were for
>>>> display hw you don't have.
>>>>
>>>> ommit 312b4dc11d4f74bfe03ea25ffe04c1f2fdd13cb9
>>>> Author: Arunpravin Paneer Selvam <[email protected]>
>>>> Date:   Tue Oct 4 07:33:39 2022 -0700
>>>>
>>>>     drm/amdgpu: Fix VRAM BO swap issue
>>>>
>>>>     DRM buddy manager allocates the contiguous memory requests in
>>>>     a single block or multiple blocks. So for the ttm move operation
>>>>     (incase of low vram memory) we should consider all the blocks to
>>>>     compute the total memory size which compared with the struct
>>>>     ttm_resource num_pages in order to verify that the blocks are
>>>>     contiguous for the eviction process.
>>>>
>>>>     v2: Added a Fixes tag
>>>>     v3: Rewrite the code to save a bit of calculations and
>>>>         variables (Christian)
>>>>
>>>>     Fixes: c9cad937c0c5 ("drm/amdgpu: add drm buddy support to
>>>> amdgpu")
>>>>     Signed-off-by: Arunpravin Paneer Selvam
>>>> <[email protected]>
>>>>     Reviewed-by: Christian König <[email protected]>
>>>>     Signed-off-by: Alex Deucher <[email protected]>
>>>>
>>>>
>>>> Thanks,
>>>> Dave.
>>>>
>>>>> Arthur.
>>>>>
>>>>>   Linux version 6.0.0+ (root@am64) (gcc-12 (Debian 12.2.0-5)
>>>>> 12.2.0, GNU ld (GNU Binutils for Debian) 2.39) #5179 SMP
>>>>> PREEMPT_DYNAMIC Fri Oct 14 17:00:40 ACDT 2022
>>>>>   Command line: BOOT_IMAGE=/vmlinuz-6.0.0+
>>>>> root=UUID=39706f53-7c27-4310-b22a-36c7b042d1a1 ro single
>>>>> amdgpu.audio=1 amdgpu.si_support=1 radeon.si_support=0
>>>>> page_owner=on amdgpu.gpu_recovery=1
>>>>> ...
>>>>>
>>>>>   [drm] amdgpu kernel modesetting enabled.
>>>>>   amdgpu 0000:01:00.0: vgaarb: deactivate vga console
>>>>>   Console: switching to colour dummy device 80x25
>>>>>   [drm] initializing kernel modesetting (VERDE 0x1002:0x682B
>>>>> 0x1458:0x22CA 0x87).
>>>>>   [drm] register mmio base: 0xFE8C0000
>>>>>   [drm] register mmio size: 262144
>>>>>   [drm] add ip block number 0 <si_common>
>>>>>   [drm] add ip block number 1 <gmc_v6_0>
>>>>>   [drm] add ip block number 2 <si_ih>
>>>>>   [drm] add ip block number 3 <gfx_v6_0>
>>>>>   [drm] add ip block number 4 <si_dma>
>>>>>   [drm] add ip block number 5 <si_dpm>
>>>>>   [drm] add ip block number 6 <dce_v6_0>
>>>>>   [drm] add ip block number 7 <uvd_v3_1>
>>>>>   [drm] BIOS signature incorrect 5b 7
>>>>>   resource sanity check: requesting [mem 0x000c0000-0x000dffff],
>>>>> which spans more than PCI Bus 0000:00 [mem 0x000d0000-0x000dffff
>>>>> window]
>>>>>   caller pci_map_rom+0x68/0x1b0 mapping multiple BARs
>>>>>   amdgpu 0000:01:00.0: No more image in the PCI ROM
>>>>>   amdgpu 0000:01:00.0: amdgpu: Fetched VBIOS from ROM BAR
>>>>>   amdgpu: ATOM BIOS: xxx-xxx-xxx
>>>>>   amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature
>>>>> not supported
>>>>>   amdgpu 0000:01:00.0: amdgpu: PCIE atomic ops is not supported
>>>>>   [drm] PCIE gen 2 link speeds already enabled
>>>>>   [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment
>>>>> size is 9-bit
>>>>>   RTL8211B Gigabit Ethernet r8169-0-300:00: attached PHY driver
>>>>> (mii_bus:phy_addr=r8169-0-300:00, irq=MAC)
>>>>>   r8169 0000:03:00.0 eth0: Link is Down
>>>>>   amdgpu 0000:01:00.0: amdgpu: VRAM: 2048M 0x000000F400000000 -
>>>>> 0x000000F47FFFFFFF (2048M used)
>>>>>   amdgpu 0000:01:00.0: amdgpu: GART: 1024M 0x000000FF00000000 -
>>>>> 0x000000FF3FFFFFFF
>>>>>   [drm] Detected VRAM RAM=2048M, BAR=256M
>>>>>   [drm] RAM width 128bits DDR3
>>>>>   [drm] amdgpu: 2048M of VRAM memory ready
>>>>>   [drm] amdgpu: 3979M of GTT memory ready.
>>>>>   [drm] GART: num cpu pages 262144, num gpu pages 262144
>>>>>   amdgpu 0000:01:00.0: amdgpu: PCIE GART of 1024M enabled (table
>>>>> at 0x000000F400A00000).
>>>>>   [drm] Internal thermal controller with fan control
>>>>>   [drm] amdgpu: dpm initialized
>>>>>   [drm] AMDGPU Display Connectors
>>>>>   [drm] Connector 0:
>>>>>   [drm]   HDMI-A-1
>>>>>   [drm]   HPD1
>>>>>   [drm]   DDC: 0x194c 0x194c 0x194d 0x194d 0x194e 0x194e 0x194f
>>>>> 0x194f
>>>>>   [drm]   Encoders:
>>>>>   [drm]     DFP1: INTERNAL_UNIPHY
>>>>>   [drm] Connector 1:
>>>>>   [drm]   DVI-D-1
>>>>>   [drm]   HPD2
>>>>>   [drm]   DDC: 0x1950 0x1950 0x1951 0x1951 0x1952 0x1952 0x1953
>>>>> 0x1953
>>>>>   [drm]   Encoders:
>>>>>   [drm]     DFP2: INTERNAL_UNIPHY
>>>>>   [drm] Connector 2:
>>>>>   [drm]   VGA-1
>>>>>   [drm]   DDC: 0x1970 0x1970 0x1971 0x1971 0x1972 0x1972 0x1973
>>>>> 0x1973
>>>>>   [drm]   Encoders:
>>>>>   [drm]     CRT1: INTERNAL_KLDSCP_DAC1
>>>>>   [drm] Found UVD firmware Version: 64.0 Family ID: 13
>>>>>   amdgpu: Move buffer fallback to memcpy unavailable
>>>>>   [drm:amdgpu_device_init.cold [amdgpu]] *ERROR* sw_init of IP
>>>>> block <uvd_v3_1> failed -19
>>>>>   amdgpu 0000:01:00.0: amdgpu: amdgpu_device_ip_init failed
>>>>>   amdgpu 0000:01:00.0: amdgpu: Fatal error during GPU init
>>>>>   amdgpu 0000:01:00.0: amdgpu: amdgpu: finishing device.
>>>>>   BUG: kernel NULL pointer dereference, address: 0000000000000090
>>>>>   #PF: supervisor write access in kernel mode
>>>>>   #PF: error_code(0x0002) - not-present page
>>>>>   PGD 0 P4D 0
>>>>>   Oops: 0002 [#1] PREEMPT SMP NOPTI
>>>>>   CPU: 3 PID: 447 Comm: udevd Not tainted 6.0.0+ #5179
>>>>>   Hardware name: System manufacturer System Product Name/M3A78
>>>>> PRO, BIOS 1701    01/27/2011
>>>>>   RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
>>>>>   Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc
>>>>> cc cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74
>>>>> 0f <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9
>>>>> 99 8e
>>>>>   RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
>>>>>   RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
>>>>>   RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
>>>>>   RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
>>>>>   R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
>>>>>   R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
>>>>>   FS:  00007fd81fcd9840(0000) GS:ffff99bb67cc0000(0000)
>>>>> knlGS:0000000000000000
>>>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>   CR2: 0000000000000090 CR3: 0000000111822000 CR4: 00000000000006e0
>>>>>   Call Trace:
>>>>>    <TASK>
>>>>>    amdgpu_fence_driver_sw_fini+0xc2/0xd0 [amdgpu]
>>>>>    amdgpu_device_fini_sw+0x17/0x3c0 [amdgpu]
>>>>>    amdgpu_driver_release_kms+0x12/0x30 [amdgpu]
>>>>>    devm_drm_dev_init_release+0x4a/0x70 [drm]
>>>>>    release_nodes+0x40/0xb0
>>>>>    devres_release_all+0x89/0xc0
>>>>>    device_unbind_cleanup+0xe/0x70
>>>>>    really_probe+0x245/0x3a0
>>>>>    ? pm_runtime_barrier+0x61/0xb0
>>>>>    __driver_probe_device+0x78/0x170
>>>>>    driver_probe_device+0x2d/0xb0
>>>>>    __driver_attach+0xdc/0x1d0
>>>>>    ? __device_attach_driver+0x100/0x100
>>>>>    bus_for_each_dev+0x69/0xa0
>>>>>    bus_add_driver+0x1d4/0x230
>>>>>    ? _raw_spin_unlock+0x15/0x40
>>>>>    driver_register+0x89/0xe0
>>>>>    ? 0xffffffffc0c3b000
>>>>>    do_one_initcall+0x44/0x200
>>>>>    ? __kmem_cache_alloc_node+0x90/0x360
>>>>>    ? kmalloc_trace+0x38/0xc0
>>>>>    do_init_module+0x4a/0x1e0
>>>>>    __do_sys_finit_module+0xb5/0x130
>>>>>    do_syscall_64+0x3a/0x90
>>>>>    entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>>>>   RIP: 0033:0x7fd81ff5b1b9
>>>>>   Code: 08 44 89 e0 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 48 89
>>>>> f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f
>>>>> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 1c 0d 00 f7 d8 64 89
>>>>> 01 48
>>>>>   RSP: 002b:00007ffc5b37cbb8 EFLAGS: 00000246 ORIG_RAX:
>>>>> 0000000000000139
>>>>>   RAX: ffffffffffffffda RBX: 000055e5f2f6a140 RCX: 00007fd81ff5b1b9
>>>>>   RDX: 0000000000000000 RSI: 000055e5f2f67e30 RDI: 0000000000000017
>>>>>   RBP: 000055e5f2f67e30 R08: 0000000000000000 R09: 000055e5f2f46700
>>>>>   R10: 0000000000000017 R11: 0000000000000246 R12: 0000000000020000
>>>>>   R13: 0000000000000000 R14: 000055e5f2f65b00 R15: 0000000000000024
>>>>>    </TASK>
>>>>>   Modules linked in: amdgpu(+) snd_emu10k1_synth snd_emux_synth
>>>>> snd_seq_midi_emul snd_seq_virmidi snd_seq_midi snd_seq_midi_event
>>>>> snd_seq wmi_bmof snd_emu10k1 edac_mce_amd gpu_sched drm_buddy
>>>>> video kvm_amd drm_ttm_helper ttm snd_util_mem drm_display_helper
>>>>> snd_ac97_codec ccp drm_kms_helper snd_hda_codec_hdmi rng_core
>>>>> ac97_bus snd_rawmidi snd_hda_intel snd_intel_dspcfg snd_hda_codec
>>>>> snd_hda_core snd_seq_device drm kvm snd_hwdep snd_pcm_oss
>>>>> snd_mixer_oss evdev serio_raw snd_pcm irqbypass i2c_algo_bit
>>>>> fb_sys_fops syscopyarea sysfillrect emu10k1_gp pcspkr gameport
>>>>> k10temp snd_timer sysimgblt snd acpi_cpufreq wmi soundcore button
>>>>> sp5100_tco asus_atk0110 ext4 crc16 mbcache jbd2 btrfs
>>>>> blake2b_generic xor raid6_pq zstd_compress libcrc32c
>>>>> crc32c_generic uas usb_storage sg sd_mod hid_generic t10_pi usbhid
>>>>> hid sr_mod cdrom crc64_rocksoft crc64 ata_generic ahci pata_atiixp
>>>>> libahci ohci_pci firewire_ohci libata firewire_core crc_itu_t
>>>>> xhci_pci scsi_mod ohci_hcd r8169 ehci_pci xhci_hcd
>>>>>    realtek ehci_hcd mdio_devres i2c_piix4 scsi_common usbcore
>>>>> libphy usb_common
>>>>>   CR2: 0000000000000090
>>>>>   ---[ end trace 0000000000000000 ]---
>>>>>   RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
>>>>>   Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc
>>>>> cc cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74
>>>>> 0f <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9
>>>>> 99 8e
>>>>>   RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
>>>>>   RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
>>>>>   RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
>>>>>   RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
>>>>>   R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
>>>>>   R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
>>>>>   FS:  00007fd81fcd9840(0000) GS:ffff99bb67cc0000(0000)
>>>>> knlGS:0000000000000000
>>>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>   CR2: 0000000000000090 CR3: 0000000111822000 CR4: 00000000000006e0
>>>>>   note: udevd[447] exited with preempt_count 1
>>>>>   udevd[433]: worker [447] terminated by signal 9 (Killed)
>>>>>   udevd[433]: worker [447] failed while handling
>>>>> '/devices/pci0000:00/0000:00:02.0/0000:01:00.0'
>>>>>   r8169 0000:03:00.0 eth0: Link is Up - 1Gbps/Full - flow control off
>>>>>   IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
>>>>>   Adding 4194300k swap on /dev/sda4.  Priority:-2 extents:1
>>>>> across:4194300k FS
>>>>>   EXT4-fs (sda5): re-mounted. Quota mode: none.
>>>>>   lp: driver loaded but no devices found
>>>>>   ppdev: user-space parallel port driver
>>>>>   it87: Found IT8716F chip at 0xe80, revision 3
>>>>>   ACPI Warning: SystemIO range
>>>>> 0x0000000000000E85-0x0000000000000E86 conflicts with OpRegion
>>>>> 0x0000000000000E85-0x0000000000000E86 (\_SB.PCI0.SBRG.ASOC.HWRE)
>>>>> (20220331/utaddress-204)
>>>>>   ACPI: OSL: Resource conflict; ACPI support missing from driver?
>>>>>   BUG: unable to handle page fault for address: 00000000000065c0
>>>>>   #PF: supervisor read access in kernel mode
>>>>>   #PF: error_code(0x0000) - not-present page
>>>>>   PGD 0 P4D 0
>>>>>   Oops: 0000 [#2] PREEMPT SMP NOPTI
>>>>>   CPU: 2 PID: 55 Comm: kworker/2:1 Tainted: G D 6.0.0+ #5179
>>>>>   Hardware name: System manufacturer System Product Name/M3A78
>>>>> PRO, BIOS 1701    01/27/2011
>>>>>   Workqueue: events output_poll_execute [drm_kms_helper]
>>>>>   RIP: 0010:amdgpu_device_rreg.part.0+0x39/0x100 [amdgpu]
>>>>>   Code: 6c 24 08 48 89 fb 4c 89 64 24 10 44 8d 24 b5 00 00 00 00
>>>>> 4c 3b a7 88 08 00 00 89 f5 73 70 83 e2 02 74 2f 4c 03 a3 90 08 00
>>>>> 00 <45> 8b 24 24 48 8b 43 08 0f b7 70 3e 66 90 44 89 e0 48 8b 1c
>>>>> 24 48
>>>>>   RSP: 0018:ffffbeb3c0717c48 EFLAGS: 00010206
>>>>>   RAX: 0000000000000000 RBX: ffff99bae8260000 RCX: 0000000000000000
>>>>>   RDX: 0000000000000000 RSI: 0000000000001970 RDI: ffff99bae8260000
>>>>>   RBP: 0000000000001970 R08: ffffbeb3c0717e08 R09: 0000000000000000
>>>>>   R10: 0000000000000018 R11: fefefefefefefeff R12: 00000000000065c0
>>>>>   R13: ffffbeb3c0717d70 R14: 0000000000000000 R15: 000000010005e340
>>>>>   FS:  0000000000000000(0000) GS:ffff99bb67c80000(0000)
>>>>> knlGS:0000000000000000
>>>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>   CR2: 00000000000065c0 CR3: 000000008980a000 CR4: 00000000000006e0
>>>>>   Call Trace:
>>>>>    <TASK>
>>>>>    amdgpu_i2c_pre_xfer+0x163/0x180 [amdgpu]
>>>>>    bit_xfer+0x36/0x530 [i2c_algo_bit]
>>>>>    __i2c_transfer+0x185/0x550
>>>>>    i2c_transfer+0xa2/0x110
>>>>>    amdgpu_display_ddc_probe+0xbd/0x100 [amdgpu]
>>>>>    amdgpu_connector_vga_detect+0x8e/0x200 [amdgpu]
>>>>>    drm_helper_probe_detect_ctx+0x7b/0xd0 [drm_kms_helper]
>>>>>    output_poll_execute+0x152/0x220 [drm_kms_helper]
>>>>>    process_one_work+0x1ae/0x370
>>>>>    worker_thread+0x4d/0x3b0
>>>>>    ? rescuer_thread+0x380/0x380
>>>>>    kthread+0xe3/0x110
>>>>>    ? kthread_complete_and_exit+0x20/0x20
>>>>>    ret_from_fork+0x22/0x30
>>>>>    </TASK>
>>>>>   Modules linked in: max6650 hwmon_vid parport_pc ppdev lp parport
>>>>> amdgpu(+) snd_emu10k1_synth snd_emux_synth snd_seq_midi_emul
>>>>> snd_seq_virmidi snd_seq_midi snd_seq_midi_event snd_seq wmi_bmof
>>>>> snd_emu10k1 edac_mce_amd gpu_sched drm_buddy video kvm_amd
>>>>> drm_ttm_helper ttm snd_util_mem drm_display_helper snd_ac97_codec
>>>>> ccp drm_kms_helper snd_hda_codec_hdmi rng_core ac97_bus
>>>>> snd_rawmidi snd_hda_intel snd_intel_dspcfg snd_hda_codec
>>>>> snd_hda_core snd_seq_device drm kvm snd_hwdep snd_pcm_oss
>>>>> snd_mixer_oss evdev serio_raw snd_pcm irqbypass i2c_algo_bit
>>>>> fb_sys_fops syscopyarea sysfillrect emu10k1_gp pcspkr gameport
>>>>> k10temp snd_timer sysimgblt snd acpi_cpufreq wmi soundcore button
>>>>> sp5100_tco asus_atk0110 ext4 crc16 mbcache jbd2 btrfs
>>>>> blake2b_generic xor raid6_pq zstd_compress libcrc32c
>>>>> crc32c_generic uas usb_storage sg sd_mod hid_generic t10_pi usbhid
>>>>> hid sr_mod cdrom crc64_rocksoft crc64 ata_generic ahci pata_atiixp
>>>>> libahci ohci_pci firewire_ohci libata firewire_core crc_itu_t
>>>>> xhci_pci
>>>>>    scsi_mod ohci_hcd r8169 ehci_pci xhci_hcd realtek ehci_hcd
>>>>> mdio_devres i2c_piix4 scsi_common usbcore libphy usb_common
>>>>>   CR2: 00000000000065c0
>>>>>   ---[ end trace 0000000000000000 ]---
>>>>>   RIP: 0010:drm_sched_fini+0x80/0xa0 [gpu_sched]
>>>>>   Code: 76 83 0e c4 c6 85 8c 01 00 00 00 5b 5d 41 5c 41 5d c3 cc
>>>>> cc cc cc 4c 8d 63 f0 4c 89 e7 e8 08 99 8e c4 48 8b 03 48 39 d8 74
>>>>> 0f <c6> 80 90 00 00 00 01 48 8b 00 48 39 d8 75 f1 4c 89 e7 e8 c9
>>>>> 99 8e
>>>>>   RSP: 0018:ffffbeb3c06bfbb8 EFLAGS: 00010213
>>>>>   RAX: 0000000000000000 RBX: ffff99bae8269a98 RCX: ffff99bab703afc0
>>>>>   RDX: 0000000000000001 RSI: ffff99bab703afe8 RDI: 0000000000000000
>>>>>   RBP: ffff99bae82699f0 R08: ffffffff85cd0bc2 R09: 0000000000000010
>>>>>   R10: 0000000000000035 R11: ffff99bb594806c0 R12: ffff99bae8269a88
>>>>>   R13: ffff99bae82699f8 R14: ffff99bae82665e8 R15: 0000000000000000
>>>>>   FS:  0000000000000000(0000) GS:ffff99bb67c80000(0000)
>>>>> knlGS:0000000000000000
>>>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>   CR2: 00000000000065c0 CR3: 000000008980a000 CR4: 00000000000006e0
>>
>

2022-10-17 08:12:48

by Dave Airlie

[permalink] [raw]
Subject: Re: [git pull] drm fixes for 6.1-rc1

On Mon, 17 Oct 2022 at 17:07, Christian König <[email protected]> wrote:
>
> Hi Arun,
>
> the hw generation doesn't matter. This error message here:
>
> amdgpu: Move buffer fallback to memcpy unavailable
>
> indicates that the detection of linear buffers still doesn't work as
> expected or that we have a bug somewhere else.
>
> Maybe the limiting when SDMA moves are not available isn't working
> correctly?

It is a CAPE_VERDE, so maybe something with the SI UVD memory limitations?

Dave.

2022-10-17 08:30:06

by Christian König

[permalink] [raw]
Subject: Re: [git pull] drm fixes for 6.1-rc1

Am 17.10.22 um 10:01 schrieb Dave Airlie:
> On Mon, 17 Oct 2022 at 17:07, Christian König <[email protected]> wrote:
>> Hi Arun,
>>
>> the hw generation doesn't matter. This error message here:
>>
>> amdgpu: Move buffer fallback to memcpy unavailable
>>
>> indicates that the detection of linear buffers still doesn't work as
>> expected or that we have a bug somewhere else.
>>
>> Maybe the limiting when SDMA moves are not available isn't working
>> correctly?
> It is a CAPE_VERDE, so maybe something with the SI UVD memory limitations?

Yeah, good point. Could be that we try to move something into the UVD
memory window and that something isn't allocated linearly.

Arun can you trace the allocation and make sure that all kernel
allocations have the CONTIGUOUS flag set?

Thanks,
Christian.

>
> Dave.

Subject: Re: [git pull] drm fixes for 6.1-rc1

Hi Christian,

Looks like we have to exit the loop if there are no blocks to compare.
May be that's why the function returns false.

@Arthur Marsh Could you please test the attached patch.

Thanks,
Arun

On 10/17/2022 1:39 PM, Christian König wrote:
> Am 17.10.22 um 10:01 schrieb Dave Airlie:
>> On Mon, 17 Oct 2022 at 17:07, Christian König
>> <[email protected]> wrote:
>>> Hi Arun,
>>>
>>> the hw generation doesn't matter. This error message here:
>>>
>>> amdgpu: Move buffer fallback to memcpy unavailable
>>>
>>> indicates that the detection of linear buffers still doesn't work as
>>> expected or that we have a bug somewhere else.
>>>
>>> Maybe the limiting when SDMA moves are not available isn't working
>>> correctly?
>> It is a CAPE_VERDE, so maybe something with the SI UVD memory
>> limitations?
>
> Yeah, good point. Could be that we try to move something into the UVD
> memory window and that something isn't allocated linearly.
>
> Arun can you trace the allocation and make sure that all kernel
> allocations have the CONTIGUOUS flag set?
>
> Thanks,
> Christian.
>
>>
>> Dave.
>


Attachments:
0001-drm-amdgpu-Fix-for-BO-move-issue.patch (980.00 B)

2022-10-18 02:02:41

by Arthur Marsh

[permalink] [raw]
Subject: Re: [git pull] drm fixes for 6.1-rc1

Thanks Arunpravin, your patch applied to the 6.1-rc1 code built a kernel that loaded the amdgpu module on my pc with Cape Verde GPU card with no problems.

Regards,

Arthur.

On 18 October 2022 7:10:45 am ACDT, Arunpravin Paneer Selvam <[email protected]> wrote:
>Hi Christian,
>
>Looks like we have to exit the loop if there are no blocks to compare.
>May be that's why the function returns false.
>
>@Arthur Marsh Could you please test the attached patch.
>
>Thanks,
>Arun
>
>On 10/17/2022 1:39 PM, Christian König wrote:
>> Am 17.10.22 um 10:01 schrieb Dave Airlie:
>>> On Mon, 17 Oct 2022 at 17:07, Christian König <[email protected]> wrote:
>>>> Hi Arun,
>>>>
>>>> the hw generation doesn't matter. This error message here:
>>>>
>>>> amdgpu: Move buffer fallback to memcpy unavailable
>>>>
>>>> indicates that the detection of linear buffers still doesn't work as
>>>> expected or that we have a bug somewhere else.
>>>>
>>>> Maybe the limiting when SDMA moves are not available isn't working
>>>> correctly?
>>> It is a CAPE_VERDE, so maybe something with the SI UVD memory limitations?
>>
>> Yeah, good point. Could be that we try to move something into the UVD memory window and that something isn't allocated linearly.
>>
>> Arun can you trace the allocation and make sure that all kernel allocations have the CONTIGUOUS flag set?
>>
>> Thanks,
>> Christian.
>>
>>>
>>> Dave.
>>

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.