Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752064Ab3FYThr (ORCPT ); Tue, 25 Jun 2013 15:37:47 -0400 Received: from mail-ie0-f180.google.com ([209.85.223.180]:47621 "EHLO mail-ie0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751691Ab3FYThi (ORCPT ); Tue, 25 Jun 2013 15:37:38 -0400 MIME-Version: 1.0 X-Originating-IP: [178.83.130.250] In-Reply-To: References: Date: Tue, 25 Jun 2013 21:37:37 +0200 Message-ID: Subject: Re: Linux 3.10-rc7 From: Daniel Vetter To: Linus Torvalds Cc: Shuah Khan , Chris Wilson , Linux Kernel Mailing List , "shuahkhan@gmail.com" , Dave Airlie , "Barnes, Jesse" , intel-gfx Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9391 Lines: 177 On Tue, Jun 25, 2013 at 9:05 PM, Linus Torvalds wrote: > Adding the appropriate cc'd.. I'm not seeing why this would start > happening now, but there's been a number of commits that touch the > intel crtc 'active' state and hotplug logic, so I'm assuming one of > them is to blame.. Lots of small changes around > ironlake_crtc_mode_set() etc. > > This warning seems to imply that the pll activity count is buggered. > Anybody? Chris/Daniel/Dave? Hm, looks like a new one. On a quick guess it's an ugly interaction between the new state restore code we've added for vt-switchless resume in 3.10 and the lack of proper pch pll refcount reconstruction when taking over foreign state. I'm thinking of 1) hibernate to disk with every crtc switched off 2) after power cycle boot into the loader kernel with crtc enables 3) restore the hibernated kernel image 4) restored kernel reads out current hw state and restores the old state by disabling everything 5) we hit the WARN(!pll->active) and also the follow-up assert since the hw pch pll is indeed on (it's driving the crtc we're disabling after all), but our state reconstruction failed to track this properly. Dunno on a quick guess what to do this late in the -rc to duct-tape this WARN away, since we should at least try to shut down the pch pll (which we currently won't do). For 3.11 I've completely revamped the pch pll code with massively increased paranoia and much better tracking of the hw state. It's not all merged yet (some will probably miss 3.11), but the refcounting part is all in -next. So I think the first step would be to test latest linux-next (or the drm-intel-nightly branch from http://cgit.freedesktop.org/~danvet/drm-intel/ if you just want the drm parts on top of a recent -rc). Also it'd be good to corrobate my guess of what's going on with a dmesg with drm debugging enabled (drm.debug=0xe). Adding more lists to cc + Jesse since he's the guilty one for the vt-switchless state restore stuff. Cheers, Daniel > Linus > > On Tue, Jun 25, 2013 at 5:28 AM, Shuah Khan wrote: >> >> I started seeing this message during suspend to disk in reboot mode test >> back 3.10-rc6 and 3.10-rc7 has the same problem. I haven't started >> debugging yet. Looking to see if this is a known issue. I don't see this >> problem on 3.9.7. >> >> [ 2097.548150] WARNING: at drivers/gpu/drm/i915/intel_display.c:1656 >> ironlake_crtc_disable+0x865/0x890 [i915]() >> [ 2097.548177] Modules linked in: rfcomm bnep parport_pc ppdev arc4 >> iwldvm mac80211 ext2 i915 joydev coretemp kvm_intel snd_hda_codec_hdmi >> iwlwifi snd_hda_codec_realtek kvm snd_hda_intel snd_hda_codec cfg80211 >> snd_hwdep snd_pcm drm_kms_helper btusb snd_seq_midi bluetooth drm >> snd_rawmidi uvcvideo snd_seq_midi_event ghash_clmulni_intel snd_seq >> aesni_intel ablk_helper videobuf2_core cryptd tpm_infineon videodev >> snd_timer lrw gf128mul snd_seq_device glue_helper psmouse snd aes_x86_64 >> media samsung_laptop videobuf2_vmalloc microcode videobuf2_memops >> serio_raw soundcore tpm_tis snd_page_alloc i2c_algo_bit lpc_ich wmi >> video mac_hid lp parport hid_generic usbhid hid r8169 >> [ 2097.548180] CPU: 3 PID: 2736 Comm: kworker/u16:0 Not tainted >> 3.10.0-rc7+ #2 >> [ 2097.548181] Hardware name: SAMSUNG ELECTRONICS CO., LTD. >> 900X3C/900X3D/900X4C/900X4D/SAMSUNG_NP1234567890, BIOS P03AAC 07/12/2012 >> [ 2097.548186] Workqueue: events_unbound async_run_entry_fn >> [ 2097.548189] 0000000000000009 ffff8803f05a5a58 ffffffff81690b04 >> ffff8803f05a5a98 >> [ 2097.548190] ffffffff81043880 ffff88040a978000 ffff8803f074e000 >> ffff88040a978000 >> [ 2097.548192] ffff88040a97a688 00000000000e0300 ffff88040af6e000 >> ffff8803f05a5aa8 >> [ 2097.548192] Call Trace: >> [ 2097.548197] [] dump_stack+0x19/0x1b >> [ 2097.548201] [] warn_slowpath_common+0x70/0xa0 >> [ 2097.548203] [] warn_slowpath_null+0x1a/0x20 >> [ 2097.548214] [] ironlake_crtc_disable+0x865/0x890 >> [i915] >> [ 2097.548223] [] __intel_set_mode+0x458/0xd00 [i915] >> [ 2097.548232] [] >> intel_modeset_setup_hw_state+0x679/0xa60 [i915] >> [ 2097.548236] [] ? mutex_lock+0x1d/0x50 >> [ 2097.548242] [] __i915_drm_thaw+0x137/0x1d0 [i915] >> [ 2097.548249] [] i915_resume+0x7b/0xd0 [i915] >> [ 2097.548255] [] i915_pm_resume+0x16/0x20 [i915] >> [ 2097.548258] [] pci_pm_restore+0x73/0xd0 >> [ 2097.548259] [] ? pci_pm_default_resume+0x50/0x50 >> [ 2097.548261] [] ? pci_pm_default_resume+0x50/0x50 >> [ 2097.548264] [] dpm_run_callback.isra.3+0x3b/0x70 >> [ 2097.548265] [] device_resume+0xd4/0x1b0 >> [ 2097.548267] [] async_resume+0x21/0x50 >> [ 2097.548269] [] async_run_entry_fn+0x3b/0x140 >> [ 2097.548272] [] process_one_work+0x16e/0x3f0 >> [ 2097.548274] [] worker_thread+0x122/0x370 >> [ 2097.548276] [] ? rescuer_thread+0x330/0x330 >> [ 2097.548277] [] kthread+0xc0/0xd0 >> [ 2097.548279] [] ? kthread_create_on_node+0x130/0x130 >> [ 2097.548281] [] ret_from_fork+0x7c/0xb0 >> [ 2097.548283] [] ? kthread_create_on_node+0x130/0x130 >> [ 2097.548284] ---[ end trace 989c661118e428a8 ]--- >> >> [ 2097.548295] WARNING: at drivers/gpu/drm/i915/intel_display.c:1091 >> assert_pch_pll+0x18f/0x1f0 [i915]() >> [ 2097.548296] PCH PLL state for reg c6014 assertion failure (expected >> off, current on), val=89086008 >> [ 2097.548315] Modules linked in: rfcomm bnep parport_pc ppdev arc4 >> iwldvm mac80211 ext2 i915 joydev coretemp kvm_intel snd_hda_codec_hdmi >> iwlwifi snd_hda_codec_realtek kvm snd_hda_intel snd_hda_codec cfg80211 >> snd_hwdep snd_pcm drm_kms_helper btusb snd_seq_midi bluetooth drm >> snd_rawmidi uvcvideo snd_seq_midi_event ghash_clmulni_intel snd_seq >> aesni_intel ablk_helper videobuf2_core cryptd tpm_infineon videodev >> snd_timer lrw gf128mul snd_seq_device glue_helper psmouse snd aes_x86_64 >> media samsung_laptop videobuf2_vmalloc microcode videobuf2_memops >> serio_raw soundcore tpm_tis snd_page_alloc i2c_algo_bit lpc_ich wmi >> video mac_hid lp parport hid_generic usbhid hid r8169 >> [ 2097.548317] CPU: 3 PID: 2736 Comm: kworker/u16:0 Tainted: G W >> 3.10.0-rc7+ #2 >> [ 2097.548318] Hardware name: SAMSUNG ELECTRONICS CO., LTD. >> 900X3C/900X3D/900X4C/900X4D/SAMSUNG_NP1234567890, BIOS P03AAC 07/12/2012 >> [ 2097.548321] Workqueue: events_unbound async_run_entry_fn >> [ 2097.548323] 0000000000000009 ffff8803f05a59b8 ffffffff81690b04 >> ffff8803f05a59f8 >> [ 2097.548324] ffffffff81043880 0000000000000009 ffff88040a978000 >> ffff88040a97a688 >> [ 2097.548325] 0000000000000000 0000000000000000 0000000089086008 >> ffff8803f05a5a58 >> [ 2097.548326] Call Trace: >> [ 2097.548328] [] dump_stack+0x19/0x1b >> [ 2097.548330] [] warn_slowpath_common+0x70/0xa0 >> [ 2097.548332] [] warn_slowpath_fmt+0x46/0x50 >> [ 2097.548360] [] assert_pch_pll+0x18f/0x1f0 [i915] >> [ 2097.548382] [] ironlake_crtc_disable+0x5a9/0x890 >> [i915] >> [ 2097.548404] [] __intel_set_mode+0x458/0xd00 [i915] >> [ 2097.548429] [] >> intel_modeset_setup_hw_state+0x679/0xa60 [i915] >> [ 2097.548438] [] ? mutex_lock+0x1d/0x50 >> [ 2097.548454] [] __i915_drm_thaw+0x137/0x1d0 [i915] >> [ 2097.548472] [] i915_resume+0x7b/0xd0 [i915] >> [ 2097.548490] [] i915_pm_resume+0x16/0x20 [i915] >> [ 2097.548496] [] pci_pm_restore+0x73/0xd0 >> [ 2097.548500] [] ? pci_pm_default_resume+0x50/0x50 >> [ 2097.548504] [] ? pci_pm_default_resume+0x50/0x50 >> [ 2097.548509] [] dpm_run_callback.isra.3+0x3b/0x70 >> [ 2097.548513] [] device_resume+0xd4/0x1b0 >> [ 2097.548517] [] async_resume+0x21/0x50 >> [ 2097.548524] [] async_run_entry_fn+0x3b/0x140 >> [ 2097.548529] [] process_one_work+0x16e/0x3f0 >> [ 2097.548534] [] worker_thread+0x122/0x370 >> [ 2097.548540] [] ? rescuer_thread+0x330/0x330 >> [ 2097.548544] [] kthread+0xc0/0xd0 >> [ 2097.548549] [] ? kthread_create_on_node+0x130/0x130 >> [ 2097.548553] [] ret_from_fork+0x7c/0xb0 >> [ 2097.548558] [] ? kthread_create_on_node+0x130/0x130 >> [ 2097.548560] ---[ end trace 989c661118e428a9 ]--- >> >> -- Shuah >> >> Shuah Khan, Linux Kernel Developer - Open Source Group Samsung Research >> America (Silicon Valley) shuah.kh@samsung.com | (970) 672-0658 -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/