Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933277Ab2EYQGZ (ORCPT ); Fri, 25 May 2012 12:06:25 -0400 Received: from cantor2.suse.de ([195.135.220.15]:56588 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752327Ab2EYQGW (ORCPT ); Fri, 25 May 2012 12:06:22 -0400 Date: Fri, 25 May 2012 18:06:21 +0200 Message-ID: From: Takashi Iwai To: =?UTF-8?B?SsO2cmctVm9sa2Vy?= Peetz Cc: Tejun Heo , Fengguang Wu , linux-kernel@vger.kernel.org Subject: Re: Linux 3.4 released In-Reply-To: <4FBFA637.6000704@web.de> References: <4FBBB11D.7020904@web.de> <20120522155345.GC14339@google.com> <4FBBC461.4060008@web.de> <20120522170320.GD14339@google.com> <4FBBEDF2.8060803@web.de> <20120523182457.GD18143@google.com> <4FBD40F4.4090201@web.de> <20120523202657.GB3933@htj.dyndns.org> <4FBFA637.6000704@web.de> User-Agent: Wanderlust/2.15.6 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.7 Emacs/23.3 (x86_64-suse-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4503 Lines: 116 At Fri, 25 May 2012 17:33:11 +0200, Jörg-Volker Peetz wrote: > > Hello, > > Takashi Iwai wrote, on 05/25/12 09:25: > > At Wed, 23 May 2012 13:26:57 -0700, > > Tejun Heo wrote: > >> > >> Cc'ing Takashi. Hi! > > > > Also Cc'ed Fengguang, who worked on ELD stuff. > > > >> On Wed, May 23, 2012 at 09:56:36PM +0200, Jörg-Volker Peetz wrote: > >>> May 23 21:32:33 hostname kernel: XXX delayed_work_timer_fn: cwq > >>> (null), fn=hdmi_repoll_eld > >> > >> So, we have the winner. > >> > >> Takashi, sound/pci/hda/patch_hdmi.c::hdmi_repoll_eld() is causing > >> workqueue code dereference %NULL pointer. It *looks* like something > >> is corrupting the work item while it's queued. It could be a > >> workqueue bug but I don't think that's likely - the code has been > >> stable for quite some time now. I glanced through the code and > >> nothing stands out. Does something ring a bell? > > > > I also don't know of this problem. My initial thought was that the > > work struct placed right after sink_eld in struct hdmi_spec_per_pin is > > overwritten wrongly by reading some ELD data. But I failed to spot > > out the bug... > > > > Reading back through the thread, the problem seems triggered via usb > > video cam. I wonder how this is connected to the HDMI audio. > > > > To get things straight: does this bug happen even without HDMI, DP or > > DVI cable plugged, i.e. only with the laptop without connecting to the > > external digital output? > > > yes it happens without any HDMI cable plugged. The notebook is only connected to > an ethernet cable and the power cable. I'll append /var/log/dmesg, it also > contains the kernel command line with "radeon.audio=1". > > The computer has two graphic chips: > ATI Mobility Radeon HD 4200 integrated graphics (non-free firmware R600_rlc.bin) > ATI Mobility Radeon HD 5470 graphic (512MB) (non-free firmware CEDAR_*.bin) > During booting, the discrete GPU is switched off using vga switcheroo: > > $ mount -t debugfs none /sys/kernel/debug > $ echo -n OFF > /sys/kernel/debug/vgaswitcheroo/switch This explains the codec stall, at least. Disabling the D-GPU also disables the HD-audio controller. Once when it's disabled, even accessing the PCI may trigger an Oops. It's a known problem. The support of vga-switcheroo for HD-audio was recently added, and I sent a pull request to Linus today. Try the latest Linus tree and pull sound git tree hda-switcheroo tag onto it: git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git tags/hda-switcheroo I'm not sure whether this is related with the workq Oops, though. At least, you can try without disabling D-GPU to check whether you see the same workq problem. > For the sound kernel module the following options are set in > /etc/modprobe.d/alsa-base.conf: > > options snd-hda-intel model=hp-dv7-4000 enable_msi=1 > > > > >>> (without line-break). > >>> > >>> By the way, don't know if this is related, I have a phenomenon with a spurious > >>> interrupt with every linux version I've used before on this notebook. Half a > >>> minute after starting the system the computer produces approx. 220 lines like > >>> > >>> ... kernel: hda-intel: spurious response 0x0:0x0, last cmd=0x170503 > >>> > >>> Now with 3.4.0, I see an additional message right before (the minute before) the > >>> "XXX ..." line: > >>> > >>> ...kernel: hda_intel: azx_get_response timeout, switching to single_cmd mode: > >>> last cmd=0x003f0900 > >> > >> These too seem to be for you, Takashi. :) > > > > This means essentially the codec communication got stalled. This is a > > bad signal. It happens often with a wrong HD-audio verb, but often > > with a bad IRQ, whatever. > > > > I'd need alsa-info.sh output (run with --no-upload option) for further > > analysis. > > > > > > thanks, > > > > Takashi > > My first try to run the alsa-info.sh script with the plain 3.4 kernel produced > the same kernel oops freezing the notebook (and /tmp is mounted on tmpfs). > Therefore I applied the patch from Tejun to produce a usable output. > I attach it also. As you will notice, it contains the line beginning with "XXX" > due to Tejun's patch. Get alsa-info.sh without disabling D-GPU if you run it on 3.4 or earlier kernel. thanks, Takashi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/