2011-05-22 13:01:41

by Olaf Freyer

[permalink] [raw]
Subject: Erroneous package power limit notification since kernel 2.6.39

Hi,

since using kernel-2.6.39 I recieve those reports right after rebooting
my system when it's still "warm".
(Sensors reporting about 50?C. I don't see those on a cold boot early in
the morning.)
I don't get any such reports with 2.6.38.7 or any earlier kernel I used
before.

May 22 14:41:34 localhost kernel: [ 57.525844] CPU4: Package power
limit notification (total events = 1)
May 22 14:41:34 localhost kernel: [ 57.525848] CPU0: Package power
limit notification (total events = 1)
May 22 14:41:34 localhost kernel: [ 57.525851] CPU1: Package power
limit notification (total events = 1)
May 22 14:41:34 localhost kernel: [ 57.525854] CPU2: Package power
limit notification (total events = 1)
May 22 14:41:34 localhost kernel: [ 57.525856] CPU5: Package power
limit notification (total events = 1)
May 22 14:41:34 localhost kernel: [ 57.525859] CPU3: Package power
limit notification (total events = 1)
May 22 14:41:34 localhost kernel: [ 57.525861] Disabling lock
debugging due to kernel taint
May 22 14:41:34 localhost kernel: [ 57.525869] CPU6: Package power
limit notification (total events = 1)
May 22 14:41:34 localhost kernel: [ 57.525872] CPU7: Package power
limit notification (total events = 1)
May 22 14:41:34 localhost kernel: [ 57.536890] CPU1: Package power
limit normal
May 22 14:41:34 localhost kernel: [ 57.536893] CPU4: Package power
limit normal
May 22 14:41:34 localhost kernel: [ 57.536896] CPU2: Package power
limit normal
May 22 14:41:34 localhost kernel: [ 57.536899] CPU3: Package power
limit normal
May 22 14:41:34 localhost kernel: [ 57.536901] CPU5: Package power
limit normal
May 22 14:41:34 localhost kernel: [ 57.536904] CPU0: Package power
limit normal
May 22 14:41:34 localhost kernel: [ 57.536915] CPU6: Package power
limit normal
May 22 14:41:34 localhost kernel: [ 57.536918] CPU7: Package power
limit normal

On executing "mcelog --syslog" I recieve those reports in /var/log/messages

May 22 14:41:55 localhost mcelog: Unsupported new Family 6 Model 2a CPU:
only decoding architectural errors
May 22 14:41:55 localhost mcelog: HARDWARE ERROR. This is *NOT* a
software problem!
May 22 14:41:55 localhost mcelog: Please contact your hardware vendor
May 22 14:41:55 localhost mcelog: MCE 0
May 22 14:41:55 localhost mcelog: CPU 1 THERMAL EVENT TSC 263055cdd1
May 22 14:41:55 localhost mcelog: TIME 1306068115 Sun May 22 14:41:55 2011
May 22 14:41:55 localhost mcelog: Processor 1 below trip temperature.
Throttling disabled
May 22 14:41:55 localhost mcelog: STATUS c000000088260c00 MCGSTATUS 0
May 22 14:41:55 localhost mcelog: MCGCAP c09 APICID 1 SOCKETID 0
May 22 14:41:55 localhost mcelog: CPUID Vendor Intel Family 6 Model 42
May 22 14:41:55 localhost mcelog: Unsupported new Family 6 Model 2a CPU:
only decoding architectural errors
May 22 14:41:55 localhost mcelog: HARDWARE ERROR. This is *NOT* a
software problem!
May 22 14:41:55 localhost mcelog: Please contact your hardware vendor
May 22 14:41:55 localhost mcelog: MCE 1
May 22 14:41:55 localhost mcelog: CPU 2 THERMAL EVENT TSC 263055e95d
May 22 14:41:55 localhost mcelog: TIME 1306068115 Sun May 22 14:41:55 2011
May 22 14:41:55 localhost mcelog: Processor 2 below trip temperature.
Throttling disabled
May 22 14:41:55 localhost mcelog: STATUS c000000088260c00 MCGSTATUS 0
May 22 14:41:55 localhost mcelog: MCGCAP c09 APICID 2 SOCKETID 0
May 22 14:41:55 localhost mcelog: CPUID Vendor Intel Family 6 Model 42
May 22 14:41:55 localhost mcelog: Unsupported new Family 6 Model 2a CPU:
only decoding architectural errors
May 22 14:41:55 localhost mcelog: HARDWARE ERROR. This is *NOT* a
software problem!
May 22 14:41:55 localhost mcelog: Please contact your hardware vendor
May 22 14:41:55 localhost mcelog: MCE 2
May 22 14:41:55 localhost mcelog: CPU 5 THERMAL EVENT TSC 263055fe7b
May 22 14:41:55 localhost mcelog: TIME 1306068115 Sun May 22 14:41:55 2011
May 22 14:41:55 localhost mcelog: Processor 5 below trip temperature.
Throttling disabled
May 22 14:41:55 localhost mcelog: STATUS c000000088260c00 MCGSTATUS 0
May 22 14:41:55 localhost mcelog: MCGCAP c09 APICID 5 SOCKETID 0
May 22 14:41:55 localhost mcelog: CPUID Vendor Intel Family 6 Model 42
May 22 14:41:55 localhost mcelog: Unsupported new Family 6 Model 2a CPU:
only decoding architectural errors
May 22 14:41:55 localhost mcelog: HARDWARE ERROR. This is *NOT* a
software problem!
May 22 14:41:55 localhost mcelog: Please contact your hardware vendor
May 22 14:41:55 localhost mcelog: MCE 3
May 22 14:41:55 localhost mcelog: CPU 3 THERMAL EVENT TSC 2630561550
May 22 14:41:55 localhost mcelog: TIME 1306068115 Sun May 22 14:41:55 2011
May 22 14:41:55 localhost mcelog: Processor 3 below trip temperature.
Throttling disabled
May 22 14:41:55 localhost mcelog: STATUS c000000088260c00 MCGSTATUS 0
May 22 14:41:55 localhost mcelog: MCGCAP c09 APICID 3 SOCKETID 0
May 22 14:41:55 localhost mcelog: CPUID Vendor Intel Family 6 Model 42
May 22 14:41:55 localhost mcelog: Unsupported new Family 6 Model 2a CPU:
only decoding architectural errors
May 22 14:41:55 localhost mcelog: HARDWARE ERROR. This is *NOT* a
software problem!
May 22 14:41:55 localhost mcelog: Please contact your hardware vendor
May 22 14:41:55 localhost mcelog: MCE 4
May 22 14:41:55 localhost mcelog: CPU 0 THERMAL EVENT TSC 26305627e1
May 22 14:41:55 localhost mcelog: TIME 1306068115 Sun May 22 14:41:55 2011
May 22 14:41:55 localhost mcelog: Processor 0 below trip temperature.
Throttling disabled
May 22 14:41:55 localhost mcelog: STATUS c000000088260c00 MCGSTATUS 0
May 22 14:41:55 localhost mcelog: MCGCAP c09 APICID 0 SOCKETID 0
May 22 14:41:55 localhost mcelog: CPUID Vendor Intel Family 6 Model 42
May 22 14:41:55 localhost mcelog: Unsupported new Family 6 Model 2a CPU:
only decoding architectural errors
May 22 14:41:55 localhost mcelog: HARDWARE ERROR. This is *NOT* a
software problem!
May 22 14:41:55 localhost mcelog: Please contact your hardware vendor
May 22 14:41:55 localhost mcelog: MCE 5
May 22 14:41:55 localhost mcelog: CPU 4 THERMAL EVENT TSC 2630562c03
May 22 14:41:55 localhost mcelog: TIME 1306068115 Sun May 22 14:41:55 2011
May 22 14:41:55 localhost mcelog: Processor 4 below trip temperature.
Throttling disabled
May 22 14:41:55 localhost mcelog: STATUS c000000088260c00 MCGSTATUS 0
May 22 14:41:55 localhost mcelog: MCGCAP c09 APICID 4 SOCKETID 0
May 22 14:41:55 localhost mcelog: CPUID Vendor Intel Family 6 Model 42
May 22 14:41:55 localhost mcelog: Unsupported new Family 6 Model 2a CPU:
only decoding architectural errors
May 22 14:41:55 localhost mcelog: HARDWARE ERROR. This is *NOT* a
software problem!
May 22 14:41:55 localhost mcelog: Please contact your hardware vendor
May 22 14:41:55 localhost mcelog: MCE 6
May 22 14:41:55 localhost mcelog: CPU 7 THERMAL EVENT TSC 2630568665
May 22 14:41:55 localhost mcelog: TIME 1306068115 Sun May 22 14:41:55 2011
May 22 14:41:55 localhost mcelog: Processor 7 below trip temperature.
Throttling disabled
May 22 14:41:55 localhost mcelog: STATUS c000000088260c00 MCGSTATUS 0
May 22 14:41:55 localhost mcelog: MCGCAP c09 APICID 7 SOCKETID 0
May 22 14:41:55 localhost mcelog: CPUID Vendor Intel Family 6 Model 42
May 22 14:41:55 localhost mcelog: Unsupported new Family 6 Model 2a CPU:
only decoding architectural errors
May 22 14:41:55 localhost mcelog: HARDWARE ERROR. This is *NOT* a
software problem!
May 22 14:41:55 localhost mcelog: Please contact your hardware vendor
May 22 14:41:55 localhost mcelog: MCE 7
May 22 14:41:55 localhost mcelog: CPU 6 THERMAL EVENT TSC 2630568af6
May 22 14:41:55 localhost mcelog: TIME 1306068115 Sun May 22 14:41:55 2011
May 22 14:41:55 localhost mcelog: Processor 6 below trip temperature.
Throttling disabled
May 22 14:41:55 localhost mcelog: STATUS c000000088260c00 MCGSTATUS 0
May 22 14:41:55 localhost mcelog: MCGCAP c09 APICID 6 SOCKETID 0
May 22 14:41:55 localhost mcelog: CPUID Vendor Intel Family 6 Model 42
May 22 14:41:55 localhost mcelog: Unsupported new Family 6 Model 2a CPU:
only decoding architectural errors
May 22 14:41:55 localhost mcelog: HARDWARE ERROR. This is *NOT* a
software problem!
May 22 14:41:55 localhost mcelog: Please contact your hardware vendor
May 22 14:41:55 localhost mcelog: MCE 8
May 22 14:41:55 localhost mcelog: CPU 4 THERMAL EVENT TSC 2631c812ca
May 22 14:41:55 localhost mcelog: TIME 1306068115 Sun May 22 14:41:55 2011
May 22 14:41:55 localhost mcelog: Processor 4 below trip temperature.
Throttling disabled
May 22 14:41:55 localhost mcelog: STATUS c000000088260800 MCGSTATUS 0
May 22 14:41:55 localhost mcelog: MCGCAP c09 APICID 4 SOCKETID 0
May 22 14:41:55 localhost mcelog: CPUID Vendor Intel Family 6 Model 42
May 22 14:41:55 localhost mcelog: Unsupported new Family 6 Model 2a CPU:
only decoding architectural errors
May 22 14:41:55 localhost mcelog: HARDWARE ERROR. This is *NOT* a
software problem!
May 22 14:41:55 localhost mcelog: Please contact your hardware vendor
May 22 14:41:55 localhost mcelog: MCE 9
May 22 14:41:55 localhost mcelog: CPU 2 THERMAL EVENT TSC 2631c82b05
May 22 14:41:55 localhost mcelog: TIME 1306068115 Sun May 22 14:41:55 2011
May 22 14:41:55 localhost mcelog: Processor 2 below trip temperature.
Throttling disabled
May 22 14:41:55 localhost mcelog: STATUS c000000088260800 MCGSTATUS 0
May 22 14:41:55 localhost mcelog: MCGCAP c09 APICID 2 SOCKETID 0
May 22 14:41:55 localhost mcelog: CPUID Vendor Intel Family 6 Model 42
May 22 14:41:55 localhost mcelog: Unsupported new Family 6 Model 2a CPU:
only decoding architectural errors
May 22 14:41:55 localhost mcelog: HARDWARE ERROR. This is *NOT* a
software problem!
May 22 14:41:55 localhost mcelog: Please contact your hardware vendor
May 22 14:41:55 localhost mcelog: MCE 10
May 22 14:41:55 localhost mcelog: CPU 3 THERMAL EVENT TSC 2631c83eb2
May 22 14:41:55 localhost mcelog: TIME 1306068115 Sun May 22 14:41:55 2011
May 22 14:41:55 localhost mcelog: Processor 3 below trip temperature.
Throttling disabled
May 22 14:41:55 localhost mcelog: STATUS c000000088260800 MCGSTATUS 0
May 22 14:41:55 localhost mcelog: MCGCAP c09 APICID 3 SOCKETID 0
May 22 14:41:55 localhost mcelog: CPUID Vendor Intel Family 6 Model 42
May 22 14:41:55 localhost mcelog: Unsupported new Family 6 Model 2a CPU:
only decoding architectural errors
May 22 14:41:55 localhost mcelog: HARDWARE ERROR. This is *NOT* a
software problem!
May 22 14:41:55 localhost mcelog: Please contact your hardware vendor
May 22 14:41:55 localhost mcelog: MCE 11
May 22 14:41:55 localhost mcelog: CPU 5 THERMAL EVENT TSC 2631c852d5
May 22 14:41:55 localhost mcelog: TIME 1306068115 Sun May 22 14:41:55 2011
May 22 14:41:55 localhost mcelog: Processor 5 below trip temperature.
Throttling disabled
May 22 14:41:55 localhost mcelog: STATUS c000000088260800 MCGSTATUS 0
May 22 14:41:55 localhost mcelog: MCGCAP c09 APICID 5 SOCKETID 0
May 22 14:41:55 localhost mcelog: CPUID Vendor Intel Family 6 Model 42
May 22 14:41:55 localhost mcelog: Unsupported new Family 6 Model 2a CPU:
only decoding architectural errors
May 22 14:41:55 localhost mcelog: HARDWARE ERROR. This is *NOT* a
software problem!
May 22 14:41:55 localhost mcelog: Please contact your hardware vendor
May 22 14:41:55 localhost mcelog: MCE 12
May 22 14:41:55 localhost mcelog: CPU 0 THERMAL EVENT TSC 2631c8668e
May 22 14:41:55 localhost mcelog: TIME 1306068115 Sun May 22 14:41:55 2011
May 22 14:41:55 localhost mcelog: Processor 0 below trip temperature.
Throttling disabled
May 22 14:41:55 localhost mcelog: STATUS c000000088260800 MCGSTATUS 0
May 22 14:41:55 localhost mcelog: MCGCAP c09 APICID 0 SOCKETID 0
May 22 14:41:55 localhost mcelog: CPUID Vendor Intel Family 6 Model 42
May 22 14:41:55 localhost mcelog: Unsupported new Family 6 Model 2a CPU:
only decoding architectural errors
May 22 14:41:55 localhost mcelog: HARDWARE ERROR. This is *NOT* a
software problem!
May 22 14:41:55 localhost mcelog: Please contact your hardware vendor
May 22 14:41:55 localhost mcelog: MCE 13
May 22 14:41:55 localhost mcelog: CPU 1 THERMAL EVENT TSC 2631c86895
May 22 14:41:55 localhost mcelog: TIME 1306068115 Sun May 22 14:41:55 2011
May 22 14:41:55 localhost mcelog: Processor 1 below trip temperature.
Throttling disabled
May 22 14:41:55 localhost mcelog: STATUS c000000088260800 MCGSTATUS 0
May 22 14:41:55 localhost mcelog: MCGCAP c09 APICID 1 SOCKETID 0
May 22 14:41:55 localhost mcelog: CPUID Vendor Intel Family 6 Model 42
May 22 14:41:55 localhost mcelog: Unsupported new Family 6 Model 2a CPU:
only decoding architectural errors
May 22 14:41:55 localhost mcelog: HARDWARE ERROR. This is *NOT* a
software problem!
May 22 14:41:55 localhost mcelog: Please contact your hardware vendor
May 22 14:41:55 localhost mcelog: MCE 14
May 22 14:41:55 localhost mcelog: CPU 7 THERMAL EVENT TSC 2631c8df4e
May 22 14:41:55 localhost mcelog: TIME 1306068115 Sun May 22 14:41:55 2011
May 22 14:41:55 localhost mcelog: Processor 7 below trip temperature.
Throttling disabled
May 22 14:41:55 localhost mcelog: STATUS c000000088260800 MCGSTATUS 0
May 22 14:41:55 localhost mcelog: MCGCAP c09 APICID 7 SOCKETID 0
May 22 14:41:55 localhost mcelog: CPUID Vendor Intel Family 6 Model 42
May 22 14:41:55 localhost mcelog: Unsupported new Family 6 Model 2a CPU:
only decoding architectural errors
May 22 14:41:55 localhost mcelog: HARDWARE ERROR. This is *NOT* a
software problem!
May 22 14:41:55 localhost mcelog: Please contact your hardware vendor
May 22 14:41:55 localhost mcelog: MCE 15
May 22 14:41:55 localhost mcelog: CPU 6 THERMAL EVENT TSC 2631c8e40f
May 22 14:41:55 localhost mcelog: TIME 1306068115 Sun May 22 14:41:55 2011
May 22 14:41:55 localhost mcelog: Processor 6 below trip temperature.
Throttling disabled
May 22 14:41:55 localhost mcelog: STATUS c000000088260800 MCGSTATUS 0
May 22 14:41:55 localhost mcelog: MCGCAP c09 APICID 6 SOCKETID 0
May 22 14:41:55 localhost mcelog: CPUID Vendor Intel Family 6 Model 42

I haven't dealt with any kernel issues before, so I'm somewhat lost what
to do next.
If there is anything I can do to track down the cause or any additional
information I need to provide,
just tell me what to.

Best regards,
Olaf Freyer


2011-05-25 15:45:09

by Olaf Freyer

[permalink] [raw]
Subject: Re: Erroneous package power limit notification since kernel 2.6.39

Hi,

I tried to git-bisect to find the root cause of this issue, but feel somewhat
lost after spending two evening without any usefull results.

I started with v2.6.38 as last known good and v2.6.39-rc1 as first known bad
version. My presumption was that bisection would now just happen between those
two versions - but somehow I ended up building 2.6.38.rc3,+ 2.6.38.rc7+,
2.6.38.rc2+ and 2.6.38.rc1+ during this process. Versions that are beyond the
starting range I intended to use. Or maybe I just haven't understood the whole
process yet.

Both of my bisection attempts ended near the same set of drm/i915 changes that
resulted in non-bootable kernels. Considering I'm using a Intel(R) Core(TM)
i7-2720QM with some Intel Sandybridge Chipset graphics those might sound
somewhat plausible to someone knowing the internals, but don't help me at all.

I'm not yet 100% sure, but I think the package power limit notification events I
see in /var/log/messages might coincide with the moment of my xorg startup. (But
those are just wall clock time estimations.) I will try some bootups without X
later this evening and report back if those package power limit notification
events also trigger without X running.

The "result" of my last git bisection attempt was:

git bisect start
# good: [521cb40b0c44418a4fd36dc633f575813d59a43d] Linux 2.6.38
git bisect good 521cb40b0c44418a4fd36dc633f575813d59a43d
# bad: [0ce790e7d736cedc563e1fb4e998babf5a4dbc3d] Linux 2.6.39-rc1
git bisect bad 0ce790e7d736cedc563e1fb4e998babf5a4dbc3d
# bad: [179198373cf374f0ef793f1023c1cdd83b53674d] Merge branch 'nfs-for-2.6.39'
of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
git bisect bad 179198373cf374f0ef793f1023c1cdd83b53674d
# good: [6445ced8670f37cfc2c5e24a9de9b413dbfc788d] Merge branch 'staging-next'
of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
git bisect good 6445ced8670f37cfc2c5e24a9de9b413dbfc788d
# good: [d72751ede1b9bf993d7bd3377305c8e9e36a3cc4] Merge branch 'for-davem' of
ssh://master.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
git bisect good d72751ede1b9bf993d7bd3377305c8e9e36a3cc4
# good: [16d8775700f1815076f879719ce14b33f50a3171] Merge branch 'for-linus' of
master.kernel.org:/home/rmk/linux-2.6-arm
git bisect good 16d8775700f1815076f879719ce14b33f50a3171
# bad: [fd34b0dee4d237ce9332cc62b03adebfe4fa9f9d] Merge
git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6
git bisect bad fd34b0dee4d237ce9332cc62b03adebfe4fa9f9d
# good: [bd35fe5a7930bf83ed56422ea4e4b6471ee6f739] drm/nouveau: fix
__nouveau_fence_wait performance
git bisect good bd35fe5a7930bf83ed56422ea4e4b6471ee6f739
# bad: [942b0e95c34f1ba432d08e1c0288ed032d32c3b2] drm/radeon/kms: fix typo in
atom overscan setup
git bisect bad 942b0e95c34f1ba432d08e1c0288ed032d32c3b2
# bad: [308977ac03117706c342099a40919b3da2667cce] drm/i915: Use DEBUG_KMS for
the self-refresh watermarks
git bisect bad 308977ac03117706c342099a40919b3da2667cce
# skip: [417ae1476de3ae9689a374d70565f41b3474641e] drm/i915: Include TLB miss
latency in g4x watermark computations
git bisect skip 417ae1476de3ae9689a374d70565f41b3474641e
# skip: [b0b544cd37c060e261afb2cf486296983fcb56da] drm/i915: Use PM QoS to
prevent C-State starvation of gen3 GPU
git bisect skip b0b544cd37c060e261afb2cf486296983fcb56da
# skip: [fe4402931e43e81a4129eba41d05cf8907603af5] Merge branch
'drm-intel-fixes' into drm-intel-next
git bisect skip fe4402931e43e81a4129eba41d05cf8907603af5
# bad: [987a709e1589cf10e250e04ce9df910b735d4f60] drm/i915: remove now
unnecessary delays in eDP panel power sequencing
git bisect bad 987a709e1589cf10e250e04ce9df910b735d4f60
# skip: [18b2190ca5bd3f19717421b1591c79c9b0372428] drm/i915: allow 945 to
control self refresh (CxSR) automatically
git bisect skip 18b2190ca5bd3f19717421b1591c79c9b0372428
# skip: [6f06ce184c765fd8d50669a8d12fdd566c920859] drm/i915: set phase sync
pointer override enable before setting phase sync pointer
git bisect skip 6f06ce184c765fd8d50669a8d12fdd566c920859
# skip: [b24e71798871089da1a4ab049db2800afc1aac0c] drm/i915: add pipe/plane
enable/disable functions
git bisect skip b24e71798871089da1a4ab049db2800afc1aac0c
# skip: [f67a559daaa0e2ba616bfe9438f202bc57bc8c72] drm/i915: skip FDI & PCH
enabling for DP_A
git bisect skip f67a559daaa0e2ba616bfe9438f202bc57bc8c72
# skip: [ea0760cfc00b9e534423fdaf630d1c8ce7a5ede0] drm/i915: add panel lock
assertion function
git bisect skip ea0760cfc00b9e534423fdaf630d1c8ce7a5ede0
# skip: [ccab5c82759e2ace74b2e84f82d1e0eedd932571] drm/i915: tune Sandy Bridge
DRPS constants
git bisect skip ccab5c82759e2ace74b2e84f82d1e0eedd932571
# skip: [a37f2f87edc1b6e5932becf6e51535d36b690f2a] drm/i915: Remove unused code:
i915_enable_interrupt()
git bisect skip a37f2f87edc1b6e5932becf6e51535d36b690f2a
# skip: [aa9b500ddf1a6318e7cf8b1754696edddae86db9] drm/i915: Honour LVDS sync
polarity from EDID
git bisect skip aa9b500ddf1a6318e7cf8b1754696edddae86db9
# skip: [c0c06bd244179f754d68684fd87674585a153e40] drm/i915/ringbuffer: Kill an
annoyingly frequent debug message
git bisect skip c0c06bd244179f754d68684fd87674585a153e40
# skip: [311bd68e024f9006db66cbadc3bd9f62fd663f4b] drm/i915: Trivial sparse fixes
git bisect skip 311bd68e024f9006db66cbadc3bd9f62fd663f4b
# skip: [63d7bbe9ded4146e3f78e5742b119fa1fdb52665] drm/i915: add PLL
enable/disable functions
git bisect skip 63d7bbe9ded4146e3f78e5742b119fa1fdb52665
# skip: [0fc932b8ec36116bb759105ce910b0475e63112a] drm/i915: factor out FDI
disable and add FDI assertions
git bisect skip 0fc932b8ec36116bb759105ce910b0475e63112a
# bad: [bdd92c9ad287e03a2ec52f5a89c470cd5caae1c2] Merge branch 'drm-intel-fixes'
into drm-intel-next
git bisect bad bdd92c9ad287e03a2ec52f5a89c470cd5caae1c2
# skip: [040484af3a4efa65786b6e107fbe74747679e17c] drm/i915: add transcoder
enable/disable functions
git bisect skip 040484af3a4efa65786b6e107fbe74747679e17c
# skip: [633f2ea26665d37bb3c8ae30799aa14988622653] drm/i915: Disable SSC for
outputs other than LVDS or DP
git bisect skip 633f2ea26665d37bb3c8ae30799aa14988622653
# skip: [d9b6cb568bc6eca8db88357bf8bbb92d42a91b1e] drm/i915: assert panel is
unlocked before writing transcoder timing regs
git bisect skip d9b6cb568bc6eca8db88357bf8bbb92d42a91b1e
# skip: [9a4114ffa7b6f5f4635e3745a8dc051d15d4596a] drm/i915/bios: Change default
clock source on PineView to use SSC
git bisect skip 9a4114ffa7b6f5f4635e3745a8dc051d15d4596a
# skip: [92f2584a083986c05fc811bbdf380c3fa7c12296] drm/i915: add PCH DPLL
enable/disable functions
git bisect skip 92f2584a083986c05fc811bbdf380c3fa7c12296
# skip: [65993d64a31844ad444694efb2d159eb9c883e49] drm/i915: don't enable plane,
pipe and PLL prematurely
git bisect skip 65993d64a31844ad444694efb2d159eb9c883e49
# skip: [01fe9dbde19a1a27b8ee63e2d964562962e1eb78] drm/i915: Use ACPI OpRegion
to determine lid status
git bisect skip 01fe9dbde19a1a27b8ee63e2d964562962e1eb78

Regards
Olaf Freyer

2011-05-29 19:53:34

by Maciej Rutecki

[permalink] [raw]
Subject: Re: Erroneous package power limit notification since kernel 2.6.39

On niedziela, 22 maja 2011 o 15:01:30 Olaf Freyer wrote:
> Hi,
>
> since using kernel-2.6.39 I recieve those reports right after rebooting
> my system when it's still "warm".
> (Sensors reporting about 50?C. I don't see those on a cold boot early in
> the morning.)
> I don't get any such reports with 2.6.38.7 or any earlier kernel I used
> before.
>

I created a Bugzilla entry at
https://bugzilla.kernel.org/show_bug.cgi?id=36182
for your bug report, please add your address to the CC list in there, thanks!

--
Maciej Rutecki
http://www.maciek.unixy.pl

2011-06-02 16:51:13

by Olaf Freyer

[permalink] [raw]
Subject: Re: Erroneous package power limit notification since kernel 2.6.39

Hi,

> I'm not yet 100% sure, but I think the package power limit notification events,
> I see in /var/log/messages might coincide with the moment of my xorg startup.

I tested it for a few evenings and I can now confirm my previous assumption.
Running the system on console for hours doesn't trigger any such event -
as soon as I start up X it happens at once.

Is there any extra info I can provide to get this issue solved?

Regards
Olaf Freyer

2011-06-26 16:28:17

by Florian Mickler

[permalink] [raw]
Subject: Re: Erroneous package power limit notification since kernel 2.6.39

On Wed, 25 May 2011 15:40:24 +0000 (UTC)
Olaf Freyer <[email protected]> wrote:

> Hi,
>
> I tried to git-bisect to find the root cause of this issue, but feel somewhat
> lost after spending two evening without any usefull results.
>
> I started with v2.6.38 as last known good and v2.6.39-rc1 as first known bad
> version. My presumption was that bisection would now just happen between those
> two versions - but somehow I ended up building 2.6.38.rc3,+ 2.6.38.rc7+,
> 2.6.38.rc2+ and 2.6.38.rc1+ during this process. Versions that are beyond the
> starting range I intended to use. Or maybe I just haven't understood the whole
> process yet.

This is to be expected. It is because you are asked to test something
which was based on a version before 2.6.38 but which got merged
into Linus' tree after 2.6.38 and so it really ended up in Linus tree
between 2.6.38 and 2.6.39 but has still the before-2.6.38-tag.

I hope you did trust git-bisect and tested those versions?

>
> Both of my bisection attempts ended near the same set of drm/i915 changes that
> resulted in non-bootable kernels. Considering I'm using a Intel(R) Core(TM)
> i7-2720QM with some Intel Sandybridge Chipset graphics those might sound
> somewhat plausible to someone knowing the internals, but don't help me at all.

There are 2 untested commits left. But they don't seem to be relevant.
Guessing on those gives me:

The first bad commit could be any of:
b0b544cd37c060e261afb2cf486296983fcb56da
f67a559daaa0e2ba616bfe9438f202bc57bc8c72
18b2190ca5bd3f19717421b1591c79c9b0372428
6f06ce184c765fd8d50669a8d12fdd566c920859
0fc932b8ec36116bb759105ce910b0475e63112a
311bd68e024f9006db66cbadc3bd9f62fd663f4b
040484af3a4efa65786b6e107fbe74747679e17c
ccab5c82759e2ace74b2e84f82d1e0eedd932571
aa9b500ddf1a6318e7cf8b1754696edddae86db9
d9b6cb568bc6eca8db88357bf8bbb92d42a91b1e
92f2584a083986c05fc811bbdf380c3fa7c12296
9a4114ffa7b6f5f4635e3745a8dc051d15d4596a
633f2ea26665d37bb3c8ae30799aa14988622653
63d7bbe9ded4146e3f78e5742b119fa1fdb52665
417ae1476de3ae9689a374d70565f41b3474641e
ea0760cfc00b9e534423fdaf630d1c8ce7a5ede0
b24e71798871089da1a4ab049db2800afc1aac0c
fe4402931e43e81a4129eba41d05cf8907603af5
65993d64a31844ad444694efb2d159eb9c883e49
c0c06bd244179f754d68684fd87674585a153e40
01fe9dbde19a1a27b8ee63e2d964562962e1eb78
a37f2f87edc1b6e5932becf6e51535d36b690f2a
bdd92c9ad287e03a2ec52f5a89c470cd5caae1c2



I'd guess ccab5c82759e2ace74b2e84f82d1e0eedd932571 could be the
cause. Can you check if the appended revert of that commit makes
things disappear?


>
> I'm not yet 100% sure, but I think the package power limit notification events I
> see in /var/log/messages might coincide with the moment of my xorg startup. (But
> those are just wall clock time estimations.) I will try some bootups without X
> later this evening and report back if those package power limit notification
> events also trigger without X running.
>
> The "result" of my last git bisection attempt was:
>
> git bisect start
> # good: [521cb40b0c44418a4fd36dc633f575813d59a43d] Linux 2.6.38
> git bisect good 521cb40b0c44418a4fd36dc633f575813d59a43d
> # bad: [0ce790e7d736cedc563e1fb4e998babf5a4dbc3d] Linux 2.6.39-rc1
> git bisect bad 0ce790e7d736cedc563e1fb4e998babf5a4dbc3d
> # bad: [179198373cf374f0ef793f1023c1cdd83b53674d] Merge branch 'nfs-for-2.6.39'
> of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
> git bisect bad 179198373cf374f0ef793f1023c1cdd83b53674d
> # good: [6445ced8670f37cfc2c5e24a9de9b413dbfc788d] Merge branch 'staging-next'
> of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
> git bisect good 6445ced8670f37cfc2c5e24a9de9b413dbfc788d
> # good: [d72751ede1b9bf993d7bd3377305c8e9e36a3cc4] Merge branch 'for-davem' of
> ssh://master.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
> git bisect good d72751ede1b9bf993d7bd3377305c8e9e36a3cc4
> # good: [16d8775700f1815076f879719ce14b33f50a3171] Merge branch 'for-linus' of
> master.kernel.org:/home/rmk/linux-2.6-arm
> git bisect good 16d8775700f1815076f879719ce14b33f50a3171
> # bad: [fd34b0dee4d237ce9332cc62b03adebfe4fa9f9d] Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6
> git bisect bad fd34b0dee4d237ce9332cc62b03adebfe4fa9f9d
> # good: [bd35fe5a7930bf83ed56422ea4e4b6471ee6f739] drm/nouveau: fix
> __nouveau_fence_wait performance
> git bisect good bd35fe5a7930bf83ed56422ea4e4b6471ee6f739
> # bad: [942b0e95c34f1ba432d08e1c0288ed032d32c3b2] drm/radeon/kms: fix typo in
> atom overscan setup
> git bisect bad 942b0e95c34f1ba432d08e1c0288ed032d32c3b2
> # bad: [308977ac03117706c342099a40919b3da2667cce] drm/i915: Use DEBUG_KMS for
> the self-refresh watermarks
> git bisect bad 308977ac03117706c342099a40919b3da2667cce
> # skip: [417ae1476de3ae9689a374d70565f41b3474641e] drm/i915: Include TLB miss
> latency in g4x watermark computations
> git bisect skip 417ae1476de3ae9689a374d70565f41b3474641e
> # skip: [b0b544cd37c060e261afb2cf486296983fcb56da] drm/i915: Use PM QoS to
> prevent C-State starvation of gen3 GPU
> git bisect skip b0b544cd37c060e261afb2cf486296983fcb56da
> # skip: [fe4402931e43e81a4129eba41d05cf8907603af5] Merge branch
> 'drm-intel-fixes' into drm-intel-next
> git bisect skip fe4402931e43e81a4129eba41d05cf8907603af5
> # bad: [987a709e1589cf10e250e04ce9df910b735d4f60] drm/i915: remove now
> unnecessary delays in eDP panel power sequencing
> git bisect bad 987a709e1589cf10e250e04ce9df910b735d4f60
> # skip: [18b2190ca5bd3f19717421b1591c79c9b0372428] drm/i915: allow 945 to
> control self refresh (CxSR) automatically
> git bisect skip 18b2190ca5bd3f19717421b1591c79c9b0372428
> # skip: [6f06ce184c765fd8d50669a8d12fdd566c920859] drm/i915: set phase sync
> pointer override enable before setting phase sync pointer
> git bisect skip 6f06ce184c765fd8d50669a8d12fdd566c920859
> # skip: [b24e71798871089da1a4ab049db2800afc1aac0c] drm/i915: add pipe/plane
> enable/disable functions
> git bisect skip b24e71798871089da1a4ab049db2800afc1aac0c
> # skip: [f67a559daaa0e2ba616bfe9438f202bc57bc8c72] drm/i915: skip FDI & PCH
> enabling for DP_A
> git bisect skip f67a559daaa0e2ba616bfe9438f202bc57bc8c72
> # skip: [ea0760cfc00b9e534423fdaf630d1c8ce7a5ede0] drm/i915: add panel lock
> assertion function
> git bisect skip ea0760cfc00b9e534423fdaf630d1c8ce7a5ede0
> # skip: [ccab5c82759e2ace74b2e84f82d1e0eedd932571] drm/i915: tune Sandy Bridge
> DRPS constants
> git bisect skip ccab5c82759e2ace74b2e84f82d1e0eedd932571
> # skip: [a37f2f87edc1b6e5932becf6e51535d36b690f2a] drm/i915: Remove unused code:
> i915_enable_interrupt()
> git bisect skip a37f2f87edc1b6e5932becf6e51535d36b690f2a
> # skip: [aa9b500ddf1a6318e7cf8b1754696edddae86db9] drm/i915: Honour LVDS sync
> polarity from EDID
> git bisect skip aa9b500ddf1a6318e7cf8b1754696edddae86db9
> # skip: [c0c06bd244179f754d68684fd87674585a153e40] drm/i915/ringbuffer: Kill an
> annoyingly frequent debug message
> git bisect skip c0c06bd244179f754d68684fd87674585a153e40
> # skip: [311bd68e024f9006db66cbadc3bd9f62fd663f4b] drm/i915: Trivial sparse fixes
> git bisect skip 311bd68e024f9006db66cbadc3bd9f62fd663f4b
> # skip: [63d7bbe9ded4146e3f78e5742b119fa1fdb52665] drm/i915: add PLL
> enable/disable functions
> git bisect skip 63d7bbe9ded4146e3f78e5742b119fa1fdb52665
> # skip: [0fc932b8ec36116bb759105ce910b0475e63112a] drm/i915: factor out FDI
> disable and add FDI assertions
> git bisect skip 0fc932b8ec36116bb759105ce910b0475e63112a
> # bad: [bdd92c9ad287e03a2ec52f5a89c470cd5caae1c2] Merge branch 'drm-intel-fixes'
> into drm-intel-next
> git bisect bad bdd92c9ad287e03a2ec52f5a89c470cd5caae1c2
> # skip: [040484af3a4efa65786b6e107fbe74747679e17c] drm/i915: add transcoder
> enable/disable functions
> git bisect skip 040484af3a4efa65786b6e107fbe74747679e17c
> # skip: [633f2ea26665d37bb3c8ae30799aa14988622653] drm/i915: Disable SSC for
> outputs other than LVDS or DP
> git bisect skip 633f2ea26665d37bb3c8ae30799aa14988622653
> # skip: [d9b6cb568bc6eca8db88357bf8bbb92d42a91b1e] drm/i915: assert panel is
> unlocked before writing transcoder timing regs
> git bisect skip d9b6cb568bc6eca8db88357bf8bbb92d42a91b1e
> # skip: [9a4114ffa7b6f5f4635e3745a8dc051d15d4596a] drm/i915/bios: Change default
> clock source on PineView to use SSC
> git bisect skip 9a4114ffa7b6f5f4635e3745a8dc051d15d4596a
> # skip: [92f2584a083986c05fc811bbdf380c3fa7c12296] drm/i915: add PCH DPLL
> enable/disable functions
> git bisect skip 92f2584a083986c05fc811bbdf380c3fa7c12296
> # skip: [65993d64a31844ad444694efb2d159eb9c883e49] drm/i915: don't enable plane,
> pipe and PLL prematurely
> git bisect skip 65993d64a31844ad444694efb2d159eb9c883e49
> # skip: [01fe9dbde19a1a27b8ee63e2d964562962e1eb78] drm/i915: Use ACPI OpRegion
> to determine lid status
> git bisect skip 01fe9dbde19a1a27b8ee63e2d964562962e1eb78
>
> Regards
> Olaf Freyer
>

------>8---------->8--------------->8-------------->8------------

Revert "drm/i915: tune Sandy Bridge DRPS constants"

This reverts parts of commit ccab5c82759e2ace74b2e84f82d1e0eedd932571.

It does not touch debugfs.c

Conflicts:

drivers/gpu/drm/i915/i915_debugfs.c

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 5d5def7..a245742 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3378,28 +3378,15 @@
#define GEN6_RP_DOWN_TIMEOUT 0xA010
#define GEN6_RP_INTERRUPT_LIMITS 0xA014
#define GEN6_RPSTAT1 0xA01C
-#define GEN6_CAGF_SHIFT 8
-#define GEN6_CAGF_MASK (0x7f << GEN6_CAGF_SHIFT)
#define GEN6_RP_CONTROL 0xA024
#define GEN6_RP_MEDIA_TURBO (1<<11)
#define GEN6_RP_USE_NORMAL_FREQ (1<<9)
#define GEN6_RP_MEDIA_IS_GFX (1<<8)
#define GEN6_RP_ENABLE (1<<7)
-#define GEN6_RP_UP_IDLE_MIN (0x1<<3)
-#define GEN6_RP_UP_BUSY_AVG (0x2<<3)
-#define GEN6_RP_UP_BUSY_CONT (0x4<<3)
-#define GEN6_RP_DOWN_IDLE_CONT (0x1<<0)
+#define GEN6_RP_UP_BUSY_MAX (0x2<<3)
+#define GEN6_RP_DOWN_BUSY_MIN (0x2<<0)
#define GEN6_RP_UP_THRESHOLD 0xA02C
#define GEN6_RP_DOWN_THRESHOLD 0xA030
-#define GEN6_RP_CUR_UP_EI 0xA050
-#define GEN6_CURICONT_MASK 0xffffff
-#define GEN6_RP_CUR_UP 0xA054
-#define GEN6_CURBSYTAVG_MASK 0xffffff
-#define GEN6_RP_PREV_UP 0xA058
-#define GEN6_RP_CUR_DOWN_EI 0xA05C
-#define GEN6_CURIAVG_MASK 0xffffff
-#define GEN6_RP_CUR_DOWN 0xA060
-#define GEN6_RP_PREV_DOWN 0xA064
#define GEN6_RP_UP_EI 0xA068
#define GEN6_RP_DOWN_EI 0xA06C
#define GEN6_RP_IDLE_HYSTERSIS 0xA070
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index aa43e7b..be9890e 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7101,18 +7101,18 @@ void gen6_enable_rps(struct drm_i915_private *dev_priv)
I915_WRITE(GEN6_RP_INTERRUPT_LIMITS,
18 << 24 |
6 << 16);
- I915_WRITE(GEN6_RP_UP_THRESHOLD, 10000);
- I915_WRITE(GEN6_RP_DOWN_THRESHOLD, 1000000);
+ I915_WRITE(GEN6_RP_UP_THRESHOLD, 90000);
+ I915_WRITE(GEN6_RP_DOWN_THRESHOLD, 100000);
I915_WRITE(GEN6_RP_UP_EI, 100000);
- I915_WRITE(GEN6_RP_DOWN_EI, 5000000);
+ I915_WRITE(GEN6_RP_DOWN_EI, 300000);
I915_WRITE(GEN6_RP_IDLE_HYSTERSIS, 10);
I915_WRITE(GEN6_RP_CONTROL,
GEN6_RP_MEDIA_TURBO |
GEN6_RP_USE_NORMAL_FREQ |
GEN6_RP_MEDIA_IS_GFX |
GEN6_RP_ENABLE |
- GEN6_RP_UP_BUSY_AVG |
- GEN6_RP_DOWN_IDLE_CONT);
+ GEN6_RP_UP_BUSY_MAX |
+ GEN6_RP_DOWN_BUSY_MIN);

if (wait_for((I915_READ(GEN6_PCODE_MAILBOX) & GEN6_PCODE_READY) == 0,
500))

2011-06-28 20:46:51

by Olaf Freyer

[permalink] [raw]
Subject: Re: Erroneous package power limit notification since kernel 2.6.39

Am 26.06.2011 18:27, schrieb Florian Mickler:
>> Both of my bisection attempts ended near the same set of drm/i915 changes that
>> resulted in non-bootable kernels. Considering I'm using a Intel(R) Core(TM)
>> i7-2720QM with some Intel Sandybridge Chipset graphics those might sound
>> somewhat plausible to someone knowing the internals, but don't help me at all.
> There are 2 untested commits left. But they don't seem to be relevant.
> Guessing on those gives me:
>
> The first bad commit could be any of:
> b0b544cd37c060e261afb2cf486296983fcb56da
> f67a559daaa0e2ba616bfe9438f202bc57bc8c72
> 18b2190ca5bd3f19717421b1591c79c9b0372428
> 6f06ce184c765fd8d50669a8d12fdd566c920859
> 0fc932b8ec36116bb759105ce910b0475e63112a
> 311bd68e024f9006db66cbadc3bd9f62fd663f4b
> 040484af3a4efa65786b6e107fbe74747679e17c
> ccab5c82759e2ace74b2e84f82d1e0eedd932571
> aa9b500ddf1a6318e7cf8b1754696edddae86db9
> d9b6cb568bc6eca8db88357bf8bbb92d42a91b1e
> 92f2584a083986c05fc811bbdf380c3fa7c12296
> 9a4114ffa7b6f5f4635e3745a8dc051d15d4596a
> 633f2ea26665d37bb3c8ae30799aa14988622653
> 63d7bbe9ded4146e3f78e5742b119fa1fdb52665
> 417ae1476de3ae9689a374d70565f41b3474641e
> ea0760cfc00b9e534423fdaf630d1c8ce7a5ede0
> b24e71798871089da1a4ab049db2800afc1aac0c
> fe4402931e43e81a4129eba41d05cf8907603af5
> 65993d64a31844ad444694efb2d159eb9c883e49
> c0c06bd244179f754d68684fd87674585a153e40
> 01fe9dbde19a1a27b8ee63e2d964562962e1eb78
> a37f2f87edc1b6e5932becf6e51535d36b690f2a
> bdd92c9ad287e03a2ec52f5a89c470cd5caae1c2
>
>
>
> I'd guess ccab5c82759e2ace74b2e84f82d1e0eedd932571 could be the
> cause. Can you check if the appended revert of that commit makes
> things disappear?
It seems like you guessed perfectly correct - reverting the commit makes
those notifications go away at once.

2011-06-28 20:59:53

by Jesse Barnes

[permalink] [raw]
Subject: Re: Erroneous package power limit notification since kernel 2.6.39

On Tue, 28 Jun 2011 22:46:41 +0200
Olaf Freyer <[email protected]> wrote:

> Am 26.06.2011 18:27, schrieb Florian Mickler:
> >> Both of my bisection attempts ended near the same set of drm/i915 changes that
> >> resulted in non-bootable kernels. Considering I'm using a Intel(R) Core(TM)
> >> i7-2720QM with some Intel Sandybridge Chipset graphics those might sound
> >> somewhat plausible to someone knowing the internals, but don't help me at all.
> > There are 2 untested commits left. But they don't seem to be relevant.
> > Guessing on those gives me:
> >
> > The first bad commit could be any of:
> > b0b544cd37c060e261afb2cf486296983fcb56da
> > f67a559daaa0e2ba616bfe9438f202bc57bc8c72
> > 18b2190ca5bd3f19717421b1591c79c9b0372428
> > 6f06ce184c765fd8d50669a8d12fdd566c920859
> > 0fc932b8ec36116bb759105ce910b0475e63112a
> > 311bd68e024f9006db66cbadc3bd9f62fd663f4b
> > 040484af3a4efa65786b6e107fbe74747679e17c
> > ccab5c82759e2ace74b2e84f82d1e0eedd932571
> > aa9b500ddf1a6318e7cf8b1754696edddae86db9
> > d9b6cb568bc6eca8db88357bf8bbb92d42a91b1e
> > 92f2584a083986c05fc811bbdf380c3fa7c12296
> > 9a4114ffa7b6f5f4635e3745a8dc051d15d4596a
> > 633f2ea26665d37bb3c8ae30799aa14988622653
> > 63d7bbe9ded4146e3f78e5742b119fa1fdb52665
> > 417ae1476de3ae9689a374d70565f41b3474641e
> > ea0760cfc00b9e534423fdaf630d1c8ce7a5ede0
> > b24e71798871089da1a4ab049db2800afc1aac0c
> > fe4402931e43e81a4129eba41d05cf8907603af5
> > 65993d64a31844ad444694efb2d159eb9c883e49
> > c0c06bd244179f754d68684fd87674585a153e40
> > 01fe9dbde19a1a27b8ee63e2d964562962e1eb78
> > a37f2f87edc1b6e5932becf6e51535d36b690f2a
> > bdd92c9ad287e03a2ec52f5a89c470cd5caae1c2
> >
> >
> >
> > I'd guess ccab5c82759e2ace74b2e84f82d1e0eedd932571 could be the
> > cause. Can you check if the appended revert of that commit makes
> > things disappear?
> It seems like you guessed perfectly correct - reverting the commit makes
> those notifications go away at once.
>

Without this reverted you see messages? I missed the earlier stuff,
what message are you seeing?

--
Jesse Barnes, Intel Open Source Technology Center

2011-06-28 21:09:51

by Olaf Freyer

[permalink] [raw]
Subject: Re: Erroneous package power limit notification since kernel 2.6.39

Am 28.06.2011 22:59, schrieb Jesse Barnes:
> On Tue, 28 Jun 2011 22:46:41 +0200
> Olaf Freyer <[email protected]> wrote:
>
>> Am 26.06.2011 18:27, schrieb Florian Mickler:
>>>> Both of my bisection attempts ended near the same set of drm/i915 changes that
>>>> resulted in non-bootable kernels. Considering I'm using a Intel(R) Core(TM)
>>>> i7-2720QM with some Intel Sandybridge Chipset graphics those might sound
>>>> somewhat plausible to someone knowing the internals, but don't help me at all.
>>> There are 2 untested commits left. But they don't seem to be relevant.
>>> Guessing on those gives me:
>>>
>>> The first bad commit could be any of:
>>> b0b544cd37c060e261afb2cf486296983fcb56da
>>> f67a559daaa0e2ba616bfe9438f202bc57bc8c72
>>> 18b2190ca5bd3f19717421b1591c79c9b0372428
>>> 6f06ce184c765fd8d50669a8d12fdd566c920859
>>> 0fc932b8ec36116bb759105ce910b0475e63112a
>>> 311bd68e024f9006db66cbadc3bd9f62fd663f4b
>>> 040484af3a4efa65786b6e107fbe74747679e17c
>>> ccab5c82759e2ace74b2e84f82d1e0eedd932571
>>> aa9b500ddf1a6318e7cf8b1754696edddae86db9
>>> d9b6cb568bc6eca8db88357bf8bbb92d42a91b1e
>>> 92f2584a083986c05fc811bbdf380c3fa7c12296
>>> 9a4114ffa7b6f5f4635e3745a8dc051d15d4596a
>>> 633f2ea26665d37bb3c8ae30799aa14988622653
>>> 63d7bbe9ded4146e3f78e5742b119fa1fdb52665
>>> 417ae1476de3ae9689a374d70565f41b3474641e
>>> ea0760cfc00b9e534423fdaf630d1c8ce7a5ede0
>>> b24e71798871089da1a4ab049db2800afc1aac0c
>>> fe4402931e43e81a4129eba41d05cf8907603af5
>>> 65993d64a31844ad444694efb2d159eb9c883e49
>>> c0c06bd244179f754d68684fd87674585a153e40
>>> 01fe9dbde19a1a27b8ee63e2d964562962e1eb78
>>> a37f2f87edc1b6e5932becf6e51535d36b690f2a
>>> bdd92c9ad287e03a2ec52f5a89c470cd5caae1c2
>>>
>>>
>>>
>>> I'd guess ccab5c82759e2ace74b2e84f82d1e0eedd932571 could be the
>>> cause. Can you check if the appended revert of that commit makes
>>> things disappear?
>> It seems like you guessed perfectly correct - reverting the commit makes
>> those notifications go away at once.
>>
> Without this reverted you see messages? I missed the earlier stuff,
> what message are you seeing?
>
Since 2.6.39 I saw those as soon as I start up the xserver:

May 22 14:41:34 localhost kernel: [ 57.525844] CPU4: Package power
limit notification (total events = 1)
May 22 14:41:34 localhost kernel: [ 57.525848] CPU0: Package power
limit notification (total events = 1)
May 22 14:41:34 localhost kernel: [ 57.525851] CPU1: Package power
limit notification (total events = 1)
May 22 14:41:34 localhost kernel: [ 57.525854] CPU2: Package power
limit notification (total events = 1)
May 22 14:41:34 localhost kernel: [ 57.525856] CPU5: Package power
limit notification (total events = 1)
May 22 14:41:34 localhost kernel: [ 57.525859] CPU3: Package power
limit notification (total events = 1)
May 22 14:41:34 localhost kernel: [ 57.525861] Disabling lock
debugging due to kernel taint
May 22 14:41:34 localhost kernel: [ 57.525869] CPU6: Package power
limit notification (total events = 1)
May 22 14:41:34 localhost kernel: [ 57.525872] CPU7: Package power
limit notification (total events = 1)
May 22 14:41:34 localhost kernel: [ 57.536890] CPU1: Package power
limit normal
May 22 14:41:34 localhost kernel: [ 57.536893] CPU4: Package power
limit normal
May 22 14:41:34 localhost kernel: [ 57.536896] CPU2: Package power
limit normal
May 22 14:41:34 localhost kernel: [ 57.536899] CPU3: Package power
limit normal
May 22 14:41:34 localhost kernel: [ 57.536901] CPU5: Package power
limit normal
May 22 14:41:34 localhost kernel: [ 57.536904] CPU0: Package power
limit normal
May 22 14:41:34 localhost kernel: [ 57.536915] CPU6: Package power
limit normal
May 22 14:41:34 localhost kernel: [ 57.536918] CPU7: Package power
limit normal

2011-06-28 21:18:46

by Jesse Barnes

[permalink] [raw]
Subject: Re: Erroneous package power limit notification since kernel 2.6.39

On Tue, 28 Jun 2011 23:09:45 +0200
Olaf Freyer <[email protected]> wrote:
> >>> I'd guess ccab5c82759e2ace74b2e84f82d1e0eedd932571 could be the
> >>> cause. Can you check if the appended revert of that commit makes
> >>> things disappear?
> >> It seems like you guessed perfectly correct - reverting the commit makes
> >> those notifications go away at once.
> >>
> > Without this reverted you see messages? I missed the earlier stuff,
> > what message are you seeing?
> >
> Since 2.6.39 I saw those as soon as I start up the xserver:
>
> May 22 14:41:34 localhost kernel: [ 57.525844] CPU4: Package power
> limit notification (total events = 1)
> May 22 14:41:34 localhost kernel: [ 57.525848] CPU0: Package power
> limit notification (total events = 1)
> May 22 14:41:34 localhost kernel: [ 57.525851] CPU1: Package power
> limit notification (total events = 1)
> May 22 14:41:34 localhost kernel: [ 57.525854] CPU2: Package power
> limit notification (total events = 1)
> May 22 14:41:34 localhost kernel: [ 57.525856] CPU5: Package power
> limit notification (total events = 1)
> May 22 14:41:34 localhost kernel: [ 57.525859] CPU3: Package power
> limit notification (total events = 1)
> May 22 14:41:34 localhost kernel: [ 57.525861] Disabling lock
> debugging due to kernel taint
> May 22 14:41:34 localhost kernel: [ 57.525869] CPU6: Package power
> limit notification (total events = 1)
> May 22 14:41:34 localhost kernel: [ 57.525872] CPU7: Package power
> limit notification (total events = 1)
> May 22 14:41:34 localhost kernel: [ 57.536890] CPU1: Package power
> limit normal
> May 22 14:41:34 localhost kernel: [ 57.536893] CPU4: Package power
> limit normal
> May 22 14:41:34 localhost kernel: [ 57.536896] CPU2: Package power
> limit normal
> May 22 14:41:34 localhost kernel: [ 57.536899] CPU3: Package power
> limit normal
> May 22 14:41:34 localhost kernel: [ 57.536901] CPU5: Package power
> limit normal
> May 22 14:41:34 localhost kernel: [ 57.536904] CPU0: Package power
> limit normal
> May 22 14:41:34 localhost kernel: [ 57.536915] CPU6: Package power
> limit normal
> May 22 14:41:34 localhost kernel: [ 57.536918] CPU7: Package power
> limit normal

Ok interesting, didn't realize X startup was so GPU intensive. :)

The patch you reverted will definitely cause the GPU to ramp up its
frequency much faster than before, but it sounds like on your system
you might also see it with the revert if you run something GPU
intensive like nexuiz.

The CPU (and by extension the GPU) will take care of itself though; if
things get too hot or over power, it will clock throttle to keep itself
in a safe range.

--
Jesse Barnes, Intel Open Source Technology Center

2011-06-28 22:02:06

by Olaf Freyer

[permalink] [raw]
Subject: Re: Erroneous package power limit notification since kernel 2.6.39

Am 28.06.2011 23:18, schrieb Jesse Barnes:
> On Tue, 28 Jun 2011 23:09:45 +0200
> Olaf Freyer <[email protected]> wrote:
>>>>> I'd guess ccab5c82759e2ace74b2e84f82d1e0eedd932571 could be the
>>>>> cause. Can you check if the appended revert of that commit makes
>>>>> things disappear?
>>>> It seems like you guessed perfectly correct - reverting the commit makes
>>>> those notifications go away at once.
>>>>
>>> Without this reverted you see messages? I missed the earlier stuff,
>>> what message are you seeing?
>>>
>> Since 2.6.39 I saw those as soon as I start up the xserver:
>>
>> May 22 14:41:34 localhost kernel: [ 57.525848] CPU0: Package power
>> limit notification (total events = 1)
>> May 22 14:41:34 localhost kernel: [ 57.536904] CPU0: Package power
>> limit normal
> Ok interesting, didn't realize X startup was so GPU intensive. :)
>
> The patch you reverted will definitely cause the GPU to ramp up its
> frequency much faster than before, but it sounds like on your system
> you might also see it with the revert if you run something GPU
> intensive like nexuiz.
>
> The CPU (and by extension the GPU) will take care of itself though; if
> things get too hot or over power, it will clock throttle to keep itself
> in a safe range.
I also see the message alot during my daily average usage of my computer
(just using Firefox, Thunderbird and IntelliJ) - seeing things like
CPU3: Package power limit notification (total events = 90809)
after a normal day in the office became normal since 2.6.39.

I just gave nexuiz a try for about 30 minutes with the reversal patch
applied -
and not a single message appeared in my logs.

2011-06-28 22:07:10

by Jesse Barnes

[permalink] [raw]
Subject: Re: Erroneous package power limit notification since kernel 2.6.39

On Wed, 29 Jun 2011 00:01:58 +0200
Olaf Freyer <[email protected]> wrote:

> Am 28.06.2011 23:18, schrieb Jesse Barnes:
> > On Tue, 28 Jun 2011 23:09:45 +0200
> > Olaf Freyer <[email protected]> wrote:
> >>>>> I'd guess ccab5c82759e2ace74b2e84f82d1e0eedd932571 could be the
> >>>>> cause. Can you check if the appended revert of that commit makes
> >>>>> things disappear?
> >>>> It seems like you guessed perfectly correct - reverting the commit makes
> >>>> those notifications go away at once.
> >>>>
> >>> Without this reverted you see messages? I missed the earlier stuff,
> >>> what message are you seeing?
> >>>
> >> Since 2.6.39 I saw those as soon as I start up the xserver:
> >>
> >> May 22 14:41:34 localhost kernel: [ 57.525848] CPU0: Package power
> >> limit notification (total events = 1)
> >> May 22 14:41:34 localhost kernel: [ 57.536904] CPU0: Package power
> >> limit normal
> > Ok interesting, didn't realize X startup was so GPU intensive. :)
> >
> > The patch you reverted will definitely cause the GPU to ramp up its
> > frequency much faster than before, but it sounds like on your system
> > you might also see it with the revert if you run something GPU
> > intensive like nexuiz.
> >
> > The CPU (and by extension the GPU) will take care of itself though; if
> > things get too hot or over power, it will clock throttle to keep itself
> > in a safe range.
> I also see the message alot during my daily average usage of my computer
> (just using Firefox, Thunderbird and IntelliJ) - seeing things like
> CPU3: Package power limit notification (total events = 90809)
> after a normal day in the office became normal since 2.6.39.
>
> I just gave nexuiz a try for about 30 minutes with the reversal patch
> applied -
> and not a single message appeared in my logs.

Sounds like with the patch reverted we can't drive your GPU and CPU
hard enough to generate the messages. Not sure if that's a good thing
or a bad thing though...

--
Jesse Barnes, Intel Open Source Technology Center

2011-06-30 06:37:22

by Olaf Freyer

[permalink] [raw]
Subject: Re: Erroneous package power limit notification since kernel 2.6.39

Am 29.06.2011 00:06, schrieb Jesse Barnes:
> On Wed, 29 Jun 2011 00:01:58 +0200
> Olaf Freyer <[email protected]> wrote:
>
>> Am 28.06.2011 23:18, schrieb Jesse Barnes:
>>> Ok interesting, didn't realize X startup was so GPU intensive. :)
>>>
>>> The patch you reverted will definitely cause the GPU to ramp up its
>>> frequency much faster than before, but it sounds like on your system
>>> you might also see it with the revert if you run something GPU
>>> intensive like nexuiz.
>>>
>>> The CPU (and by extension the GPU) will take care of itself though; if
>>> things get too hot or over power, it will clock throttle to keep itself
>>> in a safe range.
>> I also see the message alot during my daily average usage of my computer
>> (just using Firefox, Thunderbird and IntelliJ) - seeing things like
>> CPU3: Package power limit notification (total events = 90809)
>> after a normal day in the office became normal since 2.6.39.
>>
>> I just gave nexuiz a try for about 30 minutes with the reversal patch
>> applied -
>> and not a single message appeared in my logs.
> Sounds like with the patch reverted we can't drive your GPU and CPU
> hard enough to generate the messages. Not sure if that's a good thing
> or a bad thing though...
>
I'm not sure either. I saw a single notification event yesterday while
in office -
previously I would have recieved 70000-90000 during that timeframe.
I consider the pure amount of notifications unsettling - and in case of
some
"real" issue it might even get lost inbetween those notifications.

Maybe there is a possible compromise between the situation before and
after the patch? I'm willing to lose a few percent of GPU performance just
for the sake of getting lost of those notification events...