Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750920AbcJTEDB (ORCPT ); Thu, 20 Oct 2016 00:03:01 -0400 Received: from ex13-edg-ou-001.vmware.com ([208.91.0.189]:10205 "EHLO EX13-EDG-OU-001.vmware.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750793AbcJTEC6 (ORCPT ); Thu, 20 Oct 2016 00:02:58 -0400 From: Alok Kataria To: "jacob.jun.pan@linux.intel.com" CC: "rui.zhang@intel.com" , "linux-kernel@vger.kernel.org" , "eric.ernst@intel.com" , "rjw@sisk.pl" Subject: Re: Regression in intel_powerclamp, due to cpu whitelist removal Thread-Topic: Regression in intel_powerclamp, due to cpu whitelist removal Thread-Index: AQHSKUrTEUKoYe3QGESwWyu3mjhnUqCwtjgAgAAGXwA= Date: Thu, 20 Oct 2016 04:02:55 +0000 Message-ID: <1476936498.2694.21.camel@vmware.com> References: <2FF1D5AB-46C6-4BEC-A5A7-EC9C16A99919@vmware.com> <20161019204530.3d2ec1d5@jacob-builder> In-Reply-To: <20161019204530.3d2ec1d5@jacob-builder> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=akataria@vmware.com; x-originating-ip: [27.250.19.131] x-ms-office365-filtering-correlation-id: a071abad-d7ce-4f07-4e35-08d3f89df8e2 x-microsoft-exchange-diagnostics: 1;BY2PR05MB696;20:lhBlHDWqxyy2IrybUsWrWmJgD0b9d2x3Bl03mqgB5fITrWOCQDAu8tbyD62SlKMhRbmTyJS/42rzVn081J2YYsP3XsPhwZ5PmxPJRvaQo+0oGqCNhbxs70p7KocthipFH1MjHgCBFQFQeXL53K4wVlafj0FqWT7/l6O0rwQuKQc= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BY2PR05MB696; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(61668805478150)(10436049006162)(17755550239193); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(6040176)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001);SRVR:BY2PR05MB696;BCL:0;PCL:0;RULEID:;SRVR:BY2PR05MB696; x-forefront-prvs: 01018CB5B3 x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(6009001)(7916002)(189002)(199003)(377424004)(24454002)(105586002)(81166006)(2501003)(97736004)(575784001)(77096005)(7736002)(10400500002)(586003)(189998001)(2900100001)(2950100002)(6116002)(102836003)(15975445007)(5660300001)(6916009)(99286002)(7846002)(8936002)(305945005)(33646002)(36756003)(11100500001)(76176999)(2351001)(54356999)(101416001)(81156014)(50986999)(3846002)(2906002)(3280700002)(66066001)(103116003)(87936001)(3660700001)(106116001)(92566002)(5002640100001)(68736007)(106356001)(122556002)(19580395003)(8676002)(110136003)(4326007)(4001150100001)(86362001)(19580405001);DIR:OUT;SFP:1101;SCL:1;SRVR:BY2PR05MB696;H:BY2PR05MB696.namprd05.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" Content-ID: <9F2457D50C935643911BD057A2F746DE@namprd05.prod.outlook.com> MIME-Version: 1.0 X-MS-Exchange-CrossTenant-originalarrivaltime: 20 Oct 2016 04:02:55.5784 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: b39138ca-3cee-4b4a-a4d6-cd83d9dd62f0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2PR05MB696 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id u9K43BTi022708 Content-Length: 5436 Lines: 116 On Wed, 2016-10-19 at 20:45 -0700, Jacob Pan wrote: > On Tue, 18 Oct 2016 14:20:49 +0000 > Alok Kataria wrote: > > > Hi Jacob, Zhang, > > > > One of your recent commit "thermal/powerclamp: remove cpu > > whitelist” [1], has caused a regression in the kernel. > > > > That commit changed powerclamp_probe from requiring all of the > > following features: > > > > X86_FEATURE_NONSTOP_TSC > > X86_FEATURE_CONSTANT_TSC > > X86_FEATURE_MWAIT > > X86_FEATURE_ARAT > > > > to *any* of them. The problem is clamp_thread still wants to use > > mwait_idle_with_hints even if the CPU doesn't support it. > > > Hi Alok, > > You are right, it should be AND not OR. > > +Eric who has a patch to address this. > > https://patchwork.kernel.org/patch/9365005/ Thanks Jacob. Also, I don't see stable copied on that submission, shouldn't this be a candidate for backporting to all affected kernel versions ? Thanks, Alok > > Rui/Rafael, > > Could you consider this as an urgent fix? > > Jacob > > This was reported by our users when running ubuntu 16.10 > > (4.8.0-22-generic) inside a VMware VM, though as mentioned above I > > don’t think it is specific to our platform. We have seen kernel > > panics due to invalid opcode because of this. Below is the stack > > trace for your reference. > > > > [ 5.736416] invalid opcode: 0000 [#1] SMP > > [ 5.736455] Modules linked in: vmw_vsock_vmci_transport vsock > > vmw_balloon intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul > > ghash_clmulni_intel aesni_intel aes_x86_64 lrw glue_helper > > ablk_helper cryptd intel_rapl_perf input_leds joydev serio_raw > > snd_ens1371 snd_ac97_codec gameport ac97_bus snd_pcm snd_seq_midi > > snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd > > soundcore i2c_piix4 shpchp vmw_vmci nfit floppy(+) mac_hid parport_pc > > ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid > > ahci libahci e1000 mptspi mptscsih psmouse mptbase vmwgfx > > scsi_transport_spi ttm drm_kms_helper syscopyarea sysfillrect > > sysimgblt fb_sys_fops drm pata_acpi fjes [ 5.744370] CPU: 1 PID: > > 912 Comm: kidle_inject/1 Not tainted 4.8.0-22-generic #24-Ubuntu > > [ 5.744373] Hardware name: VMware, Inc. VMware Virtual > > Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 > > [ 5.744375] task: ffff9658f7a663c0 task.stack: ffff9658fa908000 > > [ 5.744378] RIP: 0010:[] [] > > clamp_thread+0x2b8/0x5d0 [intel_powerclamp] [ 5.744380] RSP: > > 0018:ffff9658fa90be00 EFLAGS: 00010246 [ 5.744383] RAX: > > ffff9658fa908008 RBX: 00000000fffee0a6 RCX: 0000000000000000 > > [ 5.744386] RDX: 0000000000000000 RSI: 0000000000000246 RDI: > > 0000000000000246 [ 5.744388] RBP: ffff9658fa90bec0 R08: > > ffff9658fa908000 R09: 0000000000000000 [ 5.744391] R10: > > 000000000001cbf7 R11: 0000000000000000 R12: ffffffff8db581a0 > > [ 5.744393] R13: ffff9658fa908000 R14: 0000000000000000 R15: > > ffff9658fa908000 [ 5.744396] FS: 0000000000000000(0000) > > GS:ffff9658fc640000(0000) knlGS:0000000000000000 [ 5.744398] CS: > > 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.744401] CR2: > > 00007ffa6cc262e8 CR3: 000000003ab3b000 CR4: 00000000001406e0 > > [ 5.744403] Stack: [ 5.744406] 0000000000000001 > > ffff9658f7a66dc0 ffff9658fc659200 00000000e878d638 [ 5.744409] > > 0000000000000001 00000002fc659200 0000000000000001 ffff9658fa908008 > > [ 5.744411] 0000000000000000 ffff9658fc64fea8 00000000fffee0a6 > > ffffffffc05720a0 [ 5.744414] Call Trace: [ 5.744416] > > [] ? pkg_state_counter+0xa0/0xa0 [intel_powerclamp] > > [ 5.744419] [] ? > > powerclamp_set_cur_state+0x170/0x170 [intel_powerclamp] > > [ 5.744421] [] ? > > powerclamp_set_cur_state+0x170/0x170 [intel_powerclamp] > > [ 5.744424] [] kthread+0xd8/0xf0 > > [ 5.744427] [] ret_from_fork+0x1f/0x40 > > [ 5.744429] [] ? > > kthread_create_on_node+0x1e0/0x1e0 [ 5.744432] Code: cc e9 ba 00 > > 00 00 eb 19 0f 1f 00 0f ae f0 65 48 8b 04 25 04 69 01 00 0f ae b8 08 > > c0 ff ff 0f ae f0 31 d2 48 8b 44 24 38 48 89 d1 <0f> 01 c8 49 8b 45 > > 08 a8 08 75 0b b9 01 00 00 00 4c 89 f0 0f 01 [ 5.744434] RIP > > [] clamp_thread+0x2b8/0x5d0 [intel_powerclamp] > > [ 5.744437] RSP [ 5.744440] invalid opcode: > > 0000 [#2] SMP [ 5.744452] ---[ end trace cf659c4076bf2804 ]--- > > > > Looking at the instruction at the RIP shows that > > the kernel attempted to execute “monitor” instruction. > > > > 8b8: 0f 01 c8 monitor %rax,%rcx,%rdx > > 8bb: 49 8b 45 08 mov 0x8(%r13),%rax > > > > To fix this, I think you should restore the explicit feature check > > “if block” that was removed in the above mentioned commit. Can you > > please look at this ? > > > > Thanks, > > Alok > > > > > > [1] b721ca0d192754deccb89fb01c77e41e6fd91ad9 > > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_torvalds_linux_commit_b721ca0d192754deccb89fb01c77e41e6fd91ad9&d=CwIFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=2AkLWShm6V8Nuu8ZZ-80Flo6y0XxCGmO1xrsAeRArAE&m=7uVsMg9U267LoIREKGqRgG6PRN0CXj7r4Or_eZkIGSc&s=k4SUhjPw1E7qeXBt7d40wlxcG1Bh4bXI-nosLw5SdYM&e= , > > > > [Jacob Pan]