Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753264AbcD2Utu (ORCPT ); Fri, 29 Apr 2016 16:49:50 -0400 Received: from mail-oi0-f41.google.com ([209.85.218.41]:33753 "EHLO mail-oi0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752705AbcD2Utt (ORCPT ); Fri, 29 Apr 2016 16:49:49 -0400 MIME-Version: 1.0 In-Reply-To: <5723C6A7.4020704@linux.intel.com> References: <5723A353.7060209@linux.intel.com> <20160429195741.GA15402@test-lenovo> <5723BE1F.7040300@linux.intel.com> <20160429200709.GA15412@test-lenovo> <5723C6A7.4020704@linux.intel.com> From: Andy Lutomirski Date: Fri, 29 Apr 2016 13:49:28 -0700 Message-ID: Subject: Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues To: Dave Hansen Cc: Yu-cheng Yu , X86 ML , "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , "linux-kernel@vger.kernel.org" , Andy Lutomirski , Borislav Petkov , Sai Praneeth Prakhya , "Ravi V. Shankar" , Fenghua Yu Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1899 Lines: 41 On Fri, Apr 29, 2016 at 1:40 PM, Dave Hansen wrote: > On 04/29/2016 01:25 PM, Andy Lutomirski wrote: >> On Fri, Apr 29, 2016 at 1:07 PM, Yu-cheng Yu wrote: >>> On Fri, Apr 29, 2016 at 01:03:43PM -0700, Dave Hansen wrote: >>>> That's not feasible. Think of dynamic libraries or just-in-time >>>> compilers. What instruction set does /usr/bin/java use, for instance? :) >>> >>> The java argument is true. In that case or when the bitmask is >>> missing, we can allocate for all supported features. >> >> I actually want to see us moving in the direction of unconditionally >> allocating everything on process startup. If we can stop using CR0.TS >> entirely, I think everything will be better. > > We can absolutely allocate the worst-case XSAVE buffer at task startup > for folks that never want to see a latency spike in the life of the app > no matter what. > > But I also think it would be pretty nice if 'ls' didn't pay the 2k cost > to have AVX-512 state if it's not using AVX-512. We also don't have to > do this with CR0.TS. We'd actually use a combination of out-of-line > (not appended to task_struct) XSAVE buffers and XGETBV1 to check the > size of our XSAVE buffer before we call XSAVE* and resize it when needed. > > Maybe nobody will ever care enough about 2kbytes/thread, though. I suspect we're so far about 2k/thread that no one cares. That being said, when I wrote this email, I wasn't thinking about compacted form at all. I think we should allocate a viable xstate area of some sort on startup and use saves/xrstors/xsaveopt/whatever without fiddling with TS and eagerly save and restore even if no extended state whatsoever has been used. I'm certainly okay in principle with reallocating. However, what do we do if we run out when memory when trying to reallocate? -- Andy Lutomirski AMA Capital Management, LLC